Channel9 syntax highlighter bug fix

Maurits

Hold ∞ in the palm of your hand [B̲̅l̲̅a̲̅k̲̅e̲̅]
Posted by Maurits // Fri, Aug 26, 2005 2:37 AM

Suggested fix for syntax highlighter bug



void foo(int bar)
{
    int baz = bar * 2; // this is a comment, should be green
    baz -= 6; // the previous statement should not be green, but is
    // the following closing brace should not be green either, but is
    return;
}


 


Hmm... seems nothing is green :-|
EDIT: 8/9/05 now everything is green after the first //
EDIT2: Testing 8/24/05
Better... now comments don't subsume the rest of the code block in their verdant hue
But two new bugs (or existing bugs that are only now visible:)
single-line comments stop being green after their first comma
keywords on lines after a single-line comment are not blue

How hard would it be to add languages, I wonder... things like SQL with -- comments...
[Save]
  Charles
  Welcome Change
 
  Wed, Aug 3 2005 4:49 PM
Thanks. We are looking into this. You should see this fixed in the next C9 site revision.

C

  Maurits
  Hold ∞ in the palm of your hand [B̲̅l̲̅a̲̅k̲̅e̲̅]
 
  Wed, Aug 3 2005 5:25 PM
Sweet... could you post a before-and-after code sampling when it's done?


  Maurits
  Hold ∞ in the palm of your hand [B̲̅l̲̅a̲̅k̲̅e̲̅]
 
  Fri, Aug 5 2005 4:11 PM
The attached code is a quick-fix.  If I were going to engineer a code-coloring solution, I'd do something with this kind of an interface (pseudocode follows)

string FormatCodeBlocks(string raw)
{
    // code blocks look like this
    coderegex = [code...language="(.*?)"](.*?)[/code]

    // create a CodeBlock object
    // this allows us to hide the implementation details from the basic plan
    coderegex.Replace(raw, new CodeBlock($1, $2).ToHtmlString())
}

// all the different kinds of code
enum codechunktype
{
    literalString, // text = " like /* this */ ";
    multilineComment, /* text = " like this" */
    singlelineComment, // like this
    keyword, // return;
    whiteSpace, // may not be necessary
    normal // everything else, really
};

/* this becomes:
    (codechunk)
    {
        codechunktype.multilineComment,
        "/* this becomes:\n ... */"
    }
*/
private struct codechunk
{
    codechunktype codetype;
    string text;
};

class CodeBlock
{
    // all the parsing logic goes in the c'tor
    public CodeBlock(
        string language, // HTML or plaintext?
        string code
        // maybe a html/plaintext enum?
    );

    // all the styling goes here
    public string ToHtmlString();

    private codechunk[] m_code;
}

The constructor would populate the codechunk[] array given the language and the code.  ToHtmlString() would concatenate the code chunks together with styling information.

This would be harder to set up initially, but much easier to debug if something went wrong.


  Maurits
  Hold ∞ in the palm of your hand [B̲̅l̲̅a̲̅k̲̅e̲̅]
 
  Fri, Aug 26 2005 2:22 AM
Glory be and the saints be praised it works!

Except... now "return" in the code sample is a keyword and should be blue, but isn't...

I'm betting there's code to avoid keyword-izing keywords "after" a double-slash "on the same line", and guess what... the line breaks aren't being recognized properly.  So keywords after a single-line comment aren't being recognized, even if they're on another line.

Seems commas terminate single-line comments :-?



  Charles
  Welcome Change
 
  Fri, Aug 26 2005 2:30 AM

I haven't been able to invest the amount of time in this as I would like. Thanks for your patience. I haven't forgotten about this.

 

C



  Manip
  Life's too short for chess.
 
  Fri, Aug 26 2005 7:48 AM
Hello World (I need to watch my language)Hello World Hello World

You also haven't had time to dump in the code I submitted to fix the tag removal.

  NeoTOM
  The Internet makes you stupid
 
  Sat, Aug 27 2005 11:36 PM
Charles wrote:
Thanks. We are looking into this. You should see this fixed in the next C9 site revision.

C


And to think when I read this I thought you meant just a little looksee over the backend servers.