• 
      

    Enabling lexer guessing

    Review Request #3156 — Created June 22, 2012 and discarded — Latest diff uploaded

    Information

    Review Board

    Reviewers

    Prolog is not perl, and pygments is smart enough to differentiate the two if we
    give it a chance. Pygment's *_lexer_for_filename() functions can guess the lexer
    based on two things...
    - the filename
    - the contents of the file, especially the shebang line
    
    The later of the two was disabled in commit '36b761a', making pygments pretty
    dumb about applying syntax highlighting.
    
    For instance, if you ever see red rectangles around '$' characters in perl code
    then that's what's happening. Both prolog and perl have a '.pl' file extension
    and without file contents pygments flips a coin and half the time guesses the
    former (it sorts based on heuristics, which are then both zero).
    
    If there's only one possible option (for instance, a '.java' extension) then
    this doesn't have any extra runtime cost. If there is ambiguity (such as perl
    vs prolog) then it's an O(n) operation over the file contents. To avoid having
    this (or the following pygments.highlight() call) take too long we're imposing
    a two second timeout.
    
    We're exercising this change with a ReviewBoard 1.5 instance and has slightly
    increased the ReviewBoardDiffFragment latency (p50 raised from 0.1 to 0.2
    seconds), but that's about it. Imho this latency cost is well worth having a
    far more readable diff (I've had syntax highlighting turned off for years
    because it's distracting to read code incorrectly marked as being full of
    syntax errors).
    
    Note that this patch itself has not been tested against master - we're running
    an identical change against RB 1.5.