Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Regex messes up highlighting for most of the rest of the page #2839

Closed
TWiStErRob opened this issue Feb 16, 2016 · 5 comments
Closed

Regex messes up highlighting for most of the rest of the page #2839

TWiStErRob opened this issue Feb 16, 2016 · 5 comments

Comments

@TWiStErRob
Copy link

This regex messes up highlighting for most of the rest of the page:

state :descriptors do
# a full-form tag
rule /!<[0-9A-Za-z;\/?:@&=+$,_.!~*'()\[\]%-]+>/, Keyword::Type
# a tag in the form '!', '!suffix' or '!handle!suffix'
rule %r(
  (?:![\w-]+)? # handle
  !(?:[\w;/?:@&=+$,.!~*\'()\[\]%-]*) # suffix
)x, Keyword::Type
# an anchor
rule /&[\w-]+/, Name::Label
# an alias
rule /\*[\w-]+/, Name::Variable
end

from https://github.com/jneen/rouge/blob/master/lib/rouge/lexers/yaml.rb#L152

That apostrophe which starts the single quoted string (blue colored from there on) is in a character class (delimited by []), so it shouldn't have any special meaning. If I remove the apostrophe then the percent sign does the same. If I remove all special chars from the character class, it still breaks because of the >/ which I can't explain.

I reported it to GitHub support, which lead me here:

We use open source TextMate-style language grammars for syntax highlighting, which are available here:
https://github.com/github/linguist/blob/master/grammars.yml
Linguist pulls in grammar updates with each new release, which usually happens every couple of weeks.

So based on https://github.com/github/linguist/blob/master/grammars.yml#L480
The code is at https://github.com/aroben/ruby.tmbundle/tree/4636a3023153c3034eb6ffc613899ba9cf33b41f

The problem is that I see that it is a fork and is on a branch.
It's about 20 commits behind and 2 minor commits ahead of the official master which is so much improvement in the offical that GitHub can't even show the diff:
aroben/ruby.tmbundle@pl...textmate:master

Syntaxes/Ruby.plist
2,663 additions, 1,710 deletions not shown because the diff is too large.
Please use a local Git client to view these changes.

It's possible that this issue was already fixed by those 20 commits, would it be possible to pull those 2 minor changes to the original repo and use that so the updates are received from other contributors?

/cc @aroben

@TWiStErRob
Copy link
Author

Wow, nice tool! Yeah, I didn't see any particular commit related to this, but the diff was so big I hoped for it. Can I help in any way here?

@aroben
Copy link
Contributor

aroben commented Feb 16, 2016

If you want to fix the regex bug you reported, feel free to work the grammar and try out changes using Lightshow. Moving off our fork is waiting on textmate/ruby.tmbundle#73; perhaps a comment in that PR could help move things along.

@TWiStErRob
Copy link
Author

I tried to test the editable grammar in Lightshow but I was getting HTTP 414 Request-URI Too Long from the server. I even tried if zipping the grammar source would be a viable option, but that's too long too:

<input type="hidden" name="grammar_text">
<textarea id="grammar_text_raw" class="js-optional-field code form-control hidden">
...
<script language="text/javascript">
submitter.addEventListener("submit", function() {
    var zip = new JSZip();
    zip.file("source.txt", document.getElementById('grammar_text_raw').value);
    submitter.grammar_text.value = zip.generate({type:"base64", compression:"DEFLATE"});
});

I think the submit method should be changed to POST for long texts, it doesn't work now anyway so you wouldn't lose the linkabily, just allow a little more leeway for use cases.

Anyway, after this I just uploaded a file to somewhere I can edit, and also managed to make the example smaller and runnable, because Sublime Text has the same issue, here's the minimal code (lightshow):

puts %r([0-9'])
puts /[0-9']/ # all from here is part of a string until the next apostrophe
puts "hello"
# this is still a comment, ' but highlighted as code: def x = 1+2

The fix I found was that /.../ needs to be matched at the same time as the %r×...× (string.regexp.mod-r.ruby) versions. Sadly I don't know enough about ruby nor syntax grammars to know if this is the correct approach.

@pchaigno
Copy link
Contributor

pchaigno commented Dec 4, 2016

@TWiStErRob Thanks for working on this! Please report your findings to the grammar repository, as it is not something that can be fixed in Linguist.

@pchaigno pchaigno closed this as completed Dec 4, 2016
@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants