Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

medium sized line (530 chars) is not highlighted #2647

Closed
mechatroner opened this issue Feb 26, 2019 · 8 comments
Closed

medium sized line (530 chars) is not highlighted #2647

mechatroner opened this issue Feb 26, 2019 · 8 comments
Labels

Comments

@mechatroner
Copy link

Description

Syntax highlighting breaks at the second line (530) chars in the following file:

world_rank,university_name,country,total_score,num_students,international_students
3,Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long field Very long,United States of America,95.,100,33%

Language name: "CSV (Rainbow)" - rainbow_csv extension

Steps to reproduce

  1. Install rainbow_csv extension
  2. Create file with 2 lines above
  3. Set "CSV (Rainbow)" language
  4. add one character (e.g. "f" to the last line) - highlighting will disappear

Before adding last "f" character
529_chars
After adding last "f" character
530_chars

Expected behavior

the second line is still highlighted

Actual behavior

the second line is not highlighted

Environment

  • Build: [e.g. 3176 - type "About" in the Command Palette]
  • Operating system and version: Windows 10

I know there is a problem with 16384 character lines ( #513 ) but my line is much shorter.

@keith-hall
Copy link
Collaborator

see #2603 (comment)
presumably when ST uses Oniguruma, it has a timeout within which the regex should be matched, to avoid lagging the editor while typing etc

@mechatroner
Copy link
Author

Thanks! I will try to use "sublime-syntax".
But if "sublime-syntax" also uses Oniguruma internally doesn't it mean that it will encounter the same timeout issue?

@mechatroner
Copy link
Author

BTW I use the same regex for VSCode and Atom editors (they also have Oniguruma engine) and this issue doesn't reproduce in any of them. Maybe they have different timeout?

@FichteFoll
Copy link
Collaborator

ST uses a custom non-backtracking regex engine called sregex that is significantly more performant (mostly due to the fact it's non-backtracking) for sublime-syntax, as long as the regex patterns don't rely on backtracking inside them (which they usually don't).

@keith-hall
Copy link
Collaborator

keith-hall commented Feb 26, 2019

ST uses its custom regex engine for tmLanguage grammars too, as it parses them into the same internal representation

even if your regex pattern is compatible with sregex, using a simpler pattern and multiple contexts may perform better / not exhibit this problem. i.e. define a variable to match one csv field, and reuse that in multiple contexts which will decide which scope to apply to the match for your desired rainbow highlighting. In each context, set to the next one when the field separator is matched. and in the prototype, match eol / line terminator to set back to the main context to start again / reset the scopes/colors which will be applied

@mechatroner
Copy link
Author

I've made an experimental sublime-syntax grammar ( gist - double quotes are ignored ) that highlights a long csv line where an equivalent tmLanguage grammar fails. Also it works ~2 times faster on large csv files.

@FichteFoll
Copy link
Collaborator

FichteFoll commented Mar 1, 2019

👍
You might want to pop when encountering the end of the line instead of pushing the rainbow1 context so the newline doesn't receive a meta scope, which would otherwise cause the entire remainder of the line to be highlighted if a background color was defined.

Also, as a general suggestion, multiple scopes like entity.name.other meta.rainbow.3 allows for cleaner selectors based solely on your custom meta scope instead of being all over the place with string.rainbow5 and similar. You can do this in a single scope: or meta_scope: instruction.
You might also want to scope the separator as punctuation.separator.column.csv.

@wbond
Copy link
Member

wbond commented Mar 1, 2019

The original report was most likely related to backtracking within Oniguruma. The current dev builds have a newer version of Oniguruma, and it will print to the console when it gives up due to having tried too many times to find a match.

Between this and the fact that a much better syntax approach has been devised, I'd say this issue is "resolved".

@mechatroner If you want more help with syntax definitions, I'd recommend posting in the forum, or on the Discord server (see the forum for the server URL).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants