Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict Vim modeline parsing to a single line #4138

Merged
merged 3 commits into from
May 27, 2018

Conversation

jatoben
Copy link
Contributor

@jatoben jatoben commented May 24, 2018

Description

Files which have a similar syntax to Vim modelines, such as YAML, can trigger regex matching behavior that makes filetype detection very slow. In this pull request, I've replaced \s in the Vim modeline regex (which considers newlines as whitespace) with the POSIX bracket expression [[:blank:]], which doesn't ([ \t] should also work, if that's preferred). This constrains the parsing of each modeline to a single line and avoids the pathological case.

This short YAML file can be used to reproduce the behavior:

# vim: ft=yaml:ts=2:sw=2:expandtab
defaults:
  milestone: q1
  priority:  2
branches:
  master:
    events:
      push:
        :state: integrate
        :resolution: change
      merge:
        :state: analyze
        :substate: review

Before the change, using Ruby 2.4.2 on Linux:

$ time bundle exec bin/linguist pathological.yml 
pathological.yml: 15 lines (13 sloc)
  type:      Text
  mime type: text/x-yaml
  language:  YAML

real	1m3.701s
user	1m3.144s
sys	0m0.156s

After:

$ time bundle exec bin/linguist pathological.yml 
pathological.yml: 15 lines (13 sloc)
  type:      Text
  mime type: text/x-yaml
  language:  YAML

real	0m0.850s
user	0m0.524s
sys	0m0.104s

Checklist:

  • I am associating a language with a new file extension.

  • I am adding a new language.

  • I am fixing a misclassified language

    • I have included a new sample for the misclassified language:
      • Sample source(s):
        • [URL to each sample source, if applicable]
      • Sample license(s):
    • I have included a change to the heuristics to distinguish my language from others using the same extension.
  • I am changing the source of a syntax highlighting grammar

  • I am adding new or changing current functionality

    • I have added or updated the tests for the new or changed functionality.

I have not added any tests here, as I think the existing coverage in fixtures like test/fixtures/Data/Modelines/iamjs2.pl already covers the desired behavior. If I've missed anything, please let me know!

@jatoben jatoben requested a review from Alhadis May 24, 2018 04:28
Copy link
Collaborator

@Alhadis Alhadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! 👍 Well done.

I've left some feedback.

@@ -49,20 +49,20 @@ class Modeline
# serves as a terminator for an option sequence, delimited by whitespace.
(?=
# So we have to ensure the modeline ends with a colon
: (?=\s* set? \s [^\n:]+ :) |
: (?=[[:blank:]]* set? [[:blank:]] [^\n:]+ :) |
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend the use of [ \t] instead, as it's arguably clearer than a POSIX character class.

|
\s* : \s* # Note that whitespace around colons is accepted too:
[[:blank:]]* : [[:blank:]]* # Note that whitespace around colons is accepted too:
) # vim: noai : ft=ruby:noexpandtab
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should indent this line so the comment matches the line above it.


# Option's value. Might be blank; `vim: ft= ` says "use no filetype".
(?:
[^\\\s] # Beware of escaped characters: titlestring=\ ft=ruby
[^\\[[:blank:]]] # Beware of escaped characters: titlestring=\ ft=ruby
| # will be read by Vim as { titlestring: " ft=ruby" }.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent needed here too.

@jatoben
Copy link
Contributor Author

jatoben commented May 25, 2018

Thanks for the review @Alhadis! I've made those updates in eda3e02 and 5fed5cd.

@Alhadis
Copy link
Collaborator

Alhadis commented May 26, 2018

LGTM. 👍

Copy link
Member

@lildude lildude left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow! Nice improvement @jatoben 🙇‍♂️

@lildude lildude merged commit 89d3d31 into github-linguist:master May 27, 2018
@jatoben jatoben deleted the vim-modeline-fix branch May 28, 2018 05:51
lildude added a commit that referenced this pull request May 30, 2018
@sheerun
Copy link
Contributor

sheerun commented Oct 18, 2020

better late than never :D

@github-linguist github-linguist locked as resolved and limited conversation to collaborators Jun 17, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants