Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Case sensitivity seems to have slowed down large scanning by x4 #98

Closed
nschonni opened this issue May 27, 2019 · 10 comments
Closed

Case sensitivity seems to have slowed down large scanning by x4 #98

nschonni opened this issue May 27, 2019 · 10 comments

Comments

@nschonni
Copy link
Collaborator

EX: on friday the build took around 5 minutes https://dev.azure.com/azure-sdk/public/_build/results?buildId=38575&view=logs
Now it is up to 21 minutes https://dev.azure.com/azure-sdk/public/_build/results?buildId=38845
I'll see if I can trace back to a particular version

@Jason3S Jason3S added the bug label May 28, 2019
@Jason3S
Copy link
Collaborator

Jason3S commented May 28, 2019

Thank you. I will speed it up.

@Jason3S
Copy link
Collaborator

Jason3S commented May 28, 2019

Can you try pinning the version to 4.0.13? That should help for now.

@nschonni
Copy link
Collaborator Author

Thank you. I will speed it up.

No pressure, just reporting something I hit 😉
I'll see if I can test out any options to speed it back up again, while keeping the case sensitivity

@Jason3S
Copy link
Collaborator

Jason3S commented May 29, 2019

Please update to 4.0.16. I have changed the order of checking to speed things up.

@nschonni
Copy link
Collaborator Author

Thanks, looks like took off 5 minutes from the scan now

@Jason3S
Copy link
Collaborator

Jason3S commented May 31, 2019

@nschonni It is awesome to see cspell used on a large project. It is also a perfect performance test case.

I have done some more work to speed things up. There is still more to do. Especially for the large files.

One thing you can do that will take a couple of minutes off of your run time is to create a dictionary for the words. cspell caches dictionaries, but it cannot cache word lists due to how they are composed, it must rebuild a word list lookup for each file being checked. A dictionary on the other hand gets cached.

If you copy all the words out of cSpell.json into a file called custom-words.txt with one word per line:

AADDS
aadiam
abcxyz
ABFS
Accel
acceptors
accesspoint
...
Xero
XSMB
YYMMDD
Zabbix
Zilla
ziplist
Zoho
zset

Update cSpell.json:

"words": [],
"dictionaryDefinitions": [
    {
        "name": "custom-words",
        "file": "./custom-words.txt",  // <-- relative to the `cspell.json` file
        "description": "Project Words"
    }
],
...
"dictionaries": [
    "custom-words",   // <-- Any where in the list of dictionaries.
    "companies",
    "softwareTerms",
    "html",
    "typescript",
    "python",
    "node",
    "go",
    "java",
    "csharp"
],
...

@Jason3S
Copy link
Collaborator

Jason3S commented May 31, 2019

Here are my timing results:

Words in cSpell.json

CSpell: Files checked: 1654, Issues found: 1 in 1 files

real	6m53.006s
user	8m29.446s
sys	0m12.091s

Words in custom-words.txt

CSpell: Files checked: 1654, Issues found: 1 in 1 files

real	5m38.420s
user	6m45.880s
sys	0m9.982s

@Jason3S Jason3S added the FAQ label May 31, 2019
@nschonni
Copy link
Collaborator Author

Thanks, I'll take a look at the dictionary idea! I just PR stuff over there to fix it, so I also just have to wait and see if they'll land it 😉

@kowalk
Copy link

kowalk commented Jul 25, 2019

@nschonni same here when I use query "some dir/**/*.html" on version 3.2.17 it is 2x times faster than version 4.0.26 ;(

@github-actions
Copy link
Contributor

This issue has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 14, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants