Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add codespell: configuration, workflow (to detect new typos) and have it fixed some typos #5176

Closed
wants to merge 7 commits into from

Conversation

yarikoptic
Copy link

📑 Summary

See https://github.com/codespell-project/codespell to discover more about project

📏 Design Decisions

Mostly implemented by running https://github.com/yarikoptic/improveit/blob/main/codespellit

📋 Tasks

Make sure you

  • 📖 have read the contribution guidelines but didn't test locally if any side effects, hopefully CI would show
  • 💻 have added necessary unit/e2e tests.
  • 📓 have added documentation. Make sure MERMAID_RELEASE_VERSION is used for all new features.
  • 🔖 targeted develop branch

Copy link

netlify bot commented Jan 3, 2024

Deploy Preview for mermaid-js ready!

Name Link
🔨 Latest commit b13e752
🔍 Latest deploy log https://app.netlify.com/sites/mermaid-js/deploys/65a5557f23b0c80008378e66
😎 Deploy Preview https://deploy-preview-5176--mermaid-js.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

Copy link

codecov bot commented Jan 3, 2024

Codecov Report

Attention: 3 lines in your changes are missing coverage. Please review.

Comparison is base (6028036) 79.38% compared to head (6e659cf) 85.16%.
Report is 74 commits behind head on develop.

❗ Current head 6e659cf differs from pull request most recent head b13e752. Consider uploading reports for the commit b13e752 to get more accurate results

Additional details and impacted files

Impacted file tree graph

@@             Coverage Diff             @@
##           develop    #5176      +/-   ##
===========================================
+ Coverage    79.38%   85.16%   +5.77%     
===========================================
  Files          166      165       -1     
  Lines        13886    11166    -2720     
  Branches       707      698       -9     
===========================================
- Hits         11024     9509    -1515     
+ Misses        2708     1501    -1207     
- Partials       154      156       +2     
Flag Coverage Δ
e2e 85.16% <75.00%> (-0.04%) ⬇️
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
...ckages/mermaid/src/diagrams/sequence/sequenceDb.js 84.64% <0.00%> (ø)
packages/mermaid/src/diagrams/c4/c4Db.js 65.27% <81.81%> (ø)

... and 24 files with indirect coverage changes

Copy link
Contributor

@nirname nirname left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome.

We already have cspell. It seems that its configuration needs to be adjusted to cover all the cases. Or may be we should switch to codespell completely? Or keep both?

cspell gives plenty of false positives, but it also gives some interesting results as

/mermaid/packages/mermaid/src/diagrams/gantt/ganttRenderer.js:728:27 - Unknown word (Occurances) fix: (Occurrences)

which either skipped or ignored by codespell.

What do you think?

This is our current config https://github.com/mermaid-js/mermaid/blob/develop/cSpell.json

cypress/platform/xss4.html Outdated Show resolved Hide resolved
@yarikoptic
Copy link
Author

cool! didn't know or forgot about cspell -- sounds like a neat and "professionally done" project!

I guess it could not hurt too much to try both for a while and then possibly kick away codespell or cspell if one annoys more than the other with false positives. Meanwhile filed also

to hopefully eventually make cspell to cover also what codespell covers ;-)

=== Do not change lines below ===
{
 "chain": [],
 "cmd": "codespell -w -i 3 -C 2",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
=== Do not change lines below ===
{
 "chain": [],
 "cmd": "codespell -w",
 "exit": 0,
 "extra_inputs": [],
 "inputs": [],
 "outputs": [],
 "pwd": "."
}
^^^ Do not change lines above ^^^
@Jason3S
Copy link
Contributor

Jason3S commented Jan 16, 2024

cool! didn't know or forgot about cspell -- sounds like a neat and "professionally done" project!

I guess it could not hurt too much to try both for a while and then possibly kick away codespell or cspell if one annoys more than the other with false positives. Meanwhile filed also

to hopefully eventually make cspell to cover also what codespell covers ;-)

CSpell will pick up everything that codespell does, but the CSpell configuration will need to be adjusted.

The patterns found in ./cSpell.json tell the spell check to ignore all code in Markdown files and portions of any other file type that match the patterns.

mermaid/cSpell.json

Lines 166 to 201 in dc857f8

"patterns": [
{ "name": "Markdown links", "pattern": "\\((.*)\\)", "description": "" },
{
"name": "Markdown code blocks",
"pattern": "/^(\\s*`{3,}).*[\\s\\S]*?^\\1/gmx",
"description": "Taken from the cSpell example at https://cspell.org/configuration/patterns/#verbose-regular-expressions"
},
{
"name": "Inline code blocks",
"pattern": "\\`([^\\`\\r\\n]+?)\\`",
"description": "https://stackoverflow.com/questions/41274241/how-to-capture-inline-markdown-code-but-not-a-markdown-code-fence-with-regex"
},
{ "name": "Link contents", "pattern": "\\<a(.*)\\>", "description": "" },
{ "name": "Snippet references", "pattern": "-- snippet:(.*)", "description": "" },
{
"name": "Snippet references 2",
"pattern": "\\<\\[sample:(.*)",
"description": "another kind of snippet reference"
},
{ "name": "Multi-line code blocks", "pattern": "/^\\s*```[\\s\\S]*?^\\s*```/gm" },
{
"name": "HTML Tags",
"pattern": "<[^>]*>",
"description": "Reference: https://stackoverflow.com/questions/11229831/regular-expression-to-remove-html-tags-from-a-string"
}
],
"ignoreRegExpList": [
"Markdown links",
"Markdown code blocks",
"Inline code blocks",
"Link contents",
"Snippet references",
"Snippet references 2",
"Multi-line code blocks",
"HTML Tags"
],

The patterns even impact the ESLint cspell plug-in.

Everything in the image with a gray background is not checked.
image

Most of the issuea are cause by the HTML Tags and Markdown links patterns.

mermaid/cSpell.json

Lines 187 to 189 in dc857f8

"name": "HTML Tags",
"pattern": "<[^>]*>",
"description": "Reference: https://stackoverflow.com/questions/11229831/regular-expression-to-remove-html-tags-from-a-string"

and

{ "name": "Markdown links", "pattern": "\\((.*)\\)", "description": "" },

The cspell check <filename> command is useful for visualizing what has been excluded.

Moving the ignoreRegExpList into languageSettings to be applied when the langaugeId is markdown would prevent the patterns bleeding into other file types.

  "languageSettings": [
    {
      "languageId": "markdown",
      "ignoreRegExpList": [
        "Markdown links",
        "Markdown code blocks",
        "Inline code blocks",
        "Link contents",
        "Snippet references",
        "Snippet references 2",
        "Multi-line code blocks",
        "HTML Tags"
      ]
    }
  ],

Switching over the cspell.config.yaml would make it much easier to work with regular expressions.

patterns:
  - name: Markdown links
    pattern: \((.*)\)
    description: ''
  - name: Markdown code blocks
    pattern: /^(\s*`{3,}).*[\s\S]*?^\1/gmx
    description: Taken from the cSpell example at
      https://cspell.org/configuration/patterns/#verbose-regular-expressions
  - name: Inline code blocks
    pattern: \`([^\`\r\n]+?)\`
    description: https://stackoverflow.com/questions/41274241/how-to-capture-inline-markdown-code-but-not-a-markdown-code-fence-with-regex
  - name: Link contents
    pattern: \<a(.*)\>
    description: ''

Note: the Markdown dictionary already excludes Markdown Links: cspell-dicts/markdown.

Creating a repository dictionary might also be useful to avoid scripts like: fixCSpell.ts

It might also be wise to update cspell to 8.x.

@sidharthv96
Copy link
Member

sidharthv96 commented Jan 19, 2024

@Jason3S can you please create a PR with the required changes?
I'm not a fan of adding yet another dependency when the one we already have can do the job. 😅

Creating a repository dictionary might also be useful to avoid scripts like: fixCSpell.ts

That script is to sort and lowercase all the words added by VSCode extension when we use the "Update cSpell.json" command. This avoids merge conflicts as new lines won't be always added at the bottom, and makes it cleaner.

@Jason3S
Copy link
Contributor

Jason3S commented Jan 25, 2024

@Jason3S can you please create a PR with the required changes? I'm not a fan of adding yet another dependency when the one we already have can do the job. 😅

Creating a repository dictionary might also be useful to avoid scripts like: fixCSpell.ts

That script is to sort and lowercase all the words added by VSCode extension when we use the "Update cSpell.json" command. This avoids merge conflicts as new lines won't be always added at the bottom, and makes it cleaner.

I'll take a look next week.

@yarikoptic
Copy link
Author

ping on this -- please advise on either should be closed/fixedup/whatnot?

@yarikoptic
Copy link
Author

image

dang -- I didn't know that there would be a bounties! I will try harder next time ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants