Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tooling request: use a large test corpus #11

Open
SKalt opened this issue Mar 29, 2023 · 1 comment
Open

Tooling request: use a large test corpus #11

SKalt opened this issue Mar 29, 2023 · 1 comment

Comments

@SKalt
Copy link
Contributor

SKalt commented Mar 29, 2023

We could run the resulting parser over all the git commit messages from a long-running OSS project such as the Linux kernel to identify parser bugs and performance issues. This parser shouldn't detect any errors since the commit messages are (by construction) valid.

Would you be interested in a PR with a script that could run this kind of fuzz/benchmarking test?

@the-mikedavis
Copy link
Owner

The commit messages themselves almost never have parse failures - errors tend to come from trying to parse the description in the comments. Those comments are dropped when performing the commit so we're left with just basic messages which are very easy to parse.

Tree-sitter grammar repositories for some programming languages have integration tests that clone some popular libraries and run the parser in test mode against those. I don't think it makes sense to do that for commit messages though since there aren't any really good examples with comments. It would be good to expand the unit tests though with more cases

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants