-
Notifications
You must be signed in to change notification settings - Fork 71
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Parlot to JsonBench #96
base: main
Are you sure you want to change the base?
Conversation
32ad354
to
82fb00a
Compare
Interesting. Looks like I can no longer claim to be the fastest in C#! 😉 I'm curious where Parlot gets its speed from. Is it purely down to the fact that Parlot does less thorough error reporting? |
I have no clue where the difference could be. But it's easier to make something faster when you have a baseline. If you want to use this as an opportunity I'd suggest to check why Pidgin allocates so much more for the DeepJson scenario. This has the most difference. I am not sure what you mean with thorough error reporting. Maybe I am not aware of a specific feature in Pidgin. In Parlot errors are reported explicitly with a custom parser construct. So if this parser is reached (or the previous fails) the error is reported. The only limitation I am aware of right now is that there is a single error message, so I need to improve it to continue parsing and report more errors when possible. What we paid attention to for perf is ref structs, not creating results when not necessary, removing interface dispatch, and having most things strongly typed. I think I saw a few boxing code paths in Pidgin at some point, that could be a difference. I had a hard time removing such code paths while maintaining some consistent API. Maybe the main thing is that @lahma seems to like making my dumb code faster ;) He knows all the tricks to gain a few ns here and there. |
Re error reporting, Pidgin does quite a lot of work to keep track of what the parser was expecting to encounter, including across branches, so that I can give error messages like There's also a certain amount of overhead associated with supporting different types of input (that is, not always parsing from a string). That's one of the reasons I have a separate function to enable backtracking ( Beyond that, there might be some overhead in the implementation of the parsers themselves, rather than across-the-board costs (perhaps the loops themselves are not optimised). That seems quite directly tractable, if I can diagnose the worst performers! |
If I were you I'd keep this PR around if you want to use it and make Pidgin faster. If you are willing to and have the time for that. |
Parlot is a new parser combinator library by @sebastienros . I added it for reference to JsonBench by bringing the parser from Parlot's repository.