Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf(parser): optimize conditional advance on ASCII values #4298

Merged
merged 1 commit into from
Jul 27, 2024

Conversation

lucab
Copy link
Contributor

@lucab lucab commented Jul 16, 2024

Part of #3291.

Copy link

graphite-app bot commented Jul 16, 2024

Your org has enabled the Graphite merge queue for merging into main

Add the label “merge” to the PR and Graphite will automatically add it to the merge queue when it’s ready to merge. Or use the label “hotfix” to add to the merge queue as a hot fix.

You must have a Graphite account and log in to Graphite in order to use the merge queue. Sign up using this link.

@github-actions github-actions bot added the A-parser Area - Parser label Jul 16, 2024
Copy link
Contributor Author

lucab commented Jul 16, 2024

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @lucab and the rest of your teammates on Graphite Graphite

Copy link

codspeed-hq bot commented Jul 16, 2024

CodSpeed Performance Report

Merging #4298 will not alter performance

Comparing ups/parser-advance-ascii-eq (868fc87) with main (e2735ca)

Summary

✅ 32 untouched benchmarks

@lucab lucab marked this pull request as ready for review July 16, 2024 14:58
@overlookmotel
Copy link
Contributor

Surprising and disappointing that this has no effect on benchmarks. I really thought it would, and if I remember right when I played with this back in Dec/Jan it did produce a speed-up.

I'm not sure what's going on! The problem previously was that Chars iterator was really slow as compiler didn't recognise that self.remaining().chars().next() == Some('.') only requires checking 1 byte, and generated a bunch of code to read a potentially multi-byte char before checking if it's '.'. Maybe between then and now Chars iterator has become more optimized?

@overlookmotel
Copy link
Contributor

Nope. Chars iterator still generates a ton more code on Rust 1.79.0 than checking by byte.

https://godbolt.org/z/dsv93Pnce

I'll take another look at this tomorrow with fresh eyes. I really hope I haven't wasted your time.

@lucab lucab force-pushed the ups/parser-advance-ascii-eq branch from 529dde4 to 6a15bca Compare July 17, 2024 07:50
@overlookmotel
Copy link
Contributor

overlookmotel commented Jul 17, 2024

I am really struggling to understand why this isn't doing more on benchmarks.

Here is a more complete version of before vs after this PR on Godbolt: https://godbolt.org/z/fjc96GfeE

  • Old version next_eq_dot_old is 55 instructions.
  • New version next_eq_dot_new is 11.

So the old code is way more verbose. It also requires 2 branches before it gets to the cmp esi, 46 instruction, which is where it does the actual check for "is this byte .?". Granted, both those branches are almost never taken - the 1st branch is only taken when have reached EOF, and the 2nd only taken if the byte read is not ASCII (never happens in our benchmark fixtures) - so branch predictor will speculatively continue down the common path without delay. But still, I'd have expected that:

  1. Trimming even a few extra instructions in a path as hot as this would make a real difference, at least to the lexer benchmarks.
  2. The excessive amount of code generated by the old version might cause more instruction cache to be consumed and therefore cause more instruction cache misses.

But no!

I've also tried adding #[inline(always)] to both the relevant functions to make sure inlining is definitely happening. But I didn't expect that to make any difference, and sure enough it doesn't.

The Godbolt link above also includes an alternative version advance_if_ascii_eq2 which converts the branch which is unpredictable to straight line code. But I doubt that'd help in practice - in actual code, we're always branching on the result of next_ascii_char_eq too, so that'd re-introduce the branch again.

@lucab I don't know how used you are to looking at assembly like this, but any ideas?

@strager I don't know if you're out there (and I know you deny being an expert in such matters!), but I'm a bit lost here and would really appreciate your thoughts if you have time.

@DonIsaac
Copy link
Contributor

Any updates on this PR?

@Boshen
Copy link
Member

Boshen commented Jul 27, 2024

I see no objections, merging.

We can also use /usr/bin/time to measure CPU instruction counts.

@Boshen Boshen added the 0-merge Merge with Graphite Merge Queue label Jul 27, 2024
Copy link

graphite-app bot commented Jul 27, 2024

Merge activity

  • Jul 26, 9:17 PM EDT: The merge label 'merge' was detected. This PR will be added to the Graphite merge queue once it meets the requirements.
  • Jul 26, 9:17 PM EDT: Boshen added this pull request to the Graphite merge queue.
  • Jul 26, 9:20 PM EDT: Boshen merged this pull request with the Graphite merge queue.

@Boshen Boshen force-pushed the ups/parser-advance-ascii-eq branch from 81d54e0 to 868fc87 Compare July 27, 2024 01:17
@graphite-app graphite-app bot merged commit 868fc87 into main Jul 27, 2024
24 checks passed
@graphite-app graphite-app bot deleted the ups/parser-advance-ascii-eq branch July 27, 2024 01:20
@oxc-bot oxc-bot mentioned this pull request Jul 27, 2024
Dunqing pushed a commit that referenced this pull request Jul 28, 2024
## [0.22.1] - 2024-07-27

### Features

- 2477330 ast: Add `AstKind::TSExportAssignment` (#4501) (Dunqing)
- aaee07e ast: Add `AstKind::AssignmentTargetPattern`,
`AstKind::ArrayAssignmentTarget` and `AstKind::ObjectAssignmentTarget`
(#4456) (Dunqing)
- fd363d1 ast: Add AstKind::get_container_scope_id (#4450) (DonIsaac)
- e2735ca span: Add `contains_inclusive` method (#4491) (DonIsaac)

### Bug Fixes

- 368112c ast: Remove `#[visit(ignore)]` from
`ExportDefaultDeclarationKind`'s `TSInterfaceDeclaration` (#4497)
(Dunqing)
- 36bb680 semantic: `TSExportAssignment` cannot reference type binding
(#4502) (Dunqing)
- cb2fa49 semantic: `typeof` operator cannot reference type-only import
(#4500) (Dunqing)
- ef0e953 semantic: Generic passed to typeof not counted as a reference
(#4499) (Dunqing)
- 40cafb8 semantic: Params in `export default (function() {})` flagged
as `SymbolFlags::Export` (#4480) (Dunqing)
- 2e01a45 semantic: Non-exported namespace member symbols flagged as
exported (#4493) (Don Isaac)
- e4ca06a semantic: Incorrect symbol’s scope_id after var hoisting
(#4458) (Dunqing)
- 77bd5f1 semantic: Use correct span for namespace symbols (#4448) (Don
Isaac)
- 5db7bed sourcemap: Fix pre-calculation of required segments for
building JSON (#4490) (overlookmotel)
- 1667491 syntax: Correct `is_reserved_keyword_or_global_object`'s
incorrect function calling. (#4484) (Ethan Goh)
- 82ba2a0 syntax: Fix unsound use of `NonZeroU32` (#4466)
(overlookmotel)
- c04b9aa transformer: Add to `SymbolTable::declarations` for all
symbols (#4460) (overlookmotel)
- ecdee88 transformer/typescript: Incorrect eliminate exports when the
referenced symbol is both value and type (#4507) (Dunqing)

### Performance

- 963a2d1 mangler: Reduce unnecessary allocation (#4498) (Dunqing)
- 868fc87 parser: Optimize conditional advance on ASCII values (#4298)
(lucab)
- 24beaeb semantic: Give `AstNodeId` a niche (#4469) (overlookmotel)
- 348c1ad semantic: Remove `span` field from `Reference` (#4464)
(overlookmotel)
- 6a9f4db semantic: Reduce storage size for symbol redeclarations
(#4463) (overlookmotel)
- 705e19f sourcemap: Reduce memory copies encoding JSON (#4489)
(overlookmotel)
- 4d10c6c sourcemap: Pre allocate String buf while encoding (#4476)
(Brooooooklyn)

### Documentation

- f5f0ba8 ast: Add doc comments to more AST nodes (#4413) (Don Isaac)
- 871b3d6 semantic: Add doc comments for SymbolTester and SemanticTester
(#4433) (DonIsaac)

### Refactor

- 9c5d2f9 ast/builder: Use `Box::new_in` over `.into_in` (#4428)
(overlookmotel)
- ccb1835 semantic: Methods take `Span` as param, not `&Span` (#4470)
(overlookmotel)
- f17254a semantic: Populate `declarations` field in
`SymbolTable::create_symbol` (#4461) (overlookmotel)
- a49f491 semantic: Re-order `SymbolTable` fields (#4459)
(overlookmotel)
- 7cd53f3 semantic: Var hoisting (#4379) (Dunqing)
- 4f5a7cb semantic: Mark SemanticTester and SymbolTester as must_use
(#4430) (DonIsaac)
- c958a55 sourcemap: `push_list` method for building JSON (#4486)
(overlookmotel)
- c99b3eb syntax: Give `ScopeId` a niche (#4468) (overlookmotel)
- 96fc94f syntax: Use `NonMaxU32` for IDs (#4467) (overlookmotel)

### Testing

- 4b274a8 semantic: Add more test cases for symbol references (#4429)
(DonIsaac)

Co-authored-by: Boshen <[email protected]>
overlookmotel added a commit that referenced this pull request Jul 29, 2024
These 2 `#[inline(always)]` were introduced by accident by me just playing around in #4298. One should be kept, but the other one we should leave to compiler to decide.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0-merge Merge with Graphite Merge Queue A-parser Area - Parser
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants