-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Surprising accessor performance for ARec #150
Comments
Wow that is surprising! I’ll take a look as soon as I get a chance. |
The core that GHC produces for the
So GHC merges multiple pattern matches, making the access pattern linear rather than quadratic. Quite a clever optimization. I'm still surprised that ARec fares so badly. |
Yes, the instance unrolling and inlining was an original key feature of the performance story. I don’t recall the exact motivations of the ARec variant, but I imagine it was some combination of:
All that said, I’m surprised by your result, too. In particular how much worse the times are than the existing benchmark. I’m looking forward to digging in to it. |
Ah, yes, I was aware of the instance optimizations, the one that surprised me was that GHC re-uses pattern matches, so instead of repeatedly matching on the outermost The reason I'm mentioning is that it means that my function doesn't actually measure the "average case" for accessing random fields; instead, it just pattern matches all the way to the last field and uses all the values it finds on the way. That's great news if we have an access pattern where we want to look at multiple fields, we effectively only have to pay for the deepest one. But it invalidates my benchmark. |
I've tried turning (All values in ns)
The three variants look like this: Newtype -- Morally: newtype ElField (s, t) = Field t
-- But GHC doesn't allow that
newtype ElField (t :: (Symbol, *)) = Field (Snd t) ADT data ElField (field :: (Symbol, *)) where
Field :: !t -> ElField '(s,t) GADT data ElField (field :: (Symbol, *)) where
Field :: KnownSymbol s => !t -> ElField '(s,t) These are the only changes I made to the code. It looks like removing the |
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining * Add (&:), arnil and arec presudo-constructors
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining * Add (&:), arnil and arec presudo-constructors
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining * Add (&:), arnil and arec presudo-constructors
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Move from Array to SmallArray# to avoid intermediate list during construction * Use class instead of recursive function in toARec for improved inlining
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
* Saves a 1 word per field * Improves accessor performance significantly both for Rec as well as ARec
The
accessors
benchmark suggests that ARec field access should be only slightly worse than Rec field access for fields in the front and stay constant for fields with higher indices while Rec fields become increasingly more expensive to access. For example, on my machine, access to Rec takes 6.1ns for index 0 up to 12.4ns for index 15, for an average of ~9.2ns. On the other hand, access time ARec fields is constant with ~7.5 ns. (See included figure)This would suggest that for access patterns that exercise all fields equally, ARec should outperform Rec. However, I added a new benchmark simulating random read access where I retrieve all elements once and calculate their sum: Link to gist
Here, ARec fares significantly worse than Rec with 68 vs 55 ns!
I find this discrepancy surprising. Is there perhaps a problem in the way the benchmark is written?
The text was updated successfully, but these errors were encountered: