-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Deriving performance, take two #153
Comments
Some hare-brained ideas that might be fun to chase down:
I don't think any of these will have an impact, but they should be easy enough to test with the code generator. |
I meant to include a link to this excellent post: https://www.parsonsmatt.org/2019/11/27/keeping_compilation_fast.html Looking at the performance of For the cases where everything is in a single file, everything performs about the same. Potential takeaway: Set |
Regardless of the implementation (manual, generic, TH) and optimization (0, 1, 2), multiple files are slower with That alone would be a compelling reason to prefer multiple files. But you also have to consider recompilation when you edit a file, in which case multiple files easily wins. The single file obviously has no choice but to recompile everything. With multiple files only the one that changed and (maybe) ones that depend on that have to be recompiled. |
In terms of different implementation options, manual is faster than TH, which is faster than generics. Using
Across different configurations, the ratios change but the idea stays the same. Generics are slow. Template Haskell is faster. Writing instances by hand is the fastest. That being said, two caveats:
I'm not sure what the takeaway is. Even though generics are slow, maybe they're the least bad? Or maybe it would be nice to have a GHC source plugin that can write instances for you, similar to TH but without forcing recompilation. |
I have written about the performance of
deriving
various type classes in Haskell before:Reflecting on that, I think I did some things that weren't really worth it:
I benchmarked too many versions of GHC. I mean this both in terms of major/minor versions (7.0.4 through 8.2.1) and patch versions (7.0.1, 7.0.2, 7.0.3, and 7.0.4). Just doing the latest major/minor is probably fine (8.10.1 at time of writing) and there's no reason to benchmark out of date patch versions (like 8.8.1 or 8.8.2).
I focused too much on type classes from
base
that were available in all the tested versions of GHC. In particular I didn't benchmarkGeneric
. I think people are probably more interested in type classes they're likely to find in the wild, likeFromJSON
andToJSON
fromaeson
.I didn't compare derived performance to hand-written performance. At the time writing a code generator seemed too difficult, but in retrospect I don't know why I thought that. Also this is what people probably want to know: Is generating the code slow, or is compiling the generated code slow?
Related to the above, I didn't look at generic deriving versus Template Haskell. (There are other derivation options like
newtype
, but I don't think that's worth investigating.)I didn't look at various optimization levels. People normally do development builds with
-O0
and production builds with-O1
or-O2
. It would be interesting to see how performance compares between them.I didn't look at splitting modules. In particular I put everything into one huge module, rather than one module per type (which I think is more common).
Related to the above, I didn't look at using multiple jobs. Most machines have multiple cores. Compiling with
-jN
for variousN
s could have a big impact on timing.I didn't actually answer this question: Is it something related to type classes that's slow, or would the equivalent functions be slow too? For example instead of providing an
Eq
instance, what if I wroteeq :: SomeType -> SomeType -> Bool
?Big picture, I didn't really provide a takeaway. What do I think people should do with this information? I've put all this effort into benchmarking, I should provide some suggestions.
I'm thinking about all this now because at $WORK we typically write JSON instances by hand, but we're considering changing that. I'm curious how it will affect compile times.
To that end, I've been working on a new benchmark for this. It generates code in various configurations and then benchmarks how long it takes to compile. I'm trying to focus on GHC 8.10.1 and JSON. The things I'm comparing are manual-vs-generic-vs-template instances, single-vs-multiple modules, optimization levels, and parallelism.
https://gist.github.com/tfausak/a5cae9e41e5ccd0b0a5b4e49f1e2104d
The text was updated successfully, but these errors were encountered: