Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deriving classes is slow #127

Closed
tfausak opened this issue May 3, 2017 · 18 comments · Fixed by #135
Closed

Deriving classes is slow #127

tfausak opened this issue May 3, 2017 · 18 comments · Fixed by #135
Assignees
Labels

Comments

@tfausak
Copy link
Owner

tfausak commented May 3, 2017

I was curious about how long it takes to derive type classes in Haskell. So I ripped out all the data types from Rattletrap, put them in a single module, and did some benchmarks.

deriving-haskell-classes

TL;DR:
deriving (): 1.5 seconds
deriving (Typeable, Eq, Generic, Show, Ord, Read, Data): 33.9 seconds

@tfausak tfausak added the idea label May 3, 2017
@tfausak
Copy link
Owner Author

tfausak commented May 5, 2017

I wrote a benchmark. Here are the raw results:

GHC version Type class Build time
8.0.2 None 0.336130375389239
8.0.2 Data 1.5771875359084617
8.0.2 Eq 0.6777693837569073
8.0.2 Generic 1.266567514411137
8.0.2 NFData 1.2762650131446902
8.0.2 Ord 1.5052250140474281
8.0.2 Read 1.73459779399804
8.0.2 Show 1.092500759077842
8.0.2 Typeable 0.3532055427451053
8.0.2 All 5.0138953359279546
8.0.1 None 0.642286714471736
8.0.1 Data 1.4861675602674982
8.0.1 Eq 0.6601576980742156
8.0.1 Generic 0.8592153585328393
8.0.1 NFData 1.3250290498818624
8.0.1 Ord 1.599108449953122
8.0.1 Read 1.4810040962807849
8.0.1 Show 1.1400197187941261
8.0.1 Typeable 0.34479866036444556
8.0.1 All 4.878157194680894
7.10.3 None 0.27517507044003525
7.10.3 Data 1.728103816337813
7.10.3 Eq 0.5770432599674674
7.10.3 Generic 1.0832548244696107
7.10.3 NFData 1.3955298367053286
7.10.3 Ord 1.3983414151005624
7.10.3 Read 1.398078999297864
7.10.3 Show 1.0514038457539043
7.10.3 Typeable 0.2830704650798367
7.10.3 All 5.3985303367390545
7.10.2 None 0.2595341259969595
7.10.2 Data 1.6681961400282896
7.10.2 Eq 0.5786647988913943
7.10.2 Generic 1.0236920422812912
7.10.2 NFData 1.382406515177711
7.10.2 Ord 1.3563632561903007
7.10.2 Read 1.4199409673957482
7.10.2 Show 1.0099546781179887
7.10.2 Typeable 0.27846318639592726
7.10.2 All 5.338841296871909
7.10.1 None 0.2636182499461309
7.10.1 Data 1.8802394867687349
7.10.1 Eq 0.5844070539189424
7.10.1 Generic 1.171292429401193
7.10.1 NFData 1.4222185805523921
7.10.1 Ord 1.30852455310019
7.10.1 Read 1.5405771345418975
7.10.1 Show 0.9493555430882701
7.10.1 Typeable 0.28911154863069255
7.10.1 All 5.6471964526325635

@tfausak
Copy link
Owner Author

tfausak commented May 5, 2017

Although I ran the benchmark against many versions of the compiler, the output is difficult to understand when graphed.

screenshot from 2017-05-04 23-08-34

8.0.x is generally a little slower than 7.10.x, but the difference is both inconsistent and not that big. The thing I'm actually trying to investigate here is the impact on compile time of deriving type classes. It was important to me to show that GHC 8.0.2 isn't drastically different than other recent versions of GHC.

(I could test with even older versions of GHC, but I'd have to exclude NFData. It requires the DeriveAnyClass extension, which was added in 7.10.)

screenshot from 2017-05-04 23-13-31

This shows that deriving Ord takes about 4.4x as long as not deriving anything (1.51s vs 0.34s).

@Profpatsch
Copy link

Compiling modules with types always takes the longest … I wonder why that is?
Deriving stuff should only be some quite simple code generation, right? Especially for classes like show.

@puffnfresh
Copy link

This is something pointed out as something to work on for Summer of Haskell, there's one more day for applications!

https://summer.haskell.org/ideas.html#galois-ghc

https://ghc.haskell.org/trac/ghc/wiki/Performance/Compiler#Derivinginstances

@puffnfresh
Copy link

puffnfresh commented May 5, 2017

For Read, I think this ticket is applicable:

https://ghc.haskell.org/trac/ghc/ticket/10980

@puffnfresh
Copy link

And Ord should hopefully be a bit better in 8.2.1:

https://ghc.haskell.org/trac/ghc/ticket/10858

@tfausak
Copy link
Owner Author

tfausak commented May 5, 2017

I ran my benchmark against GHC 8.2.1-rc1 (https://gist.github.com/tfausak/a36862c53a2cc53029cab18a05788b95). Here are the raw results:

GHC version Type class Build time
8.2.0 None 0.44000624959403134
8.2.0 Typeable 0.4253976581374423
8.2.0 Eq 0.7552512676430324
8.2.0 Show 1.2090433944487813
8.2.0 Generic 0.9957070012428888
8.2.0 NFData 2.598061319007836
8.2.0 Ord 1.5106823807037193
8.2.0 Data 1.8887927285953232
8.2.0 Read 1.6389913346312228

And here it is graphed compared to 8.0.2:

screenshot from 2017-05-05 07-51-13

Generic got a little better. Data and NFData got worse. Everything else stayed about the same.

@tfausak
Copy link
Owner Author

tfausak commented May 5, 2017

To get a feel for the generated instances, here they are (via -ddump-deriv) for the Section data type:

Data (implies Typeable)
instance Data.Data.Data body_a4Wo =>
         Data.Data.Data (Deriving.Data.Section body_a4Wo) where
  Data.Data.gfoldl
    k_a4YQ
    z_a4YR
    (Deriving.Data.Section a1_a4YS a2_a4YT a3_a4YU)
    = (((z_a4YR Deriving.Data.Section `k_a4YQ` a1_a4YS)
        `k_a4YQ` a2_a4YT)
       `k_a4YQ` a3_a4YU)
  Data.Data.gunfold k_a4YV z_a4YW _
    = k_a4YV (k_a4YV (k_a4YV (z_a4YW Deriving.Data.Section)))
  Data.Data.toConstr (Deriving.Data.Section _ _ _)
    = Deriving.Data.$cCOLrOXxQLtLE5asoBRhrjA
  Data.Data.dataTypeOf _ = Deriving.Data.$tCOLrOXxQLtLE5asoBRhrjA
  Data.Data.dataCast1 f_a4YX = Data.Typeable.gcast1 f_a4YX
Eq
instance GHC.Classes.Eq body_a2PO =>
         GHC.Classes.Eq (Deriving.Eq.Section body_a2PO) where
  (GHC.Classes.==)
    (Deriving.Eq.Section a1_a2PS a2_a2PT a3_a2PU)
    (Deriving.Eq.Section b1_a2PV b2_a2PW b3_a2PX)
    = ((((a1_a2PS GHC.Classes.== b1_a2PV))
        GHC.Classes.&& ((a2_a2PT GHC.Classes.== b2_a2PW)))
       GHC.Classes.&& ((a3_a2PU GHC.Classes.== b3_a2PX)))
  (GHC.Classes./=) a_a2PY b_a2PZ
    = GHC.Classes.not ((GHC.Classes.==) a_a2PY b_a2PZ)
Generic
instance GHC.Generics.Generic
           (Deriving.Generic.Section body_a2ww) where
  GHC.Generics.from x_a2xu
    = GHC.Generics.M1
        (case x_a2xu of {
           Deriving.Generic.Section g1_a2xv g2_a2xw g3_a2xx
             -> GHC.Generics.M1
                  ((GHC.Generics.:*:)
                     (GHC.Generics.M1 (GHC.Generics.K1 g1_a2xv))
                     ((GHC.Generics.:*:)
                        (GHC.Generics.M1 (GHC.Generics.K1 g2_a2xw))
                        (GHC.Generics.M1 (GHC.Generics.K1 g3_a2xx)))) })
  GHC.Generics.to (GHC.Generics.M1 x_a2xy)
    = case x_a2xy of {
        GHC.Generics.M1 ((GHC.Generics.:*:) (GHC.Generics.M1 (GHC.Generics.K1 g1_a2xz))
                                            ((GHC.Generics.:*:) (GHC.Generics.M1 (GHC.Generics.K1 g2_a2xA))
                                                                (GHC.Generics.M1 (GHC.Generics.K1 g3_a2xB))))
          -> Deriving.Generic.Section g1_a2xz g2_a2xA g3_a2xB }
NFData (implies Generic)
instance Control.DeepSeq.NFData body_a7gu =>
         Control.DeepSeq.NFData (Deriving.NFData.Section body_a7gu) where
Ord (implies Eq)
instance GHC.Classes.Ord body_a3uw =>
         GHC.Classes.Ord (Deriving.Ord.Section body_a3uw) where
  GHC.Classes.compare a_a3uQ b_a3uR
    = case a_a3uQ of {
        Deriving.Ord.Section a1_a3uS a2_a3uT a3_a3uU
          -> case b_a3uR of {
               Deriving.Ord.Section b1_a3uV b2_a3uW b3_a3uX
                 -> case (GHC.Classes.compare a1_a3uS b1_a3uV) of {
                      GHC.Types.LT -> GHC.Types.LT
                      GHC.Types.EQ
                        -> case (GHC.Classes.compare a2_a3uT b2_a3uW) of {
                             GHC.Types.LT -> GHC.Types.LT
                             GHC.Types.EQ -> (a3_a3uU `GHC.Classes.compare` b3_a3uX)
                             GHC.Types.GT -> GHC.Types.GT }
                      GHC.Types.GT -> GHC.Types.GT } } }
  (GHC.Classes.<) a_a3uY b_a3uZ
    = case a_a3uY of {
        Deriving.Ord.Section a1_a3v0 a2_a3v1 a3_a3v2
          -> case b_a3uZ of {
               Deriving.Ord.Section b1_a3v3 b2_a3v4 b3_a3v5
                 -> case (GHC.Classes.compare a1_a3v0 b1_a3v3) of {
                      GHC.Types.LT -> GHC.Types.True
                      GHC.Types.EQ
                        -> case (GHC.Classes.compare a2_a3v1 b2_a3v4) of {
                             GHC.Types.LT -> GHC.Types.True
                             GHC.Types.EQ -> (a3_a3v2 GHC.Classes.< b3_a3v5)
                             GHC.Types.GT -> GHC.Types.False }
                      GHC.Types.GT -> GHC.Types.False } } }
  (GHC.Classes.<=) a_a3v6 b_a3v7
    = case a_a3v6 of {
        Deriving.Ord.Section a1_a3v8 a2_a3v9 a3_a3va
          -> case b_a3v7 of {
               Deriving.Ord.Section b1_a3vb b2_a3vc b3_a3vd
                 -> case (GHC.Classes.compare a1_a3v8 b1_a3vb) of {
                      GHC.Types.LT -> GHC.Types.True
                      GHC.Types.EQ
                        -> case (GHC.Classes.compare a2_a3v9 b2_a3vc) of {
                             GHC.Types.LT -> GHC.Types.True
                             GHC.Types.EQ -> (a3_a3va GHC.Classes.<= b3_a3vd)
                             GHC.Types.GT -> GHC.Types.False }
                      GHC.Types.GT -> GHC.Types.False } } }
  (GHC.Classes.>=) a_a3ve b_a3vf
    = case a_a3ve of {
        Deriving.Ord.Section a1_a3vg a2_a3vh a3_a3vi
          -> case b_a3vf of {
               Deriving.Ord.Section b1_a3vj b2_a3vk b3_a3vl
                 -> case (GHC.Classes.compare a1_a3vg b1_a3vj) of {
                      GHC.Types.LT -> GHC.Types.False
                      GHC.Types.EQ
                        -> case (GHC.Classes.compare a2_a3vh b2_a3vk) of {
                             GHC.Types.LT -> GHC.Types.False
                             GHC.Types.EQ -> (a3_a3vi GHC.Classes.>= b3_a3vl)
                             GHC.Types.GT -> GHC.Types.True }
                      GHC.Types.GT -> GHC.Types.True } } }
  (GHC.Classes.>) a_a3vm b_a3vn
    = case a_a3vm of {
        Deriving.Ord.Section a1_a3vo a2_a3vp a3_a3vq
          -> case b_a3vn of {
               Deriving.Ord.Section b1_a3vr b2_a3vs b3_a3vt
                 -> case (GHC.Classes.compare a1_a3vo b1_a3vr) of {
                      GHC.Types.LT -> GHC.Types.False
                      GHC.Types.EQ
                        -> case (GHC.Classes.compare a2_a3vp b2_a3vs) of {
                             GHC.Types.LT -> GHC.Types.False
                             GHC.Types.EQ -> (a3_a3vq GHC.Classes.> b3_a3vt)
                             GHC.Types.GT -> GHC.Types.True }
                      GHC.Types.GT -> GHC.Types.True } } }
Read
instance GHC.Read.Read body_a2E5 =>
         GHC.Read.Read (Deriving.Read.Section body_a2E5) where
  GHC.Read.readPrec
    = GHC.Read.parens
        (Text.ParserCombinators.ReadPrec.prec
           11
           (do { GHC.Read.expectP (Text.Read.Lex.Ident "Section");
                 GHC.Read.expectP (Text.Read.Lex.Punc "{");
                 GHC.Read.expectP (Text.Read.Lex.Ident "sectionSize");
                 GHC.Read.expectP (Text.Read.Lex.Punc "=");
                 a1_a2Eb <- Text.ParserCombinators.ReadPrec.reset GHC.Read.readPrec;
                 GHC.Read.expectP (Text.Read.Lex.Punc ",");
                 GHC.Read.expectP (Text.Read.Lex.Ident "sectionCrc");
                 GHC.Read.expectP (Text.Read.Lex.Punc "=");
                 a2_a2Ec <- Text.ParserCombinators.ReadPrec.reset GHC.Read.readPrec;
                 GHC.Read.expectP (Text.Read.Lex.Punc ",");
                 GHC.Read.expectP (Text.Read.Lex.Ident "sectionBody");
                 GHC.Read.expectP (Text.Read.Lex.Punc "=");
                 a3_a2Ed <- Text.ParserCombinators.ReadPrec.reset GHC.Read.readPrec;
                 GHC.Read.expectP (Text.Read.Lex.Punc "}");
                 GHC.Base.return (Deriving.Read.Section a1_a2Eb a2_a2Ec a3_a2Ed) }))
  GHC.Read.readList = GHC.Read.readListDefault
  GHC.Read.readListPrec = GHC.Read.readListPrecDefault
Show
instance GHC.Show.Show body_a2aI =>
         GHC.Show.Show (Deriving.Show.Section body_a2aI) where
  GHC.Show.showsPrec
    a_a2aM
    (Deriving.Show.Section b1_a2aN b2_a2aO b3_a2aP)
    = GHC.Show.showParen
        (a_a2aM GHC.Classes.>= 11)
        ((GHC.Base..)
           (GHC.Show.showString "Section {")
           ((GHC.Base..)
              (GHC.Show.showString "sectionSize = ")
              ((GHC.Base..)
                 (GHC.Show.showsPrec 0 b1_a2aN)
                 ((GHC.Base..)
                    (GHC.Show.showString ", ")
                    ((GHC.Base..)
                       (GHC.Show.showString "sectionCrc = ")
                       ((GHC.Base..)
                          (GHC.Show.showsPrec 0 b2_a2aO)
                          ((GHC.Base..)
                             (GHC.Show.showString ", ")
                             ((GHC.Base..)
                                (GHC.Show.showString "sectionBody = ")
                                ((GHC.Base..)
                                   (GHC.Show.showsPrec 0 b3_a2aP)
                                   (GHC.Show.showString "}"))))))))))
  GHC.Show.showList = GHC.Show.showList__ (GHC.Show.showsPrec 0)
Typeable Nothing :)

I would like to see how non-derived instances affect compile times. Unfortunately these dumped instances aren't clean enough to put as-is in a module. Also even copy-pasting them would be tedious. I'll see if I can come up with something to replace deriving with the derived instance automatically.

@tfausak
Copy link
Owner Author

tfausak commented May 5, 2017

I was curious if just having the instances affected compile times at all. So I tested manually adding instances like this:

instance Eq   t where (==)      = undefined
instance Ord  t where compare   = undefined
instance Read t where readsPrec = undefined
instance Show t where show      = undefined

Here are the raw results (from a faster machine than previous benchmarks):

GHC version Type class Build time (ms)
8.0.2 none 274
8.0.2 Eq 444
8.0.2 Ord (and Eq) 712
8.0.2 Read 518
8.0.2 Show 492
8.0.2 all 1098

Graphed:

screenshot from 2017-05-05 09-17-58

Providing empty instances for Eq, Ord, Read, and Show increases the build time by 4x.

@tfausak
Copy link
Owner Author

tfausak commented May 6, 2017

I compared Read to my own type classes and functions.

GHC version Description Build time (ms)
8.0.2 types only 341
8.0.2 class C a 342
8.0.2 class F a where f :: a -> () 343
8.0.2 fA :: a -> (); f _ = () 360
8.0.2 instance C a 355
8.0.2 instance F a where f _ = () 360
8.0.2 instance Read a where readsPrec _ _ = [] 469
8.0.2 deriving instance Read a 1629
8.0.2 deriving (Read) 1631

screenshot from 2017-05-06 08-42-25

So providing a Read instance is a little slower. Deriving an instance is a lot slower (as expected). I would like to see how much of that time is generating the instance code versus parsing and compiling that instance code. To do that, I need a way to come up with realistic instances.

@tfausak
Copy link
Owner Author

tfausak commented May 16, 2017

GHC 8.2.1-rc2 is moderately faster than 8.0.2.

GHC version Type class Build time
8.2.1-rc2 None 0.3055767752315335
8.2.1-rc2 Typeable 0.30928339773903957
8.2.1-rc2 Eq 0.5202797332739533
8.2.1-rc2 Show 0.8212172114064861
8.2.1-rc2 Generic 0.6350646736765615
8.2.1-rc2 NFData 1.0582798662304622
8.2.1-rc2 Ord 0.9473899293428875
8.2.1-rc2 Data 1.1560555675647806
8.2.1-rc2 Read 1.1084714069734536

screenshot from 2017-05-16 09-29-49

@tfausak
Copy link
Owner Author

tfausak commented May 19, 2017

I was curious how FromJSON and ToJSON compared to the built-in classes. They're about the same.

screenshot from 2017-05-18 22-07-27

It's crazy how deriving all these classes slows compilation down by 19x.

screenshot from 2017-05-18 22-08-27

The good news is that the actual build time for all the type classes is lower than the expected build time (made by summing the deltas for each type class).

Resolver Type class Build time
lts-8.14 None 0.34668435990148755
lts-8.14 Typeable 0.399200164514984
lts-8.14 Eq 0.6520125281962724
lts-8.14 Generic 0.7019598378348447
lts-8.14 NFData 1.0905038354090586
lts-8.14 Show 1.2038608355412057
lts-8.14 Read 1.4557853140280852
lts-8.14 Data 1.5663556925766864
lts-8.14 Ord 1.6532348879424859
lts-8.14 FromJSON 1.8014229908045831
lts-8.14 ToJSON 1.9800566505620332
lts-8.14 All 6.748310899728042

Anyway, I still want to see how much of this time is code generation. That might be easier with FromJSON and ToJSON because I can use -ddump-splices together with deriveJSON to get somewhat reasonable code to paste into a file.

@tfausak
Copy link
Owner Author

tfausak commented May 19, 2017

Oh, also: I think I'll wait until GHC 8.2.1 is out before actually writing this blog post.It should be out relatively recently.

screenshot from 2017-05-18 22-18-18

@tfausak
Copy link
Owner Author

tfausak commented May 21, 2017

I should test different optimization levels too.

@tfausak
Copy link
Owner Author

tfausak commented Jul 24, 2017

GHC 8.2.1 has been officially released, so I should re-run my benchmarks against it.

@NorfairKing
Copy link

Mentioned here: https://www.reddit.com/r/haskell/comments/6nkx2d/ghc_compiletime_optimisation_idea/

@tfausak
Copy link
Owner Author

tfausak commented Jul 24, 2017

I ran the benchmarks against GHC 8.2.1:

Resolver Type class Build time
ghc-8.2.1 None 0.4917697891028346
ghc-8.2.1 Typeable 0.41437587709084217
ghc-8.2.1 Eq 0.9920900059415579
ghc-8.2.1 Generic 1.1503812945551521
ghc-8.2.1 NFData 1.738287153631161
ghc-8.2.1 Show 1.3229546929778038
ghc-8.2.1 Read 1.5611044007162134
ghc-8.2.1 Data 2.472278546300551
ghc-8.2.1 Ord 1.8569808139220736
ghc-8.2.1 All 5.425151673793732

screenshot from 2017-07-24 08-33-27

@tfausak
Copy link
Owner Author

tfausak commented Jul 25, 2017

I massively reworked the benchmark to test more versions of GHC and more easily change the list of derived classes via CPP: https://gist.github.com/tfausak/0c90ed1b450ae84136c110f026e16bc6

The new benchmark takes a long time to run. Setting up the Docker container takes at least an hour. Running the actual benchmark takes about an hour. I ran the benchmark on my laptop and saved the results (raw, text, CSV, JSON, and HTML): report.zip

I also uploaded the HTML results to my blog so that you can click around: http://taylor.fausak.me/static/pages/2017-07-25-haskell-type-class-deriving-performance.html

The Criterion report isn't great for visualizing the results, so I'll need to make some better graphs from the CSV or JSON. Regardless, here are the results for deriving all classes (Data, Eq, Ord, Read, Show, Typeable) across GHC versions 7.0.1 to 8.2.1:

screenshot from 2017-07-25 16-43-37

Bars show compile time in seconds. Lower is better. The fastest was GHC 7.8.4 at 3.84 seconds. The slowest was GHC 7.8.1 at 4.65 seconds. Broadly speaking, GHC 7.10 and 8.0 were slow, but GHC 8.2 seems to have returned us to GHC 7.8-level speeds.

I also have this data broken down by class, but it's a lot to analyze. Glancing over the graphs, Read seems faster now, Eq and Data seem slower now.

For the base case of deriving no classes, the Criterion graph makes the bars pretty short.

screenshot from 2017-07-25 17-10-50

(Same as before: Compile time in seconds. Lower is better.) The fastest is GHC 7.0.4 at 172 milliseconds. The slowest is GHC 7.10.2 at 475 milliseconds. Things got slower with GHC 7.8 (304 milliseconds) and haven't recovered, even with GHC 8.2.1 (328 milliseconds).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants