Use a custom parsing monad instead of attoparsec. (google#298)

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against google#62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of google#294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 11.66 μs (11.36 μs .. 11.99 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs) Also added isolate and used it for parsing messages and packed fields. This improved the nested benchmark a bit compared to without it: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.
avdv · Jan 3, 2019 · e7ad153 · e7ad153
1 parent 7bc47ea
commit e7ad153
Show file tree

Hide file tree

Showing 2 changed files with 19 additions and 9 deletions.
diff --git a/src/Data/ProtoLens/Compiler/Generate/Encoding.hs b/src/Data/ProtoLens/Compiler/Generate/Encoding.hs
@@ -237,7 +237,6 @@ parseFieldCase loop x f = case plainFieldKind f of
     _ -> [valueCase]
   where
     y = "y"
-    bytes = "bytes"
     entry = "entry"
     info = plainFieldInfo f
     valueCase = pLitInt (fieldTag info) --> do'
@@ -258,11 +257,7 @@ parseFieldCase loop x f = case plainFieldKind f of
             $ x
         ]
     packedCase = pLitInt (packedFieldTag info) --> do'
-        [ bytes <-- parseFieldType lengthy
-        , y <-- "Data.ProtoLens.Encoding.Bytes.runEither"
-                    @@ ("Data.ProtoLens.Encoding.Bytes.runParser"
-                        @@ parsePackedField info
-                        @@ bytes)
+        [ y <-- isolatedLengthy (parsePackedField info)
         , stmt . loop . updateParseState (overField info ("Prelude.++" @@ y))
             $ x
         ]

diff --git a/src/Data/ProtoLens/Compiler/Generate/FieldEncoding.hs b/src/Data/ProtoLens/Compiler/Generate/FieldEncoding.hs
@@ -11,6 +11,7 @@ module Data.ProtoLens.Compiler.Generate.FieldEncoding
     , fieldEncoding
     , lengthy
     , groupEnd
+    , isolatedLengthy
     ) where
 
 import Data.Word (Word8)
@@ -181,10 +182,24 @@ stringField = partialField "Data.Text.Encoding.encodeUtf8" decodeUtf8P lengthy
 
 -- | A protobuf message type.
 message :: FieldEncoding
-message = partialField
+message = lengthy
+        { buildFieldType = "Prelude.." @@
+            buildFieldType lengthy @@
             "Data.ProtoLens.encodeMessage"
-            (\m -> "Data.ProtoLens.decodeMessage" @@ m)
-            lengthy
+        , parseFieldType = isolatedLengthy "Data.ProtoLens.parseMessage"
+        }
+
+-- | Takes a @Parser a@, reads a varint and then runs the parser
+-- isolated to the given length.
+isolatedLengthy :: Exp -> Exp
+isolatedLengthy parser = do'
+    [ len <-- getVarInt'
+    , stmt $ "Data.ProtoLens.Encoding.Bytes.isolate"
+                @@ (fromIntegral' @@ len)
+                @@ parser
+    ]
+  where
+    len = "len"
 
 -- | Some functions that are used in multiple places in the generated code.
 getVarInt', putVarInt', fromIntegral' :: Exp