Use a custom parsing monad instead of attoparsec. #298

judah · 2018-12-21T20:54:29Z

All decoding benchmarks show significant speedups after this change.
The biggest improvement is to decoding packed data which is 4-5x as fast
as before. (See below for a full list of benchmark diffs.)

This parsing monad follows the approach of, e.g., the store and persist
packages. It requires that all data be in a strict ByteString,
and uses simple pointer arithmetic internally to walk through its bytes.

This effectively works against #62 (streaming parsers) since it
needs to read all the input data before starting the parse. However,
that issue has already existed since the beginning of this library for,
e.g., submessages; see that bug for more details. So this change
doesn't appear to be a regression. We also have freedom to later
try out different implementations without changing the API, since
Parser is opaque as of #294.

The implementation of Parser differs from store and persist by using
ExceptT to pass around errors internally, rather than exceptions (or
closures, as in attoparsec). We may want to experiment with this later,
but in my initial experiments I didn't see a significant improvement
from those approaches.

Benchmark results (the "time" output from Criterion):

flat(602B)/decode/whnf:
13.14 μs (13.02 μs .. 13.29 μs)
=> 8.686 μs (8.514 μs .. 8.873 μs)

nested(900B)/decode/whnf:
26.35 μs (25.85 μs .. 26.86 μs)
=> 11.66 μs (11.36 μs .. 11.99 μs)

int32-packed(1003B)/decode/whnf:
36.23 μs (35.75 μs .. 36.69 μs)
=> 17.31 μs (17.11 μs .. 17.50 μs)

int32-unpacked(2000B)/decode/whnf:
65.18 μs (64.19 μs .. 66.68 μs)
=> 19.35 μs (19.13 μs .. 19.58 μs)

float-packed(4003B)/decode/whnf:
78.61 μs (77.53 μs .. 79.46 μs)
=> 19.56 μs (19.40 μs .. 19.76 μs)

float-unpacked(5000B)/decode/whnf:
108.9 μs (107.8 μs .. 110.3 μs)
=> 22.29 μs (22.00 μs .. 22.66 μs)

no-unused(10003B)/decode/whnf:
571.7 μs (560.0 μs .. 586.6 μs)
=> 356.5 μs (349.0 μs .. 365.0 μs)

with-unused(10003B)/decode/whnf:
786.6 μs (697.8 μs .. 875.5 μs)
=> 368.3 μs (361.8 μs .. 376.4 μs)

This change is

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against #62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of #294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 14.01 μs (13.86 μs .. 14.18 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs)

blackgnezdo

Very cool!

I'd love for @augustss to also chime in on this one.

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

proto-lens/tests/parser_test.hs

This improves the nested benchmark a bit: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.

judah · 2018-12-22T04:07:31Z

Thank you @blackgnezdo for the detailed review!

blackgnezdo · 2018-12-24T04:57:31Z

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

+-- @len@ bytes remaining.  That is, once @len@ bytes have been
+-- consumed, 'atEnd' will return 'True' and other actions
+-- like 'getWord8' will act like there is no input remaining.
+isolate :: Int -> Parser a -> Parser a


Where do we catch negative len?

Good catch; fixed to fail the parse in that case.

blackgnezdo · 2018-12-24T04:59:33Z

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

+-- It is only safe for @f@ to peek between its argument @p@ and
+-- @p `plusPtr` (len - 1)@, inclusive.
+withSized :: Int -> String -> (Ptr Word8 -> IO a) -> Parser a
+withSized len message f = Parser $ \end pos ->


Where do we catch negative len?

The only exposed function that passes a user-defined value is getBytes. It calls packCStringLen which does check for negative length; however, that function throws an exception in that case.

Fixed by adding a manual check inside getBytes.

blackgnezdo

I love the newly added test coverage!

blackgnezdo · 2019-01-02T17:50:02Z

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

 getBytes :: Int -> Parser B.ByteString
-getBytes n = withSized n "getBytes: Unexpected end of input" $ \pos ->
+getBytes n
+    | n < 0 = fail "getBytes: negative length"


Is parsing 0 bytes a normal (useful?) thing to support? If so I'd document it.

Yes, for example if you have a proto string field, then the empty string value may be encoded as the varint "0".

Updated the comment.

blackgnezdo · 2019-01-02T17:51:10Z

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

-getBytes n = withSized n "getBytes: Unexpected end of input" $ \pos ->
+getBytes n
+    | n < 0 = fail "getBytes: negative length"
+    | otherwise = withSized n "getBytes: Unexpected end of input" $ \pos ->


I'd probably prefer to rephrase it with the correct case guarded with the assertion and fail otherwise. It will then become order independent and the expected behavior will go first.

blackgnezdo · 2019-01-02T17:52:08Z

proto-lens/src/Data/ProtoLens/Encoding/Parser.hs

    B.packCStringLen (castPtr pos, n)

 -- | Helper function for reading bytes from the current position and
 -- advancing the pointer.
 --
 -- It is only safe for @f@ to peek between its argument @p@ and
 -- @p `plusPtr` (len - 1)@, inclusive.
+--
+-- This function is not safe to use with a negative length.


Could you quantify the impact of including the assertion branch into this function?

Moved the assertion into withSized. Happily, it turns out that GHC with -O will completely elide the assertion if the length is constant.

All decoding benchmarks show significant speedups after this change. The biggest improvement is to decoding packed data which is 4-5x as fast as before. (See below for a full list of benchmark diffs.) This parsing monad follows the approach of, e.g., the `store` and `persist` packages. It requires that all data be in a *strict* `ByteString`, and uses simple pointer arithmetic internally to walk through its bytes. This effectively works against google#62 (streaming parsers) since it needs to read all the input data before starting the parse. However, that issue has already existed since the beginning of this library for, e.g., submessages; see that bug for more details. So this change doesn't appear to be a regression. We also have freedom to later try out different implementations without changing the API, since `Parser` is opaque as of google#294. The implementation of Parser differs from `store` and `persist` by using `ExceptT` to pass around errors internally, rather than exceptions (or closures, as in `attoparsec`). We may want to experiment with this later, but in my initial experiments I didn't see a significant improvement from those approaches. Benchmark results (the "time" output from Criterion): flat(602B)/decode/whnf: 13.14 μs (13.02 μs .. 13.29 μs) => 8.686 μs (8.514 μs .. 8.873 μs) nested(900B)/decode/whnf: 26.35 μs (25.85 μs .. 26.86 μs) => 11.66 μs (11.36 μs .. 11.99 μs) int32-packed(1003B)/decode/whnf: 36.23 μs (35.75 μs .. 36.69 μs) => 17.31 μs (17.11 μs .. 17.50 μs) int32-unpacked(2000B)/decode/whnf: 65.18 μs (64.19 μs .. 66.68 μs) => 19.35 μs (19.13 μs .. 19.58 μs) float-packed(4003B)/decode/whnf: 78.61 μs (77.53 μs .. 79.46 μs) => 19.56 μs (19.40 μs .. 19.76 μs) float-unpacked(5000B)/decode/whnf: 108.9 μs (107.8 μs .. 110.3 μs) => 22.29 μs (22.00 μs .. 22.66 μs) no-unused(10003B)/decode/whnf: 571.7 μs (560.0 μs .. 586.6 μs) => 356.5 μs (349.0 μs .. 365.0 μs) with-unused(10003B)/decode/whnf: 786.6 μs (697.8 μs .. 875.5 μs) => 368.3 μs (361.8 μs .. 376.4 μs) Also added isolate and used it for parsing messages and packed fields. This improved the nested benchmark a bit compared to without it: benchmarking nested(900B)/decode/whnf 14.32 μs (14.08 μs .. 14.57 μs) => 11.66 μs (11.36 μs .. 11.99 μs) It didn't make a significant difference in the packed benchmark, I think because the effects of using lists currently dominate everything else.

googlebot added the cla: yes label Dec 21, 2018

judah requested a review from blackgnezdo December 21, 2018 20:54

Fix haddocks

ff052a0

blackgnezdo reviewed Dec 21, 2018

View reviewed changes

judah added 3 commits December 21, 2018 16:35

Some review fixes.

2277970

Comments about copying

61420b4

judah force-pushed the encoding-parser branch from 28df564 to e992e18 Compare December 22, 2018 04:05

blackgnezdo reviewed Dec 24, 2018

View reviewed changes

judah added 2 commits January 2, 2019 07:33

Fail on negative lengths

a6a9506

More explicit about safety.

021d48e

blackgnezdo approved these changes Jan 2, 2019

View reviewed changes

Review fixes

95d4550

blackgnezdo approved these changes Jan 2, 2019

View reviewed changes

judah merged commit b250a54 into master Jan 3, 2019

judah deleted the encoding-parser branch January 3, 2019 00:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use a custom parsing monad instead of attoparsec. #298

Use a custom parsing monad instead of attoparsec. #298

judah commented Dec 21, 2018 •

edited

Loading

blackgnezdo left a comment

judah commented Dec 22, 2018

blackgnezdo Dec 24, 2018

judah Jan 2, 2019

blackgnezdo Dec 24, 2018

judah Jan 2, 2019

blackgnezdo left a comment

blackgnezdo Jan 2, 2019

judah Jan 2, 2019

blackgnezdo Jan 2, 2019

judah Jan 2, 2019

blackgnezdo Jan 2, 2019

judah Jan 2, 2019

Use a custom parsing monad instead of attoparsec. #298

Use a custom parsing monad instead of attoparsec. #298

Conversation

judah commented Dec 21, 2018 • edited Loading

blackgnezdo left a comment

Choose a reason for hiding this comment

judah commented Dec 22, 2018

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

blackgnezdo left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

judah commented Dec 21, 2018 •

edited

Loading