-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
performance anomaly and possible related bug #6
Comments
Hey Les! Thanks, it's been quite interesting to work on it. The decode error is because of using I think the performance difference is two things:
I figured this out by doing import Debug.Trace (trace)
debug x = trace (show x) x
decodeMessages' n b = V.generate n (\i -> decodeEx (debug (S.drop (i * tradeTotalBytes ) b )) :: Trade) Since no trace messages were emitted, I knew that the value was never being demanded, and this is why we don't get any decode exceptions. It ran so quickly because it just created a vector of thunks. This may well be an interesting thing to explicitly support in Store. It could cause input memory leaks, but could also be quite handy for things that only need to decode part of their input.
Note that store already has instances for Vector, so you can peek and poke them directly. I should probably use |
Hey Michael, Thanks. That explains the performance difference. My case is always existing binary that is close to the generated (Vector a) but without the beginning length field. Almost need an decodeRawEx or something. Given that how would you code decodeMessages’ ? Thanks, From: Michael Sloan Hey Les! Thanks, it's been quite interesting to work on it. I think the performance difference is two things:
I figured this out by doing import Debug.Trace (trace) debug x = trace (show x) x decodeMessages' n b = V.generate n (\i -> decodeEx (debug (S.drop (i * tradeTotalBytes ) b )) :: Trade) Since no trace messages were emitted, I knew that the value was never being demanded. It ran so quickly because it just created a vector of thunks. This may well be an interesting thing to explicitly support in Store. It could cause input memory leaks, but could also be quite handy for things that only need to decode part of their input.
Note that store already has instances for Vector, so you can peek and poke them directly. I should probably use V.generate instead of the current approach, but I bet the performance is the same. I'll give it a shot! — |
Note that I edited my comment on github to also explain that the decoding error is from using
I'd define it pretty much the way you are defining it in I was wrong about the Not sure what |
I tried switching to the I suggest something like this definition: import qualified Data.Vector.Mutable as MV
import qualified Data.Vector as V
peekVectorOfLength
:: Int
-> Peek a
-> Peek (V.Vector a)
peekVectorOfLength n f = do
mut <- liftIO (MV.new n)
forM_ [0..n-1] $ \i -> f >>= liftIO . write mut i
V.unsafeFreeze mut
{-# INLINE peekVectorOfLength #-} |
Thanks. Yes just checking and the decodeMessages’ is just broken and when it is called gives an exception. (sorry Late night coding ;)… Yes the ‘’ version does work and I will try the mono-traversable version. I would love to have decodeExWithOffset but it does not quite solve this problem (it solves another I have). In my case I know the number of messages ahead of the decodeEx call. In this case and for almost all financial exchange data we have LE binary [Header bytes count x x ][mid][Msg][mid][Msg][Header bytes count x x][mid][Msg][mid][Msg]… Nasdaq, Bats, NYSE etc very similar. The header and each message easily defined with Store. Consider, Data Trade = !Int !Int !Int , this is a good model for the Msg part a Vector Int gives ~ Length Int Int Int But existing skips the Length field as the is in the header record. So I read the header record and know exactly how many “Trade” records to read. I need something that super efficiently decodes N (Peek a) just like decodeMessages’ “decodeExWithOffsetN” and “decodeExWithN”. I’m not familiar with handle internals but it looked like it might be simple to provide handle support as well. The following idea isn’t cooked but here goes hGlimpseEx :: Store a => Handle -> a bGlimpseEx :: Store a => BS.ByteString -> a (replace Glimpse with Decode for proper naming). Handle would be very handy as often coming from handle into byte string via S.hGetSome or such then using Store. bDecodeExN 10 bytes :: Vector Foo — assumes 10 foo payloads but does not look for a leading length Thoughts? From: Michael Sloan Note that I edited my comment on github to also explain that the decoding error is from using whnf functionToTime100'' 1000 instead of whnf functionToTime1000'' 1000. The decode error doesn't happen in the single prime benchmark because no actual decoding is performed, due to laziness. My case is always existing binary that is close to the generated (Vector a) but without the beginning length field. Almost need an decodeRawEx or something. Given that how would you code decodeMessages’ ? I'd define it pretty much the way you are defining it in decodeMessages'' I was wrong about the replicateM, I thought you were using the one from Traversable, not the one from vector. To make this more polymorphic (but still fast when given a specific type), use replicateM from mono-traversablehttps://hackage.haskell.org/package/mono-traversable-0.10.2/docs/Data-Sequences.html#v:replicateM. Not sure what decodeRawEx would look like. Perhaps you want decodeExWithOffset? This allows you to decode starting at an offset. — |
Thanks! I’ll give that a try. |
Perhaps in general?
|
The question is, how do we create the peekSequence :: (IsSequence t, Store (Element t), Index t ~ Int) => Peek t
peekSet :: (IsSet t, Store (Element t)) => Peek t
peekMap
:: (Store (ContainerKey t), Store (MapValue t), IsMap t)
=> Peek t
peekMutableSequence
:: Store a
=> (Int -> IO r)
-> (r -> Int -> a -> IO ())
-> Peek r I can provide equivalents of such utilities that take a known length, if that is convenient. |
Yes, that would be a very clean solution. Thanks! |
Hey Michael, _peek n = V.replicateM n (peek :: Store.Peek Foo) data Foo = Trade !Int !Int !Int deriving (Show, Generic) Using generated Peek (Vector Foo) In one case and Data.Store.decodeExWith _peek We get (192/258 ns) and (14.7 / 21.5 micro-s) for n = 10 and 1000. V.replicateM is in the ballpark (but slower) that your internal approach. I’m not sure how that jives with the V.replicateM result you got earlier but thought you might find this interesting. Also, by way of sanity check and advertisement for the package documentation it might be interesting to create C version of _peek just as a reference for a simple example or two and to show relative times. Perhaps there are standard benchmarks against C already. Cheers, From: Michael Sloan The question is, how do we create the m? I do have a variety of generic utilities for the case where length is stored: peekSequence :: (IsSequence t, Store (Element t), Index t ~ Int) => Peek t peekMutableSequence I can provide equivalents of such utilities that take a known length, if that is convenient. — |
Closing this in favor of the summarization in #40 |
Great package!
I've coded and benchmarked two functions that decode a ByteString of simple packed encoded structures. { -O2 }
The first function is about 2-3x faster. Question 1: why? I would have guessed that the second would be faster.
Question 2: If you uncomment the line
You get
benchmarking instance/Store/1000''
sbug: PeekException 0 "Attempted to read too many bytes for Int. Needed 8, but only 0 remain."
My guess is that replicateM is overflowing a stack with 1000...but that seems bad and also the error message might be misleading.
Question 3: Any suggestions on making this a fast as possible with Store?
Question 4: In general if you want to encode and decode N simple structures without length field what is recommended approach?
Thanks!
The text was updated successfully, but these errors were encountered: