-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix 32 bit compilation #322
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would be easier to follow if we could split it into two patches: the first making it compile with older compilers (fixing the things that look like they were just broken and type incorrect), and then a patch to also make it build with newer compilers (that make the changes to the unboxed primitive numeric types).
cborg/src/Codec/CBOR/Decoding.hs
Outdated
@@ -423,7 +424,7 @@ decodeWord64 = | |||
#if defined(ARCH_64bit) | |||
Decoder (\k -> return (ConsumeWord (\w# -> k (toWord64 w#)))) | |||
#else | |||
Decoder (\k -> return (ConsumeWord64 (\w64# -> k (toWord64 w64#)))) | |||
Decoder (\k -> return (ConsumeWord64 (\w64# -> k (W64# w64#)))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand why we're replacing our conversion helper function toWord64
with a concrete implementation. And indeed deleting the helper function. Why not just make the helper function do the right thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cborg/src/Codec/CBOR/Decoding.hs
Outdated
#else | ||
toInt64 n = I64# (intToInt64# n) | ||
toWord64 n = W64# (wordToWord64# n) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we deleting these rather than fixing them? It's breaking the pattern used below. And it also means we cannot support multiple compiler versions that need differences in these functions.
Surely we should just use CPP here to change the impl of toWord64
and toInt64
so that with older GHC versions they're implemented as now, and with newer versions (where the unboxed integer types changed) to use the new implementation (that's inlined at the use site below) of
toInt64 n = I64# n
toWord64 n = W64# n
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I just removed these compat shims as they were not strictly necessary, but I will add them back for consistency and future-proofing.
cborg/src/Codec/CBOR/Decoding.hs
Outdated
toWord8 :: Word# -> Word8 | ||
toWord16 :: Word# -> Word16 | ||
toWord32 :: Word# -> Word32 | ||
#if WORD_SIZE_IN_BITS == 64 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For consistency, can we use the same macro as everywhere else: either ARCH_64bit
or ARCH_32bit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, will also change the existing usage of WORD_SIZE_IN_BITS
👍
@@ -986,7 +987,7 @@ type ByteOffset = Int64 | |||
-- @since 0.2.2.0 | |||
peekByteOffset :: Decoder s ByteOffset | |||
peekByteOffset = Decoder (\k -> return (PeekByteOffset (\off# -> k (I64# | |||
#if MIN_VERSION_base(4,17,0) | |||
#if MIN_VERSION_base(4,17,0) && !defined(ARCH_32bit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely this can't be quite the right logic, because 32bit arches prior to some ghc version did need intToInt64#
didn't they?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeekByteOffset
always provides an Int64#
on 32bit:
cborg/cborg/src/Codec/CBOR/Decoding.hs
Lines 191 to 195 in c8013b3
#if defined(ARCH_32bit) | |
| PeekByteOffset (Int64# -> ST s (DecodeAction s a)) | |
#else | |
| PeekByteOffset (Int# -> ST s (DecodeAction s a)) | |
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeekByteOffset always provides an Int64# on 32bit
which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".
But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be
#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)
so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.
Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeekByteOffset always provides an Int64# on 32bit
which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".
But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be
#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)
so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.
I think the existing logic is exactly right (it also compiles on all GHC versions >=8.4), as seen eg by a case distinction.
The purpose of this code is to fill the hole _f
in PeekByteOffset $ \off# -> k (I64# (_f off#))
(where k
has input type ByteOffset ~ Int64
).
- On 32bit:
We haveoff# :: Int64#
andI64# :: Int64# -> Int64
on all GHCs. Hence, we can fill_f = id
. - On 64bit:
We haveoff# :: Int#
.- On GHC < 9.4, we have
I64# :: Int# -> Int64
.
Hence, we can fill_f = id
. - Since GHC >= 9.4, we have
I64# :: Int64# -> Int64
(FTR: corresponding section in the migration guide for 9.4).
Hence, we can fill_f = intToInt64# :: Int# -> Int64#
- On GHC < 9.4, we have
Note that we can fill _f = id
except when we are on GHC >=9.4 and on 64bit, which is exactly the existing condition.
Would be neat if there were some more principled/locally understandable alternative to the current CPP maze 😅
Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?
Yes, to double check once again, I pushed to a branch in my fork with the second commit (Support GHC >=9.2 on 32bit) removed: https://github.com/amesgen/cborg/actions/runs/6183531233
One can see that i686 compilation with all GHCs from 8.4 to 9.0 succeeds (as expected, compilation on >=9.2 fails as the respective commit was removed).
Also, here is the CI run that everything works on all GHCs with all three commits: https://github.com/amesgen/cborg/actions/runs/6138762687
DecodedToken sz i@(I64# i#) | ||
| isInt64Canonical sz i -> k i# >>= go_fast (BS.unsafeDrop sz bs) | ||
DecodedToken sz (I64# i#) | ||
| isInt64Canonical sz i# -> k i# >>= go_fast (BS.unsafeDrop sz bs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused by this. Did this simply never work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have to do some git blame
archaeology to find out whether this worked in the past due to different type signatures or was just introduced without any testing; even more extremely, the code didn't even parse on 32bit (introduced in #273) previously, so this code went without any testing since at least then.
@@ -1796,7 +1796,7 @@ tryConsumeInteger hdr !bs = case word8ToWord hdr of | |||
0x1b -> let !w = eatTailWord64 bs | |||
sz = 9 | |||
#if defined(ARCH_32bit) | |||
in DecodedToken sz (BigIntToken (isWord64Canonical sz (word64ToWord w)) $! toInteger w) | |||
in DecodedToken sz (BigIntToken (isWord64Canonical sz w) $! toInteger w) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also looks like this couldn't ever have been type correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
see #322 (comment)
db7d049
to
8011748
Compare
Good idea, did that in the latest push (also addressed most comments) 👍 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still struggling to follow this, and I'm not totally convinced the first patch does compile on 32bit arches. I'd like to try that.
cborg/src/Codec/CBOR/Decoding.hs
Outdated
#if defined(ARCH_64bit) | ||
toInt64 n = I64# n | ||
toWord64 n = W64# n | ||
#endif | ||
#endif | ||
#if defined(ARCH_32bit) | ||
toInt64 n = I64# n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's unclear why this identity change is needed. The definitions in the 64 and 32bit cases here are the same as each other, and they are the same definitions as what where here before this code change.
I was half expecting it to be an identity refactoring in preparation for the next patch, but it's not. Was this intended to be something else? Otherwise it has no effect but makes the logic more complicated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
they are the same definitions as what where here before this code change.
No, they are different for 32bit and GHC >=9.4: Before this change, this meant that we compiled to toInt64 n = I64# (intToInt64# n)
(incorrect, see #322 (comment)), and with this change, to toInt64 n = I64# n
(correct).
But I just realized that this diff belongs into the second, not the first commit; will change that and also try to simplify the logic a bit.
@@ -986,7 +987,7 @@ type ByteOffset = Int64 | |||
-- @since 0.2.2.0 | |||
peekByteOffset :: Decoder s ByteOffset | |||
peekByteOffset = Decoder (\k -> return (PeekByteOffset (\off# -> k (I64# | |||
#if MIN_VERSION_base(4,17,0) | |||
#if MIN_VERSION_base(4,17,0) && !defined(ARCH_32bit) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PeekByteOffset always provides an Int64# on 32bit
which would mean this logic here is wrong. The logic currently says "use intToInt64 if base >= 4.17 and only on 64bit arches".
But what you're saying above is that it should be "always use intToInt64 on 32bit arches", so the combo would be
#if MIN_VERSION_base(4,17,0) || defined(ARCH_32bit)
so that we use intToInt64 on 32bit arches (for all base versions), and we also use it on all arches from base 4.17 onwards.
Did we check this patch actually compiles on 32bit arches prior to ghc 9.2?
Yes, I found it very hard to modify these CPP-overflowing modules without constantly double-checking everything by actually compiling it. See #322 (comment) at the bottom for CI runs confirming that everything compiles as epxected on various GHCs on 32bit. |
8011748
to
f490e01
Compare
f490e01
to
c3b5c69
Compare
This is a rebased and fixed version of #296 (it does not compile for me on e.g. GHC 9.2 and 9.6). This should resolve #293 and resolve #309.
This PR also adds i386 CI jobs (also see #207); feel free to remove if you consider this out-of-scope.
My actual motivation for this PR is the GHC 9.6/9.8 WASM/WASI backend; I locally ensured that the tests pass, but I did not add a CI job as this still needs unmerged/unreleased PRs of transitive dependencies (concretely, haskellari/splitmix#73).