-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Require explicit, single-threaded initialization #91
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fec_initialized = 1; | ||
|
||
void | ||
fec_init (void) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thread synchronization isn't immediate. If thread 1 and 2 are both started, and thread 1 calls init, thread 2 won't necessarily see updated versions of the memory modified by thread 1, and even if it does it may be inconsistent across different memory locations. Put another way, modifying the static does not create a happens-before relationship, except for threads started after init_fec()
is called.
The solution is to make fec_initialized
an atomic, which is supported by modern C, and then read/write it with acquire/release atomic ordering.
https://marabos.nl/atomics/ is good book about this; Rust's model is based on C++'s model, and C's model is probably also C++'s model, so the semantics will all be the same even if the APIs are slightly different.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another alternative is to just create a static lock that needs to be acquired to initialized or check initialization.(Acquiring/releasing a lock is equivalent to acquire/release semantic on an atomic variable, so it too creates a happens-before relationship.)
This would simplify the API in that you don't need to make fec_init()
public, you could just always call it. It would also make fec_new()
slightly slower, but probably not in a very meaningful way.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, great points here. From some simple experiments, it looks like neither C11 "atomics" nor "threads" are available from the MSVC we get from current CI (eg, stdatomic.h
exists but merely including it leads to numerous errors). I don't quite understand this since we're using the Windows 2022 image and https://github.com/actions/runner-images/blob/main/images/win/Windows2022-Readme.md says this includes MSVC 17.7. The build failures have a version number that looks like "14.35" ... but maybe that's not the MSVC version? Maybe https://reactos.org/wiki/Visual_Studio_Versions is suggesting that "14.35" actually means "17.7". I don't know.
So for now I'm going with "documentation". Hopefully later someone can figure out how to get at atomics/threads on Windows and we can strengthen the implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a mutex suffice here? Maybe that is available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A statically-initializable mutex here would work, yes. There isn't one in C99 so we'd need a third-party library. There is probably one but I am reluctant to give zfec it's first third-party build-time dependency. It seems likely that pthreads offers some kind of solution here, which would at least solve the problem for Linux (maybe macOS too?), but it's not the obvious pthread_mutex_t because those are not statically initializable (so all you do is move the initialization problem from "coordinate zfec initialization" to "coordinate zfec initialization mutex initialization"). Windows probably has its own solution but I don't know what it is offhand either (there's at least 5 solutions that aren't statically initializable, for sure).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have vague memory that certain Python releases require certain MSVC versions? So older Python might get older MSVC or something.
2341a71
to
90296c4
Compare
There was a segfault during one of the macOS test runs. I managed to reproduce this (at least, it appears to show up in the same test method) on my Linux laptop on the branch (after ~800 iterations) and on master (after ~1800 iterations) - so I'm supposing for now that's some pre-existing issue. |
It's nice to have access to them on CI but more importantly showing build progress prevents the build from timing out...
Might be able to find the problem with one of clang's sanitizers, or with valgrind. |
The test suite takes a long time to run, particularly on CI, this might help it a bit. Also the number of successes can be overridden on the command line if necessary.
Fixes #89
This changes the underlying C library to require an initial call to
fec_init
beforefec_new
(and therefore the other APIs) can succeed. This avoids the problem where multiple threads callfec_new
and implicitly trigger parallel initialization leading to corruption of some zfec internal state.Calls to the C function
fec_new
will now return NULL if a call to the C functionfec_init
has not yet completed.This change is not visible to the Python bindings because (a) they always call into fec while holding the GIL and so there is no chance for multithreaded usage and (b) the Python module initializer in the bindings now calls the initialization function automatically.
This change is visible to the Haskell bindings because the Haskell API can really make parallel calls into the fec library from multiple OS threads. Therefore, there is a new Haskell API,
Codec.FEC.initialize
, which must be used beforeCodec.FEC.new
. If it is not,Codec.FEC.new
will throw an exception.This is an incompatible change to the Haskell API so it comes with a version bump for those bindings.