Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for single-file multithreading #403

Closed
mbauman opened this issue Jun 24, 2022 · 4 comments
Closed

Support for single-file multithreading #403

mbauman opened this issue Jun 24, 2022 · 4 comments

Comments

@mbauman
Copy link
Member

mbauman commented Jun 24, 2022

#230 implemented thread-safety at a file level. However, if I try to read from a single file from multiple threads, I hit segfaults. I understand that multithreaded writing to a single file is fraught, but would it be a major challenge to support multithreaded reads?

@JonasIsensee
Copy link
Collaborator

Hi @mbauman,

this really should be possible.
First, the reason that it doesn't work:
JLD2 (globally) keeps track of all files that are being opened.
This is to ensure that you e.g. don't accidentally open a single file in both read and write mode at the same time.

  • If you disable that logic, threaded reading should work. see
  • if haskey(OPEN_FILES, rname)

With the current logic it doesn't work since the same file handle is being used to process reads. And also the JLDFile object has plenty internal caching that could hit race-conditions.

  • This could in principle be fixed by introducing locks around all operations that change the cache or really read from the file.

@ejmeitz
Copy link
Contributor

ejmeitz commented Jul 16, 2023

How hard would it be for a non-JLD expert to implement this? I have an application where everything threads perfectly but a HDF5 read is the thing preventing me from multi-threading the program. I can change my program to get around this but it would make my code a lot more hacky.

@JonasIsensee
Copy link
Collaborator

This wouldn't be very hard and does not require any real JLD2 knowledge.

This is the relevant function definition:

function jldopen(fname::AbstractString, wr::Bool, create::Bool, truncate::Bool, iotype::T=MmapIO;

Currently, there is a "global" list that keeps track of all open files and ensures that only a single JLDFile can exist per file.

This condition could be relaxed:

  • Always create a new file handle for read-only access
  • Keep a separate list of "read-only" file handles (threadsafe). (To throw an error if you try to also create a write access file handle while a read-only one is open)
  • Alternatively introduce a new optional kwarg that will bypass the whole safety mechanism

@JonasIsensee
Copy link
Collaborator

implemented by #477

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants