-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
2GB .fif limit #7897
Comments
Should be fixed in master by #7740. Can you try it on one of your problematic cases? We didn't backport to 0.20 but maybe we should. |
I've now backported #7740 to maint/0.20, @agramfort worth another quick release for this? |
sure why not.
… |
Thanks, guys! I will check it out and get back if any further issues. |
Hi everyone, This problem has reared its head again. I had it originally in version 0.20.6 and then updated just now to 0.21.0 and I'm still getting the error. Basically, able to save a 13 GB Epochs instance and it splits it over 3-4 FIFF files and numbers the parts to it. I then equally balance all of the data in the respective classes, concatenate the epochs into a new data structure and try to save and I get the following error:
The new file is only 9.24 GB and MNE had no trouble saving the 13 GB version, but it doesn't like my splitting, extracting and concatenation of the parts. I had this issue earlier and an update was released and this fixed the problem, at least I thought. So, I'm not sure how connected the fix is (but the error is identical). Here is the full trace if it helps:
|
@alex159 can you try replicating this problem using If it doesn't work with EpochsArray it suggests that there is something about your Epochs that we aren't accounting for when we try to figure out where to split things. |
@larsoner Hi Eric, thanks for the response. I can confirm that using the code in your response, everything saves fine. So, it looks like there is something wrong with the EpochsArray. I think I might have worked it out though. I am concatenating a lot of events and it looks like I hit numeric overflow in the first column of the events array. Everything is increasing right until the end and then I have these extreme negative numbered event time samples alongside a numeric overflow. That then leads to a 'samples not in chronological order' warning - and that must be what leads to the inability to save. I will take out the epochs from the concatenation operation such that no numerical overflow happens and will try to save the data again. I suspect this will work and the overflow / lack of chronological ordering is the issue at hand. Yes, it works fine now. The issue relates to asynchronous events brought about by having event timings over the maximum limit of an int32 number (> 2,147,483,647). Can I just change the dtype of the events array to be of int64 instead of int32 without causing any problems, do you think? |
are you on windows? cause windows used int32 by default for integers
… |
We should make sure we use |
Actually it looks like we |
Yes, on Windows for this analysis as data is restricted to the university network. I can run something similar on a Linux cluster and see the results, but will have to wait until tomorrow to try that part. I managed to make I took out the number of epochs to keep the concatenated events within the int32 limit and managed to successfully save, where MNE correctly splits it all up evenly. So the issue seems to be connected to the int32 limit, but is not specifically related to event time samples being in ascending order. I don't know enough about the internals to figure out what the issue could be. It seems as if @larsoner has pointed out that the FIFF code does imply some sort of restriction here. The Linux test will be done tomorrow. Responding to the question about using smaller values for the |
Hi everyone,
Mid-term MNE-Python user here. I have never run into saving FIF files before, because if I have a dataset with around 6GB worth of data, whenever I save my epochs, MNE-Python usually parcellates them into many < 2GB files and appends "-1", "-2" onto the filenames. Then, all I have to do is load the main file and it automatically detects that a split has happened and iteratively loads all these < 2GB divisions of the FIF data and puts it all back together again into the varriable assigned to
mne.read_epochs
.Now as I am dealing with more and more data, I notice a few edge cases. It seems sometimes that I get the error telling me I can't save a file due to this < 2GB restriction, when this is much smaller than another file that has saved successfully.
My analysis pipeline requires larger subsets and smaller subsets taken from the same data. The large data saves fine (MNE-Python splits the files equally) but I am noticing now that in many of the cases where I extract subsets, the splitting fails and it tries to create > 2GB files in my current directory and then calls the error.
I am not sure how it works but it looks like there is a calculation of sorts that determines what to do when the data is large. The subsets could have easily been split up multiple < 2GB files, but this isn't happening if the data is, say, slightly over the limit.
I can describe the saving process as having three main cases:
epochs.save
works fineepochs.save
splits the files and successfully savesepochs.save
doesn't split, calls exceptionDoes anyone have any insight into the exact condition checking and can see if there is a bug somewhere? It's awkward to keep splitting the data for saving and concatenating them back together when I need them. Those processes take a fair bit of time and ideally I would like not to have to split them arbitrarily when they're slightly over the limit.
If it's not a bug, then that's fine. I am just interested in knowing why it works that way.
So, if anyone can shed some light, it's be very much appreciated.
I was trying to think of a code snippet to demonstrate the example, but that's problematic given it's a large-data problem and not something easily demonstrated with a nice, clean, reproducible example (sorry!)
Best
Alex
The text was updated successfully, but these errors were encountered: