Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

2GB .fif limit #7897

Closed
algrmur opened this issue Jun 13, 2020 · 11 comments · Fixed by #8449
Closed

2GB .fif limit #7897

algrmur opened this issue Jun 13, 2020 · 11 comments · Fixed by #8449

Comments

@algrmur
Copy link

algrmur commented Jun 13, 2020

Hi everyone,

Mid-term MNE-Python user here. I have never run into saving FIF files before, because if I have a dataset with around 6GB worth of data, whenever I save my epochs, MNE-Python usually parcellates them into many < 2GB files and appends "-1", "-2" onto the filenames. Then, all I have to do is load the main file and it automatically detects that a split has happened and iteratively loads all these < 2GB divisions of the FIF data and puts it all back together again into the varriable assigned to mne.read_epochs.

Now as I am dealing with more and more data, I notice a few edge cases. It seems sometimes that I get the error telling me I can't save a file due to this < 2GB restriction, when this is much smaller than another file that has saved successfully.

My analysis pipeline requires larger subsets and smaller subsets taken from the same data. The large data saves fine (MNE-Python splits the files equally) but I am noticing now that in many of the cases where I extract subsets, the splitting fails and it tries to create > 2GB files in my current directory and then calls the error.

I am not sure how it works but it looks like there is a calculation of sorts that determines what to do when the data is large. The subsets could have easily been split up multiple < 2GB files, but this isn't happening if the data is, say, slightly over the limit.

I can describe the saving process as having three main cases:

  • Data is < 2GB, epochs.save works fine
  • Data is >>>> 2GB (way over), epochs.save splits the files and successfully saves
  • Data is "slightly" over 2GB, epochs.save doesn't split, calls exception

Does anyone have any insight into the exact condition checking and can see if there is a bug somewhere? It's awkward to keep splitting the data for saving and concatenating them back together when I need them. Those processes take a fair bit of time and ideally I would like not to have to split them arbitrarily when they're slightly over the limit.

If it's not a bug, then that's fine. I am just interested in knowing why it works that way.
So, if anyone can shed some light, it's be very much appreciated.

I was trying to think of a code snippet to demonstrate the example, but that's problematic given it's a large-data problem and not something easily demonstrated with a nice, clean, reproducible example (sorry!)

Best
Alex

@larsoner
Copy link
Member

Should be fixed in master by #7740. Can you try it on one of your problematic cases? We didn't backport to 0.20 but maybe we should.

@larsoner
Copy link
Member

I've now backported #7740 to maint/0.20, @agramfort worth another quick release for this?

@agramfort
Copy link
Member

agramfort commented Jun 13, 2020 via email

@algrmur
Copy link
Author

algrmur commented Jun 13, 2020

Thanks, guys! I will check it out and get back if any further issues.
Keep up the good work!

@Alxmrphi
Copy link

Alxmrphi commented Sep 29, 2020

Hi everyone,

This problem has reared its head again. I had it originally in version 0.20.6 and then updated just now to 0.21.0 and I'm still getting the error. Basically, able to save a 13 GB Epochs instance and it splits it over 3-4 FIFF files and numbers the parts to it. I then equally balance all of the data in the respective classes, concatenate the epochs into a new data structure and try to save and I get the following error:

OSError: FIFF file exceeded 2GB limit, please split file or save to a different format

The new file is only 9.24 GB and MNE had no trouble saving the 13 GB version, but it doesn't like my splitting, extracting and concatenation of the parts. I had this issue earlier and an update was released and this fixed the problem, at least I thought. So, I'm not sure how connected the fix is (but the error is identical).

Here is the full trace if it helps:

OSError                                   Traceback (most recent call last)
<ipython-input-18-74621510a84e> in <module>
      1 # save
----> 2 epochs_concat_reduced.save('bigram_10class_equal_epochs-epo.fif', overwrite=True)

<decorator-gen-199> in save(self, fname, split_size, fmt, overwrite, verbose)

~\AppData\Local\Continuum\anaconda3\lib\site-packages\mne\epochs.py in save(self, fname, split_size, fmt, overwrite, verbose)
   1647             # avoid missing event_ids in splits
   1648             this_epochs.event_id = self.event_id
-> 1649             _save_split(this_epochs, fname, part_idx, n_parts, fmt)
   1650 
   1651     def equalize_event_counts(self, event_ids, method='mintime'):

~\AppData\Local\Continuum\anaconda3\lib\site-packages\mne\epochs.py in _save_split(epochs, fname, part_idx, n_parts, fmt)
    185     end_block(fid, FIFF.FIFFB_PROCESSED_DATA)
    186     end_block(fid, FIFF.FIFFB_MEAS)
--> 187     end_file(fid)
    188 
    189 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\mne\io\write.py in end_file(fid)
    335     """Write the closing tags to a fif file and closes the file."""
    336     write_nop(fid, last=True)
--> 337     check_fiff_length(fid)
    338     fid.close()
    339 

~\AppData\Local\Continuum\anaconda3\lib\site-packages\mne\io\write.py in check_fiff_length(fid, close)
    328         if close:
    329             fid.close()
--> 330         raise IOError('FIFF file exceeded 2GB limit, please split file or '
    331                       'save to a different format')
    332 

@larsoner
Copy link
Member

@alex159 can you try replicating this problem using mne.EpochsArray(np.zeros(...), create_info(...)).save(...)? Presumably if you use the same dimensions as the data you're trying to save you'll hit the problem.

If it doesn't work with EpochsArray it suggests that there is something about your Epochs that we aren't accounting for when we try to figure out where to split things.

@larsoner larsoner reopened this Sep 29, 2020
@Alxmrphi
Copy link

Alxmrphi commented Sep 29, 2020

@larsoner Hi Eric, thanks for the response. I can confirm that using the code in your response, everything saves fine. So, it looks like there is something wrong with the EpochsArray. I think I might have worked it out though. I am concatenating a lot of events and it looks like I hit numeric overflow in the first column of the events array. Everything is increasing right until the end and then I have these extreme negative numbered event time samples alongside a numeric overflow. That then leads to a 'samples not in chronological order' warning - and that must be what leads to the inability to save.

I will take out the epochs from the concatenation operation such that no numerical overflow happens and will try to save the data again. I suspect this will work and the overflow / lack of chronological ordering is the issue at hand.


Yes, it works fine now. The issue relates to asynchronous events brought about by having event timings over the maximum limit of an int32 number (> 2,147,483,647).

Can I just change the dtype of the events array to be of int64 instead of int32 without causing any problems, do you think?
I'm guessing I'd need to change it for all of the EpochsArray items that are being concatenated, such that mne.concatenate_epochs retains the same dtype in its events array?

@agramfort
Copy link
Member

agramfort commented Sep 30, 2020 via email

@larsoner
Copy link
Member

We should make sure we use np.int64 explicitly for the events array in our code. @alex159 can you track down whether the problem arose from you creating events using int on windows (which will be 32 bit as @agramfort says) or if it's a problem with our code using int somewhere? It could even need fixing both places...

@larsoner
Copy link
Member

larsoner commented Sep 30, 2020

Actually it looks like we write_int the events, which for FIFF means they are 32-bit integers. Given that there are no int64 in FIFF, I think we should probably check that the events are all <= 2147483647. @alex159 I think in your case you will need to use smaller values for the events array :( You could add the larger numbers as metadata perhaps

@Alxmrphi
Copy link

Yes, on Windows for this analysis as data is restricted to the university network. I can run something similar on a Linux cluster and see the results, but will have to wait until tomorrow to try that part. I managed to make concatenate_epochs use int64 dtype by changing all the dtypes of the EpochArrays that are being concatenated. This did avoid the overflow in the event time samples, meaning they now do stay in chronological order. However, another overflow messages arises somewhere I can't quite locate and the issue with saving becomes a problem again.

I took out the number of epochs to keep the concatenated events within the int32 limit and managed to successfully save, where MNE correctly splits it all up evenly. So the issue seems to be connected to the int32 limit, but is not specifically related to event time samples being in ascending order. I don't know enough about the internals to figure out what the issue could be.

It seems as if @larsoner has pointed out that the FIFF code does imply some sort of restriction here. The Linux test will be done tomorrow.

Responding to the question about using smaller values for the events array, how can I go about that?
Is it possible to just overwrite the event times without destroying the ability to pull out the correct data when requested?
If that's possible, seems like a good interim solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants