-
Notifications
You must be signed in to change notification settings - Fork 6
MNE notes
These notes are in support of plotting MNE data with HoloViz+Bokeh, and potentially later creating an actual plotting backend.
In MNE-Python, the Raw object is used to represent continuous data. The Raw object essentially consists of a memory-mapped file on disk, with data access and manipulation methods provided by the object's interface.
On-Disk Format: The on-disk format can vary depending on the source of the data. For instance, if you load data from a FIF (Functional Image File) file, which is the file format native to the Neuromag MEG system, the on-disk data is in binary format specific to the FIF format. Other file formats, such as EDF, CNT, BrainVision, etc., will have their own specific on-disk binary format. MNE-Python provides specific functions to load each of these file types into a Raw object, e.g., mne.io.read_raw_fif for FIF files.
In-Memory Format Once the Raw object is loaded into memory (either partially or fully), the data is represented as a 2D NumPy array. The dimensions of this array are (n_channels, n_times). The data unit depends on the type of channel (e.g., for EEG channels, the data is in Volts). The data is typically an array of floats (although the precise float precision can depend on the original data format and how it's loaded).
The Raw object also includes an info attribute which is a dictionary-like object containing metadata about the recording, such as channel names, types, sample rate, and any additional relevant information.
MNE-Python uses matplotlib and PyQtGraph (new-ish) backends for data visualization, depending on the context or user selection. PyQtGraph is a graphics and GUI library built on PyQt and numpy. Here is the source code for the MNE PyQtGraph backend.
When it comes to visualizing large datasets with these libraries, MNE-Python uses a lazy loading feature to load and visualize data. Specifically, when you call a plotting function on a Raw object, MNE-Python will not load the entire data into memory. Instead, it will only load the segment of data that is to be displayed. The details vary slightly depending on the specific plot being generated, but the principle is the same: load only what is necessary.
If you zoom out to see more and more of the dataset, when the data is too large to fit into memory, MNE-Python will downsample the data to a manageable size, and then plot the downsampled data. Downsampling in this case involves reducing the number of data points by taking, for example, every nth data point. This allows you to get a rough idea of the overall shape of the data without having to load the entire data into memory. When you zoom in on a specific part of the plot that fits into memory, MNE-Python will load the corresponding segment of the data at full resolution.
However, in practice, the display of - and interaction (zoom, pan) with - large amounts of data is not good with matplotlib and pretty laggy with PyQtGraph. If we are to address this problem, we would probably have to benchmark and understand the timings of downsampling/data-handling and the plotting, as it could be that any significant performance gains would come from the data-handling aspects, which would likely go beyond the scope of HoloViz+Bokeh. Alternatively, if the data-handling is being done in a performative way and there are significant gains to be had in the plotting, then that would fall within HoloViz+Bokeh scope.
Note: "Data channel" in MNE means data from the brain |
---|
- 'mag': Magnetometers (scaled by 1e+15 to plot in fT)
- 'grad': Gradiometers (scaled by 1e+13 to plot in fT/cm)
- 'eeg': EEG (scaled by 1e+06 to plot in µV)
- 'csd': Current source density (scaled by 1000 to plot in mV/m²)
- 'seeg': sEEG (scaled by 1000 to plot in mV)
- 'ecog': ECoG (scaled by 1e+06 to plot in µV)
- 'dbs': DBS (scaled by 1e+06 to plot in µV)
- 'hbo': Oxyhemoglobin (scaled by 1e+06 to plot in µM)
- 'hbr': Deoxyhemoglobin (scaled by 1e+06 to plot in µM)
- 'fnirs_cw_amplitude': fNIRS (CW amplitude) (scaled by 1 to plot in V)
- 'fnirs_fd_ac_amplitude': fNIRS (FD AC amplitude) (scaled by 1 to plot in V)
- 'fnirs_fd_phase': fNIRS (FD phase) (scaled by 1 to plot in rad)
- 'fnirs_od': fNIRS (OD) (scaled by 1 to plot in V)
-
raw.get_data()
returns:-
datandarray
numpy ndarray, shape (n_channels, n_times) : Copy of the data in the given range. -
timesndarray
numpy ndarray, shape (n_times,) : Times associated with the data samples. Only returned if return_times=True.
-
-
raw.to_data_frame()
provides a pandas df; either in wide format: 1 col per data channel, time col, epoch col, and condition col, or in long format: has channel (name) and chan_type (group) cols so the Voltage data values are combined into a singlevalue
col.