-
Notifications
You must be signed in to change notification settings - Fork 8
Data Outputs
Data is output from MARGO in three phases. Once the play button is pushed, MARGO generates and saves a .mat file with all meta data during initialization of the experiment. During tracking, MARGO writes raw data fields to the hard drive. Once tracking is finished, a round of protocol-specific, automatic data processing is performed to calculate metrics for higher level features of the raw data.
Raw data is saved frame by frame to binary data files located in the subdirectory under the user-defined save path. Each tracked field is saved independently to its own .bin binary data file. The simplest way to access the data quickly is through MARGO's custom RawDataMaps. However, the binary data files can be accessed via MATLAB's fread function or similar data reading functions from other languages. If attempting to access the raw data directly, keep in mind that the data must be read in the correct precision and dimensions. In addition to the ability to record user-defined raw data, MARGO can record the following pre-defined raw data fields:
Name-Value pairs for analyze_multiFile
N = number of frames
M = number of ROIs
Name | Description | Dimensions | Data Type | Default Output |
---|---|---|---|---|
Centroid | centroid (x,y) of each object in each frame weighted by the luminance of each pixel in the background subtraction image | N x 2 x M | Single | Yes |
Time | time stamps recorded as the interframe interval in seconds between the previous frame and the current frame | N x 1 | Single | Yes |
Dropped Frames | status (true/false) of whether or not each trace was assigned a centroid in each frame | N x M | Logical | Yes |
Area | size of each object in each frame in pixels2 (default) or mm2 | N x M | Single | No |
Speed | speed of each object in each frame in pixels s-1 (default) or mm s-1 | N x M | Single | No |
Orientation | angle between the major axis of the fitted ellipse of each object and the x-axis in each frame in degrees (-90 to 90) | N x M | Single | No |
Weighted Centroid | centroid (x,y) of each object in each frame weighted by the luminance of each pixel in the background subtraction image | N x 2 x M | Single | No |
Major Axis Length | length (pixels or mm) of the major axis of ellipses fitted to each object in each frame | N x M | Single | No |
Minor Axis Length | length (pixels or mm) of the minor axis of ellipses fitted to each object in each frame | N x M | Single | No |
Heading Direction | angle (radians) of the heading direction of each object in each frame (-&pi to &pi) | N x M | Single | No |
Radius | radial position (0-1) of each object in each frame, equivalent to the normalized radial polar coordinate relative to the ROI center | N x M | Single | No |
Four-Quadrant Arctangent | angle (radians) between each object and the center of its ROI in each frame (-&pi to &pi), equivalent to the angular polar coordinate relative to the ROI. Reported angles are relative to the positive x-axis with the center of the ROI defined as the origin. | N x M | Single | No |
Video Index | status (true/false) of whether or not video data from each frame was sampled for output video file | N x 1 | Logical | No |
A single object, ExperimentData, allows the user to access all raw and processed data output. The object is saved to a .mat file in an auto-generated, time-stamped directory under the save directory. The ExperimentData object can be easily loaded into MATLAB and is a convenient way to browse and manipulate all the data. ExperimentData contains four core properties:
- Data - Contains custom RawDataField objects which contain meta data about the raw data files and allow the user to dynamically access raw data from the files.
- Meta - Contains experiment meta data (e.g. ROIs, background references, labels, file path information).
- Parameters - Contains all experimental parameters and their values.
- Hardware - Contains hardware objects (e.g. cameras, displays) and associated meta data.
See the expmt subfield tables for a complete details of the contents of the ExperimentData object.
All meta data from the experiment is saved to the ExperimentData object during initialization of tracking. By default, MARGO records the following meta data:
- time, date, and duration
- ROI position and dimensions
- label meta data
- camera and other hardware settings
- referencing and tracking parameters
- imaging noise statistics
- raw data file paths, format, and dimensions
- protocol specific parameters
Raw data files can be accessed and de-accessed through the ExperimentData object through the use of a custom RawDataMap object. These objects are built on top of MATLAB's memory maps, for efficient access of large binary data files. Although MARGO has a much lighter data footprint than raw video data, raw data files can still be cumbersome, or impossible to hold in memory. Because MARGO can efficiently track and record activity from thousands of individuals over very long timescales, raw data files can easily exceed several gigabytes in size. For this reason, raw data is dynamically read from the hard drive to avoid out of memory errors. Mapped raw data can be dynamically accessed under expmt.data.(raw field name).raw like a normal MATLAB array.
For example, to assign all centroid data and time stamps to arrays in their native dimensions:
% leaving indices blank reads all the data from
% the binary file in its native dimensions
centroids = expmt.data.centroid.raw();
time_stamps = expmt.data.time.raw();
Even though no indices are specified and the native dimensions of the two data files are different, meta data stored in the MARGO raw data maps is used to ensure that the data is read as the correct data type in the correct dimensions. The first operation reads all centroid data as an N x 2 x M matrix where N = number of frames and M = number of ROIs. The second dimension stores the X and Y coordinates respectively. The second operation reads all the time stamp data and stores it in an N x 1 array. RawDataMaps can also be indexed like a typical array to retrieve only portions of the data.
For example, to assign the X and Y coordinates for ROI #10 to separate variables:
% retrieve all centroid data for ROI #10
x = expmt.data.centroid.raw(:, 1, 10);
y = expmt.data.centroid.raw(:, 2, 10);
RawDataFields and RawDataMaps used to access data from binary raw data files such as frame to frame centroid data and timestamps. The fields listed below are the default, minimum raw data outputs.
Field | Description | Subfields |
---|---|---|
centroid | RawDataMap object for tracking centroid data (num frames x 2 x num ROIs) | map, path, precision, fID, dimension |
time | RawDataMap object for inter-frame interval (sec) data (num frames x 1) | map, path, precision, fID, dimension |
dropped_frames | RawDataMap object for which objects were tracked in each frame (num frames x num ROIs) | map, path, precision, fID, dimension |
Struct of meta data for the experiment such as ROI data, label info, background reference, and noise correction.
Field | Description | Subfields |
---|---|---|
date | string: timestamp for start of the experiment in MM-DD-YY-hh-mm-ss format | none |
fields | cell array: names of raw data fields | none |
finish | boolean: post-process data | none |
labels_table | table: label data for each ROI | box, comments, day, ID, sex, strain, treatment |
name | string: name of the experimental protocol | none |
noise | struct: pixel noise distributions and other noise correction meta data | dist, mean, std, roi_dist, roi_mean, roi_std |
num_dropped | int: number of frames dropped (i.e. not tracked) for each trace | none |
num_frames | int: number of frames recorded | none |
num_traces | int: number of individual traces tracked | none |
options | struct: post-processing options | areathresh, bouts, bootstrap, disable, handedness, raw, regress, save, slide |
path | struct: file path information for the parent ExperimentData object | dir, name, full, fig |
ref | struct: background reference image and parameters | cen, ct, im, last_update, stack, t, thresh, update |
roi | struct: ROI positions and other meta data | bounds, centers, corners, im, mask, mode, n, orientation, pixIdx, (optional: cam_dist, col, grid, row, shape, tform, vec) |
sex | string: sex of the first labeled group | |
strain | sting: strain name of the first labeled group | none |
source | string: video data input source (camera or video) | none |
track_mode | string: tracking mode (single or multitrack) | none |
treatment | string: experimental condition or treatment of the first labeled group | none |
video | struct: video reader object for video file input and associated meta data | |
vignette | struct: vignette subtraction image and other vignette correction meta data | im, mode |
Struct of hardware devices detected by MARGO and available configurations.
Field | Description | Subfields |
---|---|---|
cam | stuct: video input objects and associated parameters | activeID, AdaptorName, bitDepth, DeviceInfo, calibrate, calibration, DeviceIDs, frame_rate, settings, src, vid |
COM | struct: serial COM objects, COM ports, and associated parameters | aux, light, ports |
projector | struct: displays detected through psychtoolbox and associated parameters | reg_params, Fx, Fy |
screen | struct: psychtoolbox parameters for the active display | ifi, screenNumber, vbl, waitframes, window, windowRect, xCenter, yCenter |
Field | Description | Values |
---|---|---|
area_min | maximum blob size for extraction | numeric constant |
bg_adjust | adjust difference image to enhance contrast | false, true |
bg_mode | expected color of background | auto, dark, light |
dilate_el | structuring element for blob dilation | structuring element |
dilate_sz | pixel radius of blob dilation element | numeric constant |
dist_thresh | minimum distance to ROI center for distance ROI sorting | numeric constant |
duration | experiment duration in hours | numeric constant |
erode_el | structuring element for blob erosion | structuring element |
erode_sz | pixel radius of blob erosion element | numeric constant |
estimate_trace_num | flags automatic estimation of number of active traces in each ROI | false, true |
initialized | initialization status of tracking | false, true |
max_trace_duration | required number of consecutive frames for traces to die or revive | numeric constant |
mm_per_pix | millimeter to pixel unit conversion factor | numeric constant |
noise_estimate_missing | status of pixel noise distribution | false, true |
noise_ref_thresh | standard deviations above noise sample required to force background reset | numeric constant |
noise_sample | noise sampling required prior to tracking | false, true |
noise_skip_thresh | standard deviations above noise sample required to skip frames | numeric constant |
ref_depth | number of background reference images in rolling stack | numeric constant |
ref_freq | frequency to add background reference image to rolling stack (per min) | numeric constant |
ref_fun | function use to compute background reference from stack | mean, median |
ref_mode | set single static reference and rolling reference | live, static |
roi_mode | ROI definition mode | auto, grid |
roi_thresh | automatic ROI segmentation threshold | numeric constant |
roi_tol | number of standard deviations from cluster mean to group ROIs | numeric constant |
sort_mode | ROI sorting criteria | bounds, distance, grid |
speed_thresh | maximum allowed speed of tracked objects (distance/sec) | numeric constant |
track_thresh | stuct: video input objects and associated parameters | none |
target_rate | upper limit for acquisition rate (frames/sec) | numeric constant |
num_traces | number of traces per ROI | numeric vector |
units | distance unit of measurement | millimeters, pixels |
vignette_sigma | standard deviation of vignette correction gaussian | numeric constant |
vignette_weight | vignette correction gaussian weight | numeric constant (0-1) |
When tracking is finished, MARGO can be configured to run a protocol-specific, data-processing script to perform data pre-processing or analysis. All tracking protocols in MARGO minimally record centroid position and inter-frame interval. As examples of higher level features, MARGO uses post-processing to generate measures of: individual activity, locomotor handedness, and stimulus evoked behaviors.
MARGO will execute an analyze file for the accompanying run file upon finishing tracking. Processing times can vary greatly depending on your computer and the size of the raw data files. Once analysis is complete, the user will be given the option to view raw centroid data. At this point, the ExperimentData master can be loaded into the MATLAB workspace by copying and executing the command printed to the command window or by browsing to the save directory and manually loading the file.
Users may wish to try processing the data multiple ways or set optional processing flags or parameters. For this reason, MARGO has functions for re-processing data files, repairing broken references to raw data files, and setting optional processing flags.
The simplest way to get started is the analyze_multiFile
function. This function allows the user to browse and select a parent
directory containing all files to be reprocessed. Once a parent
directory is selected, the function will search recursively through all
directories underneath for any .mat files. An optional keyword can be
provided to restrict the query to file names containing the keyword
argument. The name property of ExperimentData is used to query and
execute to the accompanying analyze file for each file sequentially.
This means that ExperimentData files of many different types can be
processed together in batches. The following name-value pairs can be set
to customize the analysis:
Name-Value pairs for analyze_multiFile
Name | Description | Values | Default | Data Types |
---|---|---|---|---|
Keyword | Restricts file search to .mat files containing keyword | any string | none | string |
Save | Toggles figure and file saving | true, false | true | binary |
Raw | Sets raw data files to be generated from centroid trace features | 'Speed', 'Direction', 'Theta', 'Radius' | none | string, cell array of strings |
Bootstrap | Toggles parsing speed data into discreet bouts for bootstrap resampling | true, false | false | binary |
Regress | Toggles correction of speed data by regressing out lens distortion | true, false | false | binary |
Handedness | Toggles extrapolation of handedness metrics from centroid traces | true, false | false | binary |
Slide | Toggles calculating sliding average speed of speed over time | true, false | false | binary |
AreaThresh | Toggles calculating individual area thresh for parsing floor/ceiling bouts | true, false | false | binary |
The ExperimentData object and the custom RawDataMap methods used to read the raw binary data files cannot operate outside of the MATLAB environment. This section will explain how to export MARGO data to file formats that are friendly to other languages and how to the read the raw data and experiment meta data outside of MATLAB.
Binary data files are essentially formatted as a long sequence of bits. Accurately reading binary data requires:
- Knowing the original format used to write the data
- The original dimensionality of the data (see important note below)
The data format tells us how many bits make up each number in the file (e.g. 32-bit) and how to interpret those bits into a number (e.g. uint32, an unsigned 32-bit integer). The dimensionality tells us how to structure the data (i.e how many rows, columns, pages etc). By default, binary files are read as a long one-dimensional vector.
MARGO stores meta data about each binary data file in the ExperimentData object (see below for accessing meta data outside MATLAB) for each experiment under the precision
and dim
properties of the RawDataField associated with each binary file.
>> expmt.data.centroid
ans =
258x2x10 RawDataField array with properties:
raw: [258x2x10 RawDataMap]
path: 'C:/Users/deBivortLab/Documents/MATLAB/margo_data/...
fID: 7
dim: [258 2 10]
precision: 'single'
An example of reading the data in MATLAB is shown below. Note that the first and last dimensions have to be swapped to get the data in the correct format:
% get binary file meta data
dim = expmt.data.centroid.dim;
bin_path = expmt.data.centroid.path;
precision = expmt.data.centroid.precision;
% flip the first and list dimensions (see the note below)
dim = dim([3 2 1]);
num_elements = prod(dim);
% specify precision in the format [write_prcn => read_prcn]
precision = [precision '=>' precision];
% open and read the binary file
centroids = fread(fopen(bin_path), num_elements, precision);
% reshape the data into the original dimensions
centroids = reshape(centroids, dim);
Most languages have functions for reading & writing binary data files that allow users to specify the format of the data. The table below offers some examples of functions from commonly used languages.
language | function |
---|---|
MATLAB | fread |
Python | numpy.fromfile |
R | readBin |
C/C++ | fread, fscanf |
For readability, MARGO's RawDataField swaps the first dimension of the raw data (number of traces) with the last dimension (number of frames) as the data is returned and as the dimensionality is reported. This means that structuring the data correctly requires flipping the first and last dimensions. For example:
% (left) dimensions as recorded in meta data, (right) dimensionality of the binary file
[258x2x10] -> [10x2x250]
Raw data from experiments can be converted from binary data (.bin) to delimited text (.csv) in batch. To convert data to CSV:
1. From the GUI, select File > export > raw data to .csv
2. Browse to and select the parent folder containing the file(s) to convert. In this example, all raw data files in the margo_data
folder and all of its folders will be exported.
3. Wait for the conversion to complete. This may take a while depending on system specs and file size.
4. Output files will be saved to the same folder as its associated binary data file. For example:
./margo_data
|
└── 03-30-2019-17-27-40__Basic_Tracking_1-10_Day1
| 03-30-2019-17-27-40__Basic_Tracking_1-10_Day1.mat
|
└── raw_data
03-30-2019-17-27-40__centroid.bin
03-30-2019-17-27-40__centroid_x.csv
03-30-2019-17-27-40__centroid_y.csv
03-30-2019-17-27-40__dropped_frames.bin
03-30-2019-17-27-40__dropped_frames.csv
03-30-2019-17-27-40__time.bin
03-30-2019-17-27-40__time.csv
With MATLAB2016b or newer, MARGO meta data can be exported to JavaScript Object Notation (JSON), a language-indendent text file format used to encode Name/Value pairs and nested data structures. All basic numeric and text data contained in the ExperimentData object can be stored in a JSON file. To export a file to JSON:
1. From the GUI, select File > export > meta data to .json
2. Browse to and select the parent folder containing the file(s) to convert. In this example, all ExperimentData objects in the margo_data
folder and all of its folders will be exported.
3. Output files will be saved to the same folder as its associated ExperimentData object. For example:
Depending on the selected tracking protocol, MARGO may output figures during post-processing. All figures are saved to an auto-generated figures directory under the user specified save location. The figure directory is saved to expmt.meta.path.fig in the master data container.
Users will be prompted to browse raw trace data upon tracking completion. Select browse traces to open a simple GUI for plotting centroid data. Traces from any ExperimentData object can be browsed at anytime by loading the .mat file and running:
% plot raw centroid data in the margo trace browser
plotTraces(expmt)