Skip to content

Live Script Performance Overview

ehennestad edited this page Oct 2, 2023 · 8 revisions

Brain-Observatory-Toolbox v0.9.3

The following table shows the runtime for many of the live scripts across different platforms for v0.9.3:

Platform \ Live script Ephys Quickstart Ophys Quickstart Ephys Demo Ophys Demo Ephys Tutorial Ophys Tutorial
MATLAB Online (1st run) 246 sec 119 sec N/A 134 sec 353 sec 123 sec
MATLAB Online (2nd run) 13 sec 5 sec N/A 10 sec 295 sec 5 sec
DandiHub (1st run) 253 sec 117 sec 465 sec 121 sec 327 sec 104 sec
DandiHub (2nd run) 16 sec 5 sec 185 sec 20 sec 43 sec 5 sec
Local machine(*) (1st run) 440 sec 215 sec 599 sec 192 sec 747 sec 212 sec
Local machine(*) (2nd run) 0.97 sec 0.86 sec 73 sec 5.1 sec 30 sec 0.55 sec

(*) Local machine refers to a 2021 MacBook M1 Pro (3.2 GHz 10 core CPU / 16GB Memory, 100Mbps)

Summary

For each 1st run, the cache was completely reset and both the item tables and the NWB files had to be downloaded. For each 2nd run, the file cache is present, but all the in-memory caches are cleared to mimic the experience of starting a new MATLAB session.

On MATLAB Online, the EphysDemo can currently not be run due to timeout or memory issues. The EphysTutorial sometimes runs, but other times will error (h5 file related error (also memory)). On DandiHub the OphysDemo can not be run because some MATLAB toolboxes are missing (Image Processing Toolbox, Statistics and Machine Learning Toolbox).

On online MATLAB instances, download speeds are quite reliable, but on the local machine they vary a lot. All the Ophys live scripts are quick to run if the data are already available. The Ophys live scripts all use the same session item, so in practice if a user runs one of the live scripts, the remaining ones will be quick to run. This is not the case for the Ephys live scripts, where each live script uses a different session item.

Metadata download / cache retrieval

A significant portion of the runtime for the live scripts is the download/retrieval of metadata. The table below shows the time to download (1st run) or retrieve metadata from cache (2nd run).

Platform \ Dataset Ephys Metadata Ophys Metadata
MATLAB Online (1st run) 56 sec 99 sec
MATLAB Online (2nd run) 11 sec 4 sec
DandiHub (1st run) 79 sec 95 sec
DandiHub (2nd run) 13 sec 4 sec
Local machine(*) (1st run) 52 sec 171 sec
Local machine(*) (2nd run) 12 sec 4 sec

(*) Local machine refers to a 2021 MacBook M1 Pro (3.2 GHz 10 core CPU / 16GB Memory, 100Mbps)

Improvements

  • The most significant improvement will be to update all the Ephys live scripts to use the same session item, and to make sure this session item still satisfies the scientific narrative of the live scripts.

  • Additionally, for the EphysDemo, improvements can be made to the EphysSession/getPresentationwiseSpikeCounts method, more specifically to the local function zlclBuildSpikeHistogram. On the local machine, runtime for this method was reduced from ~27 sec to ~1 sec with targeted modifications to the underlying functions.

  • It is also possible to consider whether to exclude the sections on LFP / CSD data in the Ephys live scripts (Demo+Tutorial). These data are large and take a long time both to download and to load from file, while the current examples using these data are minimal. An alternative is to create an additional live script showing the usage of LFP/CSD data.

  • A significant contribution to the runtime of the Ophys live scripts is the downloading and parsing of the ophys metadata (item tables), specifically the cell metadata (cell_metadata.json). This could be improved if the metadata were available in a different file format, i.e csv. This is raised as an issue on the Allen SDK GitHub repository. Fetching the item tables for the Ophys dataset takes ~80 seconds on DandiHub, about 80% of the total runtime.

  • Implementing issue #156 offers additional speedup for EphysQuickstart and EphysTutorial, as the session NWB download is not necessary for these live scripts.

Brain-Observatory-Toolbox v0.9.4 - preliminary results.

The following table shows the runtime for many of the live scripts across different platforms for v0.9.4:

Platform \ Live script Ephys Quickstart Ophys Quickstart Ephys Demo Ophys Demo Ephys Tutorial Ophys Tutorial
MATLAB Online (1st run) 85 (-161) sec** 113 sec 303 sec** 111 sec 122 (-231) sec** 101 sec
MATLAB Online (2nd run) N/A 5 sec N/A 10 sec N/A 5 sec
DandiHub (1st run). 192 (-60) sec 121 sec 384 (-81) sec 129 sec 222 (-105) sec 110 sec
DandiHub (2nd run). 16 sec 6 sec 85 (-100) sec 26 sec 46 sec 6 sec
Local machine(*) (1st run) 364 sec 252 sec 776 sec 277 sec 366 sec 205 sec
Local machine(*) (2nd run) 12 sec 6 sec 58 sec 9 sec 48 sec 4 sec

(*) Local machine refers to a 2021 MacBook M1 Pro (3.2 GHz 10 core CPU / 16GB Memory, 100Mbps)

(**) The Ephys live scripts on MATLAB Online are run using a "scratch" directory, and a second run will take equally long as the first run.

Example incorporating suggestions for improvements applied to EphysDemo

The following results are for the EphysDemo live script tested on DandiHub. Some of the modifications are also possible for the EphysTutorial.

Description 1st run 2nd run
v0.9.3 454 sec 192 sec 
after updating zlclFindSpikeCounts(*) 336 sec 68 sec 
after excluding display of LFP data 312 sec 47 sec 
after excluding sections on LFP/CSD data 231 sec 45 sec
using the session with the smallest NWB file 173 sec 39 sec 

(*) Also includes some minor performance improvements in other EphysSession functions

Note: Each row includes all the previous modifications.