Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: First pass at SurfaceImage BIAP #1056

Closed
wants to merge 23 commits into from

Conversation

effigies
Copy link
Member

Haven't fully covered what's in the HackMD, but did expand a bit where we made some decisions that didn't get written down. Also tried to pin down some terminology, as I think it's easy to speak imprecisely here. Some of this may be conventions that we need to enforce by transforming data on input (as we do with volumetric images come with affines that project into something besides RAS+).

Everything is still negotiable, so please feel free to make drastic suggestions.

Will start on implementations next week unless I feel inspired sooner.

cc @larsoner @htwangtw @NicolasGensollen @matthew-brett

@codecov
Copy link

codecov bot commented Sep 17, 2021

Codecov Report

Merging #1056 (e278981) into master (a93f4b3) will decrease coverage by 0.32%.
The diff coverage is 42.85%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1056      +/-   ##
==========================================
- Coverage   92.22%   91.89%   -0.33%     
==========================================
  Files         100      100              
  Lines       12212    12322     +110     
  Branches     2136     2410     +274     
==========================================
+ Hits        11262    11323      +61     
- Misses        622      675      +53     
+ Partials      328      324       -4     
Impacted Files Coverage Δ
nibabel/surfaceimages.py 42.85% <42.85%> (ø)
nibabel/streamlines/array_sequence.py 97.97% <0.00%> (-0.66%) ⬇️
nibabel/viewers.py 95.49% <0.00%> (-0.65%) ⬇️
nibabel/data.py 86.99% <0.00%> (ø)
nibabel/affines.py 100.00% <0.00%> (ø)
nibabel/arrayproxy.py 100.00% <0.00%> (ø)
nibabel/nicom/dicomwrappers.py 90.90% <0.00%> (ø)
nibabel/_h5py_compat.py
nibabel/streamlines/tractogram.py 99.68% <0.00%> (+<0.01%) ⬆️
nibabel/analyze.py 98.52% <0.00%> (+<0.01%) ⬆️
... and 13 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update a93f4b3...e278981. Read the comment docs.

* GIFTI: :py:class:`~nibabel.gifti.gifti.GiftiImage`
* Every image contains a collection of data arrays, which may be
coordinates, topology, or data (further subdivided by type and intent)
* CIFTI-2: :py:class:`~nibabel.cifti2.cifti2.Cifti2Image`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't completely call cifti-2 a surface format. there is no geometry information stored in the file itself (unless someone hacked it. it's a point cloud that happens to have been extracted from some combination of a surface geometry and a volume geometry.

also many of the cifti-2 types are parcel based.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed CIFTI supports multimodal data not purely for surface. It kind of deserve its own category.
IMO with the popularity of HCP data and fmriprep supporting output in fsLR template, surface based data support for CIFTI is too common to ignore. A major road block for user is data I/O. Working on how CIFTI-2 image relates to geometry template is a starting point none the less.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I meant by "Pure data array". Fair point that it accepts data that has no geometric basis (parcels). Made a note of that case.

Copy link

@NicolasGensollen NicolasGensollen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @effigies !
I just made a quick pass but it looks good to me so far.

doc/source/devel/biaps/biap_0009.rst Outdated Show resolved Hide resolved
doc/source/devel/biaps/biap_0009.rst Show resolved Hide resolved
Copy link
Contributor

@bthirion bthirion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx. Just left tiny comments.

doc/source/devel/biaps/biap_0009.rst Show resolved Hide resolved
doc/source/devel/biaps/biap_0009.rst Outdated Show resolved Hide resolved
doc/source/devel/biaps/biap_0009.rst Outdated Show resolved Hide resolved
@effigies
Copy link
Member Author

@danjgale I would be curious about your thoughts, given that you have the surfplot package. What would make working with surface data smoother?

@danjgale
Copy link

danjgale commented Sep 30, 2021

@effigies thanks for the tag. TBH I found working with surfaces really easy when putting together surfplot, and I'm trying to recall any issues I had. When I eventually get back to polishing it up (soon🤞) and come across any issues, I can let you know.

The biggest challenges I've had primarily concern manipulating CIFTIs, such as changing the label table, creating a new CIFTI file from a modified data array, etc. Here's an example manipulation I've done in a different package, though I'm not sure how applicable/relevant these challenges are to the current topic.

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some preliminary small stuff, looks good so far!

* Geometry (e.g. ``lh.pial``):
:py:func:`~nibabel.freesurfer.io.read_geometry` /
:py:func:`~nibabel.freesurfer.io.write_geometry`
* Data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth detailing the number of dimensions supported in each of these, unless they all support arbitrary dimensions. Some or all (or none, I'm not sure!) of these could be a single scalar per vertex, so you can't for example have the n-dimensionality mentioned in the Data array definition above, you're stuck with n=0 (per vertex)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that n=0, since one axis is the vertex.

Prominent use cases
===================

* Arithmetic/modeling - per-vertex mathematical operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as there is a way to expose per-vertex data ndarray, we can just say "use NumPy-compatible tools" for this


* Arithmetic/modeling - per-vertex mathematical operations
* Smoothing - topology/geometry-respecting smoothing
* Plotting - paint the data array as a texture on a surface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having worked with VTK / Mayavi / PyVista / VisPy / mplot3d each a bit, I think this is going to be more difficult than it seems. This to me seems better tackled by a separate package, otherwise maintenance will be difficult.

I see the text below about NiBabel not necessarily providing each operation, so maybe adding to the Proposal below explicitly what functionality is out of scope would be good?

Comment on lines 107 to 108
These are not necessarily operations that NiBabel must provide, but the
components needed for each should be readily retrieved.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would put this at the top of this section -- when reading the list first, I assumed that the goal was to support/implement all prominent use cases.

@pep8speaks
Copy link

pep8speaks commented Oct 20, 2021

Hello @effigies, Thank you for updating!

Line 112:38: W291 trailing whitespace

To test for issues locally, pip install flake8 and then run flake8 nibabel.

Comment last updated at 2022-02-11 02:14:46 UTC

@larsoner
Copy link
Contributor

larsoner commented Nov 5, 2021

Sorry about missing the meeting today. I didn't get to the FreeSurfer stuff in the last two weeks but it looks like you started (yay!). Let me know if you want me to try it or add to it or anything else

@effigies
Copy link
Member Author

effigies commented Nov 5, 2021

Hey @larsoner, I've spent the morning fiddling with a collection API, if you want to have a look. I did the FreeSurfer stuff just to ground myself before diving into CIFTI and CaretSpecFile (now workbench specs), so probably won't be spending much more (self-directed) time on it, so you should feel free to play with it and see what it still needs to be useful.

@neurolabusc
Copy link

@effigies you may want to consider handling the mz3 format, the page includes Python, Blender, JavaScript code as well as a full specification and examples datasets. It attempts to live at the Pareto frontier of file size and speed. It is both faster and smaller than GIfTI. The layout design maps directly onto modern GPU architectures (and hence WebGL, Metal and Vulkan APIs). I developed this primarily for the sample files distributed with Surfice, to reduce download times, disk usage and the time to run the demos / regression testing. It is intentionally simple, and does not have the flexibility of other formats. It does not attempt to do everything, but it can fill a niche.

@effigies
Copy link
Member Author

@neurolabusc Thanks! Opened an issue to avoid this getting lost...

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some preliminary testing from me, but looks like a good start. I can see the direction from the code and it makes sense.

bold.load_geometry("/data/anat/hemi-L_midthickness.surf.gii")
# Not implementing networkx weighted graph here, so assume we have a method
graph = bold.geometry.get_graph()
distances = distance_matrix(graph) # n_coords x n_coords matrix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe going to be harder than we think:

>>> rr, tris = nib.freesurfer.read_geometry('/mnt/bakraid/data/structurals/fsaverage/surf/lh.white')
>>> from mne.surface import mesh_dist
>>> adjacency = mesh_dist(tris, rr)
>>> t0 = time.time(); dist_matrix = dijkstra(adjacency, directed=False, limit=10.); print(time.time() - t0)
...
numpy.core._exceptions.MemoryError: Unable to allocate 200. GiB for an array with shape (163842, 163842) and data type float64

So we'll need to improve SciPy's dijkstra to support sparse outputs start...

If we don't use the distance-based method and instead use the "smudging" operation by pushing data to neighboring vertices using a number of steps, this process is avoided. I guess this is a bit of an actual implementation detail, but we should think about these two methods as options with different tradeoffs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, full pairwise distances is painfully naive; not sure if iterated smudging is equivalent to distance-based (up to accumulated floating point error) but seems reasonable.

Mostly as a bookmark to myself, it might be worth looking into https://github.com/gilShamai/Fast-Pairwise-Geodesic-Distance-Computation to see if they have any good ideas.


class FreeSurferSubject(GeometryCollection):
@classmethod
def from_subject(klass, subject_id, subjects_dir=None):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So far:

>>> from nibabel.tests.test_surfaceimages import FreeSurferSubject
>>> subj = FreeSurferSubject.from_subject('fsaverage', '/mnt/bakraid/data/structurals')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/home/larsoner/python/nibabel/nibabel/tests/test_surfaceimages.py", line 210, in from_subject
    return klass.from_directory(Path(subjects_dir) / subject_id)
AttributeError: type object 'FreeSurferSubject' has no attribute 'from_directory'

So some unit tests are clearly needed :)

And if I change the call to use from_spec, then there is no self referred to in that class method

return ap


class FreeSurferHemisphere(SurfaceGeometry):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to work well as a subclass in local testing

@effigies effigies mentioned this pull request Nov 19, 2021
@jbteves
Copy link

jbteves commented Feb 8, 2022

@larsoner @effigies where should I put some toy code as an example of converting an STC to a GIFTI? Should that go here, or should that be something we put in MNE first, and then add to this API later?

@effigies
Copy link
Member Author

effigies commented Feb 8, 2022

Would a gist be sensible in the short term?

@jbteves
Copy link

jbteves commented Feb 9, 2022

Copy link
Contributor

@larsoner larsoner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to merge and address my comments later @effigies , already looks like a great start!

dlpfc_idcs = img.coordaxis.get_indices(parcel=structure, indices=vtx_idcs)

# Subsampled coordinate axes will override any duplicate information from header
dlpfc_img = CoordinateImage(img.dataobj[dlpfc_idcs], img.coordaxis[dlpfc_idcs], img.header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or in syntactic sugar mode via __getitem__:

dlpfc_img = img[dlpfc_idcs]

and under the hood you just do the operation above?

# Now load geometry so we can plot
wbspec = CaretSpec("fsLR.wb.spec")
dlpfc_img.coordaxis.load_structures(wbspec)
...
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add an example of trivial HDF5-based round-trip?

Native HDF5 format support
--------------------------
To support saving without loss of information, we also support serialization
to HDF5:
...

bold = CoordinateImage.from_filename("/data/func/hemi-L_bold.func.gii")
dm = make_first_level_design_matrix(...)
labels, results = run_glm(bold.get_fdata(), dm)
betas = CoordinateImage(results["betas"], bold.coordaxis, bold.header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing to note here is that betas will have collapsed over the time axis of the data, so the bold.header might be incorrect now, or otherwise need adjustment.

# that retrieves a graph for each structure
graphs = get_graphs(bold.coordaxis)
distances = distance_matrix(graphs['lh']) # n_coords x n_coords matrix
weights = normalize(gaussian(distances, sigma))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weights = normalize(gaussian(distances, sigma))
weights = normalize(gaussian(distances, sigma)) # sparse matrix

distances = distance_matrix(graphs['lh']) # n_coords x n_coords matrix
weights = normalize(gaussian(distances, sigma))
# Wildly inefficient smoothing algorithm
smoothed = CoordinateImage(weights @ bold.get_fdata(), bold.coordaxis, bold.header)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As you mentioned, this will kill most people's machines. Maybe you need to do this for GIfTI, but you shouldn't need to for the native HDF5 format, or for most operations (including plotting) -- it can just be done on the fly.

WDYT about an upsampler property or way to specify how to go from the subsampled / parcel space to the full-resolution space? By default / None could mean "map directly to the specified parcel, leave all else zero", but you in principle should be able to have this also be a scipy.sparse CSR or CSC matrix of shape (n_full_space, n_parcel) that you can @ with your data to get to the full-res space i.e., high-res surface for surfaces, or T1 / 1mm voxel spacing (usually) for volumetric.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants