-
Notifications
You must be signed in to change notification settings - Fork 259
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DOC: First pass at SurfaceImage BIAP #1056
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1056 +/- ##
==========================================
- Coverage 92.22% 91.89% -0.33%
==========================================
Files 100 100
Lines 12212 12322 +110
Branches 2136 2410 +274
==========================================
+ Hits 11262 11323 +61
- Misses 622 675 +53
+ Partials 328 324 -4
Continue to review full report at Codecov.
|
* GIFTI: :py:class:`~nibabel.gifti.gifti.GiftiImage` | ||
* Every image contains a collection of data arrays, which may be | ||
coordinates, topology, or data (further subdivided by type and intent) | ||
* CIFTI-2: :py:class:`~nibabel.cifti2.cifti2.Cifti2Image` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i wouldn't completely call cifti-2 a surface format. there is no geometry information stored in the file itself (unless someone hacked it. it's a point cloud that happens to have been extracted from some combination of a surface geometry and a volume geometry.
also many of the cifti-2 types are parcel based.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed CIFTI supports multimodal data not purely for surface. It kind of deserve its own category.
IMO with the popularity of HCP data and fmriprep supporting output in fsLR template, surface based data support for CIFTI is too common to ignore. A major road block for user is data I/O. Working on how CIFTI-2 image relates to geometry template is a starting point none the less.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I meant by "Pure data array". Fair point that it accepts data that has no geometric basis (parcels). Made a note of that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @effigies !
I just made a quick pass but it looks good to me so far.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx. Just left tiny comments.
@danjgale I would be curious about your thoughts, given that you have the surfplot package. What would make working with surface data smoother? |
@effigies thanks for the tag. TBH I found working with surfaces really easy when putting together surfplot, and I'm trying to recall any issues I had. When I eventually get back to polishing it up (soon🤞) and come across any issues, I can let you know. The biggest challenges I've had primarily concern manipulating CIFTIs, such as changing the label table, creating a new CIFTI file from a modified data array, etc. Here's an example manipulation I've done in a different package, though I'm not sure how applicable/relevant these challenges are to the current topic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some preliminary small stuff, looks good so far!
* Geometry (e.g. ``lh.pial``): | ||
:py:func:`~nibabel.freesurfer.io.read_geometry` / | ||
:py:func:`~nibabel.freesurfer.io.write_geometry` | ||
* Data |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be worth detailing the number of dimensions supported in each of these, unless they all support arbitrary dimensions. Some or all (or none, I'm not sure!) of these could be a single scalar per vertex, so you can't for example have the n-dimensional
ity mentioned in the Data array
definition above, you're stuck with n=0 (per vertex)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered that n=0
, since one axis is the vertex.
Prominent use cases | ||
=================== | ||
|
||
* Arithmetic/modeling - per-vertex mathematical operations |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think as long as there is a way to expose per-vertex data ndarray
, we can just say "use NumPy-compatible tools" for this
|
||
* Arithmetic/modeling - per-vertex mathematical operations | ||
* Smoothing - topology/geometry-respecting smoothing | ||
* Plotting - paint the data array as a texture on a surface |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having worked with VTK / Mayavi / PyVista / VisPy / mplot3d each a bit, I think this is going to be more difficult than it seems. This to me seems better tackled by a separate package, otherwise maintenance will be difficult.
I see the text below about NiBabel not necessarily providing each operation, so maybe adding to the Proposal
below explicitly what functionality is out of scope would be good?
doc/source/devel/biaps/biap_0009.rst
Outdated
These are not necessarily operations that NiBabel must provide, but the | ||
components needed for each should be readily retrieved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would put this at the top of this section -- when reading the list first, I assumed that the goal was to support/implement all prominent use cases.
Hello @effigies, Thank you for updating!
To test for issues locally, Comment last updated at 2022-02-11 02:14:46 UTC |
282a40c
to
0f72055
Compare
Sorry about missing the meeting today. I didn't get to the FreeSurfer stuff in the last two weeks but it looks like you started (yay!). Let me know if you want me to try it or add to it or anything else |
Hey @larsoner, I've spent the morning fiddling with a collection API, if you want to have a look. I did the FreeSurfer stuff just to ground myself before diving into CIFTI and CaretSpecFile (now workbench specs), so probably won't be spending much more (self-directed) time on it, so you should feel free to play with it and see what it still needs to be useful. |
@effigies you may want to consider handling the mz3 format, the page includes Python, Blender, JavaScript code as well as a full specification and examples datasets. It attempts to live at the Pareto frontier of file size and speed. It is both faster and smaller than GIfTI. The layout design maps directly onto modern GPU architectures (and hence WebGL, Metal and Vulkan APIs). I developed this primarily for the sample files distributed with Surfice, to reduce download times, disk usage and the time to run the demos / regression testing. It is intentionally simple, and does not have the flexibility of other formats. It does not attempt to do everything, but it can fill a niche. |
@neurolabusc Thanks! Opened an issue to avoid this getting lost... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just some preliminary testing from me, but looks like a good start. I can see the direction from the code and it makes sense.
doc/source/devel/biaps/biap_0009.rst
Outdated
bold.load_geometry("/data/anat/hemi-L_midthickness.surf.gii") | ||
# Not implementing networkx weighted graph here, so assume we have a method | ||
graph = bold.geometry.get_graph() | ||
distances = distance_matrix(graph) # n_coords x n_coords matrix |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is maybe going to be harder than we think:
>>> rr, tris = nib.freesurfer.read_geometry('/mnt/bakraid/data/structurals/fsaverage/surf/lh.white')
>>> from mne.surface import mesh_dist
>>> adjacency = mesh_dist(tris, rr)
>>> t0 = time.time(); dist_matrix = dijkstra(adjacency, directed=False, limit=10.); print(time.time() - t0)
...
numpy.core._exceptions.MemoryError: Unable to allocate 200. GiB for an array with shape (163842, 163842) and data type float64
So we'll need to improve SciPy's dijkstra to support sparse
outputs start...
If we don't use the distance-based method and instead use the "smudging" operation by pushing data to neighboring vertices using a number of steps, this process is avoided. I guess this is a bit of an actual implementation detail, but we should think about these two methods as options with different tradeoffs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, full pairwise distances is painfully naive; not sure if iterated smudging is equivalent to distance-based (up to accumulated floating point error) but seems reasonable.
Mostly as a bookmark to myself, it might be worth looking into https://github.com/gilShamai/Fast-Pairwise-Geodesic-Distance-Computation to see if they have any good ideas.
|
||
class FreeSurferSubject(GeometryCollection): | ||
@classmethod | ||
def from_subject(klass, subject_id, subjects_dir=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So far:
>>> from nibabel.tests.test_surfaceimages import FreeSurferSubject
>>> subj = FreeSurferSubject.from_subject('fsaverage', '/mnt/bakraid/data/structurals')
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/home/larsoner/python/nibabel/nibabel/tests/test_surfaceimages.py", line 210, in from_subject
return klass.from_directory(Path(subjects_dir) / subject_id)
AttributeError: type object 'FreeSurferSubject' has no attribute 'from_directory'
So some unit tests are clearly needed :)
And if I change the call to use from_spec
, then there is no self
referred to in that class method
nibabel/tests/test_surfaceimages.py
Outdated
return ap | ||
|
||
|
||
class FreeSurferHemisphere(SurfaceGeometry): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to work well as a subclass in local testing
Would a gist be sensible in the short term? |
Sure, here you go: https://gist.github.com/jbteves/384bfdd3777190f59620c3d988a314fd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel free to merge and address my comments later @effigies , already looks like a great start!
dlpfc_idcs = img.coordaxis.get_indices(parcel=structure, indices=vtx_idcs) | ||
|
||
# Subsampled coordinate axes will override any duplicate information from header | ||
dlpfc_img = CoordinateImage(img.dataobj[dlpfc_idcs], img.coordaxis[dlpfc_idcs], img.header) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or in syntactic sugar mode via __getitem__
:
dlpfc_img = img[dlpfc_idcs]
and under the hood you just do the operation above?
# Now load geometry so we can plot | ||
wbspec = CaretSpec("fsLR.wb.spec") | ||
dlpfc_img.coordaxis.load_structures(wbspec) | ||
... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add an example of trivial HDF5-based round-trip?
Native HDF5 format support
--------------------------
To support saving without loss of information, we also support serialization
to HDF5:
...
bold = CoordinateImage.from_filename("/data/func/hemi-L_bold.func.gii") | ||
dm = make_first_level_design_matrix(...) | ||
labels, results = run_glm(bold.get_fdata(), dm) | ||
betas = CoordinateImage(results["betas"], bold.coordaxis, bold.header) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing to note here is that betas
will have collapsed over the time axis of the data, so the bold.header
might be incorrect now, or otherwise need adjustment.
# that retrieves a graph for each structure | ||
graphs = get_graphs(bold.coordaxis) | ||
distances = distance_matrix(graphs['lh']) # n_coords x n_coords matrix | ||
weights = normalize(gaussian(distances, sigma)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
weights = normalize(gaussian(distances, sigma)) | |
weights = normalize(gaussian(distances, sigma)) # sparse matrix |
distances = distance_matrix(graphs['lh']) # n_coords x n_coords matrix | ||
weights = normalize(gaussian(distances, sigma)) | ||
# Wildly inefficient smoothing algorithm | ||
smoothed = CoordinateImage(weights @ bold.get_fdata(), bold.coordaxis, bold.header) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As you mentioned, this will kill most people's machines. Maybe you need to do this for GIfTI, but you shouldn't need to for the native HDF5 format, or for most operations (including plotting) -- it can just be done on the fly.
WDYT about an upsampler
property or way to specify how to go from the subsampled / parcel space to the full-resolution space? By default / None could mean "map directly to the specified parcel, leave all else zero", but you in principle should be able to have this also be a scipy.sparse
CSR or CSC matrix of shape (n_full_space, n_parcel)
that you can @
with your data to get to the full-res space i.e., high-res surface for surfaces, or T1 / 1mm voxel spacing (usually) for volumetric.
Haven't fully covered what's in the HackMD, but did expand a bit where we made some decisions that didn't get written down. Also tried to pin down some terminology, as I think it's easy to speak imprecisely here. Some of this may be conventions that we need to enforce by transforming data on input (as we do with volumetric images come with affines that project into something besides RAS+).
Everything is still negotiable, so please feel free to make drastic suggestions.
Will start on implementations next week unless I feel inspired sooner.
cc @larsoner @htwangtw @NicolasGensollen @matthew-brett