Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DOC: First pass at SurfaceImage BIAP #1056

Closed
wants to merge 23 commits into from
Closed
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
cb7d7d3
DOC: First pass at SurfaceImage BIAP
effigies Sep 17, 2021
ea2a466
DOC: Address suggestions
effigies Sep 29, 2021
0f4fddc
DOC: Clarify use cases are motivating, not necessarily implementation…
effigies Oct 8, 2021
3b2b575
DOC: Small updates
effigies Oct 8, 2021
9ddbcc7
DOC: Update BIAP with a couple examples leading to further questions
effigies Oct 8, 2021
27fd6f4
DOC: Separate Geometry and Header objects
effigies Oct 8, 2021
3e89a50
DOC: Smoothing example, typo
effigies Oct 20, 2021
b51e7a0
ENH: First pass at surfaceimage template classes
effigies Oct 20, 2021
0f72055
TEST: Build HDF5/Numpy-based surface classes
effigies Oct 21, 2021
408a227
DOC: Add VolumeGeometry stub
effigies Nov 5, 2021
79a801e
DOC: Fix header formatting
effigies Nov 5, 2021
eabc77f
TEST: Example FreeSurfer subject (not fitting into class hierarchy)
effigies Nov 5, 2021
199547f
DOC: Add concatenable structure proposal
effigies Nov 5, 2021
efef027
ENH: Add structure collection API
effigies Nov 5, 2021
d88c7c6
TEST: Rewrite FreeSurferSubject as GeometryCollection
effigies Nov 5, 2021
25e6d52
ENH: Possible VolumeGeometry
effigies Nov 5, 2021
e6be497
STY: Geometry -> Pointset
effigies Nov 8, 2021
3f62a80
Rename SurfaceGeometry to TriangularMesh
effigies Nov 19, 2021
dcd2050
FIX: FreeSurfer example implementation
effigies Nov 19, 2021
ff19edc
ENH: Flesh out VolumeGeometry
effigies Nov 19, 2021
4af4897
BIAP: Add SurfaceHeader.get_geometry() method
effigies Nov 19, 2021
a3adfe7
Rename Geometry -> Pointset
effigies Jan 14, 2022
e278981
DOC: Commit current thinking on BIAP0009
effigies Feb 11, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
309 changes: 309 additions & 0 deletions doc/source/devel/biaps/biap_0009.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,309 @@
.. _biap9:

#############################
BIAP9 - The Surface Image API
#############################

:Author: Chris Markiewicz
:Status: Draft
:Type: Standards
:Created: 2021-09-16

**********
Background
**********

Surface data is generally kept separate from geometric metadata
===============================================================

In contrast to volumetric data, whose geometry can be fully encoded in the
shape of a data array and a 4x4 affine matrix, data sampled to a surface
requires the location of each sample to be explicitly represented by a
coordinate. In practice, the most common approach is to have a geometry file
and a data file.

A geometry file consists of a vertex coordinate array and a triangle array
describing the adjacency of vertices, while a data file is an n-dimensional
array with one axis corresponding to vertex.

Keeping these files separate is a pragmatic optimization to avoid costly
reproductions of geometric data, but presents an administrative burden to
direct consumers of the data.

Terminology
===========

For the purposes of this BIAP, the following terms are used:

* Coordinate - a triplet of floating point values in RAS+ space
* Vertex - an index into a table of coordinates
* Triangle (or face) - a triplet of adjacent vertices (A-B-C);
effigies marked this conversation as resolved.
Show resolved Hide resolved
the normal vector for the face is ($\overline{AB}\times\overline{AC}$)
* Topology - vertex adjacency data, independent of vertex coordinates,
typically in the form of a list of triangles
* Geometry - topology + a specific set of coordinates for a surface
* Patch - a connected subset of vertices (can be the full topology)
* Data array - an n-dimensional array with one axis corresponding to the
vertices (typical) OR faces (more rare) in a patch


Currently supported surface formats
===================================

* FreeSurfer
* Geometry (e.g. ``lh.pial``):
:py:func:`~nibabel.freesurfer.io.read_geometry` /
:py:func:`~nibabel.freesurfer.io.write_geometry`
* Data
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be worth detailing the number of dimensions supported in each of these, unless they all support arbitrary dimensions. Some or all (or none, I'm not sure!) of these could be a single scalar per vertex, so you can't for example have the n-dimensionality mentioned in the Data array definition above, you're stuck with n=0 (per vertex)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I considered that n=0, since one axis is the vertex.

* Morphometry:
:py:func:`~nibabel.freesurfer.io.read_morph_data` /
:py:func:`~nibabel.freesurfer.io.write_morph_data`
* Labels: :py:func:`~nibabel.freesurfer.io.read_label`
* MGH: :py:class:`~nibabel.freesurfer.mghformat.MGHImage`
* GIFTI: :py:class:`~nibabel.gifti.gifti.GiftiImage`
* Every image contains a collection of data arrays, which may be
coordinates, topology, or data (further subdivided by type and intent)
* CIFTI-2: :py:class:`~nibabel.cifti2.cifti2.Cifti2Image`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i wouldn't completely call cifti-2 a surface format. there is no geometry information stored in the file itself (unless someone hacked it. it's a point cloud that happens to have been extracted from some combination of a surface geometry and a volume geometry.

also many of the cifti-2 types are parcel based.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed CIFTI supports multimodal data not purely for surface. It kind of deserve its own category.
IMO with the popularity of HCP data and fmriprep supporting output in fsLR template, surface based data support for CIFTI is too common to ignore. A major road block for user is data I/O. Working on how CIFTI-2 image relates to geometry template is a starting point none the less.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I meant by "Pure data array". Fair point that it accepts data that has no geometric basis (parcels). Made a note of that case.

* Pure data array, with image header containing flexible axes
* Geometry referred to by an associated ``wb.spec`` file
(no current implementation in NiBabel)
* Possible to have one with no geometric information, e.g., parcels x time


*********************************
Desiderata for a SurfaceImage API
*********************************

The following are provisional guiding principles

1. A surface image (data array) should carry a reference to geometric metadata
that is easily transferred to a new image.
2. Partial images (data only or geometry only) should be possible. Absence of
components should have a well-defined signature, such as a property that is
``None`` or a specific ``Exception`` is raised.
3. All arrays (coordinates, triangles, data arrays) should be proxied to
avoid excess memory consumption
4. Selecting among coordinates (e.g., gray/white boundary, inflated surface)
for a single topology should be possible.
5. Combining multiple brain structures (canonically, left and right hemispheres)
in memory should be easy; serializing to file may be format-specific.
6. Splitting a data array into independent patches that can be separately
operated on and serialized should be possible.


Prominent use cases
===================

We consider the following use cases for working with surface data.
A good API will make retrieving the components needed for each use case
straightforward, as well as storing the results in new images.

* Arithmetic/modeling - per-vertex mathematical operations
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think as long as there is a way to expose per-vertex data ndarray, we can just say "use NumPy-compatible tools" for this

* Smoothing - topology/geometry-respecting smoothing
* Plotting - paint the data array as a texture on a surface
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having worked with VTK / Mayavi / PyVista / VisPy / mplot3d each a bit, I think this is going to be more difficult than it seems. This to me seems better tackled by a separate package, otherwise maintenance will be difficult.

I see the text below about NiBabel not necessarily providing each operation, so maybe adding to the Proposal below explicitly what functionality is out of scope would be good?

* Decimation - subsampling a topology (possibly a subset, possibly with
interpolated vertex locations)
* Resampling to a geometrically-aligned surface
* Downsampling by decimating, smoothing, resampling
* Inter-subject resampling by using ``?h.sphere.reg``
* Interpolation of per-vertex and per-face data arrays

When possible, we prefer to expose NumPy ``ndarray``\s and
allow use of numpy, scipy, scikit-learn. In some cases, it may
make sense for NiBabel to provide methods.

********
Proposal
********

The basic API is as follows:

.. code-block:: python

class Geometry:
@property
def n_coords(self):
""" Number of coordinates """

def get_coords(self, name=None):
effigies marked this conversation as resolved.
Show resolved Hide resolved
""" Nx3 array of coordinates in RAS+ space """


class SurfaceGeometry(Geometry):
@property
def n_triangles(self):
""" Number of faces """

def get_triangles(self, name=None):
""" Mx3 array of indices into coordinate table """

def get_mesh(self, name=None):
return self.get_coords(name=name), self.get_triangles(name=name)

def get_names(self):
""" List of surface names that can be passed to
``get_{coords,triangles,mesh}``
"""

def decimate(self, *, n_coords=None, ratio=None):
""" Return a SurfaceHeader with a smaller number of vertices that
preserves the geometry of the original """
# To be overridden when a format provides optimization opportunities

def load_vertex_data(self, filename):
""" Return a SurfaceImage with data corresponding to each vertex """

def load_face_data(self, filename):
""" Return a SurfaceImage with data corresponding to each face """


class SurfaceHeader(FileBasedHeader):
...


class SurfaceImage(FileBasedImage):
@property
def header(self):
""" A SurfaceHeader """

@property
def geometry(self):
""" A SurfaceGeometry or None """

@property
def dataobj(self):
""" An ndarray or ArrayProxy with one of the following properties:

1) self.dataobj.shape[0] == self.header.ncoords
2) self.dataobj.shape[0] == self.header.nfaces
"""

def load_geometry(self, pathlike):
""" Specify a geometry for a data-only image """


To enable a similar interface to raveled voxel data:

.. code-block:: python

class VolumeGeometry(Geometry):
_affine # Or _affines, if we want multiple, e.g. qform, sform
_shape
_ijk_coords

def n_coords(self):
""" Number of coordinates """

def get_coords(self):
""" Nx3 array of coordinates in RAS+ space """


Here we present common use cases:


Modeling
========

.. code-block:: python

from nilearn.glm.first_level import make_first_level_design_matrix, run_glm

bold = SurfaceImage.from_filename("/data/func/hemi-L_bold.func.gii")
dm = make_first_level_design_matrix(...)
labels, results = run_glm(bold.get_data(), dm)
betas = bold.__class__(results["betas"], bold.header)
betas.to_filename("/data/stats/hemi-L_betas.mgz")

Data images have their own metadata apart from geometry that needs handling in a header:

* Axis labels (time series, labels, tensors)
* Data dtype


Smoothing
=========

.. code-block:: python

bold = SurfaceImage.from_filename("/data/func/hemi-L_bold.func.gii")
bold.load_geometry("/data/anat/hemi-L_midthickness.surf.gii")
# Not implementing networkx weighted graph here, so assume we have a method
graph = bold.geometry.get_graph()
distances = distance_matrix(graph) # n_coords x n_coords matrix
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is maybe going to be harder than we think:

>>> rr, tris = nib.freesurfer.read_geometry('/mnt/bakraid/data/structurals/fsaverage/surf/lh.white')
>>> from mne.surface import mesh_dist
>>> adjacency = mesh_dist(tris, rr)
>>> t0 = time.time(); dist_matrix = dijkstra(adjacency, directed=False, limit=10.); print(time.time() - t0)
...
numpy.core._exceptions.MemoryError: Unable to allocate 200. GiB for an array with shape (163842, 163842) and data type float64

So we'll need to improve SciPy's dijkstra to support sparse outputs start...

If we don't use the distance-based method and instead use the "smudging" operation by pushing data to neighboring vertices using a number of steps, this process is avoided. I guess this is a bit of an actual implementation detail, but we should think about these two methods as options with different tradeoffs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, full pairwise distances is painfully naive; not sure if iterated smudging is equivalent to distance-based (up to accumulated floating point error) but seems reasonable.

Mostly as a bookmark to myself, it might be worth looking into https://github.com/gilShamai/Fast-Pairwise-Geodesic-Distance-Computation to see if they have any good ideas.

weights = normalize(gaussian(distances, sigma))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
weights = normalize(gaussian(distances, sigma))
weights = normalize(gaussian(distances, sigma)) # sparse matrix

smoothed = bold.__class__(bold.get_fdata() @ weights, bold.header)
smoothed.to_filename(f"/data/func/hemi-L_smooth-{sigma}_bold.func.gii")


Plotting
========

Nilearn currently provides a
`plot_surf <https://nilearn.github.io/modules/generated/nilearn.plotting.plot_surf.html>`_ function.
With the proposed API, we could interface as follows:

.. code-block:: python

def plot_surf_img(img, surface="inflated"):
from nilearn.plotting import plot_surf
coords, triangles = img.geometry.get_mesh(name=surface)

data = tstats.get_data()

return plot_surf((triangles, coords), data)

tstats = SurfaceImage.from_filename("/data/stats/hemi-L_contrast-taskVsBase_tstat.mgz")
tstats.load_geometry("/data/subjects/fsaverage5")
plot_surf_img(tstats)

This assumes that ``load_geometry()`` will be able to identify a FreeSurfer subject directory.


**************
Open questions
**************

1) Can/should image and geometry objects be promiscuous? If I load a GIFTI geometry,
should ``geometry.load_vertex_data()`` accept a FreeSurfer curvature file, or
should there be a bit more ecosystem-local logic?


********************************
Concatenable structures proposal
********************************

A brain is typically described with multiple structures,
such as left and right hemispheres and a volumetric subcortical region.
We will consider two reference cases:

1) FreeSurfer, in which a subject directory can be used to retrieve
multiple surfaces or volumetric regions. Data arrays are specific to a
single structure.
2) CIFTI-2, which permits an axis to have type ``CIFTI_INDEX_TYPE_BRAIN_MODELS``,
indicating that each index along an axis corresponds to a vertex or voxel in
an associated surface or volume.
The data array thus may span multiple structures.

In the former case, the structures may be considered an unordered collection;
in the latter, the sequence of structures directly corresponds to segments of
the data array.

.. code-block:: python

class GeometryCollection:
def get_geometry(self, name):
""" Return a geometry by name """

@property
def names(self):
""" A set of structure names """

@classmethod
def from_spec(klass, pathlike):
""" Load a collection of geometries from a specification, broadly construed. """


class GeometrySequence(GeometryCollection, Geometry):
# Inherit and override get_coords, n_coords from Geometry
def get_indices(self, name):
""" Return indices for data array corresponding to a named geometry """
1 change: 1 addition & 0 deletions doc/source/devel/biaps/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ proposals.
biap_0006
biap_0007
biap_0008
biap_0009

.. toctree::
:hidden:
Expand Down
Loading