Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Populate space_unit attribute upon dataset loading/creation? #386

Open
niksirbi opened this issue Jan 24, 2025 · 0 comments
Open

Populate space_unit attribute upon dataset loading/creation? #386

niksirbi opened this issue Jan 24, 2025 · 0 comments
Labels
question Further information is requested

Comments

@niksirbi
Copy link
Member

Proposal

PR #384 introduces an attribute—provisionally named space_unit—that specifies the spatial unit (e.g. "mm", "pixels", etc.). This attribute is optionally added by transformation functions such as scale.

I wonder whether it might be beneficial to create this attribute right from the start when loading a dataset via from_file() or from_numpy(), for both "poses" and "bboxes" datasets.

This would mirror our handling of the time_unit attribute:

  • Set to "frames" if fps=None.
  • Set to "seconds" if a valid fps value is supplied.

Making this change would likely require updates to load_poses.py, load_bboxes.py, and possibly validators.datasets.py. For all of our supported formats—except, potentially, "Anipose"—the spatial unit would be "pixels".

Caveats

  • time_unit is currently a Dataset-level attribute, because both the "position" and "confidence" arrays have a time dimension.
  • space_unit in the transforms module is added at the DataArray level, which makes sense since those transforms operate on individual data arrays. As a result, transforms only modify the attribute on a per-array basis.
  • It is not entirely clear where space_unit should be added when creating a "poses" or "bboxes" dataset. Logically, it might belong at the DataArray level (because confidence does not have a space dimension). For a "poses" dataset, space_unit would then be an attribute of position, whereas for a "bboxes" dataset it would be an attribute of both position and shape.
  • If we retain time_unit at the Dataset level but introduce space_unit at the DataArray level, it becomes more difficult to unify them into a common units={dim_name: dim_unit} dictionary. Perhaps all dimension units should be defined at the array level.

Alternative

We could leave things as they are: space_unit would not exist by default and would only be added by transformations, at which point the user must specify or decide upon the spatial unit.

See Also

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
Status: 🤔 Triage
Development

No branches or pull requests

1 participant