Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Annotations utils #18

Open
3 of 4 tasks
sfmig opened this issue Dec 10, 2024 · 2 comments
Open
3 of 4 tasks

Annotations utils #18

sfmig opened this issue Dec 10, 2024 · 2 comments

Comments

@sfmig
Copy link
Contributor

sfmig commented Dec 10, 2024

  • Read VIA/COCO json annotation files
  • Save as VIA/COCO json annotation file
  • Utils
    • Combine annotation files
    • Filter out bboxes based on min/max area
    • Check for duplicate bboxes and (optionally) correct
      • see movement validations for bboxes dataset
    • Print and/or plot summary statistics for manual annotations
    • Filter out bboxes if boundaries are outside image?
    • Filter out bboxes if inside a certain region (use shapely)
    • Convert between VIA and COCO json
      • via "standard" dataframe

Probably a good idea to deal with each outer bulletpoint as a separate PR

@sfmig
Copy link
Contributor Author

sfmig commented Dec 17, 2024

These scripts shared in the VIA site may be useful to check.

@sfmig
Copy link
Contributor Author

sfmig commented Jan 31, 2025

From PR #31

Mock script

A mock script of the annotations module
import pandas as pd
from shapely import Point, Polygon

from ethology.annotations.curation import (
    remove_inside_polygon,
    remove_outside_image,
)
from ethology.annotations.io import (
    df_bboxes_from_file,
    df_bboxes_to_COCO_file,
    df_bboxes_to_VIA_file,
    df_keypoints_from_file,
)
from ethology.annotations.transforms import (
    compute_bboxes_from_keypoints,
    compute_centroid_from_keypoints,
    compute_masks_from_keypoints,
)

############################################
## Read data from two files, option 1
df_bboxes_1 = df_bboxes_from_file("path/to/annotations_1.json")
df_bboxes_2 = df_bboxes_from_file("path/to/annotations_2.json")

# combine dataframes
df_bboxes = pd.concat([df_bboxes_1, df_bboxes_2])
# and remove duplicates
df_bboxes = df_bboxes.drop_duplicates()

############################################
## Read data from two files, option 2
# or: we combine and remove duplicates in one step
df_bboxes = df_bboxes_from_file(
    ["path/to/annotations_1.json", "path/to/annotations_2.json"]
)

############################################
# Filter out boxes whose boundaries are outside image
image_size = (1000, 1000)
df_bboxes = df_bboxes[
    df_bboxes["x"] + 0.5 * df_bboxes["width"] > image_size[0]
]
df_bboxes = df_bboxes[
    df_bboxes["y"] + 0.5 * df_bboxes["height"] > image_size[1]
]
df_bboxes = df_bboxes[df_bboxes["x"] - 0.5 * df_bboxes["width"] < 0]
df_bboxes = df_bboxes[df_bboxes["y"] - 0.5 * df_bboxes["height"] < 0]

# or:
df_bboxes = remove_outside_image(df_bboxes, image_size)

############################################
# Filter out boxes that are within a specified polygon using shapely
polygon = Polygon([(0, 0), (0, 100), (100, 100), (100, 0)])
df_bboxes = df_bboxes[
    df_bboxes.apply(
        lambda x: not polygon.contains(Point(x["x"], x["y"])), axis=1
    )
]
# or
df_bboxes = remove_inside_polygon(
    df_bboxes, polygon
)  # with default columns to check in the dataframe: 'x', 'y'

############################################
# Print summary statistics
print(df_bboxes.describe())

############################################
# Export data

#  as a VIA json file
df_bboxes_to_VIA_file(df_bboxes, "path/to/output_VIA_file.json")

# as a COCO json file
df_bboxes_to_COCO_file(df_bboxes, "path/to/output_COCO_file.json")

# as a csv file with specified header
# df_bboxes.to_csv("path/to/output_csv_file.csv", header=["x", "y", "width", "height"])

############################################
# Transforms module
# other nice-to-have-in-the-future

# read SLEAP annotated data
df_keypoints = df_keypoints_from_file(
    "path/to/annotations_keypoints.slp",
    source_software="SLEAP",
)  # reads data into df with standard headings

# compute bounding boxes from keypoints
df_bboxes = compute_bboxes_from_keypoints(df_keypoints)

# compute RLE masks from keypoints
# under the hood prompt SAM2?
df_masks = compute_masks_from_keypoints(df_keypoints)
# plus additional arguments to determine the buffer etc
# or maybe I prompt SAM2?

# compute centroid from keypoints
df_centroid_kpts = compute_centroid_from_keypoints(df_keypoints)


# read idtracker data
df_keypoints = df_keypoints_from_file(
    "path/to/annotations_idtracker.csv",  # or .npy or .json
    source_software="idtracker",
)  # reads data into df with standard headings

# compute bounding boxes from keypoints
df_bboxes = compute_bboxes_from_keypoints(df_keypoints)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant