The Caltech Fish Counting Dataset

This repository includes resources for The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting (ECCV 2022):

Links to download the dataset and annotations
Evaluation code to reproduce our results and evaluate new algorithms
A script to convert raw sonar frames into the enhanced format used by the Baseline++ method in the paper

Data Download

Data can be downloaded from CaltechDATA using the following links.

Training, validation, and testing images [123 GB]

Running md5sum on the tar.gz file should produce: 176648e618fc5013db972aa7ded01517 fish_counting_frames.tar.gz
We also make available a Tiny Dataset [1.44 GB] which is a subset of the larger dataset, for exploring the dataset without downloading the entire thing. It includes data from all 6 river locations/subsets (elwha, kenai-channel, kenai-rightbank, kenai-train, kenai-val, nushagak). It has 20 clips (videos) from each river and up to 50 consecutive frames from each clip. The formatting, etc. of each of the subdirectories and file structure follow the formats listed below. Running md5sum on the tar.gz file should produce 6ae6062d50a90d4b084ebe476a392c0e tiny_dataset.tar.gz

Metadata [54 KB]

Running md5sum on the tar.gz file should produce: 152286bd6f25f965aadf41e8a0c44140 fish_counting_metadata.tar.gz

Annotations [5.1 MB]

Running md5sum on the tar.gz file should produce: 34c6bbd5e9187f05bfc0be72df002b19 fish_counting_annotations.tar.gz

Coco-style Annotations [11 MB]

Running md5sum on the tar.gz file should produce: 92694f9df235cf1e089f65e06faa9c82 coco_formatted_annotations.tar.gz

Data Format

Images

Frames are provided as single-channel JPGs. After extracting the tar.gz, frames are organized as follows. Each location described in the paper -- Kenai Left Bank (training & validation), Kenai Right Bank, Kenai Channel, Nushagak, and Elwha -- has its own directory, with subdirectories for each video clip at that location.

frames/
    raw/
        kenai-train/
            One directory per video sequence in the training set.
                Images are named by frame index in the video clip, e.g. 0.jpg, 1.jpg ...
        kenai-val/
            One directory per video sequence in the validation set.
                0.jpg, 1.jpg ...
        kenai-rightbank/
            ...
        kenai-channel/
            ...
        nushagak/
            ...
        elwha/
            ...

The 3-channel frames used by our Baseline++ method can be generated using convert.py:

python convert.py --in_dir PATH/TO/frames/raw --out_dir PATH/TO/OUTPUT/DIRECTORY

The directory structure will be maintained.

Metadata

Clip-level metadata is provided for each video clip in the dataset. One JSON file is provided for each location: kenai-train.json, kenai-val.json, kenai-rightbank.json, kenai-channel.json, nushagak.json, and elwha.json. Each JSON file contains a list of dictionaries, one per video clip. Each entry contains the following metadata:

{   // Information for a single clip
    "clip_name" :              // Unique ID for this clip; matches the name of the directory containing image frames
    "num_frames":              // Number of frames in the video clip
    "upstream_direction" :     // Either `left` or `right`
    "width":                   // Image width in pixels
    "height":                  // Image width in pixels
    "framerate":               // Video frame rate, in frames per second
    "x_meter_start":           // Meter distance from the sonar camera at x = 0
    "x_meter_stop":            // Meter distance from the sonar camera at x = width-1
    "y_meter_start":           // Meter distance from the sonar camera at y = 0
    "y_meter_stop":            // Meter distance from the sonar camera at y = height-1
}

Annotations

We provide annotations in COCO format as well as MOTChallenge format. For both, we use the default directory structure as described here. After extracting the tar.gz, the directory structure is as follows:

MOTChallenge

annotations/
    kenai-train/
        One directory per video sequence in the training set.
                gt.txt
    kenai-val/
        One directory per video sequence in the validation set.
                gt.txt
    kenai-rightbank/
        ...
    kenai-channel/
        ...
    nushagak/
        ...
    elwha/
        ...

Following the MOTChallenge format, each gt.txt file contains one entry per track per frame. Each line contains 10 values:

<frame_number>, <track_id>, <bb_left>, <bb_top>, <bb_width>, <bb_height>, <conf>, <x>, <y>, <z>

The world coordinates x,y,z are ignored for 2D data and are filled with -1. For ground truth tracks, conf=-1 as well. All frame numbers, target IDs and bounding boxes are 1-indexed (i.e. the minimum bb_left and bb_top values are 1, not 0). Here is an example:

1, 3, 794.27, 247.59, 71.245, 174.88, -1, -1, -1, -1
1, 6, 1648.1, 119.61, 66.504, 163.24, -1, -1, -1, -1
1, 8, 875.49, 399.98, 95.303, 233.93, -1, -1, -1, -1

COCO

coco_formatted_annotations/
    kenai-train/
        coco.json
    kenai-val/
        coco.json
    kenai-rightbank/
        ...
    kenai-channel/
        ...
    nushagak/
        ...
    elwha/
        ...

Where each coco.json file includes all of the regular information regarding image labels and annotation labels, with the addition that each image_label also has a dir_name corresponding to the directory name of the video sequence in the training set.

{
    "images": [
        {
            "dir_name": "2018-06-01-JD152_LeftFar_Stratum2_Set1_LO_2018-06-01_191003_0_280",
            "file_name": "0.jpg",
            "height": 1915,
            "width": 781,
            "id": 0
    ...
        "categories": [
        {
            "supercategory": "",
            "id": 1,
            "name": "fish"
        }
    ],
    "annotations": [
        {
            "id": 1,
            "image_id": 17,
            "bbox": [
                241.0,
                1851.0,
                15.0,
                10.0
            ],
            "area": 150,
            "iscrowd": 0,
            "category_id": 1
        },
   ...
}

Prediction Results

We provide output from our Baseline and Baseline++ methods in MOTChallenge format as well.

ECCV22 Baseline Results [18 MB]

Running md5sum on the tar.gz file should produce: ef8d517ad45419edce7af2e7dc5016be fish_counting_results.tar.gz

Note that the directory structure for predictions is different from the ground truth annotations. After extracting the tar.gz, the directory structure is as follows:

results/
    kenai-val/
        baseline/
            data/
                One text file per clip, named {clip_name}.txt
        baseline++/
            data/
                One text file per clip, named {clip_name}.txt
    kenai-rightbank/
        ...
    kenai-channel/
        ...
    nushagak/
        ...
    elwha/
        ...

Repository Setup and Usage

Installation

Clone the repo with submodules to enable MOT evaluation:

git clone --recursive https://github.com/visipedia/caltech-fish-counting.git
// or 
git clone --recursive [email protected]:visipedia/caltech-fish-counting.git

If you already cloned, submodules can be retroactively intialized with:

git submodule init
git submodule update

Evaluation

We provide evaluation code using the TrackEval codebase. In addition to the CLEAR, ID, and HOTA tracking metrics, we extend the TrackEval codebase with a custom metric nMAE as described in the paper:

$nMAE = \frac{\frac{1}{N}\sum_{i=1}^{N} E_i}{\frac{1}{N}\sum_{i=1}^{N} \hat{z}_i} = \frac{\sum_{i=1}^{N} E_i}{\sum_{i=1}^{N} \hat{z}_i}$ ,

where $N$ is the number of video clips, $E_i$ is the absolute counting error for each clip:

$E_i = | z_{left_i} - \hat{z}_{left_i} | \+ | z_{right_i} - \hat{z}_{right_i} |$ ,

and $\hat{z}_i$ is the ground truth count for clip $i$ :

$\hat{z}_i = \hat{z}_{left_i} \+ \hat{z}_{right_i}$

Run the evaluation script from the command line to reproduce the results from the paper:

python evaluate.py --results_dir PATH/TO/results --anno_dir PATH/TO/annotations --metadata_dir PATH/TO/metadata --tracker baseline

python evaluate.py --results_dir PATH/TO/results --anno_dir PATH/TO/annotations --metadata_dir PATH/TO/metadata --tracker baseline++

Reference

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

Justin Kay, Peter Kulits, Suzanne Stathatos, Siqi Deng, Erik Young, Sara Beery, Grant Van Horn, and Pietro Perona

We present the Caltech Fish Counting Dataset (CFC), a large-scale dataset for detecting, tracking, and counting fish in sonar videos. We identify sonar videos as a rich source of data for advancing low signal-to-noise computer vision applications and tackling domain generalization for multiple-object tracking (MOT) and counting. In comparison to existing MOT and counting datasets, which are largely restricted to videos of people and vehicles in cities, CFC is sourced from a natural-world domain where targets are not easily resolvable and appearance features cannot be easily leveraged for target re-identification. With over half a million annotations in over 1,500 videos sourced from seven different sonar cameras, CFC allows researchers to train MOT and counting algorithms and evaluate generalization performance at unseen test locations. We perform extensive baseline experiments and identify key challenges and opportunities for advancing the state of the art in generalization in MOT and counting.

If you find our work useful in your research please consider citing our paper:

@inproceedings{cfc2022eccv,
    author    = {Kay, Justin and Kulits, Peter and Stathatos, Suzanne and Deng, Siqi and Young, Erik and Beery, Sara and Van Horn, Grant and Perona, Pietro},
    title     = {The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting},
    booktitle = {European Conference on Computer Vision (ECCV)},
    year      = {2022}
}

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
assets		assets
lib		lib
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
convert.py		convert.py
evaluate.py		evaluate.py
get_tiny_dataset.py		get_tiny_dataset.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

The Caltech Fish Counting Dataset

Data Download

Data Format

Images

Metadata

Annotations

MOTChallenge

COCO

Prediction Results

Repository Setup and Usage

Installation

Evaluation

Reference

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

About

Releases

Packages

Languages

License

suzanne-stathatos/caltech-fish-counting

Folders and files

Latest commit

History

Repository files navigation

The Caltech Fish Counting Dataset

Data Download

Data Format

Images

Metadata

Annotations

MOTChallenge

COCO

Prediction Results

Repository Setup and Usage

Installation

Evaluation

Reference

The Caltech Fish Counting Dataset: A Benchmark for Multiple-Object Tracking and Counting

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages