Skip to content

Commit

Permalink
[Datumaro] Update documentation (#2059)
Browse files Browse the repository at this point in the history
* Update docs

* fix indent

* update design
  • Loading branch information
zhiltsov-max authored Aug 21, 2020
1 parent 082fc7a commit 3fb3f67
Show file tree
Hide file tree
Showing 6 changed files with 495 additions and 68 deletions.
2 changes: 1 addition & 1 deletion datumaro/CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -129,7 +129,7 @@ Plugins reside in plugin directories:
- `<project_dir>/.datumaro/plugins` for project-specific components

A plugin is a python file or package with any name, which exports some symbols.
To export a symbol put it to `exports` list of the module like this:
To export a symbol, put it to `exports` list of the module like this:

``` python
class MyComponent1: ...
Expand Down
12 changes: 6 additions & 6 deletions datumaro/README.md
Original file line number Diff line number Diff line change
@@ -1,13 +1,13 @@
# Dataset Framework (Datumaro)
# Dataset Management Framework (Datumaro)

A framework to build, transform, and analyze datasets.

<!--lint disable fenced-code-flag-->
```
CVAT annotations -- ---> Annotation tool
... \ /
\ /
COCO-like dataset -----> Datumaro ---> dataset ------> Model training
... / \
/ \
VOC-like dataset -- ---> Publication etc.
```
<!--lint enable fenced-code-flag-->
Expand Down Expand Up @@ -55,12 +55,12 @@ VOC-like dataset -- ---> Publication etc.
- Dataset building operations:
- Merging multiple datasets into one
- Dataset filtering with custom conditions, for instance:
- remove all annotations except polygons of a certain class
- remove polygons of a certain class
- remove images without a specific class
- remove occluded annotations from images
- remove `occluded` annotations from images
- keep only vertically-oriented images
- remove small area bounding boxes from annotations
- Annotation conversions, for instance
- Annotation conversions, for instance:
- polygons to instance masks and vise-versa
- apply a custom colormap for mask annotations
- rename or remove dataset labels
Expand Down
2 changes: 1 addition & 1 deletion datumaro/datumaro/cli/contexts/project/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -579,7 +579,7 @@ def build_transform_parser(parser_ctor=argparse.ArgumentParser):
|n
Examples:|n
- Convert instance polygons to masks:|n
|s|stransform -n polygons_to_masks
|s|stransform -t polygons_to_masks
""" % ', '.join(builtins),
formatter_class=MultilineFormatter)

Expand Down
9 changes: 8 additions & 1 deletion datumaro/datumaro/plugins/transforms.py
Original file line number Diff line number Diff line change
Expand Up @@ -409,6 +409,13 @@ def transform_item(self, item):
.format(item=item))

class RemapLabels(Transform, CliPlugin):
"""
Changes labels in the dataset.|n
Examples:|n
- Rename 'person' to 'car' and 'cat' to 'dog', keep 'bus', remove others:|n
|s|sremap_labels -l person:car -l bus:bus -l cat:dog --default delete
"""

DefaultAction = Enum('DefaultAction', ['keep', 'delete'])

@staticmethod
Expand All @@ -428,7 +435,7 @@ def build_cmdline_parser(cls, **kwargs):
parser.add_argument('--default',
choices=[a.name for a in cls.DefaultAction],
default=cls.DefaultAction.keep.name,
help="Action for unspecified labels")
help="Action for unspecified labels (default: %(default)s)")
return parser

def __init__(self, extractor, mapping, default=None):
Expand Down
24 changes: 13 additions & 11 deletions datumaro/docs/design.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,6 +49,8 @@ Datumaro is:
- Versioning (for images, annotations, subsets, sources etc., comparison)
- Documentation generation
- Provision of iterators for user code
- Dataset downloading
- Dataset generation
- Dataset building (export in a specific format, indexation, statistics, documentation)
- Dataset exporting to other formats
- Dataset debugging (run inference, generate dataset slices, compute statistics)
Expand Down Expand Up @@ -111,25 +113,25 @@ can be downloaded by user to be operated on with Datumaro CLI.
- [ ] with TensorBoard

- Calculation of statistics for datasets
- [ ] Pixel mean, std
- [ ] Object counts (detection scenario)
- [ ] Image-Class distribution (classification scenario)
- [ ] Pixel-Class distribution (segmentation scenario)
- [ ] Image clusters
- [x] Pixel mean, std
- [x] Object counts (detection scenario)
- [x] Image-Class distribution (classification scenario)
- [x] Pixel-Class distribution (segmentation scenario)
- [ ] Image similarity clusters
- [ ] Custom statistics

- Dataset building
- [x] Composite dataset building
- [ ] Annotation remapping
- [ ] Subset splitting
- [x] Class remapping
- [x] Subset splitting
- [x] Dataset filtering (`extract`)
- [x] Dataset merging (`merge`)
- [ ] Dataset item editing (`edit`)

- Dataset comparison (`diff`)
- [x] Annotation-annotation comparison
- [x] Annotation-inference comparison
- [ ] Annotation quality estimation (for CVAT)
- [x] Annotation quality estimation (for CVAT)
- Provide a simple method to check
annotation quality with a model and generate summary

Expand All @@ -142,9 +144,9 @@ can be downloaded by user to be operated on with Datumaro CLI.
- [x] Task export
- [x] Datumaro project export
- [x] Dataset export
- [ ] Original raw data (images, a video file) can be downloaded (exported)
- [x] Original raw data (images, a video file) can be downloaded (exported)
together with annotations or just have links
on CVAT server (in the future support S3, etc)
on CVAT server (in future, support S3, etc)
- [x] Be able to use local files instead of remote links
- [ ] Specify cache directory
- [x] Use case "annotate for model training"
Expand All @@ -154,7 +156,7 @@ can be downloaded by user to be operated on with Datumaro CLI.
- convert to a training format
- train a DL model
- [x] Use case "annotate - reannotate problematic images - merge"
- [ ] Use case "annotate and estimate quality"
- [x] Use case "annotate and estimate quality"
- create a task
- annotate
- estimate quality of annotations
Expand Down
Loading

0 comments on commit 3fb3f67

Please sign in to comment.