Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Raw image copying in dataset export #2229

Merged
merged 5 commits into from
Oct 7, 2020
Merged

Conversation

zhiltsov-max
Copy link
Contributor

@zhiltsov-max zhiltsov-max commented Sep 25, 2020

Motivation and context

Depends on openvinotoolkit/datumaro#27

  • Allows to copy raw images instead of recoding on dataset export
  • Improves dataset export performance (4-6x +)

How has this been tested?

Checklist

License

  • I submit my code changes under the same MIT License that covers the project.
    Feel free to contact the maintainers if that's a concern.
  • I have updated the license header for each file (see an example below)
# Copyright (C) 2020 Intel Corporation
#
# SPDX-License-Identifier: MIT

TBD:
test on a large video with chunks and cache enabled, check top output

import gc
import tracemalloc
tracemalloc.start(10)
try:    
    from cvat.apps.engine.models import Task
    from cvat.apps.engine.frame_provider import FrameProvider
    fp = FrameProvider(Task.objects.get(pk=39).data) # a large video task

    while True:
        for i, f in enumerate(fp.get_frames(out_type=fp.Type.NUMPY_ARRAY)):
            pass

except:    
    del fp
    del f
    snapshot = tracemalloc.take_snapshot()    
    gc.collect()
  • very slow work (blame debugger?) - NO, blame Pillow and image conversion

@zhiltsov-max
Copy link
Contributor Author

@azhavoro, please take a look. To make it running, do
pip install 'git+https://github.com/openvinotoolkit/datumaro@zm/byteimage' --force-reinstall

Please, also check #2241 on a large video task.

@azhavoro azhavoro linked an issue Sep 30, 2020 that may be closed by this pull request
@azhavoro
Copy link
Contributor

azhavoro commented Oct 1, 2020

@zhiltsov-max there is something wrong with the image conversion:
frame_000000

@zhiltsov-max
Copy link
Contributor Author

@azhavoro, fixed. The problem was that video frames are read as yuv420p (as encoded in chunks).

@zhiltsov-max zhiltsov-max changed the title [WIP] Raw image copying in dataset export Raw image copying in dataset export Oct 2, 2020
@nmanovic
Copy link
Contributor

nmanovic commented Oct 3, 2020

@zhiltsov-max , could you please look at the problem?

f.handle(*args, **options)
  File "/usr/local/lib/python3.8/dist-packages/django/core/management/base.py", line 85, in wrapped
    res = handle_func(*args, **kwargs)
  File "/usr/local/lib/python3.8/dist-packages/django/core/management/commands/migrate.py", line 92, in handle
    executor = MigrationExecutor(connection, self.migration_progress_callback)
  File "/usr/local/lib/python3.8/dist-packages/django/db/migrations/executor.py", line 18, in __init__
    self.loader = MigrationLoader(self.connection)
  File "/usr/local/lib/python3.8/dist-packages/django/db/migrations/loader.py", line 53, in __init__
    self.build_graph()
  File "/usr/local/lib/python3.8/dist-packages/django/db/migrations/loader.py", line 210, in build_graph
    self.load_disk()
  File "/usr/local/lib/python3.8/dist-packages/django/db/migrations/loader.py", line 112, in load_disk
    migration_module = import_module(migration_path)
  File "/usr/lib/python3.8/importlib/__init__.py", line 127, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
  File "<frozen importlib._bootstrap>", line 1014, in _gcd_import
  File "<frozen importlib._bootstrap>", line 991, in _find_and_load
  File "<frozen importlib._bootstrap>", line 975, in _find_and_load_unlocked
  File "<frozen importlib._bootstrap>", line 671, in _load_unlocked
  File "<frozen importlib._bootstrap_external>", line 783, in exec_module
  File "<frozen importlib._bootstrap>", line 219, in _call_with_frames_removed
  File "/home/django/cvat/apps/engine/migrations/0017_db_redesign_20190221.py", line 7, in <module>
    from cvat.apps.dataset_manager.task import _merge_table_rows
  File "/home/django/cvat/apps/dataset_manager/task.py", line 18, in <module>
    from .bindings import TaskData
  File "/home/django/cvat/apps/dataset_manager/bindings.py", line 16, in <module>
    from datumaro.util.image import ByteImage, Image
ImportError: cannot import name 'ByteImage' from 'datumaro.util.image' (/usr/local/lib/python3.8/dist-packages/datumaro/util/image.py)
The command "docker-compose -f docker-compose.yml -f docker-compose.ci.yml run cvat_ci /bin/bash -c 'coverage run -a manage.py test cvat/apps && coverage run -a manage.py test --pattern="_test*.py" cvat/apps/dataset_manager/tests cvat/apps/engine/tests utils/cli && mv .coverage ${CONTAINER_COVERAGE_DATA_DIR}'" exited with 1.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.009%) to 65.32% when pulling 2071dd7 on zm/image-copying-in-export into c78cbb8 on develop.

@nmanovic nmanovic merged commit 84b8a85 into develop Oct 7, 2020
@nmanovic nmanovic deleted the zm/image-copying-in-export branch October 7, 2020 09:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Download raw data from the server
4 participants