Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Any way to avoid copying and compressing files when creating new task? #204

Closed
Paperone80 opened this issue Nov 21, 2018 · 6 comments · Fixed by #2377
Closed

Any way to avoid copying and compressing files when creating new task? #204

Paperone80 opened this issue Nov 21, 2018 · 6 comments · Fixed by #2377
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@Paperone80
Copy link

Paperone80 commented Nov 21, 2018

Hi,

Is there any way to avoid copying and compressing images when a new task is created with Source: Shared? Also, source is bound read-only so it shouldn't overwrite anything.
Not sure what the reason is but I have some high resolution imagery and I am loosing the important details to set proper attributes and polygon masks. Also, it takes up unnecessary time and disk space. Thanks.

I am using cvat github version from 2018-11-19.

@gzvulon
Copy link

gzvulon commented Jan 2, 2019

Same here, I have a lot of HD images on net share and s3,
I'd like to map them to /share dir and use only links, without any data copy.

@nmanovic nmanovic added this to the Backlog milestone Jan 21, 2019
@nmanovic nmanovic self-assigned this Jan 21, 2019
@nmanovic nmanovic added the enhancement New feature or request label Jan 21, 2019
@vfdev-5
Copy link
Contributor

vfdev-5 commented Sep 13, 2019

any updates on this ?

@nmanovic
Copy link
Contributor

@vfdev-5 , we are going to reimplement our way to serve data from server (https://github.com/opencv/cvat/tree/az/video_stream). I hope to see the functionality merged in a month or so. After it is merged we will think how to implement the feature. Probably it will be possible to provide data in a pre-defined format. I hope to see the feature in v1.0.0. But we cannot promise.

@nmanovic
Copy link
Contributor

Some notes for future reference.

Pipeline to use original data:

  1. Prepare data in a format which CVAT can understand and put them onto your remote storage:
    • directory with images
    • Use CVAT script (will be provided) to prepare data in right format (chunks)
    • Use CVAT script (will be provided) to prepare data for "protected" access (e.g. S3 with credentials). Thus instead of original data the user will upload some text files with links on original data and meta information about data.
  2. CVAT will use the remote storage as is and don't try to copy files internally. There are three main use cases:
    • For directory with images CVAT will convert them into "own format" on the fly and cache using DiskCache (http://www.grantjenks.com/docs/diskcache/) in a temporary directory (thus next access should be fast and storage size will be limited).
    • Serve original data as is if data was prepared using CVAT script. Aka remote links on the required data. The user already prepared original data for us, don't need to do anything else. But for compressed data we need to compress them on the fly and cache using DiskCache (http://www.grantjenks.com/docs/diskcache/) in a temporary directory (thus next access should be fast and storage size will be limited). Because the original data was prepared as small chunks it will be fast enough to prepare compressed data.
  3. cvat-data will be responsible to accept prepared data and transform them to actual images with meta information.
    • chunkN.zip with images will be unzipped
    • chunkN.mp4 with video frames will be decoded
    • chunkN.txt with links to data will be converted to images if credentials are provided by the client.

@bsekachev
Copy link
Member

@Marishka17 Do you think the issue was implemented in #2377?

@Marishka17
Copy link
Contributor

@bsekachev, I think yes.

@Marishka17 Marishka17 linked a pull request Feb 16, 2021 that will close this issue
11 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants