Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ability to create a task with a subset of frames for a video file and images #194

Closed
2 tasks done
nmanovic opened this issue Nov 13, 2018 · 7 comments
Closed
2 tasks done
Assignees
Labels
enhancement New feature or request
Milestone

Comments

@nmanovic
Copy link
Contributor

nmanovic commented Nov 13, 2018

In some cases it is necessary to annotate every 15th frame for a video file. Now it is necessary to convert the video into 2 FPS and after that create an annotation task. It will be good if some basic support for such case will be implemented into the tool.

  • Dump should have correct frame numbers (e.g. 000000, 000015, 000030, ...)
  • Meta information should have information about such filtration as well.
@nmanovic nmanovic added the enhancement New feature or request label Nov 13, 2018
@nmanovic nmanovic added this to the Backlog milestone Nov 13, 2018
@nmanovic nmanovic self-assigned this Nov 13, 2018
@nmanovic nmanovic modified the milestones: Backlog, 0.4.0 - Alpha Dec 1, 2018
@nmanovic nmanovic modified the milestones: 0.4.0 - Alpha, 0.4.0 - Beta, Backlog Jan 21, 2019
@zliang7
Copy link
Contributor

zliang7 commented Apr 15, 2019

@nmanovic, My questions that mentioned at #382:

  1. Why and when we need to make images correspond to video frame number? How to check the corresponding relation in UI?
  2. If we have to keep it, is it ok to use frame_#fps_00000, frame_#fps_00001 ... to express the relationship?
  3. If we can't use -r option, we have to extract all frames and then pick the needed images. For big video, isn't it too resource and time consuming?
  4. If video uploading takes too long time, user may extracts frame with -r option locally and upload images. Does this cause problems comparing to upload video directly?
  5. Because front-end doesn't know how many frames in video, how to validate the specified number of the end frame?
  6. How user know the number of the start frame, because I suppose no video player show the current frame number.
  7. If we use time value to specify start and stop frames, is it precise as frame number? For example, a 30fps video, the 2nd frame is 33.33ms or 33.333333ms?

Thanks.

@nmanovic
Copy link
Contributor Author

nmanovic commented Apr 16, 2019

Why and when we need to make images correspond to video frame number? How to check the corresponding relation in UI?

The final annotation file should have a simple property: if somebody takes the file and read frame #N it has to correspond to frame #N inside the video sequence. In your case frame #N can correspond any frame in the video and it depens on how you create a task. In UI and in server code you can use frame numbers without gaps but in the dump file they should be correct. We have Video table now in nm/rest_api branch. Probably it can be used somehow for the purpose.

If we can't use -r option, we have to extract all frames and then pick the needed images. For big video, isn't it too resource and time consuming?

Will -vf "select=not(mod(n,10))" option work?
https://trac.ffmpeg.org/wiki/Create%20a%20thumbnail%20image%20every%20X%20seconds%20of%20the%20video

If video uploading takes too long time, user may extracts frame with -r option locally and upload images. Does this cause problems comparing to upload video directly?

Too many manual work is the only problem which I see at the moment.

Because front-end doesn't know how many frames in video, how to validate the specified number of the end frame?

UI can request frame N as usual (without gaps). Need to hide from UI the complexity. But at the same time it is better to show correct frame number. Thus annotation task should have "start_frame, "frame_step" fields. The correct frame number (to display in UI) will be start_frame + N*frame_step

If we use time value to specify start and stop frames, is it precise as frame number? For example, a 30fps video, the 2nd frame is 33.33ms or 33.333333ms?

In general it is necessary to find the nearest frame for the timestamp. But to avoid any problems it is OK to accept frame numbers only.

@nmanovic
Copy link
Contributor Author

From previous PR: #382

Let me clarify my vision for the feature:

  • Need several extra fields:

    • start frame. It should be empty by default, less than stop frame. It will be good if you can specify it as frame number
    • stop frame. It should be empty by default (corresponds to the end), >=0, more than start frame. It will be good if you can specify it as frame number
    • step. It should be empty by default (corresponds to the minimum possible step), >0. It will be good if you can specify it as frame number
  • Inside dump file you should have exact correspondence between the video file and dumped frames. For example, if you annotate only 1 frame from 30 frames you should have <frame000000>, <frame000029>, ... inside the annotation file.

  • As I said previously the patch should be submitted into nm/rest_api branch. It will be merged into develop branch very soon.

@zliang7
Copy link
Contributor

zliang7 commented Apr 17, 2019

The select filter does work. But the number suffix of generated image filename is continuous. Is OK to use image_%start_%step_%d.jpg pattern? Or we have to rename all files' name to a flat pattern image_xxx.jpg ?

I have no idea that how to check the ending frame in front-end. UI can only check whether it's a number. If backend find it's invalid, report error. Is this OK?

Last, I still have a concern that how user get the exact number of video frames if he/she want to set the fields. But I'm OK if you don't think this is a problem.

@nmanovic
Copy link
Contributor Author

nmanovic commented Apr 17, 2019

@zliang7 ,

The select filter does work. But the number suffix of generated image filename is continuous. Is OK to use image_%start_%step_%d.jpg pattern? Or we have to rename all files' name to a flat pattern image_xxx.jpg ?

I think that continuous should be OK. In general we need to fix "dump" only. You can leave the client and server as is (they can think that they work with continuous frames). In general you have get the same behaviour as the user extract all necessary frames manually and create a task from them. I hope you get my idea.

I have no idea that how to check the ending frame in front-end. UI can only check whether it's a number. If backend find it's invalid, report error. Is this OK?

I believe it is OK. In this case the server should return BAD REQUEST (400 error code). It means that one of input parameters is incorrect.

Last, I still have a concern that how user get the exact number of video frames if he/she want to set the fields. But I'm OK if you don't think this is a problem.

In the future it will be necessary to implement import/export data for an annotation task (#152). For now we can specify correct command in our documentation. Even now the user should know how to extract frames.

@nmanovic nmanovic modified the milestones: 0.4.0 - Beta, 0.5.0 - Alpha Apr 20, 2019
@nmanovic
Copy link
Contributor Author

nmanovic commented May 5, 2019

@zliang7 ,

In my comments to your PR I suggested using frame_filter with step=N value (the input field should have hints for users) instead of a step. I hope it will not take too much time to update your PR and support that. When I was reviewing the PR I understood that it will be much better. I got the idea from your code.

Also I suggested supporting images as well. Our primary goal is to make CVAT better for our customers. I believe that we need to look in the future (e.g. they can have a directory with frames and in this case they need the same functionality for images). Let me know if you think that supporting images will take a lot of time.

Thanks for your PR. I happy to see you in our virtual team!

@nmanovic nmanovic changed the title Ability to create a task with a subset of frames for a video Ability to create a task with a subset of frames for a video file and images May 5, 2019
@nmanovic
Copy link
Contributor Author

#437

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants