Ability to create a task with a subset of frames for a video file and images #194

nmanovic · 2018-11-13T10:13:14Z

In some cases it is necessary to annotate every 15th frame for a video file. Now it is necessary to convert the video into 2 FPS and after that create an annotation task. It will be good if some basic support for such case will be implemented into the tool.

Dump should have correct frame numbers (e.g. 000000, 000015, 000030, ...)
Meta information should have information about such filtration as well.

zliang7 · 2019-04-15T04:12:32Z

@nmanovic, My questions that mentioned at #382:

Why and when we need to make images correspond to video frame number? How to check the corresponding relation in UI?
If we have to keep it, is it ok to use frame_#fps_00000, frame_#fps_00001 ... to express the relationship?
If we can't use -r option, we have to extract all frames and then pick the needed images. For big video, isn't it too resource and time consuming?
If video uploading takes too long time, user may extracts frame with -r option locally and upload images. Does this cause problems comparing to upload video directly?
Because front-end doesn't know how many frames in video, how to validate the specified number of the end frame?
How user know the number of the start frame, because I suppose no video player show the current frame number.
If we use time value to specify start and stop frames, is it precise as frame number? For example, a 30fps video, the 2nd frame is 33.33ms or 33.333333ms?

Thanks.

nmanovic · 2019-04-16T09:28:42Z

Why and when we need to make images correspond to video frame number? How to check the corresponding relation in UI?

The final annotation file should have a simple property: if somebody takes the file and read frame #N it has to correspond to frame #N inside the video sequence. In your case frame #N can correspond any frame in the video and it depens on how you create a task. In UI and in server code you can use frame numbers without gaps but in the dump file they should be correct. We have Video table now in nm/rest_api branch. Probably it can be used somehow for the purpose.

If we can't use -r option, we have to extract all frames and then pick the needed images. For big video, isn't it too resource and time consuming?

Will -vf "select=not(mod(n,10))" option work?
https://trac.ffmpeg.org/wiki/Create%20a%20thumbnail%20image%20every%20X%20seconds%20of%20the%20video

If video uploading takes too long time, user may extracts frame with -r option locally and upload images. Does this cause problems comparing to upload video directly?

Too many manual work is the only problem which I see at the moment.

Because front-end doesn't know how many frames in video, how to validate the specified number of the end frame?

UI can request frame N as usual (without gaps). Need to hide from UI the complexity. But at the same time it is better to show correct frame number. Thus annotation task should have "start_frame, "frame_step" fields. The correct frame number (to display in UI) will be start_frame + N*frame_step

If we use time value to specify start and stop frames, is it precise as frame number? For example, a 30fps video, the 2nd frame is 33.33ms or 33.333333ms?

In general it is necessary to find the nearest frame for the timestamp. But to avoid any problems it is OK to accept frame numbers only.

nmanovic · 2019-04-16T09:32:36Z

From previous PR: #382

Let me clarify my vision for the feature:

Need several extra fields:
- start frame. It should be empty by default, less than stop frame. It will be good if you can specify it as frame number
- stop frame. It should be empty by default (corresponds to the end), >=0, more than start frame. It will be good if you can specify it as frame number
- step. It should be empty by default (corresponds to the minimum possible step), >0. It will be good if you can specify it as frame number
Inside dump file you should have exact correspondence between the video file and dumped frames. For example, if you annotate only 1 frame from 30 frames you should have <frame000000>, <frame000029>, ... inside the annotation file.
As I said previously the patch should be submitted into nm/rest_api branch. It will be merged into develop branch very soon.

zliang7 · 2019-04-17T13:47:26Z

The select filter does work. But the number suffix of generated image filename is continuous. Is OK to use image_%start_%step_%d.jpg pattern? Or we have to rename all files' name to a flat pattern image_xxx.jpg ?

I have no idea that how to check the ending frame in front-end. UI can only check whether it's a number. If backend find it's invalid, report error. Is this OK?

Last, I still have a concern that how user get the exact number of video frames if he/she want to set the fields. But I'm OK if you don't think this is a problem.

nmanovic · 2019-04-17T14:22:42Z

@zliang7 ,

The select filter does work. But the number suffix of generated image filename is continuous. Is OK to use image_%start_%step_%d.jpg pattern? Or we have to rename all files' name to a flat pattern image_xxx.jpg ?

I think that continuous should be OK. In general we need to fix "dump" only. You can leave the client and server as is (they can think that they work with continuous frames). In general you have get the same behaviour as the user extract all necessary frames manually and create a task from them. I hope you get my idea.

I have no idea that how to check the ending frame in front-end. UI can only check whether it's a number. If backend find it's invalid, report error. Is this OK?

I believe it is OK. In this case the server should return BAD REQUEST (400 error code). It means that one of input parameters is incorrect.

Last, I still have a concern that how user get the exact number of video frames if he/she want to set the fields. But I'm OK if you don't think this is a problem.

In the future it will be necessary to implement import/export data for an annotation task (#152). For now we can specify correct command in our documentation. Even now the user should know how to extract frames.

nmanovic · 2019-05-05T13:34:12Z

@zliang7 ,

In my comments to your PR I suggested using frame_filter with step=N value (the input field should have hints for users) instead of a step. I hope it will not take too much time to update your PR and support that. When I was reviewing the PR I understood that it will be much better. I got the idea from your code.

Also I suggested supporting images as well. Our primary goal is to make CVAT better for our customers. I believe that we need to look in the future (e.g. they can have a directory with frames and in this case they need the same functionality for images). Let me know if you think that supporting images will take a lot of time.

Thanks for your PR. I happy to see you in our virtual team!

nmanovic · 2019-05-30T20:33:27Z

#437

nmanovic added the enhancement New feature or request label Nov 13, 2018

nmanovic added this to the Backlog milestone Nov 13, 2018

nmanovic self-assigned this Nov 13, 2018

nmanovic modified the milestones: Backlog, 0.4.0 - Alpha Dec 1, 2018

nmanovic added the good first issue label Jan 10, 2019

nmanovic modified the milestones: 0.4.0 - Alpha, 0.4.0 - Beta, Backlog Jan 21, 2019

nmanovic mentioned this issue Mar 4, 2019

Reduce number of jpeg file generated from video input file #343

Closed

zliang7 mentioned this issue Apr 9, 2019

Set frame rate of video extracting #382

Closed

nmanovic modified the milestones: 0.4.0 - Beta, 0.5.0 - Alpha Apr 20, 2019

nmanovic changed the title ~~Ability to create a task with a subset of frames for a video~~ Ability to create a task with a subset of frames for a video file and images May 5, 2019

nmanovic closed this as completed May 30, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ability to create a task with a subset of frames for a video file and images #194

Ability to create a task with a subset of frames for a video file and images #194

nmanovic commented Nov 13, 2018 •

edited

Loading

zliang7 commented Apr 15, 2019

nmanovic commented Apr 16, 2019 •

edited

Loading

nmanovic commented Apr 16, 2019

zliang7 commented Apr 17, 2019

nmanovic commented Apr 17, 2019 •

edited

Loading

nmanovic commented May 5, 2019

nmanovic commented May 30, 2019

Ability to create a task with a subset of frames for a video file and images #194

Ability to create a task with a subset of frames for a video file and images #194

Comments

nmanovic commented Nov 13, 2018 • edited Loading

zliang7 commented Apr 15, 2019

nmanovic commented Apr 16, 2019 • edited Loading

nmanovic commented Apr 16, 2019

zliang7 commented Apr 17, 2019

nmanovic commented Apr 17, 2019 • edited Loading

nmanovic commented May 5, 2019

nmanovic commented May 30, 2019

nmanovic commented Nov 13, 2018 •

edited

Loading

nmanovic commented Apr 16, 2019 •

edited

Loading

nmanovic commented Apr 17, 2019 •

edited

Loading