Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Track "K" frames at once instead of frame by frame #7515

Open
2 tasks done
rohit901 opened this issue Feb 23, 2024 · 8 comments
Open
2 tasks done

Track "K" frames at once instead of frame by frame #7515

rohit901 opened this issue Feb 23, 2024 · 8 comments
Labels
enhancement New feature or request gsoc2024

Comments

@rohit901
Copy link

Actions before raising this issue

  • I searched the existing issues and did not find anything similar.
  • I read/searched the docs

Is your feature request related to a problem? Please describe.

It seems to take a long time to annotate videos with even AI trackers like TransT.
I'm running the model on M1 Macbook Air on CPU, but even if I run on GPU, I think running it frame by frame, makes it very slow.

What is the best way to annotate videos? My videos contain many frames roughly 200-700.

I have to run TransT frame by frame and each frame takes 2-3 seconds to run too :/ Is the current way to annotate videos is to manually do them with interpolation?

Describe the solution you'd like

A faster/semi-automatic or automatic way to annotate video dataset.

Describe alternatives you've considered

No response

Additional context

You may refer to related issues: #5686
#2949

@rohit901 rohit901 added the enhancement New feature or request label Feb 23, 2024
@nmanovic
Copy link
Contributor

@rohit901 , could you please share your ideas on how to annotate using a tracker faster? I'm not sure I got it from your issue. You can describe the pipeline step by step or share an existing good implementation in a 3rd-party tool/demo.

@rohit901
Copy link
Author

Hi @nmanovic,
The existing tracker integration seems to run frame by frame, whereas if it had the capability to run on multiples frames of a video together and generate the results at the same time, it would be much quicker and faster.

It would be nice if the tracker predicts for the future "K" frames instead of only predicting for future "1" frame.

For example, this third-party tool, i.e., Supervisely, allows you to track for future "K" frames at once (https://supervisely.com/) A feature like that in CVAT can be beneficial, where we can then run the self-hosted CVAT tool on our local machines with GPUs.

@nmanovic nmanovic changed the title Faster way to annotate videos? Track "K" frames at once instead of frame by frame Mar 5, 2024
@Ashish8329
Copy link

Hello @nmanovic,

I am interested in contributing to this project. Could you please guide me on the next steps to get involved?

@nmanovic
Copy link
Contributor

Hi, please find our application on the GSoC and contact us using of mentioned ways. We will guide you. https://summerofcode.withgoogle.com/programs/2024/organizations/cvat

@siddtmb
Copy link

siddtmb commented Mar 19, 2024

Why don't you just add a function to track every frame automatically in one server request? The speed is not low because it is tracking every frame, it is slow because CVAT sends a server request every frame

@siddtmb
Copy link

siddtmb commented Mar 19, 2024

Would be nice if you could have siammask auto annotate the next 100 frames or so

@siddtmb
Copy link

siddtmb commented Mar 23, 2024

I am also interested in contributing to this. Not because of GSoC but because I really need a tracker and the current functionality is super slow. Guidance would be appreciated on how this could be done.

@nmanovic
Copy link
Contributor

@siddtmb , feel free to contribute. It is an open-source project.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request gsoc2024
Projects
None yet
Development

No branches or pull requests

4 participants