Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switching to the Fast Model and Performance Considerations #849

Open
uriorbar17 opened this issue Jan 8, 2025 · 5 comments
Open

Switching to the Fast Model and Performance Considerations #849

uriorbar17 opened this issue Jan 8, 2025 · 5 comments

Comments

@uriorbar17
Copy link

  1. How can I enable the "fast model"? Does it prioritize speed over accuracy, and what impact might this have on tasks like file type identification?
    https://github.com/google/magika/tree/main/assets/models/fast_v2_1

  2. Is there a way to control or restrict the file types supported by the model? If so, would narrowing down the list of supported file types improve overall performance?

  3. From a performance standpoint, can the model achieve file type identification in less than 5ms per file buffer?

Thank you

@reyammer
Copy link
Collaborator

/cc @ia0

Here are some answers:

  1. from the python module, you can specify which model to use via model_dir; however the rust client for now always uses the standard_v2_1. But you can recompile the rust binary and point to the fast model, see Keep support for model v1 within the rust client #593 (comment). We are discussing with @ia0 how it would be possible to allow for selecting a different model and how to prioritize this over other features. Feedback like this helps, thanks! For context: we may have soon another model that should be significantly faster with the same accuracy, but it still WIP.

  2. yes, we could train a model that only focuses on specific content types (and we could make it smaller). This is unfortunately tricky to run as a service, but we are thinking on how we could make this tradeoff.

  3. The fast_v2_1 should get you there; the standard_v2_1 should be pretty much ~5ms depending on the hardware, even on CPU. And we have another model pending that should significantly cut the inference time, keep checking the repo for updates!

Hope this helps!

@uriorbar17
Copy link
Author

Thank you for your earlier response! I conducted a benchmark using a
regular Windows PC, with the tested setup involving Rust code exported
to C++.

  1. The fast_v2_1 model achieved an average processing time of ~4.5 ms
    per file, whereas the standard_v2_1 model took ~17 ms per file. This
    represents a ~73% improvement in speed. Is this level of improvement
    in line with what I should expect when switching to the fast model?

  2. I noticed a slight speed improvement when upgrading ort from rc8 to rc9
    with the fast_v2_1 model. The improvement was small but consistent
    across runs. Should I expect meaningful performance gains from
    upgrading the ort version, or could this be a coincidence?

  3. While using the fast_v2_1 model, I observed some degradation in
    classification quality compared to the standard_v2_1 model. I assume
    this tradeoff is reasonable given the focus on performance. Can you
    confirm if this is expected?

Thanks again for your insights.

@reyammer
Copy link
Collaborator

Hello,

thanks for these tests and for your notes!

Some answers/comments:

About 1: yes, the fast model should be ~4x than standard one. But the inference times you are getting are rather high! On my machine, I get ~6.2ms for the standard_v2_1 model.

About 2: interesting. I didn't know about this potential performance boost based on the ort version. @ia0 can we look into this? which version are we using right now?

About 3: correct, fast is faster but with lower accuracy.

That being said, I just merged a new standard_v3_0 model in #866. See the PR notes for more context, but my tests show it's ~3x faster than standard_v2_1 with pretty much the same overall accuracy (and it should be 20% faster than standard_v1). This new model is still not integrated in rust, but we should have that soon (tracked in #868)! After which point, your additional tests would be very appreciated!

@ia0
Copy link
Member

ia0 commented Jan 20, 2025

About 2: This is most probably "Tensor extract optimization" from the ort v2.0.0-rc.9. Such performance improvements are expected since most of what the magika library does is to call ort with the Magika model (there's also feature extraction).

We currently use rc9 since #821 (November 2024) but this is not published yet. The latest published version uses rc8. I would suggest publishing after each change, we don't modify the code so often that it makes sense to wait for multiple changes. This will also make it easier for publishing Python since there's no need to wait to publish Rust first, since it's always in a published state.

@reyammer
Copy link
Collaborator

We just released a new -rc version, which ships the latest rust client, the last model (which should be significantly faster), and a also a pure python wheel to install magika on non-super-common platforms.

https://pypi.org/project/magika/0.6.1rc0/

You should be able to install and test with pip install --pre magika.

Please let us know if you run into any issues, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants