chore: Improve package/binary size by remove jinja2 #1063

nguyenhoangthuan99 · 2024-08-23T10:54:37Z

The jinja2 make the binary double size because contains too many deps from boost. Because jinja2 only parse model from gguf model. We can remove this part and pass to cortex.llamacpp to handle to reduce size

freelerobot · 2024-09-06T06:55:27Z

@nguyenhoangthuan99 can you elaborate on this issue?
Are you talking about only including jinja2 in cortex.llamacpp, instead of the overall cortexcpp packager or soemthing else?

nguyenhoangthuan99 · 2024-09-09T01:44:03Z

Problem

cortex-cpp using Jinja2-cpp lib to parse chat format from GGUF file. This will help us to run model from every source.
The jinja2-cpp is the only cpp lib that can support render jinja2 template that can build multi platform, but it has many deps from boosts.
Llama.cpp also support parse these jinja2-template to chat format internally -> if we want to use that feature we have to build llamacpp along with cortex-cpp -> this is also not recommended because the llamacpp repo is too large.

Solution

All model with gguf file format only run with cortex.llamacpp engines, for that reason, we will move the part parse chat template for cortex.llamacpp engines. And this part will be executed during runtime (when user start a model using cortex.llamacpp engine, it will parse chat template).
This solution require more effort and can save 60 Mb of binary file.

gabrielle-ong · 2024-10-03T10:21:46Z

closing issue, thanks @nguyenhoangthuan99

imtuyethan transferred this issue from another repository Sep 2, 2024

freelerobot changed the title ~~Improve package/binary size by remove jinja2~~ chore: Improve package/binary size by remove jinja2 Sep 6, 2024

freelerobot added the P1: important Important feature / fix label Sep 6, 2024

dan-menlo assigned nguyenhoangthuan99 Sep 8, 2024

dan-menlo moved this from Planning to Scheduled in Menlo Sep 8, 2024

dan-menlo mentioned this issue Sep 10, 2024

epic: Cortex "App Shell" for v0.1 #1149

Open

nguyenhoangthuan99 mentioned this issue Sep 22, 2024

Add chat template renderer from llama.cpp #1289

Merged

nguyenhoangthuan99 moved this from Scheduled to In Progress in Menlo Sep 23, 2024

nguyenhoangthuan99 moved this from In Progress to In Review in Menlo Sep 23, 2024

nguyenhoangthuan99 closed this as completed in #1289 Sep 23, 2024

github-project-automation bot moved this from In Review to Completed in Menlo Sep 23, 2024

nguyenhoangthuan99 moved this from Completed to QA in Menlo Sep 23, 2024

gabrielle-ong moved this from Review + QA to Completed in Menlo Oct 3, 2024

gabrielle-ong added this to the v1.0.0 milestone Oct 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: Improve package/binary size by remove jinja2 #1063

chore: Improve package/binary size by remove jinja2 #1063

nguyenhoangthuan99 commented Aug 23, 2024

freelerobot commented Sep 6, 2024

nguyenhoangthuan99 commented Sep 9, 2024

gabrielle-ong commented Oct 3, 2024

chore: Improve package/binary size by remove jinja2 #1063

chore: Improve package/binary size by remove jinja2 #1063

Comments

nguyenhoangthuan99 commented Aug 23, 2024

freelerobot commented Sep 6, 2024

nguyenhoangthuan99 commented Sep 9, 2024

gabrielle-ong commented Oct 3, 2024