Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iq2_xxs: tune quantization #5320

Merged
merged 1 commit into from
Feb 5, 2024
Merged

iq2_xxs: tune quantization #5320

merged 1 commit into from
Feb 5, 2024

Commits on Feb 4, 2024

  1. iq2_xxs: tune quantization

    We get slightly better PPL, and we cut quantization time in
    nearly half.
    
    The trick is to 1st quantize without forcing points onto the E8-lattice.
    We can then use a narrower search range around the block scale that we
    got that way.
    Kawrakow committed Feb 4, 2024
    Configuration menu
    Copy the full SHA
    f3798f7 View commit details
    Browse the repository at this point in the history