Olive-ai 0.5.2
Examples
The following examples are added
Passes (optimization techniques)
- SliceGPT: SliceGPT is post-training sparsification scheme that makes transformer networks smaller by applying orthogonal transformations to each transformer layer that reduces the model size by slicing off the least-significant rows and columns of the weight matrices. This results in speedups and a reduced memory footprint.
- ExtractAdapters: Extracts the lora adapters (float or static quantized) weights and saves them in a separate file.
Engine
- Simplify the engine config
Fix
- GenAIModelExporter: In windows, the cache_dir of genai model exporter will exceed 260.