YAML Parser #3

dnhkng · 2024-03-09T21:32:24Z

This adds a YAML parser to make is pretty easy to write high-level model merging configs.

It should be used so:
python yamlconfig.py some.yml
It operates on yaml files with this structure:

slices:
- sources:
  - layer_range: [0, 1]
    model: models/mistral-7b-instruct-v0.1.Q5_K_M.gguf
    weight: 0.6
  - layer_range: [0, 1]
    model: models/zephyr-7b-beta.Q5_K_M.gguf
    weight: 0.4
  merge_method: slerp
- sources:
  - layer_range: [1, 16]
    model: models/mistral-7b-instruct-v0.1.Q5_K_M.gguf
    weight: 0.2
  - layer_range: [3, 18]
    model: models/zephyr-7b-beta.Q5_K_M.gguf
    weight: 0.8
  merge_method: linear
- sources:
  - layer_range: [8, 24]
    model: models/mistral-7b-instruct-v0.1.Q5_K_M.gguf
  merge_method: copy
- sources:
  - layer_range: [16, 30]
    model: models/zephyr-7b-beta.Q5_K_M.gguf
  merge_method: copy
- sources:
  - layer_range: [31, 32]
    model: models/mistral-7b-instruct-v0.1.Q5_K_M.gguf
    weight: 0.1
  - layer_range: [31, 32]
    model: models/zephyr-7b-beta.Q5_K_M.gguf
    weight: 0.9
  merge_method: linear

This config does all kinds of model merging by slices. Slices can be copied from various source models, or combined in various ways in various orders. There is also a simpler 'model' based merging system, where similar models can be merged by just linear or spherical interpolation.

Note, the current low-level method for this code:

- sources:
  - layer_range: [31, 32]
    model: models/mistral-7b-instruct-v0.1.Q5_K_M.gguf
    weight: 0.1
  - layer_range: [31, 32]
    model: models/zephyr-7b-beta.Q5_K_M.gguf
    weight: 0.9

would be:

output layer 46
all linear 31,31,0.1,0.9

Currently, the model identity is not needed, and so the format is:
layer, layer, weight, weight

However, if we were working with more than 2 models, or the order changed, this would confuse us. e.g. we want to try merging 3 models, and sometimes were are merging [1,2], and other times [2,3], or we might have the order reversed [2,1].
Perhaps we should use the format:
model, model, layer, layer, weigh, weight

If this format seems better, I will update the parser to match.

YAML Parser

8e88e68

dnhkng mentioned this pull request Mar 9, 2024

WIP: Add model merge example ggerganov/llama.cpp#5741

Draft

Refactor and documentation

dbb5cb7

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

YAML Parser #3

YAML Parser #3

dnhkng commented Mar 9, 2024 •

edited

Loading

YAML Parser #3

Are you sure you want to change the base?

YAML Parser #3

Conversation

dnhkng commented Mar 9, 2024 • edited Loading

dnhkng commented Mar 9, 2024 •

edited

Loading