Skip to content

Releases: mosaicml/llm-foundry

v0.15.0

23 Nov 02:13
Compare
Choose a tag to compare

New Features

Open Source Embedding + Contrastive Code (#1615)

LLM foundry now supports finetuning embedding models with contrastive loss. Foundry now supports various approaches to selecting negative passages for contrastive loss which can be either randomly selected or pre-defined. For more information, please view the the readme.

PyTorch 2.5.1 (#1665)

This release updates LLM Foundry to the PyTorch 2.5.1 release, bringing with it support for the new features and optimizations in PyTorch 2.5.1.

Improved error messages (#1657, #1660, #1623, #1625)

Various improved error messages, making debugging user errors more clear.

What's Changed

New Contributors

Full Changelog: v0.14.5...v0.15.0

v0.14.5

18 Nov 17:15
Compare
Choose a tag to compare
  • Move transform_model_pre_registration in hf_checkpointer (#1664)

Full Changelog: v0.14.4...v0.14.5

v0.14.4

07 Nov 20:42
Compare
Choose a tag to compare
  • Add max shard size to transformers save_pretrained by @b-chu in #1648

Full Changelog: v0.14.3...v0.14.4

v0.14.3

05 Nov 15:41
Compare
Choose a tag to compare

What's Changed

Full Changelog: v0.14.2...v0.14.3

v0.14.2

04 Nov 02:14
Compare
Choose a tag to compare

Bug Fixes

Move loss generating token counting to the dataloader (#1632)

Fixes a throughput regression due to #1610, which was release in v0.14.0

What's Changed

  • Move loss generating token counting to the dataloader by @dakinggg in #1632

Full Changelog: v0.14.1...v0.14.2

v0.14.1

01 Nov 23:55
Compare
Choose a tag to compare

New Features

Use log_model for registering models (#1544 )

Instead of calling the mlflow register API directly, we use the intended log_model API, which will both log the model to mlflow run artifacts, and register it to Unity Catalog.

What's Changed

Full Changelog: v0.14.0...v0.14.1

v0.14.0

28 Oct 22:41
Compare
Choose a tag to compare

New Features

Load Checkpoint Callback (#1570)

We added support for Composer's LoadCheckpoint callback, which loads a checkpoint at a specified event. This enables use cases like loading model base weights with peft.

callbacks:
    load_checkpoint:
        load_path: /path/to/your/weights

Breaking Changes

Accumulate over tokens in a Batch for Training Loss (#1618,#1610,#1595)

We added a new flag accumulate_train_batch_on_tokens which specifies whether training loss is accumulated over the number of tokens in a batch, rather than the number of samples. It is true by default. This will slightly change loss curves for models trained with padding. The old behavior can be recovered by simply setting this to False explicitly.

Default Run Name (#1611)

If no run name is provided, we now will default to using composer's randomly generated run names. (Previously, we defaulted to using "llm" for the run name.)

What's Changed

Full Changelog: v0.13.0...v0.14.0

v0.13.1

18 Oct 16:50
Compare
Choose a tag to compare

🚀 LLM Foundry v0.13.1

What's Changed

  • Add configurability to HF checkpointer timeout by @dakinggg in #1599

Full Changelog: v0.13.0...v0.13.1

v0.13.0

15 Oct 06:23
Compare
Choose a tag to compare

🚀 LLM Foundry v0.13.0

🛠️ Bug Fixes & Cleanup

Pytorch 2.4 Checkpointing (#1569, #1581, #1583)

Resolved issues related to checkpointing for Curriculum Learning (CL) callbacks.

🔧 Dependency Updates

Bumped tiktoken from 0.4.0 to 0.8.0 (#1572)
Updated onnxruntime from 1.19.0 to 1.19.2 (#1590)

What's Changed

Full Changelog: v0.12.0...v0.13.0

v0.12.0

26 Sep 03:52
Compare
Choose a tag to compare

🚀 LLM Foundry v0.12.0

New Features

PyTorch 2.4 (#1505)

This release updates LLM Foundry to the PyTorch 2.4 release, bringing with it support for the new features and optimizations in PyTorch 2.4

Extensibility improvements (#1450, #1449, #1468, #1467, #1478, #1493, #1495, #1511, #1512, #1527)

Numerous improvements to the extensibility of the modeling and data loading code, enabling easier reuse for subclassing and extending. Please see the linked PRs for more details on each change.

Improved error messages (#1457, #1459, #1519, #1518, #1522, #1534, #1548, #1551)

Various improved error messages, making debugging user errors more clear.

Sliding window in torch attention (#1455)

We've added support for sliding window attention to the reference attention implementation, allowing easier testing and comparison against more optimized attention variants.

Bug fixes

Extra BOS token for llama 3.1 with completion data (#1476)

A bug resulted in an extra BOS token being added between prompt and response during finetuning. This is fixed so that the prompt and response supplied by the user are concatenated without any extra tokens put between them.

What's Changed

New Contributors

Full Changelog: v0.11.0...v0.12.0