Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Registry based config - Part 1 #975

Merged
merged 67 commits into from
Mar 20, 2024
Merged
Show file tree
Hide file tree
Changes from 63 commits
Commits
Show all changes
67 commits
Select commit Hold shift + click to select a range
030180e
fix alls and imports in utils
dakinggg Feb 10, 2024
cef0bac
first try
dakinggg Feb 10, 2024
5e96253
fix?
dakinggg Feb 10, 2024
ed05e31
add import option
dakinggg Feb 10, 2024
4948006
add import option
dakinggg Feb 10, 2024
4145481
decorate
dakinggg Feb 10, 2024
2e53466
new approach
dakinggg Feb 10, 2024
8420d73
first draft
dakinggg Feb 14, 2024
dc5ebdd
fix
dakinggg Feb 15, 2024
05d3589
wip
dakinggg Feb 15, 2024
48477c6
wip
dakinggg Feb 15, 2024
70c1245
wip
dakinggg Feb 20, 2024
50e60f4
merge
dakinggg Feb 29, 2024
c094466
test organization
dakinggg Feb 29, 2024
f32936d
fixes
dakinggg Feb 29, 2024
7dd3ba0
fix
dakinggg Feb 29, 2024
4797337
rm
dakinggg Feb 29, 2024
91d208c
fix
dakinggg Feb 29, 2024
bf4efa4
fix
dakinggg Feb 29, 2024
5825fe1
fix
dakinggg Feb 29, 2024
80f1f76
rename
dakinggg Feb 29, 2024
5665df4
fix
dakinggg Feb 29, 2024
81ed9df
modularize
dakinggg Feb 29, 2024
4b2d7df
reexport
dakinggg Feb 29, 2024
235c12d
test
dakinggg Feb 29, 2024
8caf484
test
dakinggg Feb 29, 2024
4e0f524
fix
dakinggg Feb 29, 2024
f047023
Merge branch 'main' into registry
dakinggg Feb 29, 2024
0aafaf0
wip
dakinggg Mar 1, 2024
a8fb87a
fix and tests
dakinggg Mar 1, 2024
4215fa8
merge
dakinggg Mar 2, 2024
cdaff87
pc
dakinggg Mar 2, 2024
561b1ab
pyright?
dakinggg Mar 2, 2024
ae2f0e0
3.9 compat
dakinggg Mar 2, 2024
c4b5c9b
Merge branch 'main' into registry
dakinggg Mar 5, 2024
9342f6f
circular and descriptions
dakinggg Mar 5, 2024
fb4cdad
wip
dakinggg Mar 5, 2024
c145f39
Merge branch 'main' into registry
dakinggg Mar 8, 2024
d176ae1
pc
dakinggg Mar 8, 2024
bacec38
Merge branch 'main' into registry
dakinggg Mar 8, 2024
dd07597
move import_file around
dakinggg Mar 8, 2024
40d1c0a
fix
dakinggg Mar 9, 2024
b924c1d
pc
dakinggg Mar 9, 2024
63d0a4d
basic docs
dakinggg Mar 9, 2024
9912576
all
dakinggg Mar 9, 2024
8379bb0
pc
dakinggg Mar 9, 2024
1de7e74
temp change for testing
dakinggg Mar 9, 2024
870310b
more
dakinggg Mar 9, 2024
a4ec3a3
merge
dakinggg Mar 11, 2024
81ef36a
undo temp
dakinggg Mar 11, 2024
870460c
skip readme tests
dakinggg Mar 11, 2024
cf7cc75
merge
dakinggg Mar 11, 2024
e5e0ddd
pc
dakinggg Mar 11, 2024
3938cf2
construction emojis
dakinggg Mar 11, 2024
5aa86ba
docstring
dakinggg Mar 11, 2024
91d8fc8
Merge branch 'main' into registry
dakinggg Mar 11, 2024
068ea77
Bump version to 0.6.0 (#1023)
dakinggg Mar 12, 2024
6b08d59
merge
dakinggg Mar 12, 2024
be0e0c5
Merge branch 'main' into registry
dakinggg Mar 13, 2024
5a17268
Allow code-quality workflow to be callable (#1026)
b-chu Mar 13, 2024
662ac84
merge
dakinggg Mar 16, 2024
fe15f27
Merge branch 'main' into registry
dakinggg Mar 18, 2024
c106ab6
fix type checking for experimental
dakinggg Mar 18, 2024
d078a13
pr comments 1
dakinggg Mar 20, 2024
b0d8c65
pr comments
dakinggg Mar 20, 2024
2593d94
fix test
dakinggg Mar 20, 2024
3659e3e
fix
dakinggg Mar 20, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,10 @@ export HUGGING_FACE_HUB_TOKEN=your-auth-token

and uncomment the line containing `--hf_repo_for_upload ...` in the above call to `inference/convert_composer_to_hf.py`.

# :construction: UNDER CONSTRUCTION: Registry

We are adopting an extensible registry for LLM Foundry to allow various extensions of the library without forking it. See [./REGISTRY.md] for more information as it develops.

# Learn more about LLM Foundry!

Check out [TUTORIAL.md](https://github.com/mosaicml/llm-foundry/blob/main/TUTORIAL.md) to keep learning about working with LLM Foundry. The tutorial highlights example workflows, points you to other resources throughout the repo, and answers frequently asked questions!
Expand Down
84 changes: 84 additions & 0 deletions REGISTRY.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
# :construction: LLM Foundry Registry

Some components of LLM Foundry are registrable. This means that you can register options for these components, and then use them in your yaml config, without forking the library.

## How to register

There are a few ways to register a new component:

### Python entrypoints

You can specify registered components via a Python entrypoint if you are building your own package with registered components.

For example, the following would register the `WandBLogger` class, under the key `wandb`, in the `llm_foundry.loggers` registry:

<!--pytest.mark.skip-->
```yaml
[build-system]
requires = ["setuptools>=42", "wheel"]
build-backend = "setuptools.build_meta"

[project]
name = "foundry_registry"
version = "0.1.0"
dependencies = [
"mosaicml",
"llm-foundry",
]

[project.entry-points."llm_foundry.loggers"]
my_logger = "foundry_registry.loggers:MyLogger"
```

### Direct call to register

You can also register a component directly in your code:

<!--pytest.mark.skip-->
```python
from composer.loggers import LoggerDestination
from llmfoundry.registry import loggers

class MyLogger(LoggerDestination):
pass

loggers.register("my_logger", func=MyLogger)
```

### Decorators

You can also use decorators to register components directly from your code:

<!--pytest.mark.skip-->
```python
from composer.loggers import LoggerDestination
from llmfoundry.registry import loggers

@loggers.register("my_logger")
class MyLogger(LoggerDestination):
pass
```

For both the direct call and decorator approaches, if using the LLM Foundry train/eval scripts, you will need to provide the `code_paths` argument, which is a list of files need to execute in order to register your components. For example, you may have a file called `foundry_imports.py` that contains the following:

<!--pytest.mark.skip-->
```python
from foundry_registry.loggers import MyLogger
from llmfoundry.registry import loggers

loggers.register("my_logger", func=MyLogger)
```

You would then provide `code_paths` to the train/eval scripts in your yaml config:

<!--pytest.mark.skip-->
```yaml
...
code_paths:
- foundry_imports.py
...
```


## Discovering registrable components
Coming soon
dakinggg marked this conversation as resolved.
Show resolved Hide resolved
6 changes: 5 additions & 1 deletion llmfoundry/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@

hf_dynamic_modules_logger.addFilter(new_files_warning_filter)

from llmfoundry import optim, utils
from llmfoundry import algorithms, callbacks, loggers, optim, registry, utils
from llmfoundry.data import (ConcatTokensDataset, MixtureOfDenoisersCollator,
NoConcatDataset, Seq2SeqFinetuningCollator,
build_finetuning_dataloader,
Expand Down Expand Up @@ -65,7 +65,11 @@
'build_alibi_bias',
'optim',
'utils',
'loggers',
'algorithms',
'callbacks',
'TiktokenTokenizerWrapper',
'registry',
]

__version__ = '0.6.0'
12 changes: 12 additions & 0 deletions llmfoundry/algorithms/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
# Copyright 2024 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

from composer.algorithms import (Alibi, GatedLinearUnits, GradientClipping,
LowPrecisionLayerNorm)

from llmfoundry.registry import algorithms

algorithms.register('gradient_clipping', func=GradientClipping)
algorithms.register('alibi', func=Alibi)
algorithms.register('gated_linear_units', func=GatedLinearUnits)
algorithms.register('low_precision_layernorm', func=LowPrecisionLayerNorm)
25 changes: 25 additions & 0 deletions llmfoundry/callbacks/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,11 @@
# Copyright 2022 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

from composer.callbacks import (EarlyStopper, Generate, LRMonitor,
MemoryMonitor, MemorySnapshot, OOMObserver,
OptimizerMonitor, RuntimeEstimator,
SpeedMonitor)

from llmfoundry.callbacks.async_eval_callback import AsyncEval
from llmfoundry.callbacks.curriculum_learning_callback import CurriculumLearning
from llmfoundry.callbacks.eval_gauntlet_callback import EvalGauntlet
Expand All @@ -11,6 +16,26 @@
from llmfoundry.callbacks.resumption_callbacks import (GlobalLRScaling,
LayerFreezing)
from llmfoundry.callbacks.scheduled_gc_callback import ScheduledGarbageCollector
from llmfoundry.registry import callbacks, callbacks_with_config

callbacks.register('lr_monitor', func=LRMonitor)
callbacks.register('memory_monitor', func=MemoryMonitor)
callbacks.register('memory_snapshot', func=MemorySnapshot)
callbacks.register('speed_monitor', func=SpeedMonitor)
callbacks.register('runtime_estimator', func=RuntimeEstimator)
callbacks.register('optimizer_monitor', func=OptimizerMonitor)
callbacks.register('generate_callback', func=Generate)
callbacks.register('early_stopper', func=EarlyStopper)
callbacks.register('fdiff_metrics', func=FDiffMetrics)
callbacks.register('hf_checkpointer', func=HuggingFaceCheckpointer)
callbacks.register('global_lr_scaling', func=GlobalLRScaling)
callbacks.register('layer_freezing', func=LayerFreezing)
callbacks.register('mono_checkpoint_saver', func=MonolithicCheckpointSaver)
callbacks.register('scheduled_gc', func=ScheduledGarbageCollector)
callbacks.register('oom_observer', func=OOMObserver)

callbacks_with_config.register('async_eval', func=AsyncEval)
callbacks_with_config.register('curriculum_learning', func=CurriculumLearning)

__all__ = [
'FDiffMetrics',
Expand Down
5 changes: 3 additions & 2 deletions llmfoundry/callbacks/async_eval_callback.py
Original file line number Diff line number Diff line change
Expand Up @@ -14,14 +14,15 @@
from typing import Any, Dict, List, Optional, Tuple, Union

from composer.callbacks import CheckpointSaver
from composer.core import Callback, Event, State, Time, Timestamp, TimeUnit
from composer.core import Event, State, Time, Timestamp, TimeUnit
from composer.loggers import Logger
from composer.loggers.mosaicml_logger import (MOSAICML_PLATFORM_ENV_VAR,
RUN_NAME_ENV_VAR)
from composer.utils import dist
from composer.utils.file_helpers import list_remote_objects
from composer.utils.misc import create_interval_scheduler

from llmfoundry.interfaces import CallbackWithConfig
from mcli import Run, RunConfig, create_run, get_run

log = logging.getLogger(__name__)
Expand Down Expand Up @@ -177,7 +178,7 @@ def validate_eval_run_config(
CHECKS_PER_INTERVAL = 4


class AsyncEval(Callback):
class AsyncEval(CallbackWithConfig):
"""Run the eval loop asynchronously as part of a MosaicML platform run.

This callback is currently experimental. The API may change in the future.
Expand Down
9 changes: 5 additions & 4 deletions llmfoundry/callbacks/curriculum_learning_callback.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,18 +10,19 @@
import logging
from typing import Any, Dict

from composer.core import Callback, State
from composer.core import State
from composer.loggers import Logger
from streaming import StreamingDataset
from torch.utils.data import DataLoader

from llmfoundry.interfaces import CallbackWithConfig
from llmfoundry.utils.warnings import experimental

log = logging.getLogger(__name__)


@experimental('CurriculumLearning callback')
class CurriculumLearning(Callback):
class CurriculumLearning(CallbackWithConfig):
"""Starts an epoch with a different dataset when resuming from a checkpoint.

Args:
Expand All @@ -30,13 +31,13 @@ class CurriculumLearning(Callback):
being used.
"""

def __init__(self, dataset_index: int, current_dataset_config: Dict):
def __init__(self, dataset_index: int, train_config: Dict):
self.dataset_index = dataset_index
self.saved_dataset_index = 0
self.all_dataset_configs = []
self.current_dataset_state = {}
# The current dataset config is resolved and passed in train.py
self.current_dataset_config = current_dataset_config
self.current_dataset_config = train_config['dataloader']

def before_load(self, state: State, logger: Logger):
del logger
Expand Down
8 changes: 8 additions & 0 deletions llmfoundry/interfaces/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
# Copyright 2024 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

from llmfoundry.interfaces.callback_with_config import CallbackWithConfig

__all__ = [
'CallbackWithConfig',
]
21 changes: 21 additions & 0 deletions llmfoundry/interfaces/callback_with_config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
# Copyright 2024 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

import abc
from typing import Any

from composer.core import Callback

__all__ = ['CallbackWithConfig']


class CallbackWithConfig(Callback, abc.ABC):
"""A callback that takes a config dictionary as an argument, in addition to.

its other kwargs.
"""

def __init__(self, config: dict[str, Any], *args: Any,
**kwargs: Any) -> None:
del config, args, kwargs
pass
14 changes: 14 additions & 0 deletions llmfoundry/loggers/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Copyright 2024 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

from composer.loggers import (InMemoryLogger, MLFlowLogger, TensorboardLogger,
WandBLogger)

from llmfoundry.registry import loggers

loggers.register('wandb', func=WandBLogger)
loggers.register('tensorboard', func=TensorboardLogger)
loggers.register('inmemory', func=InMemoryLogger)
loggers.register('in_memory_logger',
func=InMemoryLogger) # for backwards compatibility
loggers.register('mlflow', func=MLFlowLogger)
18 changes: 18 additions & 0 deletions llmfoundry/optim/__init__.py
Original file line number Diff line number Diff line change
@@ -1,8 +1,26 @@
# Copyright 2022 MosaicML LLM Foundry authors
# SPDX-License-Identifier: Apache-2.0

from composer.optim import (ConstantWithWarmupScheduler,
CosineAnnealingWithWarmupScheduler, DecoupledAdamW,
LinearWithWarmupScheduler)

from llmfoundry.optim.adaptive_lion import DecoupledAdaLRLion, DecoupledClipLion
from llmfoundry.optim.lion import DecoupledLionW
from llmfoundry.optim.scheduler import InverseSquareRootWithWarmupScheduler
from llmfoundry.registry import optimizers, schedulers

optimizers.register('adalr_lion', func=DecoupledAdaLRLion)
optimizers.register('clip_lion', func=DecoupledClipLion)
optimizers.register('decoupled_lionw', func=DecoupledLionW)
optimizers.register('decoupled_adamw', func=DecoupledAdamW)

schedulers.register('constant_with_warmup', func=ConstantWithWarmupScheduler)
schedulers.register('cosine_with_warmup',
func=CosineAnnealingWithWarmupScheduler)
schedulers.register('linear_decay_with_warmup', func=LinearWithWarmupScheduler)
schedulers.register('inv_sqrt_with_warmup',
func=InverseSquareRootWithWarmupScheduler)

__all__ = [
'DecoupledLionW',
Expand Down
Loading
Loading