Release v0.14.0-alpha1 · replicate/cog

Support for concurrent predictions

This release introduces support for concurrent processing of predictions through the use of an async predict function.

To enable the feature add the new concurrency.max entry to your cog.yaml file:

concurrency:
  max: 32

And update your predictor to use the async def predict syntax:

class Predictor(BasePredictor):
    async def setup(self) -> None:
        print("async setup is also supported...")

    async def predict(self) -> str:
        print("async predict");
        return "hello world";

Cog will now process up to 32 predictions simultaneously, once at capacity subsequent predictions will return a 409 HTTP response.

Iterators

If your model is currently using Iterator or ConcatenateIterator it will need to be updated to use AsyncIterator or AsyncConcatenateIterator respectively.

from cog import AsyncConcatenateIterator, BasePredictor

class Predict(BasePredictor):
    async def predict(self) -> AsyncConcatenateIterator[str]:
        for fruit in ["apple", "banana", "orange"]:
            yield fruit

Migrating from 0.10.0a

An earlier fork of cog with concurrency support was published under the 0.10.0a release channel. This is now unsupported and will receive no further updates. There are some breaking changes in the API that will be introduced with the release of the 0.14.0 beta. This alpha release is backwards compatible and you will see deprecation warnings when calling the deprecated functions.

emit_metric(name, value) - this has been replaced by current_scope().record_metric(name, value)

Note

Note that the use of current_scope is still experimental and will output warnings to the console. To suppress these you can ignore the ExperimentalFeatureWarning:

import warnings
from cog import ExperimentalFeatureWarning
warnings.filterwarnings("ignore", category=ExperimentalFeatureWarning)

Known limitations

An async setup method cannot be used without an async predict method. Supported combinations are: sync setup/sync predict, async setup/async predict and sync setup/async predict.
File uploads will block the event loop. If your model outputs File or Path types these will currently block the event loop. This may be an issue for large file outputs and will be fixed in a future release.

Other Changes

Change torch vision to 0.20.0 for torch 2.5.0 cpu by @8W9aG in #2074
Ignore files within a .git directory by @8W9aG in #2087
Add fast build flag to cog by @8W9aG in #2086
Make dockerfile generators abstract by @8W9aG in #2088
Do not run a separate python install stage by @8W9aG in #2094

Full Changelog: v0.13.6...v0.14.0-alpha1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.14.0-alpha1

Support for concurrent predictions

Iterators

Migrating from 0.10.0a

Known limitations

Other Changes

Contributors