0.10.0
Benchmark v2 🎉
Benchmark has received several internal improvements. While general usage mostly stays the same, there are a few user-facing changes from previous versions:
Breaking changes
Datatype.TABULAR_NUMERIC
andDatatype.TABULAR_MIXED
have been replaced by a single enum variant,Datatype.TABULAR
.- If you're passing a list of multiple sources to
make_dataset
an exception will be raised.
Deprecations
make_dataset
is being replaced bycreate_dataset
- The freestanding functions for Gretel datasets (
get_gretel_dataset
,list_gretel_datasets
,list_gretel_dataset_tags
) are being replaced by methods on a new object:repo = GretelDatasetRepo() repo.get_dataset(...) repo.list_datasets(...) repo.list_gretel_dataset_tags(...)
Trainer column partitioning removed 👋
Trainer no longer partitions datasets by column. The max_header_clusters
argument to the Gretel model classes in gretel_trainer.models
is deprecated, and will be removed in a future release.
Smaller notes 🧹
- A bug when downloading record handler data in Relational Trainer in hybrid deployments has been fixed
- Relational Trainer uses Pandas features that were added in
1.5
, so the dependency version has been corrected. - Trainer no longer depends on
gretel-synthetics
All PRs
- Clean up some imports by @mikeknep in #141
- Drop relational drawing spike by @mikeknep in #143
- Dependency tweaks by @mikeknep in #145
- Benchmark v2 by @mikeknep in #146
- Misc by @mikeknep in #148
- Remove column partitioning by @mikeknep in #144
- Move benchmark log setup out of init by @mikeknep in #149
- Export helper as fixture instead of importing from test package by @mikeknep in #147
- Cleanup by @mikeknep in #150
- Wrap record handler data download in smart_open by @mikeknep in #151
Full Changelog: v0.9.1...v0.10.0