A Python package for the Open Benchmark for Tabular Data

This is a Python package for working with the tabben benchmark for tabular data (for machine learning tasks):

loading, processing, or inspecting datasets,
evaluating models on test sets using consistent evaluation metrics, and
examining collections of datasets from the benchmark.

See the tabben website for more info about the project.

Set Up

For the most recent stable release, you can install the tabben package from PyPI:

pip3 install tabben --upgrade

If testing/using locally from source, you can install the tabben package locally; first clone this repository locally, and then install from the python subdirectory:

pip3 install -e .

Documentation

See the package docs for tutorials, API references, and details about each of the datasets included in the benchmark.

Testing

After installing pytest and the required test dependencies, all the tests can be run by just running

pytest

from the package directory.

There is also a "large" pytest mark on tests that test with large datasets that can be used to exclude running on those datasets (as those can take a while to run, upwards of a minute depending on the network and computer).

pytest -m "not large"

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

A Python package for the Open Benchmark for Tabular Data

Set Up

Documentation

Testing

Files

README.md

Latest commit

History

README.md

File metadata and controls

A Python package for the Open Benchmark for Tabular Data

Set Up

Documentation

Testing