feat: specify `extras` instead of `features` in `to_tabular_dataset` #685

lars-reimann · 2024-05-01T18:14:59Z

Closes #623

Summary of Changes

When creating a tabular dataset, users can now optionally specify extra columns, i.e. columns that are neither target nor feature. The feature columns are implicitly all columns that are neither target nor extra.

Previously, users had to specify the features instead and the extras were implicit. However, the list of features is usually much longer than the list of extras, making the previous approach cumbersome.

…ature nor target

…Dataset`

Use the constructor instead

It was an internal method that was only used for tests. Moreover, this conversion makes little sense. We should instead be able to go back to a time series from a time series dataset.

github-actions · 2024-05-01T18:16:32Z

🦙 MegaLinter status: ✅ SUCCESS

Descriptor	Linter	Files	Fixed	Errors	Elapsed time
✅ PYTHON	black	28	0	0	1.45s
✅ PYTHON	mypy	28		0	2.78s
✅ PYTHON	ruff	28	0	0	0.24s
✅ REPOSITORY	git_diff	yes		no	0.38s

See detailed report in MegaLinter reports
Set VALIDATE_ALL_CODEBASE: true in mega-linter.yml to validate all sources, not only the diff

MegaLinter is graciously provided by

codecov · 2024-05-01T18:43:04Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (74b8a35) to head (836bd5e).

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #685   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           67        67           
  Lines         4816      4787   -29     
=========================================
- Hits          4816      4787   -29

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

## [0.22.0](v0.21.0...v0.22.0) (2024-05-01) ### Features * `is_fitted` is now always a property ([#662](#662)) ([b1db881](b1db881)), closes [#586](#586) * add `Column.missing_value_count` ([#682](#682)) ([f084916](f084916)), closes [#642](#642) * Add `InputConversion` & `OutputConversion` for nn interface ([#625](#625)) ([fd723f7](fd723f7)), closes [#621](#621) * Add hash,eq and sizeof in ForwardLayer ([#634](#634)) ([72f7fde](72f7fde)), closes [#633](#633) * allow using tables that already contain target for prediction ([#687](#687)) ([e9f1cfb](e9f1cfb)), closes [#636](#636) * callback `Row.sort_columns` takes four parameters instead of two tuples ([#683](#683)) ([9c3e3de](9c3e3de)), closes [#584](#584) * rename `group_rows_by` in `Table` to `group_rows` ([#661](#661)) ([c1644b7](c1644b7)), closes [#611](#611) * rename `number_of_column` in `Row` to `number_of_columns` ([#660](#660)) ([0a08296](0a08296)), closes [#646](#646) * rework `TaggedTable` ([#680](#680)) ([db2b613](db2b613)), closes [#647](#647) * show missing value count/ratio in summarized statistics ([#684](#684)) ([74b8a35](74b8a35)), closes [#619](#619) * specify `extras` instead of `features` in `to_tabular_dataset` ([#685](#685)) ([841657f](841657f)), closes [#623](#623) ### Bug Fixes * actually use `kernel` of support vector machines for training ([#681](#681)) ([09c5082](09c5082)), closes [#602](#602) ### Performance Improvements * Faster plot_histograms and more reliable plots ([#659](#659)) ([b5f0a12](b5f0a12))

lars-reimann · 2024-05-01T19:42:34Z

🎉 This PR is included in version 0.22.0 🎉

The release is available on:

v0.22.0
GitHub release

Your semantic-release bot 📦🚀

lars-reimann added 7 commits May 1, 2024 18:52

feat: property extras to get columns of dataset that are neither fe…

82adb20

…ature nor target

feat: specify extras instead of features when creating a `Tabular…

2e1e372

…Dataset`

test: adjust tests to API changes

d91a625

refactor: remove TabularDataset._from_table

c8283ad

Use the constructor instead

test: fix classical ML tests

5c2b7f6

fix: sklearn utils

65885c8

refactor: remove TimeSeries._from_tabular_dataset

d862d3d

It was an internal method that was only used for tests. Moreover, this conversion makes little sense. We should instead be able to go back to a time series from a time series dataset.

lars-reimann linked an issue May 1, 2024 that may be closed by this pull request

In tagColumns specify hidden instead of features #623

Closed

megalinter-bot and others added 4 commits May 1, 2024 18:16

style: apply automated linter fixes

a548870

style: apply automated linter fixes

cdc0c92

docs: fix wrong example

198d7a2

docs: get Notebooks running again

836bd5e

lars-reimann marked this pull request as ready for review May 1, 2024 18:45

lars-reimann requested a review from a team as a code owner May 1, 2024 18:45

lars-reimann merged commit 841657f into main May 1, 2024
8 checks passed

lars-reimann deleted the 623-in-tagcolumns-specify-hidden-instead-of-features branch May 1, 2024 18:45

lars-reimann added the released Included in a release label May 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: specify `extras` instead of `features` in `to_tabular_dataset` #685

feat: specify `extras` instead of `features` in `to_tabular_dataset` #685

lars-reimann commented May 1, 2024 •

edited

Loading

github-actions bot commented May 1, 2024 •

edited

Loading

codecov bot commented May 1, 2024

lars-reimann commented May 1, 2024

feat: specify extras instead of features in to_tabular_dataset #685

feat: specify extras instead of features in to_tabular_dataset #685

Conversation

lars-reimann commented May 1, 2024 • edited Loading

Summary of Changes

github-actions bot commented May 1, 2024 • edited Loading

🦙 MegaLinter status: ✅ SUCCESS

codecov bot commented May 1, 2024

Codecov Report

lars-reimann commented May 1, 2024

feat: specify `extras` instead of `features` in `to_tabular_dataset` #685

feat: specify `extras` instead of `features` in `to_tabular_dataset` #685

lars-reimann commented May 1, 2024 •

edited

Loading

github-actions bot commented May 1, 2024 •

edited

Loading