-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: improved performance of TabularDataset.__eq__
by a factor of up to 2
#697
Conversation
…up to 2 perf: slightly improved performance of `TabularDataset.__hash__` fix: corrected `TabularDataset.__sizeof__`
🦙 MegaLinter status: ✅ SUCCESS
See detailed report in MegaLinter reports |
@lars-reimann do you really want to save the exact same data twice in a |
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #697 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 66 66
Lines 4873 4873
=========================================
Hits 4873 4873 ☔ View full report in Codecov by Sentry. |
Due to We should add some performance benchmarks , though, to check runtime and memory use of the current implementation (not needed for this PR). |
Regarding column order: Should that matter for equality? If so, we should compare the tables and the names of target, features, and extras. |
In the |
…mprove-tabular-dataset
…up to 2 (#697) ### Summary of Changes perf: improved performance of `TabularDataset.__eq__` by a factor of up to 2 perf: slightly improved performance of `TabularDataset.__hash__` fix: corrected `TabularDataset.__sizeof__` --------- Co-authored-by: megalinter-bot <[email protected]>
## [0.24.0](v0.23.0...v0.24.0) (2024-05-09) ### Features * `Column.plot_histogram()` using `Table.plot_histograms` for consistent results ([#726](#726)) ([576492c](576492c)) * `Regressor.summarize_metrics` and `Classifier.summarize_metrics` ([#729](#729)) ([1cc14b1](1cc14b1)), closes [#713](#713) * `Table.keep_only_rows` ([#721](#721)) ([923a6c2](923a6c2)) * `Table.remove_rows` ([#720](#720)) ([a1cdaef](a1cdaef)), closes [#698](#698) * Add `ImageDataset` and Layer for ConvolutionalNeuralNetworks ([#645](#645)) ([5b6d219](5b6d219)), closes [#579](#579) [#580](#580) [#581](#581) * added load_percentage parameter to ImageList.from_files to load a subset of the given files ([#739](#739)) ([0564b52](0564b52)), closes [#736](#736) * added rnn layer and TimeSeries conversion ([#615](#615)) ([6cad203](6cad203)), closes [#614](#614) [#648](#648) [#656](#656) [#601](#601) * Basic implementation of cell with polars ([#734](#734)) ([004630b](004630b)), closes [#712](#712) * deprecate `Table.add_column` and `Table.add_row` ([#723](#723)) ([5dd9d02](5dd9d02)), closes [#722](#722) * deprecated `Table.from_excel_file` and `Table.to_excel_file` ([#728](#728)) ([c89e0bf](c89e0bf)), closes [#727](#727) * Larger histogram plot if table only has one column ([#716](#716)) ([31ffd12](31ffd12)) * polars implementation of a column ([#738](#738)) ([732aa48](732aa48)), closes [#712](#712) * polars implementation of a row ([#733](#733)) ([ff627f6](ff627f6)), closes [#712](#712) * polars implementation of table ([#744](#744)) ([fc49895](fc49895)), closes [#638](#638) [#641](#641) [#649](#649) [#712](#712) * regularization for decision trees and random forests ([#730](#730)) ([102de2d](102de2d)), closes [#700](#700) * Remove device information in image class ([#735](#735)) ([d783caa](d783caa)), closes [#524](#524) * return fitted transformer and transformed table from `fit_and_transform` ([#724](#724)) ([2960d35](2960d35)), closes [#613](#613) ### Bug Fixes * make `Image.clone` internal ([#725](#725)) ([215a472](215a472)), closes [#626](#626) ### Performance Improvements * improved performance of `TabularDataset.__eq__` by a factor of up to 2 ([#697](#697)) ([cd7f55b](cd7f55b))
🎉 This PR is included in version 0.24.0 🎉 The release is available on:
Your semantic-release bot 📦🚀 |
Summary of Changes
perf: improved performance of
TabularDataset.__eq__
by a factor of up to 2perf: slightly improved performance of
TabularDataset.__hash__
fix: corrected
TabularDataset.__sizeof__