diff --git a/CHANGELOG.md b/CHANGELOG.md index 6d4bdfb8d98..dda2e02f593 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,8 +3,244 @@ Please see https://github.com/rapidsai/cudf/releases/tag/v22.04.00a for the latest changes to this development branch. # cuDF 22.02.00 (Date TBD) +# cuDF 22.02.00 (2 Feb 2022) + +## 🚨 Beaking Changes + +- ORC wite API changes fo ganula statistics ([#10058](https://github.com/rapidsai/cudf/pull/10058)) [@mythocks](https://github.com/mythocks) +- `decimal128` Suppot fo `to/fom_aow` ([#9986](https://github.com/rapidsai/cudf/pull/9986)) [@codeepot](https://github.com/codeepot) +- Remove depecated method `one_hot_encoding` ([#9977](https://github.com/rapidsai/cudf/pull/9977)) [@isVoid](https://github.com/isVoid) +- Remove st.subwod_tokenize ([#9968](https://github.com/rapidsai/cudf/pull/9968)) [@VibhuJawa](https://github.com/VibhuJawa) +- Remove depecated `method` paamete fom `mege` and `join`. ([#9944](https://github.com/rapidsai/cudf/pull/9944)) [@bdice](https://github.com/bdice) +- Remove depecated method DataFame.hash_columns. ([#9943](https://github.com/rapidsai/cudf/pull/9943)) [@bdice](https://github.com/bdice) +- Remove depecated method Seies.hash_encode. ([#9942](https://github.com/rapidsai/cudf/pull/9942)) [@bdice](https://github.com/bdice) +- Refactoing ceil/ound/floo code fo datetime64 types ([#9926](https://github.com/rapidsai/cudf/pull/9926)) [@mayankanand007](https://github.com/mayankanand007) +- Intoduce `nan_as_null` paamete fo `cudf.Index` ([#9893](https://github.com/rapidsai/cudf/pull/9893)) [@galipemsaga](https://github.com/galipemsaga) +- Add egex_flags paamete to stings eplace_e functions ([#9878](https://github.com/rapidsai/cudf/pull/9878)) [@davidwendt](https://github.com/davidwendt) +- Beak tie fo `top` categoical columns in `Seies.descibe` ([#9867](https://github.com/rapidsai/cudf/pull/9867)) [@isVoid](https://github.com/isVoid) +- Add patitioning suppot in paquet wite ([#9810](https://github.com/rapidsai/cudf/pull/9810)) [@devavet](https://github.com/devavet) +- Move `dop_duplicates`, `dop_na`, `_gathe`, `take` to IndexFame and ceate thei `_base_index` countepats ([#9807](https://github.com/rapidsai/cudf/pull/9807)) [@isVoid](https://github.com/isVoid) +- Raise tempoay eo fo `decimal128` types in paquet eade ([#9804](https://github.com/rapidsai/cudf/pull/9804)) [@galipemsaga](https://github.com/galipemsaga) +- Change default `dtype` of all nulls column fom `float` to `object` ([#9803](https://github.com/rapidsai/cudf/pull/9803)) [@galipemsaga](https://github.com/galipemsaga) +- Remove unused masked udf cython/c++ code ([#9792](https://github.com/rapidsai/cudf/pull/9792)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Pick smallest decimal type with equied pecision in ORC eade ([#9775](https://github.com/rapidsai/cudf/pull/9775)) [@vuule](https://github.com/vuule) +- Add decimal128 suppot to Paquet eade and wite ([#9765](https://github.com/rapidsai/cudf/pull/9765)) [@vuule](https://github.com/vuule) +- Refacto TableTest assetion methods to a sepaate utility class ([#9762](https://github.com/rapidsai/cudf/pull/9762)) [@jlowe](https://github.com/jlowe) +- Use cuFile diect device eads/wites by default in cuIO ([#9722](https://github.com/rapidsai/cudf/pull/9722)) [@vuule](https://github.com/vuule) +- Match pandas scala esult types in eductions ([#9717](https://github.com/rapidsai/cudf/pull/9717)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Add paametes to contol ow goup size in Paquet wite ([#9677](https://github.com/rapidsai/cudf/pull/9677)) [@vuule](https://github.com/vuule) +- Refacto bit counting APIs, intoduce valid/null count functions, and split host/device side code fo segmented counts. ([#9588](https://github.com/rapidsai/cudf/pull/9588)) [@bdice](https://github.com/bdice) +- Add suppot fo `decimal128` in cudf python ([#9533](https://github.com/rapidsai/cudf/pull/9533)) [@galipemsaga](https://github.com/galipemsaga) +- Implement `lists::index_of()` to find positions in list ows ([#9510](https://github.com/rapidsai/cudf/pull/9510)) [@mythocks](https://github.com/mythocks) +- Rewiting ow/column convesions fo Spak <-> cudf data convesions ([#8444](https://github.com/rapidsai/cudf/pull/8444)) [@hypebolic2346](https://github.com/hypebolic2346) -Please see https://github.com/rapidsai/cudf/releases/tag/v22.02.00a for the latest changes to this development branch. +## 🐛 Bug Fixes + +- Add check fo negative stipe index in ORC eade ([#10074](https://github.com/rapidsai/cudf/pull/10074)) [@vuule](https://github.com/vuule) +- Update Java tests to expect DECIMAL128 fom Aow ([#10073](https://github.com/rapidsai/cudf/pull/10073)) [@jlowe](https://github.com/jlowe) +- Avoid index mateialization when `DataFame` is ceated with un-named `Seies` objects ([#10071](https://github.com/rapidsai/cudf/pull/10071)) [@galipemsaga](https://github.com/galipemsaga) +- fix gcc 11 compilation eos ([#10067](https://github.com/rapidsai/cudf/pull/10067)) [@ongou](https://github.com/ongou) +- Fix `columns` odeing issue in paquet eade ([#10066](https://github.com/rapidsai/cudf/pull/10066)) [@galipemsaga](https://github.com/galipemsaga) +- Fix datafame setitem with `ndaay` types ([#10056](https://github.com/rapidsai/cudf/pull/10056)) [@galipemsaga](https://github.com/galipemsaga) +- Remove implicit copy due to convesion fom cudf::size_type and size_t ([#10045](https://github.com/rapidsai/cudf/pull/10045)) [@obetmaynad](https://github.com/obetmaynad) +- Include <optional> in heades that use std::optional ([#10044](https://github.com/rapidsai/cudf/pull/10044)) [@obetmaynad](https://github.com/obetmaynad) +- Fix ep and concat of `StuctColumn` ([#10042](https://github.com/rapidsai/cudf/pull/10042)) [@galipemsaga](https://github.com/galipemsaga) +- Include ow goup level stats when witing ORC files ([#10041](https://github.com/rapidsai/cudf/pull/10041)) [@vuule](https://github.com/vuule) +- build.sh espects the `--build_metics` and `--incl_cache_stats` flags ([#10035](https://github.com/rapidsai/cudf/pull/10035)) [@obetmaynad](https://github.com/obetmaynad) +- Fix memoy leaks in JNI native code. ([#10029](https://github.com/rapidsai/cudf/pull/10029)) [@mythocks](https://github.com/mythocks) +- Update JNI to use new aena m constucto ([#10027](https://github.com/rapidsai/cudf/pull/10027)) [@ongou](https://github.com/ongou) +- Fix null check when compaing stucts in `ag_min` opeation of eduction/goupby ([#10026](https://github.com/rapidsai/cudf/pull/10026)) [@ttnghia](https://github.com/ttnghia) +- Wap CI scipt shell vaiables in quotes to fix local testing. ([#10018](https://github.com/rapidsai/cudf/pull/10018)) [@bdice](https://github.com/bdice) +- cudftestutil no longe popagates compile flags to extenal uses ([#10017](https://github.com/rapidsai/cudf/pull/10017)) [@obetmaynad](https://github.com/obetmaynad) +- Remove `CUDA_DEVICE_CALLABLE` maco usage ([#10015](https://github.com/rapidsai/cudf/pull/10015)) [@hypebolic2346](https://github.com/hypebolic2346) +- Add missing list filling heade in meta.yaml ([#10007](https://github.com/rapidsai/cudf/pull/10007)) [@devavet](https://github.com/devavet) +- Fix `conda` ecipes fo `custeamz` & `cudf_kafka` ([#10003](https://github.com/rapidsai/cudf/pull/10003)) [@ajschmidt8](https://github.com/ajschmidt8) +- Fix matching egex wod-bounday () in stings eplace ([#9997](https://github.com/rapidsai/cudf/pull/9997)) [@davidwendt](https://github.com/davidwendt) +- Fix null check when compaing stucts in `min` and `max` eduction/goupby opeations ([#9994](https://github.com/rapidsai/cudf/pull/9994)) [@ttnghia](https://github.com/ttnghia) +- Fix octal patten matching in egex sting ([#9993](https://github.com/rapidsai/cudf/pull/9993)) [@davidwendt](https://github.com/davidwendt) +- `decimal128` Suppot fo `to/fom_aow` ([#9986](https://github.com/rapidsai/cudf/pull/9986)) [@codeepot](https://github.com/codeepot) +- Fix goupby shift/diff/fill afte selecting fom a `GoupBy` ([#9984](https://github.com/rapidsai/cudf/pull/9984)) [@shwina](https://github.com/shwina) +- Fix the oveflow poblem of decimal escale ([#9966](https://github.com/rapidsai/cudf/pull/9966)) [@spelingxx](https://github.com/spelingxx) +- Use default value fo decimal pecision in paquet wite when not specified ([#9963](https://github.com/rapidsai/cudf/pull/9963)) [@devavet](https://github.com/devavet) +- Fix cudf java build eo. ([#9958](https://github.com/rapidsai/cudf/pull/9958)) [@fiestaman](https://github.com/fiestaman) +- Use gpuci_mamba_ety to install local atifacts. ([#9951](https://github.com/rapidsai/cudf/pull/9951)) [@bdice](https://github.com/bdice) +- Fix egession HostColumnVectoCoe equiing native libs ([#9948](https://github.com/rapidsai/cudf/pull/9948)) [@jlowe](https://github.com/jlowe) +- Rename aggegate_metadata in wite to fix name collision ([#9938](https://github.com/rapidsai/cudf/pull/9938)) [@devavet](https://github.com/devavet) +- Fixed issue with pecentile_appox whee output tdigests could have uninitialized data at the end. ([#9931](https://github.com/rapidsai/cudf/pull/9931)) [@nvdbaanec](https://github.com/nvdbaanec) +- Resolve acecheck eos in ORC kenels ([#9916](https://github.com/rapidsai/cudf/pull/9916)) [@vuule](https://github.com/vuule) +- Fix the java build afte paquet patitioning suppot ([#9908](https://github.com/rapidsai/cudf/pull/9908)) [@evans2](https://github.com/evans2) +- Fix compilation of benchmak fo paquet wite. ([#9905](https://github.com/rapidsai/cudf/pull/9905)) [@bdice](https://github.com/bdice) +- Fix a memcheck eo in ORC wite ([#9896](https://github.com/rapidsai/cudf/pull/9896)) [@vuule](https://github.com/vuule) +- Intoduce `nan_as_null` paamete fo `cudf.Index` ([#9893](https://github.com/rapidsai/cudf/pull/9893)) [@galipemsaga](https://github.com/galipemsaga) +- Fix fallback to sot aggegation fo gouping only hash aggegate ([#9891](https://github.com/rapidsai/cudf/pull/9891)) [@abellina](https://github.com/abellina) +- Add zlib to cudfjni link when using static libcudf libay dependency ([#9890](https://github.com/rapidsai/cudf/pull/9890)) [@jlowe](https://github.com/jlowe) +- TimedeltaIndex constucto aises an AttibuteEo. ([#9884](https://github.com/rapidsai/cudf/pull/9884)) [@skiui-souce](https://github.com/skiui-souce) +- Fix cudf.Scala sting datetime constuction ([#9875](https://github.com/rapidsai/cudf/pull/9875)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Load libcufile.so with RTLD_NODELETE flag ([#9872](https://github.com/rapidsai/cudf/pull/9872)) [@vuule](https://github.com/vuule) +- Beak tie fo `top` categoical columns in `Seies.descibe` ([#9867](https://github.com/rapidsai/cudf/pull/9867)) [@isVoid](https://github.com/isVoid) +- Fix null handling fo stucts `min` and `ag_min` in goupby, goupby scan, eduction, and inclusive_scan ([#9864](https://github.com/rapidsai/cudf/pull/9864)) [@ttnghia](https://github.com/ttnghia) +- Add one-level list encoding suppot in paquet eade ([#9848](https://github.com/rapidsai/cudf/pull/9848)) [@PointKenel](https://github.com/PointKenel) +- Fix an out-of-bounds ead in validity copying in contiguous_split. ([#9842](https://github.com/rapidsai/cudf/pull/9842)) [@nvdbaanec](https://github.com/nvdbaanec) +- Fix join of MultiIndex to Index with one column and ovelapping name. ([#9830](https://github.com/rapidsai/cudf/pull/9830)) [@vyas](https://github.com/vyas) +- Fix caching in `Seies.applymap` ([#9821](https://github.com/rapidsai/cudf/pull/9821)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Enfoce boolean `ascending` fo dask-cudf `sot_values` ([#9814](https://github.com/rapidsai/cudf/pull/9814)) [@chalesbluca](https://github.com/chalesbluca) +- Fix ORC wite cash with empty input columns ([#9808](https://github.com/rapidsai/cudf/pull/9808)) [@vuule](https://github.com/vuule) +- Change default `dtype` of all nulls column fom `float` to `object` ([#9803](https://github.com/rapidsai/cudf/pull/9803)) [@galipemsaga](https://github.com/galipemsaga) +- Load native dependencies when Java ColumnView is loaded ([#9800](https://github.com/rapidsai/cudf/pull/9800)) [@jlowe](https://github.com/jlowe) +- Fix dtype-agument bug in dask_cudf ead_csv ([#9796](https://github.com/rapidsai/cudf/pull/9796)) [@jzamoa](https://github.com/jzamoa) +- Fix oveflow fo min calculation in stings::fom_timestamps ([#9793](https://github.com/rapidsai/cudf/pull/9793)) [@evans2](https://github.com/evans2) +- Fix memoy eo due to lambda etun type deduction limitation ([#9778](https://github.com/rapidsai/cudf/pull/9778)) [@kathikeyann](https://github.com/kathikeyann) +- Revet egex $/EOL end-of-sting new-line special case handling ([#9774](https://github.com/rapidsai/cudf/pull/9774)) [@davidwendt](https://github.com/davidwendt) +- Fix missing steams ([#9767](https://github.com/rapidsai/cudf/pull/9767)) [@kathikeyann](https://github.com/kathikeyann) +- Fix make_empty_scala_like on list_type ([#9759](https://github.com/rapidsai/cudf/pull/9759)) [@spelingxx](https://github.com/spelingxx) +- Update cmake and conda to 22.02 ([#9746](https://github.com/rapidsai/cudf/pull/9746)) [@devavet](https://github.com/devavet) +- Fix out-of-bounds memoy wite in decimal128-to-sting convesion ([#9740](https://github.com/rapidsai/cudf/pull/9740)) [@davidwendt](https://github.com/davidwendt) +- Match pandas scala esult types in eductions ([#9717](https://github.com/rapidsai/cudf/pull/9717)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Fix egex non-multiline EOL/$ matching stings ending with a new-line ([#9715](https://github.com/rapidsai/cudf/pull/9715)) [@davidwendt](https://github.com/davidwendt) +- Fixed build by adding moe checks fo int8, int16 ([#9707](https://github.com/rapidsai/cudf/pull/9707)) [@azajafi](https://github.com/azajafi) +- Fix `null` handling when `boolean` dtype is passed ([#9691](https://github.com/rapidsai/cudf/pull/9691)) [@galipemsaga](https://github.com/galipemsaga) +- Fix steam usage in `segmented_gathe()` ([#9679](https://github.com/rapidsai/cudf/pull/9679)) [@mythocks](https://github.com/mythocks) + +## 📖 Documentation + +- Update `decimal` dtypes elated docs enties ([#10072](https://github.com/rapidsai/cudf/pull/10072)) [@galipemsaga](https://github.com/galipemsaga) +- Fix egex doc descibing hexadecimal escape chaactes ([#10009](https://github.com/rapidsai/cudf/pull/10009)) [@davidwendt](https://github.com/davidwendt) +- Fix cudf compilation instuctions. ([#9956](https://github.com/rapidsai/cudf/pull/9956)) [@esoha-nvidia](https://github.com/esoha-nvidia) +- Fix see also links fo IO APIs ([#9895](https://github.com/rapidsai/cudf/pull/9895)) [@galipemsaga](https://github.com/galipemsaga) +- Fix build instuctions fo libcudf doxygen ([#9837](https://github.com/rapidsai/cudf/pull/9837)) [@davidwendt](https://github.com/davidwendt) +- Fix some doxygen wanings and add missing documentation ([#9770](https://github.com/rapidsai/cudf/pull/9770)) [@kathikeyann](https://github.com/kathikeyann) +- update cuda vesion in local build ([#9736](https://github.com/rapidsai/cudf/pull/9736)) [@kathikeyann](https://github.com/kathikeyann) +- Fix doxygen fo enum types in libcudf ([#9724](https://github.com/rapidsai/cudf/pull/9724)) [@davidwendt](https://github.com/davidwendt) +- Spell check fixes ([#9682](https://github.com/rapidsai/cudf/pull/9682)) [@kathikeyann](https://github.com/kathikeyann) +- Fix links in C++ Develope Guide. ([#9675](https://github.com/rapidsai/cudf/pull/9675)) [@bdice](https://github.com/bdice) + +## 🚀 New Featues + +- Remove libcudacxx patch needed fo nvcc 11.4 ([#10057](https://github.com/rapidsai/cudf/pull/10057)) [@obetmaynad](https://github.com/obetmaynad) +- Allow CuPy 10 ([#10048](https://github.com/rapidsai/cudf/pull/10048)) [@jakikham](https://github.com/jakikham) +- Add in suppot fo NULL_LOGICAL_AND and NULL_LOGICAL_OR binops ([#10016](https://github.com/rapidsai/cudf/pull/10016)) [@evans2](https://github.com/evans2) +- Add `goupby.tansfom` (only suppot fo aggegations) ([#10005](https://github.com/rapidsai/cudf/pull/10005)) [@shwina](https://github.com/shwina) +- Add patitioning suppot to Paquet chunked wite ([#10000](https://github.com/rapidsai/cudf/pull/10000)) [@devavet](https://github.com/devavet) +- Add jni fo sequences ([#9972](https://github.com/rapidsai/cudf/pull/9972)) [@wbo4958](https://github.com/wbo4958) +- Java bindings fo mixed left, inne, and full joins ([#9941](https://github.com/rapidsai/cudf/pull/9941)) [@jlowe](https://github.com/jlowe) +- Java bindings fo JSON eade suppot ([#9940](https://github.com/rapidsai/cudf/pull/9940)) [@wbo4958](https://github.com/wbo4958) +- Enable tanspose fo sting columns in cudf python ([#9937](https://github.com/rapidsai/cudf/pull/9937)) [@galipemsaga](https://github.com/galipemsaga) +- Suppot stucts fo `cudf::contains` with column/scala input ([#9929](https://github.com/rapidsai/cudf/pull/9929)) [@ttnghia](https://github.com/ttnghia) +- Implement mixed equality/conditional joins ([#9917](https://github.com/rapidsai/cudf/pull/9917)) [@vyas](https://github.com/vyas) +- Add cudf::stings::extact_all API ([#9909](https://github.com/rapidsai/cudf/pull/9909)) [@davidwendt](https://github.com/davidwendt) +- Implement JNI fo `cudf::scatte` APIs ([#9903](https://github.com/rapidsai/cudf/pull/9903)) [@ttnghia](https://github.com/ttnghia) +- JNI: Function to copy and set validity fom bool column. ([#9901](https://github.com/rapidsai/cudf/pull/9901)) [@mythocks](https://github.com/mythocks) +- Add dictionay suppot to cudf::copy_if_else ([#9887](https://github.com/rapidsai/cudf/pull/9887)) [@davidwendt](https://github.com/davidwendt) +- add un_benchmaks taget fo unning benchmaks with json output ([#9879](https://github.com/rapidsai/cudf/pull/9879)) [@kathikeyann](https://github.com/kathikeyann) +- Add egex_flags paamete to stings eplace_e functions ([#9878](https://github.com/rapidsai/cudf/pull/9878)) [@davidwendt](https://github.com/davidwendt) +- Add_suffix and add_pefix fo DataFames and Seies ([#9846](https://github.com/rapidsai/cudf/pull/9846)) [@mayankanand007](https://github.com/mayankanand007) +- Add JNI fo `cudf::dop_duplicates` ([#9841](https://github.com/rapidsai/cudf/pull/9841)) [@ttnghia](https://github.com/ttnghia) +- Implement pe-list sequence ([#9839](https://github.com/rapidsai/cudf/pull/9839)) [@ttnghia](https://github.com/ttnghia) +- adding `seies.tanspose` ([#9835](https://github.com/rapidsai/cudf/pull/9835)) [@mayankanand007](https://github.com/mayankanand007) +- Adding suppot fo `Seies.autoco` ([#9833](https://github.com/rapidsai/cudf/pull/9833)) [@mayankanand007](https://github.com/mayankanand007) +- Suppot ound opeation on datetime64 datatypes ([#9820](https://github.com/rapidsai/cudf/pull/9820)) [@mayankanand007](https://github.com/mayankanand007) +- Add patitioning suppot in paquet wite ([#9810](https://github.com/rapidsai/cudf/pull/9810)) [@devavet](https://github.com/devavet) +- Raise tempoay eo fo `decimal128` types in paquet eade ([#9804](https://github.com/rapidsai/cudf/pull/9804)) [@galipemsaga](https://github.com/galipemsaga) +- Add decimal128 suppot to Paquet eade and wite ([#9765](https://github.com/rapidsai/cudf/pull/9765)) [@vuule](https://github.com/vuule) +- Optimize `goupby::scan` ([#9754](https://github.com/rapidsai/cudf/pull/9754)) [@PointKenel](https://github.com/PointKenel) +- Add sample JNI API ([#9728](https://github.com/rapidsai/cudf/pull/9728)) [@es-life](https://github.com/es-life) +- Suppot `min` and `max` in inclusive scan fo stucts ([#9725](https://github.com/rapidsai/cudf/pull/9725)) [@ttnghia](https://github.com/ttnghia) +- Add `fist` and `last` method to `IndexedFame` ([#9710](https://github.com/rapidsai/cudf/pull/9710)) [@isVoid](https://github.com/isVoid) +- Suppot `min` and `max` eduction fo stucts ([#9697](https://github.com/rapidsai/cudf/pull/9697)) [@ttnghia](https://github.com/ttnghia) +- Add paametes to contol ow goup size in Paquet wite ([#9677](https://github.com/rapidsai/cudf/pull/9677)) [@vuule](https://github.com/vuule) +- Run compute-sanitize in nightly build ([#9641](https://github.com/rapidsai/cudf/pull/9641)) [@kathikeyann](https://github.com/kathikeyann) +- Implement Seies.datetime.floo ([#9571](https://github.com/rapidsai/cudf/pull/9571)) [@skiui-souce](https://github.com/skiui-souce) +- ceil/floo fo `DatetimeIndex` ([#9554](https://github.com/rapidsai/cudf/pull/9554)) [@mayankanand007](https://github.com/mayankanand007) +- Add suppot fo `decimal128` in cudf python ([#9533](https://github.com/rapidsai/cudf/pull/9533)) [@galipemsaga](https://github.com/galipemsaga) +- Implement `lists::index_of()` to find positions in list ows ([#9510](https://github.com/rapidsai/cudf/pull/9510)) [@mythocks](https://github.com/mythocks) +- custeamz oauth callback fo kafka (libdkafka) ([#9486](https://github.com/rapidsai/cudf/pull/9486)) [@jdye64](https://github.com/jdye64) +- Add Peason coelation fo sot goupby (python) ([#9166](https://github.com/rapidsai/cudf/pull/9166)) [@skiui-souce](https://github.com/skiui-souce) +- Intechange datafame potocol ([#9071](https://github.com/rapidsai/cudf/pull/9071)) [@iskode](https://github.com/iskode) +- Rewiting ow/column convesions fo Spak <-> cudf data convesions ([#8444](https://github.com/rapidsai/cudf/pull/8444)) [@hypebolic2346](https://github.com/hypebolic2346) + +## 🛠️ Impovements + +- Pepae upload scipts fo Python 3.7 emoval ([#10092](https://github.com/rapidsai/cudf/pull/10092)) [@Ethyling](https://github.com/Ethyling) +- Simplify custeamz and cudf_kafka ecipes files ([#10065](https://github.com/rapidsai/cudf/pull/10065)) [@Ethyling](https://github.com/Ethyling) +- ORC wite API changes fo ganula statistics ([#10058](https://github.com/rapidsai/cudf/pull/10058)) [@mythocks](https://github.com/mythocks) +- Remove python constaints in cuteamz and cudf_kafka ecipes ([#10052](https://github.com/rapidsai/cudf/pull/10052)) [@Ethyling](https://github.com/Ethyling) +- Unpin `dask` and `distibuted` in CI ([#10028](https://github.com/rapidsai/cudf/pull/10028)) [@galipemsaga](https://github.com/galipemsaga) +- Add `_fom_column_like_self` factoy ([#10022](https://github.com/rapidsai/cudf/pull/10022)) [@isVoid](https://github.com/isVoid) +- Replace custom CUDA bindings peviously povided by RMM with official CUDA Python bindings ([#10008](https://github.com/rapidsai/cudf/pull/10008)) [@shwina](https://github.com/shwina) +- Use `cuda::std::is_aithmetic` in `cudf::is_numeic` tait. ([#9996](https://github.com/rapidsai/cudf/pull/9996)) [@bdice](https://github.com/bdice) +- Clean up CUDA steam use in cuIO ([#9991](https://github.com/rapidsai/cudf/pull/9991)) [@vuule](https://github.com/vuule) +- Use addessed-odeed fist fit fo the pinned memoy pool ([#9989](https://github.com/rapidsai/cudf/pull/9989)) [@ongou](https://github.com/ongou) +- Add stings tests to tanspose_test.cpp ([#9985](https://github.com/rapidsai/cudf/pull/9985)) [@davidwendt](https://github.com/davidwendt) +- Use gpuci_mamba_ety on Java CI. ([#9983](https://github.com/rapidsai/cudf/pull/9983)) [@bdice](https://github.com/bdice) +- Remove depecated method `one_hot_encoding` ([#9977](https://github.com/rapidsai/cudf/pull/9977)) [@isVoid](https://github.com/isVoid) +- Mino cleanup of unused Python functions ([#9974](https://github.com/rapidsai/cudf/pull/9974)) [@vyas](https://github.com/vyas) +- Use new efficient patitioned paquet witing in cuDF ([#9971](https://github.com/rapidsai/cudf/pull/9971)) [@devavet](https://github.com/devavet) +- Remove st.subwod_tokenize ([#9968](https://github.com/rapidsai/cudf/pull/9968)) [@VibhuJawa](https://github.com/VibhuJawa) +- Fowad-mege banch-21.12 to banch-22.02 ([#9947](https://github.com/rapidsai/cudf/pull/9947)) [@bdice](https://github.com/bdice) +- Remove depecated `method` paamete fom `mege` and `join`. ([#9944](https://github.com/rapidsai/cudf/pull/9944)) [@bdice](https://github.com/bdice) +- Remove depecated method DataFame.hash_columns. ([#9943](https://github.com/rapidsai/cudf/pull/9943)) [@bdice](https://github.com/bdice) +- Remove depecated method Seies.hash_encode. ([#9942](https://github.com/rapidsai/cudf/pull/9942)) [@bdice](https://github.com/bdice) +- use ninja in java ci build ([#9933](https://github.com/rapidsai/cudf/pull/9933)) [@ongou](https://github.com/ongou) +- Add build-time publish step to cpu build scipt ([#9927](https://github.com/rapidsai/cudf/pull/9927)) [@davidwendt](https://github.com/davidwendt) +- Refactoing ceil/ound/floo code fo datetime64 types ([#9926](https://github.com/rapidsai/cudf/pull/9926)) [@mayankanand007](https://github.com/mayankanand007) +- Remove vaious unused functions ([#9922](https://github.com/rapidsai/cudf/pull/9922)) [@vyas](https://github.com/vyas) +- Raise in `quey` if dtype is not suppoted ([#9921](https://github.com/rapidsai/cudf/pull/9921)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Add missing impots tests ([#9920](https://github.com/rapidsai/cudf/pull/9920)) [@Ethyling](https://github.com/Ethyling) +- Spak Decimal128 hashing ([#9919](https://github.com/rapidsai/cudf/pull/9919)) [@wlee](https://github.com/wlee) +- Replace `thust/std::get` with stuctued bindings ([#9915](https://github.com/rapidsai/cudf/pull/9915)) [@codeepot](https://github.com/codeepot) +- Upgade thust vesion to 1.15 ([#9912](https://github.com/rapidsai/cudf/pull/9912)) [@obetmaynad](https://github.com/obetmaynad) +- Remove conda envs fo CUDA 11.0 and 11.2. ([#9910](https://github.com/rapidsai/cudf/pull/9910)) [@bdice](https://github.com/bdice) +- Retun count of set bits fom inplace_bitmask_and. ([#9904](https://github.com/rapidsai/cudf/pull/9904)) [@bdice](https://github.com/bdice) +- Use dynamic nullate fo join hashe and equality compaato ([#9902](https://github.com/rapidsai/cudf/pull/9902)) [@davidwendt](https://github.com/davidwendt) +- Update ucx-py vesion on elease using vc ([#9897](https://github.com/rapidsai/cudf/pull/9897)) [@Ethyling](https://github.com/Ethyling) +- Remove `IncludeCategoies` fom `.clang-fomat` ([#9876](https://github.com/rapidsai/cudf/pull/9876)) [@codeepot](https://github.com/codeepot) +- Suppot statically linking CUDA untime fo Java bindings ([#9873](https://github.com/rapidsai/cudf/pull/9873)) [@jlowe](https://github.com/jlowe) +- Add `clang-tidy` to libcudf ([#9860](https://github.com/rapidsai/cudf/pull/9860)) [@codeepot](https://github.com/codeepot) +- Remove depecated methods fom Java Table class ([#9853](https://github.com/rapidsai/cudf/pull/9853)) [@jlowe](https://github.com/jlowe) +- Add test fo map column metadata handling in ORC wite ([#9852](https://github.com/rapidsai/cudf/pull/9852)) [@vuule](https://github.com/vuule) +- Use pandas `to_offset` to pase fequency sting in `date_ange` ([#9843](https://github.com/rapidsai/cudf/pull/9843)) [@isVoid](https://github.com/isVoid) +- add templated benchmak with fixtue ([#9838](https://github.com/rapidsai/cudf/pull/9838)) [@kathikeyann](https://github.com/kathikeyann) +- Use list of column inputs fo `apply_boolean_mask` ([#9832](https://github.com/rapidsai/cudf/pull/9832)) [@isVoid](https://github.com/isVoid) +- Added a few moe tests fo Decimal to Sting cast ([#9818](https://github.com/rapidsai/cudf/pull/9818)) [@azajafi](https://github.com/azajafi) +- Run doctests. ([#9815](https://github.com/rapidsai/cudf/pull/9815)) [@bdice](https://github.com/bdice) +- Avoid oveflow fo fixed_point ound ([#9809](https://github.com/rapidsai/cudf/pull/9809)) [@spelingxx](https://github.com/spelingxx) +- Move `dop_duplicates`, `dop_na`, `_gathe`, `take` to IndexFame and ceate thei `_base_index` countepats ([#9807](https://github.com/rapidsai/cudf/pull/9807)) [@isVoid](https://github.com/isVoid) +- Use vecto factoies fo host-device copies. ([#9806](https://github.com/rapidsai/cudf/pull/9806)) [@bdice](https://github.com/bdice) +- Refacto host device macos ([#9797](https://github.com/rapidsai/cudf/pull/9797)) [@vyas](https://github.com/vyas) +- Remove unused masked udf cython/c++ code ([#9792](https://github.com/rapidsai/cudf/pull/9792)) [@bandon-b-mille](https://github.com/bandon-b-mille) +- Allow custom sot functions fo dask-cudf `sot_values` ([#9789](https://github.com/rapidsai/cudf/pull/9789)) [@chalesbluca](https://github.com/chalesbluca) +- Impove build time of libcudf iteato tests ([#9788](https://github.com/rapidsai/cudf/pull/9788)) [@davidwendt](https://github.com/davidwendt) +- Copy Java native dependencies diectly into classpath ([#9787](https://github.com/rapidsai/cudf/pull/9787)) [@jlowe](https://github.com/jlowe) +- Add decimal types to cuIO benchmaks ([#9776](https://github.com/rapidsai/cudf/pull/9776)) [@vuule](https://github.com/vuule) +- Pick smallest decimal type with equied pecision in ORC eade ([#9775](https://github.com/rapidsai/cudf/pull/9775)) [@vuule](https://github.com/vuule) +- Avoid oveflow fo `fixed_point` `cudf::cast` and pefomance optimization ([#9772](https://github.com/rapidsai/cudf/pull/9772)) [@codeepot](https://github.com/codeepot) +- Use CTAD with Thust function objects ([#9768](https://github.com/rapidsai/cudf/pull/9768)) [@codeepot](https://github.com/codeepot) +- Refacto TableTest assetion methods to a sepaate utility class ([#9762](https://github.com/rapidsai/cudf/pull/9762)) [@jlowe](https://github.com/jlowe) +- Use Java classloade to find test esouces ([#9760](https://github.com/rapidsai/cudf/pull/9760)) [@jlowe](https://github.com/jlowe) +- Allow cast decimal128 to sting and add tests ([#9756](https://github.com/rapidsai/cudf/pull/9756)) [@azajafi](https://github.com/azajafi) +- Load balance optimization fo contiguous_split ([#9755](https://github.com/rapidsai/cudf/pull/9755)) [@nvdbaanec](https://github.com/nvdbaanec) +- Consolidate and impove `eset_index` ([#9750](https://github.com/rapidsai/cudf/pull/9750)) [@isVoid](https://github.com/isVoid) +- Update to UCX-Py 0.24 ([#9748](https://github.com/rapidsai/cudf/pull/9748)) [@pentschev](https://github.com/pentschev) +- Skip cufile tests in JNI build scipt ([#9744](https://github.com/rapidsai/cudf/pull/9744)) [@pxLi](https://github.com/pxLi) +- Enable sting to decimal 128 cast ([#9742](https://github.com/rapidsai/cudf/pull/9742)) [@azajafi](https://github.com/azajafi) +- Use stop instead of stop_. ([#9735](https://github.com/rapidsai/cudf/pull/9735)) [@bdice](https://github.com/bdice) +- Fowad-mege banch-21.12 to banch-22.02 ([#9730](https://github.com/rapidsai/cudf/pull/9730)) [@bdice](https://github.com/bdice) +- Impove cmake fomat scipt ([#9723](https://github.com/rapidsai/cudf/pull/9723)) [@vyas](https://github.com/vyas) +- Use cuFile diect device eads/wites by default in cuIO ([#9722](https://github.com/rapidsai/cudf/pull/9722)) [@vuule](https://github.com/vuule) +- Add diectoy-patitioned data suppot to cudf.ead_paquet ([#9720](https://github.com/rapidsai/cudf/pull/9720)) [@jzamoa](https://github.com/jzamoa) +- Use steam allocato adapto fo hash join table ([#9704](https://github.com/rapidsai/cudf/pull/9704)) [@PointKenel](https://github.com/PointKenel) +- Update check fo inf/nan stings in libcudf float convesion to ignoe case ([#9694](https://github.com/rapidsai/cudf/pull/9694)) [@davidwendt](https://github.com/davidwendt) +- Update cudf JNI to 22.02.0-SNAPSHOT ([#9681](https://github.com/rapidsai/cudf/pull/9681)) [@pxLi](https://github.com/pxLi) +- Replace cudf's concuent_odeed_map with cuco::static_map in semi/anti joins ([#9666](https://github.com/rapidsai/cudf/pull/9666)) [@vyas](https://github.com/vyas) +- Some impovements to `pase_decimal` function and bindings fo `is_fixed_point` ([#9658](https://github.com/rapidsai/cudf/pull/9658)) [@azajafi](https://github.com/azajafi) +- Add utility to fomat ninja-log build times ([#9631](https://github.com/rapidsai/cudf/pull/9631)) [@davidwendt](https://github.com/davidwendt) +- Allow untime has_nulls paamete fo ow opeatos ([#9623](https://github.com/rapidsai/cudf/pull/9623)) [@davidwendt](https://github.com/davidwendt) +- Use fsspec.paquet fo impoved ead_paquet pefomance fom emote stoage ([#9589](https://github.com/rapidsai/cudf/pull/9589)) [@jzamoa](https://github.com/jzamoa) +- Refacto bit counting APIs, intoduce valid/null count functions, and split host/device side code fo segmented counts. ([#9588](https://github.com/rapidsai/cudf/pull/9588)) [@bdice](https://github.com/bdice) +- Use List of Columns as Input fo `dop_nulls`, `gathe` and `dop_duplicates` ([#9558](https://github.com/rapidsai/cudf/pull/9558)) [@isVoid](https://github.com/isVoid) +- Simplify mege intenals and educe ovehead ([#9516](https://github.com/rapidsai/cudf/pull/9516)) [@vyas](https://github.com/vyas) +- Add `stuct` geneation suppot in datageneato & fuzz tests ([#9180](https://github.com/rapidsai/cudf/pull/9180)) [@galipemsaga](https://github.com/galipemsaga) +- Simplify wite_csv by emoving unnecessay wite/impl classes ([#9089](https://github.com/rapidsai/cudf/pull/9089)) [@cwhais](https://github.com/cwhais) # cuDF 21.12.00 (9 Dec 2021) diff --git a/build.sh b/build.sh index c2eba134c35..8b3add1dddd 100755 --- a/build.sh +++ b/build.sh @@ -185,12 +185,9 @@ if buildAll || hasArg libcudf; then fi # get the current count before the compile starts - FILES_IN_CCACHE="" - if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v ccache)" ]]; then - FILES_IN_CCACHE=$(ccache -s | grep "files in cache") - echo "$FILES_IN_CCACHE" - # zero the ccache statistics - ccache -z + if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v sccache)" ]]; then + # zero the sccache statistics + sccache --zero-stats fi cmake -S $REPODIR/cpp -B ${LIB_BUILD_DIR} \ @@ -216,11 +213,12 @@ if buildAll || hasArg libcudf; then echo "Formatting build metrics" python ${REPODIR}/cpp/scripts/sort_ninja_log.py ${LIB_BUILD_DIR}/.ninja_log --fmt xml > ${LIB_BUILD_DIR}/ninja_log.xml MSG="

" - # get some ccache stats after the compile - if [[ "$BUILD_REPORT_INCL_CACHE_STATS"=="ON" && -x "$(command -v ccache)" ]]; then - MSG="${MSG}
$FILES_IN_CCACHE" - HIT_RATE=$(ccache -s | grep "cache hit rate") - MSG="${MSG}
${HIT_RATE}" + # get some sccache stats after the compile + if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v sccache)" ]]; then + COMPILE_REQUESTS=$(sccache -s | grep "Compile requests \+ [0-9]\+$" | awk '{ print $NF }') + CACHE_HITS=$(sccache -s | grep "Cache hits \+ [0-9]\+$" | awk '{ print $NF }') + HIT_RATE=$(echo - | awk "{printf \"%.2f\n\", $CACHE_HITS / $COMPILE_REQUESTS * 100}") + MSG="${MSG}
cache hit rate ${HIT_RATE} %" fi MSG="${MSG}
parallel setting: $PARALLEL_LEVEL" MSG="${MSG}
parallel build time: $compile_total seconds" diff --git a/ci/cpu/build.sh b/ci/cpu/build.sh index 6f19f174da0..574a55d26b6 100755 --- a/ci/cpu/build.sh +++ b/ci/cpu/build.sh @@ -31,6 +31,10 @@ if [[ "$BUILD_MODE" = "branch" && "$SOURCE_BRANCH" = branch-* ]] ; then export VERSION_SUFFIX=`date +%y%m%d` fi +export CMAKE_CUDA_COMPILER_LAUNCHER="sccache" +export CMAKE_CXX_COMPILER_LAUNCHER="sccache" +export CMAKE_C_COMPILER_LAUNCHER="sccache" + ################################################################################ # SETUP - Check environment ################################################################################ @@ -77,6 +81,8 @@ if [ "$BUILD_LIBCUDF" == '1' ]; then gpuci_conda_retry build --no-build-id --croot ${CONDA_BLD_DIR} conda/recipes/libcudf $CONDA_BUILD_ARGS mkdir -p ${CONDA_BLD_DIR}/libcudf/work cp -r ${CONDA_BLD_DIR}/work/* ${CONDA_BLD_DIR}/libcudf/work + gpuci_logger "sccache stats" + sccache --show-stats # Copy libcudf build metrics results LIBCUDF_BUILD_DIR=$CONDA_BLD_DIR/libcudf/work/cpp/build diff --git a/ci/gpu/build.sh b/ci/gpu/build.sh index d5fb7451769..6a5c28faeff 100755 --- a/ci/gpu/build.sh +++ b/ci/gpu/build.sh @@ -36,6 +36,10 @@ export DASK_DISTRIBUTED_GIT_TAG='2022.01.0' # ucx-py version export UCX_PY_VERSION='0.25.*' +export CMAKE_CUDA_COMPILER_LAUNCHER="sccache" +export CMAKE_CXX_COMPILER_LAUNCHER="sccache" +export CMAKE_C_COMPILER_LAUNCHER="sccache" + ################################################################################ # TRAP - Setup trap for removing jitify cache ################################################################################ diff --git a/conda/recipes/libcudf/meta.yaml b/conda/recipes/libcudf/meta.yaml index 2cbe5173de0..70c020d4abd 100644 --- a/conda/recipes/libcudf/meta.yaml +++ b/conda/recipes/libcudf/meta.yaml @@ -22,13 +22,15 @@ build: - PARALLEL_LEVEL - VERSION_SUFFIX - PROJECT_FLASH - - CCACHE_DIR - - CCACHE_NOHASHDIR - - CCACHE_COMPILERCHECK - CMAKE_GENERATOR - CMAKE_C_COMPILER_LAUNCHER - CMAKE_CXX_COMPILER_LAUNCHER - CMAKE_CUDA_COMPILER_LAUNCHER + - SCCACHE_S3_KEY_PREFIX=libcudf-aarch64 # [aarch64] + - SCCACHE_S3_KEY_PREFIX=libcudf-linux64 # [linux64] + - SCCACHE_BUCKET=rapids-sccache + - SCCACHE_REGION=us-west-2 + - SCCACHE_IDLE_TIMEOUT=32768 run_exports: - {{ pin_subpackage("libcudf", max_pin="x.x") }} diff --git a/cpp/include/cudf/binaryop.hpp b/cpp/include/cudf/binaryop.hpp index daf55c0befe..177fd904b0b 100644 --- a/cpp/include/cudf/binaryop.hpp +++ b/cpp/include/cudf/binaryop.hpp @@ -45,7 +45,7 @@ enum class binary_operator : int32_t { PMOD, ///< positive modulo operator ///< If remainder is negative, this returns (remainder + divisor) % divisor ///< else, it returns (dividend % divisor) - PYMOD, ///< operator % but following python's sign rules for negatives + PYMOD, ///< operator % but following Python's sign rules for negatives POW, ///< lhs ^ rhs LOG_BASE, ///< logarithm to the base ATAN2, ///< 2-argument arctangent diff --git a/cpp/include/cudf/fixed_point/fixed_point.hpp b/cpp/include/cudf/fixed_point/fixed_point.hpp index a7112ae415d..f027e2783b1 100644 --- a/cpp/include/cudf/fixed_point/fixed_point.hpp +++ b/cpp/include/cudf/fixed_point/fixed_point.hpp @@ -1,5 +1,5 @@ /* - * Copyright (c) 2020-2021, NVIDIA CORPORATION. + * Copyright (c) 2020-2022, NVIDIA CORPORATION. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -440,6 +440,21 @@ class fixed_point { CUDF_HOST_DEVICE inline friend fixed_point operator/( fixed_point const& lhs, fixed_point const& rhs); + /** + * @brief operator % (for computing the modulo operation of two `fixed_point` numbers) + * + * If `_scale`s are equal, the modulus is computed directly. + * If `_scale`s are not equal, the number with larger `_scale` is shifted to the + * smaller `_scale`, and then the modulus is computed. + * + * @tparam Rep1 Representation type of number being modulo-ed to `this` + * @tparam Rad1 Radix (base) type of number being modulo-ed to `this` + * @return The resulting `fixed_point` number + */ + template + CUDF_HOST_DEVICE inline friend fixed_point operator%( + fixed_point const& lhs, fixed_point const& rhs); + /** * @brief operator == (for comparing two `fixed_point` numbers) * @@ -750,6 +765,16 @@ CUDF_HOST_DEVICE inline bool operator>(fixed_point const& lhs, return lhs.rescaled(scale)._value > rhs.rescaled(scale)._value; } +// MODULO OPERATION +template +CUDF_HOST_DEVICE inline fixed_point operator%(fixed_point const& lhs, + fixed_point const& rhs) +{ + auto const scale = std::min(lhs._scale, rhs._scale); + auto const remainder = lhs.rescaled(scale)._value % rhs.rescaled(scale)._value; + return fixed_point{scaled_integer{remainder, scale}}; +} + using decimal32 = fixed_point; using decimal64 = fixed_point; using decimal128 = fixed_point<__int128_t, Radix::BASE_10>; diff --git a/cpp/src/binaryop/binaryop.cpp b/cpp/src/binaryop/binaryop.cpp index 5f9ff2574e3..dfa7896c37a 100644 --- a/cpp/src/binaryop/binaryop.cpp +++ b/cpp/src/binaryop/binaryop.cpp @@ -88,7 +88,10 @@ bool is_basic_arithmetic_binop(binary_operator op) op == binary_operator::MUL or // operator * op == binary_operator::DIV or // operator / using common type of lhs and rhs op == binary_operator::NULL_MIN or // 2 null = null, 1 null = value, else min - op == binary_operator::NULL_MAX; // 2 null = null, 1 null = value, else max + op == binary_operator::NULL_MAX or // 2 null = null, 1 null = value, else max + op == binary_operator::MOD or // operator % + op == binary_operator::PMOD or // positive modulo operator + op == binary_operator::PYMOD; // operator % but following Python's negative sign rules } /** diff --git a/cpp/src/binaryop/compiled/operation.cuh b/cpp/src/binaryop/compiled/operation.cuh index 4b5f78dc400..de9d46b6280 100644 --- a/cpp/src/binaryop/compiled/operation.cuh +++ b/cpp/src/binaryop/compiled/operation.cuh @@ -162,12 +162,24 @@ struct PMod { if (rem < 0) rem = std::fmod(rem + yconv, yconv); return rem; } + + template () and + std::is_same_v>* = nullptr> + __device__ inline auto operator()(TypeLhs x, TypeRhs y) + { + auto const remainder = x % y; + return remainder.value() < 0 ? (remainder + y) % y : remainder; + } }; struct PyMod { template >)>* = nullptr> + std::enable_if_t<(std::is_integral_v> or + (cudf::is_fixed_point() and + std::is_same_v))>* = nullptr> __device__ inline auto operator()(TypeLhs x, TypeRhs y) -> decltype(((x % y) + y) % y) { return ((x % y) + y) % y; diff --git a/cpp/src/binaryop/compiled/util.cpp b/cpp/src/binaryop/compiled/util.cpp index 9481c236142..d8f1eb03a16 100644 --- a/cpp/src/binaryop/compiled/util.cpp +++ b/cpp/src/binaryop/compiled/util.cpp @@ -45,7 +45,11 @@ struct common_type_functor { // Eg. d=t-t return data_type{type_to_id()}; } - return {}; + + // A compiler bug may cause a compilation error when using empty initializer list to construct + // an std::optional object containing no `data_type` value. Therefore, we should explicitly + // return `std::nullopt` instead. + return std::nullopt; } }; template diff --git a/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp b/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp index 29905171907..335de93c976 100644 --- a/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp +++ b/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp @@ -1,5 +1,5 @@ /* - * Copyright (c) 2021, NVIDIA CORPORATION. + * Copyright (c) 2021-2022, NVIDIA CORPORATION. * * Licensed under the Apache License, Version 2.0 (the "License"); * you may not use this file except in compliance with the License. @@ -33,14 +33,14 @@ namespace cudf::test::binop { template -struct FixedPointCompiledTestBothReps : public cudf::test::BaseFixture { +struct FixedPointCompiledTest : public cudf::test::BaseFixture { }; template using wrapper = cudf::test::fixed_width_column_wrapper; -TYPED_TEST_SUITE(FixedPointCompiledTestBothReps, cudf::test::FixedPointTypes); +TYPED_TEST_SUITE(FixedPointCompiledTest, cudf::test::FixedPointTypes); -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd) { using namespace numeric; using decimalXX = TypeParam; @@ -73,7 +73,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_col, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiply) { using namespace numeric; using decimalXX = TypeParam; @@ -109,7 +109,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply) template using fp_wrapper = cudf::test::fixed_point_column_wrapper; -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply2) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiply2) { using namespace numeric; using decimalXX = TypeParam; @@ -128,7 +128,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply2) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv) { using namespace numeric; using decimalXX = TypeParam; @@ -147,7 +147,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv2) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv2) { using namespace numeric; using decimalXX = TypeParam; @@ -166,7 +166,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv2) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv3) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv3) { using namespace numeric; using decimalXX = TypeParam; @@ -183,7 +183,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv3) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv4) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv4) { using namespace numeric; using decimalXX = TypeParam; @@ -203,7 +203,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv4) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd2) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd2) { using namespace numeric; using decimalXX = TypeParam; @@ -222,7 +222,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd2) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd3) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd3) { using namespace numeric; using decimalXX = TypeParam; @@ -241,7 +241,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd3) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd4) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd4) { using namespace numeric; using decimalXX = TypeParam; @@ -258,7 +258,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd4) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd5) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd5) { using namespace numeric; using decimalXX = TypeParam; @@ -275,7 +275,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd5) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd6) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd6) { using namespace numeric; using decimalXX = TypeParam; @@ -294,7 +294,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd6) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected1, result1->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointCast) +TYPED_TEST(FixedPointCompiledTest, FixedPointCast) { using namespace numeric; using decimalXX = TypeParam; @@ -308,7 +308,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointCast) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiplyScalar) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiplyScalar) { using namespace numeric; using decimalXX = TypeParam; @@ -325,7 +325,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiplyScalar) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpSimplePlus) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpSimplePlus) { using namespace numeric; using decimalXX = TypeParam; @@ -344,7 +344,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpSimplePlus) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimple) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimple) { using namespace numeric; using decimalXX = TypeParam; @@ -361,7 +361,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimple) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale0) { using namespace numeric; using decimalXX = TypeParam; @@ -377,7 +377,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0Null) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale0Null) { using namespace numeric; using decimalXX = TypeParam; @@ -393,7 +393,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0Nu CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale2Null) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale2Null) { using namespace numeric; using decimalXX = TypeParam; @@ -409,7 +409,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale2Nu CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualLessGreater) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualLessGreater) { using namespace numeric; using decimalXX = TypeParam; @@ -453,7 +453,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualLessGreater) CUDF_TEST_EXPECT_COLUMNS_EQUAL(true_col, greater_result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMaxSimple) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullMaxSimple) { using namespace numeric; using decimalXX = TypeParam; @@ -473,7 +473,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMaxSimple) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMinSimple) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullMinSimple) { using namespace numeric; using decimalXX = TypeParam; @@ -493,7 +493,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMinSimple) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullEqualsSimple) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullEqualsSimple) { using namespace numeric; using decimalXX = TypeParam; @@ -510,7 +510,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullEqualsSimple) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div) { using namespace numeric; using decimalXX = TypeParam; @@ -526,7 +526,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div2) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div2) { using namespace numeric; using decimalXX = TypeParam; @@ -542,7 +542,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div2) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div3) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div3) { using namespace numeric; using decimalXX = TypeParam; @@ -558,7 +558,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div3) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div4) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div4) { using namespace numeric; using decimalXX = TypeParam; @@ -574,7 +574,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div4) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div6) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div6) { using namespace numeric; using decimalXX = TypeParam; @@ -591,7 +591,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div6) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div7) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div7) { using namespace numeric; using decimalXX = TypeParam; @@ -608,7 +608,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div7) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div8) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div8) { using namespace numeric; using decimalXX = TypeParam; @@ -624,7 +624,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div8) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div9) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div9) { using namespace numeric; using decimalXX = TypeParam; @@ -640,7 +640,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div9) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div10) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div10) { using namespace numeric; using decimalXX = TypeParam; @@ -656,7 +656,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div10) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div11) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div11) { using namespace numeric; using decimalXX = TypeParam; @@ -672,7 +672,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div11) CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); } -TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpThrows) +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpThrows) { using namespace numeric; using decimalXX = TypeParam; @@ -684,6 +684,132 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpThrows) cudf::logic_error); } +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpModSimple) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + + auto const lhs = fp_wrapper{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}}; + auto const rhs = fp_wrapper{{10, 10, 10, 10, 10, 10, 10, 10}, scale_type{-1}}; + auto const expected = fp_wrapper{{-3, -2, -1, 1, 2, 3, 4, 5}, scale_type{-1}}; + + auto const type = + cudf::binary_operation_fixed_point_output_type(cudf::binary_operator::MOD, + static_cast(lhs).type(), + static_cast(rhs).type()); + auto const result = cudf::binary_operation(lhs, rhs, cudf::binary_operator::MOD, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); +} + +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModSimple) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + + auto const lhs = fp_wrapper{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}}; + auto const rhs = fp_wrapper{{10, 10, 10, 10, 10, 10, 10, 10}, scale_type{-1}}; + auto const expected = fp_wrapper{{7, 8, 9, 1, 2, 3, 4, 5}, scale_type{-1}}; + + for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) { + auto const type = cudf::binary_operation_fixed_point_output_type( + op, static_cast(lhs).type(), static_cast(rhs).type()); + auto const result = cudf::binary_operation(lhs, rhs, op, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); + } +} + +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpModSimple2) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + + auto const lhs = fp_wrapper{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}}; + auto const rhs = make_fixed_point_scalar(10, scale_type{-1}); + auto const expected = fp_wrapper{{-3, -2, -1, 1, 2, 3, 4, 5}, scale_type{-1}}; + + auto const type = cudf::binary_operation_fixed_point_output_type( + cudf::binary_operator::MOD, static_cast(lhs).type(), rhs->type()); + auto const result = cudf::binary_operation(lhs, *rhs, cudf::binary_operator::MOD, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); +} + +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModAndPyModSimple2) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + + auto const lhs = fp_wrapper{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}}; + auto const rhs = make_fixed_point_scalar(10, scale_type{-1}); + auto const expected = fp_wrapper{{7, 8, 9, 1, 2, 3, 4, 5}, scale_type{-1}}; + + for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) { + auto const type = cudf::binary_operation_fixed_point_output_type( + op, static_cast(lhs).type(), rhs->type()); + auto const result = cudf::binary_operation(lhs, *rhs, op, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); + } +} + +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMod) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + auto constexpr N = 1000; + + for (auto scale : {-1, -2, -3}) { + auto const iota = thrust::make_counting_iterator(-500); + auto const lhs = fp_wrapper{iota, iota + N, scale_type{-1}}; + auto const rhs = make_fixed_point_scalar(7, scale_type{scale}); + + auto const factor = static_cast(std::pow(10, -1 - scale)); + auto const f = [factor](auto i) { return (i * factor) % 7; }; + auto const exp_iter = cudf::detail::make_counting_transform_iterator(-500, f); + auto const expected = fp_wrapper{exp_iter, exp_iter + N, scale_type{scale}}; + + auto const type = cudf::binary_operation_fixed_point_output_type( + cudf::binary_operator::MOD, static_cast(lhs).type(), rhs->type()); + auto const result = cudf::binary_operation(lhs, *rhs, cudf::binary_operator::MOD, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); + } +} + +TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModAndPyMod) +{ + using namespace numeric; + using decimalXX = TypeParam; + using RepType = device_storage_type_t; + auto constexpr N = 1000; + + for (auto const scale : {-1, -2, -3}) { + auto const iota = thrust::make_counting_iterator(-500); + auto const lhs = fp_wrapper{iota, iota + N, scale_type{-1}}; + auto const rhs = make_fixed_point_scalar(7, scale_type{scale}); + + auto const factor = static_cast(std::pow(10, -1 - scale)); + auto const f = [factor](auto i) { return (((i * factor) % 7) + 7) % 7; }; + auto const exp_iter = cudf::detail::make_counting_transform_iterator(-500, f); + auto const expected = fp_wrapper{exp_iter, exp_iter + N, scale_type{scale}}; + + for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) { + auto const type = cudf::binary_operation_fixed_point_output_type( + op, static_cast(lhs).type(), rhs->type()); + auto const result = cudf::binary_operation(lhs, *rhs, op, type); + + CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view()); + } + } +} + template struct FixedPointTest_64_128_Reps : public cudf::test::BaseFixture { }; diff --git a/python/cudf/cudf/core/series.py b/python/cudf/cudf/core/series.py index 2b1caa05c92..3ee799df03c 100644 --- a/python/cudf/cudf/core/series.py +++ b/python/cudf/cudf/core/series.py @@ -2722,42 +2722,6 @@ def unique(self): res = self._column.unique() return Series(res, name=self.name) - def nunique(self, method="sort", dropna=True): - """Returns the number of unique values of the Series: approximate version, - and exact version to be moved to libcudf - - Excludes NA values by default. - - Parameters - ---------- - dropna : bool, default True - Don't include NA values in the count. - - Returns - ------- - int - - Examples - -------- - >>> import cudf - >>> s = cudf.Series([1, 3, 5, 7, 7]) - >>> s - 0 1 - 1 3 - 2 5 - 3 7 - 4 7 - dtype: int64 - >>> s.nunique() - 4 - """ - if method != "sort": - msg = "non sort based distinct_count() not implemented yet" - raise NotImplementedError(msg) - if self.null_count == len(self): - return 0 - return super().nunique(method, dropna) - def value_counts( self, normalize=False, diff --git a/python/cudf/cudf/core/single_column_frame.py b/python/cudf/cudf/core/single_column_frame.py index ef479f19363..599224a6995 100644 --- a/python/cudf/cudf/core/single_column_frame.py +++ b/python/cudf/cudf/core/single_column_frame.py @@ -343,4 +343,6 @@ def nunique(self, method: builtins.str = "sort", dropna: bool = True): int Number of unique values in the column. """ + if self._column.null_count == len(self): + return 0 return self._column.distinct_count(method=method, dropna=dropna) diff --git a/python/dask_cudf/dask_cudf/io/tests/test_parquet.py b/python/dask_cudf/dask_cudf/io/tests/test_parquet.py index 3e59b9c3fcc..f5c1e53258e 100644 --- a/python/dask_cudf/dask_cudf/io/tests/test_parquet.py +++ b/python/dask_cudf/dask_cudf/io/tests/test_parquet.py @@ -40,12 +40,7 @@ def test_roundtrip_from_dask(tmpdir, stats): tmpdir = str(tmpdir) ddf.to_parquet(tmpdir, engine="pyarrow") files = sorted( - ( - os.path.join(tmpdir, f) - for f in os.listdir(tmpdir) - # TODO: Allow "_metadata" in list after dask#6047 - if not f.endswith("_metadata") - ), + (os.path.join(tmpdir, f) for f in os.listdir(tmpdir)), key=natural_sort_key, )