diff --git a/CHANGELOG.md b/CHANGELOG.md
index 6d4bdfb8d98..dda2e02f593 100644
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -3,8 +3,244 @@
 Please see https://github.com/rapidsai/cudf/releases/tag/v22.04.00a for the latest changes to this development branch.
 
 # cuDF 22.02.00 (Date TBD)
+# cuDF 22.02.00 (2 Feb 2022)
+
+## 🚨 Beaking Changes
+
+- ORC wite API changes fo ganula statistics ([#10058](https://github.com/rapidsai/cudf/pull/10058)) [@mythocks](https://github.com/mythocks)
+- `decimal128` Suppot fo `to/fom_aow` ([#9986](https://github.com/rapidsai/cudf/pull/9986)) [@codeepot](https://github.com/codeepot)
+- Remove depecated method `one_hot_encoding` ([#9977](https://github.com/rapidsai/cudf/pull/9977)) [@isVoid](https://github.com/isVoid)
+- Remove st.subwod_tokenize ([#9968](https://github.com/rapidsai/cudf/pull/9968)) [@VibhuJawa](https://github.com/VibhuJawa)
+- Remove depecated `method` paamete fom `mege` and `join`. ([#9944](https://github.com/rapidsai/cudf/pull/9944)) [@bdice](https://github.com/bdice)
+- Remove depecated method DataFame.hash_columns. ([#9943](https://github.com/rapidsai/cudf/pull/9943)) [@bdice](https://github.com/bdice)
+- Remove depecated method Seies.hash_encode. ([#9942](https://github.com/rapidsai/cudf/pull/9942)) [@bdice](https://github.com/bdice)
+- Refactoing ceil/ound/floo code fo datetime64 types ([#9926](https://github.com/rapidsai/cudf/pull/9926)) [@mayankanand007](https://github.com/mayankanand007)
+- Intoduce `nan_as_null` paamete fo `cudf.Index` ([#9893](https://github.com/rapidsai/cudf/pull/9893)) [@galipemsaga](https://github.com/galipemsaga)
+- Add egex_flags paamete to stings eplace_e functions ([#9878](https://github.com/rapidsai/cudf/pull/9878)) [@davidwendt](https://github.com/davidwendt)
+- Beak tie fo `top` categoical columns in `Seies.descibe` ([#9867](https://github.com/rapidsai/cudf/pull/9867)) [@isVoid](https://github.com/isVoid)
+- Add patitioning suppot in paquet wite ([#9810](https://github.com/rapidsai/cudf/pull/9810)) [@devavet](https://github.com/devavet)
+- Move `dop_duplicates`, `dop_na`, `_gathe`, `take` to IndexFame and ceate thei `_base_index` countepats ([#9807](https://github.com/rapidsai/cudf/pull/9807)) [@isVoid](https://github.com/isVoid)
+- Raise tempoay eo fo `decimal128` types in paquet eade ([#9804](https://github.com/rapidsai/cudf/pull/9804)) [@galipemsaga](https://github.com/galipemsaga)
+- Change default `dtype` of all nulls column fom `float` to `object` ([#9803](https://github.com/rapidsai/cudf/pull/9803)) [@galipemsaga](https://github.com/galipemsaga)
+- Remove unused masked udf cython/c++ code ([#9792](https://github.com/rapidsai/cudf/pull/9792)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Pick smallest decimal type with equied pecision in ORC eade ([#9775](https://github.com/rapidsai/cudf/pull/9775)) [@vuule](https://github.com/vuule)
+- Add decimal128 suppot to Paquet eade and wite ([#9765](https://github.com/rapidsai/cudf/pull/9765)) [@vuule](https://github.com/vuule)
+- Refacto TableTest assetion methods to a sepaate utility class ([#9762](https://github.com/rapidsai/cudf/pull/9762)) [@jlowe](https://github.com/jlowe)
+- Use cuFile diect device eads/wites by default in cuIO ([#9722](https://github.com/rapidsai/cudf/pull/9722)) [@vuule](https://github.com/vuule)
+- Match pandas scala esult types in eductions ([#9717](https://github.com/rapidsai/cudf/pull/9717)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Add paametes to contol ow goup size in Paquet wite ([#9677](https://github.com/rapidsai/cudf/pull/9677)) [@vuule](https://github.com/vuule)
+- Refacto bit counting APIs, intoduce valid/null count functions, and split host/device side code fo segmented counts. ([#9588](https://github.com/rapidsai/cudf/pull/9588)) [@bdice](https://github.com/bdice)
+- Add suppot fo `decimal128` in cudf python ([#9533](https://github.com/rapidsai/cudf/pull/9533)) [@galipemsaga](https://github.com/galipemsaga)
+- Implement `lists::index_of()` to find positions in list ows ([#9510](https://github.com/rapidsai/cudf/pull/9510)) [@mythocks](https://github.com/mythocks)
+- Rewiting ow/column convesions fo Spak &lt;-&gt; cudf data convesions ([#8444](https://github.com/rapidsai/cudf/pull/8444)) [@hypebolic2346](https://github.com/hypebolic2346)
 
-Please see https://github.com/rapidsai/cudf/releases/tag/v22.02.00a for the latest changes to this development branch.
+## 🐛 Bug Fixes
+
+- Add check fo negative stipe index in ORC eade ([#10074](https://github.com/rapidsai/cudf/pull/10074)) [@vuule](https://github.com/vuule)
+- Update Java tests to expect DECIMAL128 fom Aow ([#10073](https://github.com/rapidsai/cudf/pull/10073)) [@jlowe](https://github.com/jlowe)
+- Avoid index mateialization when `DataFame` is ceated with un-named `Seies` objects ([#10071](https://github.com/rapidsai/cudf/pull/10071)) [@galipemsaga](https://github.com/galipemsaga)
+- fix gcc 11 compilation eos ([#10067](https://github.com/rapidsai/cudf/pull/10067)) [@ongou](https://github.com/ongou)
+- Fix `columns` odeing issue in paquet eade ([#10066](https://github.com/rapidsai/cudf/pull/10066)) [@galipemsaga](https://github.com/galipemsaga)
+- Fix datafame setitem with `ndaay` types ([#10056](https://github.com/rapidsai/cudf/pull/10056)) [@galipemsaga](https://github.com/galipemsaga)
+- Remove implicit copy due to convesion fom cudf::size_type and size_t ([#10045](https://github.com/rapidsai/cudf/pull/10045)) [@obetmaynad](https://github.com/obetmaynad)
+- Include &lt;optional&gt; in heades that use std::optional ([#10044](https://github.com/rapidsai/cudf/pull/10044)) [@obetmaynad](https://github.com/obetmaynad)
+- Fix ep and concat of `StuctColumn` ([#10042](https://github.com/rapidsai/cudf/pull/10042)) [@galipemsaga](https://github.com/galipemsaga)
+- Include ow goup level stats when witing ORC files ([#10041](https://github.com/rapidsai/cudf/pull/10041)) [@vuule](https://github.com/vuule)
+- build.sh espects the `--build_metics` and `--incl_cache_stats` flags ([#10035](https://github.com/rapidsai/cudf/pull/10035)) [@obetmaynad](https://github.com/obetmaynad)
+- Fix memoy leaks in JNI native code. ([#10029](https://github.com/rapidsai/cudf/pull/10029)) [@mythocks](https://github.com/mythocks)
+- Update JNI to use new aena m constucto ([#10027](https://github.com/rapidsai/cudf/pull/10027)) [@ongou](https://github.com/ongou)
+- Fix null check when compaing stucts in `ag_min` opeation of eduction/goupby ([#10026](https://github.com/rapidsai/cudf/pull/10026)) [@ttnghia](https://github.com/ttnghia)
+- Wap CI scipt shell vaiables in quotes to fix local testing. ([#10018](https://github.com/rapidsai/cudf/pull/10018)) [@bdice](https://github.com/bdice)
+- cudftestutil no longe popagates compile flags to extenal uses ([#10017](https://github.com/rapidsai/cudf/pull/10017)) [@obetmaynad](https://github.com/obetmaynad)
+- Remove `CUDA_DEVICE_CALLABLE` maco usage ([#10015](https://github.com/rapidsai/cudf/pull/10015)) [@hypebolic2346](https://github.com/hypebolic2346)
+- Add missing list filling heade in meta.yaml ([#10007](https://github.com/rapidsai/cudf/pull/10007)) [@devavet](https://github.com/devavet)
+- Fix `conda` ecipes fo `custeamz` &amp; `cudf_kafka` ([#10003](https://github.com/rapidsai/cudf/pull/10003)) [@ajschmidt8](https://github.com/ajschmidt8)
+- Fix matching egex wod-bounday () in stings eplace ([#9997](https://github.com/rapidsai/cudf/pull/9997)) [@davidwendt](https://github.com/davidwendt)
+- Fix null check when compaing stucts in `min` and `max` eduction/goupby opeations ([#9994](https://github.com/rapidsai/cudf/pull/9994)) [@ttnghia](https://github.com/ttnghia)
+- Fix octal patten matching in egex sting ([#9993](https://github.com/rapidsai/cudf/pull/9993)) [@davidwendt](https://github.com/davidwendt)
+- `decimal128` Suppot fo `to/fom_aow` ([#9986](https://github.com/rapidsai/cudf/pull/9986)) [@codeepot](https://github.com/codeepot)
+- Fix goupby shift/diff/fill afte selecting fom a  `GoupBy` ([#9984](https://github.com/rapidsai/cudf/pull/9984)) [@shwina](https://github.com/shwina)
+- Fix the oveflow poblem of decimal escale ([#9966](https://github.com/rapidsai/cudf/pull/9966)) [@spelingxx](https://github.com/spelingxx)
+- Use default value fo decimal pecision in paquet wite when not specified ([#9963](https://github.com/rapidsai/cudf/pull/9963)) [@devavet](https://github.com/devavet)
+- Fix cudf java build eo. ([#9958](https://github.com/rapidsai/cudf/pull/9958)) [@fiestaman](https://github.com/fiestaman)
+- Use gpuci_mamba_ety to install local atifacts. ([#9951](https://github.com/rapidsai/cudf/pull/9951)) [@bdice](https://github.com/bdice)
+- Fix egession HostColumnVectoCoe equiing native libs ([#9948](https://github.com/rapidsai/cudf/pull/9948)) [@jlowe](https://github.com/jlowe)
+- Rename aggegate_metadata in wite to fix name collision ([#9938](https://github.com/rapidsai/cudf/pull/9938)) [@devavet](https://github.com/devavet)
+- Fixed issue with pecentile_appox whee output tdigests could have uninitialized data at the end. ([#9931](https://github.com/rapidsai/cudf/pull/9931)) [@nvdbaanec](https://github.com/nvdbaanec)
+- Resolve acecheck eos in ORC kenels ([#9916](https://github.com/rapidsai/cudf/pull/9916)) [@vuule](https://github.com/vuule)
+- Fix the java build afte paquet patitioning suppot ([#9908](https://github.com/rapidsai/cudf/pull/9908)) [@evans2](https://github.com/evans2)
+- Fix compilation of benchmak fo paquet wite. ([#9905](https://github.com/rapidsai/cudf/pull/9905)) [@bdice](https://github.com/bdice)
+- Fix a memcheck eo in ORC wite ([#9896](https://github.com/rapidsai/cudf/pull/9896)) [@vuule](https://github.com/vuule)
+- Intoduce `nan_as_null` paamete fo `cudf.Index` ([#9893](https://github.com/rapidsai/cudf/pull/9893)) [@galipemsaga](https://github.com/galipemsaga)
+- Fix fallback to sot aggegation fo gouping only hash aggegate ([#9891](https://github.com/rapidsai/cudf/pull/9891)) [@abellina](https://github.com/abellina)
+- Add zlib to cudfjni link when using static libcudf libay dependency ([#9890](https://github.com/rapidsai/cudf/pull/9890)) [@jlowe](https://github.com/jlowe)
+- TimedeltaIndex constucto aises an AttibuteEo. ([#9884](https://github.com/rapidsai/cudf/pull/9884)) [@skiui-souce](https://github.com/skiui-souce)
+- Fix cudf.Scala sting datetime constuction ([#9875](https://github.com/rapidsai/cudf/pull/9875)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Load libcufile.so with RTLD_NODELETE flag ([#9872](https://github.com/rapidsai/cudf/pull/9872)) [@vuule](https://github.com/vuule)
+- Beak tie fo `top` categoical columns in `Seies.descibe` ([#9867](https://github.com/rapidsai/cudf/pull/9867)) [@isVoid](https://github.com/isVoid)
+- Fix null handling fo stucts `min` and `ag_min` in goupby, goupby scan, eduction, and inclusive_scan ([#9864](https://github.com/rapidsai/cudf/pull/9864)) [@ttnghia](https://github.com/ttnghia)
+- Add one-level list encoding suppot in paquet eade ([#9848](https://github.com/rapidsai/cudf/pull/9848)) [@PointKenel](https://github.com/PointKenel)
+- Fix an out-of-bounds ead in validity copying in contiguous_split. ([#9842](https://github.com/rapidsai/cudf/pull/9842)) [@nvdbaanec](https://github.com/nvdbaanec)
+- Fix join of MultiIndex to Index with one column and ovelapping name. ([#9830](https://github.com/rapidsai/cudf/pull/9830)) [@vyas](https://github.com/vyas)
+- Fix caching in `Seies.applymap` ([#9821](https://github.com/rapidsai/cudf/pull/9821)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Enfoce boolean `ascending` fo dask-cudf `sot_values` ([#9814](https://github.com/rapidsai/cudf/pull/9814)) [@chalesbluca](https://github.com/chalesbluca)
+- Fix ORC wite cash with empty input columns ([#9808](https://github.com/rapidsai/cudf/pull/9808)) [@vuule](https://github.com/vuule)
+- Change default `dtype` of all nulls column fom `float` to `object` ([#9803](https://github.com/rapidsai/cudf/pull/9803)) [@galipemsaga](https://github.com/galipemsaga)
+- Load native dependencies when Java ColumnView is loaded ([#9800](https://github.com/rapidsai/cudf/pull/9800)) [@jlowe](https://github.com/jlowe)
+- Fix dtype-agument bug in dask_cudf ead_csv ([#9796](https://github.com/rapidsai/cudf/pull/9796)) [@jzamoa](https://github.com/jzamoa)
+- Fix oveflow fo min calculation in stings::fom_timestamps ([#9793](https://github.com/rapidsai/cudf/pull/9793)) [@evans2](https://github.com/evans2)
+- Fix memoy eo due to lambda etun type deduction limitation ([#9778](https://github.com/rapidsai/cudf/pull/9778)) [@kathikeyann](https://github.com/kathikeyann)
+- Revet egex $/EOL end-of-sting new-line special case handling ([#9774](https://github.com/rapidsai/cudf/pull/9774)) [@davidwendt](https://github.com/davidwendt)
+- Fix missing steams ([#9767](https://github.com/rapidsai/cudf/pull/9767)) [@kathikeyann](https://github.com/kathikeyann)
+- Fix make_empty_scala_like on list_type ([#9759](https://github.com/rapidsai/cudf/pull/9759)) [@spelingxx](https://github.com/spelingxx)
+- Update cmake and conda to 22.02 ([#9746](https://github.com/rapidsai/cudf/pull/9746)) [@devavet](https://github.com/devavet)
+- Fix out-of-bounds memoy wite in decimal128-to-sting convesion ([#9740](https://github.com/rapidsai/cudf/pull/9740)) [@davidwendt](https://github.com/davidwendt)
+- Match pandas scala esult types in eductions ([#9717](https://github.com/rapidsai/cudf/pull/9717)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Fix egex non-multiline EOL/$ matching stings ending with a new-line ([#9715](https://github.com/rapidsai/cudf/pull/9715)) [@davidwendt](https://github.com/davidwendt)
+- Fixed build by adding moe checks fo int8, int16 ([#9707](https://github.com/rapidsai/cudf/pull/9707)) [@azajafi](https://github.com/azajafi)
+- Fix `null` handling when `boolean` dtype is passed ([#9691](https://github.com/rapidsai/cudf/pull/9691)) [@galipemsaga](https://github.com/galipemsaga)
+- Fix steam usage in `segmented_gathe()` ([#9679](https://github.com/rapidsai/cudf/pull/9679)) [@mythocks](https://github.com/mythocks)
+
+## 📖 Documentation
+
+- Update `decimal` dtypes elated docs enties ([#10072](https://github.com/rapidsai/cudf/pull/10072)) [@galipemsaga](https://github.com/galipemsaga)
+- Fix egex doc descibing hexadecimal escape chaactes ([#10009](https://github.com/rapidsai/cudf/pull/10009)) [@davidwendt](https://github.com/davidwendt)
+- Fix cudf compilation instuctions. ([#9956](https://github.com/rapidsai/cudf/pull/9956)) [@esoha-nvidia](https://github.com/esoha-nvidia)
+- Fix see also links fo IO APIs ([#9895](https://github.com/rapidsai/cudf/pull/9895)) [@galipemsaga](https://github.com/galipemsaga)
+- Fix build instuctions fo libcudf doxygen ([#9837](https://github.com/rapidsai/cudf/pull/9837)) [@davidwendt](https://github.com/davidwendt)
+- Fix some doxygen wanings and add missing documentation ([#9770](https://github.com/rapidsai/cudf/pull/9770)) [@kathikeyann](https://github.com/kathikeyann)
+- update cuda vesion in local build ([#9736](https://github.com/rapidsai/cudf/pull/9736)) [@kathikeyann](https://github.com/kathikeyann)
+- Fix doxygen fo enum types in libcudf ([#9724](https://github.com/rapidsai/cudf/pull/9724)) [@davidwendt](https://github.com/davidwendt)
+- Spell check fixes ([#9682](https://github.com/rapidsai/cudf/pull/9682)) [@kathikeyann](https://github.com/kathikeyann)
+- Fix links in C++ Develope Guide. ([#9675](https://github.com/rapidsai/cudf/pull/9675)) [@bdice](https://github.com/bdice)
+
+## 🚀 New Featues
+
+- Remove libcudacxx patch needed fo nvcc 11.4 ([#10057](https://github.com/rapidsai/cudf/pull/10057)) [@obetmaynad](https://github.com/obetmaynad)
+- Allow CuPy 10 ([#10048](https://github.com/rapidsai/cudf/pull/10048)) [@jakikham](https://github.com/jakikham)
+- Add in suppot fo NULL_LOGICAL_AND and NULL_LOGICAL_OR binops ([#10016](https://github.com/rapidsai/cudf/pull/10016)) [@evans2](https://github.com/evans2)
+- Add `goupby.tansfom` (only suppot fo aggegations) ([#10005](https://github.com/rapidsai/cudf/pull/10005)) [@shwina](https://github.com/shwina)
+- Add patitioning suppot to Paquet chunked wite ([#10000](https://github.com/rapidsai/cudf/pull/10000)) [@devavet](https://github.com/devavet)
+- Add jni fo sequences ([#9972](https://github.com/rapidsai/cudf/pull/9972)) [@wbo4958](https://github.com/wbo4958)
+- Java bindings fo mixed left, inne, and full joins ([#9941](https://github.com/rapidsai/cudf/pull/9941)) [@jlowe](https://github.com/jlowe)
+- Java bindings fo JSON eade suppot ([#9940](https://github.com/rapidsai/cudf/pull/9940)) [@wbo4958](https://github.com/wbo4958)
+- Enable tanspose fo sting columns in cudf python ([#9937](https://github.com/rapidsai/cudf/pull/9937)) [@galipemsaga](https://github.com/galipemsaga)
+- Suppot stucts fo `cudf::contains` with column/scala input ([#9929](https://github.com/rapidsai/cudf/pull/9929)) [@ttnghia](https://github.com/ttnghia)
+- Implement mixed equality/conditional joins ([#9917](https://github.com/rapidsai/cudf/pull/9917)) [@vyas](https://github.com/vyas)
+- Add cudf::stings::extact_all API ([#9909](https://github.com/rapidsai/cudf/pull/9909)) [@davidwendt](https://github.com/davidwendt)
+- Implement JNI fo `cudf::scatte` APIs ([#9903](https://github.com/rapidsai/cudf/pull/9903)) [@ttnghia](https://github.com/ttnghia)
+- JNI: Function to copy and set validity fom bool column. ([#9901](https://github.com/rapidsai/cudf/pull/9901)) [@mythocks](https://github.com/mythocks)
+- Add dictionay suppot to cudf::copy_if_else ([#9887](https://github.com/rapidsai/cudf/pull/9887)) [@davidwendt](https://github.com/davidwendt)
+- add un_benchmaks taget fo unning benchmaks with json output ([#9879](https://github.com/rapidsai/cudf/pull/9879)) [@kathikeyann](https://github.com/kathikeyann)
+- Add egex_flags paamete to stings eplace_e functions ([#9878](https://github.com/rapidsai/cudf/pull/9878)) [@davidwendt](https://github.com/davidwendt)
+- Add_suffix and add_pefix fo DataFames and Seies ([#9846](https://github.com/rapidsai/cudf/pull/9846)) [@mayankanand007](https://github.com/mayankanand007)
+- Add JNI fo `cudf::dop_duplicates` ([#9841](https://github.com/rapidsai/cudf/pull/9841)) [@ttnghia](https://github.com/ttnghia)
+- Implement pe-list sequence ([#9839](https://github.com/rapidsai/cudf/pull/9839)) [@ttnghia](https://github.com/ttnghia)
+- adding `seies.tanspose` ([#9835](https://github.com/rapidsai/cudf/pull/9835)) [@mayankanand007](https://github.com/mayankanand007)
+- Adding suppot fo `Seies.autoco` ([#9833](https://github.com/rapidsai/cudf/pull/9833)) [@mayankanand007](https://github.com/mayankanand007)
+- Suppot ound opeation on datetime64 datatypes ([#9820](https://github.com/rapidsai/cudf/pull/9820)) [@mayankanand007](https://github.com/mayankanand007)
+- Add patitioning suppot in paquet wite ([#9810](https://github.com/rapidsai/cudf/pull/9810)) [@devavet](https://github.com/devavet)
+- Raise tempoay eo fo `decimal128` types in paquet eade ([#9804](https://github.com/rapidsai/cudf/pull/9804)) [@galipemsaga](https://github.com/galipemsaga)
+- Add decimal128 suppot to Paquet eade and wite ([#9765](https://github.com/rapidsai/cudf/pull/9765)) [@vuule](https://github.com/vuule)
+- Optimize `goupby::scan` ([#9754](https://github.com/rapidsai/cudf/pull/9754)) [@PointKenel](https://github.com/PointKenel)
+- Add sample JNI API ([#9728](https://github.com/rapidsai/cudf/pull/9728)) [@es-life](https://github.com/es-life)
+- Suppot `min` and `max` in inclusive scan fo stucts ([#9725](https://github.com/rapidsai/cudf/pull/9725)) [@ttnghia](https://github.com/ttnghia)
+- Add `fist` and `last` method to `IndexedFame` ([#9710](https://github.com/rapidsai/cudf/pull/9710)) [@isVoid](https://github.com/isVoid)
+- Suppot `min` and `max` eduction fo stucts ([#9697](https://github.com/rapidsai/cudf/pull/9697)) [@ttnghia](https://github.com/ttnghia)
+- Add paametes to contol ow goup size in Paquet wite ([#9677](https://github.com/rapidsai/cudf/pull/9677)) [@vuule](https://github.com/vuule)
+- Run compute-sanitize in nightly build ([#9641](https://github.com/rapidsai/cudf/pull/9641)) [@kathikeyann](https://github.com/kathikeyann)
+- Implement Seies.datetime.floo ([#9571](https://github.com/rapidsai/cudf/pull/9571)) [@skiui-souce](https://github.com/skiui-souce)
+- ceil/floo fo `DatetimeIndex` ([#9554](https://github.com/rapidsai/cudf/pull/9554)) [@mayankanand007](https://github.com/mayankanand007)
+- Add suppot fo `decimal128` in cudf python ([#9533](https://github.com/rapidsai/cudf/pull/9533)) [@galipemsaga](https://github.com/galipemsaga)
+- Implement `lists::index_of()` to find positions in list ows ([#9510](https://github.com/rapidsai/cudf/pull/9510)) [@mythocks](https://github.com/mythocks)
+- custeamz oauth callback fo kafka (libdkafka) ([#9486](https://github.com/rapidsai/cudf/pull/9486)) [@jdye64](https://github.com/jdye64)
+- Add Peason coelation fo sot goupby (python) ([#9166](https://github.com/rapidsai/cudf/pull/9166)) [@skiui-souce](https://github.com/skiui-souce)
+- Intechange datafame potocol ([#9071](https://github.com/rapidsai/cudf/pull/9071)) [@iskode](https://github.com/iskode)
+- Rewiting ow/column convesions fo Spak &lt;-&gt; cudf data convesions ([#8444](https://github.com/rapidsai/cudf/pull/8444)) [@hypebolic2346](https://github.com/hypebolic2346)
+
+## 🛠️ Impovements
+
+- Pepae upload scipts fo Python 3.7 emoval ([#10092](https://github.com/rapidsai/cudf/pull/10092)) [@Ethyling](https://github.com/Ethyling)
+- Simplify custeamz and cudf_kafka ecipes files ([#10065](https://github.com/rapidsai/cudf/pull/10065)) [@Ethyling](https://github.com/Ethyling)
+- ORC wite API changes fo ganula statistics ([#10058](https://github.com/rapidsai/cudf/pull/10058)) [@mythocks](https://github.com/mythocks)
+- Remove python constaints in cuteamz and cudf_kafka ecipes ([#10052](https://github.com/rapidsai/cudf/pull/10052)) [@Ethyling](https://github.com/Ethyling)
+- Unpin `dask` and `distibuted` in CI ([#10028](https://github.com/rapidsai/cudf/pull/10028)) [@galipemsaga](https://github.com/galipemsaga)
+- Add `_fom_column_like_self` factoy ([#10022](https://github.com/rapidsai/cudf/pull/10022)) [@isVoid](https://github.com/isVoid)
+- Replace custom CUDA bindings peviously povided by RMM with official CUDA Python bindings ([#10008](https://github.com/rapidsai/cudf/pull/10008)) [@shwina](https://github.com/shwina)
+- Use `cuda::std::is_aithmetic` in `cudf::is_numeic` tait. ([#9996](https://github.com/rapidsai/cudf/pull/9996)) [@bdice](https://github.com/bdice)
+- Clean up CUDA steam use in cuIO ([#9991](https://github.com/rapidsai/cudf/pull/9991)) [@vuule](https://github.com/vuule)
+- Use addessed-odeed fist fit fo the pinned memoy pool ([#9989](https://github.com/rapidsai/cudf/pull/9989)) [@ongou](https://github.com/ongou)
+- Add stings tests to tanspose_test.cpp ([#9985](https://github.com/rapidsai/cudf/pull/9985)) [@davidwendt](https://github.com/davidwendt)
+- Use gpuci_mamba_ety on Java CI. ([#9983](https://github.com/rapidsai/cudf/pull/9983)) [@bdice](https://github.com/bdice)
+- Remove depecated method `one_hot_encoding` ([#9977](https://github.com/rapidsai/cudf/pull/9977)) [@isVoid](https://github.com/isVoid)
+- Mino cleanup of unused Python functions ([#9974](https://github.com/rapidsai/cudf/pull/9974)) [@vyas](https://github.com/vyas)
+- Use new efficient patitioned paquet witing in cuDF ([#9971](https://github.com/rapidsai/cudf/pull/9971)) [@devavet](https://github.com/devavet)
+- Remove st.subwod_tokenize ([#9968](https://github.com/rapidsai/cudf/pull/9968)) [@VibhuJawa](https://github.com/VibhuJawa)
+- Fowad-mege banch-21.12 to banch-22.02 ([#9947](https://github.com/rapidsai/cudf/pull/9947)) [@bdice](https://github.com/bdice)
+- Remove depecated `method` paamete fom `mege` and `join`. ([#9944](https://github.com/rapidsai/cudf/pull/9944)) [@bdice](https://github.com/bdice)
+- Remove depecated method DataFame.hash_columns. ([#9943](https://github.com/rapidsai/cudf/pull/9943)) [@bdice](https://github.com/bdice)
+- Remove depecated method Seies.hash_encode. ([#9942](https://github.com/rapidsai/cudf/pull/9942)) [@bdice](https://github.com/bdice)
+- use ninja in java ci build ([#9933](https://github.com/rapidsai/cudf/pull/9933)) [@ongou](https://github.com/ongou)
+- Add build-time publish step to cpu build scipt ([#9927](https://github.com/rapidsai/cudf/pull/9927)) [@davidwendt](https://github.com/davidwendt)
+- Refactoing ceil/ound/floo code fo datetime64 types ([#9926](https://github.com/rapidsai/cudf/pull/9926)) [@mayankanand007](https://github.com/mayankanand007)
+- Remove vaious unused functions ([#9922](https://github.com/rapidsai/cudf/pull/9922)) [@vyas](https://github.com/vyas)
+- Raise in `quey` if dtype is not suppoted ([#9921](https://github.com/rapidsai/cudf/pull/9921)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Add missing impots tests ([#9920](https://github.com/rapidsai/cudf/pull/9920)) [@Ethyling](https://github.com/Ethyling)
+- Spak Decimal128 hashing ([#9919](https://github.com/rapidsai/cudf/pull/9919)) [@wlee](https://github.com/wlee)
+- Replace `thust/std::get` with stuctued bindings ([#9915](https://github.com/rapidsai/cudf/pull/9915)) [@codeepot](https://github.com/codeepot)
+- Upgade thust vesion to 1.15 ([#9912](https://github.com/rapidsai/cudf/pull/9912)) [@obetmaynad](https://github.com/obetmaynad)
+- Remove conda envs fo CUDA 11.0 and 11.2. ([#9910](https://github.com/rapidsai/cudf/pull/9910)) [@bdice](https://github.com/bdice)
+- Retun count of set bits fom inplace_bitmask_and. ([#9904](https://github.com/rapidsai/cudf/pull/9904)) [@bdice](https://github.com/bdice)
+- Use dynamic nullate fo join hashe and equality compaato ([#9902](https://github.com/rapidsai/cudf/pull/9902)) [@davidwendt](https://github.com/davidwendt)
+- Update ucx-py vesion on elease using vc ([#9897](https://github.com/rapidsai/cudf/pull/9897)) [@Ethyling](https://github.com/Ethyling)
+- Remove `IncludeCategoies` fom `.clang-fomat` ([#9876](https://github.com/rapidsai/cudf/pull/9876)) [@codeepot](https://github.com/codeepot)
+- Suppot statically linking CUDA untime fo Java bindings ([#9873](https://github.com/rapidsai/cudf/pull/9873)) [@jlowe](https://github.com/jlowe)
+- Add `clang-tidy` to libcudf ([#9860](https://github.com/rapidsai/cudf/pull/9860)) [@codeepot](https://github.com/codeepot)
+- Remove depecated methods fom Java Table class ([#9853](https://github.com/rapidsai/cudf/pull/9853)) [@jlowe](https://github.com/jlowe)
+- Add test fo map column metadata handling in ORC wite ([#9852](https://github.com/rapidsai/cudf/pull/9852)) [@vuule](https://github.com/vuule)
+- Use pandas `to_offset` to pase fequency sting in `date_ange` ([#9843](https://github.com/rapidsai/cudf/pull/9843)) [@isVoid](https://github.com/isVoid)
+- add templated benchmak with fixtue ([#9838](https://github.com/rapidsai/cudf/pull/9838)) [@kathikeyann](https://github.com/kathikeyann)
+- Use list of column inputs fo `apply_boolean_mask` ([#9832](https://github.com/rapidsai/cudf/pull/9832)) [@isVoid](https://github.com/isVoid)
+- Added a few moe tests fo Decimal to Sting cast ([#9818](https://github.com/rapidsai/cudf/pull/9818)) [@azajafi](https://github.com/azajafi)
+- Run doctests. ([#9815](https://github.com/rapidsai/cudf/pull/9815)) [@bdice](https://github.com/bdice)
+- Avoid oveflow fo fixed_point ound ([#9809](https://github.com/rapidsai/cudf/pull/9809)) [@spelingxx](https://github.com/spelingxx)
+- Move `dop_duplicates`, `dop_na`, `_gathe`, `take` to IndexFame and ceate thei `_base_index` countepats ([#9807](https://github.com/rapidsai/cudf/pull/9807)) [@isVoid](https://github.com/isVoid)
+- Use vecto factoies fo host-device copies. ([#9806](https://github.com/rapidsai/cudf/pull/9806)) [@bdice](https://github.com/bdice)
+- Refacto host device macos ([#9797](https://github.com/rapidsai/cudf/pull/9797)) [@vyas](https://github.com/vyas)
+- Remove unused masked udf cython/c++ code ([#9792](https://github.com/rapidsai/cudf/pull/9792)) [@bandon-b-mille](https://github.com/bandon-b-mille)
+- Allow custom sot functions fo dask-cudf `sot_values` ([#9789](https://github.com/rapidsai/cudf/pull/9789)) [@chalesbluca](https://github.com/chalesbluca)
+- Impove build time of libcudf iteato tests ([#9788](https://github.com/rapidsai/cudf/pull/9788)) [@davidwendt](https://github.com/davidwendt)
+- Copy Java native dependencies diectly into classpath ([#9787](https://github.com/rapidsai/cudf/pull/9787)) [@jlowe](https://github.com/jlowe)
+- Add decimal types to cuIO benchmaks ([#9776](https://github.com/rapidsai/cudf/pull/9776)) [@vuule](https://github.com/vuule)
+- Pick smallest decimal type with equied pecision in ORC eade ([#9775](https://github.com/rapidsai/cudf/pull/9775)) [@vuule](https://github.com/vuule)
+- Avoid oveflow fo `fixed_point` `cudf::cast` and pefomance optimization ([#9772](https://github.com/rapidsai/cudf/pull/9772)) [@codeepot](https://github.com/codeepot)
+- Use CTAD with Thust function objects ([#9768](https://github.com/rapidsai/cudf/pull/9768)) [@codeepot](https://github.com/codeepot)
+- Refacto TableTest assetion methods to a sepaate utility class ([#9762](https://github.com/rapidsai/cudf/pull/9762)) [@jlowe](https://github.com/jlowe)
+- Use Java classloade to find test esouces ([#9760](https://github.com/rapidsai/cudf/pull/9760)) [@jlowe](https://github.com/jlowe)
+- Allow cast decimal128 to sting and add tests ([#9756](https://github.com/rapidsai/cudf/pull/9756)) [@azajafi](https://github.com/azajafi)
+- Load balance optimization fo contiguous_split ([#9755](https://github.com/rapidsai/cudf/pull/9755)) [@nvdbaanec](https://github.com/nvdbaanec)
+- Consolidate and impove `eset_index` ([#9750](https://github.com/rapidsai/cudf/pull/9750)) [@isVoid](https://github.com/isVoid)
+- Update to UCX-Py 0.24 ([#9748](https://github.com/rapidsai/cudf/pull/9748)) [@pentschev](https://github.com/pentschev)
+- Skip cufile tests in JNI build scipt ([#9744](https://github.com/rapidsai/cudf/pull/9744)) [@pxLi](https://github.com/pxLi)
+- Enable sting to decimal 128 cast ([#9742](https://github.com/rapidsai/cudf/pull/9742)) [@azajafi](https://github.com/azajafi)
+- Use stop instead of stop_. ([#9735](https://github.com/rapidsai/cudf/pull/9735)) [@bdice](https://github.com/bdice)
+- Fowad-mege banch-21.12 to banch-22.02 ([#9730](https://github.com/rapidsai/cudf/pull/9730)) [@bdice](https://github.com/bdice)
+- Impove cmake fomat scipt ([#9723](https://github.com/rapidsai/cudf/pull/9723)) [@vyas](https://github.com/vyas)
+- Use cuFile diect device eads/wites by default in cuIO ([#9722](https://github.com/rapidsai/cudf/pull/9722)) [@vuule](https://github.com/vuule)
+- Add diectoy-patitioned data suppot to cudf.ead_paquet ([#9720](https://github.com/rapidsai/cudf/pull/9720)) [@jzamoa](https://github.com/jzamoa)
+- Use steam allocato adapto fo hash join table ([#9704](https://github.com/rapidsai/cudf/pull/9704)) [@PointKenel](https://github.com/PointKenel)
+- Update check fo inf/nan stings in libcudf float convesion to ignoe case ([#9694](https://github.com/rapidsai/cudf/pull/9694)) [@davidwendt](https://github.com/davidwendt)
+- Update cudf JNI to 22.02.0-SNAPSHOT ([#9681](https://github.com/rapidsai/cudf/pull/9681)) [@pxLi](https://github.com/pxLi)
+- Replace cudf&#39;s concuent_odeed_map with cuco::static_map in semi/anti joins ([#9666](https://github.com/rapidsai/cudf/pull/9666)) [@vyas](https://github.com/vyas)
+- Some impovements to `pase_decimal` function and bindings fo `is_fixed_point` ([#9658](https://github.com/rapidsai/cudf/pull/9658)) [@azajafi](https://github.com/azajafi)
+- Add utility to fomat ninja-log build times ([#9631](https://github.com/rapidsai/cudf/pull/9631)) [@davidwendt](https://github.com/davidwendt)
+- Allow untime has_nulls paamete fo ow opeatos ([#9623](https://github.com/rapidsai/cudf/pull/9623)) [@davidwendt](https://github.com/davidwendt)
+- Use fsspec.paquet fo impoved ead_paquet pefomance fom emote stoage ([#9589](https://github.com/rapidsai/cudf/pull/9589)) [@jzamoa](https://github.com/jzamoa)
+- Refacto bit counting APIs, intoduce valid/null count functions, and split host/device side code fo segmented counts. ([#9588](https://github.com/rapidsai/cudf/pull/9588)) [@bdice](https://github.com/bdice)
+- Use List of Columns as Input fo `dop_nulls`, `gathe` and `dop_duplicates` ([#9558](https://github.com/rapidsai/cudf/pull/9558)) [@isVoid](https://github.com/isVoid)
+- Simplify mege intenals and educe ovehead ([#9516](https://github.com/rapidsai/cudf/pull/9516)) [@vyas](https://github.com/vyas)
+- Add `stuct` geneation suppot in datageneato &amp; fuzz tests ([#9180](https://github.com/rapidsai/cudf/pull/9180)) [@galipemsaga](https://github.com/galipemsaga)
+- Simplify wite_csv by emoving unnecessay wite/impl classes ([#9089](https://github.com/rapidsai/cudf/pull/9089)) [@cwhais](https://github.com/cwhais)
 
 # cuDF 21.12.00 (9 Dec 2021)
 
diff --git a/build.sh b/build.sh
index c2eba134c35..8b3add1dddd 100755
--- a/build.sh
+++ b/build.sh
@@ -185,12 +185,9 @@ if buildAll || hasArg libcudf; then
     fi
 
     # get the current count before the compile starts
-    FILES_IN_CCACHE=""
-    if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v ccache)" ]]; then
-        FILES_IN_CCACHE=$(ccache -s | grep "files in cache")
-        echo "$FILES_IN_CCACHE"
-        # zero the ccache statistics
-        ccache -z
+    if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v sccache)" ]]; then
+        # zero the sccache statistics
+        sccache --zero-stats
     fi
 
     cmake -S $REPODIR/cpp -B ${LIB_BUILD_DIR} \
@@ -216,11 +213,12 @@ if buildAll || hasArg libcudf; then
         echo "Formatting build metrics"
         python ${REPODIR}/cpp/scripts/sort_ninja_log.py ${LIB_BUILD_DIR}/.ninja_log --fmt xml > ${LIB_BUILD_DIR}/ninja_log.xml
         MSG="<p>"
-        # get some ccache stats after the compile
-        if [[ "$BUILD_REPORT_INCL_CACHE_STATS"=="ON" && -x "$(command -v ccache)" ]]; then
-           MSG="${MSG}<br/>$FILES_IN_CCACHE"
-           HIT_RATE=$(ccache -s | grep "cache hit rate")
-           MSG="${MSG}<br/>${HIT_RATE}"
+        # get some sccache stats after the compile
+        if [[ "$BUILD_REPORT_INCL_CACHE_STATS" == "ON" && -x "$(command -v sccache)" ]]; then
+           COMPILE_REQUESTS=$(sccache -s | grep "Compile requests \+ [0-9]\+$" | awk '{ print $NF }')
+           CACHE_HITS=$(sccache -s | grep "Cache hits \+ [0-9]\+$" | awk '{ print $NF }')
+           HIT_RATE=$(echo - | awk "{printf \"%.2f\n\", $CACHE_HITS / $COMPILE_REQUESTS * 100}")
+           MSG="${MSG}<br/>cache hit rate ${HIT_RATE} %"
         fi
         MSG="${MSG}<br/>parallel setting: $PARALLEL_LEVEL"
         MSG="${MSG}<br/>parallel build time: $compile_total seconds"
diff --git a/ci/cpu/build.sh b/ci/cpu/build.sh
index 6f19f174da0..574a55d26b6 100755
--- a/ci/cpu/build.sh
+++ b/ci/cpu/build.sh
@@ -31,6 +31,10 @@ if [[ "$BUILD_MODE" = "branch" && "$SOURCE_BRANCH" = branch-* ]] ; then
   export VERSION_SUFFIX=`date +%y%m%d`
 fi
 
+export CMAKE_CUDA_COMPILER_LAUNCHER="sccache"
+export CMAKE_CXX_COMPILER_LAUNCHER="sccache"
+export CMAKE_C_COMPILER_LAUNCHER="sccache"
+
 ################################################################################
 # SETUP - Check environment
 ################################################################################
@@ -77,6 +81,8 @@ if [ "$BUILD_LIBCUDF" == '1' ]; then
   gpuci_conda_retry build --no-build-id --croot ${CONDA_BLD_DIR} conda/recipes/libcudf $CONDA_BUILD_ARGS
   mkdir -p ${CONDA_BLD_DIR}/libcudf/work
   cp -r ${CONDA_BLD_DIR}/work/* ${CONDA_BLD_DIR}/libcudf/work
+  gpuci_logger "sccache stats"
+  sccache --show-stats
 
   # Copy libcudf build metrics results
   LIBCUDF_BUILD_DIR=$CONDA_BLD_DIR/libcudf/work/cpp/build
diff --git a/ci/gpu/build.sh b/ci/gpu/build.sh
index d5fb7451769..6a5c28faeff 100755
--- a/ci/gpu/build.sh
+++ b/ci/gpu/build.sh
@@ -36,6 +36,10 @@ export DASK_DISTRIBUTED_GIT_TAG='2022.01.0'
 # ucx-py version
 export UCX_PY_VERSION='0.25.*'
 
+export CMAKE_CUDA_COMPILER_LAUNCHER="sccache"
+export CMAKE_CXX_COMPILER_LAUNCHER="sccache"
+export CMAKE_C_COMPILER_LAUNCHER="sccache"
+
 ################################################################################
 # TRAP - Setup trap for removing jitify cache
 ################################################################################
diff --git a/ci/utils/nbtestlog2junitxml.py b/ci/utils/nbtestlog2junitxml.py
index 15b362e4b70..6a421279112 100644
--- a/ci/utils/nbtestlog2junitxml.py
+++ b/ci/utils/nbtestlog2junitxml.py
@@ -7,11 +7,11 @@
 from enum import Enum
 
 
-startingPatt = re.compile("^STARTING: ([\w\.\-]+)$")
-skippingPatt = re.compile("^SKIPPING: ([\w\.\-]+)\s*(\(([\w\.\-\ \,]+)\))?\s*$")
-exitCodePatt = re.compile("^EXIT CODE: (\d+)$")
-folderPatt = re.compile("^FOLDER: ([\w\.\-]+)$")
-timePatt = re.compile("^real\s+([\d\.ms]+)$")
+startingPatt = re.compile(r"^STARTING: ([\w\.\-]+)$")
+skippingPatt = re.compile(r"^SKIPPING: ([\w\.\-]+)\s*(\(([\w\.\-\ \,]+)\))?\s*$")
+exitCodePatt = re.compile(r"^EXIT CODE: (\d+)$")
+folderPatt = re.compile(r"^FOLDER: ([\w\.\-]+)$")
+timePatt = re.compile(r"^real\s+([\d\.ms]+)$")
 linePatt = re.compile("^" + ("-" * 80) + "$")
 
 
diff --git a/conda/recipes/libcudf/meta.yaml b/conda/recipes/libcudf/meta.yaml
index 2cbe5173de0..70c020d4abd 100644
--- a/conda/recipes/libcudf/meta.yaml
+++ b/conda/recipes/libcudf/meta.yaml
@@ -22,13 +22,15 @@ build:
     - PARALLEL_LEVEL
     - VERSION_SUFFIX
     - PROJECT_FLASH
-    - CCACHE_DIR
-    - CCACHE_NOHASHDIR
-    - CCACHE_COMPILERCHECK
     - CMAKE_GENERATOR
     - CMAKE_C_COMPILER_LAUNCHER
     - CMAKE_CXX_COMPILER_LAUNCHER
     - CMAKE_CUDA_COMPILER_LAUNCHER
+    - SCCACHE_S3_KEY_PREFIX=libcudf-aarch64 # [aarch64]
+    - SCCACHE_S3_KEY_PREFIX=libcudf-linux64 # [linux64]
+    - SCCACHE_BUCKET=rapids-sccache
+    - SCCACHE_REGION=us-west-2
+    - SCCACHE_IDLE_TIMEOUT=32768
   run_exports:
     - {{ pin_subpackage("libcudf", max_pin="x.x") }}
 
diff --git a/cpp/include/cudf/binaryop.hpp b/cpp/include/cudf/binaryop.hpp
index daf55c0befe..177fd904b0b 100644
--- a/cpp/include/cudf/binaryop.hpp
+++ b/cpp/include/cudf/binaryop.hpp
@@ -45,7 +45,7 @@ enum class binary_operator : int32_t {
   PMOD,                  ///< positive modulo operator
                          ///< If remainder is negative, this returns (remainder + divisor) % divisor
                          ///< else, it returns (dividend % divisor)
-  PYMOD,                 ///< operator % but following python's sign rules for negatives
+  PYMOD,                 ///< operator % but following Python's sign rules for negatives
   POW,                   ///< lhs ^ rhs
   LOG_BASE,              ///< logarithm to the base
   ATAN2,                 ///< 2-argument arctangent
diff --git a/cpp/include/cudf/fixed_point/fixed_point.hpp b/cpp/include/cudf/fixed_point/fixed_point.hpp
index a7112ae415d..f027e2783b1 100644
--- a/cpp/include/cudf/fixed_point/fixed_point.hpp
+++ b/cpp/include/cudf/fixed_point/fixed_point.hpp
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2020-2021, NVIDIA CORPORATION.
+ * Copyright (c) 2020-2022, NVIDIA CORPORATION.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -440,6 +440,21 @@ class fixed_point {
   CUDF_HOST_DEVICE inline friend fixed_point<Rep1, Rad1> operator/(
     fixed_point<Rep1, Rad1> const& lhs, fixed_point<Rep1, Rad1> const& rhs);
 
+  /**
+   * @brief operator % (for computing the modulo operation of two `fixed_point` numbers)
+   *
+   * If `_scale`s are equal, the modulus is computed directly.
+   * If `_scale`s are not equal, the number with larger `_scale` is shifted to the
+   * smaller `_scale`, and then the modulus is computed.
+   *
+   * @tparam Rep1 Representation type of number being modulo-ed to `this`
+   * @tparam Rad1 Radix (base) type of number being modulo-ed to `this`
+   * @return The resulting `fixed_point` number
+   */
+  template <typename Rep1, Radix Rad1>
+  CUDF_HOST_DEVICE inline friend fixed_point<Rep1, Rad1> operator%(
+    fixed_point<Rep1, Rad1> const& lhs, fixed_point<Rep1, Rad1> const& rhs);
+
   /**
    * @brief operator == (for comparing two `fixed_point` numbers)
    *
@@ -750,6 +765,16 @@ CUDF_HOST_DEVICE inline bool operator>(fixed_point<Rep1, Rad1> const& lhs,
   return lhs.rescaled(scale)._value > rhs.rescaled(scale)._value;
 }
 
+// MODULO OPERATION
+template <typename Rep1, Radix Rad1>
+CUDF_HOST_DEVICE inline fixed_point<Rep1, Rad1> operator%(fixed_point<Rep1, Rad1> const& lhs,
+                                                          fixed_point<Rep1, Rad1> const& rhs)
+{
+  auto const scale     = std::min(lhs._scale, rhs._scale);
+  auto const remainder = lhs.rescaled(scale)._value % rhs.rescaled(scale)._value;
+  return fixed_point<Rep1, Rad1>{scaled_integer<Rep1>{remainder, scale}};
+}
+
 using decimal32  = fixed_point<int32_t, Radix::BASE_10>;
 using decimal64  = fixed_point<int64_t, Radix::BASE_10>;
 using decimal128 = fixed_point<__int128_t, Radix::BASE_10>;
diff --git a/cpp/scripts/run-clang-format.py b/cpp/scripts/run-clang-format.py
index a7c83da22c5..3d462d65fb8 100755
--- a/cpp/scripts/run-clang-format.py
+++ b/cpp/scripts/run-clang-format.py
@@ -13,7 +13,6 @@
 # limitations under the License.
 #
 
-from __future__ import print_function
 
 import argparse
 import os
@@ -124,9 +123,9 @@ def run_clang_format(src, dst, exe, verbose, inplace):
         os.makedirs(dstdir)
     # run the clang format command itself
     if src == dst:
-        cmd = "%s -i %s" % (exe, src)
+        cmd = f"{exe} -i {src}"
     else:
-        cmd = "%s %s > %s" % (exe, src, dst)
+        cmd = f"{exe} {src} > {dst}"
     try:
         subprocess.check_call(cmd, shell=True)
     except subprocess.CalledProcessError:
@@ -134,9 +133,9 @@ def run_clang_format(src, dst, exe, verbose, inplace):
         raise
     # run the diff to check if there are any formatting issues
     if inplace:
-        cmd = "diff -q %s %s >/dev/null" % (src, dst)
+        cmd = f"diff -q {src} {dst} >/dev/null"
     else:
-        cmd = "diff %s %s" % (src, dst)
+        cmd = f"diff {src} {dst}"
 
     try:
         subprocess.check_call(cmd, shell=True)
diff --git a/cpp/scripts/run-clang-tidy.py b/cpp/scripts/run-clang-tidy.py
index 3a1a663e231..30e937d7f4d 100644
--- a/cpp/scripts/run-clang-tidy.py
+++ b/cpp/scripts/run-clang-tidy.py
@@ -13,7 +13,6 @@
 # limitations under the License.
 #
 
-from __future__ import print_function
 import re
 import os
 import subprocess
@@ -67,7 +66,7 @@ def parse_args():
 
 
 def get_all_commands(cdb):
-    with open(cdb, "r") as fp:
+    with open(cdb) as fp:
         return json.load(fp)
 
 
@@ -195,10 +194,10 @@ def collect_result(result):
 
 def print_result(passed, stdout, file):
     status_str = "PASSED" if passed else "FAILED"
-    print("%s File:%s %s %s" % (SEPARATOR, file, status_str, SEPARATOR))
+    print(f"{SEPARATOR} File:{file} {status_str} {SEPARATOR}")
     if stdout:
         print(stdout)
-        print("%s File:%s ENDS %s" % (SEPARATOR, file, SEPARATOR))
+        print(f"{SEPARATOR} File:{file} ENDS {SEPARATOR}")
 
 
 def print_results():
diff --git a/cpp/scripts/sort_ninja_log.py b/cpp/scripts/sort_ninja_log.py
index 33c369b254f..85eb800879a 100755
--- a/cpp/scripts/sort_ninja_log.py
+++ b/cpp/scripts/sort_ninja_log.py
@@ -33,7 +33,7 @@
 
 # build a map of the log entries
 entries = {}
-with open(log_file, "r") as log:
+with open(log_file) as log:
     last = 0
     files = {}
     for line in log:
diff --git a/cpp/src/binaryop/binaryop.cpp b/cpp/src/binaryop/binaryop.cpp
index 5f9ff2574e3..dfa7896c37a 100644
--- a/cpp/src/binaryop/binaryop.cpp
+++ b/cpp/src/binaryop/binaryop.cpp
@@ -88,7 +88,10 @@ bool is_basic_arithmetic_binop(binary_operator op)
          op == binary_operator::MUL or       // operator *
          op == binary_operator::DIV or       // operator / using common type of lhs and rhs
          op == binary_operator::NULL_MIN or  // 2 null = null, 1 null = value, else min
-         op == binary_operator::NULL_MAX;    // 2 null = null, 1 null = value, else max
+         op == binary_operator::NULL_MAX or  // 2 null = null, 1 null = value, else max
+         op == binary_operator::MOD or       // operator %
+         op == binary_operator::PMOD or      // positive modulo operator
+         op == binary_operator::PYMOD;  // operator % but following Python's negative sign rules
 }
 
 /**
diff --git a/cpp/src/binaryop/compiled/operation.cuh b/cpp/src/binaryop/compiled/operation.cuh
index 4b5f78dc400..de9d46b6280 100644
--- a/cpp/src/binaryop/compiled/operation.cuh
+++ b/cpp/src/binaryop/compiled/operation.cuh
@@ -162,12 +162,24 @@ struct PMod {
     if (rem < 0) rem = std::fmod(rem + yconv, yconv);
     return rem;
   }
+
+  template <typename TypeLhs,
+            typename TypeRhs,
+            std::enable_if_t<cudf::is_fixed_point<TypeLhs>() and
+                             std::is_same_v<TypeLhs, TypeRhs>>* = nullptr>
+  __device__ inline auto operator()(TypeLhs x, TypeRhs y)
+  {
+    auto const remainder = x % y;
+    return remainder.value() < 0 ? (remainder + y) % y : remainder;
+  }
 };
 
 struct PyMod {
   template <typename TypeLhs,
             typename TypeRhs,
-            std::enable_if_t<(std::is_integral_v<std::common_type_t<TypeLhs, TypeRhs>>)>* = nullptr>
+            std::enable_if_t<(std::is_integral_v<std::common_type_t<TypeLhs, TypeRhs>> or
+                              (cudf::is_fixed_point<TypeLhs>() and
+                               std::is_same_v<TypeLhs, TypeRhs>))>* = nullptr>
   __device__ inline auto operator()(TypeLhs x, TypeRhs y) -> decltype(((x % y) + y) % y)
   {
     return ((x % y) + y) % y;
diff --git a/cpp/src/binaryop/compiled/util.cpp b/cpp/src/binaryop/compiled/util.cpp
index 9481c236142..d8f1eb03a16 100644
--- a/cpp/src/binaryop/compiled/util.cpp
+++ b/cpp/src/binaryop/compiled/util.cpp
@@ -45,7 +45,11 @@ struct common_type_functor {
         // Eg. d=t-t
         return data_type{type_to_id<TypeCommon>()};
       }
-      return {};
+
+      // A compiler bug may cause a compilation error when using empty initializer list to construct
+      // an std::optional object containing no `data_type` value. Therefore, we should explicitly
+      // return `std::nullopt` instead.
+      return std::nullopt;
     }
   };
   template <typename TypeLhs, typename TypeRhs>
diff --git a/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp b/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp
index 29905171907..335de93c976 100644
--- a/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp
+++ b/cpp/tests/binaryop/binop-compiled-fixed_point-test.cpp
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2021, NVIDIA CORPORATION.
+ * Copyright (c) 2021-2022, NVIDIA CORPORATION.
  *
  * Licensed under the Apache License, Version 2.0 (the "License");
  * you may not use this file except in compliance with the License.
@@ -33,14 +33,14 @@
 namespace cudf::test::binop {
 
 template <typename T>
-struct FixedPointCompiledTestBothReps : public cudf::test::BaseFixture {
+struct FixedPointCompiledTest : public cudf::test::BaseFixture {
 };
 
 template <typename T>
 using wrapper = cudf::test::fixed_width_column_wrapper<T>;
-TYPED_TEST_SUITE(FixedPointCompiledTestBothReps, cudf::test::FixedPointTypes);
+TYPED_TEST_SUITE(FixedPointCompiledTest, cudf::test::FixedPointTypes);
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -73,7 +73,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected_col, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiply)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -109,7 +109,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply)
 template <typename T>
 using fp_wrapper = cudf::test::fixed_point_column_wrapper<T>;
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply2)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiply2)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -128,7 +128,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiply2)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -147,7 +147,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv2)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv2)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -166,7 +166,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv2)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv3)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv3)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -183,7 +183,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv3)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv4)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpDiv4)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -203,7 +203,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpDiv4)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd2)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd2)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -222,7 +222,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd2)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd3)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd3)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -241,7 +241,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd3)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd4)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd4)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -258,7 +258,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd4)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd5)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd5)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -275,7 +275,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd5)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd6)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpAdd6)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -294,7 +294,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpAdd6)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected1, result1->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointCast)
+TYPED_TEST(FixedPointCompiledTest, FixedPointCast)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -308,7 +308,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointCast)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiplyScalar)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMultiplyScalar)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -325,7 +325,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpMultiplyScalar)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpSimplePlus)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpSimplePlus)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -344,7 +344,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpSimplePlus)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimple)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimple)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -361,7 +361,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimple)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale0)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -377,7 +377,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0Null)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale0Null)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -393,7 +393,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale0Nu
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale2Null)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualSimpleScale2Null)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -409,7 +409,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualSimpleScale2Nu
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualLessGreater)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpEqualLessGreater)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -453,7 +453,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpEqualLessGreater)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(true_col, greater_result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMaxSimple)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullMaxSimple)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -473,7 +473,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMaxSimple)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMinSimple)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullMinSimple)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -493,7 +493,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullMinSimple)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullEqualsSimple)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpNullEqualsSimple)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -510,7 +510,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpNullEqualsSimple)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -526,7 +526,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div2)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div2)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -542,7 +542,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div2)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div3)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div3)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -558,7 +558,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div3)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div4)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div4)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -574,7 +574,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div4)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div6)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div6)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -591,7 +591,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div6)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div7)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div7)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -608,7 +608,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div7)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div8)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div8)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -624,7 +624,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div8)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div9)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div9)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -640,7 +640,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div9)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div10)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div10)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -656,7 +656,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div10)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div11)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOp_Div11)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -672,7 +672,7 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOp_Div11)
   CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
 }
 
-TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpThrows)
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpThrows)
 {
   using namespace numeric;
   using decimalXX = TypeParam;
@@ -684,6 +684,132 @@ TYPED_TEST(FixedPointCompiledTestBothReps, FixedPointBinaryOpThrows)
                cudf::logic_error);
 }
 
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpModSimple)
+{
+  using namespace numeric;
+  using decimalXX = TypeParam;
+  using RepType   = device_storage_type_t<decimalXX>;
+
+  auto const lhs      = fp_wrapper<RepType>{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}};
+  auto const rhs      = fp_wrapper<RepType>{{10, 10, 10, 10, 10, 10, 10, 10}, scale_type{-1}};
+  auto const expected = fp_wrapper<RepType>{{-3, -2, -1, 1, 2, 3, 4, 5}, scale_type{-1}};
+
+  auto const type =
+    cudf::binary_operation_fixed_point_output_type(cudf::binary_operator::MOD,
+                                                   static_cast<cudf::column_view>(lhs).type(),
+                                                   static_cast<cudf::column_view>(rhs).type());
+  auto const result = cudf::binary_operation(lhs, rhs, cudf::binary_operator::MOD, type);
+
+  CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+}
+
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModSimple)
+{
+  using namespace numeric;
+  using decimalXX = TypeParam;
+  using RepType   = device_storage_type_t<decimalXX>;
+
+  auto const lhs      = fp_wrapper<RepType>{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}};
+  auto const rhs      = fp_wrapper<RepType>{{10, 10, 10, 10, 10, 10, 10, 10}, scale_type{-1}};
+  auto const expected = fp_wrapper<RepType>{{7, 8, 9, 1, 2, 3, 4, 5}, scale_type{-1}};
+
+  for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) {
+    auto const type = cudf::binary_operation_fixed_point_output_type(
+      op, static_cast<cudf::column_view>(lhs).type(), static_cast<cudf::column_view>(rhs).type());
+    auto const result = cudf::binary_operation(lhs, rhs, op, type);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpModSimple2)
+{
+  using namespace numeric;
+  using decimalXX = TypeParam;
+  using RepType   = device_storage_type_t<decimalXX>;
+
+  auto const lhs      = fp_wrapper<RepType>{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}};
+  auto const rhs      = make_fixed_point_scalar<decimalXX>(10, scale_type{-1});
+  auto const expected = fp_wrapper<RepType>{{-3, -2, -1, 1, 2, 3, 4, 5}, scale_type{-1}};
+
+  auto const type = cudf::binary_operation_fixed_point_output_type(
+    cudf::binary_operator::MOD, static_cast<cudf::column_view>(lhs).type(), rhs->type());
+  auto const result = cudf::binary_operation(lhs, *rhs, cudf::binary_operator::MOD, type);
+
+  CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+}
+
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModAndPyModSimple2)
+{
+  using namespace numeric;
+  using decimalXX = TypeParam;
+  using RepType   = device_storage_type_t<decimalXX>;
+
+  auto const lhs      = fp_wrapper<RepType>{{-33, -22, -11, 11, 22, 33, 44, 55}, scale_type{-1}};
+  auto const rhs      = make_fixed_point_scalar<decimalXX>(10, scale_type{-1});
+  auto const expected = fp_wrapper<RepType>{{7, 8, 9, 1, 2, 3, 4, 5}, scale_type{-1}};
+
+  for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) {
+    auto const type = cudf::binary_operation_fixed_point_output_type(
+      op, static_cast<cudf::column_view>(lhs).type(), rhs->type());
+    auto const result = cudf::binary_operation(lhs, *rhs, op, type);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpMod)
+{
+  using namespace numeric;
+  using decimalXX  = TypeParam;
+  using RepType    = device_storage_type_t<decimalXX>;
+  auto constexpr N = 1000;
+
+  for (auto scale : {-1, -2, -3}) {
+    auto const iota = thrust::make_counting_iterator(-500);
+    auto const lhs  = fp_wrapper<RepType>{iota, iota + N, scale_type{-1}};
+    auto const rhs  = make_fixed_point_scalar<decimalXX>(7, scale_type{scale});
+
+    auto const factor   = static_cast<int>(std::pow(10, -1 - scale));
+    auto const f        = [factor](auto i) { return (i * factor) % 7; };
+    auto const exp_iter = cudf::detail::make_counting_transform_iterator(-500, f);
+    auto const expected = fp_wrapper<RepType>{exp_iter, exp_iter + N, scale_type{scale}};
+
+    auto const type = cudf::binary_operation_fixed_point_output_type(
+      cudf::binary_operator::MOD, static_cast<cudf::column_view>(lhs).type(), rhs->type());
+    auto const result = cudf::binary_operation(lhs, *rhs, cudf::binary_operator::MOD, type);
+
+    CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+  }
+}
+
+TYPED_TEST(FixedPointCompiledTest, FixedPointBinaryOpPModAndPyMod)
+{
+  using namespace numeric;
+  using decimalXX  = TypeParam;
+  using RepType    = device_storage_type_t<decimalXX>;
+  auto constexpr N = 1000;
+
+  for (auto const scale : {-1, -2, -3}) {
+    auto const iota = thrust::make_counting_iterator(-500);
+    auto const lhs  = fp_wrapper<RepType>{iota, iota + N, scale_type{-1}};
+    auto const rhs  = make_fixed_point_scalar<decimalXX>(7, scale_type{scale});
+
+    auto const factor   = static_cast<int>(std::pow(10, -1 - scale));
+    auto const f        = [factor](auto i) { return (((i * factor) % 7) + 7) % 7; };
+    auto const exp_iter = cudf::detail::make_counting_transform_iterator(-500, f);
+    auto const expected = fp_wrapper<RepType>{exp_iter, exp_iter + N, scale_type{scale}};
+
+    for (auto const op : {cudf::binary_operator::PMOD, cudf::binary_operator::PYMOD}) {
+      auto const type = cudf::binary_operation_fixed_point_output_type(
+        op, static_cast<cudf::column_view>(lhs).type(), rhs->type());
+      auto const result = cudf::binary_operation(lhs, *rhs, op, type);
+
+      CUDF_TEST_EXPECT_COLUMNS_EQUAL(expected, result->view());
+    }
+  }
+}
+
 template <typename T>
 struct FixedPointTest_64_128_Reps : public cudf::test::BaseFixture {
 };
diff --git a/docs/cudf/source/conf.py b/docs/cudf/source/conf.py
index 3d6d3ceb399..60704f3e6ae 100644
--- a/docs/cudf/source/conf.py
+++ b/docs/cudf/source/conf.py
@@ -1,6 +1,4 @@
 #!/usr/bin/env python3
-# -*- coding: utf-8 -*-
-#
 # Copyright (c) 2018-2021, NVIDIA CORPORATION.
 #
 # cudf documentation build configuration file, created by
@@ -118,17 +116,6 @@
 
 html_theme = "pydata_sphinx_theme"
 html_logo = "_static/RAPIDS-logo-purple.png"
-# on_rtd is whether we are on readthedocs.org
-on_rtd = os.environ.get("READTHEDOCS", None) == "True"
-
-if not on_rtd:
-    # only import and set the theme if we're building docs locally
-    # otherwise, readthedocs.org uses their theme by default,
-    # so no need to specify it
-    import pydata_sphinx_theme
-
-    html_theme = "pydata_sphinx_theme"
-    html_theme_path = pydata_sphinx_theme.get_html_theme_path()
 
 
 # Theme options are theme-specific and customize the look and feel of a theme
diff --git a/java/src/main/java/ai/rapids/cudf/Aggregation128Utils.java b/java/src/main/java/ai/rapids/cudf/Aggregation128Utils.java
new file mode 100644
index 00000000000..9a0ac709e3e
--- /dev/null
+++ b/java/src/main/java/ai/rapids/cudf/Aggregation128Utils.java
@@ -0,0 +1,67 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package ai.rapids.cudf;
+
+/**
+ * Utility methods for breaking apart and reassembling 128-bit values during aggregations
+ * to enable hash-based aggregations and detect overflows.
+ */
+public class Aggregation128Utils {
+  static {
+    NativeDepsLoader.loadNativeDeps();
+  }
+
+  /**
+   * Extract a 32-bit chunk from a 128-bit value.
+   * @param col column of 128-bit values (e.g.: DECIMAL128)
+   * @param outType integer type to use for the output column (e.g.: UINT32 or INT32)
+   * @param chunkIdx index of the 32-bit chunk to extract where 0 is the least significant chunk
+   *                 and 3 is the most significant chunk
+   * @return column containing the specified 32-bit chunk of the input column values. A null input
+   *                row will result in a corresponding null output row.
+   */
+  public static ColumnVector extractInt32Chunk(ColumnView col, DType outType, int chunkIdx) {
+    return new ColumnVector(extractInt32Chunk(col.getNativeView(),
+        outType.getTypeId().getNativeId(), chunkIdx));
+  }
+
+  /**
+   * Reassemble a column of 128-bit values from a table of four 64-bit integer columns and check
+   * for overflow. The 128-bit value is reconstructed by overlapping the 64-bit values by 32-bits.
+   * The least significant 32-bits of the least significant 64-bit value are used directly as the
+   * least significant 32-bits of the final 128-bit value, and the remaining 32-bits are added to
+   * the next most significant 64-bit value. The lower 32-bits of that sum become the next most
+   * significant 32-bits in the final 128-bit value, and the remaining 32-bits are added to the
+   * next most significant 64-bit input value, and so on.
+   *
+   * @param chunks table of four 64-bit integer columns with the columns ordered from least
+   *               significant to most significant. The last column must be of type INT64.
+   * @param type the type to use for the resulting 128-bit value column
+   * @return table containing a boolean column and a 128-bit value column of the requested type.
+   *         The boolean value will be true if an overflow was detected for that row's value when
+   *         it was reassembled. A null input row will result in a corresponding null output row.
+   */
+  public static Table combineInt64SumChunks(Table chunks, DType type) {
+    return new Table(combineInt64SumChunks(chunks.getNativeView(),
+        type.getTypeId().getNativeId(),
+        type.getScale()));
+  }
+
+  private static native long extractInt32Chunk(long columnView, int outTypeId, int chunkIdx);
+
+  private static native long[] combineInt64SumChunks(long chunksTableView, int dtype, int scale);
+}
diff --git a/java/src/main/native/CMakeLists.txt b/java/src/main/native/CMakeLists.txt
index 00747efff27..ffbeeb155e0 100755
--- a/java/src/main/native/CMakeLists.txt
+++ b/java/src/main/native/CMakeLists.txt
@@ -1,5 +1,5 @@
 # =============================================================================
-# Copyright (c) 2019-2021, NVIDIA CORPORATION.
+# Copyright (c) 2019-2022, NVIDIA CORPORATION.
 #
 # Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except
 # in compliance with the License. You may obtain a copy of the License at
@@ -219,7 +219,7 @@ endif()
 
 add_library(
   cudfjni SHARED
-  src/row_conversion.cu
+  src/Aggregation128UtilsJni.cpp
   src/AggregationJni.cpp
   src/CudfJni.cpp
   src/CudaJni.cpp
@@ -236,7 +236,9 @@ add_library(
   src/RmmJni.cpp
   src/ScalarJni.cpp
   src/TableJni.cpp
+  src/aggregation128_utils.cu
   src/map_lookup.cu
+  src/row_conversion.cu
   src/check_nvcomp_output_sizes.cu
 )
 
diff --git a/java/src/main/native/src/Aggregation128UtilsJni.cpp b/java/src/main/native/src/Aggregation128UtilsJni.cpp
new file mode 100644
index 00000000000..71c36cb724a
--- /dev/null
+++ b/java/src/main/native/src/Aggregation128UtilsJni.cpp
@@ -0,0 +1,47 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include "aggregation128_utils.hpp"
+#include "cudf_jni_apis.hpp"
+#include "dtype_utils.hpp"
+
+extern "C" {
+
+JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_Aggregation128Utils_extractInt32Chunk(
+    JNIEnv *env, jclass, jlong j_column_view, jint j_out_dtype, jint j_chunk_idx) {
+  JNI_NULL_CHECK(env, j_column_view, "column is null", 0);
+  try {
+    cudf::jni::auto_set_device(env);
+    auto cview = reinterpret_cast<cudf::column_view const *>(j_column_view);
+    auto dtype = cudf::jni::make_data_type(j_out_dtype, 0);
+    return cudf::jni::release_as_jlong(cudf::jni::extract_chunk32(*cview, dtype, j_chunk_idx));
+  }
+  CATCH_STD(env, 0);
+}
+
+JNIEXPORT jlongArray JNICALL Java_ai_rapids_cudf_Aggregation128Utils_combineInt64SumChunks(
+    JNIEnv *env, jclass, jlong j_table_view, jint j_dtype, jint j_scale) {
+  JNI_NULL_CHECK(env, j_table_view, "table is null", 0);
+  try {
+    cudf::jni::auto_set_device(env);
+    auto tview = reinterpret_cast<cudf::table_view const *>(j_table_view);
+    std::unique_ptr<cudf::table> result =
+        cudf::jni::assemble128_from_sum(*tview, cudf::jni::make_data_type(j_dtype, j_scale));
+    return cudf::jni::convert_table_for_return(env, result);
+  }
+  CATCH_STD(env, 0);
+}
+}
diff --git a/java/src/main/native/src/ColumnVectorJni.cpp b/java/src/main/native/src/ColumnVectorJni.cpp
index 0e559ad0403..f01d832eb19 100644
--- a/java/src/main/native/src/ColumnVectorJni.cpp
+++ b/java/src/main/native/src/ColumnVectorJni.cpp
@@ -252,8 +252,8 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnVector_makeListFromOffsets(
   JNI_NULL_CHECK(env, offsets_handle, "offsets_handle is null", 0)
   try {
     cudf::jni::auto_set_device(env);
-    auto const *child_cv = reinterpret_cast<cudf::column_view const *>(child_handle);
-    auto const *offsets_cv = reinterpret_cast<cudf::column_view const *>(offsets_handle);
+    auto const child_cv = reinterpret_cast<cudf::column_view const *>(child_handle);
+    auto const offsets_cv = reinterpret_cast<cudf::column_view const *>(offsets_handle);
     CUDF_EXPECTS(offsets_cv->type().id() == cudf::type_id::INT32,
                  "Input offsets does not have type INT32.");
 
diff --git a/java/src/main/native/src/ColumnViewJni.cpp b/java/src/main/native/src/ColumnViewJni.cpp
index 63247eb0066..eec4a78a457 100644
--- a/java/src/main/native/src/ColumnViewJni.cpp
+++ b/java/src/main/native/src/ColumnViewJni.cpp
@@ -408,7 +408,7 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_dropListDuplicatesWithKey
   JNI_NULL_CHECK(env, keys_vals_handle, "keys_vals_handle is null", 0);
   try {
     cudf::jni::auto_set_device(env);
-    auto const *input_cv = reinterpret_cast<cudf::column_view const *>(keys_vals_handle);
+    auto const input_cv = reinterpret_cast<cudf::column_view const *>(keys_vals_handle);
     CUDF_EXPECTS(input_cv->offset() == 0, "Input column has non-zero offset.");
     CUDF_EXPECTS(input_cv->type().id() == cudf::type_id::LIST,
                  "Input column is not a lists column.");
@@ -460,7 +460,8 @@ JNIEXPORT jlong JNICALL Java_ai_rapids_cudf_ColumnView_dropListDuplicatesWithKey
     auto out_structs =
         cudf::make_structs_column(out_child_size, std::move(out_structs_members), 0, {});
     return release_as_jlong(cudf::make_lists_column(input_cv->size(), std::move(out_offsets),
-                                                    std::move(out_structs), 0, {}));
+                                                    std::move(out_structs), input_cv->null_count(),
+                                                    cudf::copy_bitmask(*input_cv)));
   }
   CATCH_STD(env, 0);
 }
diff --git a/java/src/main/native/src/aggregation128_utils.cu b/java/src/main/native/src/aggregation128_utils.cu
new file mode 100644
index 00000000000..865f607ff7d
--- /dev/null
+++ b/java/src/main/native/src/aggregation128_utils.cu
@@ -0,0 +1,127 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <cstddef>
+#include <utility>
+#include <vector>
+
+#include <cudf/column/column_factories.hpp>
+#include <cudf/detail/null_mask.hpp>
+#include <cudf/utilities/error.hpp>
+#include <rmm/exec_policy.hpp>
+#include <thrust/iterator/counting_iterator.h>
+#include <thrust/iterator/permutation_iterator.h>
+#include <thrust/iterator/transform_iterator.h>
+
+#include "aggregation128_utils.hpp"
+
+namespace {
+
+// Functor to reassemble a 128-bit value from four 64-bit chunks with overflow detection.
+class chunk_assembler : public thrust::unary_function<cudf::size_type, __int128_t> {
+public:
+  chunk_assembler(bool *overflows, uint64_t const *chunks0, uint64_t const *chunks1,
+                  uint64_t const *chunks2, int64_t const *chunks3)
+      : overflows(overflows), chunks0(chunks0), chunks1(chunks1), chunks2(chunks2),
+        chunks3(chunks3) {}
+
+  __device__ __int128_t operator()(cudf::size_type i) const {
+    // Starting with the least significant input and moving to the most significant, propagate the
+    // upper 32-bits of the previous column into the next column, i.e.: propagate the "carry" bits
+    // of each 64-bit chunk into the next chunk.
+    uint64_t const c0 = chunks0[i];
+    uint64_t const c1 = chunks1[i] + (c0 >> 32);
+    uint64_t const c2 = chunks2[i] + (c1 >> 32);
+    int64_t const c3 = chunks3[i] + (c2 >> 32);
+    uint64_t const lower64 = (c1 << 32) | static_cast<uint32_t>(c0);
+    int64_t const upper64 = (c3 << 32) | static_cast<uint32_t>(c2);
+
+    // check for overflow by ensuring the sign bit matches the top carry bits
+    int32_t const replicated_sign_bit = static_cast<int32_t>(c3) >> 31;
+    int32_t const top_carry_bits = static_cast<int32_t>(c3 >> 32);
+    overflows[i] = (replicated_sign_bit != top_carry_bits);
+
+    return (static_cast<__int128_t>(upper64) << 64) | lower64;
+  }
+
+private:
+  // output column for overflow detected
+  bool *const overflows;
+
+  // input columns for the four 64-bit values
+  uint64_t const *const chunks0;
+  uint64_t const *const chunks1;
+  uint64_t const *const chunks2;
+  int64_t const *const chunks3;
+};
+
+} // anonymous namespace
+
+namespace cudf::jni {
+
+// Extract a 32-bit chunk from a 128-bit value.
+std::unique_ptr<cudf::column> extract_chunk32(cudf::column_view const &in_col, cudf::data_type type,
+                                              int chunk_idx, rmm::cuda_stream_view stream) {
+  CUDF_EXPECTS(in_col.type().id() == cudf::type_id::DECIMAL128, "not a 128-bit type");
+  CUDF_EXPECTS(chunk_idx >= 0 && chunk_idx < 4, "invalid chunk index");
+  CUDF_EXPECTS(type.id() == cudf::type_id::INT32 || type.id() == cudf::type_id::UINT32,
+               "not a 32-bit integer type");
+  auto const num_rows = in_col.size();
+  auto out_col = cudf::make_fixed_width_column(type, num_rows, copy_bitmask(in_col));
+  auto out_view = out_col->mutable_view();
+  auto const in_begin = in_col.begin<int32_t>();
+
+  // Build an iterator for every fourth 32-bit value, i.e.: one "chunk" of a __int128_t value
+  thrust::transform_iterator transform_iter{thrust::counting_iterator{0},
+                                            [] __device__(auto i) { return i * 4; }};
+  thrust::permutation_iterator stride_iter{in_begin + chunk_idx, transform_iter};
+
+  thrust::copy(rmm::exec_policy(stream), stride_iter, stride_iter + num_rows,
+               out_view.data<int32_t>());
+  return out_col;
+}
+
+// Reassemble a column of 128-bit values from four 64-bit integer columns with overflow detection.
+std::unique_ptr<cudf::table> assemble128_from_sum(cudf::table_view const &chunks_table,
+                                                  cudf::data_type output_type,
+                                                  rmm::cuda_stream_view stream) {
+  CUDF_EXPECTS(output_type.id() == cudf::type_id::DECIMAL128, "not a 128-bit type");
+  CUDF_EXPECTS(chunks_table.num_columns() == 4, "must be 4 column table");
+  auto const num_rows = chunks_table.num_rows();
+  auto const chunks0 = chunks_table.column(0);
+  auto const chunks1 = chunks_table.column(1);
+  auto const chunks2 = chunks_table.column(2);
+  auto const chunks3 = chunks_table.column(3);
+  CUDF_EXPECTS(cudf::size_of(chunks0.type()) == 8 && cudf::size_of(chunks1.type()) == 8 &&
+                   cudf::size_of(chunks2.type()) == 8 &&
+                   chunks3.type().id() == cudf::type_id::INT64,
+               "chunks type mismatch");
+  std::vector<std::unique_ptr<cudf::column>> columns;
+  columns.push_back(cudf::make_fixed_width_column(cudf::data_type{cudf::type_id::BOOL8}, num_rows,
+                                                  copy_bitmask(chunks0)));
+  columns.push_back(cudf::make_fixed_width_column(output_type, num_rows, copy_bitmask(chunks0)));
+  auto overflows_view = columns[0]->mutable_view();
+  auto assembled_view = columns[1]->mutable_view();
+  thrust::transform(rmm::exec_policy(stream), thrust::make_counting_iterator<cudf::size_type>(0),
+                    thrust::make_counting_iterator<cudf::size_type>(num_rows),
+                    assembled_view.begin<__int128_t>(),
+                    chunk_assembler(overflows_view.begin<bool>(), chunks0.begin<uint64_t>(),
+                                    chunks1.begin<uint64_t>(), chunks2.begin<uint64_t>(),
+                                    chunks3.begin<int64_t>()));
+  return std::make_unique<cudf::table>(std::move(columns));
+}
+
+} // namespace cudf::jni
diff --git a/java/src/main/native/src/aggregation128_utils.hpp b/java/src/main/native/src/aggregation128_utils.hpp
new file mode 100644
index 00000000000..30c1032b795
--- /dev/null
+++ b/java/src/main/native/src/aggregation128_utils.hpp
@@ -0,0 +1,69 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+#include <memory>
+
+#include <cudf/column/column_view.hpp>
+#include <cudf/table/table.hpp>
+#include <rmm/cuda_stream_view.hpp>
+
+namespace cudf::jni {
+
+/**
+ * @brief Extract a 32-bit integer column from a column of 128-bit values.
+ *
+ * Given a 128-bit input column, a 32-bit integer column is returned corresponding to
+ * the index of which 32-bit chunk of the original 128-bit values to extract.
+ * 0 corresponds to the least significant chunk, and 3 corresponds to the most
+ * significant chunk.
+ *
+ * A null input row will result in a corresponding null output row.
+ *
+ * @param col       Column of 128-bit values
+ * @param dtype     Integer type to use for the output column (e.g.: UINT32 or INT32)
+ * @param chunk_idx Index of the 32-bit chunk to extract
+ * @param stream    CUDA stream to use
+ * @return          A column containing the extracted 32-bit integer values
+ */
+std::unique_ptr<cudf::column>
+extract_chunk32(cudf::column_view const &col, cudf::data_type dtype, int chunk_idx,
+                rmm::cuda_stream_view stream = rmm::cuda_stream_default);
+
+/**
+ * @brief Reassemble a 128-bit column from four 64-bit integer columns with overflow detection.
+ *
+ * The 128-bit value is reconstructed by overlapping the 64-bit values by 32-bits. The least
+ * significant 32-bits of the least significant 64-bit value are used directly as the least
+ * significant 32-bits of the final 128-bit value, and the remaining 32-bits are added to the next
+ * most significant 64-bit value. The lower 32-bits of that sum become the next most significant
+ * 32-bits in the final 128-bit value, and the remaining 32-bits are added to the next most
+ * significant 64-bit input value, and so on.
+ *
+ * A null input row will result in a corresponding null output row.
+ *
+ * @param chunks_table Table of four 64-bit integer columns with the columns ordered from least
+ *                     significant to most significant. The last column must be an INT64 column.
+ * @param output_type  The type to use for the resulting 128-bit value column
+ * @param stream       CUDA stream to use
+ * @return             Table containing a boolean column and a 128-bit value column of the
+ *                     requested type. The boolean value will be true if an overflow was detected
+ *                     for that row's value.
+ */
+std::unique_ptr<cudf::table>
+assemble128_from_sum(cudf::table_view const &chunks_table, cudf::data_type output_type,
+                     rmm::cuda_stream_view stream = rmm::cuda_stream_default);
+
+} // namespace cudf::jni
diff --git a/java/src/test/java/ai/rapids/cudf/Aggregation128UtilsTest.java b/java/src/test/java/ai/rapids/cudf/Aggregation128UtilsTest.java
new file mode 100644
index 00000000000..11e2aff7259
--- /dev/null
+++ b/java/src/test/java/ai/rapids/cudf/Aggregation128UtilsTest.java
@@ -0,0 +1,80 @@
+/*
+ * Copyright (c) 2022, NVIDIA CORPORATION.
+ *
+ * Licensed under the Apache License, Version 2.0 (the "License");
+ * you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ *     http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package ai.rapids.cudf;
+
+import org.junit.jupiter.api.Test;
+
+import java.math.BigInteger;
+
+public class Aggregation128UtilsTest extends CudfTestBase {
+  @Test
+  public void testExtractInt32Chunks() {
+    BigInteger[] intvals = new BigInteger[] {
+        null,
+        new BigInteger("123456789abcdef0f0debc9a78563412", 16),
+        new BigInteger("123456789abcdef0f0debc9a78563412", 16),
+        new BigInteger("123456789abcdef0f0debc9a78563412", 16),
+        null
+    };
+    try (ColumnVector cv = ColumnVector.decimalFromBigInt(-38, intvals);
+         ColumnVector chunk1 = Aggregation128Utils.extractInt32Chunk(cv, DType.UINT32, 0);
+         ColumnVector chunk2 = Aggregation128Utils.extractInt32Chunk(cv, DType.UINT32, 1);
+         ColumnVector chunk3 = Aggregation128Utils.extractInt32Chunk(cv, DType.UINT32, 2);
+         ColumnVector chunk4 = Aggregation128Utils.extractInt32Chunk(cv, DType.INT32, 3);
+         Table actualChunks = new Table(chunk1, chunk2, chunk3, chunk4);
+         ColumnVector expectedChunk1 = ColumnVector.fromBoxedUnsignedInts(
+             null, 0x78563412, 0x78563412, 0x78563412, null);
+         ColumnVector expectedChunk2 = ColumnVector.fromBoxedUnsignedInts(
+             null, -0x0f214366, -0x0f214366, -0x0f214366, null);
+         ColumnVector expectedChunk3 = ColumnVector.fromBoxedUnsignedInts(
+             null, -0x65432110, -0x65432110, -0x65432110, null);
+         ColumnVector expectedChunk4 = ColumnVector.fromBoxedInts(
+             null, 0x12345678, 0x12345678, 0x12345678, null);
+         Table expectedChunks = new Table(expectedChunk1, expectedChunk2, expectedChunk3, expectedChunk4)) {
+      AssertUtils.assertTablesAreEqual(expectedChunks, actualChunks);
+    }
+  }
+
+  @Test
+  public void testCombineInt64SumChunks() {
+    try (ColumnVector chunks0 = ColumnVector.fromBoxedUnsignedLongs(
+             null, 0L, 1L, 0L, 0L, 0x12345678L, 0x123456789L, 0x1234567812345678L, 0xfedcba9876543210L);
+         ColumnVector chunks1 = ColumnVector.fromBoxedUnsignedLongs(
+             null, 0L, 2L, 0L, 0L, 0x9abcdef0L, 0x9abcdef01L, 0x1122334455667788L, 0xaceaceaceaceaceaL);
+         ColumnVector chunks2 = ColumnVector.fromBoxedUnsignedLongs(
+             null, 0L, 3L, 0L, 0L, 0x11223344L, 0x556677889L, 0x99aabbccddeeff00L, 0xbdfbdfbdfbdfbdfbL);
+         ColumnVector chunks3 = ColumnVector.fromBoxedLongs(
+             null, 0L, -1L, 0x100000000L, 0x80000000L, 0x55667788L, 0x01234567L, 0x66554434L, -0x42042043L);
+         Table chunksTable = new Table(chunks0, chunks1, chunks2, chunks3);
+         Table actual = Aggregation128Utils.combineInt64SumChunks(chunksTable, DType.create(DType.DTypeEnum.DECIMAL128, -20));
+         ColumnVector expectedOverflows = ColumnVector.fromBoxedBooleans(
+             null, false, false, true, true, false, false, true, false);
+         ColumnVector expectedValues = ColumnVector.decimalFromBigInt(-20,
+             null,
+             new BigInteger("0", 16),
+             new BigInteger("-fffffffcfffffffdffffffff", 16),
+             new BigInteger("0", 16),
+             new BigInteger("-80000000000000000000000000000000", 16),
+             new BigInteger("55667788112233449abcdef012345678", 16),
+             new BigInteger("123456c56677892abcdef0223456789", 16),
+             new BigInteger("ef113244679ace0012345678", 16),
+             new BigInteger("7bf7bf7ba8ca8ca8e9ab678276543210", 16));
+         Table expected = new Table(expectedOverflows, expectedValues)) {
+      AssertUtils.assertTablesAreEqual(expected, actual);
+    }
+  }
+}
diff --git a/java/src/test/java/ai/rapids/cudf/ColumnVectorTest.java b/java/src/test/java/ai/rapids/cudf/ColumnVectorTest.java
index 8f39c3c51ce..f9c8029ed84 100644
--- a/java/src/test/java/ai/rapids/cudf/ColumnVectorTest.java
+++ b/java/src/test/java/ai/rapids/cudf/ColumnVectorTest.java
@@ -4380,12 +4380,14 @@ void testDropListDuplicatesWithKeysValues() {
             3, 4, 5, // list2
             null, 0, 6, 6, 0, // list3
             null, 6, 7, null, 7 // list 4
+            // list5 (empty)
         );
         ColumnVector inputChildVals = ColumnVector.fromBoxedInts(
             10, 20, // list1
             30, 40, 50, // list2
             60, 70, 80, 90, 100, // list3
             110, 120, 130, 140, 150 // list4
+            // list5 (empty)
         );
         ColumnVector inputStructsKeysVals = ColumnVector.makeStruct(inputChildKeys, inputChildVals);
         ColumnVector inputOffsets = ColumnVector.fromInts(0, 2, 5, 10, 15, 15);
@@ -4402,7 +4404,8 @@ void testDropListDuplicatesWithKeysValues() {
             10, 20,
             30, 40, 50,
             100, 90, 60,
-            120, 150, 140);
+            120, 150, 140
+        );
         ColumnVector expectedStructsKeysVals = ColumnVector.makeStruct(expectedChildKeys,
             expectedChildVals);
         ColumnVector expectedOffsets = ColumnVector.fromInts(0, 2, 5, 8, 11, 11);
@@ -4416,6 +4419,60 @@ void testDropListDuplicatesWithKeysValues() {
     }
   }
 
+  @Test
+  void testDropListDuplicatesWithKeysValuesNullable() {
+    try(ColumnVector inputChildKeys = ColumnVector.fromBoxedInts(
+            1, 2, // list1
+            // list2 (null)
+            3, 4, 5, // list3
+            null, 0, 6, 6, 0, // list4
+            null, 6, 7, null, 7 // list 5
+            // list6 (null)
+        );
+        ColumnVector inputChildVals = ColumnVector.fromBoxedInts(
+            10, 20, // list1
+            // list2 (null)
+            30, 40, 50, // list3
+            60, 70, 80, 90, 100, // list4
+            110, 120, 130, 140, 150 // list5
+            // list6 (null)
+        );
+        ColumnVector inputStructsKeysVals = ColumnVector.makeStruct(inputChildKeys, inputChildVals);
+        ColumnVector inputOffsets = ColumnVector.fromInts(0, 2, 2, 5, 10, 15, 15);
+        ColumnVector tmpInputListsKeysVals = inputStructsKeysVals.makeListFromOffsets(6,inputOffsets);
+        ColumnVector templateBitmask = ColumnVector.fromBoxedInts(1, null, 1, 1, 1, null);
+        ColumnVector inputListsKeysVals = tmpInputListsKeysVals.mergeAndSetValidity(BinaryOp.BITWISE_AND, templateBitmask);
+
+        ColumnVector expectedChildKeys = ColumnVector.fromBoxedInts(
+            1, 2, // list1
+            // list2 (null)
+            3, 4, 5, // list3
+            0, 6, null, // list4
+            6, 7, null // list5
+            // list6 (null)
+        );
+        ColumnVector expectedChildVals = ColumnVector.fromBoxedInts(
+            10, 20, // list1
+            // list2 (null)
+            30, 40, 50, // list3
+            100, 90, 60, // list4
+            120, 150, 140 // list5
+            // list6 (null)
+        );
+        ColumnVector expectedStructsKeysVals = ColumnVector.makeStruct(expectedChildKeys,
+            expectedChildVals);
+        ColumnVector expectedOffsets = ColumnVector.fromInts(0, 2, 2, 5, 8, 11, 11);
+        ColumnVector tmpExpectedListsKeysVals = expectedStructsKeysVals.makeListFromOffsets(6,
+            expectedOffsets);
+        ColumnVector expectedListsKeysVals = tmpExpectedListsKeysVals.mergeAndSetValidity(BinaryOp.BITWISE_AND, templateBitmask);
+
+        ColumnVector output = inputListsKeysVals.dropListDuplicatesWithKeysValues();
+        ColumnVector sortedOutput = output.listSortRows(false, false);
+    ) {
+      assertColumnsAreEqual(expectedListsKeysVals, sortedOutput);
+    }
+  }
+
   @SafeVarargs
   private static <T> ColumnVector makeListsColumn(DType childDType, List<T>... rows) {
     HostColumnVector.DataType childType = new HostColumnVector.BasicType(true, childDType);
@@ -4716,7 +4773,7 @@ void testStringSplit() {
          Table resultSplitOnce = v.stringSplit(pattern, 1);
          Table resultSplitAll = v.stringSplit(pattern)) {
           assertTablesAreEqual(expectedSplitOnce, resultSplitOnce);
-          assertTablesAreEqual(expectedSplitAll, resultSplitAll);      
+          assertTablesAreEqual(expectedSplitAll, resultSplitAll);
     }
   }
 
@@ -6068,7 +6125,7 @@ void testCopyWithBooleanColumnAsValidity() {
     }
 
     // Negative case: Mismatch in row count.
-    Exception x = assertThrows(CudfException.class, () ->  { 
+    Exception x = assertThrows(CudfException.class, () ->  {
       try (ColumnVector exemplar = ColumnVector.fromBoxedInts(1, 2, 3, 4, 5, 6, 7, 8, 9, 10);
          ColumnVector validity = ColumnVector.fromBoxedBooleans(F, T, F, T);
          ColumnVector result = exemplar.copyWithBooleanColumnAsValidity(validity)) {
diff --git a/python/cudf/cudf/_fuzz_testing/fuzzer.py b/python/cudf/cudf/_fuzz_testing/fuzzer.py
index 484b3fb26f4..a51a5073510 100644
--- a/python/cudf/cudf/_fuzz_testing/fuzzer.py
+++ b/python/cudf/cudf/_fuzz_testing/fuzzer.py
@@ -14,7 +14,7 @@
 )
 
 
-class Fuzzer(object):
+class Fuzzer:
     def __init__(
         self,
         target,
diff --git a/python/cudf/cudf/_fuzz_testing/io.py b/python/cudf/cudf/_fuzz_testing/io.py
index 193fb4c7f7f..dfc59a1f18d 100644
--- a/python/cudf/cudf/_fuzz_testing/io.py
+++ b/python/cudf/cudf/_fuzz_testing/io.py
@@ -16,7 +16,7 @@
 )
 
 
-class IOFuzz(object):
+class IOFuzz:
     def __init__(
         self,
         dirs=None,
@@ -59,7 +59,7 @@ def __init__(
         self._current_buffer = None
 
     def _load_params(self, path):
-        with open(path, "r") as f:
+        with open(path) as f:
             params = json.load(f)
         self._inputs.append(params)
 
diff --git a/python/cudf/cudf/_fuzz_testing/main.py b/python/cudf/cudf/_fuzz_testing/main.py
index 7b28a4c4970..6b536fc3e2e 100644
--- a/python/cudf/cudf/_fuzz_testing/main.py
+++ b/python/cudf/cudf/_fuzz_testing/main.py
@@ -3,7 +3,7 @@
 from cudf._fuzz_testing import fuzzer
 
 
-class PythonFuzz(object):
+class PythonFuzz:
     def __init__(self, func, params=None, data_handle=None, **kwargs):
         self.function = func
         self.data_handler_class = data_handle
diff --git a/python/cudf/cudf/_version.py b/python/cudf/cudf/_version.py
index a511ab98acf..c6281349c50 100644
--- a/python/cudf/cudf/_version.py
+++ b/python/cudf/cudf/_version.py
@@ -86,7 +86,7 @@ def run_command(
                 stderr=(subprocess.PIPE if hide_stderr else None),
             )
             break
-        except EnvironmentError:
+        except OSError:
             e = sys.exc_info()[1]
             if e.errno == errno.ENOENT:
                 continue
@@ -96,7 +96,7 @@ def run_command(
             return None, None
     else:
         if verbose:
-            print("unable to find command, tried %s" % (commands,))
+            print(f"unable to find command, tried {commands}")
         return None, None
     stdout = p.communicate()[0].strip()
     if sys.version_info[0] >= 3:
@@ -149,7 +149,7 @@ def git_get_keywords(versionfile_abs):
     # _version.py.
     keywords = {}
     try:
-        f = open(versionfile_abs, "r")
+        f = open(versionfile_abs)
         for line in f.readlines():
             if line.strip().startswith("git_refnames ="):
                 mo = re.search(r'=\s*"(.*)"', line)
@@ -164,7 +164,7 @@ def git_get_keywords(versionfile_abs):
                 if mo:
                     keywords["date"] = mo.group(1)
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     return keywords
 
@@ -188,11 +188,11 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         if verbose:
             print("keywords are unexpanded, not using")
         raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
-    refs = set([r.strip() for r in refnames.strip("()").split(",")])
+    refs = {r.strip() for r in refnames.strip("()").split(",")}
     # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
     # just "foo-1.0". If we see a "tag: " prefix, prefer those.
     TAG = "tag: "
-    tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
+    tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
     if not tags:
         # Either we're using git < 1.8.3, or there really are no tags. We use
         # a heuristic: assume all version tags have a digit. The old git %d
@@ -201,7 +201,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         # between branches and tags. By ignoring refnames without digits, we
         # filter out many common branch names like "release" and
         # "stabilization", as well as "HEAD" and "master".
-        tags = set([r for r in refs if re.search(r"\d", r)])
+        tags = {r for r in refs if re.search(r"\d", r)}
         if verbose:
             print("discarding '%s', no digits" % ",".join(refs - tags))
     if verbose:
@@ -308,10 +308,9 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
             if verbose:
                 fmt = "tag '%s' doesn't start with prefix '%s'"
                 print(fmt % (full_tag, tag_prefix))
-            pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
-                full_tag,
-                tag_prefix,
-            )
+            pieces[
+                "error"
+            ] = f"tag '{full_tag}' doesn't start with prefix '{tag_prefix}'"
             return pieces
         pieces["closest-tag"] = full_tag[len(tag_prefix) :]
 
diff --git a/python/cudf/cudf/comm/gpuarrow.py b/python/cudf/cudf/comm/gpuarrow.py
index b6089b65aa5..7879261139d 100644
--- a/python/cudf/cudf/comm/gpuarrow.py
+++ b/python/cudf/cudf/comm/gpuarrow.py
@@ -58,7 +58,7 @@ def to_dict(self):
         return dc
 
 
-class GpuArrowNodeReader(object):
+class GpuArrowNodeReader:
     def __init__(self, table, index):
         self._table = table
         self._field = table.schema[index]
diff --git a/python/cudf/cudf/core/_base_index.py b/python/cudf/cudf/core/_base_index.py
index 6569184e90b..2e6f138d2e3 100644
--- a/python/cudf/cudf/core/_base_index.py
+++ b/python/cudf/cudf/core/_base_index.py
@@ -1,9 +1,8 @@
 # Copyright (c) 2021, NVIDIA CORPORATION.
 
-from __future__ import annotations, division, print_function
+from __future__ import annotations
 
 import pickle
-import warnings
 from typing import Any, Set
 
 import pandas as pd
@@ -1350,28 +1349,6 @@ def isin(self, values):
 
         return self._values.isin(values).values
 
-    def memory_usage(self, deep=False):
-        """
-        Memory usage of the values.
-
-        Parameters
-        ----------
-            deep : bool
-                Introspect the data deeply,
-                interrogate `object` dtypes for system-level
-                memory consumption.
-
-        Returns
-        -------
-            bytes used
-        """
-        if deep:
-            warnings.warn(
-                "The deep parameter is ignored and is only included "
-                "for pandas compatibility."
-            )
-        return self._values.memory_usage()
-
     @classmethod
     def from_pandas(cls, index, nan_as_null=None):
         """
diff --git a/python/cudf/cudf/core/column/column.py b/python/cudf/cudf/core/column/column.py
index 19313dd3fe2..2c3951c0e5e 100644
--- a/python/cudf/cudf/core/column/column.py
+++ b/python/cudf/cudf/core/column/column.py
@@ -77,12 +77,12 @@
     pandas_dtypes_alias_to_cudf_alias,
     pandas_dtypes_to_np_dtypes,
 )
-from cudf.utils.utils import mask_dtype
+from cudf.utils.utils import NotIterable, mask_dtype
 
 T = TypeVar("T", bound="ColumnBase")
 
 
-class ColumnBase(Column, Serializable):
+class ColumnBase(Column, Serializable, NotIterable):
     def as_frame(self) -> "cudf.core.frame.Frame":
         """
         Converts a Column to Frame
@@ -130,9 +130,6 @@ def to_pandas(self, index: pd.Index = None, **kwargs) -> "pd.Series":
             pd_series.index = index
         return pd_series
 
-    def __iter__(self):
-        cudf.utils.utils.raise_iteration_error(obj=self)
-
     @property
     def values_host(self) -> "np.ndarray":
         """
diff --git a/python/cudf/cudf/core/column/string.py b/python/cudf/cudf/core/column/string.py
index 6467fd39ddd..22b7a0f9d2c 100644
--- a/python/cudf/cudf/core/column/string.py
+++ b/python/cudf/cudf/core/column/string.py
@@ -5083,7 +5083,7 @@ def to_arrow(self) -> pa.Array:
         """
         if self.null_count == len(self):
             return pa.NullArray.from_buffers(
-                pa.null(), len(self), [pa.py_buffer((b""))]
+                pa.null(), len(self), [pa.py_buffer(b"")]
             )
         else:
             return super().to_arrow()
diff --git a/python/cudf/cudf/core/dataframe.py b/python/cudf/cudf/core/dataframe.py
index 3735a949277..9d179994174 100644
--- a/python/cudf/cudf/core/dataframe.py
+++ b/python/cudf/cudf/core/dataframe.py
@@ -1,6 +1,6 @@
 # Copyright (c) 2018-2022, NVIDIA CORPORATION.
 
-from __future__ import annotations, division
+from __future__ import annotations
 
 import functools
 import inspect
@@ -1242,66 +1242,9 @@ def _slice(self: T, arg: slice) -> T:
                 return result
 
     def memory_usage(self, index=True, deep=False):
-        """
-        Return the memory usage of each column in bytes.
-        The memory usage can optionally include the contribution of
-        the index and elements of `object` dtype.
-
-        Parameters
-        ----------
-        index : bool, default True
-            Specifies whether to include the memory usage of the DataFrame's
-            index in returned Series. If ``index=True``, the memory usage of
-            the index is the first item in the output.
-        deep : bool, default False
-            If True, introspect the data deeply by interrogating
-            `object` dtypes for system-level memory consumption, and include
-            it in the returned values.
-
-        Returns
-        -------
-        Series
-            A Series whose index is the original column names and whose values
-            is the memory usage of each column in bytes.
-
-        Examples
-        --------
-        >>> dtypes = ['int64', 'float64', 'object', 'bool']
-        >>> data = dict([(t, np.ones(shape=5000).astype(t))
-        ...              for t in dtypes])
-        >>> df = cudf.DataFrame(data)
-        >>> df.head()
-           int64  float64  object  bool
-        0      1      1.0     1.0  True
-        1      1      1.0     1.0  True
-        2      1      1.0     1.0  True
-        3      1      1.0     1.0  True
-        4      1      1.0     1.0  True
-        >>> df.memory_usage(index=False)
-        int64      40000
-        float64    40000
-        object     40000
-        bool        5000
-        dtype: int64
-
-        Use a Categorical for efficient storage of an object-dtype column with
-        many repeated values.
-
-        >>> df['object'].astype('category').memory_usage(deep=True)
-        5008
-        """
-        if deep:
-            warnings.warn(
-                "The deep parameter is ignored and is only included "
-                "for pandas compatibility."
-            )
-        ind = list(self.columns)
-        sizes = [col.memory_usage() for col in self._data.columns]
-        if index:
-            ind.append("Index")
-            ind = cudf.Index(ind, dtype="str")
-            sizes.append(self.index.memory_usage())
-        return Series(sizes, index=ind)
+        return Series(
+            {str(k): v for k, v in super().memory_usage(index, deep).items()}
+        )
 
     def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
         if method == "__call__" and hasattr(cudf, ufunc.__name__):
@@ -2547,11 +2490,6 @@ def reset_index(
             inplace=inplace,
         )
 
-    def take(self, indices, axis=0):
-        out = super().take(indices)
-        out.columns = self.columns
-        return out
-
     @annotate("INSERT", color="green", domain="cudf_python")
     def insert(self, loc, name, value, nan_as_null=None):
         """Add a column to DataFrame at the index specified by loc.
@@ -4229,7 +4167,7 @@ def _verbose_repr():
                 dtype = self.dtypes.iloc[i]
                 col = pprint_thing(col)
 
-                line_no = _put_str(" {num}".format(num=i), space_num)
+                line_no = _put_str(f" {i}", space_num)
                 count = ""
                 if show_counts:
                     count = counts[i]
@@ -5576,9 +5514,7 @@ def select_dtypes(self, include=None, exclude=None):
                     if issubclass(dtype.type, e_dtype):
                         exclude_subtypes.add(dtype.type)
 
-        include_all = set(
-            [cudf_dtype_from_pydata_dtype(d) for d in self.dtypes]
-        )
+        include_all = {cudf_dtype_from_pydata_dtype(d) for d in self.dtypes}
 
         if include:
             inclusion = include_all & include_subtypes
@@ -6329,8 +6265,8 @@ def _align_indices(lhs, rhs):
         lhs_out = DataFrame(index=df.index)
         rhs_out = DataFrame(index=df.index)
         common = set(lhs.columns) & set(rhs.columns)
-        common_x = set(["{}_x".format(x) for x in common])
-        common_y = set(["{}_y".format(x) for x in common])
+        common_x = {f"{x}_x" for x in common}
+        common_y = {f"{x}_y" for x in common}
         for col in df.columns:
             if col in common_x:
                 lhs_out[col[:-2]] = df[col]
diff --git a/python/cudf/cudf/core/frame.py b/python/cudf/cudf/core/frame.py
index 2e01a29b961..6b83f927727 100644
--- a/python/cudf/cudf/core/frame.py
+++ b/python/cudf/cudf/core/frame.py
@@ -337,6 +337,26 @@ def empty(self):
         """
         return self.size == 0
 
+    def memory_usage(self, deep=False):
+        """Return the memory usage of an object.
+
+        Parameters
+        ----------
+        deep : bool
+            The deep parameter is ignored and is only included for pandas
+            compatibility.
+
+        Returns
+        -------
+        The total bytes used.
+        """
+        if deep:
+            warnings.warn(
+                "The deep parameter is ignored and is only included "
+                "for pandas compatibility."
+            )
+        return {name: col.memory_usage() for name, col in self._data.items()}
+
     def __len__(self):
         return self._num_rows
 
diff --git a/python/cudf/cudf/core/groupby/groupby.py b/python/cudf/cudf/core/groupby/groupby.py
index a393d8e9457..ff700144bed 100644
--- a/python/cudf/cudf/core/groupby/groupby.py
+++ b/python/cudf/cudf/core/groupby/groupby.py
@@ -1461,7 +1461,7 @@ def apply(self, func):
 
 
 # TODO: should we define this as a dataclass instead?
-class Grouper(object):
+class Grouper:
     def __init__(
         self, key=None, level=None, freq=None, closed=None, label=None
     ):
diff --git a/python/cudf/cudf/core/index.py b/python/cudf/cudf/core/index.py
index fc59d15e264..f71f930a21c 100644
--- a/python/cudf/cudf/core/index.py
+++ b/python/cudf/cudf/core/index.py
@@ -1,6 +1,6 @@
 # Copyright (c) 2018-2021, NVIDIA CORPORATION.
 
-from __future__ import annotations, division, print_function
+from __future__ import annotations
 
 import math
 import pickle
@@ -826,6 +826,9 @@ def _concat(cls, objs):
         result.name = name
         return result
 
+    def memory_usage(self, deep=False):
+        return sum(super().memory_usage(deep=deep).values())
+
     @annotate("INDEX_EQUALS", color="green", domain="cudf_python")
     def equals(self, other, **kwargs):
         """
diff --git a/python/cudf/cudf/core/indexed_frame.py b/python/cudf/cudf/core/indexed_frame.py
index 8ecab2c7c65..fab5d75f62b 100644
--- a/python/cudf/cudf/core/indexed_frame.py
+++ b/python/cudf/cudf/core/indexed_frame.py
@@ -473,6 +473,68 @@ def sort_index(
             out = out.reset_index(drop=True)
         return self._mimic_inplace(out, inplace=inplace)
 
+    def memory_usage(self, index=True, deep=False):
+        """Return the memory usage of an object.
+
+        Parameters
+        ----------
+        index : bool, default True
+            Specifies whether to include the memory usage of the index.
+        deep : bool, default False
+            The deep parameter is ignored and is only included for pandas
+            compatibility.
+
+        Returns
+        -------
+        Series or scalar
+            For DataFrame, a Series whose index is the original column names
+            and whose values is the memory usage of each column in bytes. For a
+            Series the total memory usage.
+
+        Examples
+        --------
+        **DataFrame**
+
+        >>> dtypes = ['int64', 'float64', 'object', 'bool']
+        >>> data = dict([(t, np.ones(shape=5000).astype(t))
+        ...              for t in dtypes])
+        >>> df = cudf.DataFrame(data)
+        >>> df.head()
+           int64  float64  object  bool
+        0      1      1.0     1.0  True
+        1      1      1.0     1.0  True
+        2      1      1.0     1.0  True
+        3      1      1.0     1.0  True
+        4      1      1.0     1.0  True
+        >>> df.memory_usage(index=False)
+        int64      40000
+        float64    40000
+        object     40000
+        bool        5000
+        dtype: int64
+
+        Use a Categorical for efficient storage of an object-dtype column with
+        many repeated values.
+
+        >>> df['object'].astype('category').memory_usage(deep=True)
+        5008
+
+        **Series**
+        >>> s = cudf.Series(range(3), index=['a','b','c'])
+        >>> s.memory_usage()
+        43
+
+        Not including the index gives the size of the rest of the data, which
+        is necessarily smaller:
+
+        >>> s.memory_usage(index=False)
+        24
+        """
+        usage = super().memory_usage(deep=deep)
+        if index:
+            usage["Index"] = self.index.memory_usage()
+        return usage
+
     def hash_values(self, method="murmur3"):
         """Compute the hash of values in this column.
 
diff --git a/python/cudf/cudf/core/join/join.py b/python/cudf/cudf/core/join/join.py
index 704274815f6..39ff4718550 100644
--- a/python/cudf/cudf/core/join/join.py
+++ b/python/cudf/cudf/core/join/join.py
@@ -169,13 +169,11 @@ def __init__(
             if on
             else set()
             if (self._using_left_index or self._using_right_index)
-            else set(
-                [
-                    lkey.name
-                    for lkey, rkey in zip(self._left_keys, self._right_keys)
-                    if lkey.name == rkey.name
-                ]
-            )
+            else {
+                lkey.name
+                for lkey, rkey in zip(self._left_keys, self._right_keys)
+                if lkey.name == rkey.name
+            }
         )
 
     def perform_merge(self) -> Frame:
diff --git a/python/cudf/cudf/core/multiindex.py b/python/cudf/cudf/core/multiindex.py
index adce3c24a83..8581b97c217 100644
--- a/python/cudf/cudf/core/multiindex.py
+++ b/python/cudf/cudf/core/multiindex.py
@@ -5,7 +5,6 @@
 import itertools
 import numbers
 import pickle
-import warnings
 from collections.abc import Sequence
 from numbers import Integral
 from typing import Any, List, MutableMapping, Optional, Tuple, Union
@@ -23,10 +22,14 @@
 from cudf.core._compat import PANDAS_GE_120
 from cudf.core.frame import Frame
 from cudf.core.index import BaseIndex, _lexsorted_equal_range, as_index
-from cudf.utils.utils import _maybe_indices_to_slice, cached_property
+from cudf.utils.utils import (
+    NotIterable,
+    _maybe_indices_to_slice,
+    cached_property,
+)
 
 
-class MultiIndex(Frame, BaseIndex):
+class MultiIndex(Frame, BaseIndex, NotIterable):
     """A multi-level or hierarchical index.
 
     Provides N-Dimensional indexing into Series and DataFrame objects.
@@ -115,7 +118,7 @@ def __init__(
                 "MultiIndex has unequal number of levels and "
                 "codes and is inconsistent!"
             )
-        if len(set(c.size for c in codes._data.columns)) != 1:
+        if len({c.size for c in codes._data.columns}) != 1:
             raise ValueError(
                 "MultiIndex length of codes does not match "
                 "and is inconsistent!"
@@ -367,9 +370,6 @@ def copy(
 
         return mi
 
-    def __iter__(self):
-        cudf.utils.utils.raise_iteration_error(obj=self)
-
     def __repr__(self):
         max_seq_items = get_option("display.max_seq_items") or len(self)
 
@@ -752,7 +752,7 @@ def _index_and_downcast(self, result, index, index_key):
             # Pandas returns an empty Series with a tuple as name
             # the one expected result column
             result = cudf.Series._from_data(
-                {}, name=tuple((col[0] for col in index._data.columns))
+                {}, name=tuple(col[0] for col in index._data.columns)
             )
         elif out_index._num_columns == 1:
             # If there's only one column remaining in the output index, convert
@@ -1202,7 +1202,7 @@ def _poplevels(self, level):
         if not pd.api.types.is_list_like(level):
             level = (level,)
 
-        ilevels = sorted([self._level_index_from_level(lev) for lev in level])
+        ilevels = sorted(self._level_index_from_level(lev) for lev in level)
 
         if not ilevels:
             return None
@@ -1412,22 +1412,14 @@ def _clean_nulls_from_index(self):
         )
 
     def memory_usage(self, deep=False):
-        if deep:
-            warnings.warn(
-                "The deep parameter is ignored and is only included "
-                "for pandas compatibility."
-            )
-
-        n = 0
-        for col in self._data.columns:
-            n += col.memory_usage()
+        usage = sum(super().memory_usage(deep=deep).values())
         if self.levels:
             for level in self.levels:
-                n += level.memory_usage(deep=deep)
+                usage += level.memory_usage(deep=deep)
         if self.codes:
             for col in self.codes._data.columns:
-                n += col.memory_usage()
-        return n
+                usage += col.memory_usage()
+        return usage
 
     def difference(self, other, sort=None):
         if hasattr(other, "to_pandas"):
diff --git a/python/cudf/cudf/core/scalar.py b/python/cudf/cudf/core/scalar.py
index b0770b71ca6..134b94bf0f2 100644
--- a/python/cudf/cudf/core/scalar.py
+++ b/python/cudf/cudf/core/scalar.py
@@ -17,7 +17,7 @@
 )
 
 
-class Scalar(object):
+class Scalar:
     """
     A GPU-backed scalar object with NumPy scalar like properties
     May be used in binary operations against other scalars, cuDF
diff --git a/python/cudf/cudf/core/series.py b/python/cudf/cudf/core/series.py
index 12a2538b776..5823ea18d1b 100644
--- a/python/cudf/cudf/core/series.py
+++ b/python/cudf/cudf/core/series.py
@@ -167,7 +167,7 @@ def __getitem__(self, arg: Any) -> Union[ScalarLike, DataFrameOrSeries]:
             if (
                 isinstance(arg, tuple)
                 and len(arg) == self._frame._index.nlevels
-                and not any((isinstance(x, slice) for x in arg))
+                and not any(isinstance(x, slice) for x in arg)
             ):
                 result = result.iloc[0]
             return result
@@ -953,52 +953,7 @@ def to_frame(self, name=None):
         return cudf.DataFrame({col: self._column}, index=self.index)
 
     def memory_usage(self, index=True, deep=False):
-        """
-        Return the memory usage of the Series.
-
-        The memory usage can optionally include the contribution of
-        the index and of elements of `object` dtype.
-
-        Parameters
-        ----------
-        index : bool, default True
-            Specifies whether to include the memory usage of the Series index.
-        deep : bool, default False
-            If True, introspect the data deeply by interrogating
-            `object` dtypes for system-level memory consumption, and include
-            it in the returned value.
-
-        Returns
-        -------
-        int
-            Bytes of memory consumed.
-
-        See Also
-        --------
-        cudf.DataFrame.memory_usage : Bytes consumed by
-            a DataFrame.
-
-        Examples
-        --------
-        >>> s = cudf.Series(range(3), index=['a','b','c'])
-        >>> s.memory_usage()
-        43
-
-        Not including the index gives the size of the rest of the data, which
-        is necessarily smaller:
-
-        >>> s.memory_usage(index=False)
-        24
-        """
-        if deep:
-            warnings.warn(
-                "The deep parameter is ignored and is only included "
-                "for pandas compatibility."
-            )
-        n = self._column.memory_usage()
-        if index:
-            n += self._index.memory_usage()
-        return n
+        return sum(super().memory_usage(index, deep).values())
 
     def __array_ufunc__(self, ufunc, method, *inputs, **kwargs):
         if method == "__call__":
@@ -2722,42 +2677,6 @@ def unique(self):
         res = self._column.unique()
         return Series(res, name=self.name)
 
-    def nunique(self, method="sort", dropna=True):
-        """Returns the number of unique values of the Series: approximate version,
-        and exact version to be moved to libcudf
-
-        Excludes NA values by default.
-
-        Parameters
-        ----------
-        dropna : bool, default True
-            Don't include NA values in the count.
-
-        Returns
-        -------
-        int
-
-        Examples
-        --------
-        >>> import cudf
-        >>> s = cudf.Series([1, 3, 5, 7, 7])
-        >>> s
-        0    1
-        1    3
-        2    5
-        3    7
-        4    7
-        dtype: int64
-        >>> s.nunique()
-        4
-        """
-        if method != "sort":
-            msg = "non sort based distinct_count() not implemented yet"
-            raise NotImplementedError(msg)
-        if self.null_count == len(self):
-            return 0
-        return super().nunique(method, dropna)
-
     def value_counts(
         self,
         normalize=False,
@@ -2969,7 +2888,7 @@ def _prepare_percentiles(percentiles):
             return percentiles
 
         def _format_percentile_names(percentiles):
-            return ["{0}%".format(int(x * 100)) for x in percentiles]
+            return [f"{int(x * 100)}%" for x in percentiles]
 
         def _format_stats_values(stats_data):
             return map(lambda x: round(x, 6), stats_data)
@@ -3071,7 +2990,7 @@ def _describe_timestamp(self):
                         .to_numpy(na_value=np.nan),
                     )
                 ),
-                "max": str(pd.Timestamp((self.max()))),
+                "max": str(pd.Timestamp(self.max())),
             }
 
             return Series(
@@ -3327,6 +3246,11 @@ def merge(
         method="hash",
         suffixes=("_x", "_y"),
     ):
+        warnings.warn(
+            "Series.merge is deprecated and will be removed in a future "
+            "release. Use cudf.merge instead.",
+            FutureWarning,
+        )
         if left_on not in (self.name, None):
             raise ValueError(
                 "Series to other merge uses series name as key implicitly"
@@ -3550,7 +3474,7 @@ def wrapper(self, other, level=None, fill_value=None, axis=0):
     setattr(Series, binop, make_binop_func(binop))
 
 
-class DatetimeProperties(object):
+class DatetimeProperties:
     """
     Accessor object for datetimelike properties of the Series values.
 
@@ -4492,7 +4416,7 @@ def strftime(self, date_format, *args, **kwargs):
         )
 
 
-class TimedeltaProperties(object):
+class TimedeltaProperties:
     """
     Accessor object for timedeltalike properties of the Series values.
 
diff --git a/python/cudf/cudf/core/single_column_frame.py b/python/cudf/cudf/core/single_column_frame.py
index ef479f19363..bf867923b57 100644
--- a/python/cudf/cudf/core/single_column_frame.py
+++ b/python/cudf/cudf/core/single_column_frame.py
@@ -15,11 +15,12 @@
 from cudf.api.types import _is_scalar_or_zero_d_array
 from cudf.core.column import ColumnBase, as_column
 from cudf.core.frame import Frame
+from cudf.utils.utils import NotIterable
 
 T = TypeVar("T", bound="Frame")
 
 
-class SingleColumnFrame(Frame):
+class SingleColumnFrame(Frame, NotIterable):
     """A one-dimensional frame.
 
     Frames with only a single column share certain logic that is encoded in
@@ -85,12 +86,6 @@ def shape(self):
         """Get a tuple representing the dimensionality of the Index."""
         return (len(self),)
 
-    def __iter__(self):
-        # Iterating over a GPU object is not efficient and hence not supported.
-        # Consider using ``.to_arrow()``, ``.to_pandas()`` or ``.values_host``
-        # if you wish to iterate over the values.
-        cudf.utils.utils.raise_iteration_error(obj=self)
-
     def __bool__(self):
         raise TypeError(
             f"The truth value of a {type(self)} is ambiguous. Use "
@@ -343,4 +338,6 @@ def nunique(self, method: builtins.str = "sort", dropna: bool = True):
         int
             Number of unique values in the column.
         """
+        if self._column.null_count == len(self):
+            return 0
         return self._column.distinct_count(method=method, dropna=dropna)
diff --git a/python/cudf/cudf/core/udf/typing.py b/python/cudf/cudf/core/udf/typing.py
index da7ff4c0e32..56e8bec74dc 100644
--- a/python/cudf/cudf/core/udf/typing.py
+++ b/python/cudf/cudf/core/udf/typing.py
@@ -133,8 +133,8 @@ def typeof_masked(val, c):
 class MaskedConstructor(ConcreteTemplate):
     key = api.Masked
     units = ["ns", "ms", "us", "s"]
-    datetime_cases = set(types.NPDatetime(u) for u in units)
-    timedelta_cases = set(types.NPTimedelta(u) for u in units)
+    datetime_cases = {types.NPDatetime(u) for u in units}
+    timedelta_cases = {types.NPTimedelta(u) for u in units}
     cases = [
         nb_signature(MaskedType(t), t, types.boolean)
         for t in (
diff --git a/python/cudf/cudf/datasets.py b/python/cudf/cudf/datasets.py
index 2341a5c23b9..d7a2fedef59 100644
--- a/python/cudf/cudf/datasets.py
+++ b/python/cudf/cudf/datasets.py
@@ -57,9 +57,7 @@ def timeseries(
         pd.date_range(start, end, freq=freq, name="timestamp")
     )
     state = np.random.RandomState(seed)
-    columns = dict(
-        (k, make[dt](len(index), state)) for k, dt in dtypes.items()
-    )
+    columns = {k: make[dt](len(index), state) for k, dt in dtypes.items()}
     df = pd.DataFrame(columns, index=index, columns=sorted(columns))
     if df.index[-1] == end:
         df = df.iloc[:-1]
@@ -110,7 +108,7 @@ def randomdata(nrows=10, dtypes=None, seed=None):
     if dtypes is None:
         dtypes = {"id": int, "x": float, "y": float}
     state = np.random.RandomState(seed)
-    columns = dict((k, make[dt](nrows, state)) for k, dt in dtypes.items())
+    columns = {k: make[dt](nrows, state) for k, dt in dtypes.items()}
     df = pd.DataFrame(columns, columns=sorted(columns))
     return cudf.from_pandas(df)
 
diff --git a/python/cudf/cudf/tests/test_api_types.py b/python/cudf/cudf/tests/test_api_types.py
index 4d104c122d1..e7cf113f604 100644
--- a/python/cudf/cudf/tests/test_api_types.py
+++ b/python/cudf/cudf/tests/test_api_types.py
@@ -17,9 +17,7 @@
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -128,9 +126,7 @@ def test_is_categorical_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, True),
@@ -235,9 +231,7 @@ def test_is_numeric_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -342,9 +336,7 @@ def test_is_integer_dtype(obj, expect):
         (int(), True),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -450,9 +442,7 @@ def test_is_integer(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -557,9 +547,7 @@ def test_is_string_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -664,9 +652,7 @@ def test_is_datetime_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -771,9 +757,7 @@ def test_is_list_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -881,9 +865,7 @@ def test_is_struct_dtype(obj, expect):
         (int(), False),
         (float(), False),
         (complex(), False),
-        (str(), False),
         ("", False),
-        (r"", False),
         (object(), False),
         # Base Python types.
         (bool, False),
@@ -988,9 +970,7 @@ def test_is_decimal_dtype(obj, expect):
         int(),
         float(),
         complex(),
-        str(),
         "",
-        r"",
         object(),
         # Base Python types.
         bool,
@@ -1070,9 +1050,7 @@ def test_pandas_agreement(obj):
         int(),
         float(),
         complex(),
-        str(),
         "",
-        r"",
         object(),
         # Base Python types.
         bool,
diff --git a/python/cudf/cudf/tests/test_binops.py b/python/cudf/cudf/tests/test_binops.py
index 921f2de38c2..76add8b9c5d 100644
--- a/python/cudf/cudf/tests/test_binops.py
+++ b/python/cudf/cudf/tests/test_binops.py
@@ -1,6 +1,5 @@
 # Copyright (c) 2018-2022, NVIDIA CORPORATION.
 
-from __future__ import division
 
 import decimal
 import operator
diff --git a/python/cudf/cudf/tests/test_copying.py b/python/cudf/cudf/tests/test_copying.py
index 21a6a9172db..0d0ba579f22 100644
--- a/python/cudf/cudf/tests/test_copying.py
+++ b/python/cudf/cudf/tests/test_copying.py
@@ -1,5 +1,3 @@
-from __future__ import division, print_function
-
 import numpy as np
 import pandas as pd
 import pytest
diff --git a/python/cudf/cudf/tests/test_cuda_apply.py b/python/cudf/cudf/tests/test_cuda_apply.py
index a00dbbba5f0..e8bd64b5061 100644
--- a/python/cudf/cudf/tests/test_cuda_apply.py
+++ b/python/cudf/cudf/tests/test_cuda_apply.py
@@ -98,7 +98,7 @@ def kernel(in1, in2, in3, out1, out2, extra1, extra2):
 
     expect_out1 = extra2 * in1 - extra1 * in2 + in3
     expect_out2 = np.hstack(
-        np.arange((e - s)) for s, e in zip(chunks, chunks[1:] + [len(df)])
+        np.arange(e - s) for s, e in zip(chunks, chunks[1:] + [len(df)])
     )
 
     outdf = df.apply_chunks(
@@ -141,8 +141,7 @@ def kernel(in1, in2, in3, out1, out2, extra1, extra2):
 
     expect_out1 = extra2 * in1 - extra1 * in2 + in3
     expect_out2 = np.hstack(
-        tpb * np.arange((e - s))
-        for s, e in zip(chunks, chunks[1:] + [len(df)])
+        tpb * np.arange(e - s) for s, e in zip(chunks, chunks[1:] + [len(df)])
     )
 
     outdf = df.apply_chunks(
diff --git a/python/cudf/cudf/tests/test_dataframe.py b/python/cudf/cudf/tests/test_dataframe.py
index ba2caf7c6c8..5022f1a675b 100644
--- a/python/cudf/cudf/tests/test_dataframe.py
+++ b/python/cudf/cudf/tests/test_dataframe.py
@@ -246,17 +246,15 @@ def test_series_init_none():
     sr1 = cudf.Series()
     got = sr1.to_string()
 
-    expect = sr1.to_pandas().__repr__()
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = repr(sr1.to_pandas())
+    assert got == expect
 
     # 2: Using `None` as an initializer
     sr2 = cudf.Series(None)
     got = sr2.to_string()
 
-    expect = sr2.to_pandas().__repr__()
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = repr(sr2.to_pandas())
+    assert got == expect
 
 
 def test_dataframe_basic():
@@ -843,21 +841,20 @@ def test_dataframe_to_string_with_masked_data():
 def test_dataframe_to_string_wide(monkeypatch):
     monkeypatch.setenv("COLUMNS", "79")
     # Test basic
-    df = cudf.DataFrame()
-    for i in range(100):
-        df["a{}".format(i)] = list(range(3))
-    pd.options.display.max_columns = 0
-    got = df.to_string()
+    df = cudf.DataFrame({f"a{i}": [0, 1, 2] for i in range(100)})
+    with pd.option_context("display.max_columns", 0):
+        got = df.to_string()
 
-    expect = """
-    a0  a1  a2  a3  a4  a5  a6  a7 ...  a92 a93 a94 a95 a96 a97 a98 a99
-0    0   0   0   0   0   0   0   0 ...    0   0   0   0   0   0   0   0
-1    1   1   1   1   1   1   1   1 ...    1   1   1   1   1   1   1   1
-2    2   2   2   2   2   2   2   2 ...    2   2   2   2   2   2   2   2
-[3 rows x 100 columns]
-"""
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = textwrap.dedent(
+        """\
+           a0  a1  a2  a3  a4  a5  a6  a7  ...  a92  a93  a94  a95  a96  a97  a98  a99
+        0   0   0   0   0   0   0   0   0  ...    0    0    0    0    0    0    0    0
+        1   1   1   1   1   1   1   1   1  ...    1    1    1    1    1    1    1    1
+        2   2   2   2   2   2   2   2   2  ...    2    2    2    2    2    2    2    2
+
+        [3 rows x 100 columns]"""  # noqa: E501
+    )
+    assert got == expect
 
 
 def test_dataframe_empty_to_string():
@@ -865,9 +862,8 @@ def test_dataframe_empty_to_string():
     df = cudf.DataFrame()
     got = df.to_string()
 
-    expect = "Empty DataFrame\nColumns: []\nIndex: []\n"
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = "Empty DataFrame\nColumns: []\nIndex: []"
+    assert got == expect
 
 
 def test_dataframe_emptycolumns_to_string():
@@ -877,9 +873,8 @@ def test_dataframe_emptycolumns_to_string():
     df["b"] = []
     got = df.to_string()
 
-    expect = "Empty DataFrame\nColumns: [a, b]\nIndex: []\n"
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = "Empty DataFrame\nColumns: [a, b]\nIndex: []"
+    assert got == expect
 
 
 def test_dataframe_copy():
@@ -890,14 +885,14 @@ def test_dataframe_copy():
     df2["b"] = [4, 5, 6]
     got = df.to_string()
 
-    expect = """
-     a
-0    1
-1    2
-2    3
-"""
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = textwrap.dedent(
+        """\
+           a
+        0  1
+        1  2
+        2  3"""
+    )
+    assert got == expect
 
 
 def test_dataframe_copy_shallow():
@@ -908,14 +903,14 @@ def test_dataframe_copy_shallow():
     df2["b"] = [4, 2, 3]
     got = df.to_string()
 
-    expect = """
-     a
-0    1
-1    2
-2    3
-"""
-    # values should match despite whitespace difference
-    assert got.split() == expect.split()
+    expect = textwrap.dedent(
+        """\
+           a
+        0  1
+        1  2
+        2  3"""
+    )
+    assert got == expect
 
 
 def test_dataframe_dtypes():
@@ -1163,7 +1158,7 @@ def test_dataframe_hash_partition(nrows, nparts, nkeys):
     gdf = cudf.DataFrame()
     keycols = []
     for i in range(nkeys):
-        keyname = "key{}".format(i)
+        keyname = f"key{i}"
         gdf[keyname] = np.random.randint(0, 7 - i, nrows)
         keycols.append(keyname)
     gdf["val1"] = np.random.randint(0, nrows * 2, nrows)
diff --git a/python/cudf/cudf/tests/test_factorize.py b/python/cudf/cudf/tests/test_factorize.py
index 1f16686a6a6..3081b7c4a6e 100644
--- a/python/cudf/cudf/tests/test_factorize.py
+++ b/python/cudf/cudf/tests/test_factorize.py
@@ -23,7 +23,7 @@ def test_factorize_series_obj(ncats, nelem):
     assert isinstance(uvals, cp.ndarray)
     assert isinstance(labels, Index)
 
-    encoder = dict((labels[idx], idx) for idx in range(len(labels)))
+    encoder = {labels[idx]: idx for idx in range(len(labels))}
     handcoded = [encoder[v] for v in arr]
     np.testing.assert_array_equal(uvals.get(), handcoded)
 
@@ -42,7 +42,7 @@ def test_factorize_index_obj(ncats, nelem):
     assert isinstance(uvals, cp.ndarray)
     assert isinstance(labels, Index)
 
-    encoder = dict((labels[idx], idx) for idx in range(len(labels)))
+    encoder = {labels[idx]: idx for idx in range(len(labels))}
     handcoded = [encoder[v] for v in arr]
     np.testing.assert_array_equal(uvals.get(), handcoded)
 
diff --git a/python/cudf/cudf/tests/test_gcs.py b/python/cudf/cudf/tests/test_gcs.py
index db53529b22f..307232b1305 100644
--- a/python/cudf/cudf/tests/test_gcs.py
+++ b/python/cudf/cudf/tests/test_gcs.py
@@ -48,14 +48,14 @@ def mock_size(*args):
     # use_python_file_object=True, because the pyarrow
     # `open_input_file` command will fail (since it doesn't
     # use the monkey-patched `open` definition)
-    got = cudf.read_csv("gcs://{}".format(fpath), use_python_file_object=False)
+    got = cudf.read_csv(f"gcs://{fpath}", use_python_file_object=False)
     assert_eq(pdf, got)
 
     # AbstractBufferedFile -> PythonFile conversion
     # will work fine with the monkey-patched FS if we
     # pass in an fsspec file object
     fs = gcsfs.core.GCSFileSystem()
-    with fs.open("gcs://{}".format(fpath)) as f:
+    with fs.open(f"gcs://{fpath}") as f:
         got = cudf.read_csv(f)
     assert_eq(pdf, got)
 
@@ -69,7 +69,7 @@ def mock_open(*args, **kwargs):
         return open(local_filepath, "wb")
 
     monkeypatch.setattr(gcsfs.core.GCSFileSystem, "open", mock_open)
-    gdf.to_orc("gcs://{}".format(gcs_fname))
+    gdf.to_orc(f"gcs://{gcs_fname}")
 
     got = pa.orc.ORCFile(local_filepath).read().to_pandas()
     assert_eq(pdf, got)
diff --git a/python/cudf/cudf/tests/test_groupby.py b/python/cudf/cudf/tests/test_groupby.py
index f5decd62ea9..61c7d1958a0 100644
--- a/python/cudf/cudf/tests/test_groupby.py
+++ b/python/cudf/cudf/tests/test_groupby.py
@@ -84,11 +84,6 @@ def make_frame(
     return df
 
 
-def get_nelem():
-    for elem in [2, 3, 1000]:
-        yield elem
-
-
 @pytest.fixture
 def gdf():
     return DataFrame({"x": [1, 2, 3], "y": [0, 1, 1]})
@@ -1096,7 +1091,7 @@ def test_groupby_cumcount():
     )
 
 
-@pytest.mark.parametrize("nelem", get_nelem())
+@pytest.mark.parametrize("nelem", [2, 3, 1000])
 @pytest.mark.parametrize("as_index", [True, False])
 @pytest.mark.parametrize(
     "agg", ["min", "max", "idxmin", "idxmax", "mean", "count"]
diff --git a/python/cudf/cudf/tests/test_hdfs.py b/python/cudf/cudf/tests/test_hdfs.py
index 24554f113bb..2d61d6693cb 100644
--- a/python/cudf/cudf/tests/test_hdfs.py
+++ b/python/cudf/cudf/tests/test_hdfs.py
@@ -62,7 +62,7 @@ def test_read_csv(tmpdir, pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_csv_reader.csv".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_csv_reader.csv"
 
     got = cudf.read_csv(hd_fpath)
 
@@ -81,7 +81,7 @@ def test_write_csv(pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_csv_writer.csv".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_csv_writer.csv"
 
     gdf.to_csv(hd_fpath, index=False)
 
@@ -107,7 +107,7 @@ def test_read_parquet(tmpdir, pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_parquet_reader.parquet".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_parquet_reader.parquet"
 
     got = cudf.read_parquet(hd_fpath)
 
@@ -126,7 +126,7 @@ def test_write_parquet(pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_parquet_writer.parquet".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_parquet_writer.parquet"
 
     gdf.to_parquet(hd_fpath)
 
@@ -153,7 +153,7 @@ def test_write_parquet_partitioned(tmpdir, pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_parquet_partitioned.parquet".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_parquet_partitioned.parquet"
     # Clear data written from previous runs
     hdfs.rm(f"{basedir}/test_parquet_partitioned.parquet", recursive=True)
     gdf.to_parquet(
@@ -186,7 +186,7 @@ def test_read_json(tmpdir, pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_json_reader.json".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_json_reader.json"
 
     got = cudf.read_json(hd_fpath, engine="cudf", orient="records", lines=True)
 
@@ -207,9 +207,9 @@ def test_read_orc(datadir, hdfs, test_url):
     hdfs.upload(basedir + "/file.orc", buffer)
 
     if test_url:
-        hd_fpath = "hdfs://{}:{}{}/file.orc".format(host, port, basedir)
+        hd_fpath = f"hdfs://{host}:{port}{basedir}/file.orc"
     else:
-        hd_fpath = "hdfs://{}/file.orc".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/file.orc"
 
     got = cudf.read_orc(hd_fpath)
     expect = orc.ORCFile(buffer).read().to_pandas()
@@ -226,7 +226,7 @@ def test_write_orc(pdf, hdfs, test_url):
             host, port, basedir
         )
     else:
-        hd_fpath = "hdfs://{}/test_orc_writer.orc".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/test_orc_writer.orc"
 
     gdf.to_orc(hd_fpath)
 
@@ -247,9 +247,9 @@ def test_read_avro(datadir, hdfs, test_url):
     hdfs.upload(basedir + "/file.avro", buffer)
 
     if test_url:
-        hd_fpath = "hdfs://{}:{}{}/file.avro".format(host, port, basedir)
+        hd_fpath = f"hdfs://{host}:{port}{basedir}/file.avro"
     else:
-        hd_fpath = "hdfs://{}/file.avro".format(basedir)
+        hd_fpath = f"hdfs://{basedir}/file.avro"
 
     got = cudf.read_avro(hd_fpath)
     with open(fname, mode="rb") as f:
@@ -270,7 +270,7 @@ def test_storage_options(tmpdir, pdf, hdfs):
     # Write to hdfs
     hdfs.upload(basedir + "/file.csv", buffer)
 
-    hd_fpath = "hdfs://{}/file.csv".format(basedir)
+    hd_fpath = f"hdfs://{basedir}/file.csv"
 
     storage_options = {"host": host, "port": port}
 
@@ -293,7 +293,7 @@ def test_storage_options_error(tmpdir, pdf, hdfs):
     # Write to hdfs
     hdfs.upload(basedir + "/file.csv", buffer)
 
-    hd_fpath = "hdfs://{}:{}{}/file.avro".format(host, port, basedir)
+    hd_fpath = f"hdfs://{host}:{port}{basedir}/file.avro"
 
     storage_options = {"host": host, "port": port}
 
diff --git a/python/cudf/cudf/tests/test_query.py b/python/cudf/cudf/tests/test_query.py
index 3de38b2cf6f..09129a43f07 100644
--- a/python/cudf/cudf/tests/test_query.py
+++ b/python/cudf/cudf/tests/test_query.py
@@ -1,6 +1,5 @@
 # Copyright (c) 2018, NVIDIA CORPORATION.
 
-from __future__ import division, print_function
 
 import datetime
 import inspect
diff --git a/python/cudf/cudf/tests/test_reductions.py b/python/cudf/cudf/tests/test_reductions.py
index 40add502309..7106ab54686 100644
--- a/python/cudf/cudf/tests/test_reductions.py
+++ b/python/cudf/cudf/tests/test_reductions.py
@@ -1,6 +1,5 @@
 # Copyright (c) 2020-2022, NVIDIA CORPORATION.
 
-from __future__ import division, print_function
 
 import re
 from decimal import Decimal
diff --git a/python/cudf/cudf/tests/test_s3.py b/python/cudf/cudf/tests/test_s3.py
index da1ffc1fc16..4807879a730 100644
--- a/python/cudf/cudf/tests/test_s3.py
+++ b/python/cudf/cudf/tests/test_s3.py
@@ -147,7 +147,7 @@ def test_read_csv(s3_base, s3so, pdf, bytes_per_thread):
     # Use fsspec file object
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_csv(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             storage_options=s3so,
             bytes_per_thread=bytes_per_thread,
             use_python_file_object=False,
@@ -157,7 +157,7 @@ def test_read_csv(s3_base, s3so, pdf, bytes_per_thread):
     # Use Arrow PythonFile object
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_csv(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             storage_options=s3so,
             bytes_per_thread=bytes_per_thread,
             use_python_file_object=True,
@@ -174,7 +174,7 @@ def test_read_csv_arrow_nativefile(s3_base, s3so, pdf):
         fs = pa_fs.S3FileSystem(
             endpoint_override=s3so["client_kwargs"]["endpoint_url"],
         )
-        with fs.open_input_file("{}/{}".format(bname, fname)) as fil:
+        with fs.open_input_file(f"{bname}/{fname}") as fil:
             got = cudf.read_csv(fil)
 
     assert_eq(pdf, got)
@@ -193,7 +193,7 @@ def test_read_csv_byte_range(
     # Use fsspec file object
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_csv(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             storage_options=s3so,
             byte_range=(74, 73),
             bytes_per_thread=bytes_per_thread,
@@ -213,15 +213,15 @@ def test_write_csv(s3_base, s3so, pdf, chunksize):
     gdf = cudf.from_pandas(pdf)
     with s3_context(s3_base=s3_base, bucket=bname) as s3fs:
         gdf.to_csv(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             index=False,
             chunksize=chunksize,
             storage_options=s3so,
         )
-        assert s3fs.exists("s3://{}/{}".format(bname, fname))
+        assert s3fs.exists(f"s3://{bname}/{fname}")
 
         # TODO: Update to use `storage_options` from pandas v1.2.0
-        got = pd.read_csv(s3fs.open("s3://{}/{}".format(bname, fname)))
+        got = pd.read_csv(s3fs.open(f"s3://{bname}/{fname}"))
 
     assert_eq(pdf, got)
 
@@ -248,7 +248,7 @@ def test_read_parquet(
     buffer.seek(0)
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got1 = cudf.read_parquet(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             open_file_options=(
                 {"precache_options": {"method": precache}}
                 if use_python_file_object
@@ -265,10 +265,10 @@ def test_read_parquet(
     # Check fsspec file-object handling
     buffer.seek(0)
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
-        fs = get_fs_token_paths(
-            "s3://{}/{}".format(bname, fname), storage_options=s3so
-        )[0]
-        with fs.open("s3://{}/{}".format(bname, fname), mode="rb") as f:
+        fs = get_fs_token_paths(f"s3://{bname}/{fname}", storage_options=s3so)[
+            0
+        ]
+        with fs.open(f"s3://{bname}/{fname}", mode="rb") as f:
             got2 = cudf.read_parquet(
                 f,
                 bytes_per_thread=bytes_per_thread,
@@ -297,7 +297,7 @@ def test_read_parquet_ext(
     buffer.seek(0)
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got1 = cudf.read_parquet(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             storage_options=s3so,
             bytes_per_thread=bytes_per_thread,
             footer_sample_size=3200,
@@ -326,7 +326,7 @@ def test_read_parquet_arrow_nativefile(s3_base, s3so, pdf, columns):
         fs = pa_fs.S3FileSystem(
             endpoint_override=s3so["client_kwargs"]["endpoint_url"],
         )
-        with fs.open_input_file("{}/{}".format(bname, fname)) as fil:
+        with fs.open_input_file(f"{bname}/{fname}") as fil:
             got = cudf.read_parquet(fil, columns=columns)
 
     expect = pdf[columns] if columns else pdf
@@ -343,7 +343,7 @@ def test_read_parquet_filters(s3_base, s3so, pdf_ext, precache):
     filters = [("String", "==", "Omega")]
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_parquet(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             storage_options=s3so,
             filters=filters,
             open_file_options={"precache_options": {"method": precache}},
@@ -360,13 +360,13 @@ def test_write_parquet(s3_base, s3so, pdf, partition_cols):
     gdf = cudf.from_pandas(pdf)
     with s3_context(s3_base=s3_base, bucket=bname) as s3fs:
         gdf.to_parquet(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             partition_cols=partition_cols,
             storage_options=s3so,
         )
-        assert s3fs.exists("s3://{}/{}".format(bname, fname))
+        assert s3fs.exists(f"s3://{bname}/{fname}")
 
-        got = pd.read_parquet(s3fs.open("s3://{}/{}".format(bname, fname)))
+        got = pd.read_parquet(s3fs.open(f"s3://{bname}/{fname}"))
 
     assert_eq(pdf, got)
 
@@ -383,7 +383,7 @@ def test_read_json(s3_base, s3so):
 
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_json(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             engine="cudf",
             orient="records",
             lines=True,
@@ -407,7 +407,7 @@ def test_read_orc(s3_base, s3so, datadir, use_python_file_object, columns):
 
     with s3_context(s3_base=s3_base, bucket=bname, files={fname: buffer}):
         got = cudf.read_orc(
-            "s3://{}/{}".format(bname, fname),
+            f"s3://{bname}/{fname}",
             columns=columns,
             storage_options=s3so,
             use_python_file_object=use_python_file_object,
@@ -432,7 +432,7 @@ def test_read_orc_arrow_nativefile(s3_base, s3so, datadir, columns):
         fs = pa_fs.S3FileSystem(
             endpoint_override=s3so["client_kwargs"]["endpoint_url"],
         )
-        with fs.open_input_file("{}/{}".format(bname, fname)) as fil:
+        with fs.open_input_file(f"{bname}/{fname}") as fil:
             got = cudf.read_orc(fil, columns=columns)
 
     if columns:
@@ -445,10 +445,10 @@ def test_write_orc(s3_base, s3so, pdf):
     bname = "orc"
     gdf = cudf.from_pandas(pdf)
     with s3_context(s3_base=s3_base, bucket=bname) as s3fs:
-        gdf.to_orc("s3://{}/{}".format(bname, fname), storage_options=s3so)
-        assert s3fs.exists("s3://{}/{}".format(bname, fname))
+        gdf.to_orc(f"s3://{bname}/{fname}", storage_options=s3so)
+        assert s3fs.exists(f"s3://{bname}/{fname}")
 
-        with s3fs.open("s3://{}/{}".format(bname, fname)) as f:
+        with s3fs.open(f"s3://{bname}/{fname}") as f:
             got = pa.orc.ORCFile(f).read().to_pandas()
 
     assert_eq(pdf, got)
diff --git a/python/cudf/cudf/tests/test_sorting.py b/python/cudf/cudf/tests/test_sorting.py
index 00cd31e7539..10c3689fcd7 100644
--- a/python/cudf/cudf/tests/test_sorting.py
+++ b/python/cudf/cudf/tests/test_sorting.py
@@ -105,7 +105,7 @@ def test_series_argsort(nelem, dtype, asc):
 )
 def test_series_sort_index(nelem, asc):
     np.random.seed(0)
-    sr = Series((100 * np.random.random(nelem)))
+    sr = Series(100 * np.random.random(nelem))
     psr = sr.to_pandas()
 
     expected = psr.sort_index(ascending=asc)
diff --git a/python/cudf/cudf/tests/test_text.py b/python/cudf/cudf/tests/test_text.py
index a447a60c709..5ff66fc750f 100644
--- a/python/cudf/cudf/tests/test_text.py
+++ b/python/cudf/cudf/tests/test_text.py
@@ -763,7 +763,7 @@ def test_read_text(datadir):
     chess_file = str(datadir) + "/chess.pgn"
     delimiter = "1."
 
-    with open(chess_file, "r") as f:
+    with open(chess_file) as f:
         content = f.read().split(delimiter)
 
     # Since Python split removes the delimiter and read_text does
diff --git a/python/cudf/cudf/tests/test_transform.py b/python/cudf/cudf/tests/test_transform.py
index 021c4052759..bd7ee45fbf8 100644
--- a/python/cudf/cudf/tests/test_transform.py
+++ b/python/cudf/cudf/tests/test_transform.py
@@ -1,6 +1,5 @@
 # Copyright (c) 2018-2020, NVIDIA CORPORATION.
 
-from __future__ import division
 
 import numpy as np
 import pytest
diff --git a/python/cudf/cudf/tests/test_udf_binops.py b/python/cudf/cudf/tests/test_udf_binops.py
index c5cd8f8b717..173515509cd 100644
--- a/python/cudf/cudf/tests/test_udf_binops.py
+++ b/python/cudf/cudf/tests/test_udf_binops.py
@@ -1,5 +1,4 @@
 # Copyright (c) 2018, NVIDIA CORPORATION.
-from __future__ import division
 
 import numpy as np
 import pytest
diff --git a/python/cudf/cudf/tests/test_unaops.py b/python/cudf/cudf/tests/test_unaops.py
index e79b74e3aab..2e8da615e3e 100644
--- a/python/cudf/cudf/tests/test_unaops.py
+++ b/python/cudf/cudf/tests/test_unaops.py
@@ -1,5 +1,3 @@
-from __future__ import division
-
 import itertools
 import operator
 import re
diff --git a/python/cudf/cudf/utils/applyutils.py b/python/cudf/cudf/utils/applyutils.py
index 3cbbc1e1ce7..593965046e6 100644
--- a/python/cudf/cudf/utils/applyutils.py
+++ b/python/cudf/cudf/utils/applyutils.py
@@ -125,7 +125,7 @@ def make_aggregate_nullmask(df, columns=None, op="and"):
     return out_mask
 
 
-class ApplyKernelCompilerBase(object):
+class ApplyKernelCompilerBase:
     def __init__(
         self, func, incols, outcols, kwargs, pessimistic_nulls, cache_key
     ):
@@ -253,7 +253,7 @@ def row_wise_kernel({args}):
                 srcidx.format(a=a, start=start, stop=stop, stride=stride)
             )
 
-    body.append("inner({})".format(args))
+    body.append(f"inner({args})")
 
     indented = ["{}{}".format(" " * 4, ln) for ln in body]
     # Finalize source
@@ -309,7 +309,7 @@ def chunk_wise_kernel(nrows, chunks, {args}):
     slicedargs = {}
     for a in argnames:
         if a not in extras:
-            slicedargs[a] = "{}[start:stop]".format(a)
+            slicedargs[a] = f"{a}[start:stop]"
         else:
             slicedargs[a] = str(a)
     body.append(
@@ -361,4 +361,4 @@ def _load_cache_or_make_chunk_wise_kernel(func, *args, **kwargs):
 
 def _mangle_user(name):
     """Mangle user variable name"""
-    return "__user_{}".format(name)
+    return f"__user_{name}"
diff --git a/python/cudf/cudf/utils/cudautils.py b/python/cudf/cudf/utils/cudautils.py
index f0533dcaa72..742c747ab69 100755
--- a/python/cudf/cudf/utils/cudautils.py
+++ b/python/cudf/cudf/utils/cudautils.py
@@ -218,7 +218,7 @@ def make_cache_key(udf, sig):
     codebytes = udf.__code__.co_code
     constants = udf.__code__.co_consts
     if udf.__closure__ is not None:
-        cvars = tuple([x.cell_contents for x in udf.__closure__])
+        cvars = tuple(x.cell_contents for x in udf.__closure__)
         cvarbytes = dumps(cvars)
     else:
         cvarbytes = b""
diff --git a/python/cudf/cudf/utils/dtypes.py b/python/cudf/cudf/utils/dtypes.py
index 44bbb1b493d..4cd1738996f 100644
--- a/python/cudf/cudf/utils/dtypes.py
+++ b/python/cudf/cudf/utils/dtypes.py
@@ -160,8 +160,8 @@ def numeric_normalize_types(*args):
 def _find_common_type_decimal(dtypes):
     # Find the largest scale and the largest difference between
     # precision and scale of the columns to be concatenated
-    s = max([dtype.scale for dtype in dtypes])
-    lhs = max([dtype.precision - dtype.scale for dtype in dtypes])
+    s = max(dtype.scale for dtype in dtypes)
+    lhs = max(dtype.precision - dtype.scale for dtype in dtypes)
     # Combine to get the necessary precision and clip at the maximum
     # precision
     p = s + lhs
@@ -525,7 +525,7 @@ def find_common_type(dtypes):
             )
             for dtype in dtypes
         ):
-            if len(set(dtype._categories.dtype for dtype in dtypes)) == 1:
+            if len({dtype._categories.dtype for dtype in dtypes}) == 1:
                 return cudf.CategoricalDtype(
                     cudf.core.column.concat_columns(
                         [dtype._categories for dtype in dtypes]
diff --git a/python/cudf/cudf/utils/hash_vocab_utils.py b/python/cudf/cudf/utils/hash_vocab_utils.py
index 45004c5f107..11029cbfe5e 100644
--- a/python/cudf/cudf/utils/hash_vocab_utils.py
+++ b/python/cudf/cudf/utils/hash_vocab_utils.py
@@ -79,10 +79,8 @@ def _pick_initial_a_b(data, max_constant, init_bins):
         longest = _new_bin_length(_longest_bin_length(bins))
 
         if score <= max_constant and longest <= MAX_SIZE_FOR_INITIAL_BIN:
-            print(
-                "Attempting to build table using {:.6f}n space".format(score)
-            )
-            print("Longest bin was {}".format(longest))
+            print(f"Attempting to build table using {score:.6f}n space")
+            print(f"Longest bin was {longest}")
             break
 
     return bins, a, b
@@ -170,7 +168,7 @@ def _pack_keys_and_values(flattened_hash_table, original_dict):
 
 def _load_vocab_dict(path):
     vocab = {}
-    with open(path, mode="r", encoding="utf-8") as f:
+    with open(path, encoding="utf-8") as f:
         counter = 0
         for line in f:
             vocab[line.strip()] = counter
@@ -193,17 +191,17 @@ def _store_func(
 ):
 
     with open(out_name, mode="w+") as f:
-        f.write("{}\n".format(outer_a))
-        f.write("{}\n".format(outer_b))
-        f.write("{}\n".format(num_outer_bins))
+        f.write(f"{outer_a}\n")
+        f.write(f"{outer_b}\n")
+        f.write(f"{num_outer_bins}\n")
         f.writelines(
-            "{} {}\n".format(coeff, offset)
+            f"{coeff} {offset}\n"
             for coeff, offset in zip(inner_table_coeffs, offsets_into_ht)
         )
-        f.write("{}\n".format(len(hash_table)))
-        f.writelines("{}\n".format(kv) for kv in hash_table)
+        f.write(f"{len(hash_table)}\n")
+        f.writelines(f"{kv}\n" for kv in hash_table)
         f.writelines(
-            "{}\n".format(tok_id)
+            f"{tok_id}\n"
             for tok_id in [unk_tok_id, first_token_id, sep_token_id]
         )
 
@@ -295,6 +293,6 @@ def hash_vocab(
         )
         assert (
             val == value
-        ), "Incorrect value found. Got {} expected {}".format(val, value)
+        ), f"Incorrect value found. Got {val} expected {value}"
 
     print("All present tokens return correct value.")
diff --git a/python/cudf/cudf/utils/queryutils.py b/python/cudf/cudf/utils/queryutils.py
index d9153c2b1d2..64218ddf46a 100644
--- a/python/cudf/cudf/utils/queryutils.py
+++ b/python/cudf/cudf/utils/queryutils.py
@@ -136,7 +136,7 @@ def query_compile(expr):
         key "args" is a sequence of name of the arguments.
     """
 
-    funcid = "queryexpr_{:x}".format(np.uintp(hash(expr)))
+    funcid = f"queryexpr_{np.uintp(hash(expr)):x}"
     # Load cache
     compiled = _cache.get(funcid)
     # Cache not found
@@ -147,7 +147,7 @@ def query_compile(expr):
         # compile
         devicefn = cuda.jit(device=True)(fn)
 
-        kernelid = "kernel_{}".format(funcid)
+        kernelid = f"kernel_{funcid}"
         kernel = _wrap_query_expr(kernelid, devicefn, args)
 
         compiled = info.copy()
@@ -173,10 +173,10 @@ def _add_idx(arg):
         if arg.startswith(ENVREF_PREFIX):
             return arg
         else:
-            return "{}[idx]".format(arg)
+            return f"{arg}[idx]"
 
     def _add_prefix(arg):
-        return "_args_{}".format(arg)
+        return f"_args_{arg}"
 
     glbls = {"queryfn": fn, "cuda": cuda}
     kernargs = map(_add_prefix, args)
diff --git a/python/cudf/cudf/utils/utils.py b/python/cudf/cudf/utils/utils.py
index add4ecd8f01..65a803d6768 100644
--- a/python/cudf/cudf/utils/utils.py
+++ b/python/cudf/cudf/utils/utils.py
@@ -204,12 +204,13 @@ def __getattr__(self, key):
             )
 
 
-def raise_iteration_error(obj):
-    raise TypeError(
-        f"{obj.__class__.__name__} object is not iterable. "
-        f"Consider using `.to_arrow()`, `.to_pandas()` or `.values_host` "
-        f"if you wish to iterate over the values."
-    )
+class NotIterable:
+    def __iter__(self):
+        raise TypeError(
+            f"{self.__class__.__name__} object is not iterable. "
+            f"Consider using `.to_arrow()`, `.to_pandas()` or `.values_host` "
+            f"if you wish to iterate over the values."
+        )
 
 
 def pa_mask_buffer_to_mask(mask_buf, size):
diff --git a/python/cudf/setup.py b/python/cudf/setup.py
index a8e14504469..e4e43bc1595 100644
--- a/python/cudf/setup.py
+++ b/python/cudf/setup.py
@@ -63,9 +63,7 @@ def get_cuda_version_from_header(cuda_include_dir, delimeter=""):
 
     cuda_version = None
 
-    with open(
-        os.path.join(cuda_include_dir, "cuda.h"), "r", encoding="utf-8"
-    ) as f:
+    with open(os.path.join(cuda_include_dir, "cuda.h"), encoding="utf-8") as f:
         for line in f.readlines():
             if re.search(r"#define CUDA_VERSION ", line) is not None:
                 cuda_version = line
diff --git a/python/cudf_kafka/cudf_kafka/_version.py b/python/cudf_kafka/cudf_kafka/_version.py
index 5ab5c72e457..6cd10cc10bf 100644
--- a/python/cudf_kafka/cudf_kafka/_version.py
+++ b/python/cudf_kafka/cudf_kafka/_version.py
@@ -86,7 +86,7 @@ def run_command(
                 stderr=(subprocess.PIPE if hide_stderr else None),
             )
             break
-        except EnvironmentError:
+        except OSError:
             e = sys.exc_info()[1]
             if e.errno == errno.ENOENT:
                 continue
@@ -96,7 +96,7 @@ def run_command(
             return None, None
     else:
         if verbose:
-            print("unable to find command, tried %s" % (commands,))
+            print(f"unable to find command, tried {commands}")
         return None, None
     stdout = p.communicate()[0].strip()
     if sys.version_info[0] >= 3:
@@ -149,7 +149,7 @@ def git_get_keywords(versionfile_abs):
     # _version.py.
     keywords = {}
     try:
-        f = open(versionfile_abs, "r")
+        f = open(versionfile_abs)
         for line in f.readlines():
             if line.strip().startswith("git_refnames ="):
                 mo = re.search(r'=\s*"(.*)"', line)
@@ -164,7 +164,7 @@ def git_get_keywords(versionfile_abs):
                 if mo:
                     keywords["date"] = mo.group(1)
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     return keywords
 
@@ -188,11 +188,11 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         if verbose:
             print("keywords are unexpanded, not using")
         raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
-    refs = set([r.strip() for r in refnames.strip("()").split(",")])
+    refs = {r.strip() for r in refnames.strip("()").split(",")}
     # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
     # just "foo-1.0". If we see a "tag: " prefix, prefer those.
     TAG = "tag: "
-    tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
+    tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
     if not tags:
         # Either we're using git < 1.8.3, or there really are no tags. We use
         # a heuristic: assume all version tags have a digit. The old git %d
@@ -201,7 +201,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         # between branches and tags. By ignoring refnames without digits, we
         # filter out many common branch names like "release" and
         # "stabilization", as well as "HEAD" and "master".
-        tags = set([r for r in refs if re.search(r"\d", r)])
+        tags = {r for r in refs if re.search(r"\d", r)}
         if verbose:
             print("discarding '%s', no digits" % ",".join(refs - tags))
     if verbose:
@@ -308,10 +308,9 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
             if verbose:
                 fmt = "tag '%s' doesn't start with prefix '%s'"
                 print(fmt % (full_tag, tag_prefix))
-            pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
-                full_tag,
-                tag_prefix,
-            )
+            pieces[
+                "error"
+            ] = f"tag '{full_tag}' doesn't start with prefix '{tag_prefix}'"
             return pieces
         pieces["closest-tag"] = full_tag[len(tag_prefix) :]
 
diff --git a/python/cudf_kafka/versioneer.py b/python/cudf_kafka/versioneer.py
index 2260d5c2dcf..c7dbfd76734 100644
--- a/python/cudf_kafka/versioneer.py
+++ b/python/cudf_kafka/versioneer.py
@@ -275,7 +275,6 @@
 
 """
 
-from __future__ import print_function
 
 import errno
 import json
@@ -345,7 +344,7 @@ def get_config_from_root(root):
     # the top of versioneer.py for instructions on writing your setup.cfg .
     setup_cfg = os.path.join(root, "setup.cfg")
     parser = configparser.SafeConfigParser()
-    with open(setup_cfg, "r") as f:
+    with open(setup_cfg) as f:
         parser.readfp(f)
     VCS = parser.get("versioneer", "VCS")  # mandatory
 
@@ -407,7 +406,7 @@ def run_command(
                 stderr=(subprocess.PIPE if hide_stderr else None),
             )
             break
-        except EnvironmentError:
+        except OSError:
             e = sys.exc_info()[1]
             if e.errno == errno.ENOENT:
                 continue
@@ -417,7 +416,7 @@ def run_command(
             return None, None
     else:
         if verbose:
-            print("unable to find command, tried %s" % (commands,))
+            print(f"unable to find command, tried {commands}")
         return None, None
     stdout = p.communicate()[0].strip()
     if sys.version_info[0] >= 3:
@@ -964,7 +963,7 @@ def git_get_keywords(versionfile_abs):
     # _version.py.
     keywords = {}
     try:
-        f = open(versionfile_abs, "r")
+        f = open(versionfile_abs)
         for line in f.readlines():
             if line.strip().startswith("git_refnames ="):
                 mo = re.search(r'=\s*"(.*)"', line)
@@ -979,7 +978,7 @@ def git_get_keywords(versionfile_abs):
                 if mo:
                     keywords["date"] = mo.group(1)
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     return keywords
 
@@ -1003,11 +1002,11 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         if verbose:
             print("keywords are unexpanded, not using")
         raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
-    refs = set([r.strip() for r in refnames.strip("()").split(",")])
+    refs = {r.strip() for r in refnames.strip("()").split(",")}
     # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
     # just "foo-1.0". If we see a "tag: " prefix, prefer those.
     TAG = "tag: "
-    tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
+    tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
     if not tags:
         # Either we're using git < 1.8.3, or there really are no tags. We use
         # a heuristic: assume all version tags have a digit. The old git %d
@@ -1016,7 +1015,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         # between branches and tags. By ignoring refnames without digits, we
         # filter out many common branch names like "release" and
         # "stabilization", as well as "HEAD" and "master".
-        tags = set([r for r in refs if re.search(r"\d", r)])
+        tags = {r for r in refs if re.search(r"\d", r)}
         if verbose:
             print("discarding '%s', no digits" % ",".join(refs - tags))
     if verbose:
@@ -1123,9 +1122,8 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
             if verbose:
                 fmt = "tag '%s' doesn't start with prefix '%s'"
                 print(fmt % (full_tag, tag_prefix))
-            pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
-                full_tag,
-                tag_prefix,
+            pieces["error"] = "tag '{}' doesn't start with prefix '{}'".format(
+                full_tag, tag_prefix,
             )
             return pieces
         pieces["closest-tag"] = full_tag[len(tag_prefix) :]
@@ -1175,13 +1173,13 @@ def do_vcs_install(manifest_in, versionfile_source, ipy):
     files.append(versioneer_file)
     present = False
     try:
-        f = open(".gitattributes", "r")
+        f = open(".gitattributes")
         for line in f.readlines():
             if line.strip().startswith(versionfile_source):
                 if "export-subst" in line.strip().split()[1:]:
                     present = True
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     if not present:
         f = open(".gitattributes", "a+")
@@ -1245,7 +1243,7 @@ def versions_from_file(filename):
     try:
         with open(filename) as f:
             contents = f.read()
-    except EnvironmentError:
+    except OSError:
         raise NotThisMethod("unable to read _version.py")
     mo = re.search(
         r"version_json = '''\n(.*)'''  # END VERSION_JSON",
@@ -1272,7 +1270,7 @@ def write_to_version_file(filename, versions):
     with open(filename, "w") as f:
         f.write(SHORT_VERSION_PY % contents)
 
-    print("set %s to '%s'" % (filename, versions["version"]))
+    print("set {} to '{}'".format(filename, versions["version"]))
 
 
 def plus_or_dot(pieces):
@@ -1497,7 +1495,7 @@ def get_versions(verbose=False):
     try:
         ver = versions_from_file(versionfile_abs)
         if verbose:
-            print("got version from file %s %s" % (versionfile_abs, ver))
+            print(f"got version from file {versionfile_abs} {ver}")
         return ver
     except NotThisMethod:
         pass
@@ -1773,7 +1771,7 @@ def do_setup():
     try:
         cfg = get_config_from_root(root)
     except (
-        EnvironmentError,
+        OSError,
         configparser.NoSectionError,
         configparser.NoOptionError,
     ) as e:
@@ -1803,9 +1801,9 @@ def do_setup():
     ipy = os.path.join(os.path.dirname(cfg.versionfile_source), "__init__.py")
     if os.path.exists(ipy):
         try:
-            with open(ipy, "r") as f:
+            with open(ipy) as f:
                 old = f.read()
-        except EnvironmentError:
+        except OSError:
             old = ""
         if INIT_PY_SNIPPET not in old:
             print(" appending to %s" % ipy)
@@ -1824,12 +1822,12 @@ def do_setup():
     manifest_in = os.path.join(root, "MANIFEST.in")
     simple_includes = set()
     try:
-        with open(manifest_in, "r") as f:
+        with open(manifest_in) as f:
             for line in f:
                 if line.startswith("include "):
                     for include in line.split()[1:]:
                         simple_includes.add(include)
-    except EnvironmentError:
+    except OSError:
         pass
     # That doesn't cover everything MANIFEST.in can do
     # (http://docs.python.org/2/distutils/sourcedist.html#commands), so
@@ -1863,7 +1861,7 @@ def scan_setup_py():
     found = set()
     setters = False
     errors = 0
-    with open("setup.py", "r") as f:
+    with open("setup.py") as f:
         for line in f.readlines():
             if "import versioneer" in line:
                 found.add("import")
diff --git a/python/custreamz/custreamz/_version.py b/python/custreamz/custreamz/_version.py
index a3409a06953..106fc3524f9 100644
--- a/python/custreamz/custreamz/_version.py
+++ b/python/custreamz/custreamz/_version.py
@@ -86,7 +86,7 @@ def run_command(
                 stderr=(subprocess.PIPE if hide_stderr else None),
             )
             break
-        except EnvironmentError:
+        except OSError:
             e = sys.exc_info()[1]
             if e.errno == errno.ENOENT:
                 continue
@@ -96,7 +96,7 @@ def run_command(
             return None, None
     else:
         if verbose:
-            print("unable to find command, tried %s" % (commands,))
+            print(f"unable to find command, tried {commands}")
         return None, None
     stdout = p.communicate()[0].strip()
     if sys.version_info[0] >= 3:
@@ -149,7 +149,7 @@ def git_get_keywords(versionfile_abs):
     # _version.py.
     keywords = {}
     try:
-        f = open(versionfile_abs, "r")
+        f = open(versionfile_abs)
         for line in f.readlines():
             if line.strip().startswith("git_refnames ="):
                 mo = re.search(r'=\s*"(.*)"', line)
@@ -164,7 +164,7 @@ def git_get_keywords(versionfile_abs):
                 if mo:
                     keywords["date"] = mo.group(1)
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     return keywords
 
@@ -188,11 +188,11 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         if verbose:
             print("keywords are unexpanded, not using")
         raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
-    refs = set([r.strip() for r in refnames.strip("()").split(",")])
+    refs = {r.strip() for r in refnames.strip("()").split(",")}
     # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
     # just "foo-1.0". If we see a "tag: " prefix, prefer those.
     TAG = "tag: "
-    tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
+    tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
     if not tags:
         # Either we're using git < 1.8.3, or there really are no tags. We use
         # a heuristic: assume all version tags have a digit. The old git %d
@@ -201,7 +201,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         # between branches and tags. By ignoring refnames without digits, we
         # filter out many common branch names like "release" and
         # "stabilization", as well as "HEAD" and "master".
-        tags = set([r for r in refs if re.search(r"\d", r)])
+        tags = {r for r in refs if re.search(r"\d", r)}
         if verbose:
             print("discarding '%s', no digits" % ",".join(refs - tags))
     if verbose:
@@ -308,10 +308,9 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
             if verbose:
                 fmt = "tag '%s' doesn't start with prefix '%s'"
                 print(fmt % (full_tag, tag_prefix))
-            pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
-                full_tag,
-                tag_prefix,
-            )
+            pieces[
+                "error"
+            ] = f"tag '{full_tag}' doesn't start with prefix '{tag_prefix}'"
             return pieces
         pieces["closest-tag"] = full_tag[len(tag_prefix) :]
 
diff --git a/python/custreamz/custreamz/tests/test_dataframes.py b/python/custreamz/custreamz/tests/test_dataframes.py
index 24f6e46f6c5..a7378408c24 100644
--- a/python/custreamz/custreamz/tests/test_dataframes.py
+++ b/python/custreamz/custreamz/tests/test_dataframes.py
@@ -4,7 +4,6 @@
 Tests for Streamz Dataframes (SDFs) built on top of cuDF DataFrames.
 *** Borrowed from streamz.dataframe.tests | License at thirdparty/LICENSE ***
 """
-from __future__ import division, print_function
 
 import json
 import operator
diff --git a/python/dask_cudf/dask_cudf/_version.py b/python/dask_cudf/dask_cudf/_version.py
index 8ca2cf98381..104879fce36 100644
--- a/python/dask_cudf/dask_cudf/_version.py
+++ b/python/dask_cudf/dask_cudf/_version.py
@@ -86,7 +86,7 @@ def run_command(
                 stderr=(subprocess.PIPE if hide_stderr else None),
             )
             break
-        except EnvironmentError:
+        except OSError:
             e = sys.exc_info()[1]
             if e.errno == errno.ENOENT:
                 continue
@@ -96,7 +96,7 @@ def run_command(
             return None, None
     else:
         if verbose:
-            print("unable to find command, tried %s" % (commands,))
+            print(f"unable to find command, tried {commands}")
         return None, None
     stdout = p.communicate()[0].strip()
     if sys.version_info[0] >= 3:
@@ -149,7 +149,7 @@ def git_get_keywords(versionfile_abs):
     # _version.py.
     keywords = {}
     try:
-        f = open(versionfile_abs, "r")
+        f = open(versionfile_abs)
         for line in f.readlines():
             if line.strip().startswith("git_refnames ="):
                 mo = re.search(r'=\s*"(.*)"', line)
@@ -164,7 +164,7 @@ def git_get_keywords(versionfile_abs):
                 if mo:
                     keywords["date"] = mo.group(1)
         f.close()
-    except EnvironmentError:
+    except OSError:
         pass
     return keywords
 
@@ -188,11 +188,11 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         if verbose:
             print("keywords are unexpanded, not using")
         raise NotThisMethod("unexpanded keywords, not a git-archive tarball")
-    refs = set([r.strip() for r in refnames.strip("()").split(",")])
+    refs = {r.strip() for r in refnames.strip("()").split(",")}
     # starting in git-1.8.3, tags are listed as "tag: foo-1.0" instead of
     # just "foo-1.0". If we see a "tag: " prefix, prefer those.
     TAG = "tag: "
-    tags = set([r[len(TAG) :] for r in refs if r.startswith(TAG)])
+    tags = {r[len(TAG) :] for r in refs if r.startswith(TAG)}
     if not tags:
         # Either we're using git < 1.8.3, or there really are no tags. We use
         # a heuristic: assume all version tags have a digit. The old git %d
@@ -201,7 +201,7 @@ def git_versions_from_keywords(keywords, tag_prefix, verbose):
         # between branches and tags. By ignoring refnames without digits, we
         # filter out many common branch names like "release" and
         # "stabilization", as well as "HEAD" and "master".
-        tags = set([r for r in refs if re.search(r"\d", r)])
+        tags = {r for r in refs if re.search(r"\d", r)}
         if verbose:
             print("discarding '%s', no digits" % ",".join(refs - tags))
     if verbose:
@@ -308,10 +308,9 @@ def git_pieces_from_vcs(tag_prefix, root, verbose, run_command=run_command):
             if verbose:
                 fmt = "tag '%s' doesn't start with prefix '%s'"
                 print(fmt % (full_tag, tag_prefix))
-            pieces["error"] = "tag '%s' doesn't start with prefix '%s'" % (
-                full_tag,
-                tag_prefix,
-            )
+            pieces[
+                "error"
+            ] = f"tag '{full_tag}' doesn't start with prefix '{tag_prefix}'"
             return pieces
         pieces["closest-tag"] = full_tag[len(tag_prefix) :]
 
diff --git a/python/dask_cudf/dask_cudf/core.py b/python/dask_cudf/dask_cudf/core.py
index e191873f82b..729db6c232d 100644
--- a/python/dask_cudf/dask_cudf/core.py
+++ b/python/dask_cudf/dask_cudf/core.py
@@ -516,7 +516,7 @@ def _extract_meta(x):
     elif isinstance(x, list):
         return [_extract_meta(_x) for _x in x]
     elif isinstance(x, tuple):
-        return tuple([_extract_meta(_x) for _x in x])
+        return tuple(_extract_meta(_x) for _x in x)
     elif isinstance(x, dict):
         return {k: _extract_meta(v) for k, v in x.items()}
     return x
@@ -611,9 +611,7 @@ def reduction(
     if not isinstance(args, (tuple, list)):
         args = [args]
 
-    npartitions = set(
-        arg.npartitions for arg in args if isinstance(arg, _Frame)
-    )
+    npartitions = {arg.npartitions for arg in args if isinstance(arg, _Frame)}
     if len(npartitions) > 1:
         raise ValueError("All arguments must have same number of partitions")
     npartitions = npartitions.pop()
@@ -636,7 +634,7 @@ def reduction(
     )
 
     # Chunk
-    a = "{0}-chunk-{1}".format(token or funcname(chunk), token_key)
+    a = f"{token or funcname(chunk)}-chunk-{token_key}"
     if len(args) == 1 and isinstance(args[0], _Frame) and not chunk_kwargs:
         dsk = {
             (a, 0, i): (chunk, key)
@@ -654,7 +652,7 @@ def reduction(
         }
 
     # Combine
-    b = "{0}-combine-{1}".format(token or funcname(combine), token_key)
+    b = f"{token or funcname(combine)}-combine-{token_key}"
     k = npartitions
     depth = 0
     while k > split_every:
@@ -670,7 +668,7 @@ def reduction(
         depth += 1
 
     # Aggregate
-    b = "{0}-agg-{1}".format(token or funcname(aggregate), token_key)
+    b = f"{token or funcname(aggregate)}-agg-{token_key}"
     conc = (list, [(a, depth, i) for i in range(k)])
     if aggregate_kwargs:
         dsk[(b, 0)] = (apply, aggregate, [conc], aggregate_kwargs)
diff --git a/python/dask_cudf/dask_cudf/io/orc.py b/python/dask_cudf/dask_cudf/io/orc.py
index 00fc197da9b..2d326e41c3e 100644
--- a/python/dask_cudf/dask_cudf/io/orc.py
+++ b/python/dask_cudf/dask_cudf/io/orc.py
@@ -79,7 +79,7 @@ def read_orc(path, columns=None, filters=None, storage_options=None, **kwargs):
         ex = set(columns) - set(schema)
         if ex:
             raise ValueError(
-                "Requested columns (%s) not in schema (%s)" % (ex, set(schema))
+                "Requested columns ({ex}) not in schema ({set(schema)})"
             )
     else:
         columns = list(schema)
diff --git a/python/dask_cudf/dask_cudf/io/tests/test_parquet.py b/python/dask_cudf/dask_cudf/io/tests/test_parquet.py
index 706b0e272ea..f5c1e53258e 100644
--- a/python/dask_cudf/dask_cudf/io/tests/test_parquet.py
+++ b/python/dask_cudf/dask_cudf/io/tests/test_parquet.py
@@ -40,12 +40,7 @@ def test_roundtrip_from_dask(tmpdir, stats):
     tmpdir = str(tmpdir)
     ddf.to_parquet(tmpdir, engine="pyarrow")
     files = sorted(
-        [
-            os.path.join(tmpdir, f)
-            for f in os.listdir(tmpdir)
-            # TODO: Allow "_metadata" in list after dask#6047
-            if not f.endswith("_metadata")
-        ],
+        (os.path.join(tmpdir, f) for f in os.listdir(tmpdir)),
         key=natural_sort_key,
     )
 
diff --git a/python/dask_cudf/setup.py b/python/dask_cudf/setup.py
index 39491a45e7e..635f21fd906 100644
--- a/python/dask_cudf/setup.py
+++ b/python/dask_cudf/setup.py
@@ -33,9 +33,7 @@ def get_cuda_version_from_header(cuda_include_dir, delimeter=""):
 
     cuda_version = None
 
-    with open(
-        os.path.join(cuda_include_dir, "cuda.h"), "r", encoding="utf-8"
-    ) as f:
+    with open(os.path.join(cuda_include_dir, "cuda.h"), encoding="utf-8") as f:
         for line in f.readlines():
             if re.search(r"#define CUDA_VERSION ", line) is not None:
                 cuda_version = line