forked from rapidsai/cudf
-
Notifications
You must be signed in to change notification settings - Fork 0
/
CHANGELOG.md
5990 lines (5703 loc) · 575 KB
/
CHANGELOG.md
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
# cuDF 23.02.00 (9 Feb 2023)
## 🚨 Breaking Changes
- Pin `dask` and `distributed` for release ([#12695](https://github.com/rapidsai/cudf/pull/12695)) [@galipremsagar](https://github.com/galipremsagar)
- Change ways to access `ptr` in `Buffer` ([#12587](https://github.com/rapidsai/cudf/pull/12587)) [@galipremsagar](https://github.com/galipremsagar)
- Remove column names ([#12578](https://github.com/rapidsai/cudf/pull/12578)) [@vuule](https://github.com/vuule)
- Default `cudf::io::read_json` to nested JSON parser ([#12544](https://github.com/rapidsai/cudf/pull/12544)) [@vuule](https://github.com/vuule)
- Switch `engine=cudf` to the new `JSON` reader ([#12509](https://github.com/rapidsai/cudf/pull/12509)) [@galipremsagar](https://github.com/galipremsagar)
- Add trailing comma support for nested JSON reader ([#12448](https://github.com/rapidsai/cudf/pull/12448)) [@karthikeyann](https://github.com/karthikeyann)
- Upgrade to `arrow-10.0.1` ([#12327](https://github.com/rapidsai/cudf/pull/12327)) [@galipremsagar](https://github.com/galipremsagar)
- Fail loudly to avoid data corruption with unsupported input in `read_orc` ([#12325](https://github.com/rapidsai/cudf/pull/12325)) [@vuule](https://github.com/vuule)
- CSV, JSON reader to infer integer column with nulls as int64 instead of float64 ([#12309](https://github.com/rapidsai/cudf/pull/12309)) [@karthikeyann](https://github.com/karthikeyann)
- Remove deprecated code for 23.02 ([#12281](https://github.com/rapidsai/cudf/pull/12281)) [@vyasr](https://github.com/vyasr)
- Null element for parsing error in numeric types in JSON, CSV reader ([#12272](https://github.com/rapidsai/cudf/pull/12272)) [@karthikeyann](https://github.com/karthikeyann)
- Purge non-empty nulls for `superimpose_nulls` and `push_down_nulls` ([#12239](https://github.com/rapidsai/cudf/pull/12239)) [@ttnghia](https://github.com/ttnghia)
- Rename `cudf::structs::detail::superimpose_parent_nulls` APIs ([#12230](https://github.com/rapidsai/cudf/pull/12230)) [@ttnghia](https://github.com/ttnghia)
- Remove JIT type names, refactor id_to_type. ([#12158](https://github.com/rapidsai/cudf/pull/12158)) [@bdice](https://github.com/bdice)
- Floor division uses integer division for integral arguments ([#12131](https://github.com/rapidsai/cudf/pull/12131)) [@wence-](https://github.com/wence-)
## 🐛 Bug Fixes
- Fix a mask data corruption in UDF ([#12647](https://github.com/rapidsai/cudf/pull/12647)) [@galipremsagar](https://github.com/galipremsagar)
- pre-commit: Update isort version to 5.12.0 ([#12645](https://github.com/rapidsai/cudf/pull/12645)) [@wence-](https://github.com/wence-)
- tests: Skip cuInit tests if cuda-gdb is not found or not working ([#12644](https://github.com/rapidsai/cudf/pull/12644)) [@wence-](https://github.com/wence-)
- Revert regex program java APIs and tests ([#12639](https://github.com/rapidsai/cudf/pull/12639)) [@cindyyuanjiang](https://github.com/cindyyuanjiang)
- Fix leaks in ColumnVectorTest ([#12625](https://github.com/rapidsai/cudf/pull/12625)) [@jlowe](https://github.com/jlowe)
- Handle when spillable buffers own each other ([#12607](https://github.com/rapidsai/cudf/pull/12607)) [@madsbk](https://github.com/madsbk)
- Fix incorrect null counts for sliced columns in JCudfSerialization ([#12589](https://github.com/rapidsai/cudf/pull/12589)) [@jlowe](https://github.com/jlowe)
- lists: Transfer dtypes correctly through list.get ([#12586](https://github.com/rapidsai/cudf/pull/12586)) [@wence-](https://github.com/wence-)
- timedelta: Don't go via float intermediates for floordiv ([#12585](https://github.com/rapidsai/cudf/pull/12585)) [@wence-](https://github.com/wence-)
- Fixing BUG, `get_next_chunk()` should use the blocking function `device_read()` ([#12584](https://github.com/rapidsai/cudf/pull/12584)) [@madsbk](https://github.com/madsbk)
- Make JNI QuoteStyle accessible outside ai.rapids.cudf ([#12572](https://github.com/rapidsai/cudf/pull/12572)) [@mythrocks](https://github.com/mythrocks)
- `partition_by_hash()`: support index ([#12554](https://github.com/rapidsai/cudf/pull/12554)) [@madsbk](https://github.com/madsbk)
- Mixed Join benchmark bug due to wrong conditional column ([#12553](https://github.com/rapidsai/cudf/pull/12553)) [@divyegala](https://github.com/divyegala)
- Update List Lexicographical Comparator ([#12538](https://github.com/rapidsai/cudf/pull/12538)) [@divyegala](https://github.com/divyegala)
- Dynamically read PTX version ([#12534](https://github.com/rapidsai/cudf/pull/12534)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- build.sh switch to use `RAPIDS` magic value ([#12525](https://github.com/rapidsai/cudf/pull/12525)) [@robertmaynard](https://github.com/robertmaynard)
- Loosen runtime arrow pinning ([#12522](https://github.com/rapidsai/cudf/pull/12522)) [@vyasr](https://github.com/vyasr)
- Enable metadata transfer for complex types in transpose ([#12491](https://github.com/rapidsai/cudf/pull/12491)) [@galipremsagar](https://github.com/galipremsagar)
- Fix issues with parquet chunked reader ([#12488](https://github.com/rapidsai/cudf/pull/12488)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix missing metadata transfer in concat for `ListColumn` ([#12487](https://github.com/rapidsai/cudf/pull/12487)) [@galipremsagar](https://github.com/galipremsagar)
- Rename libcudf substring source files to slice ([#12484](https://github.com/rapidsai/cudf/pull/12484)) [@davidwendt](https://github.com/davidwendt)
- Fix compile issue with arrow 10 ([#12465](https://github.com/rapidsai/cudf/pull/12465)) [@ttnghia](https://github.com/ttnghia)
- Fix List offsets bug in mixed type list column in nested JSON reader ([#12447](https://github.com/rapidsai/cudf/pull/12447)) [@karthikeyann](https://github.com/karthikeyann)
- Fix xfail incompatibilities ([#12423](https://github.com/rapidsai/cudf/pull/12423)) [@vyasr](https://github.com/vyasr)
- Fix bug in Parquet column index encoding ([#12404](https://github.com/rapidsai/cudf/pull/12404)) [@etseidl](https://github.com/etseidl)
- When building Arrow shared look for a shared OpenSSL ([#12396](https://github.com/rapidsai/cudf/pull/12396)) [@robertmaynard](https://github.com/robertmaynard)
- Fix get_json_object to return empty column on empty input ([#12384](https://github.com/rapidsai/cudf/pull/12384)) [@davidwendt](https://github.com/davidwendt)
- Pin arrow 9 in testing dependencies to prevent conda solve issues ([#12377](https://github.com/rapidsai/cudf/pull/12377)) [@vyasr](https://github.com/vyasr)
- Fix reductions any/all return value for empty input ([#12374](https://github.com/rapidsai/cudf/pull/12374)) [@davidwendt](https://github.com/davidwendt)
- Fix debug compile errors in parquet.hpp ([#12372](https://github.com/rapidsai/cudf/pull/12372)) [@davidwendt](https://github.com/davidwendt)
- Purge non-empty nulls in `cudf::make_lists_column` ([#12370](https://github.com/rapidsai/cudf/pull/12370)) [@ttnghia](https://github.com/ttnghia)
- Use correct memory resource in io::make_column ([#12364](https://github.com/rapidsai/cudf/pull/12364)) [@vyasr](https://github.com/vyasr)
- Add code to detect possible malformed page data in parquet files. ([#12360](https://github.com/rapidsai/cudf/pull/12360)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fail loudly to avoid data corruption with unsupported input in `read_orc` ([#12325](https://github.com/rapidsai/cudf/pull/12325)) [@vuule](https://github.com/vuule)
- Fix NumericPairIteratorTest for float values ([#12306](https://github.com/rapidsai/cudf/pull/12306)) [@davidwendt](https://github.com/davidwendt)
- Fixes memory allocation in nested JSON tokenizer ([#12300](https://github.com/rapidsai/cudf/pull/12300)) [@elstehle](https://github.com/elstehle)
- Reconstruct dtypes correctly for list aggs of struct columns ([#12290](https://github.com/rapidsai/cudf/pull/12290)) [@wence-](https://github.com/wence-)
- Fix regex \A and \Z to strictly match string begin/end ([#12282](https://github.com/rapidsai/cudf/pull/12282)) [@davidwendt](https://github.com/davidwendt)
- Fix compile issue in `json_chunked_reader.cpp` ([#12280](https://github.com/rapidsai/cudf/pull/12280)) [@ttnghia](https://github.com/ttnghia)
- Change reductions any/all to return valid values for empty input ([#12279](https://github.com/rapidsai/cudf/pull/12279)) [@davidwendt](https://github.com/davidwendt)
- Only exclude join keys that are indices from key columns ([#12271](https://github.com/rapidsai/cudf/pull/12271)) [@wence-](https://github.com/wence-)
- Fix spill to device limit ([#12252](https://github.com/rapidsai/cudf/pull/12252)) [@madsbk](https://github.com/madsbk)
- Correct behaviour of sort in `concat` for singleton concatenations ([#12247](https://github.com/rapidsai/cudf/pull/12247)) [@wence-](https://github.com/wence-)
- Purge non-empty nulls for `superimpose_nulls` and `push_down_nulls` ([#12239](https://github.com/rapidsai/cudf/pull/12239)) [@ttnghia](https://github.com/ttnghia)
- Patch CUB DeviceSegmentedSort and remove workaround ([#12234](https://github.com/rapidsai/cudf/pull/12234)) [@davidwendt](https://github.com/davidwendt)
- Fix memory leak in udf_string::assign(&&) function ([#12206](https://github.com/rapidsai/cudf/pull/12206)) [@davidwendt](https://github.com/davidwendt)
- Workaround thrust-copy-if limit in json get_tree_representation ([#12190](https://github.com/rapidsai/cudf/pull/12190)) [@davidwendt](https://github.com/davidwendt)
- Fix page size calculation in Parquet writer ([#12182](https://github.com/rapidsai/cudf/pull/12182)) [@etseidl](https://github.com/etseidl)
- Add cudf::detail::sizes_to_offsets_iterator to allow checking overflow in offsets ([#12180](https://github.com/rapidsai/cudf/pull/12180)) [@davidwendt](https://github.com/davidwendt)
- Workaround thrust-copy-if limit in wordpiece-tokenizer ([#12168](https://github.com/rapidsai/cudf/pull/12168)) [@davidwendt](https://github.com/davidwendt)
- Floor division uses integer division for integral arguments ([#12131](https://github.com/rapidsai/cudf/pull/12131)) [@wence-](https://github.com/wence-)
## 📖 Documentation
- Fix link to NVTX ([#12598](https://github.com/rapidsai/cudf/pull/12598)) [@sameerz](https://github.com/sameerz)
- Include missing groupby functions in documentation ([#12580](https://github.com/rapidsai/cudf/pull/12580)) [@quasiben](https://github.com/quasiben)
- Fix documentation author ([#12527](https://github.com/rapidsai/cudf/pull/12527)) [@bdice](https://github.com/bdice)
- Update libcudf reduction docs for casting output types ([#12526](https://github.com/rapidsai/cudf/pull/12526)) [@davidwendt](https://github.com/davidwendt)
- Add JSON reader page in user guide ([#12499](https://github.com/rapidsai/cudf/pull/12499)) [@GregoryKimball](https://github.com/GregoryKimball)
- Link unsupported iteration API docstrings ([#12482](https://github.com/rapidsai/cudf/pull/12482)) [@galipremsagar](https://github.com/galipremsagar)
- `strings_udf` doc update ([#12469](https://github.com/rapidsai/cudf/pull/12469)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Update cudf_assert docs with correct NDEBUG behavior ([#12464](https://github.com/rapidsai/cudf/pull/12464)) [@robertmaynard](https://github.com/robertmaynard)
- Update pre-commit hooks guide ([#12395](https://github.com/rapidsai/cudf/pull/12395)) [@bdice](https://github.com/bdice)
- Update test docs to not use detail comparison utilities ([#12332](https://github.com/rapidsai/cudf/pull/12332)) [@PointKernel](https://github.com/PointKernel)
- Fix doxygen description for regex_program::compute_working_memory_size ([#12329](https://github.com/rapidsai/cudf/pull/12329)) [@davidwendt](https://github.com/davidwendt)
- Add eval to docs. ([#12322](https://github.com/rapidsai/cudf/pull/12322)) [@vyasr](https://github.com/vyasr)
- Turn on xfail_strict=true ([#12244](https://github.com/rapidsai/cudf/pull/12244)) [@wence-](https://github.com/wence-)
- Update 10 minutes to cuDF ([#12114](https://github.com/rapidsai/cudf/pull/12114)) [@wence-](https://github.com/wence-)
## 🚀 New Features
- Use kvikIO as the default IO backend ([#12574](https://github.com/rapidsai/cudf/pull/12574)) [@vuule](https://github.com/vuule)
- Use `has_nonempty_nulls` instead of `may_contain_non_empty_nulls` in `superimpose_nulls` and `push_down_nulls` ([#12560](https://github.com/rapidsai/cudf/pull/12560)) [@ttnghia](https://github.com/ttnghia)
- Add strings methods removeprefix and removesuffix ([#12557](https://github.com/rapidsai/cudf/pull/12557)) [@davidwendt](https://github.com/davidwendt)
- Add `regex_program` java APIs and unit tests ([#12548](https://github.com/rapidsai/cudf/pull/12548)) [@cindyyuanjiang](https://github.com/cindyyuanjiang)
- Default `cudf::io::read_json` to nested JSON parser ([#12544](https://github.com/rapidsai/cudf/pull/12544)) [@vuule](https://github.com/vuule)
- Make string quoting optional on CSV write ([#12539](https://github.com/rapidsai/cudf/pull/12539)) [@mythrocks](https://github.com/mythrocks)
- Use new nvCOMP API to optimize the compression temp memory size ([#12533](https://github.com/rapidsai/cudf/pull/12533)) [@vuule](https://github.com/vuule)
- Support "values" orient (array of arrays) in Nested JSON reader ([#12498](https://github.com/rapidsai/cudf/pull/12498)) [@karthikeyann](https://github.com/karthikeyann)
- `one_hot_encode` to use experimental row comparators ([#12478](https://github.com/rapidsai/cudf/pull/12478)) [@divyegala](https://github.com/divyegala)
- Support %W and %w format specifiers in cudf::strings::to_timestamps ([#12475](https://github.com/rapidsai/cudf/pull/12475)) [@davidwendt](https://github.com/davidwendt)
- Add JSON Writer ([#12474](https://github.com/rapidsai/cudf/pull/12474)) [@karthikeyann](https://github.com/karthikeyann)
- Refactor `thrust_copy_if` into `cudf::detail::copy_if_safe` ([#12455](https://github.com/rapidsai/cudf/pull/12455)) [@ttnghia](https://github.com/ttnghia)
- Add trailing comma support for nested JSON reader ([#12448](https://github.com/rapidsai/cudf/pull/12448)) [@karthikeyann](https://github.com/karthikeyann)
- Extract `tokenize_json.hpp` detail header from `src/io/json/nested_json.hpp` ([#12432](https://github.com/rapidsai/cudf/pull/12432)) [@ttnghia](https://github.com/ttnghia)
- JNI bindings to write CSV ([#12425](https://github.com/rapidsai/cudf/pull/12425)) [@mythrocks](https://github.com/mythrocks)
- Nested JSON depth benchmark ([#12371](https://github.com/rapidsai/cudf/pull/12371)) [@karthikeyann](https://github.com/karthikeyann)
- Implement `lists::reverse` ([#12336](https://github.com/rapidsai/cudf/pull/12336)) [@ttnghia](https://github.com/ttnghia)
- Use `device_read` in experimental `read_json` ([#12314](https://github.com/rapidsai/cudf/pull/12314)) [@vuule](https://github.com/vuule)
- Implement JNI for `strings::reverse` ([#12283](https://github.com/rapidsai/cudf/pull/12283)) [@ttnghia](https://github.com/ttnghia)
- Null element for parsing error in numeric types in JSON, CSV reader ([#12272](https://github.com/rapidsai/cudf/pull/12272)) [@karthikeyann](https://github.com/karthikeyann)
- Add cudf::strings:like function with multiple patterns ([#12269](https://github.com/rapidsai/cudf/pull/12269)) [@davidwendt](https://github.com/davidwendt)
- Add environment variable to control host memory allocation in `hostdevice_vector` ([#12251](https://github.com/rapidsai/cudf/pull/12251)) [@vuule](https://github.com/vuule)
- Add cudf::strings::reverse function ([#12227](https://github.com/rapidsai/cudf/pull/12227)) [@davidwendt](https://github.com/davidwendt)
- Selectively use dictionary encoding in Parquet writer ([#12211](https://github.com/rapidsai/cudf/pull/12211)) [@etseidl](https://github.com/etseidl)
- Support `replace` in `strings_udf` ([#12207](https://github.com/rapidsai/cudf/pull/12207)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Add support to read binary encoded decimals in parquet ([#12205](https://github.com/rapidsai/cudf/pull/12205)) [@PointKernel](https://github.com/PointKernel)
- Support regex EOL where the string ends with a new-line character ([#12181](https://github.com/rapidsai/cudf/pull/12181)) [@davidwendt](https://github.com/davidwendt)
- Updating `stream_compaction/unique` to use new row comparators ([#12159](https://github.com/rapidsai/cudf/pull/12159)) [@divyegala](https://github.com/divyegala)
- Add device buffer datasource ([#12024](https://github.com/rapidsai/cudf/pull/12024)) [@PointKernel](https://github.com/PointKernel)
- Implement groupby apply with JIT ([#11452](https://github.com/rapidsai/cudf/pull/11452)) [@bwyogatama](https://github.com/bwyogatama)
## 🛠️ Improvements
- Update shared workflow branches ([#12696](https://github.com/rapidsai/cudf/pull/12696)) [@ajschmidt8](https://github.com/ajschmidt8)
- Pin `dask` and `distributed` for release ([#12695](https://github.com/rapidsai/cudf/pull/12695)) [@galipremsagar](https://github.com/galipremsagar)
- Don't upload `libcudf-example` to Anaconda.org ([#12671](https://github.com/rapidsai/cudf/pull/12671)) [@ajschmidt8](https://github.com/ajschmidt8)
- Pin wheel dependencies to same RAPIDS release ([#12659](https://github.com/rapidsai/cudf/pull/12659)) [@sevagh](https://github.com/sevagh)
- Use CTK 118/cp310 branch of wheel workflows ([#12602](https://github.com/rapidsai/cudf/pull/12602)) [@sevagh](https://github.com/sevagh)
- Change ways to access `ptr` in `Buffer` ([#12587](https://github.com/rapidsai/cudf/pull/12587)) [@galipremsagar](https://github.com/galipremsagar)
- Version a parquet writer xfail ([#12579](https://github.com/rapidsai/cudf/pull/12579)) [@galipremsagar](https://github.com/galipremsagar)
- Remove column names ([#12578](https://github.com/rapidsai/cudf/pull/12578)) [@vuule](https://github.com/vuule)
- Parquet reader optimization to address V100 regression. ([#12577](https://github.com/rapidsai/cudf/pull/12577)) [@nvdbaranec](https://github.com/nvdbaranec)
- Add support for `category` dtypes in CSV reader ([#12571](https://github.com/rapidsai/cudf/pull/12571)) [@galipremsagar](https://github.com/galipremsagar)
- Remove `spill_lock` parameter from `SpillableBuffer.get_ptr()` ([#12564](https://github.com/rapidsai/cudf/pull/12564)) [@madsbk](https://github.com/madsbk)
- Optimize `cudf::make_lists_column` ([#12547](https://github.com/rapidsai/cudf/pull/12547)) [@ttnghia](https://github.com/ttnghia)
- Remove `cudf::strings::repeat_strings_output_sizes` from Java and JNI ([#12546](https://github.com/rapidsai/cudf/pull/12546)) [@ttnghia](https://github.com/ttnghia)
- Test that cuInit is not called when RAPIDS_NO_INITIALIZE is set ([#12545](https://github.com/rapidsai/cudf/pull/12545)) [@wence-](https://github.com/wence-)
- Rework repeat_strings to use sizes-to-offsets utility ([#12543](https://github.com/rapidsai/cudf/pull/12543)) [@davidwendt](https://github.com/davidwendt)
- Replace exclusive_scan with sizes_to_offsets in cudf::lists::sequences ([#12541](https://github.com/rapidsai/cudf/pull/12541)) [@davidwendt](https://github.com/davidwendt)
- Rework nvtext::ngrams_tokenize to use sizes-to-offsets utility ([#12540](https://github.com/rapidsai/cudf/pull/12540)) [@davidwendt](https://github.com/davidwendt)
- Fix binary-ops gtests coded in namespace cudf::test ([#12536](https://github.com/rapidsai/cudf/pull/12536)) [@davidwendt](https://github.com/davidwendt)
- More `[@acquire_spill_lock()` and `as_buffer(..., exposed=False)` ([#12535](https://github.com/rapidsai/cudf/pull/12535)) @madsbk](https://github.com/acquire_spill_lock()` and `as_buffer(..., exposed=False)` ([#12535](https://github.com/rapidsai/cudf/pull/12535)) @madsbk)
- Guard CUDA runtime APIs with error checking ([#12531](https://github.com/rapidsai/cudf/pull/12531)) [@PointKernel](https://github.com/PointKernel)
- Update TODOs from issue 10432. ([#12528](https://github.com/rapidsai/cudf/pull/12528)) [@bdice](https://github.com/bdice)
- Update rapids-cmake definitions version in GitHub Actions style checks. ([#12511](https://github.com/rapidsai/cudf/pull/12511)) [@bdice](https://github.com/bdice)
- Switch `engine=cudf` to the new `JSON` reader ([#12509](https://github.com/rapidsai/cudf/pull/12509)) [@galipremsagar](https://github.com/galipremsagar)
- Fix SUM/MEAN aggregation type support. ([#12503](https://github.com/rapidsai/cudf/pull/12503)) [@bdice](https://github.com/bdice)
- Stop using pandas._testing ([#12492](https://github.com/rapidsai/cudf/pull/12492)) [@vyasr](https://github.com/vyasr)
- Fix ROLLING_TEST gtests coded in namespace cudf::test ([#12490](https://github.com/rapidsai/cudf/pull/12490)) [@davidwendt](https://github.com/davidwendt)
- Fix erroneously skipped ORC ZSTD test ([#12486](https://github.com/rapidsai/cudf/pull/12486)) [@vuule](https://github.com/vuule)
- Rework nvtext::generate_character_ngrams to use make_strings_children ([#12480](https://github.com/rapidsai/cudf/pull/12480)) [@davidwendt](https://github.com/davidwendt)
- Raise warnings as errors in the test suite ([#12468](https://github.com/rapidsai/cudf/pull/12468)) [@vyasr](https://github.com/vyasr)
- Remove `int32` hard-coding in python ([#12467](https://github.com/rapidsai/cudf/pull/12467)) [@galipremsagar](https://github.com/galipremsagar)
- Use cudaMemcpyDefault. ([#12466](https://github.com/rapidsai/cudf/pull/12466)) [@bdice](https://github.com/bdice)
- Update workflows for nightly tests ([#12462](https://github.com/rapidsai/cudf/pull/12462)) [@ajschmidt8](https://github.com/ajschmidt8)
- Build CUDA `11.8` and Python `3.10` Packages ([#12457](https://github.com/rapidsai/cudf/pull/12457)) [@ajschmidt8](https://github.com/ajschmidt8)
- JNI build image default as cuda11.8 ([#12441](https://github.com/rapidsai/cudf/pull/12441)) [@pxLi](https://github.com/pxLi)
- Re-enable `Recently Updated` Check ([#12435](https://github.com/rapidsai/cudf/pull/12435)) [@ajschmidt8](https://github.com/ajschmidt8)
- Rework remaining cudf::strings::from_xyz functions to use make_strings_children ([#12434](https://github.com/rapidsai/cudf/pull/12434)) [@vuule](https://github.com/vuule)
- Build wheels alongside conda CI ([#12427](https://github.com/rapidsai/cudf/pull/12427)) [@sevagh](https://github.com/sevagh)
- Remove arguments for checking exception messages in Python ([#12424](https://github.com/rapidsai/cudf/pull/12424)) [@vyasr](https://github.com/vyasr)
- Clean up cuco usage ([#12421](https://github.com/rapidsai/cudf/pull/12421)) [@PointKernel](https://github.com/PointKernel)
- Fix warnings in remaining modules ([#12406](https://github.com/rapidsai/cudf/pull/12406)) [@vyasr](https://github.com/vyasr)
- Update `ops-bot.yaml` ([#12402](https://github.com/rapidsai/cudf/pull/12402)) [@ajschmidt8](https://github.com/ajschmidt8)
- Rework cudf::strings::integers_to_ipv4 to use make_strings_children utility ([#12401](https://github.com/rapidsai/cudf/pull/12401)) [@davidwendt](https://github.com/davidwendt)
- Use `numpy.empty()` instead of `bytearray` to allocate host memory for spilling ([#12399](https://github.com/rapidsai/cudf/pull/12399)) [@madsbk](https://github.com/madsbk)
- Deprecate chunksize from dask_cudf.read_csv ([#12394](https://github.com/rapidsai/cudf/pull/12394)) [@rjzamora](https://github.com/rjzamora)
- Expose the RMM pool size in JNI ([#12390](https://github.com/rapidsai/cudf/pull/12390)) [@revans2](https://github.com/revans2)
- Fix COPYING_TEST: gtests coded in namespace cudf::test ([#12387](https://github.com/rapidsai/cudf/pull/12387)) [@davidwendt](https://github.com/davidwendt)
- Rework cudf::strings::url_encode to use make_strings_children utility ([#12385](https://github.com/rapidsai/cudf/pull/12385)) [@davidwendt](https://github.com/davidwendt)
- Use make_strings_children in parse_data nested json reader ([#12382](https://github.com/rapidsai/cudf/pull/12382)) [@karthikeyann](https://github.com/karthikeyann)
- Fix warnings in test_datetime.py ([#12381](https://github.com/rapidsai/cudf/pull/12381)) [@vyasr](https://github.com/vyasr)
- Mixed Join Benchmarks ([#12375](https://github.com/rapidsai/cudf/pull/12375)) [@divyegala](https://github.com/divyegala)
- Fix warnings in dataframe.py ([#12369](https://github.com/rapidsai/cudf/pull/12369)) [@vyasr](https://github.com/vyasr)
- Update conda recipes. ([#12368](https://github.com/rapidsai/cudf/pull/12368)) [@bdice](https://github.com/bdice)
- Use gpu-latest-1 runner tag ([#12366](https://github.com/rapidsai/cudf/pull/12366)) [@bdice](https://github.com/bdice)
- Rework cudf::strings::from_booleans to use make_strings_children ([#12365](https://github.com/rapidsai/cudf/pull/12365)) [@vuule](https://github.com/vuule)
- Fix warnings in test modules up to test_dataframe.py ([#12355](https://github.com/rapidsai/cudf/pull/12355)) [@vyasr](https://github.com/vyasr)
- JSON column performance optimization - struct column nulls ([#12354](https://github.com/rapidsai/cudf/pull/12354)) [@karthikeyann](https://github.com/karthikeyann)
- Accelerate stable-segmented-sort with CUB segmented sort ([#12347](https://github.com/rapidsai/cudf/pull/12347)) [@davidwendt](https://github.com/davidwendt)
- Add size check to make_offsets_child_column utility ([#12345](https://github.com/rapidsai/cudf/pull/12345)) [@davidwendt](https://github.com/davidwendt)
- Enable max compression ratio small block optimization for ZSTD ([#12338](https://github.com/rapidsai/cudf/pull/12338)) [@vuule](https://github.com/vuule)
- Fix warnings in test_monotonic.py ([#12334](https://github.com/rapidsai/cudf/pull/12334)) [@vyasr](https://github.com/vyasr)
- Improve JSON column creation performance (list offsets) ([#12330](https://github.com/rapidsai/cudf/pull/12330)) [@karthikeyann](https://github.com/karthikeyann)
- Upgrade to `arrow-10.0.1` ([#12327](https://github.com/rapidsai/cudf/pull/12327)) [@galipremsagar](https://github.com/galipremsagar)
- Fix warnings in test_orc.py ([#12326](https://github.com/rapidsai/cudf/pull/12326)) [@vyasr](https://github.com/vyasr)
- Fix warnings in test_groupby.py ([#12324](https://github.com/rapidsai/cudf/pull/12324)) [@vyasr](https://github.com/vyasr)
- Fix `test_notebooks.sh` ([#12323](https://github.com/rapidsai/cudf/pull/12323)) [@ajschmidt8](https://github.com/ajschmidt8)
- Fix transform gtests coded in namespace cudf::test ([#12321](https://github.com/rapidsai/cudf/pull/12321)) [@davidwendt](https://github.com/davidwendt)
- Fix `check_style.sh` script ([#12320](https://github.com/rapidsai/cudf/pull/12320)) [@ajschmidt8](https://github.com/ajschmidt8)
- Rework cudf::strings::from_timestamps to use make_strings_children ([#12317](https://github.com/rapidsai/cudf/pull/12317)) [@davidwendt](https://github.com/davidwendt)
- Fix warnings in test_index.py ([#12313](https://github.com/rapidsai/cudf/pull/12313)) [@vyasr](https://github.com/vyasr)
- Fix warnings in test_multiindex.py ([#12310](https://github.com/rapidsai/cudf/pull/12310)) [@vyasr](https://github.com/vyasr)
- CSV, JSON reader to infer integer column with nulls as int64 instead of float64 ([#12309](https://github.com/rapidsai/cudf/pull/12309)) [@karthikeyann](https://github.com/karthikeyann)
- Fix warnings in test_indexing.py ([#12305](https://github.com/rapidsai/cudf/pull/12305)) [@vyasr](https://github.com/vyasr)
- Fix warnings in test_joining.py ([#12304](https://github.com/rapidsai/cudf/pull/12304)) [@vyasr](https://github.com/vyasr)
- Unpin `dask` and `distributed` for development ([#12302](https://github.com/rapidsai/cudf/pull/12302)) [@galipremsagar](https://github.com/galipremsagar)
- Re-enable `sccache` for Jenkins builds ([#12297](https://github.com/rapidsai/cudf/pull/12297)) [@ajschmidt8](https://github.com/ajschmidt8)
- Define needs for pr-builder workflow. ([#12296](https://github.com/rapidsai/cudf/pull/12296)) [@bdice](https://github.com/bdice)
- Forward merge 22.12 into 23.02 ([#12294](https://github.com/rapidsai/cudf/pull/12294)) [@vyasr](https://github.com/vyasr)
- Fix warnings in test_stats.py ([#12293](https://github.com/rapidsai/cudf/pull/12293)) [@vyasr](https://github.com/vyasr)
- Fix table gtests coded in namespace cudf::test ([#12292](https://github.com/rapidsai/cudf/pull/12292)) [@davidwendt](https://github.com/davidwendt)
- Change cython for regex calls to use cudf::strings::regex_program ([#12289](https://github.com/rapidsai/cudf/pull/12289)) [@davidwendt](https://github.com/davidwendt)
- Improved error reporting when reading multiple JSON files ([#12285](https://github.com/rapidsai/cudf/pull/12285)) [@vuule](https://github.com/vuule)
- Deprecate Frame.sum_of_squares ([#12284](https://github.com/rapidsai/cudf/pull/12284)) [@vyasr](https://github.com/vyasr)
- Remove deprecated code for 23.02 ([#12281](https://github.com/rapidsai/cudf/pull/12281)) [@vyasr](https://github.com/vyasr)
- Clean up handling of max_page_size_bytes in Parquet writer ([#12277](https://github.com/rapidsai/cudf/pull/12277)) [@etseidl](https://github.com/etseidl)
- Fix replace gtests coded in namespace cudf::test ([#12270](https://github.com/rapidsai/cudf/pull/12270)) [@davidwendt](https://github.com/davidwendt)
- Add pandas nullable type support in `Index.to_pandas` ([#12268](https://github.com/rapidsai/cudf/pull/12268)) [@galipremsagar](https://github.com/galipremsagar)
- Rework nvtext::detokenize to use indexalator for row indices ([#12267](https://github.com/rapidsai/cudf/pull/12267)) [@davidwendt](https://github.com/davidwendt)
- Fix reduction gtests coded in namespace cudf::test ([#12257](https://github.com/rapidsai/cudf/pull/12257)) [@davidwendt](https://github.com/davidwendt)
- Remove default parameters from cudf::detail::sort function declarations ([#12254](https://github.com/rapidsai/cudf/pull/12254)) [@davidwendt](https://github.com/davidwendt)
- Add `duplicated` support for `Series`, `DataFrame` and `Index` ([#12246](https://github.com/rapidsai/cudf/pull/12246)) [@galipremsagar](https://github.com/galipremsagar)
- Replace column/table test utilities with macros ([#12242](https://github.com/rapidsai/cudf/pull/12242)) [@PointKernel](https://github.com/PointKernel)
- Rework cudf::strings::pad and zfill to use make_strings_children ([#12238](https://github.com/rapidsai/cudf/pull/12238)) [@davidwendt](https://github.com/davidwendt)
- Fix sort gtests coded in namespace cudf::test ([#12237](https://github.com/rapidsai/cudf/pull/12237)) [@davidwendt](https://github.com/davidwendt)
- Wrapping concat and file writes in `[@acquire_spill_lock()` ([#12232](https://github.com/rapidsai/cudf/pull/12232)) @madsbk](https://github.com/acquire_spill_lock()` ([#12232](https://github.com/rapidsai/cudf/pull/12232)) @madsbk)
- Rename `cudf::structs::detail::superimpose_parent_nulls` APIs ([#12230](https://github.com/rapidsai/cudf/pull/12230)) [@ttnghia](https://github.com/ttnghia)
- Cover parsing to decimal types in `read_json` tests ([#12229](https://github.com/rapidsai/cudf/pull/12229)) [@vuule](https://github.com/vuule)
- Spill Statistics ([#12223](https://github.com/rapidsai/cudf/pull/12223)) [@madsbk](https://github.com/madsbk)
- Use CUDF_JNI_ENABLE_PROFILING to conditionally enable profiling support. ([#12221](https://github.com/rapidsai/cudf/pull/12221)) [@bdice](https://github.com/bdice)
- Clean up of `test_spilling.py` ([#12220](https://github.com/rapidsai/cudf/pull/12220)) [@madsbk](https://github.com/madsbk)
- Simplify repetitive boolean logic ([#12218](https://github.com/rapidsai/cudf/pull/12218)) [@vuule](https://github.com/vuule)
- Add `Series.hasnans` and `Index.hasnans` ([#12214](https://github.com/rapidsai/cudf/pull/12214)) [@galipremsagar](https://github.com/galipremsagar)
- Add cudf::strings:udf::replace function ([#12210](https://github.com/rapidsai/cudf/pull/12210)) [@davidwendt](https://github.com/davidwendt)
- Adds in new java APIs for appending byte arrays to host columnar data ([#12208](https://github.com/rapidsai/cudf/pull/12208)) [@revans2](https://github.com/revans2)
- Remove Python dependencies from Java CI. ([#12193](https://github.com/rapidsai/cudf/pull/12193)) [@bdice](https://github.com/bdice)
- Fix null order in sort-based groupby and improve groupby tests ([#12191](https://github.com/rapidsai/cudf/pull/12191)) [@divyegala](https://github.com/divyegala)
- Move strings children functions from cudf/strings/detail/utilities.cuh to new header ([#12185](https://github.com/rapidsai/cudf/pull/12185)) [@davidwendt](https://github.com/davidwendt)
- Clean up existing JNI scalar to column code ([#12173](https://github.com/rapidsai/cudf/pull/12173)) [@revans2](https://github.com/revans2)
- Remove JIT type names, refactor id_to_type. ([#12158](https://github.com/rapidsai/cudf/pull/12158)) [@bdice](https://github.com/bdice)
- Update JNI version to 23.02.0-SNAPSHOT ([#12129](https://github.com/rapidsai/cudf/pull/12129)) [@pxLi](https://github.com/pxLi)
- Minor refactor of cpp/src/io/parquet/page_data.cu ([#12126](https://github.com/rapidsai/cudf/pull/12126)) [@etseidl](https://github.com/etseidl)
- Add codespell as a linter ([#12097](https://github.com/rapidsai/cudf/pull/12097)) [@benfred](https://github.com/benfred)
- Enable specifying exceptions in error macros ([#12078](https://github.com/rapidsai/cudf/pull/12078)) [@vyasr](https://github.com/vyasr)
- Move `_label_encoding` from Series to Column ([#12040](https://github.com/rapidsai/cudf/pull/12040)) [@shwina](https://github.com/shwina)
- Add GitHub Actions Workflows ([#12002](https://github.com/rapidsai/cudf/pull/12002)) [@ajschmidt8](https://github.com/ajschmidt8)
- Consolidate dask-cudf `groupby_agg` calls in one place ([#10835](https://github.com/rapidsai/cudf/pull/10835)) [@charlesbluca](https://github.com/charlesbluca)
# cuDF 22.12.00 (8 Dec 2022)
## 🚨 Breaking Changes
- Add JNI for `substring` without 'end' parameter. ([#12113](https://github.com/rapidsai/cudf/pull/12113)) [@firestarman](https://github.com/firestarman)
- Refactor `purge_nonempty_nulls` ([#12111](https://github.com/rapidsai/cudf/pull/12111)) [@ttnghia](https://github.com/ttnghia)
- Create an `int8` column in `read_csv` when all elements are missing ([#12110](https://github.com/rapidsai/cudf/pull/12110)) [@vuule](https://github.com/vuule)
- Throw an error when libcudf is built without cuFile and `LIBCUDF_CUFILE_POLICY` is set to `"ALWAYS"` ([#12080](https://github.com/rapidsai/cudf/pull/12080)) [@vuule](https://github.com/vuule)
- Fix type promotion edge cases in numerical binops ([#12074](https://github.com/rapidsai/cudf/pull/12074)) [@wence-](https://github.com/wence-)
- Reduce/Remove reliance on `**kwargs` and `*args` in `IO` readers & writers ([#12025](https://github.com/rapidsai/cudf/pull/12025)) [@galipremsagar](https://github.com/galipremsagar)
- Rollback of `DeviceBufferLike` ([#12009](https://github.com/rapidsai/cudf/pull/12009)) [@madsbk](https://github.com/madsbk)
- Remove unused `managed_allocator` ([#12005](https://github.com/rapidsai/cudf/pull/12005)) [@vyasr](https://github.com/vyasr)
- Pass column names to `write_csv` instead of `table_metadata` pointer ([#11972](https://github.com/rapidsai/cudf/pull/11972)) [@vuule](https://github.com/vuule)
- Accept const refs instead of const unique_ptr refs in reduce and scan APIs. ([#11960](https://github.com/rapidsai/cudf/pull/11960)) [@vyasr](https://github.com/vyasr)
- Default to equal NaNs in make_merge_sets_aggregation. ([#11952](https://github.com/rapidsai/cudf/pull/11952)) [@bdice](https://github.com/bdice)
- Remove validation that requires introspection ([#11938](https://github.com/rapidsai/cudf/pull/11938)) [@vyasr](https://github.com/vyasr)
- Trim quotes for non-string values in nested json parsing ([#11898](https://github.com/rapidsai/cudf/pull/11898)) [@karthikeyann](https://github.com/karthikeyann)
- Add tests ensuring that cudf's default stream is always used ([#11875](https://github.com/rapidsai/cudf/pull/11875)) [@vyasr](https://github.com/vyasr)
- Support nested types as groupby keys in libcudf ([#11792](https://github.com/rapidsai/cudf/pull/11792)) [@PointKernel](https://github.com/PointKernel)
- Default to equal NaNs in make_collect_set_aggregation. ([#11621](https://github.com/rapidsai/cudf/pull/11621)) [@bdice](https://github.com/bdice)
- Removing int8 column option from parquet byte_array writing ([#11539](https://github.com/rapidsai/cudf/pull/11539)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- part1: Simplify BaseIndex to an abstract class ([#10389](https://github.com/rapidsai/cudf/pull/10389)) [@skirui-source](https://github.com/skirui-source)
## 🐛 Bug Fixes
- Fix include line for IO Cython modules ([#12250](https://github.com/rapidsai/cudf/pull/12250)) [@vyasr](https://github.com/vyasr)
- Make dask pinning looser ([#12231](https://github.com/rapidsai/cudf/pull/12231)) [@vyasr](https://github.com/vyasr)
- Workaround for CUB segmented-sort bug with boolean keys ([#12217](https://github.com/rapidsai/cudf/pull/12217)) [@davidwendt](https://github.com/davidwendt)
- Fix `from_dict` backend dispatch to match upstream `dask` ([#12203](https://github.com/rapidsai/cudf/pull/12203)) [@galipremsagar](https://github.com/galipremsagar)
- Merge branch-22.10 into branch-22.12 ([#12198](https://github.com/rapidsai/cudf/pull/12198)) [@davidwendt](https://github.com/davidwendt)
- Fix compression in ORC writer ([#12194](https://github.com/rapidsai/cudf/pull/12194)) [@vuule](https://github.com/vuule)
- Don't use CMake 3.25.0 as it has a show stopping FindCUDAToolkit bug ([#12188](https://github.com/rapidsai/cudf/pull/12188)) [@robertmaynard](https://github.com/robertmaynard)
- Fix data corruption when reading ORC files with empty stripes ([#12160](https://github.com/rapidsai/cudf/pull/12160)) [@vuule](https://github.com/vuule)
- Fix decimal binary operations ([#12142](https://github.com/rapidsai/cudf/pull/12142)) [@galipremsagar](https://github.com/galipremsagar)
- Ensure dlpack include is provided to cudf interop lib ([#12139](https://github.com/rapidsai/cudf/pull/12139)) [@robertmaynard](https://github.com/robertmaynard)
- Safely allocate `udf_string` pointers in `strings_udf` ([#12138](https://github.com/rapidsai/cudf/pull/12138)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Fix/disable jitify lto ([#12122](https://github.com/rapidsai/cudf/pull/12122)) [@robertmaynard](https://github.com/robertmaynard)
- Fix conditional_full_join benchmark ([#12121](https://github.com/rapidsai/cudf/pull/12121)) [@GregoryKimball](https://github.com/GregoryKimball)
- Fix regex working-memory-size refactor error ([#12119](https://github.com/rapidsai/cudf/pull/12119)) [@davidwendt](https://github.com/davidwendt)
- Add in negative size checks for columns ([#12118](https://github.com/rapidsai/cudf/pull/12118)) [@revans2](https://github.com/revans2)
- Add JNI for `substring` without 'end' parameter. ([#12113](https://github.com/rapidsai/cudf/pull/12113)) [@firestarman](https://github.com/firestarman)
- Fix reading of CSV files with blank second row ([#12098](https://github.com/rapidsai/cudf/pull/12098)) [@vuule](https://github.com/vuule)
- Fix an error in IO with `GzipFile` type ([#12085](https://github.com/rapidsai/cudf/pull/12085)) [@galipremsagar](https://github.com/galipremsagar)
- Workaround groupby aggregate thrust::copy_if overflow ([#12079](https://github.com/rapidsai/cudf/pull/12079)) [@davidwendt](https://github.com/davidwendt)
- Fix alignment of compressed blocks in ORC writer ([#12077](https://github.com/rapidsai/cudf/pull/12077)) [@vuule](https://github.com/vuule)
- Fix singleton-range `__setitem__` edge case ([#12075](https://github.com/rapidsai/cudf/pull/12075)) [@wence-](https://github.com/wence-)
- Fix type promotion edge cases in numerical binops ([#12074](https://github.com/rapidsai/cudf/pull/12074)) [@wence-](https://github.com/wence-)
- Force using old fmt in nvbench. ([#12067](https://github.com/rapidsai/cudf/pull/12067)) [@vyasr](https://github.com/vyasr)
- Fixes List offset bug in Nested JSON reader ([#12060](https://github.com/rapidsai/cudf/pull/12060)) [@karthikeyann](https://github.com/karthikeyann)
- Allow falling back to `shim_60.ptx` by default in `strings_udf` ([#12056](https://github.com/rapidsai/cudf/pull/12056)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Force black exclusions for pre-commit. ([#12036](https://github.com/rapidsai/cudf/pull/12036)) [@bdice](https://github.com/bdice)
- Add `memory_usage` & `items` implementation for `Struct` column & dtype ([#12033](https://github.com/rapidsai/cudf/pull/12033)) [@galipremsagar](https://github.com/galipremsagar)
- Reduce/Remove reliance on `**kwargs` and `*args` in `IO` readers & writers ([#12025](https://github.com/rapidsai/cudf/pull/12025)) [@galipremsagar](https://github.com/galipremsagar)
- Fixes bug in csv_reader_options construction in cython ([#12021](https://github.com/rapidsai/cudf/pull/12021)) [@karthikeyann](https://github.com/karthikeyann)
- Fix issues when both `usecols` and `names` options are used in `read_csv` ([#12018](https://github.com/rapidsai/cudf/pull/12018)) [@vuule](https://github.com/vuule)
- Port thrust's pinned_allocator to cudf, since Thrust 1.17 removes the type ([#12004](https://github.com/rapidsai/cudf/pull/12004)) [@robertmaynard](https://github.com/robertmaynard)
- Revert "Replace most of preprocessor usage in nvcomp adapter with `constexpr`" ([#11999](https://github.com/rapidsai/cudf/pull/11999)) [@vuule](https://github.com/vuule)
- Fix bug where `df.loc` resulting in single row could give wrong index ([#11998](https://github.com/rapidsai/cudf/pull/11998)) [@eriknw](https://github.com/eriknw)
- Switch to DISABLE_DEPRECATION_WARNINGS to match other RAPIDS projects ([#11989](https://github.com/rapidsai/cudf/pull/11989)) [@robertmaynard](https://github.com/robertmaynard)
- Fix maximum page size estimate in Parquet writer ([#11962](https://github.com/rapidsai/cudf/pull/11962)) [@vuule](https://github.com/vuule)
- Fix local offset handling in bgzip reader ([#11918](https://github.com/rapidsai/cudf/pull/11918)) [@upsj](https://github.com/upsj)
- Fix an issue reading struct-of-list types in Parquet. ([#11910](https://github.com/rapidsai/cudf/pull/11910)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix memcheck error in TypeInference.Timestamp gtest ([#11905](https://github.com/rapidsai/cudf/pull/11905)) [@davidwendt](https://github.com/davidwendt)
- Fix type casting in Series.__setitem__ ([#11904](https://github.com/rapidsai/cudf/pull/11904)) [@wence-](https://github.com/wence-)
- Fix memcheck error in get_dremel_data ([#11903](https://github.com/rapidsai/cudf/pull/11903)) [@davidwendt](https://github.com/davidwendt)
- Fixes Unsupported column type error due to empty list columns in Nested JSON reader ([#11897](https://github.com/rapidsai/cudf/pull/11897)) [@karthikeyann](https://github.com/karthikeyann)
- Fix segmented-sort to ignore indices outside the offsets ([#11888](https://github.com/rapidsai/cudf/pull/11888)) [@davidwendt](https://github.com/davidwendt)
- Fix cudf::stable_sorted_order for NaN and -NaN in FLOAT64 columns ([#11874](https://github.com/rapidsai/cudf/pull/11874)) [@davidwendt](https://github.com/davidwendt)
- Fix writing of Parquet files with many fragments ([#11869](https://github.com/rapidsai/cudf/pull/11869)) [@etseidl](https://github.com/etseidl)
- Fix RangeIndex unary operators. ([#11868](https://github.com/rapidsai/cudf/pull/11868)) [@vyasr](https://github.com/vyasr)
- JNI Avoid NPE for reading host binary data ([#11865](https://github.com/rapidsai/cudf/pull/11865)) [@revans2](https://github.com/revans2)
- Fix decimal benchmark input data generation ([#11863](https://github.com/rapidsai/cudf/pull/11863)) [@karthikeyann](https://github.com/karthikeyann)
- Fix pre-commit copyright check ([#11860](https://github.com/rapidsai/cudf/pull/11860)) [@galipremsagar](https://github.com/galipremsagar)
- Fix Parquet support for seconds and milliseconds duration types ([#11854](https://github.com/rapidsai/cudf/pull/11854)) [@vuule](https://github.com/vuule)
- Ensure better compiler cache results between cudf cal-ver branches ([#11835](https://github.com/rapidsai/cudf/pull/11835)) [@robertmaynard](https://github.com/robertmaynard)
- Fix make_column_from_scalar for all-null strings column ([#11807](https://github.com/rapidsai/cudf/pull/11807)) [@davidwendt](https://github.com/davidwendt)
- Tell jitify_preprocess where to search for libnvrtc ([#11787](https://github.com/rapidsai/cudf/pull/11787)) [@robertmaynard](https://github.com/robertmaynard)
- add V2 page header support to parquet reader ([#11778](https://github.com/rapidsai/cudf/pull/11778)) [@etseidl](https://github.com/etseidl)
- Parquet reader: bug fix for a num_rows/skip_rows corner case, w/optimization for nested preprocessing ([#11752](https://github.com/rapidsai/cudf/pull/11752)) [@nvdbaranec](https://github.com/nvdbaranec)
- Determine if Arrow has S3 support at runtime in unit test. ([#11560](https://github.com/rapidsai/cudf/pull/11560)) [@bdice](https://github.com/bdice)
## 📖 Documentation
- Use rapidsai CODE_OF_CONDUCT.md ([#12166](https://github.com/rapidsai/cudf/pull/12166)) [@bdice](https://github.com/bdice)
- Add symlinks to notebooks. ([#12128](https://github.com/rapidsai/cudf/pull/12128)) [@bdice](https://github.com/bdice)
- Add `truncate` API to python doc pages ([#12109](https://github.com/rapidsai/cudf/pull/12109)) [@galipremsagar](https://github.com/galipremsagar)
- Update Numba docs links. ([#12107](https://github.com/rapidsai/cudf/pull/12107)) [@bdice](https://github.com/bdice)
- Remove "Multi-GPU with Dask-cuDF" notebook. ([#12095](https://github.com/rapidsai/cudf/pull/12095)) [@bdice](https://github.com/bdice)
- Fix link to c++ developer guide from `CONTRIBUTING.md` ([#12084](https://github.com/rapidsai/cudf/pull/12084)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Add pivot_table and crosstab to docs. ([#12014](https://github.com/rapidsai/cudf/pull/12014)) [@bdice](https://github.com/bdice)
- Fix doxygen text for cudf::dictionary::encode ([#11991](https://github.com/rapidsai/cudf/pull/11991)) [@davidwendt](https://github.com/davidwendt)
- Replace default_stream_value with get_default_stream in docs. ([#11985](https://github.com/rapidsai/cudf/pull/11985)) [@vyasr](https://github.com/vyasr)
- Add dtype docs pages and docstrings for `cudf` specific dtypes ([#11974](https://github.com/rapidsai/cudf/pull/11974)) [@galipremsagar](https://github.com/galipremsagar)
- Update Unit Testing in libcudf guidelines to code tests outside the cudf::test namespace ([#11959](https://github.com/rapidsai/cudf/pull/11959)) [@davidwendt](https://github.com/davidwendt)
- Rename libcudf++ to libcudf. ([#11953](https://github.com/rapidsai/cudf/pull/11953)) [@bdice](https://github.com/bdice)
- Fix documentation referring to removed as_gpu_matrix method. ([#11937](https://github.com/rapidsai/cudf/pull/11937)) [@bdice](https://github.com/bdice)
- Remove "experimental" warning for struct columns in ORC reader and writer ([#11880](https://github.com/rapidsai/cudf/pull/11880)) [@vuule](https://github.com/vuule)
- Initial draft of policies and guidelines for libcudf usage. ([#11853](https://github.com/rapidsai/cudf/pull/11853)) [@vyasr](https://github.com/vyasr)
- Add clear indication of non-GPU accelerated parameters in read_json docstring ([#11825](https://github.com/rapidsai/cudf/pull/11825)) [@GregoryKimball](https://github.com/GregoryKimball)
- Add developer docs for writing tests ([#11199](https://github.com/rapidsai/cudf/pull/11199)) [@vyasr](https://github.com/vyasr)
## 🚀 New Features
- Adds an EventHandler to Java MemoryBuffer to be invoked on close ([#12125](https://github.com/rapidsai/cudf/pull/12125)) [@abellina](https://github.com/abellina)
- Support `+` in `strings_udf` ([#12117](https://github.com/rapidsai/cudf/pull/12117)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Support `upper` and `lower` in `strings_udf` ([#12099](https://github.com/rapidsai/cudf/pull/12099)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Add wheel builds ([#12096](https://github.com/rapidsai/cudf/pull/12096)) [@vyasr](https://github.com/vyasr)
- Allow setting malloc heap size in string udfs ([#12094](https://github.com/rapidsai/cudf/pull/12094)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Support `strip`, `lstrip`, and `rstrip` in `strings_udf` ([#12091](https://github.com/rapidsai/cudf/pull/12091)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Mark nvcomp zstd compression stable ([#12059](https://github.com/rapidsai/cudf/pull/12059)) [@jbrennan333](https://github.com/jbrennan333)
- Add debug-only onAllocated/onDeallocated to RmmEventHandler ([#12054](https://github.com/rapidsai/cudf/pull/12054)) [@abellina](https://github.com/abellina)
- Enable building against the libarrow contained in pyarrow ([#12034](https://github.com/rapidsai/cudf/pull/12034)) [@vyasr](https://github.com/vyasr)
- Add strings `like` jni and native method ([#12032](https://github.com/rapidsai/cudf/pull/12032)) [@cindyyuanjiang](https://github.com/cindyyuanjiang)
- Cleanup common parsing code in JSON, CSV reader ([#12022](https://github.com/rapidsai/cudf/pull/12022)) [@karthikeyann](https://github.com/karthikeyann)
- byte_range support for JSON Lines format ([#12017](https://github.com/rapidsai/cudf/pull/12017)) [@karthikeyann](https://github.com/karthikeyann)
- Minor cleanup of root CMakeLists.txt for better organization ([#11988](https://github.com/rapidsai/cudf/pull/11988)) [@robertmaynard](https://github.com/robertmaynard)
- Add inplace arithmetic operators to `MaskedType` ([#11987](https://github.com/rapidsai/cudf/pull/11987)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Implement JNI for chunked Parquet reader ([#11961](https://github.com/rapidsai/cudf/pull/11961)) [@ttnghia](https://github.com/ttnghia)
- Add method argument to DataFrame.quantile ([#11957](https://github.com/rapidsai/cudf/pull/11957)) [@rjzamora](https://github.com/rjzamora)
- Add gpu memory watermark apis to JNI ([#11950](https://github.com/rapidsai/cudf/pull/11950)) [@abellina](https://github.com/abellina)
- Adds retryCount to RmmEventHandler.onAllocFailure ([#11940](https://github.com/rapidsai/cudf/pull/11940)) [@abellina](https://github.com/abellina)
- Enable returning string data from UDFs used through `apply` ([#11933](https://github.com/rapidsai/cudf/pull/11933)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Switch over to rapids-cmake patches for thrust ([#11921](https://github.com/rapidsai/cudf/pull/11921)) [@robertmaynard](https://github.com/robertmaynard)
- Add strings udf C++ classes and functions for phase II ([#11912](https://github.com/rapidsai/cudf/pull/11912)) [@davidwendt](https://github.com/davidwendt)
- Trim quotes for non-string values in nested json parsing ([#11898](https://github.com/rapidsai/cudf/pull/11898)) [@karthikeyann](https://github.com/karthikeyann)
- Enable CEC for `strings_udf` ([#11884](https://github.com/rapidsai/cudf/pull/11884)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- ArrowIPCTableWriter writes en empty batch in the case of an empty table. ([#11883](https://github.com/rapidsai/cudf/pull/11883)) [@firestarman](https://github.com/firestarman)
- Implement chunked Parquet reader ([#11867](https://github.com/rapidsai/cudf/pull/11867)) [@ttnghia](https://github.com/ttnghia)
- Add `read_orc_metadata` to libcudf ([#11815](https://github.com/rapidsai/cudf/pull/11815)) [@vuule](https://github.com/vuule)
- Support nested types as groupby keys in libcudf ([#11792](https://github.com/rapidsai/cudf/pull/11792)) [@PointKernel](https://github.com/PointKernel)
- Adding feature Truncate to DataFrame and Series ([#11435](https://github.com/rapidsai/cudf/pull/11435)) [@VamsiTallam95](https://github.com/VamsiTallam95)
## 🛠️ Improvements
- Reduce number of tests marked `spilling` ([#12197](https://github.com/rapidsai/cudf/pull/12197)) [@madsbk](https://github.com/madsbk)
- Pin `dask` and `distributed` for release ([#12165](https://github.com/rapidsai/cudf/pull/12165)) [@galipremsagar](https://github.com/galipremsagar)
- Don't rely on GNU find in headers_test.sh ([#12164](https://github.com/rapidsai/cudf/pull/12164)) [@wence-](https://github.com/wence-)
- Update cp.clip call ([#12148](https://github.com/rapidsai/cudf/pull/12148)) [@quasiben](https://github.com/quasiben)
- Enable automatic column projection in groupby().agg ([#12124](https://github.com/rapidsai/cudf/pull/12124)) [@rjzamora](https://github.com/rjzamora)
- Refactor `purge_nonempty_nulls` ([#12111](https://github.com/rapidsai/cudf/pull/12111)) [@ttnghia](https://github.com/ttnghia)
- Create an `int8` column in `read_csv` when all elements are missing ([#12110](https://github.com/rapidsai/cudf/pull/12110)) [@vuule](https://github.com/vuule)
- Spilling to host memory ([#12106](https://github.com/rapidsai/cudf/pull/12106)) [@madsbk](https://github.com/madsbk)
- First pass of `pd.read_orc` changes in tests ([#12103](https://github.com/rapidsai/cudf/pull/12103)) [@galipremsagar](https://github.com/galipremsagar)
- Expose engine argument in dask_cudf.read_json ([#12101](https://github.com/rapidsai/cudf/pull/12101)) [@rjzamora](https://github.com/rjzamora)
- Remove CUDA 10 compatibility code. ([#12088](https://github.com/rapidsai/cudf/pull/12088)) [@bdice](https://github.com/bdice)
- Move and update `dask` nigthly install in CI ([#12082](https://github.com/rapidsai/cudf/pull/12082)) [@galipremsagar](https://github.com/galipremsagar)
- Throw an error when libcudf is built without cuFile and `LIBCUDF_CUFILE_POLICY` is set to `"ALWAYS"` ([#12080](https://github.com/rapidsai/cudf/pull/12080)) [@vuule](https://github.com/vuule)
- Remove macros that inspect the contents of exceptions ([#12076](https://github.com/rapidsai/cudf/pull/12076)) [@vyasr](https://github.com/vyasr)
- Fix ingest_raw_data performance issue in Nested JSON reader due to RVO ([#12070](https://github.com/rapidsai/cudf/pull/12070)) [@karthikeyann](https://github.com/karthikeyann)
- Remove overflow error during decimal binops ([#12063](https://github.com/rapidsai/cudf/pull/12063)) [@galipremsagar](https://github.com/galipremsagar)
- Change cudf::detail::tdigest to cudf::tdigest::detail ([#12050](https://github.com/rapidsai/cudf/pull/12050)) [@davidwendt](https://github.com/davidwendt)
- Fix quantile gtests coded in namespace cudf::test ([#12049](https://github.com/rapidsai/cudf/pull/12049)) [@davidwendt](https://github.com/davidwendt)
- Add support for `DataFrame.from_dict`\`to_dict` and `Series.to_dict` ([#12048](https://github.com/rapidsai/cudf/pull/12048)) [@galipremsagar](https://github.com/galipremsagar)
- Refactor Parquet reader ([#12046](https://github.com/rapidsai/cudf/pull/12046)) [@ttnghia](https://github.com/ttnghia)
- Forward merge 22.10 into 22.12 ([#12045](https://github.com/rapidsai/cudf/pull/12045)) [@vyasr](https://github.com/vyasr)
- Standardize newlines at ends of files. ([#12042](https://github.com/rapidsai/cudf/pull/12042)) [@bdice](https://github.com/bdice)
- Trim trailing whitespace from all files. ([#12041](https://github.com/rapidsai/cudf/pull/12041)) [@bdice](https://github.com/bdice)
- Use nosync policy in gather and scatter implementations. ([#12038](https://github.com/rapidsai/cudf/pull/12038)) [@bdice](https://github.com/bdice)
- Remove smart quotes from all docstrings. ([#12035](https://github.com/rapidsai/cudf/pull/12035)) [@bdice](https://github.com/bdice)
- Update cuda-python dependency to 11.7.1 ([#12030](https://github.com/rapidsai/cudf/pull/12030)) [@galipremsagar](https://github.com/galipremsagar)
- Add cython-lint to pre-commit checks. ([#12020](https://github.com/rapidsai/cudf/pull/12020)) [@bdice](https://github.com/bdice)
- Use pragma once ([#12019](https://github.com/rapidsai/cudf/pull/12019)) [@bdice](https://github.com/bdice)
- New GHA to add issues/prs to project board ([#12016](https://github.com/rapidsai/cudf/pull/12016)) [@jarmak-nv](https://github.com/jarmak-nv)
- Add DataFrame.pivot_table. ([#12015](https://github.com/rapidsai/cudf/pull/12015)) [@bdice](https://github.com/bdice)
- Rollback of `DeviceBufferLike` ([#12009](https://github.com/rapidsai/cudf/pull/12009)) [@madsbk](https://github.com/madsbk)
- Remove default parameters for nvtext::detail functions ([#12007](https://github.com/rapidsai/cudf/pull/12007)) [@davidwendt](https://github.com/davidwendt)
- Remove default parameters for cudf::dictionary::detail functions ([#12006](https://github.com/rapidsai/cudf/pull/12006)) [@davidwendt](https://github.com/davidwendt)
- Remove unused `managed_allocator` ([#12005](https://github.com/rapidsai/cudf/pull/12005)) [@vyasr](https://github.com/vyasr)
- Remove default parameters for cudf::strings::detail functions ([#12003](https://github.com/rapidsai/cudf/pull/12003)) [@davidwendt](https://github.com/davidwendt)
- Remove unnecessary code from dask-cudf _Frame ([#12001](https://github.com/rapidsai/cudf/pull/12001)) [@rjzamora](https://github.com/rjzamora)
- Ignore python docs build artifacts ([#12000](https://github.com/rapidsai/cudf/pull/12000)) [@galipremsagar](https://github.com/galipremsagar)
- Use rapids-cmake for google benchmark. ([#11997](https://github.com/rapidsai/cudf/pull/11997)) [@vyasr](https://github.com/vyasr)
- Leverage rapids_cython for more automated RPATH handling ([#11996](https://github.com/rapidsai/cudf/pull/11996)) [@vyasr](https://github.com/vyasr)
- Remove stale labeler ([#11995](https://github.com/rapidsai/cudf/pull/11995)) [@raydouglass](https://github.com/raydouglass)
- Move protobuf compilation to CMake ([#11986](https://github.com/rapidsai/cudf/pull/11986)) [@vyasr](https://github.com/vyasr)
- Replace most of preprocessor usage in nvcomp adapter with `constexpr` ([#11980](https://github.com/rapidsai/cudf/pull/11980)) [@vuule](https://github.com/vuule)
- Add missing noexcepts to column_in_metadata methods ([#11973](https://github.com/rapidsai/cudf/pull/11973)) [@vyasr](https://github.com/vyasr)
- Pass column names to `write_csv` instead of `table_metadata` pointer ([#11972](https://github.com/rapidsai/cudf/pull/11972)) [@vuule](https://github.com/vuule)
- Accelerate libcudf segmented sort with CUB segmented sort ([#11969](https://github.com/rapidsai/cudf/pull/11969)) [@davidwendt](https://github.com/davidwendt)
- Feature/remove default streams ([#11967](https://github.com/rapidsai/cudf/pull/11967)) [@vyasr](https://github.com/vyasr)
- Add pool memory resource to libcudf basic example ([#11966](https://github.com/rapidsai/cudf/pull/11966)) [@davidwendt](https://github.com/davidwendt)
- Fix some libcudf calls to cudf::detail::gather ([#11963](https://github.com/rapidsai/cudf/pull/11963)) [@davidwendt](https://github.com/davidwendt)
- Accept const refs instead of const unique_ptr refs in reduce and scan APIs. ([#11960](https://github.com/rapidsai/cudf/pull/11960)) [@vyasr](https://github.com/vyasr)
- Add deprecation warning for set_allocator. ([#11958](https://github.com/rapidsai/cudf/pull/11958)) [@vyasr](https://github.com/vyasr)
- Fix lists and structs gtests coded in namespace cudf::test ([#11956](https://github.com/rapidsai/cudf/pull/11956)) [@davidwendt](https://github.com/davidwendt)
- Add full page indexes to Parquet writer benchmarks ([#11955](https://github.com/rapidsai/cudf/pull/11955)) [@etseidl](https://github.com/etseidl)
- Use gather-based strings factory in cudf::strings::strip ([#11954](https://github.com/rapidsai/cudf/pull/11954)) [@davidwendt](https://github.com/davidwendt)
- Default to equal NaNs in make_merge_sets_aggregation. ([#11952](https://github.com/rapidsai/cudf/pull/11952)) [@bdice](https://github.com/bdice)
- Add `strip_delimiters` option to `read_text` ([#11946](https://github.com/rapidsai/cudf/pull/11946)) [@upsj](https://github.com/upsj)
- Refactor multibyte_split `output_builder` ([#11945](https://github.com/rapidsai/cudf/pull/11945)) [@upsj](https://github.com/upsj)
- Remove validation that requires introspection ([#11938](https://github.com/rapidsai/cudf/pull/11938)) [@vyasr](https://github.com/vyasr)
- Add `.str.find_multiple` API ([#11928](https://github.com/rapidsai/cudf/pull/11928)) [@galipremsagar](https://github.com/galipremsagar)
- Add regex_program class for use with all regex APIs ([#11927](https://github.com/rapidsai/cudf/pull/11927)) [@davidwendt](https://github.com/davidwendt)
- Enable backend dispatching for Dask-DataFrame creation ([#11920](https://github.com/rapidsai/cudf/pull/11920)) [@rjzamora](https://github.com/rjzamora)
- Performance improvement in JSON Tree traversal ([#11919](https://github.com/rapidsai/cudf/pull/11919)) [@karthikeyann](https://github.com/karthikeyann)
- Fix some gtests incorrectly coded in namespace cudf::test (part I) ([#11917](https://github.com/rapidsai/cudf/pull/11917)) [@davidwendt](https://github.com/davidwendt)
- Refactor pad/zfill functions for reuse with strings udf ([#11914](https://github.com/rapidsai/cudf/pull/11914)) [@davidwendt](https://github.com/davidwendt)
- Add `nanosecond` & `microsecond` to `DatetimeProperties` ([#11911](https://github.com/rapidsai/cudf/pull/11911)) [@galipremsagar](https://github.com/galipremsagar)
- Pin mimesis version in setup.py. ([#11906](https://github.com/rapidsai/cudf/pull/11906)) [@bdice](https://github.com/bdice)
- Error on `ListColumn` or any new unsupported column in `cudf.Index` ([#11902](https://github.com/rapidsai/cudf/pull/11902)) [@galipremsagar](https://github.com/galipremsagar)
- Add thrust output iterator fix (1805) to thrust.patch ([#11900](https://github.com/rapidsai/cudf/pull/11900)) [@davidwendt](https://github.com/davidwendt)
- Relax `codecov` threshold diff ([#11899](https://github.com/rapidsai/cudf/pull/11899)) [@galipremsagar](https://github.com/galipremsagar)
- Use public APIs in STREAM_COMPACTION_NVBENCH ([#11892](https://github.com/rapidsai/cudf/pull/11892)) [@GregoryKimball](https://github.com/GregoryKimball)
- Add coverage for string UDF tests. ([#11891](https://github.com/rapidsai/cudf/pull/11891)) [@vyasr](https://github.com/vyasr)
- Provide `data_chunk_source` wrapper for `datasource` ([#11886](https://github.com/rapidsai/cudf/pull/11886)) [@upsj](https://github.com/upsj)
- Handle `multibyte_split` byte_range out-of-bounds offsets on host ([#11885](https://github.com/rapidsai/cudf/pull/11885)) [@upsj](https://github.com/upsj)
- Add tests ensuring that cudf's default stream is always used ([#11875](https://github.com/rapidsai/cudf/pull/11875)) [@vyasr](https://github.com/vyasr)
- Change expect_strings_empty into expect_column_empty libcudf test utility ([#11873](https://github.com/rapidsai/cudf/pull/11873)) [@davidwendt](https://github.com/davidwendt)
- Add ngroup ([#11871](https://github.com/rapidsai/cudf/pull/11871)) [@shwina](https://github.com/shwina)
- Reduce memory usage in nested JSON parser - tree generation ([#11864](https://github.com/rapidsai/cudf/pull/11864)) [@karthikeyann](https://github.com/karthikeyann)
- Unpin `dask` and `distributed` for development ([#11859](https://github.com/rapidsai/cudf/pull/11859)) [@galipremsagar](https://github.com/galipremsagar)
- Remove unused includes for table/row_operators ([#11857](https://github.com/rapidsai/cudf/pull/11857)) [@GregoryKimball](https://github.com/GregoryKimball)
- Use conda-forge's `pyorc` ([#11855](https://github.com/rapidsai/cudf/pull/11855)) [@jakirkham](https://github.com/jakirkham)
- Add libcudf strings examples ([#11849](https://github.com/rapidsai/cudf/pull/11849)) [@davidwendt](https://github.com/davidwendt)
- Remove `cudf_io` namespace alias ([#11827](https://github.com/rapidsai/cudf/pull/11827)) [@vuule](https://github.com/vuule)
- Test/remove thrust vector usage ([#11813](https://github.com/rapidsai/cudf/pull/11813)) [@vyasr](https://github.com/vyasr)
- Add BGZIP reader to python `read_text` ([#11802](https://github.com/rapidsai/cudf/pull/11802)) [@upsj](https://github.com/upsj)
- Merge branch-22.10 into branch-22.12 ([#11801](https://github.com/rapidsai/cudf/pull/11801)) [@davidwendt](https://github.com/davidwendt)
- Fix compile warning from CUDF_FUNC_RANGE in a member function ([#11798](https://github.com/rapidsai/cudf/pull/11798)) [@davidwendt](https://github.com/davidwendt)
- Update cudf JNI version to 22.12.0-SNAPSHOT ([#11764](https://github.com/rapidsai/cudf/pull/11764)) [@pxLi](https://github.com/pxLi)
- Update flake8 to 5.0.4 and use flake8-force to check Cython. ([#11736](https://github.com/rapidsai/cudf/pull/11736)) [@bdice](https://github.com/bdice)
- Add BGZIP multibyte_split benchmark ([#11723](https://github.com/rapidsai/cudf/pull/11723)) [@upsj](https://github.com/upsj)
- Bifurcate Dependency Lists ([#11674](https://github.com/rapidsai/cudf/pull/11674)) [@bdice](https://github.com/bdice)
- Default to equal NaNs in make_collect_set_aggregation. ([#11621](https://github.com/rapidsai/cudf/pull/11621)) [@bdice](https://github.com/bdice)
- Conform "bench_isin" to match generator column names ([#11549](https://github.com/rapidsai/cudf/pull/11549)) [@GregoryKimball](https://github.com/GregoryKimball)
- Removing int8 column option from parquet byte_array writing ([#11539](https://github.com/rapidsai/cudf/pull/11539)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Add checks for HLG layers in dask-cudf groupby tests ([#10853](https://github.com/rapidsai/cudf/pull/10853)) [@charlesbluca](https://github.com/charlesbluca)
- part1: Simplify BaseIndex to an abstract class ([#10389](https://github.com/rapidsai/cudf/pull/10389)) [@skirui-source](https://github.com/skirui-source)
- Make all `nvcc` warnings into errors ([#8916](https://github.com/rapidsai/cudf/pull/8916)) [@trxcllnt](https://github.com/trxcllnt)
# cuDF 22.10.00 (12 Oct 2022)
## 🚨 Breaking Changes
- Disable Zstandard decompression on nvCOMP 2.4 and Pascal GPus ([#11856](https://github.com/rapidsai/cudf/pull/11856)) [@vuule](https://github.com/vuule)
- Disable nvCOMP DEFLATE integration ([#11811](https://github.com/rapidsai/cudf/pull/11811)) [@vuule](https://github.com/vuule)
- Fix return type of `Index.isna` & `Index.notna` ([#11769](https://github.com/rapidsai/cudf/pull/11769)) [@galipremsagar](https://github.com/galipremsagar)
- Remove `kwargs` in `read_csv` & `to_csv` ([#11762](https://github.com/rapidsai/cudf/pull/11762)) [@galipremsagar](https://github.com/galipremsagar)
- Fix `cudf::partition*` APIs that do not return offsets for empty output table ([#11709](https://github.com/rapidsai/cudf/pull/11709)) [@ttnghia](https://github.com/ttnghia)
- Fix regex negated classes to not automatically include new-lines ([#11644](https://github.com/rapidsai/cudf/pull/11644)) [@davidwendt](https://github.com/davidwendt)
- Update zfill to match Python output ([#11634](https://github.com/rapidsai/cudf/pull/11634)) [@davidwendt](https://github.com/davidwendt)
- Upgrade `pandas` to `1.5` ([#11617](https://github.com/rapidsai/cudf/pull/11617)) [@galipremsagar](https://github.com/galipremsagar)
- Change default value of `ordered` to `False` in `CategoricalDtype` ([#11604](https://github.com/rapidsai/cudf/pull/11604)) [@galipremsagar](https://github.com/galipremsagar)
- Move cudf::strings::findall_record to cudf::strings::findall ([#11575](https://github.com/rapidsai/cudf/pull/11575)) [@davidwendt](https://github.com/davidwendt)
- Adding optional parquet reader schema ([#11524](https://github.com/rapidsai/cudf/pull/11524)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Deprecate `skiprows` and `num_rows` in `read_orc` ([#11522](https://github.com/rapidsai/cudf/pull/11522)) [@galipremsagar](https://github.com/galipremsagar)
- Remove support for skip_rows / num_rows options in the parquet reader. ([#11503](https://github.com/rapidsai/cudf/pull/11503)) [@nvdbaranec](https://github.com/nvdbaranec)
- Drop support for `skiprows` and `num_rows` in `cudf.read_parquet` ([#11480](https://github.com/rapidsai/cudf/pull/11480)) [@galipremsagar](https://github.com/galipremsagar)
- Disable Arrow S3 support by default. ([#11470](https://github.com/rapidsai/cudf/pull/11470)) [@bdice](https://github.com/bdice)
- Convert thrust::optional usages to std::optional ([#11455](https://github.com/rapidsai/cudf/pull/11455)) [@robertmaynard](https://github.com/robertmaynard)
- Remove unused is_struct trait. ([#11450](https://github.com/rapidsai/cudf/pull/11450)) [@bdice](https://github.com/bdice)
- Refactor the `Buffer` class ([#11447](https://github.com/rapidsai/cudf/pull/11447)) [@madsbk](https://github.com/madsbk)
- Return empty dataframe when reading an ORC file using empty `columns` option ([#11446](https://github.com/rapidsai/cudf/pull/11446)) [@vuule](https://github.com/vuule)
- Refactor pad_side and strip_type enums into side_type enum ([#11438](https://github.com/rapidsai/cudf/pull/11438)) [@davidwendt](https://github.com/davidwendt)
- Remove HASH_SERIAL_MURMUR3 / serial32BitMurmurHash3 ([#11383](https://github.com/rapidsai/cudf/pull/11383)) [@bdice](https://github.com/bdice)
- Use the new JSON parser when the experimental reader is selected ([#11364](https://github.com/rapidsai/cudf/pull/11364)) [@vuule](https://github.com/vuule)
- Remove deprecated Series.applymap. ([#11031](https://github.com/rapidsai/cudf/pull/11031)) [@bdice](https://github.com/bdice)
- Remove deprecated expand parameter from str.findall. ([#11030](https://github.com/rapidsai/cudf/pull/11030)) [@bdice](https://github.com/bdice)
## 🐛 Bug Fixes
- Fixes bug in temporary decompression space estimation before calling nvcomp ([#11879](https://github.com/rapidsai/cudf/pull/11879)) [@abellina](https://github.com/abellina)
- Handle `ptx` file paths during `strings_udf` import ([#11862](https://github.com/rapidsai/cudf/pull/11862)) [@galipremsagar](https://github.com/galipremsagar)
- Disable Zstandard decompression on nvCOMP 2.4 and Pascal GPus ([#11856](https://github.com/rapidsai/cudf/pull/11856)) [@vuule](https://github.com/vuule)
- Reset `strings_udf` CEC and solve several related issues ([#11846](https://github.com/rapidsai/cudf/pull/11846)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Fix bug in new shuffle-based groupby implementation ([#11836](https://github.com/rapidsai/cudf/pull/11836)) [@rjzamora](https://github.com/rjzamora)
- Fix `is_valid` checks in `Scalar._binaryop` ([#11818](https://github.com/rapidsai/cudf/pull/11818)) [@wence-](https://github.com/wence-)
- Fix operator `NotImplemented` issue with `numpy` ([#11816](https://github.com/rapidsai/cudf/pull/11816)) [@galipremsagar](https://github.com/galipremsagar)
- Disable nvCOMP DEFLATE integration ([#11811](https://github.com/rapidsai/cudf/pull/11811)) [@vuule](https://github.com/vuule)
- Build `strings_udf` package with other python packages in nightlies ([#11808](https://github.com/rapidsai/cudf/pull/11808)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Revert problematic shuffle=explicit-comms changes ([#11803](https://github.com/rapidsai/cudf/pull/11803)) [@rjzamora](https://github.com/rjzamora)
- Fix regex out-of-bounds write in strided rows logic ([#11797](https://github.com/rapidsai/cudf/pull/11797)) [@davidwendt](https://github.com/davidwendt)
- Build `cudf` locally before building `strings_udf` conda packages in CI ([#11785](https://github.com/rapidsai/cudf/pull/11785)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Fix an issue in cudf::row_bit_count involving structs and lists at multiple levels. ([#11779](https://github.com/rapidsai/cudf/pull/11779)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix return type of `Index.isna` & `Index.notna` ([#11769](https://github.com/rapidsai/cudf/pull/11769)) [@galipremsagar](https://github.com/galipremsagar)
- Fix issue with set-item in case of `list` and `struct` types ([#11760](https://github.com/rapidsai/cudf/pull/11760)) [@galipremsagar](https://github.com/galipremsagar)
- Ensure all libcudf APIs run on cudf's default stream ([#11759](https://github.com/rapidsai/cudf/pull/11759)) [@vyasr](https://github.com/vyasr)
- Resolve dask_cudf failures caused by upstream groupby changes ([#11755](https://github.com/rapidsai/cudf/pull/11755)) [@rjzamora](https://github.com/rjzamora)
- Fix ORC string sum statistics ([#11740](https://github.com/rapidsai/cudf/pull/11740)) [@vuule](https://github.com/vuule)
- Add `strings_udf` package for python 3.9 ([#11730](https://github.com/rapidsai/cudf/pull/11730)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Ensure that all tests launch kernels on cudf's default stream ([#11726](https://github.com/rapidsai/cudf/pull/11726)) [@vyasr](https://github.com/vyasr)
- Don't assume stream is a compile-time constant expression ([#11725](https://github.com/rapidsai/cudf/pull/11725)) [@vyasr](https://github.com/vyasr)
- Fix get_thrust.cmake format at patch command ([#11715](https://github.com/rapidsai/cudf/pull/11715)) [@davidwendt](https://github.com/davidwendt)
- Fix `cudf::partition*` APIs that do not return offsets for empty output table ([#11709](https://github.com/rapidsai/cudf/pull/11709)) [@ttnghia](https://github.com/ttnghia)
- Fix cudf::lists::sort_lists for NaN and Infinity values ([#11703](https://github.com/rapidsai/cudf/pull/11703)) [@davidwendt](https://github.com/davidwendt)
- Modify ORC reader timestamp parsing to match the apache reader behavior ([#11699](https://github.com/rapidsai/cudf/pull/11699)) [@vuule](https://github.com/vuule)
- Fix `DataFrame.from_arrow` to preserve type metadata ([#11698](https://github.com/rapidsai/cudf/pull/11698)) [@galipremsagar](https://github.com/galipremsagar)
- Fix compile error due to missing header ([#11697](https://github.com/rapidsai/cudf/pull/11697)) [@ttnghia](https://github.com/ttnghia)
- Default to Snappy compression in `to_orc` when using cuDF or Dask ([#11690](https://github.com/rapidsai/cudf/pull/11690)) [@vuule](https://github.com/vuule)
- Fix an issue related to `Multindex` when `group_keys=True` ([#11689](https://github.com/rapidsai/cudf/pull/11689)) [@galipremsagar](https://github.com/galipremsagar)
- Transfer correct dtype to exploded column ([#11687](https://github.com/rapidsai/cudf/pull/11687)) [@wence-](https://github.com/wence-)
- Ignore protobuf generated files in `mypy` checks ([#11685](https://github.com/rapidsai/cudf/pull/11685)) [@galipremsagar](https://github.com/galipremsagar)
- Maintain the index name after `.loc` ([#11677](https://github.com/rapidsai/cudf/pull/11677)) [@shwina](https://github.com/shwina)
- Fix issue with extracting nested column data & dtype preservation ([#11671](https://github.com/rapidsai/cudf/pull/11671)) [@galipremsagar](https://github.com/galipremsagar)
- Ensure that all cudf tests and benchmarks are conda env aware ([#11666](https://github.com/rapidsai/cudf/pull/11666)) [@robertmaynard](https://github.com/robertmaynard)
- Update to Thrust 1.17.2 to fix cub ODR issues ([#11665](https://github.com/rapidsai/cudf/pull/11665)) [@robertmaynard](https://github.com/robertmaynard)
- Fix multi-file remote datasource bug ([#11655](https://github.com/rapidsai/cudf/pull/11655)) [@rjzamora](https://github.com/rjzamora)
- Fix invalid regex quantifier check to not include alternation ([#11654](https://github.com/rapidsai/cudf/pull/11654)) [@davidwendt](https://github.com/davidwendt)
- Fix bug in `device_write()`: it uses an incorrect size ([#11651](https://github.com/rapidsai/cudf/pull/11651)) [@madsbk](https://github.com/madsbk)
- fixes overflows in benchmarks ([#11649](https://github.com/rapidsai/cudf/pull/11649)) [@elstehle](https://github.com/elstehle)
- Fix regex negated classes to not automatically include new-lines ([#11644](https://github.com/rapidsai/cudf/pull/11644)) [@davidwendt](https://github.com/davidwendt)
- Fix compile error in benchmark nested_json.cpp ([#11637](https://github.com/rapidsai/cudf/pull/11637)) [@davidwendt](https://github.com/davidwendt)
- Update zfill to match Python output ([#11634](https://github.com/rapidsai/cudf/pull/11634)) [@davidwendt](https://github.com/davidwendt)
- Removed converted type for INT32 and INT64 since they do not convert ([#11627](https://github.com/rapidsai/cudf/pull/11627)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Fix host scalars construction of nested types ([#11612](https://github.com/rapidsai/cudf/pull/11612)) [@galipremsagar](https://github.com/galipremsagar)
- Fix compile warning in nested_json_gpu.cu ([#11607](https://github.com/rapidsai/cudf/pull/11607)) [@davidwendt](https://github.com/davidwendt)
- Change default value of `ordered` to `False` in `CategoricalDtype` ([#11604](https://github.com/rapidsai/cudf/pull/11604)) [@galipremsagar](https://github.com/galipremsagar)
- Preserve order if necessary when deduping categoricals internally ([#11597](https://github.com/rapidsai/cudf/pull/11597)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Add is_timestamp test for leap second (60) ([#11594](https://github.com/rapidsai/cudf/pull/11594)) [@davidwendt](https://github.com/davidwendt)
- Fix an issue with `to_arrow` when column name type is not a string ([#11590](https://github.com/rapidsai/cudf/pull/11590)) [@galipremsagar](https://github.com/galipremsagar)
- Fix exception in segmented-reduce benchmark ([#11588](https://github.com/rapidsai/cudf/pull/11588)) [@davidwendt](https://github.com/davidwendt)
- Fix encode/decode of negative timestamps in ORC reader/writer ([#11586](https://github.com/rapidsai/cudf/pull/11586)) [@vuule](https://github.com/vuule)
- Correct distribution data type in `quantiles` benchmark ([#11584](https://github.com/rapidsai/cudf/pull/11584)) [@vuule](https://github.com/vuule)
- Fix multibyte_split benchmark for host buffers ([#11583](https://github.com/rapidsai/cudf/pull/11583)) [@upsj](https://github.com/upsj)
- xfail custreamz display test for now ([#11567](https://github.com/rapidsai/cudf/pull/11567)) [@shwina](https://github.com/shwina)
- Fix JNI for TableWithMeta to use schema_info instead of column_names ([#11566](https://github.com/rapidsai/cudf/pull/11566)) [@jlowe](https://github.com/jlowe)
- Reduce code duplication for `dask` & `distributed` nightly/stable installs ([#11565](https://github.com/rapidsai/cudf/pull/11565)) [@galipremsagar](https://github.com/galipremsagar)
- Fix groupby failures in dask_cudf CI ([#11561](https://github.com/rapidsai/cudf/pull/11561)) [@rjzamora](https://github.com/rjzamora)
- Fix for pivot: error when 'values' is a multicharacter string ([#11538](https://github.com/rapidsai/cudf/pull/11538)) [@shaswat-indian](https://github.com/shaswat-indian)
- find_package(cudf) + arrow9 usable with cudf build directory ([#11535](https://github.com/rapidsai/cudf/pull/11535)) [@robertmaynard](https://github.com/robertmaynard)
- Fixing crash when writing binary nested data in parquet ([#11526](https://github.com/rapidsai/cudf/pull/11526)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Fix for: error when assigning a value to an empty series ([#11523](https://github.com/rapidsai/cudf/pull/11523)) [@shaswat-indian](https://github.com/shaswat-indian)
- Fix invalid results from conditional-left-anti-join in debug build ([#11517](https://github.com/rapidsai/cudf/pull/11517)) [@davidwendt](https://github.com/davidwendt)
- Fix cmake error after upgrading to Arrow 9 ([#11513](https://github.com/rapidsai/cudf/pull/11513)) [@ttnghia](https://github.com/ttnghia)
- Fix reverse binary operators acting on a host value and cudf.Scalar ([#11512](https://github.com/rapidsai/cudf/pull/11512)) [@bdice](https://github.com/bdice)
- Update parquet fuzz tests to drop support for `skiprows` & `num_rows` ([#11505](https://github.com/rapidsai/cudf/pull/11505)) [@galipremsagar](https://github.com/galipremsagar)
- Use rapids-cmake 22.10 best practice for RAPIDS.cmake location ([#11493](https://github.com/rapidsai/cudf/pull/11493)) [@robertmaynard](https://github.com/robertmaynard)
- Handle some zero-sized corner cases in dlpack interop ([#11449](https://github.com/rapidsai/cudf/pull/11449)) [@wence-](https://github.com/wence-)
- Return empty dataframe when reading an ORC file using empty `columns` option ([#11446](https://github.com/rapidsai/cudf/pull/11446)) [@vuule](https://github.com/vuule)
- libcudf c++ example updated to CPM version 0.35.3 ([#11417](https://github.com/rapidsai/cudf/pull/11417)) [@robertmaynard](https://github.com/robertmaynard)
- Fix regex quantifier check to include capture groups ([#11373](https://github.com/rapidsai/cudf/pull/11373)) [@davidwendt](https://github.com/davidwendt)
- Fix read_text when byte_range is aligned with field ([#11371](https://github.com/rapidsai/cudf/pull/11371)) [@upsj](https://github.com/upsj)
- Fix to_timestamps truncated subsecond calculation ([#11367](https://github.com/rapidsai/cudf/pull/11367)) [@davidwendt](https://github.com/davidwendt)
- column: calculate null_count before release()ing the cudf::column ([#11365](https://github.com/rapidsai/cudf/pull/11365)) [@wence-](https://github.com/wence-)
## 📖 Documentation
- Update `guide-to-udfs` notebook ([#11861](https://github.com/rapidsai/cudf/pull/11861)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Update docstring for cudf.read_text ([#11799](https://github.com/rapidsai/cudf/pull/11799)) [@GregoryKimball](https://github.com/GregoryKimball)
- Add doc section for `list` & `struct` handling ([#11770](https://github.com/rapidsai/cudf/pull/11770)) [@galipremsagar](https://github.com/galipremsagar)
- Document that minimum required CMake version is now 3.23.1 ([#11751](https://github.com/rapidsai/cudf/pull/11751)) [@robertmaynard](https://github.com/robertmaynard)
- Update libcudf documentation build command in DOCUMENTATION.md ([#11735](https://github.com/rapidsai/cudf/pull/11735)) [@davidwendt](https://github.com/davidwendt)
- Add docs for use of string data to `DataFrame.apply` and `Series.apply` and update guide to UDFs notebook ([#11733](https://github.com/rapidsai/cudf/pull/11733)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Enable more Pydocstyle rules ([#11582](https://github.com/rapidsai/cudf/pull/11582)) [@bdice](https://github.com/bdice)
- Remove unused cpp/img folder ([#11554](https://github.com/rapidsai/cudf/pull/11554)) [@davidwendt](https://github.com/davidwendt)
- Publish C++ developer docs ([#11475](https://github.com/rapidsai/cudf/pull/11475)) [@vyasr](https://github.com/vyasr)
- Fix a misalignment in `cudf.get_dummies` docstring ([#11443](https://github.com/rapidsai/cudf/pull/11443)) [@galipremsagar](https://github.com/galipremsagar)
- Update contributing doc to include links to the developer guides ([#11390](https://github.com/rapidsai/cudf/pull/11390)) [@davidwendt](https://github.com/davidwendt)
- Fix table_view_base doxygen format ([#11340](https://github.com/rapidsai/cudf/pull/11340)) [@davidwendt](https://github.com/davidwendt)
- Create main developer guide for Python ([#11235](https://github.com/rapidsai/cudf/pull/11235)) [@vyasr](https://github.com/vyasr)
- Add developer documentation for benchmarking ([#11122](https://github.com/rapidsai/cudf/pull/11122)) [@vyasr](https://github.com/vyasr)
- cuDF error handling document ([#7917](https://github.com/rapidsai/cudf/pull/7917)) [@isVoid](https://github.com/isVoid)
## 🚀 New Features
- Add hasNull statistic reading ability to ORC ([#11747](https://github.com/rapidsai/cudf/pull/11747)) [@devavret](https://github.com/devavret)
- Add `istitle` to string UDFs ([#11738](https://github.com/rapidsai/cudf/pull/11738)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- JSON Column creation in GPU ([#11714](https://github.com/rapidsai/cudf/pull/11714)) [@karthikeyann](https://github.com/karthikeyann)
- Adds option to take explicit nested schema for nested JSON reader ([#11682](https://github.com/rapidsai/cudf/pull/11682)) [@elstehle](https://github.com/elstehle)
- Add BGZIP `data_chunk_reader` ([#11652](https://github.com/rapidsai/cudf/pull/11652)) [@upsj](https://github.com/upsj)
- Support DECIMAL order-by for RANGE window functions ([#11645](https://github.com/rapidsai/cudf/pull/11645)) [@mythrocks](https://github.com/mythrocks)
- changing version of cmake to 3.23.3 ([#11619](https://github.com/rapidsai/cudf/pull/11619)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Generate unique keys table in java JNI `contiguousSplitGroups` ([#11614](https://github.com/rapidsai/cudf/pull/11614)) [@res-life](https://github.com/res-life)
- Generic type casting to support the new nested JSON reader ([#11613](https://github.com/rapidsai/cudf/pull/11613)) [@elstehle](https://github.com/elstehle)
- JSON tree traversal ([#11610](https://github.com/rapidsai/cudf/pull/11610)) [@karthikeyann](https://github.com/karthikeyann)
- Add casting operators to masked UDFs ([#11578](https://github.com/rapidsai/cudf/pull/11578)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Adds type inference and type conversion for leaf-columns to the nested JSON parser ([#11574](https://github.com/rapidsai/cudf/pull/11574)) [@elstehle](https://github.com/elstehle)
- Add strings 'like' function ([#11558](https://github.com/rapidsai/cudf/pull/11558)) [@davidwendt](https://github.com/davidwendt)
- Handle hyphen as literal for regex cclass when incomplete range ([#11557](https://github.com/rapidsai/cudf/pull/11557)) [@davidwendt](https://github.com/davidwendt)
- Enable ZSTD compression in ORC and Parquet writers ([#11551](https://github.com/rapidsai/cudf/pull/11551)) [@vuule](https://github.com/vuule)
- Adds support for json lines format to the nested JSON reader ([#11534](https://github.com/rapidsai/cudf/pull/11534)) [@elstehle](https://github.com/elstehle)
- Adding optional parquet reader schema ([#11524](https://github.com/rapidsai/cudf/pull/11524)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Adds GPU implementation of JSON-token-stream to JSON-tree ([#11518](https://github.com/rapidsai/cudf/pull/11518)) [@karthikeyann](https://github.com/karthikeyann)
- Add `gdb` pretty-printers for simple types ([#11499](https://github.com/rapidsai/cudf/pull/11499)) [@upsj](https://github.com/upsj)
- Add `create_random_column` function to the data generator ([#11490](https://github.com/rapidsai/cudf/pull/11490)) [@vuule](https://github.com/vuule)
- Add fluent API builder to `data_profile` ([#11479](https://github.com/rapidsai/cudf/pull/11479)) [@vuule](https://github.com/vuule)
- Adds Nested Json benchmark ([#11466](https://github.com/rapidsai/cudf/pull/11466)) [@karthikeyann](https://github.com/karthikeyann)
- Convert thrust::optional usages to std::optional ([#11455](https://github.com/rapidsai/cudf/pull/11455)) [@robertmaynard](https://github.com/robertmaynard)
- Python API for the future experimental JSON reader ([#11426](https://github.com/rapidsai/cudf/pull/11426)) [@vuule](https://github.com/vuule)
- Return schema info from JSON reader ([#11419](https://github.com/rapidsai/cudf/pull/11419)) [@vuule](https://github.com/vuule)
- Add regex ASCII flag support for matching builtin character classes ([#11404](https://github.com/rapidsai/cudf/pull/11404)) [@davidwendt](https://github.com/davidwendt)
- Truncate parquet column indexes ([#11403](https://github.com/rapidsai/cudf/pull/11403)) [@etseidl](https://github.com/etseidl)
- Adds the end-to-end JSON parser implementation ([#11388](https://github.com/rapidsai/cudf/pull/11388)) [@elstehle](https://github.com/elstehle)
- Use the new JSON parser when the experimental reader is selected ([#11364](https://github.com/rapidsai/cudf/pull/11364)) [@vuule](https://github.com/vuule)
- Add placeholder for the experimental JSON reader ([#11334](https://github.com/rapidsai/cudf/pull/11334)) [@vuule](https://github.com/vuule)
- Add read-only functions on string dtypes to `DataFrame.apply` and `Series.apply` ([#11319](https://github.com/rapidsai/cudf/pull/11319)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Added 'crosstab' and 'pivot_table' features ([#11314](https://github.com/rapidsai/cudf/pull/11314)) [@shaswat-indian](https://github.com/shaswat-indian)
- Quickly error out when trying to build with unsupported nvcc versions ([#11297](https://github.com/rapidsai/cudf/pull/11297)) [@robertmaynard](https://github.com/robertmaynard)
- Adds JSON tokenizer ([#11264](https://github.com/rapidsai/cudf/pull/11264)) [@elstehle](https://github.com/elstehle)
- List lexicographic comparator ([#11129](https://github.com/rapidsai/cudf/pull/11129)) [@devavret](https://github.com/devavret)
- Add generic type inference for cuIO ([#11121](https://github.com/rapidsai/cudf/pull/11121)) [@PointKernel](https://github.com/PointKernel)
- Fully support nested types in `cudf::contains` ([#10656](https://github.com/rapidsai/cudf/pull/10656)) [@ttnghia](https://github.com/ttnghia)
- Support nested types in `lists::contains` ([#10548](https://github.com/rapidsai/cudf/pull/10548)) [@ttnghia](https://github.com/ttnghia)
## 🛠️ Improvements
- Pin `dask` and `distributed` for release ([#11822](https://github.com/rapidsai/cudf/pull/11822)) [@galipremsagar](https://github.com/galipremsagar)
- Add examples for Nested JSON reader ([#11814](https://github.com/rapidsai/cudf/pull/11814)) [@GregoryKimball](https://github.com/GregoryKimball)
- Support shuffle-based groupby aggregations in dask_cudf ([#11800](https://github.com/rapidsai/cudf/pull/11800)) [@rjzamora](https://github.com/rjzamora)
- Update strings udf version updater script ([#11772](https://github.com/rapidsai/cudf/pull/11772)) [@galipremsagar](https://github.com/galipremsagar)
- Remove `kwargs` in `read_csv` & `to_csv` ([#11762](https://github.com/rapidsai/cudf/pull/11762)) [@galipremsagar](https://github.com/galipremsagar)
- Pass `dtype` param to avoid `pd.Series` warnings ([#11761](https://github.com/rapidsai/cudf/pull/11761)) [@galipremsagar](https://github.com/galipremsagar)
- Enable `schema_element` & `keep_quotes` support in json reader ([#11746](https://github.com/rapidsai/cudf/pull/11746)) [@galipremsagar](https://github.com/galipremsagar)
- Add ability to construct `ListColumn` when size is `None` ([#11745](https://github.com/rapidsai/cudf/pull/11745)) [@galipremsagar](https://github.com/galipremsagar)
- Reduces memory requirements in JSON parser and adds bytes/s and peak memory usage to benchmarks ([#11732](https://github.com/rapidsai/cudf/pull/11732)) [@elstehle](https://github.com/elstehle)
- Add missing copyright headers. ([#11712](https://github.com/rapidsai/cudf/pull/11712)) [@bdice](https://github.com/bdice)
- Fix copyright check issues in pre-commit ([#11711](https://github.com/rapidsai/cudf/pull/11711)) [@bdice](https://github.com/bdice)
- Include decimal in supported types for range window order-by columns ([#11710](https://github.com/rapidsai/cudf/pull/11710)) [@mythrocks](https://github.com/mythrocks)
- Disable very large column gtest for contiguous-split ([#11706](https://github.com/rapidsai/cudf/pull/11706)) [@davidwendt](https://github.com/davidwendt)
- Drop split_out=None test from groupby.agg ([#11704](https://github.com/rapidsai/cudf/pull/11704)) [@wence-](https://github.com/wence-)
- Use CubinLinker for CUDA Minor Version Compatibility ([#11701](https://github.com/rapidsai/cudf/pull/11701)) [@gmarkall](https://github.com/gmarkall)
- Add regex capture-group parameter to auto convert to non-capture groups ([#11695](https://github.com/rapidsai/cudf/pull/11695)) [@davidwendt](https://github.com/davidwendt)
- Add a `__dataframe__` method to the protocol dataframe object ([#11692](https://github.com/rapidsai/cudf/pull/11692)) [@rgommers](https://github.com/rgommers)
- Special-case multibyte_split for single-byte delimiter ([#11681](https://github.com/rapidsai/cudf/pull/11681)) [@upsj](https://github.com/upsj)
- Remove isort exclusions ([#11680](https://github.com/rapidsai/cudf/pull/11680)) [@bdice](https://github.com/bdice)
- Refactor CSV reader benchmarks with nvbench ([#11678](https://github.com/rapidsai/cudf/pull/11678)) [@PointKernel](https://github.com/PointKernel)
- Check conda recipe headers with pre-commit ([#11669](https://github.com/rapidsai/cudf/pull/11669)) [@bdice](https://github.com/bdice)
- Remove redundant style check for clang-format. ([#11668](https://github.com/rapidsai/cudf/pull/11668)) [@bdice](https://github.com/bdice)
- Add support for `group_keys` in `groupby` ([#11659](https://github.com/rapidsai/cudf/pull/11659)) [@galipremsagar](https://github.com/galipremsagar)
- Fix pandoc pinning. ([#11658](https://github.com/rapidsai/cudf/pull/11658)) [@bdice](https://github.com/bdice)
- Revert removal of skip_rows / num_rows options from the Parquet reader. ([#11657](https://github.com/rapidsai/cudf/pull/11657)) [@nvdbaranec](https://github.com/nvdbaranec)
- Update git metadata ([#11647](https://github.com/rapidsai/cudf/pull/11647)) [@bdice](https://github.com/bdice)
- Call set_null_count on a returning column if null-count is known ([#11646](https://github.com/rapidsai/cudf/pull/11646)) [@davidwendt](https://github.com/davidwendt)
- Fix some libcudf detail calls not passing the stream variable ([#11642](https://github.com/rapidsai/cudf/pull/11642)) [@davidwendt](https://github.com/davidwendt)
- Update to mypy 0.971 ([#11640](https://github.com/rapidsai/cudf/pull/11640)) [@wence-](https://github.com/wence-)
- Refactor strings strip functor to details header ([#11635](https://github.com/rapidsai/cudf/pull/11635)) [@davidwendt](https://github.com/davidwendt)
- Fix incorrect `nullCount` in `get_json_object` ([#11633](https://github.com/rapidsai/cudf/pull/11633)) [@trxcllnt](https://github.com/trxcllnt)
- Simplify `hostdevice_vector` ([#11631](https://github.com/rapidsai/cudf/pull/11631)) [@upsj](https://github.com/upsj)
- Refactor parquet writer benchmarks with nvbench ([#11623](https://github.com/rapidsai/cudf/pull/11623)) [@PointKernel](https://github.com/PointKernel)
- Rework contains_scalar to check nulls at runtime ([#11622](https://github.com/rapidsai/cudf/pull/11622)) [@davidwendt](https://github.com/davidwendt)
- Fix incorrect memory resource used in rolling temp columns ([#11618](https://github.com/rapidsai/cudf/pull/11618)) [@mythrocks](https://github.com/mythrocks)
- Upgrade `pandas` to `1.5` ([#11617](https://github.com/rapidsai/cudf/pull/11617)) [@galipremsagar](https://github.com/galipremsagar)
- Move type-dispatcher calls from traits.hpp to traits.cpp ([#11616](https://github.com/rapidsai/cudf/pull/11616)) [@davidwendt](https://github.com/davidwendt)
- Refactor parquet reader benchmarks with nvbench ([#11611](https://github.com/rapidsai/cudf/pull/11611)) [@PointKernel](https://github.com/PointKernel)
- Forward-merge branch-22.08 to branch-22.10 ([#11608](https://github.com/rapidsai/cudf/pull/11608)) [@bdice](https://github.com/bdice)
- Use stream in Java API. ([#11601](https://github.com/rapidsai/cudf/pull/11601)) [@bdice](https://github.com/bdice)
- Refactors of public/detail APIs, CUDF_FUNC_RANGE, stream handling. ([#11600](https://github.com/rapidsai/cudf/pull/11600)) [@bdice](https://github.com/bdice)
- Improve ORC writer benchmark with nvbench ([#11598](https://github.com/rapidsai/cudf/pull/11598)) [@PointKernel](https://github.com/PointKernel)
- Tune multibyte_split kernel ([#11587](https://github.com/rapidsai/cudf/pull/11587)) [@upsj](https://github.com/upsj)
- Move split_utils.cuh to strings/detail ([#11585](https://github.com/rapidsai/cudf/pull/11585)) [@davidwendt](https://github.com/davidwendt)
- Fix warnings due to compiler regression with `if constexpr` ([#11581](https://github.com/rapidsai/cudf/pull/11581)) [@ttnghia](https://github.com/ttnghia)
- Add full 24-bit dictionary support to Parquet writer ([#11580](https://github.com/rapidsai/cudf/pull/11580)) [@etseidl](https://github.com/etseidl)
- Expose "explicit-comms" option in shuffle-based dask_cudf functions ([#11576](https://github.com/rapidsai/cudf/pull/11576)) [@rjzamora](https://github.com/rjzamora)
- Move cudf::strings::findall_record to cudf::strings::findall ([#11575](https://github.com/rapidsai/cudf/pull/11575)) [@davidwendt](https://github.com/davidwendt)
- Refactor dask_cudf groupby to use apply_concat_apply ([#11571](https://github.com/rapidsai/cudf/pull/11571)) [@rjzamora](https://github.com/rjzamora)
- Add ability to write `list(struct)` columns as `map` type in orc writer ([#11568](https://github.com/rapidsai/cudf/pull/11568)) [@galipremsagar](https://github.com/galipremsagar)
- Add byte_range to multibyte_split benchmark + NVBench refactor ([#11562](https://github.com/rapidsai/cudf/pull/11562)) [@upsj](https://github.com/upsj)
- JNI support for writing binary columns in parquet ([#11556](https://github.com/rapidsai/cudf/pull/11556)) [@revans2](https://github.com/revans2)
- Support additional dictionary bit widths in Parquet writer ([#11547](https://github.com/rapidsai/cudf/pull/11547)) [@etseidl](https://github.com/etseidl)
- Refactor string/numeric conversion utilities ([#11545](https://github.com/rapidsai/cudf/pull/11545)) [@davidwendt](https://github.com/davidwendt)
- Removing unnecessary asserts in parquet tests ([#11544](https://github.com/rapidsai/cudf/pull/11544)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Clean up ORC reader benchmarks with NVBench ([#11543](https://github.com/rapidsai/cudf/pull/11543)) [@PointKernel](https://github.com/PointKernel)
- Reuse MurmurHash3_32 in Parquet page data. ([#11528](https://github.com/rapidsai/cudf/pull/11528)) [@bdice](https://github.com/bdice)
- Add hexadecimal value separators ([#11527](https://github.com/rapidsai/cudf/pull/11527)) [@bdice](https://github.com/bdice)
- Deprecate `skiprows` and `num_rows` in `read_orc` ([#11522](https://github.com/rapidsai/cudf/pull/11522)) [@galipremsagar](https://github.com/galipremsagar)
- Struct support for `NULL_EQUALS` binary operation ([#11520](https://github.com/rapidsai/cudf/pull/11520)) [@rwlee](https://github.com/rwlee)
- Bump hadoop-common from 3.2.3 to 3.2.4 in /java ([#11516](https://github.com/rapidsai/cudf/pull/11516)) [@dependabot[bot]](https://github.com/dependabot[bot])
- Fix Feather test warning. ([#11511](https://github.com/rapidsai/cudf/pull/11511)) [@bdice](https://github.com/bdice)
- copy_range ballot_syncs to have no execution dependency ([#11508](https://github.com/rapidsai/cudf/pull/11508)) [@robertmaynard](https://github.com/robertmaynard)
- Upgrade to `arrow-9.x` ([#11507](https://github.com/rapidsai/cudf/pull/11507)) [@galipremsagar](https://github.com/galipremsagar)
- Remove support for skip_rows / num_rows options in the parquet reader. ([#11503](https://github.com/rapidsai/cudf/pull/11503)) [@nvdbaranec](https://github.com/nvdbaranec)
- Single-pass `multibyte_split` ([#11500](https://github.com/rapidsai/cudf/pull/11500)) [@upsj](https://github.com/upsj)
- Sanitize percentile_approx() output for empty input ([#11498](https://github.com/rapidsai/cudf/pull/11498)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Unpin `dask` and `distributed` for development ([#11492](https://github.com/rapidsai/cudf/pull/11492)) [@galipremsagar](https://github.com/galipremsagar)
- Move SparkMurmurHash3_32 functor. ([#11489](https://github.com/rapidsai/cudf/pull/11489)) [@bdice](https://github.com/bdice)
- Refactor group_nunique.cu to use nullate::DYNAMIC for reduce-by-key functor ([#11482](https://github.com/rapidsai/cudf/pull/11482)) [@davidwendt](https://github.com/davidwendt)
- Drop support for `skiprows` and `num_rows` in `cudf.read_parquet` ([#11480](https://github.com/rapidsai/cudf/pull/11480)) [@galipremsagar](https://github.com/galipremsagar)
- Add reduction `distinct_count` benchmark ([#11473](https://github.com/rapidsai/cudf/pull/11473)) [@ttnghia](https://github.com/ttnghia)
- Add groupby `nunique` aggregation benchmark ([#11472](https://github.com/rapidsai/cudf/pull/11472)) [@ttnghia](https://github.com/ttnghia)
- Disable Arrow S3 support by default. ([#11470](https://github.com/rapidsai/cudf/pull/11470)) [@bdice](https://github.com/bdice)
- Add groupby `max` aggregation benchmark ([#11464](https://github.com/rapidsai/cudf/pull/11464)) [@ttnghia](https://github.com/ttnghia)
- Extract Dremel encoding code from Parquet ([#11461](https://github.com/rapidsai/cudf/pull/11461)) [@vyasr](https://github.com/vyasr)
- Add missing Thrust #includes. ([#11457](https://github.com/rapidsai/cudf/pull/11457)) [@bdice](https://github.com/bdice)
- Make CMake hooks verbose ([#11456](https://github.com/rapidsai/cudf/pull/11456)) [@vyasr](https://github.com/vyasr)
- Control Parquet page size through Python API ([#11454](https://github.com/rapidsai/cudf/pull/11454)) [@etseidl](https://github.com/etseidl)
- Add control of Parquet column index creation to python ([#11453](https://github.com/rapidsai/cudf/pull/11453)) [@etseidl](https://github.com/etseidl)
- Remove unused is_struct trait. ([#11450](https://github.com/rapidsai/cudf/pull/11450)) [@bdice](https://github.com/bdice)
- Refactor the `Buffer` class ([#11447](https://github.com/rapidsai/cudf/pull/11447)) [@madsbk](https://github.com/madsbk)
- Refactor pad_side and strip_type enums into side_type enum ([#11438](https://github.com/rapidsai/cudf/pull/11438)) [@davidwendt](https://github.com/davidwendt)
- Update to Thrust 1.17.0 ([#11437](https://github.com/rapidsai/cudf/pull/11437)) [@bdice](https://github.com/bdice)
- Add in JNI for parsing JSON data and getting the metadata back too. ([#11431](https://github.com/rapidsai/cudf/pull/11431)) [@revans2](https://github.com/revans2)
- Convert byte_array_view to use std::byte ([#11424](https://github.com/rapidsai/cudf/pull/11424)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Deprecate unflatten_nested_columns ([#11421](https://github.com/rapidsai/cudf/pull/11421)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Remove HASH_SERIAL_MURMUR3 / serial32BitMurmurHash3 ([#11383](https://github.com/rapidsai/cudf/pull/11383)) [@bdice](https://github.com/bdice)
- Add Spark list hashing Java tests ([#11379](https://github.com/rapidsai/cudf/pull/11379)) [@bdice](https://github.com/bdice)
- Move cmake to the build section. ([#11376](https://github.com/rapidsai/cudf/pull/11376)) [@vyasr](https://github.com/vyasr)
- Remove use of CUDA driver API calls from libcudf ([#11370](https://github.com/rapidsai/cudf/pull/11370)) [@shwina](https://github.com/shwina)
- Add column constructor from device_uvector&& ([#11356](https://github.com/rapidsai/cudf/pull/11356)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Remove unused custreamz thirdparty directory ([#11343](https://github.com/rapidsai/cudf/pull/11343)) [@vyasr](https://github.com/vyasr)
- Update jni version to 22.10.0-SNAPSHOT ([#11338](https://github.com/rapidsai/cudf/pull/11338)) [@pxLi](https://github.com/pxLi)
- Enable using upstream jitify2 ([#11287](https://github.com/rapidsai/cudf/pull/11287)) [@shwina](https://github.com/shwina)
- Cache cudf.Scalar ([#11246](https://github.com/rapidsai/cudf/pull/11246)) [@shwina](https://github.com/shwina)
- Remove deprecated Series.applymap. ([#11031](https://github.com/rapidsai/cudf/pull/11031)) [@bdice](https://github.com/bdice)
- Remove deprecated expand parameter from str.findall. ([#11030](https://github.com/rapidsai/cudf/pull/11030)) [@bdice](https://github.com/bdice)
# cuDF 22.08.00 (17 Aug 2022)
## 🚨 Breaking Changes
- Remove legacy join APIs ([#11274](https://github.com/rapidsai/cudf/pull/11274)) [@vyasr](https://github.com/vyasr)
- Remove `lists::drop_list_duplicates` ([#11236](https://github.com/rapidsai/cudf/pull/11236)) [@ttnghia](https://github.com/ttnghia)
- Remove Index.replace API ([#11131](https://github.com/rapidsai/cudf/pull/11131)) [@vyasr](https://github.com/vyasr)
- Remove deprecated Index methods from Frame ([#11073](https://github.com/rapidsai/cudf/pull/11073)) [@vyasr](https://github.com/vyasr)
- Remove public API of cudf.merge_sorted. ([#11032](https://github.com/rapidsai/cudf/pull/11032)) [@bdice](https://github.com/bdice)
- Drop python `3.7` in code-base ([#11029](https://github.com/rapidsai/cudf/pull/11029)) [@galipremsagar](https://github.com/galipremsagar)
- Return empty dataframe when reading a Parquet file using empty `columns` option ([#11018](https://github.com/rapidsai/cudf/pull/11018)) [@vuule](https://github.com/vuule)
- Remove Arrow CUDA IPC code ([#10995](https://github.com/rapidsai/cudf/pull/10995)) [@shwina](https://github.com/shwina)
- Buffer: make `.ptr` read-only ([#10872](https://github.com/rapidsai/cudf/pull/10872)) [@madsbk](https://github.com/madsbk)
## 🐛 Bug Fixes
- Fix `distributed` error related to `loop_in_thread` ([#11428](https://github.com/rapidsai/cudf/pull/11428)) [@galipremsagar](https://github.com/galipremsagar)
- Relax arrow pinning to just 8.x and remove cuda build dependency from cudf recipe ([#11412](https://github.com/rapidsai/cudf/pull/11412)) [@kkraus14](https://github.com/kkraus14)
- Revert "Allow CuPy 11" ([#11409](https://github.com/rapidsai/cudf/pull/11409)) [@jakirkham](https://github.com/jakirkham)
- Fix `moto` timeouts ([#11369](https://github.com/rapidsai/cudf/pull/11369)) [@galipremsagar](https://github.com/galipremsagar)
- Set `+/-infinity` as the `identity` values for floating-point numbers in device operators `min` and `max` ([#11357](https://github.com/rapidsai/cudf/pull/11357)) [@ttnghia](https://github.com/ttnghia)
- Fix memory_usage() for `ListSeries` ([#11355](https://github.com/rapidsai/cudf/pull/11355)) [@thomcom](https://github.com/thomcom)
- Fix constructing Column from column_view with expired mask ([#11354](https://github.com/rapidsai/cudf/pull/11354)) [@shwina](https://github.com/shwina)
- Handle parquet corner case: Columns with more rows than are in the row group. ([#11353](https://github.com/rapidsai/cudf/pull/11353)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix `DatetimeIndex` & `TimedeltaIndex` constructors ([#11342](https://github.com/rapidsai/cudf/pull/11342)) [@galipremsagar](https://github.com/galipremsagar)
- Fix unsigned-compare compile warning in IntPow binops ([#11339](https://github.com/rapidsai/cudf/pull/11339)) [@davidwendt](https://github.com/davidwendt)
- Fix performance issue and add a new code path to `cudf::detail::contains` ([#11330](https://github.com/rapidsai/cudf/pull/11330)) [@ttnghia](https://github.com/ttnghia)
- Pin `pytorch` to temporarily unblock from `libcupti` errors ([#11289](https://github.com/rapidsai/cudf/pull/11289)) [@galipremsagar](https://github.com/galipremsagar)
- Workaround for nvcomp zstd overwriting blocks for orc due to underestimate of sizes ([#11288](https://github.com/rapidsai/cudf/pull/11288)) [@jbrennan333](https://github.com/jbrennan333)
- Fix inconsistency when hashing two tables in `cudf::detail::contains` ([#11284](https://github.com/rapidsai/cudf/pull/11284)) [@ttnghia](https://github.com/ttnghia)
- Fix issue related to numpy array and `category` dtype ([#11282](https://github.com/rapidsai/cudf/pull/11282)) [@galipremsagar](https://github.com/galipremsagar)
- Add NotImplementedError when on is specified in DataFrame.join. ([#11275](https://github.com/rapidsai/cudf/pull/11275)) [@vyasr](https://github.com/vyasr)
- Fix invalid allocate_like() and empty_like() tests. ([#11268](https://github.com/rapidsai/cudf/pull/11268)) [@nvdbaranec](https://github.com/nvdbaranec)
- Returns DataFrame When Concatenating Along Axis 1 ([#11263](https://github.com/rapidsai/cudf/pull/11263)) [@isVoid](https://github.com/isVoid)
- Fix compile error due to missing header ([#11257](https://github.com/rapidsai/cudf/pull/11257)) [@ttnghia](https://github.com/ttnghia)
- Fix a memory aliasing/crash issue in scatter for lists. ([#11254](https://github.com/rapidsai/cudf/pull/11254)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix `tests/rolling/empty_input_test` ([#11238](https://github.com/rapidsai/cudf/pull/11238)) [@ttnghia](https://github.com/ttnghia)
- Fix const qualifier when using `host_span<bitmask_type const*>` ([#11220](https://github.com/rapidsai/cudf/pull/11220)) [@ttnghia](https://github.com/ttnghia)
- Avoid using `nvcompBatchedDeflateDecompressGetTempSizeEx` in cuIO ([#11213](https://github.com/rapidsai/cudf/pull/11213)) [@vuule](https://github.com/vuule)
- Generate benchmark data with correct run length regardless of cardinality ([#11205](https://github.com/rapidsai/cudf/pull/11205)) [@vuule](https://github.com/vuule)
- Fix cumulative count index behavior ([#11188](https://github.com/rapidsai/cudf/pull/11188)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Fix assertion in dask_cudf test_struct_explode ([#11170](https://github.com/rapidsai/cudf/pull/11170)) [@rjzamora](https://github.com/rjzamora)
- Provides a method for the user to remove the hook and re-register the hook in a custom shutdown hook manager ([#11161](https://github.com/rapidsai/cudf/pull/11161)) [@res-life](https://github.com/res-life)
- Fix compatibility issues with pandas 1.4.3 ([#11152](https://github.com/rapidsai/cudf/pull/11152)) [@vyasr](https://github.com/vyasr)
- Ensure cuco export set is installed in cmake build ([#11147](https://github.com/rapidsai/cudf/pull/11147)) [@jlowe](https://github.com/jlowe)
- Avoid redundant deepcopy in `cudf.from_pandas` ([#11142](https://github.com/rapidsai/cudf/pull/11142)) [@galipremsagar](https://github.com/galipremsagar)
- Fix compile error due to missing header ([#11126](https://github.com/rapidsai/cudf/pull/11126)) [@ttnghia](https://github.com/ttnghia)
- Fix `__cuda_array_interface__` failures ([#11113](https://github.com/rapidsai/cudf/pull/11113)) [@galipremsagar](https://github.com/galipremsagar)
- Support octal and hex within regex character class pattern ([#11112](https://github.com/rapidsai/cudf/pull/11112)) [@davidwendt](https://github.com/davidwendt)
- Fix split_re matching logic for word boundaries ([#11106](https://github.com/rapidsai/cudf/pull/11106)) [@davidwendt](https://github.com/davidwendt)
- Handle multiple files metadata in `read_parquet` ([#11105](https://github.com/rapidsai/cudf/pull/11105)) [@galipremsagar](https://github.com/galipremsagar)
- Fix index alignment for Series objects with repeated index ([#11103](https://github.com/rapidsai/cudf/pull/11103)) [@shwina](https://github.com/shwina)
- FindcuFile now searches in the current CUDA Toolkit location ([#11101](https://github.com/rapidsai/cudf/pull/11101)) [@robertmaynard](https://github.com/robertmaynard)
- Fix regex word boundary logic to include underline ([#11099](https://github.com/rapidsai/cudf/pull/11099)) [@davidwendt](https://github.com/davidwendt)
- Exclude CudaFatalTest when selecting all Java tests ([#11083](https://github.com/rapidsai/cudf/pull/11083)) [@jlowe](https://github.com/jlowe)
- Fix duplicate `cudatoolkit` pinning issue ([#11070](https://github.com/rapidsai/cudf/pull/11070)) [@galipremsagar](https://github.com/galipremsagar)
- Maintain the input index in the result of a groupby-transform ([#11068](https://github.com/rapidsai/cudf/pull/11068)) [@shwina](https://github.com/shwina)
- Fix bug with row count comparison for expect_columns_equivalent(). ([#11059](https://github.com/rapidsai/cudf/pull/11059)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix BPE uninitialized size value for null and empty input strings ([#11054](https://github.com/rapidsai/cudf/pull/11054)) [@davidwendt](https://github.com/davidwendt)
- Include missing header for usage of `get_current_device_resource()` ([#11047](https://github.com/rapidsai/cudf/pull/11047)) [@AtlantaPepsi](https://github.com/AtlantaPepsi)
- Fix warn_unused_result error in parquet test ([#11026](https://github.com/rapidsai/cudf/pull/11026)) [@karthikeyann](https://github.com/karthikeyann)
- Return empty dataframe when reading a Parquet file using empty `columns` option ([#11018](https://github.com/rapidsai/cudf/pull/11018)) [@vuule](https://github.com/vuule)
- Fix small error in page row count limiting ([#10991](https://github.com/rapidsai/cudf/pull/10991)) [@etseidl](https://github.com/etseidl)
- Fix a row index entry error in ORC writer issue ([#10989](https://github.com/rapidsai/cudf/pull/10989)) [@vuule](https://github.com/vuule)
- Fix grouped covariance to require both values to be convertible to double. ([#10891](https://github.com/rapidsai/cudf/pull/10891)) [@bdice](https://github.com/bdice)
## 📖 Documentation
- Fix issues with day & night modes in python docs ([#11400](https://github.com/rapidsai/cudf/pull/11400)) [@galipremsagar](https://github.com/galipremsagar)
- Update missing data handling APIs in docs ([#11345](https://github.com/rapidsai/cudf/pull/11345)) [@galipremsagar](https://github.com/galipremsagar)
- Add lists filtering APIs to doxygen group. ([#11336](https://github.com/rapidsai/cudf/pull/11336)) [@bdice](https://github.com/bdice)
- Remove unused import in README sample ([#11318](https://github.com/rapidsai/cudf/pull/11318)) [@vyasr](https://github.com/vyasr)
- Note null behavior in `where` docs ([#11276](https://github.com/rapidsai/cudf/pull/11276)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Update docstring for spans in `get_row_data_range` ([#11271](https://github.com/rapidsai/cudf/pull/11271)) [@vyasr](https://github.com/vyasr)
- Update nvCOMP integration table ([#11231](https://github.com/rapidsai/cudf/pull/11231)) [@vuule](https://github.com/vuule)
- Add dev docs for documentation writing ([#11217](https://github.com/rapidsai/cudf/pull/11217)) [@vyasr](https://github.com/vyasr)
- Documentation fix for concatenate ([#11187](https://github.com/rapidsai/cudf/pull/11187)) [@dagardner-nv](https://github.com/dagardner-nv)
- Fix unresolved links in markdown ([#11173](https://github.com/rapidsai/cudf/pull/11173)) [@karthikeyann](https://github.com/karthikeyann)
- Fix cudf version in README.md install commands ([#11164](https://github.com/rapidsai/cudf/pull/11164)) [@jvanstraten](https://github.com/jvanstraten)
- Switch `language` from `None` to `"en"` in docs build ([#11133](https://github.com/rapidsai/cudf/pull/11133)) [@galipremsagar](https://github.com/galipremsagar)
- Remove docs mentioning scalar_view since no such class exists. ([#11132](https://github.com/rapidsai/cudf/pull/11132)) [@bdice](https://github.com/bdice)
- Add docstring entry for `DataFrame.value_counts` ([#11039](https://github.com/rapidsai/cudf/pull/11039)) [@galipremsagar](https://github.com/galipremsagar)
- Add docs to rolling var, std, count. ([#11035](https://github.com/rapidsai/cudf/pull/11035)) [@bdice](https://github.com/bdice)
- Fix docs for Numba UDFs. ([#11020](https://github.com/rapidsai/cudf/pull/11020)) [@bdice](https://github.com/bdice)
- Replace column comparison utilities functions with macros ([#11007](https://github.com/rapidsai/cudf/pull/11007)) [@karthikeyann](https://github.com/karthikeyann)
- Fix Doxygen warnings in multiple headers files ([#11003](https://github.com/rapidsai/cudf/pull/11003)) [@karthikeyann](https://github.com/karthikeyann)
- Fix doxygen warnings in utilities/ headers ([#10974](https://github.com/rapidsai/cudf/pull/10974)) [@karthikeyann](https://github.com/karthikeyann)
- Fix Doxygen warnings in table header files ([#10964](https://github.com/rapidsai/cudf/pull/10964)) [@karthikeyann](https://github.com/karthikeyann)
- Fix Doxygen warnings in column header files ([#10963](https://github.com/rapidsai/cudf/pull/10963)) [@karthikeyann](https://github.com/karthikeyann)
- Fix Doxygen warnings in strings / header files ([#10937](https://github.com/rapidsai/cudf/pull/10937)) [@karthikeyann](https://github.com/karthikeyann)
- Generate Doxygen Tag File for Libcudf ([#10932](https://github.com/rapidsai/cudf/pull/10932)) [@isVoid](https://github.com/isVoid)
- Fix doxygen warnings in structs, lists headers ([#10923](https://github.com/rapidsai/cudf/pull/10923)) [@karthikeyann](https://github.com/karthikeyann)
- Fix doxygen warnings in fixed_point.hpp ([#10922](https://github.com/rapidsai/cudf/pull/10922)) [@karthikeyann](https://github.com/karthikeyann)
- Fix doxygen warnings in ast/, rolling, tdigest/, wrappers/, dictionary/ headers ([#10921](https://github.com/rapidsai/cudf/pull/10921)) [@karthikeyann](https://github.com/karthikeyann)
- fix doxygen warnings in cudf/io/types.hpp, other header files ([#10913](https://github.com/rapidsai/cudf/pull/10913)) [@karthikeyann](https://github.com/karthikeyann)
- fix doxygen warnings in cudf/io/ avro, csv, json, orc, parquet header files ([#10912](https://github.com/rapidsai/cudf/pull/10912)) [@karthikeyann](https://github.com/karthikeyann)
- Fix doxygen warnings in cudf/*.hpp ([#10896](https://github.com/rapidsai/cudf/pull/10896)) [@karthikeyann](https://github.com/karthikeyann)
- Add missing documentation in aggregation.hpp ([#10887](https://github.com/rapidsai/cudf/pull/10887)) [@karthikeyann](https://github.com/karthikeyann)
- Revise PR template. ([#10774](https://github.com/rapidsai/cudf/pull/10774)) [@bdice](https://github.com/bdice)
## 🚀 New Features
- Change cmake to allow controlling Arrow version via cmake variable ([#11429](https://github.com/rapidsai/cudf/pull/11429)) [@kkraus14](https://github.com/kkraus14)
- Adding support for list<int8> columns to be written as byte arrays in parquet ([#11328](https://github.com/rapidsai/cudf/pull/11328)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Adding byte array view structure ([#11322](https://github.com/rapidsai/cudf/pull/11322)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Adding byte_array statistics ([#11303](https://github.com/rapidsai/cudf/pull/11303)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Add column indexes to Parquet writer ([#11302](https://github.com/rapidsai/cudf/pull/11302)) [@etseidl](https://github.com/etseidl)
- Provide an Option for Default Integer and Floating Bitwidth ([#11272](https://github.com/rapidsai/cudf/pull/11272)) [@isVoid](https://github.com/isVoid)
- FST benchmark ([#11243](https://github.com/rapidsai/cudf/pull/11243)) [@karthikeyann](https://github.com/karthikeyann)
- Adds the Finite-State Transducer algorithm ([#11242](https://github.com/rapidsai/cudf/pull/11242)) [@elstehle](https://github.com/elstehle)
- Refactor `collect_set` to use `cudf::distinct` and `cudf::lists::distinct` ([#11228](https://github.com/rapidsai/cudf/pull/11228)) [@ttnghia](https://github.com/ttnghia)
- Treat zstd as stable in nvcomp releases 2.3.2 and later ([#11226](https://github.com/rapidsai/cudf/pull/11226)) [@jbrennan333](https://github.com/jbrennan333)
- Add 24 bit dictionary support to Parquet writer ([#11216](https://github.com/rapidsai/cudf/pull/11216)) [@devavret](https://github.com/devavret)
- Enable positive group indices for extractAllRecord on JNI ([#11215](https://github.com/rapidsai/cudf/pull/11215)) [@anthony-chang](https://github.com/anthony-chang)
- JNI bindings for NTH_ELEMENT window aggregation ([#11201](https://github.com/rapidsai/cudf/pull/11201)) [@mythrocks](https://github.com/mythrocks)
- Add JNI bindings for extractAllRecord ([#11196](https://github.com/rapidsai/cudf/pull/11196)) [@anthony-chang](https://github.com/anthony-chang)
- Add `cudf.options` ([#11193](https://github.com/rapidsai/cudf/pull/11193)) [@isVoid](https://github.com/isVoid)
- Add thrift support for parquet column and offset indexes ([#11178](https://github.com/rapidsai/cudf/pull/11178)) [@etseidl](https://github.com/etseidl)
- Adding binary read/write as options for parquet ([#11160](https://github.com/rapidsai/cudf/pull/11160)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Support `nth_element` for window functions ([#11158](https://github.com/rapidsai/cudf/pull/11158)) [@mythrocks](https://github.com/mythrocks)
- Implement `lists::distinct` and `cudf::detail::stable_distinct` ([#11149](https://github.com/rapidsai/cudf/pull/11149)) [@ttnghia](https://github.com/ttnghia)
- Implement Groupby pct_change ([#11144](https://github.com/rapidsai/cudf/pull/11144)) [@skirui-source](https://github.com/skirui-source)
- Add JNI for set operations ([#11143](https://github.com/rapidsai/cudf/pull/11143)) [@ttnghia](https://github.com/ttnghia)
- Remove deprecated PER_THREAD_DEFAULT_STREAM ([#11134](https://github.com/rapidsai/cudf/pull/11134)) [@jbrennan333](https://github.com/jbrennan333)
- Added a Java method to check the existence of a list of keys in a map ([#11128](https://github.com/rapidsai/cudf/pull/11128)) [@razajafri](https://github.com/razajafri)
- Feature/python benchmarking ([#11125](https://github.com/rapidsai/cudf/pull/11125)) [@vyasr](https://github.com/vyasr)
- Support `nan_equality` in `cudf::distinct` ([#11118](https://github.com/rapidsai/cudf/pull/11118)) [@ttnghia](https://github.com/ttnghia)
- Added JNI for getMapValueForKeys ([#11104](https://github.com/rapidsai/cudf/pull/11104)) [@razajafri](https://github.com/razajafri)
- Refactor `semi_anti_join` ([#11100](https://github.com/rapidsai/cudf/pull/11100)) [@ttnghia](https://github.com/ttnghia)
- Replace remaining instances of rmm::cuda_stream_default with cudf::default_stream_value ([#11082](https://github.com/rapidsai/cudf/pull/11082)) [@jbrennan333](https://github.com/jbrennan333)
- Adds the Logical Stack algorithm ([#11078](https://github.com/rapidsai/cudf/pull/11078)) [@elstehle](https://github.com/elstehle)
- Add doxygen-check pre-commit hook ([#11076](https://github.com/rapidsai/cudf/pull/11076)) [@karthikeyann](https://github.com/karthikeyann)
- Use new nvCOMP API to optimize the decompression temp memory size ([#11064](https://github.com/rapidsai/cudf/pull/11064)) [@vuule](https://github.com/vuule)
- Add Doxygen CI check ([#11057](https://github.com/rapidsai/cudf/pull/11057)) [@karthikeyann](https://github.com/karthikeyann)
- Support `duplicate_keep_option` in `cudf::distinct` ([#11052](https://github.com/rapidsai/cudf/pull/11052)) [@ttnghia](https://github.com/ttnghia)
- Support set operations ([#11043](https://github.com/rapidsai/cudf/pull/11043)) [@ttnghia](https://github.com/ttnghia)
- Support for ZLIB compression in ORC writer ([#11036](https://github.com/rapidsai/cudf/pull/11036)) [@vuule](https://github.com/vuule)
- Adding feature swaplevels ([#11027](https://github.com/rapidsai/cudf/pull/11027)) [@VamsiTallam95](https://github.com/VamsiTallam95)
- Use nvCOMP for ZLIB decompression in ORC reader ([#11024](https://github.com/rapidsai/cudf/pull/11024)) [@vuule](https://github.com/vuule)
- Function for bfill, ffill #9591 ([#11022](https://github.com/rapidsai/cudf/pull/11022)) [@Sreekiran096](https://github.com/Sreekiran096)
- Generate group offsets from element labels ([#11017](https://github.com/rapidsai/cudf/pull/11017)) [@ttnghia](https://github.com/ttnghia)
- Feature axes ([#10979](https://github.com/rapidsai/cudf/pull/10979)) [@VamsiTallam95](https://github.com/VamsiTallam95)
- Generate group labels from offsets ([#10945](https://github.com/rapidsai/cudf/pull/10945)) [@ttnghia](https://github.com/ttnghia)
- Add missing cuIO benchmark coverage for duration types ([#10933](https://github.com/rapidsai/cudf/pull/10933)) [@vuule](https://github.com/vuule)
- Dask-cuDF cumulative groupby ops ([#10889](https://github.com/rapidsai/cudf/pull/10889)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Reindex Improvements ([#10815](https://github.com/rapidsai/cudf/pull/10815)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Implement value_counts for DataFrame ([#10813](https://github.com/rapidsai/cudf/pull/10813)) [@martinfalisse](https://github.com/martinfalisse)
## 🛠️ Improvements
- Pin `dask` & `distributed` for release ([#11433](https://github.com/rapidsai/cudf/pull/11433)) [@galipremsagar](https://github.com/galipremsagar)
- Use documented header template for `doxygen` ([#11430](https://github.com/rapidsai/cudf/pull/11430)) [@galipremsagar](https://github.com/galipremsagar)
- Relax arrow version in dev env ([#11418](https://github.com/rapidsai/cudf/pull/11418)) [@galipremsagar](https://github.com/galipremsagar)
- Allow CuPy 11 ([#11393](https://github.com/rapidsai/cudf/pull/11393)) [@jakirkham](https://github.com/jakirkham)
- Improve multibyte_split performance ([#11347](https://github.com/rapidsai/cudf/pull/11347)) [@cwharris](https://github.com/cwharris)
- Switch death test to use explicit trap. ([#11326](https://github.com/rapidsai/cudf/pull/11326)) [@vyasr](https://github.com/vyasr)
- Add --output-on-failure to ctest args. ([#11321](https://github.com/rapidsai/cudf/pull/11321)) [@vyasr](https://github.com/vyasr)
- Consolidate remaining DataFrame/Series APIs ([#11315](https://github.com/rapidsai/cudf/pull/11315)) [@vyasr](https://github.com/vyasr)
- Add JNI support for the join_strings API ([#11309](https://github.com/rapidsai/cudf/pull/11309)) [@revans2](https://github.com/revans2)
- Add cupy version to setup.py install_requires ([#11306](https://github.com/rapidsai/cudf/pull/11306)) [@vyasr](https://github.com/vyasr)
- removing some unused code ([#11305](https://github.com/rapidsai/cudf/pull/11305)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Add test of wildcard selection ([#11300](https://github.com/rapidsai/cudf/pull/11300)) [@vyasr](https://github.com/vyasr)
- Update parquet reader to take stream parameter ([#11294](https://github.com/rapidsai/cudf/pull/11294)) [@PointKernel](https://github.com/PointKernel)
- Spark list hashing ([#11292](https://github.com/rapidsai/cudf/pull/11292)) [@bdice](https://github.com/bdice)
- Remove legacy join APIs ([#11274](https://github.com/rapidsai/cudf/pull/11274)) [@vyasr](https://github.com/vyasr)
- Fix `cudf` recipes syntax ([#11273](https://github.com/rapidsai/cudf/pull/11273)) [@ajschmidt8](https://github.com/ajschmidt8)
- Fix `cudf` recipe ([#11267](https://github.com/rapidsai/cudf/pull/11267)) [@ajschmidt8](https://github.com/ajschmidt8)
- Cleanup config files ([#11266](https://github.com/rapidsai/cudf/pull/11266)) [@vyasr](https://github.com/vyasr)
- Run mypy on all packages ([#11265](https://github.com/rapidsai/cudf/pull/11265)) [@vyasr](https://github.com/vyasr)
- Update to isort 5.10.1. ([#11262](https://github.com/rapidsai/cudf/pull/11262)) [@vyasr](https://github.com/vyasr)
- Consolidate flake8 and pydocstyle configuration ([#11260](https://github.com/rapidsai/cudf/pull/11260)) [@vyasr](https://github.com/vyasr)
- Remove redundant black config specifications. ([#11258](https://github.com/rapidsai/cudf/pull/11258)) [@vyasr](https://github.com/vyasr)
- Ensure DeprecationWarnings are not introduced via pre-commit ([#11255](https://github.com/rapidsai/cudf/pull/11255)) [@wence-](https://github.com/wence-)
- Optimization to gpu::PreprocessColumnData in parquet reader. ([#11252](https://github.com/rapidsai/cudf/pull/11252)) [@nvdbaranec](https://github.com/nvdbaranec)
- Move rolling impl details to detail/ directory. ([#11250](https://github.com/rapidsai/cudf/pull/11250)) [@mythrocks](https://github.com/mythrocks)
- Remove `lists::drop_list_duplicates` ([#11236](https://github.com/rapidsai/cudf/pull/11236)) [@ttnghia](https://github.com/ttnghia)
- Use `cudf::lists::distinct` in Python binding ([#11234](https://github.com/rapidsai/cudf/pull/11234)) [@ttnghia](https://github.com/ttnghia)
- Use `cudf::lists::distinct` in Java binding ([#11233](https://github.com/rapidsai/cudf/pull/11233)) [@ttnghia](https://github.com/ttnghia)
- Use `cudf::distinct` in Java binding ([#11232](https://github.com/rapidsai/cudf/pull/11232)) [@ttnghia](https://github.com/ttnghia)
- Pin `dask-cuda` in dev environment ([#11229](https://github.com/rapidsai/cudf/pull/11229)) [@galipremsagar](https://github.com/galipremsagar)
- Remove cruft in map_lookup ([#11221](https://github.com/rapidsai/cudf/pull/11221)) [@mythrocks](https://github.com/mythrocks)
- Deprecate `skiprows` & `num_rows` in parquet reader ([#11218](https://github.com/rapidsai/cudf/pull/11218)) [@galipremsagar](https://github.com/galipremsagar)
- Remove Frame._index ([#11210](https://github.com/rapidsai/cudf/pull/11210)) [@vyasr](https://github.com/vyasr)
- Improve performance for `cudf::contains` when searching for a scalar ([#11202](https://github.com/rapidsai/cudf/pull/11202)) [@ttnghia](https://github.com/ttnghia)
- Document why Development component is needing for CMake. ([#11200](https://github.com/rapidsai/cudf/pull/11200)) [@vyasr](https://github.com/vyasr)
- cleanup unused code in rolling_test.hpp ([#11195](https://github.com/rapidsai/cudf/pull/11195)) [@karthikeyann](https://github.com/karthikeyann)
- Standardize join internals around DataFrame ([#11184](https://github.com/rapidsai/cudf/pull/11184)) [@vyasr](https://github.com/vyasr)
- Move character case table declarations from src to detail ([#11183](https://github.com/rapidsai/cudf/pull/11183)) [@davidwendt](https://github.com/davidwendt)
- Remove usage of Frame in StringMethods ([#11181](https://github.com/rapidsai/cudf/pull/11181)) [@vyasr](https://github.com/vyasr)
- Expose get_json_object_options to Python ([#11180](https://github.com/rapidsai/cudf/pull/11180)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Fix decimal128 stats in parquet writer ([#11179](https://github.com/rapidsai/cudf/pull/11179)) [@etseidl](https://github.com/etseidl)
- Modify CheckPageRows in parquet_test to use datasources ([#11177](https://github.com/rapidsai/cudf/pull/11177)) [@etseidl](https://github.com/etseidl)
- Pin max version of `cuda-python` to `11.7.0` ([#11174](https://github.com/rapidsai/cudf/pull/11174)) [@Ethyling](https://github.com/Ethyling)
- Refactor and optimize Frame.where ([#11168](https://github.com/rapidsai/cudf/pull/11168)) [@vyasr](https://github.com/vyasr)
- Add npos const static member to cudf::string_view ([#11166](https://github.com/rapidsai/cudf/pull/11166)) [@davidwendt](https://github.com/davidwendt)
- Move _drop_rows_by_label from Frame to IndexedFrame ([#11157](https://github.com/rapidsai/cudf/pull/11157)) [@vyasr](https://github.com/vyasr)
- Clean up _copy_type_metadata ([#11156](https://github.com/rapidsai/cudf/pull/11156)) [@vyasr](https://github.com/vyasr)
- Add `nvcc` conda package in dev environment ([#11154](https://github.com/rapidsai/cudf/pull/11154)) [@galipremsagar](https://github.com/galipremsagar)
- Struct binary comparison op functionality for spark rapids ([#11153](https://github.com/rapidsai/cudf/pull/11153)) [@rwlee](https://github.com/rwlee)
- Refactor inline conditionals. ([#11151](https://github.com/rapidsai/cudf/pull/11151)) [@bdice](https://github.com/bdice)
- Refactor Spark hashing tests ([#11145](https://github.com/rapidsai/cudf/pull/11145)) [@bdice](https://github.com/bdice)
- Add new `_from_data_like_self` factory ([#11140](https://github.com/rapidsai/cudf/pull/11140)) [@vyasr](https://github.com/vyasr)
- Update get_cucollections to use rapids-cmake ([#11139](https://github.com/rapidsai/cudf/pull/11139)) [@vyasr](https://github.com/vyasr)
- Remove unnecessary extra function for libcudacxx detection ([#11138](https://github.com/rapidsai/cudf/pull/11138)) [@vyasr](https://github.com/vyasr)
- Allow initial value for cudf::reduce and cudf::segmented_reduce. ([#11137](https://github.com/rapidsai/cudf/pull/11137)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Remove Index.replace API ([#11131](https://github.com/rapidsai/cudf/pull/11131)) [@vyasr](https://github.com/vyasr)
- Move char-type table function declarations from src to detail ([#11127](https://github.com/rapidsai/cudf/pull/11127)) [@davidwendt](https://github.com/davidwendt)
- Clean up repo root ([#11124](https://github.com/rapidsai/cudf/pull/11124)) [@bdice](https://github.com/bdice)
- Improve print formatting of strings containing newline characters. ([#11108](https://github.com/rapidsai/cudf/pull/11108)) [@nvdbaranec](https://github.com/nvdbaranec)
- Fix cudf::string_view::find() to return pos for empty string argument ([#11107](https://github.com/rapidsai/cudf/pull/11107)) [@davidwendt](https://github.com/davidwendt)
- Forward-merge branch-22.06 to branch-22.08 ([#11086](https://github.com/rapidsai/cudf/pull/11086)) [@bdice](https://github.com/bdice)
- Take iterators by value in clamp.cu. ([#11084](https://github.com/rapidsai/cudf/pull/11084)) [@bdice](https://github.com/bdice)
- Performance improvements for row to column conversions ([#11075](https://github.com/rapidsai/cudf/pull/11075)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Remove deprecated Index methods from Frame ([#11073](https://github.com/rapidsai/cudf/pull/11073)) [@vyasr](https://github.com/vyasr)
- Use per-page max compressed size estimate for compression ([#11066](https://github.com/rapidsai/cudf/pull/11066)) [@devavret](https://github.com/devavret)
- column to row refactor for performance ([#11063](https://github.com/rapidsai/cudf/pull/11063)) [@hyperbolic2346](https://github.com/hyperbolic2346)
- Include `skbuild` directory into `build.sh` `clean` operation ([#11060](https://github.com/rapidsai/cudf/pull/11060)) [@galipremsagar](https://github.com/galipremsagar)
- Unpin `dask` & `distributed` for development ([#11058](https://github.com/rapidsai/cudf/pull/11058)) [@galipremsagar](https://github.com/galipremsagar)
- Add support for `Series.between` ([#11051](https://github.com/rapidsai/cudf/pull/11051)) [@galipremsagar](https://github.com/galipremsagar)
- Fix groupby include ([#11046](https://github.com/rapidsai/cudf/pull/11046)) [@bwyogatama](https://github.com/bwyogatama)
- Regex cleanup internal reclass and reclass_device classes ([#11045](https://github.com/rapidsai/cudf/pull/11045)) [@davidwendt](https://github.com/davidwendt)
- Remove public API of cudf.merge_sorted. ([#11032](https://github.com/rapidsai/cudf/pull/11032)) [@bdice](https://github.com/bdice)
- Drop python `3.7` in code-base ([#11029](https://github.com/rapidsai/cudf/pull/11029)) [@galipremsagar](https://github.com/galipremsagar)
- Addition & integration of the integer power operator ([#11025](https://github.com/rapidsai/cudf/pull/11025)) [@AtlantaPepsi](https://github.com/AtlantaPepsi)
- Refactor `lists::contains` ([#11019](https://github.com/rapidsai/cudf/pull/11019)) [@ttnghia](https://github.com/ttnghia)
- Change build.sh to find C++ library by default and avoid shadowing CMAKE_ARGS ([#11013](https://github.com/rapidsai/cudf/pull/11013)) [@vyasr](https://github.com/vyasr)
- Clean up parquet unit test ([#11005](https://github.com/rapidsai/cudf/pull/11005)) [@PointKernel](https://github.com/PointKernel)
- Add missing #pragma once to header files ([#11004](https://github.com/rapidsai/cudf/pull/11004)) [@karthikeyann](https://github.com/karthikeyann)
- Cleanup `iterator.cuh` and add fixed point support for `scalar_optional_accessor` ([#10999](https://github.com/rapidsai/cudf/pull/10999)) [@ttnghia](https://github.com/ttnghia)
- Refactor `cudf::contains` ([#10997](https://github.com/rapidsai/cudf/pull/10997)) [@ttnghia](https://github.com/ttnghia)
- Remove Arrow CUDA IPC code ([#10995](https://github.com/rapidsai/cudf/pull/10995)) [@shwina](https://github.com/shwina)
- Change file extension for groupby benchmark ([#10985](https://github.com/rapidsai/cudf/pull/10985)) [@ttnghia](https://github.com/ttnghia)
- Sort recipe include checks. ([#10984](https://github.com/rapidsai/cudf/pull/10984)) [@bdice](https://github.com/bdice)
- Update cuCollections for thrust upgrade ([#10983](https://github.com/rapidsai/cudf/pull/10983)) [@PointKernel](https://github.com/PointKernel)
- Expose row-group size options in cudf ParquetWriter ([#10980](https://github.com/rapidsai/cudf/pull/10980)) [@rjzamora](https://github.com/rjzamora)
- Cleanup cudf::strings::detail::regex_parser class source ([#10975](https://github.com/rapidsai/cudf/pull/10975)) [@davidwendt](https://github.com/davidwendt)
- Handle missing fields as nulls in get_json_object() ([#10970](https://github.com/rapidsai/cudf/pull/10970)) [@SrikarVanavasam](https://github.com/SrikarVanavasam)
- Fix license families to match all-caps expected by conda-verify. ([#10931](https://github.com/rapidsai/cudf/pull/10931)) [@bdice](https://github.com/bdice)
- Include <optional> for GCC 11 compatibility. ([#10927](https://github.com/rapidsai/cudf/pull/10927)) [@bdice](https://github.com/bdice)
- Enable builds with scikit-build ([#10919](https://github.com/rapidsai/cudf/pull/10919)) [@vyasr](https://github.com/vyasr)
- Improve `distinct` by using `cuco::static_map::retrieve_all` ([#10916](https://github.com/rapidsai/cudf/pull/10916)) [@PointKernel](https://github.com/PointKernel)
- update cudfjni to 22.08.0-SNAPSHOT ([#10910](https://github.com/rapidsai/cudf/pull/10910)) [@pxLi](https://github.com/pxLi)
- Improve the capture of fatal cuda error ([#10884](https://github.com/rapidsai/cudf/pull/10884)) [@sperlingxx](https://github.com/sperlingxx)
- Cleanup regex compiler operators and operands source ([#10879](https://github.com/rapidsai/cudf/pull/10879)) [@davidwendt](https://github.com/davidwendt)
- Buffer: make `.ptr` read-only ([#10872](https://github.com/rapidsai/cudf/pull/10872)) [@madsbk](https://github.com/madsbk)
- Configurable NaN handling in device_row_comparators ([#10870](https://github.com/rapidsai/cudf/pull/10870)) [@rwlee](https://github.com/rwlee)
- Register `cudf.core.groupby.Grouper` objects to dask `grouper_dispatch` ([#10838](https://github.com/rapidsai/cudf/pull/10838)) [@brandon-b-miller](https://github.com/brandon-b-miller)
- Upgrade to `arrow-8` ([#10816](https://github.com/rapidsai/cudf/pull/10816)) [@galipremsagar](https://github.com/galipremsagar)
- Remove _getattr_ method in RangeIndex class ([#10538](https://github.com/rapidsai/cudf/pull/10538)) [@skirui-source](https://github.com/skirui-source)
- Adding bins to value counts ([#8247](https://github.com/rapidsai/cudf/pull/8247)) [@marlenezw](https://github.com/marlenezw)
# cuDF 22.06.00 (7 Jun 2022)
## 🚨 Breaking Changes
- Enable Zstandard decompression only when all nvcomp integrations are enabled ([#10944](https://github.com/rapidsai/cudf/pull/10944)) [@vuule](https://github.com/vuule)
- Rename `sliced_child` to `get_sliced_child`. ([#10885](https://github.com/rapidsai/cudf/pull/10885)) [@bdice](https://github.com/bdice)
- Add parameters to control page size in Parquet writer ([#10882](https://github.com/rapidsai/cudf/pull/10882)) [@etseidl](https://github.com/etseidl)
- Make cudf::test::expect_columns_equal() to fail when comparing unsanitary lists. ([#10880](https://github.com/rapidsai/cudf/pull/10880)) [@nvdbaranec](https://github.com/nvdbaranec)
- Cleanup regex compiler fixed quantifiers source ([#10843](https://github.com/rapidsai/cudf/pull/10843)) [@davidwendt](https://github.com/davidwendt)
- Refactor `cudf::contains`, renaming and switching parameters role ([#10802](https://github.com/rapidsai/cudf/pull/10802)) [@ttnghia](https://github.com/ttnghia)
- Generic serialization of all column types ([#10784](https://github.com/rapidsai/cudf/pull/10784)) [@wence-](https://github.com/wence-)
- Return per-file metadata from readers ([#10782](https://github.com/rapidsai/cudf/pull/10782)) [@vuule](https://github.com/vuule)