Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes CSV-reader type inference for thousands separator and decimal point #8261

Merged

Conversation

elstehle
Copy link
Contributor

@elstehle elstehle commented May 17, 2021

This PR fixes #6655
This PR also makes sure to respect a user-specified decimal point during type inference. I.e., when the decimal point is not '.', types are now correctly inferred.
Plus some minor doxygen fixes and style changes from camelCase to snake_case.

@elstehle elstehle requested a review from a team as a code owner May 17, 2021 15:48
@github-actions github-actions bot added the libcudf Affects libcudf (C++/CUDA) code. label May 17, 2021
@elstehle elstehle added 3 - Ready for Review Ready for review by team bug Something isn't working non-breaking Non-breaking change labels May 17, 2021
Copy link
Contributor

@rgsl888prabhu rgsl888prabhu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor comment, other than that looks good.

cpp/tests/io/csv_test.cpp Outdated Show resolved Hide resolved
@elstehle elstehle requested a review from vuule May 17, 2021 16:32
Copy link
Contributor

@vuule vuule left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good except for the existing comment on testing 👍

@elstehle elstehle force-pushed the fix/read-csv-auto-detect-types branch from 79c2e55 to a7e2715 Compare May 19, 2021 15:14
@vuule
Copy link
Contributor

vuule commented May 19, 2021

@elstehle please don't force-push. If I remember correctly, it messes with the comment traceability.

@vuule vuule added 5 - Ready to Merge Testing and reviews complete, ready to merge and removed 3 - Ready for Review Ready for review by team labels May 19, 2021
@codecov
Copy link

codecov bot commented May 19, 2021

Codecov Report

❗ No coverage uploaded for pull request base (branch-21.06@59d8d5e). Click here to learn what that means.
The diff coverage is n/a.

❗ Current head 8252777 differs from pull request most recent head a7e2715. Consider uploading reports for the commit a7e2715 to get more accurate results
Impacted file tree graph

@@               Coverage Diff               @@
##             branch-21.06    #8261   +/-   ##
===============================================
  Coverage                ?   82.84%           
===============================================
  Files                   ?      105           
  Lines                   ?    17865           
  Branches                ?        0           
===============================================
  Hits                    ?    14800           
  Misses                  ?     3065           
  Partials                ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 59d8d5e...a7e2715. Read the comment docs.

@vuule
Copy link
Contributor

vuule commented May 19, 2021

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 2b9fc62 into rapidsai:branch-21.06 May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
5 - Ready to Merge Testing and reviews complete, ready to merge bug Something isn't working libcudf Affects libcudf (C++/CUDA) code. non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] CSV reader incorrectly infers type when numbers contain the thousands character
3 participants