-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large strings support in cudf::concatenate #15195
Large strings support in cudf::concatenate #15195
Conversation
@@ -226,6 +197,53 @@ TEST_F(StringColumnTest, ConcatenateTooLarge) | |||
EXPECT_THROW(cudf::concatenate(input_cols), std::overflow_error); | |||
} | |||
|
|||
TEST_F(StringColumnTest, ConcatenateLargeStrings) | |||
{ | |||
CUDF_TEST_ENABLE_LARGE_STRINGS(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's the long term plan for CUDF_TEST_ENABLE_LARGE_STRINGS
? Will we need this forever or is it only until we turn on long strings by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be temporary until long strings are turned on by default.
It will be useful until then in case specific tests require them.
Also, a future PR will introduce a new test fixture where this feature is turned on by default.
And then this test may be moved to use that fixture at that time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like how you're making this digestible pieces. Thank you for the trouble you're going through for this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful. Also I would echo @hyperbolic2346's comment, I appreciate the work that is going into making this project digestible in small PRs.
/merge |
Description
Enables
cudf::concatenate
to create and return a large strings column (offsets are INT64).This also introduces the
LIBCUDF_LARGE_STRINGS_ENABLED
environment variable and utilities around it.One internal utility checks the value so appropriate logic can either throw an overflow exception or build INT64 offsets as appropriate.
The
cudf::test::large_strings_enabler
is introduced to set/unset the env var for individual tests are needed.A follow on PR will attempt to consolidate these kinds of tests with a specialized test fixture using this utility class.
Checklist