-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Large strings gtest fixture and utilities #15513
Conversation
#include <vector> | ||
|
||
struct ConcatenateTest : public cudf::test::StringsLargeTest {}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from copying/concatenate_tests.cpp
#include <vector> | ||
|
||
struct MergeTest : public cudf::test::StringsLargeTest {}; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Moved from merge/merge_string_test.cpp
// create object to automatically be destroyed at the end of main() | ||
auto lsd = cudf::test::LargeStringsData(); | ||
// set object pointer into static variable | ||
cudf::test::StringsLargeTest::g_ls_data = &lsd; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh no, manually assigning static variable like this is not a good practice. Can we initialize it automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The issue is that the variable (or its pointer at least) needs to be accessible globally but the lifetime scope must be within main()
. So lsd
must be created and destroyed within the main()
scope but needs to be singleton for the entire process at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about create-on-first-access?
struct StringsLargeTest : public cudf::test::BaseFixture {
public:
static auto get_ls_data() {
g_ls_data = new cudf::test::LargeStringsData;
return g_ls_data;
}
private:
static LargeStringsData* g_ls_data;
};
get_ls_data()
then should be called within main()
scope.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This won't work because it will return a pointer to main which will not automatically destroy it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there value in making this a function static? Its construction is guaranteed to be thread safe, and will be destroyed in reverse order of construction.
static auto get_ls_data() {
auto the_instance = cudf::test::LargeStringsData{};
return &the_instance;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use a smart pointer here. This seems to work.
static auto get_lsd_data(int v) {
auto ls_data = std::make_unique<LargeStringsData>(v);
g_ls_data = ls_data.get();
return ls_data;
}
When the smart pointer goes out of scope, it will delete the object.
This is more inline with what RMM does with resource memory manager objects.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the object (not the pointer) is a static variable anywhere there is a chance it could be destroyed outside of main().
Sorry, I don't fully understand the concern. The function static object is guaranteed to be alive until main()
exits. Does that not suit?
Do we have a dependency somewhere in the global static destruction sequence, or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The getter needs to check and throw if g_ls_data is not yet initialized.
@ttnghia: With a function static, the object is guaranteed to be initialized once, on the first call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initializing part is not too challenging. Global static destruction is not good since the object holds device memory.
Here is a godbolt which I hope will explain some of this: https://godbolt.org/z/rTa9ceEKf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Global static destruction is not good since the object holds device memory.
Hmm. Thank you, I'll try bear this in mind.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me. 👍
(Barring the current discussion with @ttnghia.)
/merge |
Description
Creates the base class and utilities for testing APIs to produce large strings.
The main purpose of the fixture is to enable the large strings environment variable(s) and to setup large test data that can be reused by multiple tests.
Checklist