-
Notifications
You must be signed in to change notification settings - Fork 915
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Write cuDF version in Parquet "created_by" metadata field #14721
Changes from 11 commits
98c17d0
b6dae25
3be8bc1
bdc597a
8c34446
f8a2024
6e4614a
8c9fcde
275adbc
79a2aba
c7a4ef4
16caa90
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,5 +1,5 @@ | ||
/* | ||
* Copyright (c) 2019-2023, NVIDIA CORPORATION. | ||
* Copyright (c) 2019-2024, NVIDIA CORPORATION. | ||
* | ||
* Licensed under the Apache License, Version 2.0 (the "License"); | ||
* you may not use this file except in compliance with the License. | ||
|
@@ -99,6 +99,10 @@ struct aggregate_writer_metadata { | |
} | ||
} | ||
|
||
#ifndef CUDF_VERSION | ||
#error "CUDF_VERSION is not defined" | ||
#endif | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is in the middle of the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I wanted it close to where it's used, but it looked uglier to me in the middle of the function. Moved it to the top just after the includes. |
||
|
||
FileMetaData get_metadata(size_t part) | ||
{ | ||
CUDF_EXPECTS(part < files.size(), "Invalid part index queried"); | ||
|
@@ -108,7 +112,7 @@ struct aggregate_writer_metadata { | |
meta.num_rows = this->files[part].num_rows; | ||
meta.row_groups = this->files[part].row_groups; | ||
meta.key_value_metadata = this->files[part].key_value_metadata; | ||
meta.created_by = this->created_by; | ||
meta.created_by = "cudf version " CUDF_STRINGIFY(CUDF_VERSION); | ||
ttnghia marked this conversation as resolved.
Show resolved
Hide resolved
|
||
meta.column_orders = this->column_orders; | ||
return meta; | ||
} | ||
|
@@ -171,7 +175,6 @@ struct aggregate_writer_metadata { | |
std::vector<std::vector<uint8_t>> column_indexes; | ||
}; | ||
std::vector<per_file_metadata> files; | ||
std::string created_by = ""; | ||
thrust::optional<std::vector<ColumnOrder>> column_orders = thrust::nullopt; | ||
}; | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The line 650 above this is using
set_source_files_properties
so should we use it here?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a CMake expert 😅. I used
set_property
since it's also used below for setting the same property forjit/cache.cpp
.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm also not cmake expert. Okay probably they both can achieve the same output 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set_property
is probably easier to work with here because you canAPPEND
with it. This solution is fine.