Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(core-clp)!: Migrate archive metadata file format to MessagePack. #700
feat(core-clp)!: Migrate archive metadata file format to MessagePack. #700
Changes from 4 commits
ecde0d3
a6974c7
d3a52d4
72452c1
aafee6e
f19a7ed
620ce22
fd7a451
fa67ad6
13d5fdf
1b1b290
202c4ee
a885dc8
fc00b4a
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally, exceptions should only be used for exceptional circumstances. I don't think corrupt data is exceptional since it can occur under normal circumstances (e.g., an interrupted download, a drive failure, etc.). Thus, I think this method should return these errors as error codes. To support a return value that can be an error code OR an object, you can use outcome. See
clp::ffi::get_schema_subtree_bitmap
for an example.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function is called in a try-catch block, where any exception reading metadata is critical and re thrown. I think it would look weird to return std::result here , then throw an exception if there is an error, inside the existing try-catch block. It may be possible to get rid of the try-catch block entirely, and just use std::result, but then any undocumented exceptions in msgpack/file_reader may be caught higher up in code, and not in metadata handler like they are now. Lmk ur thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. So from a guideline perspective, we should:
I think (1) & (2) is doable in this PR (but feel free to disagree if it seems like too much code to change). (3) will probably be larger than the scope of this PR. So if we just do (1) & (2), to be able to propagate the error to the caller, I'd throw an exception.
Let's just make sure we're on the same page with the above before we go ahead with the change (don't want you to have to change too much).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general I am still a little suspect about using std::result now that i have looked at a bit more. We need to use since this function returns
ArchiveMetadata
. We don't have custom error codes defined in this namespace yet (see Zhihao's custom codes), and dosen't look like the standard clp error codes have been migrated? It may be out-of-scope of this PR to add these new error codes, unless you want to do a new PR first, to add support for new codes into this class, or migrate clp codes or something else?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can use regular error code, and pass in a pointer to default constructed ArchivedMetadata, but that is a bit gross
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Uh, we usually use the generic error codes (I gave a bad example, lol), but I guess since it isn't clean, let's leave it for now. I'll review the rest of the PR when I get a chance today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Add error handling for the write operation.
The write operation should be checked for errors to ensure the data is written successfully.
Apply this diff to add error handling:
📝 Committable suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Use std::string instead of std::string_view for member variables.
Using std::string_view for member variables that store string literals is risky as it doesn't own the string data. If the string literal's lifetime isn't guaranteed, this could lead to dangling references.
Apply this diff to use std::string:
📝 Committable suggestion
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The stings are constexpression
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@davemarco Ah, you're absolutely right. Since these are constexpr strings with static storage duration, using
std::string_view
is actually the correct and more efficient approach. I apologize for the confusion in my previous review.✏️ Learnings added
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since the FileReader goes out of scope right after the metadata is created, can the method takes metadata_file_path as its parameter and create a FileReader internally?
Edit: Now i see that you are just trying to reuse the original signature, which makes sense.. but maybe we can also improve the method interface in this PR since it's a few lines of change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you feel about taking a clp::array instead as input. It will move some of the code (the read and stat) into archive reader, but then this constructor can also be used for sfa. Like for sfa, we can't pass in a filereader (since no way to get the size of metadata in larger file by itself, and the filepath is also for the entire sfa)?
We could also have 2 constructors. One for file path, and one for buffer. And they just share a method to deserialize from buffer? Let me know your thoughts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this planned in a future PR? I think it makes sense, but the change you propose might be easier to reason about when we are actually making the changes that include sfa.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay for now, I can take in a file path. I can add the buffer feature later when sfa reader is implemented