-
Notifications
You must be signed in to change notification settings - Fork 102
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add rcutils_list_directory function #249
Conversation
{ | ||
RCUTILS_CHECK_FOR_NULL_WITH_MSG( | ||
directory_path, "directory_path is null", return RCUTILS_RET_INVALID_ARGUMENT); | ||
RCUTILS_CHECK_FOR_NULL_WITH_MSG( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rcutils_string_array_init also does this check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair. To be pedantic, the argument name could be different in that function so the error message might not be correct, but it might be worth the performance since this is likely to be a rare error case.
string_array, "string_array is null", return RCUTILS_RET_INVALID_ARGUMENT); | ||
|
||
// Start with 8 entries | ||
rcutils_ret_t ret = rcutils_string_array_init(string_array, 8, &allocator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would do check size 1st in the directory, and the allocate array what we need and store the data. this reduces array adjustment operation. I think this is cost effective compared to allocate/de-allocate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my biggest problem with that is that it introduces a race with the filesystem. Entries could be removed or added during or after the initial enumeration and then our size would be wrong. To handle this, we'd essentially need to have all the code already here, plus the initial enumeration to give us the size hint.
I'm also not convinced that the performance would be better, but I could be convinced by some data on the topic.
goto fail; | ||
} | ||
struct dirent * entry; | ||
for (; NULL != (entry = readdir(dir)); ++count) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
counting .
and ..
directory included, is this intended?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I felt that we can leave it up to the consumer whether they want to ignore those entries rather than performing the comparisons here. The API proposed in #323 mentions this behavior in the API docs.
This change adds a simple function for listing the contents of a directory, and returning the names of those files as a string_array object. Signed-off-by: Scott K Logan <[email protected]>
@cottsay What's the status of this one? Is it something you are still interested in getting in? |
I really think some sort of directory iteration would be a great thing to have in The main reason I haven't pushed further on this is that I'm having doubts about the scalability of this solution. It's a really convenient API, but having an iteration-based API that doesn't need to allocate space for the entire directory listing might be nice to have. Really, we could do both. I'd love to hear your thoughts on it. |
Oh, that is a really good point. It would be much better not to allocate all of the memory at once, particularly for potentially large directories. I like your idea of a directory iterator; essentially I would think of something like the My personal suggestion is that we close this out, and instead open an issue tracking the addition of a directory iterator. But if you still think this has value, we can leave it open. |
While I agree with you that there might be better and more efficient methods out there, I wonder if it's not just worth finishing up the current solution in order to move forward and iterate over it in a point of time. Opposing the idea of closing this PR. |
The issue that @cottsay brings up is more than just being inefficient, though. If you happen to have a directory with many, many files, then calling this API inside of that directory could potentially use up all of the memory on your machine just storing the names of those files. Since this API is not recursive, this may be somewhat mitigated by the limits of the underlying filesystem on the number of files in a directory, but I'd really rather not depend on that. I think that in order to move forward here we need to switch this to a directory iterator in order to avoid that problem (and that matches what the underlying OS APIs look like as well). |
I've prototyped an iteration model, but it wasn't terribly elegant due to some fundamentally different approaches that Windows and Linux take. I'm not overly thrilled about the way errors would be handled, but I'll try to dig up that old branch and put it somewhere for review. Another reason to take that approach, however, is that we can update the location in |
Closed in favor of #323. |
This change adds a simple function for listing the contents of a directory, and returning the names of those files as a string_array object. The iteration method is largely based on the one used in
rcutils_calculate_directory_size
.The signature for this new function is modeled after the existing
rcutils_split
function:rcutils/include/rcutils/split.h
Lines 27 to 44 in 6acf8d8
Requires #247 and #248