Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid libc buffered IO #294

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

stewartsmith
Copy link
Contributor

the FILE related IO functions in libc do buffering inside the C library, and are generally less performant than using the file descriptor based open()/read()/write() functions.

The FILE based approach somewhat limits the maximum throughput of librepo, as well as increases CPU usage. In my benchmarks of a reposync of the Amazon Linux 2023 x86-64 repositories, this move to file descriptor based IO saves about 1 second of user time, and .5 seconds of system time, for a wall clock time benefit of a few seconds (102s vs 99s).

@stewartsmith
Copy link
Contributor Author

stewartsmith commented Feb 9, 2024

For reference, my benchmarking has been done on a m5n.16xlarge EC2 instance to the in-region S3 buckets as well as to the CDN repositories. That instance type has 256GB memory, a 75Gbit network connection, and is a 64 core Cascade Lake system. The root volume is a 256GB gp3 EBS volume with 500MB/sec of IO and 3000 IOPs.

The background of this is that a lot of EC2 instances don't live that long (relatively speaking), and never install RPMs except on launch - so all the time-to-install RPMs is time spent scaling up a system that could be better served by running the customer workload.

@@ -619,7 +619,7 @@ lr_writecb(char *ptr, size_t size, size_t nmemb, void *userdata)
if (range_start <= 0 && range_end <= 0) {
// Write everything curl give to you
target->writecb_recieved += all;
return fwrite(ptr, size, nmemb, target->f);
return write(target->fd, ptr, all);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replacing fwrite() with write() is not so simple:

  • write() expects size_t in its last argument, while you pass gint64. That won't match on 32-bit memory model, e.g. on 32-bit x86 platform.
  • write() returns a signed ssize_t, while fwrite() unsigned size_t. Your return statement does match lr_writecb() prototype.
  • write() can return before writing all data with errno==EINTR and you need to call write() again.

Please fix these issues at all places you replaced fwrite() with write().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

Included some code for keeping to reasonable IO sizes and just avoiding any of these issues with signed/unsigned.

@@ -684,7 +684,7 @@ lr_writecb(char *ptr, size_t size, size_t nmemb, void *userdata)
}

assert(nmemb > 0);
cur_written = fwrite(ptr, size, nmemb, target->f);
cur_written = write(target->fd, ptr, size * nmemb);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What if size*nmemb overflows size_t? fwrite() handled it for you. Now you need to implement it check yourself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to check for overflow and fall back to a well known size.

@@ -1033,9 +1033,9 @@ remove_librepo_xattr(LrDownloadTarget * target)
gboolean
lr_zck_clear_header(LrTarget *target, GError **err)
{
assert(target && target->f && target->target && target->target->path);
assert(target && target->fd && target->target && target->target->path);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fd is an int. fd==0 is a valid descriptor. You should compare it to -1, or at least to < 0 as in invalid descriptor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -1295,7 +1295,7 @@ static gboolean
check_zck(LrTarget *target, GError **err)
{
assert(!err || *err == NULL);
assert(target && target->f && target->target);
assert(target && target->fd && target->target);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again. fd == 0 is a valid value. Compare to -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

fclose(target->f);
target->f = NULL;
close(target->fd);
target->fd = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assign -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -1573,8 +1563,7 @@ prepare_next_transfer(LrDownload *dd, gboolean *candidatefound, GError **err)

if (target->original_offset == -1) {
// Determine offset
fseek(target->f, 0L, SEEK_END);
gint64 determined_offset = ftell(target->f);
gint64 determined_offset = lseek(target->fd, 0L, SEEK_END);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lseek() returns off_t. Not gint64. Those won't match on 32-bit platforms without large file support enabled.

Copy link
Contributor Author

@stewartsmith stewartsmith Jun 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed (in upcoming updated patch...)

there's this snipped below from the original code:

        gint64 used_offset = target->original_offset;
        g_debug("%s: Used offset for download resume: %"G_GINT64_FORMAT,
                __func__, used_offset);

        c_rc = curl_easy_setopt(h, CURLOPT_RESUME_FROM_LARGE,
                                (curl_off_t) used_offset);

So there might be some places in existing code that doesn't match everywhere. Possible that this should all just move to off_t ?

if (target->f != NULL) {
fclose(target->f);
target->f = NULL;
if (target->fd != 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare to -1 and set to -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

fclose(target->f);
target->f = NULL;
close(target->fd);
target->fd = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set to -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

fclose(target->f);
target->f = NULL;
close(target->fd);
target->fd = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set to -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@@ -2803,7 +2791,7 @@ lr_download(GSList *targets,
for (GSList *elem = dd.targets; elem; elem = g_slist_next(elem)) {
LrTarget *target = elem->data;
assert(target->curl_handle == NULL);
assert(target->f == NULL);
assert(target->fd == 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Compare against -1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@ppisar ppisar self-assigned this Feb 12, 2024
@ppisar ppisar added the Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take label Feb 12, 2024
@ppisar
Copy link
Contributor

ppisar commented Feb 12, 2024

In general, I like your patch, but please address the different semantics I pointed in-line.

@stewartsmith
Copy link
Contributor Author

Thanks for the eyes on it. I can clean up the review comments, and spin a rev2.

the FILE related IO functions in libc do buffering inside the C library,
and are generally less performant than using the file descriptor based
open()/read()/write() functions.

The FILE based approach somewhat limits the maximum throughput of
librepo, as well as increases CPU usage. In my benchmarks of a reposync
of the Amazon Linux 2023 x86-64 repositories, this move to file
descriptor based IO saves about 1 second of user time, and .5 seconds of
system time, for a wall clock time benefit of a few seconds (102s vs
99s).
@stewartsmith stewartsmith force-pushed the no-libc-buffered-io branch from a38226c to ebd4be0 Compare June 5, 2024 00:47
@stewartsmith
Copy link
Contributor Author

I think I've managed to address the comments and ensure correct behavior in (hopefully) all error conditions.

Copy link
Member

@Conan-Kudo Conan-Kudo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, it looks fine. One small change, though.

* We write up to 32MB at a time, mainly to ensure that the below loops
* never bitrot, and it's regularly tested with real world RPMs.
*/
#define WRITECB_CHUNK_MAX 32*1024*1024
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent to match the line starting point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Triaged Someone on the DNF 5 team has read the issue and determined the next steps to take
Projects
Status: Review
Development

Successfully merging this pull request may close these issues.

3 participants