Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix read example use io.CopyBuffer #42

Merged
merged 2 commits into from
Nov 4, 2024

Conversation

nirs
Copy link
Member

@nirs nirs commented Nov 3, 2024

Due to zero copy optimizations in os.File, using io.CopyBuffer() with
os.File and qcow2reader.Image ends as io.Copy() ignoring our buffer.
This is slow and a bad example for users of the library.

Fix the issue by hiding os.File ReadFrom method with a wrapper, forcing
io.CopyBuffer() to use our buffer.

This speeds up the copy significantly:

  • Reading qcow2 image is 3.7 times faster
  • Copying qcow2 image is 1.7 times faster
  • Reading qcow2 compressed image is 1.05 times faster

Based on the benchmarks, update the default buffer size in the read exxample
to 2 MiB.

buffer-size-qcow2

Fixes: #41

@@ -29,7 +29,7 @@ func cmdRead(args []string) error {
flag.PrintDefaults()
}
fs.BoolVar(&debug, "debug", false, "enable printing debug messages")
fs.IntVar(&bufferSize, "buffer-size", 65536, "buffer size")
fs.IntVar(&bufferSize, "buffer-size", 512*1024, "buffer size")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a separate topic and should be another commit

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Separated to another commit.


// Writer forces os.File to write using our buffer by hiding it's ReadFrom
// method. io.CopyBuffer is up to XX times faster with this wrrapper.
type Writer struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Writer isn't a descriptive name.

What about:

Suggested change
type Writer struct {
type hideReadFrom struct {

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Renamed.


return err
}

// Writer forces os.File to write using our buffer by hiding it's ReadFrom
// method. io.CopyBuffer is up to XX times faster with this wrrapper.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you want to fulfill XX?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe just add a link to this PR (#42) for the benchmark result and the context

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll link the pr.

Copy link
Member Author

@nirs nirs Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I updated the docs including a link to the pr.


return err
}

// Writer forces os.File to write using our buffer by hiding it's ReadFrom
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's → its ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure, too easy to get this wrong :-)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, should be "its"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed in current version.

nirs added 2 commits November 4, 2024 05:46
Due to zero copy optimizations in os.File, using io.CopyBuffer() with
os.File and qcow2reader.Image ends as io.Copy() ignoring our buffer.
This is slow and a bad example for users of the library.

Fix the issue by hiding os.File ReadFrom method with a wrapper, forcing
io.CopyBuffer() to use our buffer.

This speeds up the copy significantly:
- Reading qcow2 image is 3.7 times faster
- Copying qcow2 image is 1.7 times faster
- Reading qcow2 compressed image is 1.05 times faster

Making the example more complicated is not great but users need to know
how to workaround the "optimizations" in io.CopyBuffer(). I think this
should be fixed in the standard library later.

Reading Ubuntu 24.04 image in qcow2 format:

    % hyperfine -w3 -r10 -L b 32768,65536,131072,262144,524288,1048576,2097152,4194304,8388608 \
                "./go-qcow2reader-example read -buffer-size {b} /var/tmp/images/test.qcow2 >/dev/null"
    Benchmark 1: ./go-qcow2reader-example read -buffer-size 32768 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     697.8 ms ±  13.0 ms    [User: 107.2 ms, System: 591.8 ms]
      Range (min … max):   684.0 ms … 724.0 ms    10 runs

    Benchmark 2: ./go-qcow2reader-example read -buffer-size 65536 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     430.9 ms ±   4.7 ms    [User: 75.3 ms, System: 356.5 ms]
      Range (min … max):   426.0 ms … 438.0 ms    10 runs

    Benchmark 3: ./go-qcow2reader-example read -buffer-size 131072 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     306.1 ms ±   3.7 ms    [User: 62.2 ms, System: 244.2 ms]
      Range (min … max):   302.9 ms … 315.0 ms    10 runs

    Benchmark 4: ./go-qcow2reader-example read -buffer-size 262144 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     259.8 ms ±   5.8 ms    [User: 55.6 ms, System: 204.2 ms]
      Range (min … max):   253.6 ms … 266.8 ms    10 runs

    Benchmark 5: ./go-qcow2reader-example read -buffer-size 524288 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     223.9 ms ±   2.9 ms    [User: 49.5 ms, System: 170.9 ms]
      Range (min … max):   220.6 ms … 228.3 ms    10 runs

    Benchmark 6: ./go-qcow2reader-example read -buffer-size 1048576 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     201.1 ms ±   1.7 ms    [User: 48.3 ms, System: 148.9 ms]
      Range (min … max):   198.0 ms … 204.2 ms    10 runs

    Benchmark 7: ./go-qcow2reader-example read -buffer-size 2097152 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     192.8 ms ±   3.2 ms    [User: 47.5 ms, System: 140.4 ms]
      Range (min … max):   188.5 ms … 196.6 ms    10 runs

    Benchmark 8: ./go-qcow2reader-example read -buffer-size 4194304 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     189.3 ms ±   5.9 ms    [User: 47.7 ms, System: 135.0 ms]
      Range (min … max):   183.5 ms … 201.5 ms    10 runs

    Benchmark 9: ./go-qcow2reader-example read -buffer-size 8388608 /var/tmp/images/test.qcow2 >/dev/null
      Time (mean ± σ):     191.4 ms ±   1.8 ms    [User: 48.9 ms, System: 139.1 ms]
      Range (min … max):   189.8 ms … 195.5 ms    10 runs

    Summary
      ./go-qcow2reader-example read -buffer-size 4194304 /var/tmp/images/test.qcow2 >/dev/null ran
        1.01 ± 0.03 times faster than ./go-qcow2reader-example read -buffer-size 8388608 /var/tmp/images/test.qcow2 >/dev/null
        1.02 ± 0.04 times faster than ./go-qcow2reader-example read -buffer-size 2097152 /var/tmp/images/test.qcow2 >/dev/null
        1.06 ± 0.03 times faster than ./go-qcow2reader-example read -buffer-size 1048576 /var/tmp/images/test.qcow2 >/dev/null
        1.18 ± 0.04 times faster than ./go-qcow2reader-example read -buffer-size 524288 /var/tmp/images/test.qcow2 >/dev/null
        1.37 ± 0.05 times faster than ./go-qcow2reader-example read -buffer-size 262144 /var/tmp/images/test.qcow2 >/dev/null
        1.62 ± 0.05 times faster than ./go-qcow2reader-example read -buffer-size 131072 /var/tmp/images/test.qcow2 >/dev/null
        2.28 ± 0.08 times faster than ./go-qcow2reader-example read -buffer-size 65536 /var/tmp/images/test.qcow2 >/dev/null
        3.69 ± 0.13 times faster than ./go-qcow2reader-example read -buffer-size 32768 /var/tmp/images/test.qcow2 >/dev/null

Copying Ubuntu 24.04 image in qcow2 format:

    % hyperfine -w3 -r10 -L b 32768,65536,131072,262144,524288,1048576,2097152,4194304,8388608 \
                "./go-qcow2reader-example read -buffer-size {b} /var/tmp/images/test.qcow2 >/tmp/tmp.img"
    Benchmark 1: ./go-qcow2reader-example read -buffer-size 32768 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      2.000 s ±  0.283 s    [User: 0.126 s, System: 1.678 s]
      Range (min … max):    1.847 s …  2.792 s    10 runs

    Benchmark 2: ./go-qcow2reader-example read -buffer-size 65536 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.425 s ±  0.010 s    [User: 0.096 s, System: 1.265 s]
      Range (min … max):    1.411 s …  1.447 s    10 runs

    Benchmark 3: ./go-qcow2reader-example read -buffer-size 131072 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.269 s ±  0.029 s    [User: 0.077 s, System: 1.050 s]
      Range (min … max):    1.244 s …  1.333 s    10 runs

    Benchmark 4: ./go-qcow2reader-example read -buffer-size 262144 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.167 s ±  0.022 s    [User: 0.058 s, System: 0.928 s]
      Range (min … max):    1.139 s …  1.206 s    10 runs

    Benchmark 5: ./go-qcow2reader-example read -buffer-size 524288 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.218 s ±  0.048 s    [User: 0.048 s, System: 0.870 s]
      Range (min … max):    1.161 s …  1.289 s    10 runs

    Benchmark 6: ./go-qcow2reader-example read -buffer-size 1048576 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.210 s ±  0.046 s    [User: 0.044 s, System: 0.855 s]
      Range (min … max):    1.148 s …  1.284 s    10 runs

    Benchmark 7: ./go-qcow2reader-example read -buffer-size 2097152 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.178 s ±  0.045 s    [User: 0.041 s, System: 0.837 s]
      Range (min … max):    1.145 s …  1.296 s    10 runs

    Benchmark 8: ./go-qcow2reader-example read -buffer-size 4194304 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.170 s ±  0.034 s    [User: 0.041 s, System: 0.861 s]
      Range (min … max):    1.140 s …  1.234 s    10 runs

    Benchmark 9: ./go-qcow2reader-example read -buffer-size 8388608 /var/tmp/images/test.qcow2 >/tmp/tmp.img
      Time (mean ± σ):      1.176 s ±  0.028 s    [User: 0.043 s, System: 0.908 s]
      Range (min … max):    1.147 s …  1.221 s    10 runs

    Summary
      ./go-qcow2reader-example read -buffer-size 262144 /var/tmp/images/test.qcow2 >/tmp/tmp.img ran
        1.00 ± 0.03 times faster than ./go-qcow2reader-example read -buffer-size 4194304 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.01 ± 0.03 times faster than ./go-qcow2reader-example read -buffer-size 8388608 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.01 ± 0.04 times faster than ./go-qcow2reader-example read -buffer-size 2097152 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.04 ± 0.04 times faster than ./go-qcow2reader-example read -buffer-size 1048576 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.04 ± 0.05 times faster than ./go-qcow2reader-example read -buffer-size 524288 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.09 ± 0.03 times faster than ./go-qcow2reader-example read -buffer-size 131072 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.22 ± 0.02 times faster than ./go-qcow2reader-example read -buffer-size 65536 /var/tmp/images/test.qcow2 >/tmp/tmp.img
        1.71 ± 0.24 times faster than ./go-qcow2reader-example read -buffer-size 32768 /var/tmp/images/test.qcow2 >/tmp/tmp.img

Fixes: lima-vm#41
Signed-off-by: Nir Soffer <[email protected]>
Based on the benchmarks, 2 MiB buffer gives good results for reading and
copying an image with the read example.

Signed-off-by: Nir Soffer <[email protected]>
@nirs nirs force-pushed the fix-read-example branch from ee2b114 to 9fef7ca Compare November 4, 2024 13:27
@nirs nirs requested a review from AkihiroSuda November 4, 2024 13:35
Copy link
Member

@AkihiroSuda AkihiroSuda left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks

@AkihiroSuda AkihiroSuda merged commit a04a2a1 into lima-vm:master Nov 4, 2024
2 checks passed
@nirs nirs deleted the fix-read-example branch November 4, 2024 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

read example use of buffer-size is ineffective and bad example
2 participants