Use streaming file reads when comparing for writes #4093

jridgewell · 2023-03-07T04:46:22Z

Description

Following #3526, this reimplements write(content)'s file comparison to stream the contents of the file. Not every file we write is massive, but some of the ones we generate are 100kb+. I hope this can reduce memory pressure a bit by using a consistent 8kb block size, instead of allocating a buffer for the full file contents just to check if we should write.

Testing Instructions

Wait for benchmarks.

vercel · 2023-03-07T04:46:33Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated
examples-tailwind-web	🔄 Building (Inspect)			Mar 8, 2023 at 7:28PM (UTC)

9 Ignored Deployments

Name	Status	Preview	Updated
examples-basic-web	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-cra-web	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-designsystem-docs	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-kitchensink-blog	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-native-web	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-nonmonorepo	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-svelte-web	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
examples-vite-web	⬜️ Ignored (Inspect)		Mar 8, 2023 at 7:28PM (UTC)
turbo-site	⬜️ Ignored (Inspect)	Visit Preview	Mar 8, 2023 at 7:28PM (UTC)

github-actions · 2023-03-07T05:12:15Z

🟢 CI successful 🟢

Thanks

github-actions · 2023-03-07T06:46:53Z

Benchmark for `6fd5d29`

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	8186.63µs ± 93.62µs	8652.43µs ± 40.62µs	+5.69%	+2.36%
bench_hmr_to_commit/Turbopack RCC/1000 modules	12.50ms ± 0.11ms	11.87ms ± 0.12ms	-5.01%	-1.41%
bench_hmr_to_commit/Turbopack RSC/1000 modules	493.38ms ± 2.28ms	500.39ms ± 1.16ms	+1.42%	+0.03%
bench_hmr_to_eval/Turbopack RCC/1000 modules	11.70ms ± 0.12ms	10.59ms ± 0.16ms	-9.46%	-4.75%

Click to view full benchmark

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	8186.63µs ± 93.62µs	8652.43µs ± 40.62µs	+5.69%	+2.36%
bench_hmr_to_commit/Turbopack RCC/1000 modules	12.50ms ± 0.11ms	11.87ms ± 0.12ms	-5.01%	-1.41%
bench_hmr_to_commit/Turbopack RSC/1000 modules	493.38ms ± 2.28ms	500.39ms ± 1.16ms	+1.42%	+0.03%
bench_hmr_to_commit/Turbopack SSR/1000 modules	8645.52µs ± 47.66µs	8760.52µs ± 57.49µs	+1.33%
bench_hmr_to_eval/Turbopack CSR/1000 modules	7117.56µs ± 33.82µs	7120.10µs ± 65.72µs	+0.04%
bench_hmr_to_eval/Turbopack RCC/1000 modules	11.70ms ± 0.12ms	10.59ms ± 0.16ms	-9.46%	-4.75%
bench_hmr_to_eval/Turbopack SSR/1000 modules	7508.75µs ± 123.26µs	7441.63µs ± 64.98µs	-0.89%
bench_hydration/Turbopack RCC/1000 modules	3526.20ms ± 10.78ms	3502.50ms ± 11.18ms	-0.67%
bench_hydration/Turbopack RSC/1000 modules	3146.45ms ± 7.78ms	3179.91ms ± 13.47ms	+1.06%
bench_hydration/Turbopack SSR/1000 modules	3231.98ms ± 6.35ms	3229.11ms ± 5.80ms	-0.09%
bench_startup/Turbopack CSR/1000 modules	2507.96ms ± 2.26ms	2501.29ms ± 6.08ms	-0.27%
bench_startup/Turbopack RCC/1000 modules	2083.65ms ± 2.46ms	2091.32ms ± 3.17ms	+0.37%
bench_startup/Turbopack RSC/1000 modules	2015.56ms ± 4.47ms	2025.83ms ± 4.09ms	+0.51%
bench_startup/Turbopack SSR/1000 modules	1988.19ms ± 3.21ms	1995.40ms ± 3.34ms	+0.36%

sokra · 2023-03-07T17:58:56Z

crates/turbo-tasks-fs/src/util.rs

+/// continue to be returned until the full buffer has been consumed. This allows
+/// us to skip the overhead of, eg, repeated sys calls to read from disk as we
+/// process a smaller number of bytes.
+pub struct AsyncBufReader<'a, T: AsyncRead + Unpin + Sized> {


What's the difference to https://docs.rs/tokio/latest/tokio/io/struct.BufReader.html?

😳 I searched the docs for AsyncBufReader and not BufReader, and implemented this from scratch because I didn't find anything.

sokra · 2023-03-07T18:09:36Z

crates/turbo-tasks-fs/src/lib.rs

+        // So meta matches, and we have a file handle. Let's stream the contents to see
+        // if they match.
+        let mut new_contents = new_file.read();
+        let mut old_contents = AsyncBufReader::new(&mut old_file);


Performance-wise it would probably be better to avoid the buffered reader here and just read directly from the old_file as much as possible. Probably requires a little bit more complicated compare logic to handle consumed positions.

I think you're assuming reading from a file returns a buffer, and we're then copying from that buffer into our BufReader buffer. But that's not the case, the read() API requires you to maintain a buffer and that gets passed to the system to copy bytes into, reading doesn't return a system buffer.

So we can either maintain our own buffer in this loop, or we can reuse the buffer created as part of BufReader.

github-actions · 2023-03-07T20:35:38Z

Benchmark for `fbf69aa`

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack RCC/1000 modules	12.88ms ± 0.19ms	12.10ms ± 0.10ms	-6.07%	-1.54%
bench_hmr_to_commit/Turbopack RSC/1000 modules	503.39ms ± 0.51ms	513.28ms ± 2.20ms	+1.96%	+0.89%

Click to view full benchmark

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	9296.79µs ± 24.51µs	9259.43µs ± 38.28µs	-0.40%
bench_hmr_to_commit/Turbopack RCC/1000 modules	12.88ms ± 0.19ms	12.10ms ± 0.10ms	-6.07%	-1.54%
bench_hmr_to_commit/Turbopack RSC/1000 modules	503.39ms ± 0.51ms	513.28ms ± 2.20ms	+1.96%	+0.89%
bench_hmr_to_commit/Turbopack SSR/1000 modules	9527.05µs ± 36.13µs	9535.97µs ± 56.53µs	+0.09%
bench_hmr_to_eval/Turbopack CSR/1000 modules	8260.08µs ± 25.50µs	8277.96µs ± 41.94µs	+0.22%
bench_hmr_to_eval/Turbopack RCC/1000 modules	11.01ms ± 0.33ms	10.63ms ± 0.12ms	-3.41%
bench_hmr_to_eval/Turbopack SSR/1000 modules	8542.37µs ± 74.82µs	8543.46µs ± 43.35µs	+0.01%
bench_hydration/Turbopack RCC/1000 modules	3562.39ms ± 9.87ms	3549.87ms ± 7.62ms	-0.35%
bench_hydration/Turbopack RSC/1000 modules	3227.00ms ± 9.22ms	3202.42ms ± 9.70ms	-0.76%
bench_hydration/Turbopack SSR/1000 modules	3266.28ms ± 5.65ms	3269.62ms ± 3.74ms	+0.10%
bench_startup/Turbopack CSR/1000 modules	2581.65ms ± 6.64ms	2583.01ms ± 7.27ms	+0.05%
bench_startup/Turbopack RCC/1000 modules	2107.22ms ± 4.55ms	2095.79ms ± 2.87ms	-0.54%
bench_startup/Turbopack RSC/1000 modules	2037.40ms ± 6.16ms	2042.60ms ± 4.08ms	+0.26%
bench_startup/Turbopack SSR/1000 modules	2021.13ms ± 3.89ms	2017.91ms ± 3.34ms	-0.16%

github-actions · 2023-03-08T00:33:40Z

Benchmark for `b22cce7`

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack RSC/1000 modules	512.16ms ± 2.89ms	541.35ms ± 4.22ms	+5.70%	+2.89%

Click to view full benchmark

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	9124.89µs ± 128.37µs	9533.24µs ± 101.62µs	+4.48%
bench_hmr_to_commit/Turbopack RCC/1000 modules	13.53ms ± 0.31ms	12.72ms ± 0.26ms	-5.99%
bench_hmr_to_commit/Turbopack RSC/1000 modules	512.16ms ± 2.89ms	541.35ms ± 4.22ms	+5.70%	+2.89%
bench_hmr_to_commit/Turbopack SSR/1000 modules	10.14ms ± 0.07ms	9850.43µs ± 125.37µs	-2.85%
bench_hmr_to_eval/Turbopack CSR/1000 modules	7701.54µs ± 48.92µs	7690.37µs ± 88.15µs	-0.14%
bench_hmr_to_eval/Turbopack RCC/1000 modules	11.65ms ± 0.23ms	11.10ms ± 0.15ms	-4.72%
bench_hmr_to_eval/Turbopack SSR/1000 modules	8500.99µs ± 83.62µs	8435.56µs ± 40.88µs	-0.77%
bench_hydration/Turbopack RCC/1000 modules	3597.18ms ± 6.03ms	3610.43ms ± 12.12ms	+0.37%
bench_hydration/Turbopack RSC/1000 modules	3229.12ms ± 19.16ms	3212.96ms ± 11.73ms	-0.50%
bench_hydration/Turbopack SSR/1000 modules	3313.98ms ± 8.86ms	3338.35ms ± 8.67ms	+0.74%
bench_startup/Turbopack CSR/1000 modules	2660.41ms ± 10.24ms	2687.79ms ± 8.11ms	+1.03%
bench_startup/Turbopack RCC/1000 modules	2114.01ms ± 4.95ms	2099.76ms ± 7.13ms	-0.67%
bench_startup/Turbopack RSC/1000 modules	2018.22ms ± 6.93ms	2033.73ms ± 9.83ms	+0.77%
bench_startup/Turbopack SSR/1000 modules	2050.18ms ± 7.72ms	2064.88ms ± 8.24ms	+0.72%

github-actions · 2023-03-08T06:41:47Z

Benchmark for `8228312`

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack RSC/1000 modules	502.77ms ± 3.12ms	522.66ms ± 1.94ms	+3.96%	+1.92%
bench_hmr_to_eval/Turbopack CSR/1000 modules	8590.84µs ± 28.35µs	8789.61µs ± 61.16µs	+2.31%	+0.23%
bench_startup/Turbopack CSR/1000 modules	2663.10ms ± 5.33ms	2625.76ms ± 7.09ms	-1.40%	-0.47%

Click to view full benchmark

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	9464.29µs ± 49.70µs	9454.66µs ± 93.89µs	-0.10%
bench_hmr_to_commit/Turbopack RCC/1000 modules	13.08ms ± 0.17ms	12.42ms ± 0.20ms	-5.08%
bench_hmr_to_commit/Turbopack RSC/1000 modules	502.77ms ± 3.12ms	522.66ms ± 1.94ms	+3.96%	+1.92%
bench_hmr_to_commit/Turbopack SSR/1000 modules	9604.51µs ± 67.51µs	9634.72µs ± 47.25µs	+0.31%
bench_hmr_to_eval/Turbopack CSR/1000 modules	8590.84µs ± 28.35µs	8789.61µs ± 61.16µs	+2.31%	+0.23%
bench_hmr_to_eval/Turbopack RCC/1000 modules	10.46ms ± 0.06ms	10.68ms ± 0.19ms	+2.11%
bench_hmr_to_eval/Turbopack SSR/1000 modules	8587.92µs ± 49.15µs	8691.27µs ± 63.80µs	+1.20%
bench_hydration/Turbopack RCC/1000 modules	3568.56ms ± 14.76ms	3602.62ms ± 10.89ms	+0.95%
bench_hydration/Turbopack RSC/1000 modules	3228.27ms ± 8.88ms	3252.92ms ± 11.43ms	+0.76%
bench_hydration/Turbopack SSR/1000 modules	3314.65ms ± 7.18ms	3318.55ms ± 10.10ms	+0.12%
bench_startup/Turbopack CSR/1000 modules	2663.10ms ± 5.33ms	2625.76ms ± 7.09ms	-1.40%	-0.47%
bench_startup/Turbopack RCC/1000 modules	2143.52ms ± 8.54ms	2146.89ms ± 3.22ms	+0.16%
bench_startup/Turbopack RSC/1000 modules	2088.71ms ± 6.60ms	2104.28ms ± 11.16ms	+0.75%
bench_startup/Turbopack SSR/1000 modules	2052.22ms ± 4.75ms	2059.88ms ± 6.20ms	+0.37%

crates/turbo-tasks-fs/src/lib.rs

Co-authored-by: Alex Kirszenberg <[email protected]>

### Description Following vercel#3526, this reimplements `write(content)`'s file comparison to stream the contents of the file. Not every file we write is massive, but some of the ones we generate are 100kb+. I hope this can reduce memory pressure a bit by using a consistent 8kb block size, instead of allocating a buffer for the full file contents just to check if we should write. ### Testing Instructions Wait for benchmarks. --------- Co-authored-by: Alex Kirszenberg <[email protected]>

github-actions · 2023-03-08T20:46:18Z

Benchmark for `604a028`

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack RSC/1000 modules	498.68ms ± 1.29ms	512.42ms ± 1.71ms	+2.76%	+1.54%

Click to view full benchmark

Test	Base	PR	%	Significant %
bench_hmr_to_commit/Turbopack CSR/1000 modules	9378.39µs ± 33.98µs	9346.23µs ± 63.03µs	-0.34%
bench_hmr_to_commit/Turbopack RCC/1000 modules	12.27ms ± 0.20ms	12.06ms ± 0.20ms	-1.72%
bench_hmr_to_commit/Turbopack RSC/1000 modules	498.68ms ± 1.29ms	512.42ms ± 1.71ms	+2.76%	+1.54%
bench_hmr_to_commit/Turbopack SSR/1000 modules	9655.89µs ± 37.80µs	9561.45µs ± 49.20µs	-0.98%
bench_hmr_to_eval/Turbopack CSR/1000 modules	8312.07µs ± 48.87µs	8376.13µs ± 39.55µs	+0.77%
bench_hmr_to_eval/Turbopack RCC/1000 modules	10.60ms ± 0.18ms	10.55ms ± 0.20ms	-0.48%
bench_hmr_to_eval/Turbopack SSR/1000 modules	8558.96µs ± 74.90µs	8360.95µs ± 53.81µs	-2.31%
bench_hydration/Turbopack RCC/1000 modules	3448.85ms ± 7.29ms	3465.44ms ± 8.29ms	+0.48%
bench_hydration/Turbopack RSC/1000 modules	3166.40ms ± 13.62ms	3194.66ms ± 12.43ms	+0.89%
bench_hydration/Turbopack SSR/1000 modules	3345.27ms ± 10.80ms	3319.50ms ± 10.36ms	-0.77%
bench_startup/Turbopack CSR/1000 modules	2609.57ms ± 5.18ms	2592.59ms ± 9.83ms	-0.65%
bench_startup/Turbopack RCC/1000 modules	1997.44ms ± 3.18ms	1993.21ms ± 2.90ms	-0.21%
bench_startup/Turbopack RSC/1000 modules	1980.76ms ± 5.17ms	1974.00ms ± 4.80ms	-0.34%
bench_startup/Turbopack SSR/1000 modules	2041.95ms ± 3.15ms	2040.41ms ± 2.88ms	-0.08%

### Description Following vercel#3526, this reimplements `write(content)`'s file comparison to stream the contents of the file. Not every file we write is massive, but some of the ones we generate are 100kb+. I hope this can reduce memory pressure a bit by using a consistent 8kb block size, instead of allocating a buffer for the full file contents just to check if we should write. ### Testing Instructions Wait for benchmarks. --------- Co-authored-by: Alex Kirszenberg <[email protected]>

# New Features - vercel/turborepo#3975 # Bug Fixes - vercel/turborepo#4129 - vercel/turborepo#4134 - vercel/turborepo#4062 # Performance - vercel/turborepo#4093

### Description Following vercel#3526, this reimplements `write(content)`'s file comparison to stream the contents of the file. Not every file we write is massive, but some of the ones we generate are 100kb+. I hope this can reduce memory pressure a bit by using a consistent 8kb block size, instead of allocating a buffer for the full file contents just to check if we should write. ### Testing Instructions Wait for benchmarks. --------- Co-authored-by: Alex Kirszenberg <[email protected]>

…4093) ### Description Following #3526, this reimplements `write(content)`'s file comparison to stream the contents of the file. Not every file we write is massive, but some of the ones we generate are 100kb+. I hope this can reduce memory pressure a bit by using a consistent 8kb block size, instead of allocating a buffer for the full file contents just to check if we should write. ### Testing Instructions Wait for benchmarks. --------- Co-authored-by: Alex Kirszenberg <[email protected]>

jridgewell requested a review from sokra March 7, 2023 04:46

jridgewell requested a review from a team as a code owner March 7, 2023 04:46

turbo-orchestrator bot added the team: turbopack label Mar 7, 2023

sokra reviewed Mar 7, 2023

View reviewed changes

jridgewell requested a review from sokra March 7, 2023 19:34

jridgewell added 5 commits March 7, 2023 17:10

Implement streaming disk file comparison

affbe3c

Extract helper

fb0161e

Fix clippy

b5ea54c

Use tokio BufReader impl

b612d81

Clippy

36b5d76

jridgewell force-pushed the jrl-fs-streaming-compare branch from 43e62c7 to 36b5d76 Compare March 7, 2023 23:12

vercel bot deployed to Preview – turbo-site March 7, 2023 23:13 View deployment

jridgewell force-pushed the jrl-fs-streaming-compare branch from fbc152f to 36b5d76 Compare March 8, 2023 05:25

alexkirsz approved these changes Mar 8, 2023

View reviewed changes

crates/turbo-tasks-fs/src/lib.rs Outdated Show resolved Hide resolved

remove new_uninit

8bb0a9d

Co-authored-by: Alex Kirszenberg <[email protected]>

jridgewell added the pr: automerge Kodiak will merge these automatically after checks pass label Mar 8, 2023

kodiakhq bot merged commit d2d0d3e into main Mar 8, 2023

kodiakhq bot deleted the jrl-fs-streaming-compare branch March 8, 2023 19:50

jridgewell mentioned this pull request Mar 9, 2023

Update Turbopack to 230309.2 vercel/next.js#46971

Merged

ijjk pushed a commit to vercel/next.js that referenced this pull request Mar 9, 2023

Update Turbopack to 230309.2 (#46971)

8e14b67

# New Features - vercel/turborepo#3975 # Bug Fixes - vercel/turborepo#4129 - vercel/turborepo#4134 - vercel/turborepo#4062 # Performance - vercel/turborepo#4093

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use streaming file reads when comparing for writes #4093

Use streaming file reads when comparing for writes #4093

jridgewell commented Mar 7, 2023 •

edited

Loading

vercel bot commented Mar 7, 2023 •

edited

Loading

github-actions bot commented Mar 7, 2023 •

edited

Loading

github-actions bot commented Mar 7, 2023

sokra Mar 7, 2023

jridgewell Mar 7, 2023

sokra Mar 7, 2023

jridgewell Mar 7, 2023

sokra Mar 7, 2023

github-actions bot commented Mar 7, 2023

github-actions bot commented Mar 8, 2023

github-actions bot commented Mar 8, 2023

github-actions bot commented Mar 8, 2023

Use streaming file reads when comparing for writes #4093

Use streaming file reads when comparing for writes #4093

Conversation

jridgewell commented Mar 7, 2023 • edited Loading

Description

Testing Instructions

vercel bot commented Mar 7, 2023 • edited Loading

github-actions bot commented Mar 7, 2023 • edited Loading

🟢 CI successful 🟢

github-actions bot commented Mar 7, 2023

Benchmark for 6fd5d29

sokra Mar 7, 2023

Choose a reason for hiding this comment

jridgewell Mar 7, 2023

Choose a reason for hiding this comment

sokra Mar 7, 2023

Choose a reason for hiding this comment

jridgewell Mar 7, 2023

Choose a reason for hiding this comment

sokra Mar 7, 2023

Choose a reason for hiding this comment

github-actions bot commented Mar 7, 2023

Benchmark for fbf69aa

github-actions bot commented Mar 8, 2023

Benchmark for b22cce7

github-actions bot commented Mar 8, 2023

Benchmark for 8228312

github-actions bot commented Mar 8, 2023

Benchmark for 604a028

jridgewell commented Mar 7, 2023 •

edited

Loading

vercel bot commented Mar 7, 2023 •

edited

Loading

github-actions bot commented Mar 7, 2023 •

edited

Loading

Benchmark for `6fd5d29`

Benchmark for `fbf69aa`

Benchmark for `b22cce7`

Benchmark for `8228312`

Benchmark for `604a028`