Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new test case for possible bug in BufReader #58913

Merged
merged 2 commits into from
Mar 21, 2019
Merged

Add new test case for possible bug in BufReader #58913

merged 2 commits into from
Mar 21, 2019

Conversation

andre-vm
Copy link
Contributor

@andre-vm andre-vm commented Mar 4, 2019

When reading a large chunk from a BufReader, if all the bytes from the buffer have been already consumed, the internal buffer is bypassed entirely. However, it is not invalidated, and it's possible to access its contents using the seek_relative method, because it tries to reuse the existing buffer.

@rust-highfive
Copy link
Collaborator

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @joshtriplett (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Mar 4, 2019
@rust-highfive

This comment has been minimized.

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:155f79e0:start=1551708531382670248,finish=1551708532305729200,duration=923058952
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:21:28] 
[01:21:28] running 119 tests
[01:21:55] .iiiii...i.....i..i...i..i.i..i.ii...i.....i..i....i..........iiii..........i...ii...i.......ii.i.i. 100/119
[01:21:59] i......iii.i.....ii
[01:21:59] 
[01:21:59]  finished in 31.395
[01:21:59] travis_fold:end:test_debuginfo

---
[01:38:58] travis_fold:start:test_stage1-std
travis_time:start:test_stage1-std
Testing std stage1 (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:38:59]    Compiling std v0.0.0 (/checkout/src/libstd)
[01:39:01] error[E0433]: failed to resolve: use of undeclared type or module `RevBufReader`
[01:39:01]      |
[01:39:01]      |
[01:39:01] 1075 |         let mut reader = RevBufReader::with_capacity(3, io::Cursor::new(inner));
[01:39:01]      |                          ^^^^^^^^^^^^ use of undeclared type or module `RevBufReader`
[01:39:13] error: aborting due to previous error
[01:39:13] 
[01:39:13] For more information about this error, try `rustc --explain E0433`.
[01:39:13] error: Could not compile `std`.
[01:39:13] error: Could not compile `std`.
[01:39:13] 
[01:39:13] To learn more, run the command again with --verbose.
[01:39:13] 
[01:39:13] 
[01:39:13] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "std" "--" "--quiet"
[01:39:13] 
[01:39:13] 
[01:39:13] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:39:13] Build completed unsuccessfully in 0:29:57
[01:39:13] Build completed unsuccessfully in 0:29:57
[01:39:13] Makefile:48: recipe for target 'check' failed
[01:39:13] make: *** [check] Error 1
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:0ced8d38
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Mon Mar  4 15:48:16 UTC 2019
Mon Mar  4 15:48:16 UTC 2019
echo $CORE | sed 's|obj/cores/core\.[0-9]*\.!checkout!\(.*\)|\1|;y|!|/|'); if [ -f "$EXE" ]; then printf travis_fold":start:crashlog\n\033[31;1m%s\033[0m\n" "$CORE"; gdb --batch -q -c "$CORE" "$EXE" -iex 'set auto-load off' -iex 'dir src/' -iex 'set sysroot .' -ex bt -ex q; echo travis_fold":"end:crashlog; fi; done || true
travis_fold:end:after_failure.4
travis_fold:start:after_failure.5
travis_time:start:14fcb930
travis_time:start:14fcb930
$ cat ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers || true
cat: ./obj/build/x86_64-unknown-linux-gnu/native/asan/build/lib/asan/clang_rt.asan-dynamic-i386.vers: No such file or directory
travis_fold:end:after_failure.5
travis_fold:start:after_failure.6
travis_time:start:0d08b418
$ dmesg | grep -i kill

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:0c9b8eaf:start=1551715561873205141,finish=1551715563775151375,duration=1901946234
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
Setting environment variables from .travis.yml
$ export IMAGE=x86_64-gnu-llvm-6.0
---
travis_time:start:test_debuginfo
Check compiletest suite=debuginfo mode=debuginfo-both (x86_64-unknown-linux-gnu -> x86_64-unknown-linux-gnu)
[01:20:17] 
[01:20:17] running 119 tests
[01:20:46] .iiiii...i.....i..i...i..i.i..i.ii...i.....i..i....i..........iiii..........i...ii...i.......ii.i.i. 100/119
[01:20:51] i......iii.i.....ii
[01:20:51] 
[01:20:51]  finished in 33.384
[01:20:51] travis_fold:end:test_debuginfo

---
[01:39:01] .................................................................................................... 100/774
[01:39:01] .................................................................................................... 200/774
[01:39:01] thread '<unnamed>' panicked at 'explicit panic', src/libstd/io/buffered.rs:1339:17
[01:39:01] thread '<unnamed>' panicked at 'explicit panic', src/libstd/io/stdio.rs:805:13
[01:39:01] ......................................................F............................................. 300/774
[01:39:02] .................................................................................................... 500/774
[01:39:02] thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: RecvError', src/libcore/result.rs:997:5
[01:39:02] thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: RecvError', src/libcore/result.rs:997:5
[01:39:02] thread '<unnamed>' panicked at 'called `Result::unwrap()` on an `Err` value: "SendError(..)"', src/libcore/result.rs:997:5
---
[01:39:13] failures:
[01:39:13] 
[01:39:13] ---- io::buffered::tests::test_buffered_reader_seek_relative_read stdout ----
[01:39:13] thread '<unnamed>' panicked at 'assertion failed: `(left == right)`
[01:39:13]   left: `[6, 7]`,
[01:39:13]  right: `[3, 4]`', src/libstd/io/buffered.rs:1087:9
[01:39:13] 
[01:39:13] failures:
[01:39:13]     io::buffered::tests::test_buffered_reader_seek_relative_read
[01:39:13] 
[01:39:13] 
[01:39:13] test result: FAILED. 773 passed; 1 failed; 0 ignored; 0 measured; 0 filtered out
[01:39:13] 
[01:39:13] error: test failed, to rerun pass '--lib'
[01:39:13] 
[01:39:13] 
[01:39:13] command did not execute successfully: "/checkout/obj/build/x86_64-unknown-linux-gnu/stage0/bin/cargo" "test" "--target" "x86_64-unknown-linux-gnu" "-j" "4" "--release" "--locked" "--color" "always" "--features" "panic-unwind backtrace" "--manifest-path" "/checkout/src/libstd/Cargo.toml" "-p" "std" "--" "--quiet"
[01:39:13] 
[01:39:13] 
[01:39:13] failed to run: /checkout/obj/build/bootstrap/debug/bootstrap test
[01:39:13] Build completed unsuccessfully in 0:31:25
[01:39:13] Build completed unsuccessfully in 0:31:25
[01:39:13] Makefile:48: recipe for target 'check' failed
[01:39:13] make: *** [check] Error 1
The command "stamp sh -x -c "$RUN_SCRIPT"" exited with 2.
travis_time:start:0019d106
$ date && (curl -fs --head https://google.com | grep ^Date: | sed 's/Date: //g' || true)
Mon Mar  4 17:45:28 UTC 2019
---
travis_time:end:0011346a:start=1551721530381457301,finish=1551721530386629441,duration=5172140
travis_fold:end:after_failure.3
travis_fold:start:after_failure.4
travis_time:start:036a4ca4
$ ln -s . checkout && for CORE in obj/cores/core.*; do EXE=$(e

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

Copy link
Member

@joshtriplett joshtriplett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the test case makes sense; if the buffer still exists, seeking to a location within it seems reasonable.

I don't think it makes sense to throw away the buffer until it gets overwritten. But once it does get overwritten, it needs invalidating.

Do you have a test case in which you can read data from the buffer that you shouldn't be able to? (Or am I misunderstanding what you're trying to demonstrate here?)

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 4, 2019

@joshtriplett Yes, I believe the test case that I presented already shows that it is possible to access data from the buffer when we shouldn't:

  1. [5, 6, 7] is read and stored in the buffer;
  2. [0, 1, 2, 3, 4] is read bypassing the buffer. The buffer should now be invalidated but it's not, the chunk [5, 6, 7] remains there. At this point, we're at the end of the stream.
  3. We call seek(-2). The buffer is 3 bytes long, so BufReader understands that we're still in its range, and that it doesn't need to fetch data from the inner reader.
  4. We read again, but instead of getting [3, 4] the test fails and we get [6, 7], which was incorrectly stored in the buffer.

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 4, 2019

By the way, sorry for the mess. I'd like to squash those mistaken commits and force push them to keep it organized, but I don't have good internet access right now. I didn't even clone the Rust repository, I edited the file and commited on GitHub itself.

@joshtriplett
Copy link
Member

joshtriplett commented Mar 5, 2019 via email

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 5, 2019

Yes, the test passes when we uncomment those lines. I'll do it as soon as possible.

There are two moments when a BufRead object needs to empty it's internal
buffer:

- In a seek call;
- In a read call when all data in the internal buffer had been already
  consumed and the output buffer has a greater or equal size than the
  internal buffer.

In both cases, the buffer was not being properly emptied, but only
marked as consumed (self.pos = self.cap). That should be no problem if
the inner reader is only Read, but if it is Seek as well, then it's
possible to access the data in the buffer by using the seek_relative
method. In order to prevent this from happening, both self.pos and
self.cap should be set to 0.

Two test cases were added to detect that failure:

- test_buffered_reader_invalidated_after_read
- test_buffered_reader_invalidated_after_seek

Both tests are very similar to each other. The inner reader contains the
following data: [5, 6, 7, 0, 1, 2, 3, 4]. The buffer capacity is 3
bytes.

- First, we call fill_buffer, which loads [5, 6, 7] into the internal
  buffer, and then consume those 3 bytes.
- Then we either read the 5 remaining bytes in a single read call or we
  move to the end of the stream by calling seek. In both cases the
  buffer should be emptied to prevent the previous data [5, 6, 7] from
  being read.
- We now call seek_relative(-2) and read two bytes, which should give us
  the last 2 bytes of the stream: [3, 4].

Before this commit, the the seek_relative method would consider that
we're still in the range of the internal buffer, so instead of fetching
data from the inner reader, it would return the two last bytes that were
incorrectly still in the buffer: [6, 7]. Therefore, the test would fail.

Now, when seek_relative is called the buffer is empty. So the expected
data [3, 4] is fetched from the inner reader and the test passes.
@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 6, 2019

@joshtriplett I realized that this problem not only happened in the read method, but also in the seek method. I just fixed the problem on both methods (I hope I didn't forget anything this time) and added another test case to cover seek as well. I also wrote a rationale in the commit message.

@joshtriplett
Copy link
Member

I feel like I need an extra pair of eyes here. I can see how this makes sense, but it feels like it might be invalidating more than it needs to, and I also wonder why it looks like seek's control flow can pass through both the invalidates you added in one call.

@Diggsey, @sfackler, it looks like you've done a fair bit of work in this area, can you take a look, please?

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 6, 2019

[...] and I also wonder why it looks like seek's control flow can pass through both the invalidates you added in one call.

I had the same question when I first read that code. It seems to me that the invalidation inside the else is there because the seek just below it can fail and return without passing through the invalidation outside the else.

self.pos = self.cap; // empty the buffer
// Empty the buffer
self.cap = 0;
self.pos = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need to mark the buffer as empty here, when we do the same further down?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Never-mind just saw the conversation above!

@Diggsey
Copy link
Contributor

Diggsey commented Mar 6, 2019

What about read_vectored - I notice that doesn't even attempt to clear the buffer?

@Diggsey
Copy link
Contributor

Diggsey commented Mar 6, 2019

Also, it might be worth pulling this out into a method, discard_buffer() so that this mistake isn't repeated.

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 6, 2019

@Diggsey Good point, I didn't understand yet how read_vectored works, so I didn't touch it. And I like the discard_buffer() idea, I'll do it.

@Diggsey
Copy link
Contributor

Diggsey commented Mar 6, 2019

It's pretty much the same as read(), just using multiple user-provided buffers instead of one. It needs a call to discard_buffer() in the same place as for the read() call.

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 6, 2019

Does discard_buffer need any other annotation?

   /// Invalidates all data in the internal buffer.
    #[inline]
    fn discard_buffer(&mut self) {
        self.pos = 0;
        self.cap = 0;
    }

@rust-highfive
Copy link
Collaborator

The job x86_64-gnu-llvm-6.0 of your PR failed on Travis (raw log). Through arcane magic we have determined that the following fragments from the build log may contain information about the problem.

Click to expand the log.
travis_time:end:0bf9a846:start=1551900765085032461,finish=1551900766017839675,duration=932807214
$ git checkout -qf FETCH_HEAD
travis_fold:end:git.checkout

Encrypted environment variables have been removed for security reasons.
See https://docs.travis-ci.com/user/pull-requests/#pull-requests-and-security-restrictions
$ export SCCACHE_BUCKET=rust-lang-ci-sccache2
$ export SCCACHE_REGION=us-west-1
$ export GCP_CACHE_BUCKET=rust-lang-ci-cache
Setting environment variables from .travis.yml

I'm a bot! I can only do what humans tell me to, so if this was not helpful or you have suggestions for improvements, please ping or otherwise contact @TimNN. (Feature Requests)

@andre-vm
Copy link
Contributor Author

andre-vm commented Mar 7, 2019

I had tried to improve the readability of the seek method, but I ended up changing its behavior as well, so I just rolled back.

@Dylan-DPC-zz
Copy link

ping from triage @joshtriplett any updates on this?

@joshtriplett
Copy link
Member

@bors r+

@bors
Copy link
Contributor

bors commented Mar 18, 2019

📌 Commit c36d91c has been approved by joshtriplett

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 18, 2019
Centril added a commit to Centril/rust that referenced this pull request Mar 19, 2019
…iplett

Add new test case for possible bug in BufReader

When reading a large chunk from a BufReader, if all the bytes from the buffer have been already consumed, the internal buffer is bypassed entirely. However, it is not invalidated, and it's possible to access its contents using the `seek_relative` method, because it tries to reuse the existing buffer.
Centril added a commit to Centril/rust that referenced this pull request Mar 19, 2019
…iplett

Add new test case for possible bug in BufReader

When reading a large chunk from a BufReader, if all the bytes from the buffer have been already consumed, the internal buffer is bypassed entirely. However, it is not invalidated, and it's possible to access its contents using the `seek_relative` method, because it tries to reuse the existing buffer.
bors added a commit that referenced this pull request Mar 21, 2019
Add new test case for possible bug in BufReader

When reading a large chunk from a BufReader, if all the bytes from the buffer have been already consumed, the internal buffer is bypassed entirely. However, it is not invalidated, and it's possible to access its contents using the `seek_relative` method, because it tries to reuse the existing buffer.
@bors
Copy link
Contributor

bors commented Mar 21, 2019

⌛ Testing commit c36d91c with merge 15a5dfa...

@bors
Copy link
Contributor

bors commented Mar 21, 2019

☀️ Test successful - checks-travis, status-appveyor
Approved by: joshtriplett
Pushing 15a5dfa to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 21, 2019
@bors bors merged commit c36d91c into rust-lang:master Mar 21, 2019
@rust-highfive
Copy link
Collaborator

📣 Toolstate changed by #58913!

Tested on commit 15a5dfa.
Direct link to PR: #58913

💔 rls on windows: test-pass → test-fail (cc @nrc @Xanewok, @rust-lang/infra).

rust-highfive added a commit to rust-lang-nursery/rust-toolstate that referenced this pull request Mar 21, 2019
Tested on commit rust-lang/rust@15a5dfa.
Direct link to PR: <rust-lang/rust#58913>

💔 rls on windows: test-pass → test-fail (cc @nrc @Xanewok, @rust-lang/infra).
@andre-vm andre-vm deleted the patch_buf_reader branch March 21, 2019 12:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants