Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Set filesystem constructor parameter for FilePrefetchBuffer inside Pr…
…efetchTail (#13157) Summary: After #13118 was merged, I did some investigation to see whether the file system buffer reuse code was actually being used. The good news is that I was able to see from the CPU profiling results that my code is getting invoked through the warm storage stress tests. The bad news is that most of the time, the optimization is not being used, so we end up going through the regular old `RandomAccessFileReader::Read` path. Here is the entire function call chain up to `FilePrefetchBuffer::Read` 1. rocksdb::DB::MultiGet 2. rocksdb::DBImpl::MultiGet 3. rocksdb::DBImpl::MultiGetCommon 4. rocksdb::DBImpl::MultiGetImpl 5. rocksdb::Version::MultiGet 6. rocksdb::Version::MultiGetFromSST 7. rocksdb::TableCache::MultiGet 8. rocksdb::TableCache::FindTable 9. rocksdb::TableCache::GetTableReader 10. rocksdb::BlockBasedTableFactory::NewTableReader 11. rocksdb::BlockBasedTable::Open 12. rocksdb::BlockBasedTable::PrefetchTail 13. rocksdb::FilePrefetchBuffer::Prefetch 14. rocksdb::FilePrefetchBuffer::Read At this point, we split into `rocksdb::RandomAccessFileReader::Read` and `rocksdb::FilePrefetchBuffer::FSBufferDirectRead`. `FSBufferDirectRead` gets called <3% of the time. I think the root cause is that the `FileSystem* fs` parameter is not getting passed into the `FilePrefetchBuffer` constructor. When `fs` is `nullptr`, `UseFSBuffer()` will always return `false` and we do not end up calling `FSBufferDirectRead`. Luckily, it does not seem like there are too many places I need to change. `BlockBasedTable` resets its `prefetch_buffer` in 3 separate places. When it disables the prefetch buffer (2/3 of the instances), we don't care about whether the `fs` parameter is there. This PR is addressing the third instance, where it is not trying to disable the buffer. Note that there is another method, `PrefetchBufferCollection::GetOrCreatePrefetchBuffer` that creates new `FilePrefetchBuffer`s without the `fs` parameter. This method gets called by `compaction_iterator` and `merge_helper`. I think we can address this in a subsequent PR: 1. Each of these changes effectively "unlocks" the buffer reuse feature. Separating the changes would be helpful when I look at the profiling results again, since I can isolate what impact this PR had on the percentage of time that `rocksdb::FilePrefetchBuffer::FSBufferDirectRead` was invoked. 2. I still need to look into what exactly I would need to changes I need to make to `PrefetchBufferCollection` 3. This code seems to be for blob prefetching in particular, and I don't think it has the biggest ROI anyways. ```cpp const Status s = blob_fetcher_->FetchBlob( user_key(), blob_index, prefetch_buffer, &blob_value_, &bytes_read); ``` 4. I am not sure if the current benchmark I am using for warm storage exercises this blob prefetching code, so I may need to find another way to assess the performance impact. Pull Request resolved: #13157 Test Plan: The existing unit test coverage guards against obvious bugs. I ran another set of performance tests to confirm there were no regressions in CPU utilization. Reviewed By: anand1976 Differential Revision: D66464704 Pulled By: archang19 fbshipit-source-id: 260145cfcc05ac46cf2dd77a53a85e8808031dea
- Loading branch information