Skip to content
This repository has been archived by the owner on Jan 22, 2025. It is now read-only.

Tune RocksDB to avoid unbounded cache growth #15266

Closed
wants to merge 1 commit into from

Conversation

ryoqun
Copy link
Contributor

@ryoqun ryoqun commented Feb 11, 2021

Problem

Yet another slow mem leak is spotted; this time... by our usage of RocksDB (or lack of proper configuration...)

Unless we set max_open_files (-1; inifinite; is the default), rocksdb holds associated cache in memory. Putting aside just mere File descriptors, it burdens long-running validator operation. Note that, generally we continue to create SST files.

comparison of validators before/after running roughly equal time with same options against mainnet-beta under heaptrack

BEFORE

$ tail heaptrack.solana-validator2.190486-5.gz.report
total runtime: 133453.78s.

heaptrack.solana-validator2.190486-5.gz.report:1.01GB peak memory consumed over 1558 calls from
heaptrack.solana-validator2.190486-5.gz.report:570.43MB peak memory consumed over 5950 calls from

heaptrack.solana-validator2.190486-5.gz.report:1.07GB leaked over 1558 calls from
heaptrack.solana-validator2.190486-5.gz.report:570.43MB leaked over 5950 calls from

(Note: peak is same as leaked amount... so not freeing at all. lol)

AFTER

$ tail heaptrack.solana-validator3-all-v2.1-leak-fixes.813196-7.zst.report
total runtime: 133720.55s.

heaptrack.solana-validator3-all-v2.1-leak-fixes.813196-7.zst.report:570.43MB peak memory consumed over 9115 calls from
heaptrack.solana-validator3-all-v2.1-leak-fixes.813196-7.zst.report:327.85MB peak memory consumed over 2105 calls from

heaptrack.solana-validator3-all-v2.1-leak-fixes.813196-7.zst.report:469.76MB leaked over 9115 calls from
heaptrack.solana-validator3-all-v2.1-leak-fixes.813196-7.zst.report:333.79MB leaked over 2105 calls from

(Note: this report is taken when rocksdb dir contained more than 512 (new tuned value) files and lsof -p indicated actually less than 512 open file descriptor)

After very long run even after daily manually-triggered compaction was completed
$ tail heaptrack.solana-validator2.190486-9.gz.report
total runtime: 342044.07s.

heaptrack.solana-validator2.190486-9.gz.report:2.68GB peak memory consumed over 6003 calls from
heaptrack.solana-validator2.190486-9.gz.report:771.75MB peak memory consumed over 14875 calls from

heaptrack.solana-validator2.190486-9.gz.report:2.73GB leaked over 6003 calls from
heaptrack.solana-validator2.190486-9.gz.report:771.75MB leaked over 14875 calls from

(note: Dunno about small difference between peak vs leak)

resource: facebook/rocksdb#4112 (sadly, very cluttered not-authoritative info...)

Summary of Changes

TODO:

  • assess perf degradation (maybe, run on one of our validators?)
  • assess risk of worsening known rocksdb troubles (stall by compaction, and still-unknown one rocksdb compaction sometimes cause complete blockprocesing stalls #14586 )
  • google around to see best practice by others (I have done lightly and have some links to grok)
    • goal is to then decide the correct tune value (static or dynamic? / different value depending --rpc-port or not).
    • alternatively, goal is to find more proper memory tuning option?

context #14366

@ryoqun
Copy link
Contributor Author

ryoqun commented Feb 11, 2021

FYI, these two call sites are the exact call sites to allocate these unfreed mem

2.73GB leaked over 6003 calls from
  rocksdb::BlockFetcher::ReadBlockContents()
    in /home/ubuntu/solana-validator2
  rocksdb::Status rocksdb::(anonymous namespace)::ReadBlockFromFile<>(rocksdb::RandomAccessFileReader*, rocksdb::FilePrefetchBuffer*, rocksdb::Footer const&, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, std::unique_ptr<>*, rocksdb::ImmutableCFOptions const&, bool, bool, rocksdb::BlockType, rocksdb::UncompressionDict const&, rocksdb::PersistentCacheOptions const&, unsigned long, rocksdb::MemoryAllocator*, bool, bool, rocksdb::FilterPolicy const*) [clone .constprop.412]
    in /home/ubuntu/solana-validator2
  rocksdb::Status rocksdb::BlockBasedTable::RetrieveBlock<>(rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, rocksdb::BlockHandle const&, rocksdb::UncompressionDict const&, rocksdb::CachableEntry<>*, rocksdb::BlockType, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, bool, bool) const
    in /home/ubuntu/solana-validator2
  rocksdb::BlockBasedTable::IndexReaderCommon::ReadIndexBlock(rocksdb::BlockBasedTable const*, rocksdb::FilePrefetchBuffer*, rocksdb::ReadOptions const&, bool, rocksdb::GetContext*, rocksdb::BlockCacheLookupContext*, rocksdb::CachableEntry<>*)
    in /home/ubuntu/solana-validator2
  rocksdb::BinarySearchIndexReader::Create(rocksdb::BlockBasedTable const*, rocksdb::FilePrefetchBuffer*, bool, bool, bool, rocksdb::BlockCacheLookupContext*, std::unique_ptr<>*)
    in /home/ubuntu/solana-validator2
  rocksdb::BlockBasedTable::CreateIndexReader(rocksdb::FilePrefetchBuffer*, rocksdb::InternalIteratorBase<>*, bool, bool, bool, rocksdb::BlockCacheLookupContext*, std::unique_ptr<>*)
    in /home/ubuntu/solana-validator2
  rocksdb::BlockBasedTable::PrefetchIndexAndFilterBlocks(rocksdb::FilePrefetchBuffer*, rocksdb::InternalIteratorBase<>*, rocksdb::BlockBasedTable*, bool, rocksdb::BlockBasedTableOptions const&, int, unsigned long, unsigned long, rocksdb::BlockCacheLookupContext*)
    in /home/ubuntu/solana-validator2
  rocksdb::BlockBasedTable::Open(rocksdb::ImmutableCFOptions const&, rocksdb::EnvOptions const&, rocksdb::BlockBasedTableOptions const&, rocksdb::InternalKeyComparator const&, std::unique_ptr<>&&, unsigned long, std::unique_ptr<>*, rocksdb::SliceTransform const*, bool, bool, int, bool, unsigned long, bool, rocksdb::TailPrefetchStats*, rocksdb::BlockCacheTracer*, unsigned long)
    in /home/ubuntu/solana-validator2
  rocksdb::BlockBasedTableFactory::NewTableReader(rocksdb::TableReaderOptions const&, std::unique_ptr<>&&, unsigned long, std::unique_ptr<>*, bool) const
    in /home/ubuntu/solana-validator2
  rocksdb::TableCache::GetTableReader(rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, bool, bool, rocksdb::HistogramImpl*, std::unique_ptr<>*, rocksdb::SliceTransform const*, bool, int, bool, unsigned long)
    in /home/ubuntu/solana-validator2
  rocksdb::TableCache::FindTable(rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileDescriptor const&, rocksdb::Cache::Handle**, rocksdb::SliceTransform const*, bool, bool, rocksdb::HistogramImpl*, bool, int, bool, unsigned long)
    in /home/ubuntu/solana-validator2
  rocksdb::TableCache::NewIterator(rocksdb::ReadOptions const&, rocksdb::FileOptions const&, rocksdb::InternalKeyComparator const&, rocksdb::FileMetaData const&, rocksdb::RangeDelAggregator*, rocksdb::SliceTransform const*, rocksdb::TableReader**, rocksdb::HistogramImpl*, rocksdb::TableReaderCaller, rocksdb::Arena*, bool, int, unsigned long, rocksdb::InternalKey const*, rocksdb::InternalKey const*, bool)
    in /home/ubuntu/solana-validator2
  rocksdb::CompactionJob::Run()::{lambda(rocksdb::Status&)#1}::operator()(rocksdb::Status&) const
    in /home/ubuntu/solana-validator2
  rocksdb::CompactionJob::Run()
    in /home/ubuntu/solana-validator2
  rocksdb::DBImpl::BackgroundCompaction(bool*, rocksdb::JobContext*, rocksdb::LogBuffer*, rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)
    in /home/ubuntu/solana-validator2
  rocksdb::DBImpl::BackgroundCallCompaction(rocksdb::DBImpl::PrepickedCompaction*, rocksdb::Env::Priority)
    in /home/ubuntu/solana-validator2
  rocksdb::DBImpl::BGWorkCompaction(void*)
    in /home/ubuntu/solana-validator2
  rocksdb::ThreadPoolImpl::Impl::BGThread(unsigned long)
    in /home/ubuntu/solana-validator2
  0x7f9adf306111
    in /lib/x86_64-linux-gnu/libstdc++.so.6
  start_thread
    in /lib/x86_64-linux-gnu/libpthread.so.0
  __clone
    in /lib/x86_64-linux-gnu/libc.so.6

771.75MB leaked over 14875 calls from
  rocksdb::Arena::AllocateNewBlock(unsigned long)
    in /home/ubuntu/solana-validator2
  rocksdb::Arena::AllocateFallback(unsigned long, bool)
    in /home/ubuntu/solana-validator2
  rocksdb::ConcurrentArena::AllocateAligned(unsigned long, unsigned long, rocksdb::Logger*)
    in /home/ubuntu/solana-validator2
  rocksdb::(anonymous namespace)::SkipListRep::Allocate(unsigned long, char**)
    in /home/ubuntu/solana-validator2
  rocksdb::MemTable::Add(unsigned long, rocksdb::ValueType, rocksdb::Slice const&, rocksdb::Slice const&, bool, rocksdb::MemTablePostProcessInfo*, void**)
    in /home/ubuntu/solana-validator2
  rocksdb::MemTableInserter::PutCFImpl(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&, rocksdb::ValueType)
    in /home/ubuntu/solana-validator2
  rocksdb::MemTableInserter::PutCF(unsigned int, rocksdb::Slice const&, rocksdb::Slice const&)
    in /home/ubuntu/solana-validator2
  rocksdb::WriteBatchInternal::Iterate(rocksdb::WriteBatch const*, rocksdb::WriteBatch::Handler*, unsigned long, unsigned long)
    in /home/ubuntu/solana-validator2
  rocksdb::WriteBatch::Iterate(rocksdb::WriteBatch::Handler*) const
    in /home/ubuntu/solana-validator2
  rocksdb::WriteBatchInternal::InsertInto(rocksdb::WriteThread::WriteGroup&, unsigned long, rocksdb::ColumnFamilyMemTables*, rocksdb::FlushScheduler*, rocksdb::TrimHistoryScheduler*, bool, unsigned long, rocksdb::DB*, bool, bool, bool)
    in /home/ubuntu/solana-validator2
  rocksdb::DBImpl::WriteImpl(rocksdb::WriteOptions const&, rocksdb::WriteBatch*, rocksdb::WriteCallback*, unsigned long*, unsigned long, bool, unsigned long*, unsigned long, rocksdb::PreReleaseCallback*)
    in /home/ubuntu/solana-validator2
  rocksdb::DBImpl::Write(rocksdb::WriteOptions const&, rocksdb::WriteBatch*)
    in /home/ubuntu/solana-validator2
  rocksdb_write
    in /home/ubuntu/solana-validator2
  rocksdb::db::DB::write_opt::h266e54c7e87f6832
    at /home/ryoqun/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/rocksdb-0.15.0/src/db.rs:426
    in /home/ubuntu/solana-validator2
  rocksdb::db::DB::write::h794ba91d7cdd0167
    at /home/ryoqun/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/rocksdb-0.15.0/src/db.rs:432
  solana_ledger::blockstore_db::Rocks::write::h6b70c70b688d6641
    at /home/ryoqun/work/solana/solana/ledger/src/blockstore_db.rs:389
    in /home/ubuntu/solana-validator2
  solana_ledger::blockstore_db::Database::write::h8f6b387e3e68224b
    at /home/ryoqun/work/solana/solana/ledger/src/blockstore_db.rs:812
  solana_ledger::blockstore::Blockstore::insert_shreds_handle_duplicate::h1dc3518d6a9b7729
    at /home/ryoqun/work/solana/solana/ledger/src/blockstore.rs:927
    in /home/ubuntu/solana-validator2
  solana_core::window_service::run_insert::h6ee2101b41d292ef
    at /home/ryoqun/work/solana/solana/core/src/window_service.rs:150
    in /home/ubuntu/solana-validator2
  solana_core::window_service::WindowService::start_window_insert_thread::_$u7b$$u7b$closure$u7d$$u7d$::hece92de0cbcf3122
    at /home/ryoqun/work/solana/solana/core/src/window_service.rs:435
    in /home/ubuntu/solana-validator2
  std::sys_common::backtrace::__rust_begin_short_backtrace::haf66992bc3b40712
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/sys_common/backtrace.rs:125
  std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::_$u7b$$u7b$closure$u7d$$u7d$::h1013b57d8a8a41ab
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:474
    in /home/ubuntu/solana-validator2
  _$LT$std..panic..AssertUnwindSafe$LT$F$GT$$u20$as$u20$core..ops..function..FnOnce$LT$$LP$$RP$$GT$$GT$::call_once::h4151c2bbd9216aa3
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:322
  std::panicking::try::do_call::hc62a8de8198d2325
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:379
  std::panicking::try::hc0223c4c82740a26
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panicking.rs:343
  std::panic::catch_unwind::hf0e2afba486849e2
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/panic.rs:396
  std::thread::Builder::spawn_unchecked::_$u7b$$u7b$closure$u7d$$u7d$::h2eb4bf163ea54282
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/std/src/thread/mod.rs:473
  core::ops::function::FnOnce::call_once$u7b$$u7b$vtable.shim$u7d$$u7d$::h1ee78c9fe8915e4a
    at /home/ryoqun/.rustup/toolchains/nightly-2021-01-23-x86_64-unknown-linux-gnu/lib/rustlib/src/rust/library/core/src/ops/function.rs:227
  _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h8ffdf8dc1f37e360
    at /rustc/22ddcd1a13082b7be0fc99b720677efd2b733816/library/alloc/src/boxed.rs:1484
    in /home/ubuntu/solana-validator2
  _$LT$alloc..boxed..Box$LT$F$C$A$GT$$u20$as$u20$core..ops..function..FnOnce$LT$Args$GT$$GT$::call_once::h74f6ec149ce6acc8
    at /rustc/22ddcd1a13082b7be0fc99b720677efd2b733816/library/alloc/src/boxed.rs:1484
  std::sys::unix::thread::Thread::new::thread_start::h565bef3956c58d58
    at /rustc/22ddcd1a13082b7be0fc99b720677efd2b733816//library/std/src/sys/unix/thread.rs:71
  start_thread
    in /lib/x86_64-linux-gnu/libpthread.so.0
  __clone
    in /lib/x86_64-linux-gnu/libc.so.6

@codecov
Copy link

codecov bot commented Feb 11, 2021

Codecov Report

Merging #15266 (1ea5500) into master (ab0f4c6) will increase coverage by 0.0%.
The diff coverage is 100.0%.

@@           Coverage Diff           @@
##           master   #15266   +/-   ##
=======================================
  Coverage    79.5%    79.5%           
=======================================
  Files         402      402           
  Lines      102344   102345    +1     
=======================================
+ Hits        81431    81445   +14     
+ Misses      20913    20900   -13     

@sakridge
Copy link
Contributor

We could plumb it as an command-line option, and then it will be easy to test out.

@stale
Copy link

stale bot commented Mar 19, 2021

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the stale [bot only] Added to stale content; results in auto-close after a week. label Mar 19, 2021
@stale
Copy link

stale bot commented Apr 19, 2021

This stale pull request has been automatically closed. Thank you for your contributions.

@stale stale bot closed this Apr 19, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stale [bot only] Added to stale content; results in auto-close after a week.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants