Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Core dump when opening a new rocksdb database #4900

Closed
gaojieliu opened this issue Jan 18, 2019 · 5 comments
Closed

Core dump when opening a new rocksdb database #4900

gaojieliu opened this issue Jan 18, 2019 · 5 comments
Labels

Comments

@gaojieliu
Copy link

gaojieliu commented Jan 18, 2019

We are using org.rocksdb:rocksdbjni:5.14.2, and the core dump happened when opening a new rocksdb database and it doesn't happen consistently.
When this issue happened, there were several RocksDB open/close/data read operations.
Here is the stack generated by Java, and we haven't got core dump file yet since it is not easy to reproduce:

Stack: [0x00007fdafa364000,0x00007fdafa465000],  sp=0x00007fdafa461ed0,  free space=1015k
Native frames: (J=compiled Java code, j=interpreted, Vv=VM code, C=native code)
C  [libstdc++.so.6+0x9d25b]  std::basic_string<char, std::char_traits<char>, std::allocator<char> >::basic_string(std::string const&)+0xb
C  [librocksdbjni7645520449361009434.so+0x383ca6]  rocksdb::DB::Open(rocksdb::Options const&, std::string const&, rocksdb::DB**)+0x36
C  [librocksdbjni7645520449361009434.so+0x2db03a]  std::_Function_handler<rocksdb::Status ()(rocksdb::Options const&, std::string const&, rocksdb::DB**), rocksdb::Status (*)(rocksdb::Options const&, std::string const&, rocksdb::DB**)>::_M_invoke(std::_Any_data const&, rocksdb::Options const&, std::string const&, rocksdb::DB**)+0x1a
C  [librocksdbjni7645520449361009434.so+0x2d32fe]  rocksdb_open_helper(JNIEnv_*, long, _jstring*, std::function<rocksdb::Status ()(rocksdb::Options const&, std::string const&, rocksdb::DB**)>)+0x7e
C  [librocksdbjni7645520449361009434.so+0x2d343e]  Java_org_rocksdb_RocksDB_open__JLjava_lang_String_2+0x3e
J 25523  org.rocksdb.RocksDB.open(JLjava/lang/String;)J (0 bytes) @ 0x00007fe6b4b10527 [0x00007fe6b4b10440+0xe7]

When I tried to use addr2line to get the line number, and it didn't print any useful info so it might be the lib we are using didn't compile with '-g' operation.
addr2line -e librocksdbjni7645520449361009434.so 0x383ca6
??:0

Have you guys seen this issue before?
On our end, we will try to reproduce it and get the core dump file later.

Expected behavior

Actual behavior

Steps to reproduce the behavior

@adamretter
Copy link
Collaborator

I am afraid there is not much I can tell from the stack trace. If you build a debug version of RocksJava and then run that, any line numbers will then be resolvable with addr2line - https://github.com/facebook/rocksdb/wiki/JNI-Debugging

@gaojieliu
Copy link
Author

Thanks @adamretter for your reply.
We will try to gather more info with the debug version of RocksJava.

@gaojieliu
Copy link
Author

gaojieliu commented Jan 26, 2019

@adamretter
I followed the instruction to build a debug version of RocksJava:
make jclean clean
export DEBUG_LEVEL=1
make rocksdbjava

When I copied the generated rocksdb java jar: rocksdbjni-5.14.2-linux64.jar into my war, it cored dump during start with the following stack:

(gdb) info stack
#0  0x00007f1dd0602207 in raise () from /lib64/libc.so.6
#1  0x00007f1dd06038f8 in abort () from /lib64/libc.so.6
#2  0x00007f1dcfeff0b5 in os::abort(bool) () from /export/apps/jdk/JDK-1_8_0_121/jre/lib/amd64/server/libjvm.so
#3  0x00007f1dd00a1443 in VMError::report_and_die() () from /export/apps/jdk/JDK-1_8_0_121/jre/lib/amd64/server/libjvm.so
#4  0x00007f1dcff045bf in JVM_handle_linux_signal () from /export/apps/jdk/JDK-1_8_0_121/jre/lib/amd64/server/libjvm.so
#5  0x00007f1dcfefab03 in signalHandler(int, siginfo*, void*) () from /export/apps/jdk/JDK-1_8_0_121/jre/lib/amd64/server/libjvm.so
#6  <signal handler called>
#7  rocksdb::LRUCache::LRUCache (this=0x7f1dcba94470, capacity=8388608, num_shard_bits=4, strict_capacity_limit=false, high_pri_pool_ratio=<unavailable>)
    at cache/lru_cache.cc:475
#8  0x00007f1c27a9cb6b in construct<rocksdb::LRUCache, unsigned long&, int&, bool&, double&> (__p=<optimized out>, this=<synthetic pointer>)
    at /usr/include/c++/4.8.2/ext/new_allocator.h:120
#9  _S_construct<rocksdb::LRUCache, unsigned long&, int&, bool&, double&> (__p=<optimized out>, __a=<synthetic pointer>)
    at /usr/include/c++/4.8.2/bits/alloc_traits.h:254
#10 construct<rocksdb::LRUCache, unsigned long&, int&, bool&, double&> (__p=<optimized out>, __a=<synthetic pointer>)
    at /usr/include/c++/4.8.2/bits/alloc_traits.h:393
#11 __shared_ptr<std::allocator<rocksdb::LRUCache>, unsigned long&, int&, bool&, double&> (__a=..., __tag=..., this=<optimized out>)
    at /usr/include/c++/4.8.2/bits/shared_ptr_base.h:991
#12 shared_ptr<std::allocator<rocksdb::LRUCache>, unsigned long&, int&, bool&, double&> (__a=..., __tag=..., this=<optimized out>)
    at /usr/include/c++/4.8.2/bits/shared_ptr.h:316
#13 allocate_shared<rocksdb::LRUCache, std::allocator<rocksdb::LRUCache>, unsigned long&, int&, bool&, double&> (__a=...)
    at /usr/include/c++/4.8.2/bits/shared_ptr.h:598
#14 make_shared<rocksdb::LRUCache, unsigned long&, int&, bool&, double&> () at /usr/include/c++/4.8.2/bits/shared_ptr.h:614
#15 rocksdb::NewLRUCache (capacity=<optimized out>, num_shard_bits=4, strict_capacity_limit=<optimized out>, high_pri_pool_ratio=<optimized out>)
    at cache/lru_cache.cc:549
#16 0x00007f1c27c16d90 in rocksdb::BlockBasedTableFactory::BlockBasedTableFactory (this=0x7f1dcba94390, _table_options=...) at table/block_based_table_factory.cc:44
#17 0x00007f1c27bee01c in rocksdb::ColumnFamilyOptions::ColumnFamilyOptions (this=0x7f1c27ffc700 <rocksdb::OptionsHelper::dummy_cf_options>)
    at options/options.cc:100

The code at cache/lru_cache.cc:475:

472 LRUCache::LRUCache(size_t capacity, int num_shard_bits,
473                    bool strict_capacity_limit, double high_pri_pool_ratio)
474     : ShardedCache(capacity, num_shard_bits, strict_capacity_limit) {
475   num_shards_ = 1 << num_shard_bits;
476   shards_ = new LRUCacheShard[num_shards_];
477   SetCapacity(capacity);
478   SetStrictCapacityLimit(strict_capacity_limit);
479   for (int i = 0; i < num_shards_; i++) {
480     shards_[i].SetHighPriorityPoolRatio(high_pri_pool_ratio);
481   }
482 }

Not sure why the debug version would fail at that point.
Could you let me know whether I was using the right way to generate debug version? Or I missed something important.

Thanks a lot.

@adamretter
Copy link
Collaborator

adamretter commented Jan 26, 2019

You likely want:

make clean jclean
DEBUG_LEVEL=2 make rocksdbjavastatic

@gaojieliu
Copy link
Author

@adamretter
We did more debugging on our end, and we figured out that we passed a already-freed Options object when opening a new RocksDB database because of a bug.
After we fixed the bug, the crash didn't happen again.

Thanks for your help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants