Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[YSQL][SQLsmith] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache #11229

Open
def- opened this issue Jan 26, 2022 · 1 comment
Assignees
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures

Comments

@def-
Copy link
Contributor

def- commented Jan 26, 2022

Jira Link: DB-931

Description

yb-master failed to come up when restarting (in Debug build) after a SQLsmith session. FATAL file reports:

Log file created at: 2022/01/26 18:06:14
Running on machine: si
Application fingerprint: version 2.11.3.0 build PRE_RELEASE revision 9ac94afd3726d3bc683af0096a2c9f2389528c7a build_type DEBUG built at 26 Jan 2022 17:57:07 CET
Running duration (h:mm:ss): 0:00:00
Log line format: [IWEF]mmdd hh:mm:ss.uuuuuu threadid file:line] msg
F0126 18:06:14.417448 84608384 master_main.cc:126] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache

I can provide the data files if required. (10 GB)

@def- def- added kind/bug This issue is a bug area/docdb YugabyteDB core features labels Jan 26, 2022
@def- def- changed the title [DocDB] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache [DocDB][SQLsmith] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache Feb 3, 2022
@jaki
Copy link
Contributor

jaki commented Feb 26, 2022

Got same message when trying to fix #5658. With partial fix patch on latest 2.6 and centos fastdebug gcc build, rolling upgrade rf=3 2.0.5.2 to 2.2.2.0 to this 2.6 build causes master FATAL. Slightly modified msg to show stack trace and column number:

F0225 18:10:22.651639 14713 master_main.cc:138] Internal error (yb/common/ql_expr.cc:692): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column 10 unexpectedly not found in cache,     @     0x7fcd13d8244f  yb::QLTableRow::Column(int) const (src/yb/common/ql_expr.cc:691)
    @     0x7fcd13d827cb  yb::QLTableRow::GetValue(int, yb::QLValue*) const (src/yb/common/ql_expr.cc:711)
    @     0x7fcd1c6cfac4  yb::QLTableRow::GetValue(yb::ColumnId const&, yb::QLValue*) const (src/yb/common/ql_expr.h:138)
    @     0x7fcd1c6cfac4  yb::master::EnumerateSysCatalog(yb::docdb::DocRowwiseIterator*, yb::Schema const&, signed char, std::function<yb::Status (yb::Slice const&, yb::Slice const&)> const&) (src/yb/master/sys_catalog_writer.cc:195)
    @     0x7fcd1c6d0f04  yb::master::EnumerateSysCatalog(yb::tablet::Tablet*, yb::Schema const&, signed char, std::function<yb::Status (yb::Slice const&, yb::Slice const&)> const&) (src/yb/master/sys_catalog_writer.cc:172)
    @     0x7fcd1c66fa36  yb::master::MasterSnapshotCoordinator::Impl::Load(yb::tablet::Tablet*) (src/yb/master/master_snapshot_coordinator.cc:242)
    @     0x7fcd1c66fa36  yb::master::MasterSnapshotCoordinator::Load(yb::tablet::Tablet*) (src/yb/master/master_snapshot_coordinator.cc:1297)
    @     0x7fcd1b7f6d47  yb::tablet::TabletBootstrap::PlaySegments(yb::consensus::ConsensusBootstrapInfo*) (src/yb/tablet/tablet_bootstrap.cc:1213)
    @     0x7fcd1b7f980f  yb::tablet::TabletBootstrap::Bootstrap(shared_ptr<yb::tablet::Tablet>*, scoped_refptr<yb::log::Log>*, yb::consensus::ConsensusBootstrapInfo*) (src/yb/tablet/tablet_bootstrap.cc:558)
    @     0x7fcd1b7e5eb2  yb::tablet::BootstrapTabletImpl(yb::tablet::BootstrapTabletData const&, shared_ptr<yb::tablet::Tablet>*, scoped_refptr<yb::log::Log>*, yb::consensus::ConsensusBootstrapInfo*) (src/yb/tablet/tablet_bootstrap.cc:1671)
    @     0x7fcd1b7facde  yb::tablet::BootstrapTablet(yb::tablet::BootstrapTabletData const&, shared_ptr<yb::tablet::Tablet>*, scoped_refptr<yb::log::Log>*, yb::consensus::ConsensusBootstrapInfo*) (src/yb/tablet/tablet_bootstrap_if.cc:92)
    @     0x7fcd1c6b271e  yb::master::SysCatalogTable::OpenTablet(scoped_refptr<yb::tablet::RaftGroupMetadata> const&) (src/yb/master/sys_catalog.cc:565)
    @     0x7fcd1c6b3bd3  yb::master::SysCatalogTable::SetupTablet(scoped_refptr<yb::tablet::RaftGroupMetadata> const&) (src/yb/master/sys_catalog.cc:508)
    @     0x7fcd1c6b48ee  yb::master::SysCatalogTable::Load(yb::FsManager*) (src/yb/master/sys_catalog.cc:288)
    @     0x7fcd1c53f854  yb::master::CatalogManager::InitSysCatalogAsync() (src/yb/master/catalog_manager.cc:1466)
    @     0x7fcd1c55600c  yb::master::CatalogManager::Init() (src/yb/master/catalog_manager.cc:652)
    @     0x7fcd1c62d6ff  yb::master::Master::InitCatalogManager() (src/yb/master/master.cc:285)
    @     0x7fcd1c62e724  yb::master::Master::InitCatalogManagerTask() (src/yb/master/master.cc:274)
    @     0x7fcd1c634a0c  yb::internal::RunnableAdapter<void (yb::master::Master::*)()>::Run(yb::master::Master*) (src/yb/gutil/bind_internal.h:149)
    @     0x7fcd1c634a0c  yb::internal::InvokeHelper<false, void, yb::internal::RunnableAdapter<void (yb::master::Master::*)()>, void (yb::master::Master*)>::MakeItSo(yb::internal::RunnableAdapter<void (yb::master::Master::*)()>, yb::master::Master*) (src/yb/gutil/bind_internal.h:886)
    @     0x7fcd1c634a0c  yb::internal::Invoker<1, yb::internal::BindState<yb::internal::RunnableAdapter<void (yb::master::Master::*)()>, void (yb::master::Master*), void (yb::internal::UnretainedWrapper<yb::master::Master>)>, void (yb::master::Master*)>::Run(yb::internal::BindStateBase*) (src/yb/gutil/bind_internal.h:1081)
    @     0x7fcd11d1caf0  yb::Callback<void ()>::Run() const (src/yb/gutil/callback.h:411)
    @     0x7fcd11d1cbce  void std::_Mem_fn_base<void (yb::Callback<void ()>::*)() const, true>::operator()<, void>(yb::Callback<void ()> const&) const (gcc/5.5.0_4/include/c++/5.5.0/functional:583)
    @     0x7fcd11d1cbce  void std::_Bind<std::_Mem_fn<void (yb::Callback<void ()>::*)() const> (yb::Callback<void ()>)>::__call<void, , 0ul>(tuple<>&&, std::_Index_tuple<0ul>) (gcc/5.5.0_4/include/c++/5.5.0/functional:1074)
    @     0x7fcd11d1cbce  void std::_Bind<std::_Mem_fn<void (yb::Callback<void ()>::*)() const> (yb::Callback<void ()>)>::operator()<, void>() (gcc/5.5.0_4/include/c++/5.5.0/functional:1133)
    @     0x7fcd11d1cbce  yb::RunnableImpl<std::_Bind<std::_Mem_fn<void (yb::Callback<void ()>::*)() const> (yb::Callback<void ()>)> >::Run() (src/yb/util/threadpool.h:77)
    @     0x7fcd11d19591  yb::ThreadPool::DispatchThread(bool) (src/yb/util/threadpool.cc:617)
    @     0x7fcd11d1cb1c  void std::_Mem_fn_base<void (yb::ThreadPool::*)(bool), true>::operator()<bool&, void>(yb::ThreadPool*, bool&) const (gcc/5.5.0_4/include/c++/5.5.0/functional:600)
    @     0x7fcd11d1cb1c  void std::_Bind<std::_Mem_fn<void (yb::ThreadPool::*)(bool)> (yb::ThreadPool*, bool)>::__call<void, , 0ul, 1ul>(tuple<>&&, std::_Index_tuple<0ul, 1ul>) (gcc/5.5.0_4/include/c++/5.5.0/functional:1074)
    @     0x7fcd11d1cb1c  void std::_Bind<std::_Mem_fn<void (yb::ThreadPool::*)(bool)> (yb::ThreadPool*, bool)>::operator()<, void>() (gcc/5.5.0_4/include/c++/5.5.0/functional:1133)
    @     0x7fcd11d1cb1c  std::_Function_handler<void (), std::_Bind<std::_Mem_fn<void (yb::ThreadPool::*)(bool)> (yb::ThreadPool*, bool)> >::_M_invoke(std::_Any_data const&) (gcc/5.5.0_4/include/c++/5.5.0/functional:1871)
    @     0x7fcd11d1202f  std::function<void ()>::operator()() const (gcc/5.5.0_4/include/c++/5.5.0/functional:2267)
    @     0x7fcd11d1202f  yb::Thread::SuperviseThread(void*) (src/yb/util/thread.cc:771)
    @     0x7fcd0d5e7693  start_thread (/tmp/glibc-20181130-26094-cs1x60/glibc-2.23/nptl/pthread_create.c:333)
    @     0x7fcd0d32941c  (unknown) (sysdeps/unix/sysv/linux/x86_64/clone.S:109)
    @ 0xffffffffffffffff

For reproducibility, here's the partial fix patch for #5658:

diff --git a/src/yb/tablet/tablet_metadata.cc b/src/yb/tablet/tablet_metadata.cc
index b15b47a7ab..6209fcd49c 100644
--- a/src/yb/tablet/tablet_metadata.cc
+++ b/src/yb/tablet/tablet_metadata.cc
@@ -51,6 +51,7 @@
 #include "yb/gutil/map-util.h"
 #include "yb/gutil/stl_util.h"
 #include "yb/gutil/strings/substitute.h"
+#include "yb/master/sys_catalog_constants.h"
 #include "yb/rocksutil/yb_rocksdb.h"
 #include "yb/rocksutil/yb_rocksdb_logger.h"
 #include "yb/server/metadata.h"
@@ -181,6 +182,18 @@ Status KvStoreInfo::LoadTablesFromPB(
     RETURN_NOT_OK(table_info->LoadFromPB(table_pb));
     if (table_info->table_id != primary_table_id) {
       if (table_pb.schema().table_properties().is_ysql_catalog_table()) {
+        // For now, only master syscatalog tablet has YSQL system tables.
+        CHECK_EQ(primary_table_id, master::kSysCatalogTableId);
+
+        // Backwards compatibility with clusters that upgraded over v2.0.9-v2.0.10: these clusters
+        // may incorrectly have pgtable_id set for system tables, so remove it.  See issue #5658 for
+        // details.
+        if (table_info->schema.has_pgtable_id()) {
+          LOG(INFO) << "Removing pgtable id " << table_info->schema.pgtable_id()
+                    << " from YSQL catalog table " << table_info->table_id;
+          table_info->schema.set_pgtable_id(0);
+        }
+
         Uuid cotable_id;
         CHECK_OK(cotable_id.FromHexString(table_info->table_id));
         // TODO(#79): when adding for multiple KV-stores per Raft group support - check if we need

@yugabyte-ci yugabyte-ci added the priority/medium Medium priority issue label Jun 8, 2022
@bmatican bmatican assigned tverona1 and unassigned bmatican Jul 19, 2022
@bmatican bmatican added the status/awaiting-triage Issue awaiting triage label Jul 19, 2022
@yugabyte-ci yugabyte-ci changed the title [DocDB][SQLsmith] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache [YSQL][SQLsmith] Internal error (yb/common/ql_expr.cc:702): Unable to initialize catalog manager: Failed to initialize sys tables async: Failed log replay. Reason: Column unexpectedly not found in cache Jul 19, 2022
@yugabyte-ci yugabyte-ci added area/ysql Yugabyte SQL (YSQL) and removed area/docdb YugabyteDB core features labels Jul 19, 2022
@yugabyte-ci yugabyte-ci removed the status/awaiting-triage Issue awaiting triage label Jul 27, 2022
@kripasreenivasan kripasreenivasan added the qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures label Sep 13, 2022
@yugabyte-ci yugabyte-ci added the kind/failing-test Tests and testing infra label Oct 12, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/ysql Yugabyte SQL (YSQL) kind/bug This issue is a bug kind/failing-test Tests and testing infra priority/medium Medium priority issue qa_automation Bugs identified via itest-system, LST, Stress automation or causing automation failures
Projects
None yet
Development

No branches or pull requests

6 participants