Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DPB] redis (database) input/output errors #6935

Closed
vadymhlushko-mlnx opened this issue Mar 2, 2021 · 7 comments
Closed

[DPB] redis (database) input/output errors #6935

vadymhlushko-mlnx opened this issue Mar 2, 2021 · 7 comments

Comments

@vadymhlushko-mlnx
Copy link
Contributor

vadymhlushko-mlnx commented Mar 2, 2021

Description

The Dynamic Port breakout CLI throws DB exceptions.

Steps to reproduce the issue:

  1. Install the latest master image via ONIE
  2. config interface breakout Ethernet0 2x50G[40G,25G,10G] -v -y

Describe the results you received:

root@r-tigris-13:/home/admin# config interface breakout Ethernet0 2x50G[40G,25G,10G] -v -y

Running Breakout Mode : 1x100G[50G,40G,25G,10G] 
Target Breakout Mode : 2x50G[40G,25G,10G]

Ports to be deleted : 
 {
    "Ethernet0": "100000"
}
Ports to be added : 
 {
    "Ethernet0": "50000",
    "Ethernet2": "50000"
}

After running Logic to limit the impact

Final list of ports to be deleted : 
 {
    "Ethernet0": "100000"
} 
Final list of ports to be added :  
 {
    "Ethernet0": "50000",
    "Ethernet2": "50000"
}
Loaded below Yang Models
['sonic-acl', 'sonic-breakout_cfg', 'sonic-crm', 'sonic-device_metadata', 'sonic-device_neighbor', 'sonic-extension', 'sonic-flex_counter', 'sonic-interface', 'sonic-loopback-interface', 'sonic-port', 'sonic-portchannel', 'sonic-types', 'sonic-versions', 'sonic-vlan']
RedisReply catches system_error: command: *2
$7
HGETALL
$21
CONFIG_DB_INITIALIZED
, reason: WRONGTYPE Operation against a key holding the wrong kind of value: Input/output error: Input/output error
ConfigMgmt Class creation failed
Failed to break out Port. Error: Failed to load the config. Error: ConfigMgmtDPB Class creation failed

Brief logs output

Mar  2 09:52:53.008274 r-tigris-13 INFO ConfigMgmt: Reading data from Redis configDb
Mar  2 09:52:53.012886 r-tigris-13 ERR python3: :- guard: RedisReply catches system_error: command: *2#015#012$7#015#012HGETALL#015#012$21#015#012CONFIG_DB_INITIALIZED#015#012, reason: WRONGTYPE Operation against a key holding the wrong kind of value: Input/output error
Mar  2 09:52:53.013150 r-tigris-13 ERR ConfigMgmt: RedisReply catches system_error: command: *2#015#012$7#015#012HGETALL#015#012$21#015#012CONFIG_DB_INITIALIZED#015#012, reason: WRONGTYPE Operation against a key holding the wrong kind of value: Input/output error: Input/output error
Mar  2 09:52:53.013418 r-tigris-13 ERR ConfigMgmt: ConfigMgmt Class creation failed

Describe the results you expected:

The DPB CLIE should work

Output of show version:

SONiC Software Version: SONiC.SONIC.master.87-724785d_Internal
Distribution: Debian 10.8
Kernel: 4.19.0-12-2-amd64
Build commit: 724785db
Build date: Tue Mar  2 07:03:48 UTC 2021
Built by: sw-r2d2-bot@r-build-sonic-ci02

Platform: x86_64-mlnx_msn3800-r0
HwSKU: ACS-MSN3800
ASIC: mellanox
ASIC Count: 1
Serial Number: MT1937X00527
Uptime: 10:12:59 up 32 min,  2 users,  load average: 4.12, 2.66, 2.47

Docker images:
REPOSITORY                    TAG                                IMAGE ID            SIZE
docker-snmp                   SONIC.master.87-724785d_Internal   69dfabae7945        438MB
docker-snmp                   latest                             69dfabae7945        438MB
docker-platform-monitor       SONIC.master.87-724785d_Internal   d898e5f67dac        689MB
docker-platform-monitor       latest                             d898e5f67dac        689MB
docker-sonic-telemetry        SONIC.master.87-724785d_Internal   7602dd363057        487MB
docker-sonic-telemetry        latest                             7602dd363057        487MB
docker-fpm-frr                SONIC.master.87-724785d_Internal   46ab1e9ddacc        426MB
docker-fpm-frr                latest                             46ab1e9ddacc        426MB
docker-syncd-mlnx             SONIC.master.87-724785d_Internal   bbb841bb23b3        662MB
docker-syncd-mlnx             latest                             bbb841bb23b3        662MB
docker-teamd                  SONIC.master.87-724785d_Internal   ee725c56f5c4        408MB
docker-teamd                  latest                             ee725c56f5c4        408MB
docker-sonic-mgmt-framework   SONIC.master.87-724785d_Internal   c02661a132d2        616MB
docker-sonic-mgmt-framework   latest                             c02661a132d2        616MB
docker-nat                    SONIC.master.87-724785d_Internal   90a7858d821a        411MB
docker-nat                    latest                             90a7858d821a        411MB
docker-router-advertiser      SONIC.master.87-724785d_Internal   6eec1f77c832        398MB
docker-router-advertiser      latest                             6eec1f77c832        398MB
docker-lldp                   SONIC.master.87-724785d_Internal   c7978680254c        438MB
docker-lldp                   latest                             c7978680254c        438MB
docker-database               SONIC.master.87-724785d_Internal   7e03d2f87033        397MB
docker-database               latest                             7e03d2f87033        397MB
docker-orchagent              SONIC.master.87-724785d_Internal   21ced8f27770        427MB
docker-orchagent              latest                             21ced8f27770        427MB
docker-macsec                 SONIC.master.87-724785d_Internal   112d48f0f140        411MB
docker-macsec                 latest                             112d48f0f140        411MB
docker-dhcp-relay             SONIC.master.87-724785d_Internal   5dc7f7d83002        404MB
docker-dhcp-relay             latest                             5dc7f7d83002        404MB
docker-sflow                  SONIC.master.87-724785d_Internal   41b66dfe4cd8        409MB
docker-sflow                  latest                             41b66dfe4cd8        409MB

Additional information you deem important (e.g. issue happens only occasionally):

sudo generate_dump
sonic_dump_r-tigris-13_20210302_095319.tar.gz

@vadymhlushko-mlnx
Copy link
Contributor Author

@samaity, @praveen-li, @zhenggen-xu - could you please take a look?

@praveen-li
Copy link
Collaborator

praveen-li commented Mar 2, 2021

@vadymhlushko-mlnx

https://github.com/Azure/sonic-py-swsssdk/blob/fa760c46b25b154a5985fc1a50bfda18beb67bec/src/swsssdk/configdb.py#L331

This problem seems to be coming from the below line and is not related to dynamic Port breakout. You can modify this file on the switch to print the exception here.

    def get_config(self):
        """Read all config data. 
        Returns:
            Config data in a dictionary form of 
            { 
                'TABLE_NAME': { 'row_key': {'column_key': 'value', ...}, ...},
                'MULTI_KEY_TABLE_NAME': { ('l1_key', 'l2_key', ...) : {'column_key': 'value', ...}, ...},
                ...
            }
        """
        client = self.get_redis_client(self.db_name)
        keys = client.keys('*')
        data = {}
        for key in keys:
            try:
                (table_name, row) = key.split(self.TABLE_NAME_SEPARATOR, 1)
                entry = self.raw_to_typed(client.hgetall(key))          <<<<<<<<<<<<<<<<
                if entry != None:
                    data.setdefault(table_name, {})[self.deserialize_key(row)] = entry
            except ValueError:
                pass    #Ignore non table-formated redis entries
        return data

@anshuv-mfst
Copy link

To be discussed in DPB workgroup

@praveen-li
Copy link
Collaborator

praveen-li commented Mar 3, 2021 via email

@dmytroxshevchuk
Copy link
Contributor

@qiluo-msft please take a look.
Looks like we try get hash value from entry with string type. Issue produced in this code, when we try to do hgetall with CONFIG_DB_INITIALIZED key.
https://github.com/Azure/sonic-swss-common/blob/master/common/configdb.cpp#L250-L257

        size_t pos = key.find(TABLE_NAME_SEPARATOR);
        string table_name = key.substr(0, pos);
        string row;
        if (pos != string::npos)
        {
            row = key.substr(pos + 1);
        }
        auto const& entry = client.hgetall<map<string, string>>(key);

But looks like key CONFIG_DB_INITIALIZED has string type value, but hgetall is for get hash value type.
Also this key does not have any separator so it is not compatible for data: map<string, map<string, map<string, string>>> data;
As for me we need handle this value using get function or skip such keys.

Please take a look fix in PR, where I just skip keys with no separator:
sonic-net/sonic-swss-common#465

Logic with keys without separator was till this PR:
sonic-net/sonic-swss-common#446

@mykolaxgerasymenko
Copy link
Contributor

Fix for this issue was merged. Take a look PR:
sonic-net/sonic-swss-common#465

Also I created PR to update sonic-swss-common submodule:
#7121

@mykolaxgerasymenko
Copy link
Contributor

@anshuv-mfst @lguohan PR #7121 ([swss-common] Update submodule) was merged. So, this bug was fixed in SONiC. Therefore, as for me, we can close this issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants