Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Not supported attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY #6563

Closed
Junchao-Mellanox opened this issue Jan 26, 2021 · 9 comments · Fixed by sonic-net/sonic-swss#1613 or sonic-net/sonic-swss#1641
Assignees

Comments

@Junchao-Mellanox
Copy link
Collaborator

Description

See following warning and error in syslog. I assume this is related to PR sonic-net/sonic-swss#1511. These warning and error causes a lot of log analyzer error in sonic-mgmt regression. Please remove them.

Jan 26 07:08:51.743242 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2003]- check_attribs_metadata: Not supported attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY
Jan 26 07:08:51.743298 mts-sonic-dut ERR swss#orchagent: :- getResAvailableCounters: Failed to get switch attribute 60 , rv:-327680
Jan 26 07:08:51.743627 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Jan 26 07:08:51.745417 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[1970]- check_attribs_metadata: Not implemented attribute SAI_SWITCH_ATTR_AVAILABLE_SNAT_ENTRY (vendor data not found)
Jan 26 07:08:51.745675 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Jan 26 07:08:51.745972 mts-sonic-dut ERR swss#orchagent: :- getResAvailableCounters: Failed to get switch attribute 61 , rv:-196608
Jan 26 07:08:51.758755 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[1970]- check_attribs_metadata: Not implemented attribute SAI_SWITCH_ATTR_AVAILABLE_DNAT_ENTRY (vendor data not found)
Jan 26 07:08:51.759107 mts-sonic-dut ERR swss#orchagent: :- getResAvailableCounters: Failed to get switch attribute 62 , rv:-196608
Jan 26 07:08:51.759107 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Jan 26 07:13:50.082558 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2003]- check_attribs_metadata: Not supported attribute SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY
Jan 26 07:13:50.082558 mts-sonic-dut WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1

Steps to reproduce the issue:

  1. Load master image on Mellanox platform and check syslog

Describe the results you received:

See error and warning in syslog about SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY.

Describe the results you expected:

No such error and warning in syslog

Additional information you deem important (e.g. issue happens only occasionally):

This issue is found at commit hash 46b3bd5.

@lguohan
Copy link
Collaborator

lguohan commented Jan 26, 2021

what sai value are we returning? if it is not supported, then it should query only once.

https://github.com/Azure/sonic-swss/blob/master/orchagent/crmorch.cpp#L460

@itaibaz
Copy link

itaibaz commented Jan 26, 2021

SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY returns SAI_STATUS_ATTR_NOT_SUPPORTED_0 = -(0x50000) = -327680

SAI_SWITCH_ATTR_AVAILABLE_SNAT_ENTRY, SAI_SWITCH_ATTR_AVAILABLE_DNAT_ENTRY return SAI_STATUS_ATTR_NOT_IMPLEMENTED_0 = -(0x30000) = -196608

@itaibaz
Copy link

itaibaz commented Jan 26, 2021

I think Sonic should query if NAT API is implemented (either query NAT API itself, or sai_query_attribute_capability for the SAI_SWITCH_ATTR_AVAILABLE_SNAT_ENTRY/DNAT), and not just try to read this attribs
Same for IPMC
As I would think, NAT and IPMC are "advanced optional" APIs, and not the mandatory basic APIs that all vendors must support
And since we do have a clean way to query capability, why not use it ?

@lguohan
Copy link
Collaborator

lguohan commented Jan 26, 2021

first I agree with you on using sai_query_attribute_capability. second, if we simply add additional check on line 460 to include NOT_IMPLEMENTED, will it work?

@ben-gale
Copy link
Collaborator

Shouldn't we use the SAI_SWITCH_ATTR_SUPPORTED_OBJECT_TYPE_LIST query for SONiC/CRM to see if the object is available in SAI? This seems better than making making per-feature decisions on what is/is not an advanced feature.

In the meantime, we can implement Guohan's suggestion of the additional check (for SAI_STATUS_ATTR_NOT_IMPLEMENTED).

@liat-grozovik
Copy link
Collaborator

@ben-gale probably need to extend the 'unsupported' logic to also include an additional unsupported error code. As can be seen above you need to add ' SAI_STATUS_ATTR_NOT_SUPPORTED_0'. Suggest to check the list on SAI API.
Please note that there is a logic in SWSS unit testing which can simulate the return code and thus verify the SONiC logic. You may check how it is done in the ACL capabilities check.

I would also suggest not to have warning or error for unsupported or not implemented items. This should be logged in notice level (assuming it is printed only once when swss is restarted).

@nazariig
Copy link
Collaborator

@volodymyrsamotiy / @liat-grozovik we still see a lot of warning on latest 202012:

Feb 11 13:15:30.281516 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2003]- check_attribs_metadata: Not supported attribute SAI_SWITCH_ATTR_AVA
ILABLE_IPMC_ENTRY
Feb 11 13:15:30.281516 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Feb 11 13:15:30.281967 sonic NOTICE swss#orchagent: :- getResAvailableCounters: Switch attribute 60 not supported
Feb 11 13:15:30.287611 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[1970]- check_attribs_metadata: Not implemented attribute SAI_SWITCH_ATTR_AVAILABLE_SNAT_ENTRY (vendor data not found)
Feb 11 13:15:30.287611 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Feb 11 13:15:30.287994 sonic NOTICE swss#orchagent: :- getResAvailableCounters: Switch attribute 61 not supported
Feb 11 13:15:30.293317 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[1970]- check_attribs_metadata: Not implemented attribute SAI_SWITCH_ATTR_AVAILABLE_DNAT_ENTRY (vendor data not found)
Feb 11 13:15:30.293317 sonic WARNING syncd#SDK: [SAI_UTILS.WARNING] mlnx_sai_utils.c[2322]- sai_get_attributes: Failed attribs check, key:Switch ID 1
Feb 11 13:15:30.293777 sonic NOTICE swss#orchagent: :- getResAvailableCounters: Switch attribute 62 not supported

@nazariig nazariig reopened this Feb 11, 2021
@nazariig
Copy link
Collaborator

@lguohan / @PrabhuSreenivasan FYI. There is swss crash observed due to last CRM changes:

root@sonic:/# cgdb /usr/bin/orchagent core/orchagent.1612988495.36.core
(gdb) bt
#0  0x00007fba45afc7bb in raise () from /lib/x86_64-linux-gnu/libc.so.6
#1  0x00007fba45ae7535 in abort () from /lib/x86_64-linux-gnu/libc.so.6
#2  0x00007fba45eb1983 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#3  0x00007fba45eb78c6 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#4  0x00007fba45eb7901 in std::terminate() () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#5  0x00007fba45eb7b34 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#6  0x00007fba45eb386b in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00005593e1714d0c in std::map<CrmResourceType, unsigned int, std::less<CrmResourceType>, std::allocator<std::pair<CrmResourceType const, unsigned int> > >::at (__
k=@0x5593e2be8090: 32, this=0x5593e180cdc0 <crmResSaiAvailAttrMap>) at crmorch.cpp:470
#8  CrmOrch::getResAvailableCounters (this=0x5593e2ab32d0) at crmorch.cpp:440
#9  0x00005593e1714d78 in CrmOrch::doTask (this=0x5593e2ab32d0, timer=...) at crmorch.cpp:428
#10 0x00005593e16211ce in OrchDaemon::start (this=this@entry=0x5593e2aacf20) at orchdaemon.cpp:547
#11 0x00005593e15ffc88 in main (argc=<optimized out>, argv=<optimized out>) at main.cpp:410

Codebase:

https://github.com/Azure/sonic-swss/blob/202012/orchagent/crmorch.cpp#L433

void CrmOrch::getResAvailableCounters()
{
    SWSS_LOG_ENTER();

    for (auto &res : m_resourcesMap)
    {
        sai_attribute_t attr;
        attr.id = crmResSaiAvailAttrMap.at(res.first);

        switch (attr.id)
        {
            case SAI_SWITCH_ATTR_AVAILABLE_IPV4_ROUTE_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPV6_ROUTE_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPV4_NEXTHOP_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPV6_NEXTHOP_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPV4_NEIGHBOR_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPV6_NEIGHBOR_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_NEXT_HOP_GROUP_MEMBER_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_NEXT_HOP_GROUP_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_FDB_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_IPMC_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_SNAT_ENTRY:
            case SAI_SWITCH_ATTR_AVAILABLE_DNAT_ENTRY:
            {
                sai_status_t status = sai_switch_api->get_switch_attribute(gSwitchId, 1, &attr);
                if (status != SAI_STATUS_SUCCESS)
                {
                    if((status == SAI_STATUS_NOT_SUPPORTED) ||
                       (status == SAI_STATUS_NOT_IMPLEMENTED) ||
                       SAI_STATUS_IS_ATTR_NOT_SUPPORTED(status) ||
                       SAI_STATUS_IS_ATTR_NOT_IMPLEMENTED(status))
                    {
                        // remove unsupported resources from map
                        m_resourcesMap.erase(res.first);
                        SWSS_LOG_NOTICE("Switch attribute %u not supported", attr.id);
                        break;
                    }
                    SWSS_LOG_ERROR("Failed to get switch attribute %u , rv:%d", attr.id, status);
                    break;
                }

                res.second.countersMap[CRM_COUNTERS_TABLE_KEY].availableCounter = attr.value.u32;

                break;
            }

https://github.com/Azure/sonic-swss/blob/202012/orchagent/crmorch.cpp#L440

attr.id = crmResSaiAvailAttrMap.at(res.first);

https://github.com/Azure/sonic-swss/blob/202012/orchagent/crmorch.cpp#L470

SWSS_LOG_ERROR("Failed to get switch attribute %u , rv:%d", attr.id, status);

Conclusions:

Seems like the issue is caused by m_resourcesMap state modification while using C++11 range-based loops

@PrabhuSreenivasan
Copy link
Contributor

Hi @nazariig, Understood the problem. Will push a fix.

qiluo-msft pushed a commit to sonic-net/sonic-swss that referenced this issue Feb 15, 2021
Fix sonic-net/sonic-buildimage#6563

**What I did**
Instead of erasing unsupported resource from m_resourcesMap, mark it as unsupported.

**Why I did it**
Erasing an entry while iterating using range based iterator makes the iterator invalid.
Also removing a resource from the m_resourcesMap impacts cross access while iterating other maps (eg crmUsedCntsTableMap).

**How I verified it**
Added a new unsupported resource for testing and verified the logs.
daall pushed a commit to sonic-net/sonic-swss that referenced this issue Feb 16, 2021
Fix sonic-net/sonic-buildimage#6563

**What I did**
Instead of erasing unsupported resource from m_resourcesMap, mark it as unsupported.

**Why I did it**
Erasing an entry while iterating using range based iterator makes the iterator invalid.
Also removing a resource from the m_resourcesMap impacts cross access while iterating other maps (eg crmUsedCntsTableMap).

**How I verified it**
Added a new unsupported resource for testing and verified the logs.
DavidZagury pushed a commit to DavidZagury/sonic-swss that referenced this issue Mar 4, 2021
Fix sonic-net/sonic-buildimage#6563

**What I did**
Instead of erasing unsupported resource from m_resourcesMap, mark it as unsupported.

**Why I did it**
Erasing an entry while iterating using range based iterator makes the iterator invalid.
Also removing a resource from the m_resourcesMap impacts cross access while iterating other maps (eg crmUsedCntsTableMap).

**How I verified it**
Added a new unsupported resource for testing and verified the logs.
raphaelt-nvidia pushed a commit to raphaelt-nvidia/sonic-swss that referenced this issue Oct 5, 2021
Fix sonic-net/sonic-buildimage#6563

**What I did**
Instead of erasing unsupported resource from m_resourcesMap, mark it as unsupported.

**Why I did it**
Erasing an entry while iterating using range based iterator makes the iterator invalid.
Also removing a resource from the m_resourcesMap impacts cross access while iterating other maps (eg crmUsedCntsTableMap).

**How I verified it**
Added a new unsupported resource for testing and verified the logs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
7 participants