Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HLD] SONiC ConfigDB support Yang default values and profiles. #989

Open
wants to merge 31 commits into
base: master
Choose a base branch
from

Conversation

liuh-80
Copy link
Contributor

@liuh-80 liuh-80 commented Apr 22, 2022

Add HLD for SONiC config DB support load Yang model default value and profiles.

Here are PRs related with this HLD:

Repo PR Title State
sonic-buildimage Add new Redis database PROFILE_DB. GitHub issue/pull request detail
sonic-swss-common Add ProfileProvider class to support read profile config from PROFILE_DB. GitHub issue/pull request detail
sonic-swss-common Modify azure pipeline to install libyang in azure pipeline steps. GitHub issue/pull request detail
sonic-swss-common Load Yang model default value to DefaultValueProvider. GitHub issue/pull request detail
sonic-swss-common Add decorator for profile config. GitHub issue/pull request detail
sonic-swss-common Add decorator for Yang default value. GitHub issue/pull request detail
sonic-buildimage Update sonic-cfggen to support generate profile config. GitHub issue/pull request detail
sonic-utilities Import buffer config to Profile DB when load minigraph. GitHub issue/pull request detail

@stephenxs
Copy link
Collaborator

stephenxs commented Jun 28, 2022

I was just wondering whether it's possible to introduce a way to "shadow" the fields that existed in STATIC_CONFIG_DB but are removed from CONFIG_DB by the user. Maybe we can refer to how it is handled in APPL_DB: introduce a set to store the items, and fields that are deleted by the user.
By doing so, we do not have to rule out a table from STATIC_CONFIG_DB only because the table can be modified by a user.
I think it's better to decide whether a table is eligible to be in STATIC_CONFIG_DB based on the meaning of the table instead of the possibility to be modified.

@liuh-80 liuh-80 changed the title [POC] swsscommon default value from yang model [POC] Overlay config DB. Jul 13, 2022
@liuh-80 liuh-80 requested a review from neethajohn July 13, 2022 05:45
@liuh-80
Copy link
Contributor Author

liuh-80 commented Jul 13, 2022

I was just wondering whether it's possible to introduce a way to "shadow" the fields that existed in STATIC_CONFIG_DB but are removed from CONFIG_DB by the user. Maybe we can refer to how it is handled in APPL_DB: introduce a set to store the items, and fields that are deleted by the user. By doing so, we do not have to rule out a table from STATIC_CONFIG_DB only because the table can be modified by a user. I think it's better to decide whether a table is eligible to be in STATIC_CONFIG_DB based on the meaning of the table instead of the possibility to be modified.

Currently there are 2 scenario about delete:

  1. Xoff can be deleted/reset by user.
  2. Mellanox may have multiple pool config in J2 template, and will delete pool config.

For 1, Delete means change database schema, maybe we should use Xoff = 0 or Xoff=-1 instead of delete.

For 2, still not quite undertsand the scenario.

Will setup a offline discussion for this.

@liuh-80 liuh-80 requested a review from qiluo-msft July 13, 2022 05:50
@liuh-80 liuh-80 changed the title [POC] Overlay config DB. [HLD] SONiC overlay config DB. Jul 13, 2022
@stephenxs
Copy link
Collaborator

I was just wondering whether it's possible to introduce a way to "shadow" the fields that existed in STATIC_CONFIG_DB but are removed from CONFIG_DB by the user. Maybe we can refer to how it is handled in APPL_DB: introduce a set to store the items, and fields that are deleted by the user. By doing so, we do not have to rule out a table from STATIC_CONFIG_DB only because the table can be modified by a user. I think it's better to decide whether a table is eligible to be in STATIC_CONFIG_DB based on the meaning of the table instead of the possibility to be modified.

Currently there are 2 scenario about delete:

  1. Xoff can be deleted/reset by user.
  2. Mellanox may have multiple pool config in J2 template, and will delete pool config.

For 1, Delete means change database schema, maybe we should use Xoff = 0 or Xoff=-1 instead of delete.

For 2, still not quite undertsand the scenario.

Will setup a offline discussion for this.

For xoff, it's OK to disable it via setting it to 0.
For deleting a pool, a possible example is to remove the lossless pools when a device is deployed in a lossy-traffic-only scenario.

Anyway, now that we are introducing a new mechanism to support multiple overlay DB, I believe it should not be restricted in the buffer area. Probably the requirement is also applicable to other features.

- Merge default value to API result.

## 2.3 Other code change
- Following method will change to get default value from DefaultValueProvider
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API design changed, will update this later.

- Merge default value to API result.

## 2.3 Other code change
- Following method will change to get default value from DefaultValueProvider
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

API design changed, will update this later.


| | Pros | Cons |
| ------------------------------------------------------------- | ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| Get default value from Yang model in read API. | Redis config DB keeps no change. | 3 MB memory per-process because need load Yang model and reference libyang.<br>50ms to load yang model.<br>8ms to read default value 10000 times. |
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, currently the load and mapping only happened when first time read default values.
Will update design document with this.


## 2.2 New class

- YangModelLoader class
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a new C++ class, and it's a internal class, not visible to user.

doc/mgmt/SONiC_overlay_ConfigDB.md Outdated Show resolved Hide resolved
- For delete operation, data in static config table will not be delete, profile underlay use PROFILE_DELETE table to handle delete/revert:
- When delete a static config item, profile underlay will add the item key to PROFILE_DELETE table.
- When read static config, any key in PROFILE_DELETE will not exist in result.
- When user set deleted item back, the item key will be remove from PROFILE_DELETE table.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

remove

typo: removed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed

}
}
```
- On mellanox devices, when set buffer mode to 'dynamic' mode, buffer manager will use swsscommon API to delete buffer pool to reclam unused buffer resource.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reclam

typo

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


### How to get static config

- Static config will stored in a new redis database 'PROFILE_DB', database index is 15.
Copy link
Contributor

@qiluo-msft qiluo-msft Aug 24, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

15

Some UT is using 15 as TEST_DB. Please be aware for any conflict risk. #Pending

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will test this issue with this PR: sonic-net/sonic-swss-common#625
Will update here after all UT passed.

@liuh-80 liuh-80 changed the title [HLD] SONiC overlay config DB. [HLD] SONiC ConfigDB support Yang default values and profiles. Aug 29, 2022
@liuh-80
Copy link
Contributor Author

liuh-80 commented Aug 29, 2022

/easycla

2 similar comments
@liuh-80
Copy link
Contributor Author

liuh-80 commented Aug 29, 2022

/easycla

@liuh-80
Copy link
Contributor Author

liuh-80 commented Aug 29, 2022

/easycla

@liuh-80 liuh-80 marked this pull request as ready for review August 29, 2022 09:31
@liuh-80
Copy link
Contributor Author

liuh-80 commented Aug 29, 2022

/easycla

@liuh-80 liuh-80 force-pushed the dev/liuh/configdb_default_value branch from e4b10a6 to 8de3a60 Compare August 30, 2022 02:21
@liuh-80
Copy link
Contributor Author

liuh-80 commented Aug 30, 2022

Fix EasyCLA issue by merge all change to one and do a force push.

@liuh-80 liuh-80 requested a review from lguohan August 30, 2022 05:50
@liuh-80 liuh-80 force-pushed the dev/liuh/configdb_default_value branch from 3838154 to ece1aa7 Compare September 6, 2022 01:18

- Profile

- Profile is part of OS image, should upgrade with OS upgrade.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the profile here represent a generic concept? It is a bit confusing because there is a similar word buffer profile...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'profile' is a common concept, profile DB will not only be used by buffer config.


- Backward compatibility with existing code and applications.

- For backward compatibility when initialize buffer config from minigraph, config will be write to both config tables and profile tables.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add terminology to explain config tables and profile tables? It's hard to understand ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, add more detail and change this to config DB and profile DB.

```
- On mellanox devices, when set buffer mode to 'dynamic' mode, buffer manager will use swsscommon API to delete buffer pool to reclaim unused buffer resource.
- When delete a profile item 'ingress_lossless_pool', the key of this item will be write to PROFILE_DELETE table. and swsscommon read API will not return this item anymore.
- Revert operation: when user set buffer mode back, buffer manager will write the deleted item back, when this happen, the key of the item will be removed from PROFILE_DELETE table. and swsscommon read API will return this item.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more precise,

  • if the content user writes back is the same as that in the profile table, removing the key from PROFILE_DELETE table suffice
  • otherwise, the content should also be written to config table

Correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not correct, when the content writes back is same with item with profile table, following happen:

  1. Removing the key from PROFILE_DELETE table
  2. Write content to config table, this is because the 'SET' operation is from user, so it will be a user config, even it exactly same with the data in profile DB.

- Profile will stored in a new redis database 'PROFILE_DB', database index is 15.
- Profile DB tables will have exactly same name and schema with config DB tables.
- Data will read form profile DB with swsscommon API.
- Profile DB will save and persist for warm-reboot and fast-reboot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if user perform a SONiC-SONiC upgrade?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When upgrade from old version of SONiC which does not support this feature to new version:

  1. The config DB will keep no change, all data in config DB profile table will be treat as user config.
  2. Profile DB will be created, and DB migrator will initialize Profile DB.

So, from user aspect:

  1. all existing buffer pool/profile keeps no change.
  2. There maybe new buffer pool/profile exist, which introduced by the new version of OS.

I will update this to reboot part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


## 3.3 Fast-reboot

- Fast-reboot will reload minigraph, profile DB will re-render in reload minigraph.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, minigraph won't be reloaded during fast reboot. It should read the config_db. no?
Same as cold reboot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed, I check the reboot code and fix the mistake here.

}
}
```
- On mellanox devices, when set buffer mode to 'dynamic' mode, buffer manager will use swsscommon API to delete buffer pool to reclaim unused buffer resource.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, on Mellanox devices, no buffer pool or buffer profile will be removed to reclaim unused buffers.
But it's possible that a user decides to remove some buffer pool, eg.

  • to migrate from double-ingress-pool mode to single-ingress-pool mode
  • or to execute config qos clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, my mistake, the buffermgrd are deleting items in appldb and statedb. config DB keeps no change.

- Profile: defined in j2 template.
- Currently all config stored in config DB, so above configs can't be update after OS upgrade.
- Potential risk, Yang model default value conflict with hardcoded value:
- Default value hardcoded in source code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, did you mean the hardcoded value in the templates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's ins the source code, for example in hostcfgd, when read config from config DB, if some item does not exist in config DB, hostcfgd will use default values hardcoded in it's code.

- DB migrator will re-render profile DB to handle schema change.
- Profile DB will generate by sonic-cfggen command.
- DB migrator code will improve by deprate hardcoded config and complex upgrade logic.
- Some profile configuration change are disruptive, for example buffer config change,those change need fast-reboot to make sure change refelected on ASIC.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it should also work to do a warm reboot to upgrade the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, the code/warm/fast reboot will use DB migrator to upgrade schema.


- Existing API keeps no change.
- Add decorator API to return default value and profile data.
- Load yang model as lazy as possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can cache the parsed yang model information somewhere globally so that we just need to parse the yang model once and can use them many times.
But I'm not sure whether it is worth doing so since caching/storing it also takes time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by add more detail, current design is to only load once and cache it.

}

organization
"BRCM";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change BRCM to SONiC.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

list PROFILE_DELETE {
key "itemkey";
scommon:key-delim "|";
scommon:key-pattern "PROFILE_DELETE_TABLE|{itemkey}";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We dont need key-delim and key-pattern, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, 2 lines removed.

}

container sonic-profile-delete {
scommon:db-name "CONFIG_DB";
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Config-DB is by default, we don't need it here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@liuh-80
Copy link
Contributor Author

liuh-80 commented Oct 12, 2022

@stephenxs @venkatmahalingam , thanks for your comments, I have updated the HLD and fix it, please check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants