Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[HLD] SONiC ConfigDB support Yang default values and profiles. #989

Open
wants to merge 31 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
89fef3e
Add TACACS+ protocol requirement document.
liuh-80 Jul 9, 2021
5e3fc17
Improve document.
liuh-80 Jul 9, 2021
d1805c6
Update document according to PR comments.
liuh-80 Jul 13, 2021
f3c8020
Improve document.
liuh-80 Jul 15, 2021
bf1e468
Update requirement document, remove some authorization limitation bec…
liuh-80 Jul 20, 2021
e060412
Save document change.
liuh-80 Jul 27, 2021
8ce6293
Improve design document.
liuh-80 Jul 28, 2021
39e1376
Update design document.
liuh-80 Jul 29, 2021
ecc6f3e
Fix section numbers.
liuh-80 Jul 29, 2021
ef48e3d
Improve document.
liuh-80 Jul 29, 2021
241e35d
Update document according to EOS AAA behavior.
liuh-80 Aug 4, 2021
dc8e5e0
Improve document.
liuh-80 Aug 6, 2021
e06e901
Fix typo
liuh-80 Aug 18, 2021
e8da9bc
Improve document according to review feedback.
liuh-80 Aug 19, 2021
4cf0581
Improve document according to community review.
liuh-80 Sep 22, 2021
1bec8fc
Improve document according review comments.
liuh-80 Sep 22, 2021
4d1660c
Fix PR comments.
liuh-80 Sep 22, 2021
15bd52c
Update design document.
liuh-80 Oct 12, 2021
92ea017
Add comments for AAA table 'login' attribute name issue.
liuh-80 Oct 19, 2021
f6f62ac
Merge branch 'Azure:master' into master
liuh-80 Dec 2, 2021
6942286
Merge branch 'sonic-net:master' into master
liuh-80 Jun 15, 2022
2829484
Merge branch 'sonic-net:master' into master
liuh-80 Jul 22, 2022
25734fc
Merge branch 'sonic-net:master' into master
liuh-80 Aug 30, 2022
8de3a60
Add design document for "ConfigDB support Yang default value and prof…
liuh-80 Aug 30, 2022
3b6efc4
Fix typo
liuh-80 Aug 30, 2022
ba3790a
Improve reboot handling
liuh-80 Aug 31, 2022
ece1aa7
Improve migration plan
liuh-80 Aug 31, 2022
3838154
Remove a unnecessary class from design document
liuh-80 Sep 5, 2022
8046143
Improve design doc and yang model
liuh-80 Oct 12, 2022
993a279
Merge branch 'dev/liuh/configdb_default_value' of https://github.com/…
liuh-80 Oct 17, 2022
88d7749
Add more example
liuh-80 Oct 17, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
392 changes: 392 additions & 0 deletions doc/mgmt/SONiC_ConfigDB_support_Yang_default_values_and_profiles.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,392 @@
# SONiC ConfigDB support Yang default values and profiles

## Table of Contents

- [SONiC ConfigDB support Yang default values and profiles](#sonic-configdb-support-yang-default-values-and-profiles)
- [Table of Contents](#table-of-contents)
- [About this Manual](#about-this-manual)
- [Terminologies](#terminologies)
- [Problem Statement](#problem-statement)
- [1 Functional Requirement](#1-functional-requirement)
- [1.1 swss-common return default value from Yang model](#11-swss-common-return-default-value-from-yang-model)
- [1.2 swss-common return profile from profile DB](#12-swss-common-return-profile-from-profile-db)
- [2 Design](#2-design)
- [Current design:](#current-design)
- [2.1 Considerations](#21-considerations)
- [How to get default value](#how-to-get-default-value)
- [How to get profile](#how-to-get-profile)
- [API compatibility](#api-compatibility)
- [2.2 Other solutions for Yang model default value](#22-other-solutions-for-yang-model-default-value)
- [2.3 New class](#23-new-class)
- [2.4 Other code change](#24-other-code-change)
- [2.5 Database Schema](#25-database-schema)
- [2.6 Code example](#26-code-example)
- [3 Reboot](#3-reboot)
- [3.1 Cold-reboot](#31-cold-reboot)
- [3.2 Warm-reboot](#31-warm-reboot)
- [3.3 Fast-reboot](#31-fast-reboot)
- [3.4 Schema upgrade and DB migration](#34-schema-upgrade-and-db-migration)
- [4 Error handling](#4-error-handling)
- [5 Serviceability and Debug](#5-serviceability-and-debug)
- [6 Unit Test](#6-unit-test)
- [7 Migration steps](#7-migration-steps)
- [7.1 Phase 1](#71-phase-1)
- [7.2 Phase 2](#72-phase-2)
- [8 References](#8-references)
- [SONiC YANG MODEL GUIDELINES](#sonic-yang-model-guidelines)

# About this Manual

This document provides a detailed description on the new features for:

- Yang model default value.
- Profile DB.
- swss-common API change.

## Terminologies

- Yang model:

- The Yang model define a hierarchical data structure.
- SONiC Define config DB schema with Yang model, please refer to [SONiC YANG MODEL GUIDELINES](#7-1-sonic-yang-model-guidelines)

- Default value:

- Yang model support define default value for configuration items.
- For example, default value for nat_zone:

```
leaf nat_zone {
description "NAT Zone for the vlan interface";
type uint8 {
range "0..3" {
error-message "Invalid nat zone for the vlan interface.";
error-app-tag nat-zone-invalid;
}
}
default "0";
}
```

- J2 template:

- SONiC using Jinja2 template generate configuration files during deply minigraph.

- For example, buffer config are rendered by J2 template:

```
{%- set default_cable = '40m' %}
{%- macro generate_buffer_pool_and_profiles() %}
"BUFFER_POOL": {
"ingress_lossless_pool": {
"size": "26531072",
"type": "ingress",
"mode": "dynamic",
"xoff": "6291456"
},
"egress_lossless_pool": {
"size": "32822528",
"type": "egress",
"mode": "static"
}
},

......

{%- endmacro %}
```

- Profile

- Profile is part of OS image, should upgrade with OS upgrade.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the profile here represent a generic concept? It is a bit confusing because there is a similar word buffer profile...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 'profile' is a common concept, profile DB will not only be used by buffer config.

- Profile are suggested values based on expirence.
- User can overwrite profile.

## Problem Statement

- SONiC still using old default value and config from j2 template after OS upgrade:
- Following config may update after OS upgrade:
- Default value: defined in yang model.
- Profile: defined in j2 template.
- Currently all config stored in config DB, so above configs can't be update after OS upgrade.
- Potential risk, Yang model default value conflict with hardcoded value:
- Default value hardcoded in source code.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, did you mean the hardcoded value in the templates?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it's ins the source code, for example in hostcfgd, when read config from config DB, if some item does not exist in config DB, hostcfgd will use default values hardcoded in it's code.

- Yang model default value does not be used.
- SONiC utilities not support get user config and all config:
- Vender OS have different show command:
- show running: only return user config.
- show running all: return user config, default value from yang model, and config from j2 template.
- Currently SONiC only support 'show running'
- DB migrator has complex logic and hardcoded config data:
- DB migrator will be run in post startup action.
- DB migrator complex logic and hardcoded config for migrate from every historical version to latest version. This will keep increase and difficult to maintance.

# 1 Functional Requirement

## 1.1 swss-common return default value from Yang model

- Return default value is optional.
- Application can read config without default value, also can read config with default value.
- Backward compatibility with existing code and applications.

## 1.2 swss-common return profile from profile DB

- Buffer profile stored in profile tables.

- Return profile is optional.

- Application can decide read config with profile or not:
- When application read profile:
- If user overwrite the profile, user config will be return.
- If user not overwrite the profile, profile will be return.
- If user 'delete' the profile, will return nothing.
- When application not read profile, API will only return user config.
- If user overwrite the profile, user config will be return.
- If user not overwrite the profile, will return nothing.
- If user 'delete' the profile, then there also will no user config exist, will return nothing.

- Backward compatibility with existing code and applications.

- For backward compatibility when initialize buffer config from minigraph, config will be write to both config tables and profile tables.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to add terminology to explain config tables and profile tables? It's hard to understand ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, add more detail and change this to config DB and profile DB.

- After all code migrate to use profile DB, profile will only write to profile tables.
- Profile support delete/revert operation:
- In some user scenario, user need delete a profile, and also may add config back later.
- For delete operation, data in profile table will not be delete, profile use PROFILE_DELETE table to handle delete/revert:
- When delete a profile item, profile provider will add the item key to PROFILE_DELETE table.
- When read profile, any key in PROFILE_DELETE will not exist in result.
- When user set deleted item back, the item key will be remove from PROFILE_DELETE table.
- For example:
- BUFFER_POOL table are profile:

```
"BUFFER_POOL": {
"ingress_lossless_pool": {
"size": "26531072",
"type": "ingress",
"mode": "dynamic",
"xoff": "6291456"
},
"egress_lossless_pool": {
"size": "32822528",
"type": "egress",
"mode": "static"
}
}
```
- On mellanox devices, when set buffer mode to 'dynamic' mode, buffer manager will use swsscommon API to delete buffer pool to reclaim unused buffer resource.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just FYI, on Mellanox devices, no buffer pool or buffer profile will be removed to reclaim unused buffers.
But it's possible that a user decides to remove some buffer pool, eg.

  • to migrate from double-ingress-pool mode to single-ingress-pool mode
  • or to execute config qos clear

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, my mistake, the buffermgrd are deleting items in appldb and statedb. config DB keeps no change.

- When delete a profile item 'ingress_lossless_pool', the key of this item will be write to PROFILE_DELETE table. and swsscommon read API will not return this item anymore.
- Revert operation: when user set buffer mode back, buffer manager will write the deleted item back, when this happen, the key of the item will be removed from PROFILE_DELETE table. and swsscommon read API will return this item.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be more precise,

  • if the content user writes back is the same as that in the profile table, removing the key from PROFILE_DELETE table suffice
  • otherwise, the content should also be written to config table

Correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not correct, when the content writes back is same with item with profile table, following happen:

  1. Removing the key from PROFILE_DELETE table
  2. Write content to config table, this is because the 'SET' operation is from user, so it will be a user config, even it exactly same with the data in profile DB.


# 2 Design

- Config DB design diagram:

<img src="./images/swss-common-layer.png" />

- API design diagram:

<img src="./images/swss-common-default-value.png" />

### Current design:

- Existing API keeps no change.
- Add decorator API to return default value and profile data.
- Load yang model as lazy as possible.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can cache the parsed yang model information somewhere globally so that we just need to parse the yang model once and can use them many times.
But I'm not sure whether it is worth doing so since caching/storing it also takes time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by add more detail, current design is to only load once and cache it.


## 2.1 Considerations

### How to get default value

| | Pros | Cons |
| ------------------------------------------------------------- | ------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------- |
| Get default value from Yang model in read API. | Redis config DB keeps no change. | 3 MB memory per-process because need load Yang model and reference libyang.<br>50ms to load yang model.<br>8ms to read default value 10000 times. |
| Write default value to default value DB when write config DB. | Better read performance, Less memory consumption. | Need add new Redis DB for default value. |

### How to get profile

- Profile will stored in a new redis database 'PROFILE_DB', database index is 15.
- Profile DB tables will have exactly same name and schema with config DB tables.
- Data will read form profile DB with swsscommon API.
- Profile DB will save and persist for warm-reboot and fast-reboot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if user perform a SONiC-SONiC upgrade?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When upgrade from old version of SONiC which does not support this feature to new version:

  1. The config DB will keep no change, all data in config DB profile table will be treat as user config.
  2. Profile DB will be created, and DB migrator will initialize Profile DB.

So, from user aspect:

  1. all existing buffer pool/profile keeps no change.
  2. There maybe new buffer pool/profile exist, which introduced by the new version of OS.

I will update this to reboot part.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed


### API compatibility

| | Pros | Cons |
| ----------------------------------------------- | ------------------------------------------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------- |
| Change API to return default value and profile. | Less code change, all app will get default value and profile automatically. | For default value, there are hardcoded default value may different with Yang model, new default value from config DB may cause code bug. |
| Existing API keeps no change. | When update existing code, can cleanup code to remove hard coded default value. | All apps need code update. |

## 2.2 Other solutions for Yang model default value

| | Pros | Cons |
| ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| 1. All existing APIs change to return default value.<br>2. Add new API to get 'real' data from config DB, which not have default value. | Less code change, all app will get default value automatically. | 1. There are hardcoded default value in many different place, the default value of those code may different with default value from Yang model, so new default value from config DB may cause code bug, this is a potential risk.<br/>2. 3 MB memory per-process because need load Yang model.<br/>3. 0.05 second to load yang model |
| 1. Write API change: when write data to config DB, also write default value to 'Default_value_DB'.<br/>2. Read API change: read default value from 'Default_value_DB' and merge with config DB result. | 1. Less memory consumption and better performance when only call read API: read API no need to load yang model.<br/>2. Less code change, all app will get default value automatically. | Hardcoded default value code still need cleanup. |

## 2.3 New class

- YangModelLoader class

- load table name to default value mapping to memory.

- DefaultValueProvider class

- Find default value information by table name and config DB key
- Merge default value to API result.

- ProfileProvider class

- Read profile from profile DB.
- Merge profile to API result.

- YangDefaultDecorator python class

- DecoratorTable c++ class

- DecoratorSubscriberStateTable c++ class

## 2.4 Other code change

- Add new methods to TableEntryEnumerable interface:
- virtual bool hget(const std::string &key, const std::string &field, std::string &value) = 0;

## 2.5 Database Schema

- All profile table will have exactly same schema with existing ConfigDB tables.

- BUFFER_POOL
- BUFFER_PROFILE

- PROFILE_DELETE Table:
- For usage of this table, please refer to [Backward compatibility with existing code and applications.](#backward-compatibility-with-existing-code-and-applications)
- This table is a config DB table, it will presist cross reboot. When re-render config, this table will be flushed as well as other config DB tables.
- Yang model: [link](./sonic-profile-delete.yang "profile delete table")
```
; Key
itemkey = 1*256VCHAR ; Deleted profile item key.
```

## 2.6 Code example

- Connector Decorator:

```
from swsscommon.swsscommon import SonicV2Connector, ConfigDBConnector, YangDefaultDecorator

conn = ConfigDBConnector()
decorator = YangDefaultDecorator(conn)
decorator.connect()
decorator.get_table("VLAN_INTERFACE")
decorator.get_entry("VLAN_INTERFACE", "Vlan1000")
decorator.get_config()
```

- DecoratorTable:

```
from swsscommon.swsscommon import DBConnector, Table, DecoratorTable

db = DBConnector("CONFIG_DB", 0)
# Still can use Table to read user config:
# table = Table(db, 'VLAN_INTERFACE')
# Use DecoratorTable to read default value, profile and use config:
table = DecoratorTable(db, 'VLAN_INTERFACE')
table.get("Vlan1000")
```

# 3 Reboot

-

## 3.1 Cold-reboot

- Code-reboot will reload minigraph, profile DB will re-render in reload minigraph.
- sonic-cfggen and sonic-utility will update to support generate profile DB during reload minigraph.

## 3.2 Warm-reboot

- Profile DB will follow SONiC warm-reboot process, warm-reboot will save whole Redis DB to /host/warmboot/dump.rdb, please refer to [SONiC YANG MODEL GUIDELINES](#7-1-sonic-yang-model-guidelines)
- DB migrator will run after warm reboot, will handle OS upgrade in DB migrator.

## 3.3 Fast-reboot

- Fast-reboot will reload minigraph, profile DB will re-render in reload minigraph.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, minigraph won't be reloaded during fast reboot. It should read the config_db. no?
Same as cold reboot.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, fixed, I check the reboot code and fix the mistake here.

- sonic-cfggen and sonic-utility will update to support generate profile DB during reload minigraph.

## 3.4 Schema upgrade and DB migration

- DB migrator will re-render profile DB to handle schema change.
- Profile DB will generate by sonic-cfggen command.
- DB migrator code will improve by deprate hardcoded config and complex upgrade logic.
- Some profile configuration change are disruptive, for example buffer config change,those change need fast-reboot to make sure change refelected on ASIC.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO it should also work to do a warm reboot to upgrade the schema

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, the code/warm/fast reboot will use DB migrator to upgrade schema.


# 4 Error handling

- Load yang model: throw exception when found yang model data issue.
- swss-common API: if not found Yang model schema data for a given table name, write warning message to syslog.

# 5 Serviceability and Debug

- Debug version will write debug log to syslog.

# 6 Unit Test

- All new code will 100% covered by gtest or pytest test case.

# 7 Migration steps

## 7.1 Phase 1

- swss common API change:

- support Yang model default value.
- support profile value.

- Profile DB code change:

- sonic-cfggen change to support generate profile to PROFILE_DB.
- sonic-util change to support generate profile when load minigraph.
- for backward compatibility, config tables still generate to CONFIG_DB.

## 7.2 Phase 2

- Update sonic cli.

- Add 'show all config' command, which will show following config:
- User config from config DB.
- Yang model default value.
- Profile.
- Add 'show user config' command, which will only show user config.
- For backward compatibility, 'show config' will mapping to 'show all config'.
- This will improve user expirence for debug config related issue.

- Find out all projects need update by code scan:

- Any project using swsssdk.
- Any project using swss common c++ lib.
- Any project using swss common python lib.

- Involve project owner to migrate to new API.

- If project still using swsssdk, then switch to swsscommon with new API.
- When migrate to new API, also clean up hardcoded default values.
- Fix code in buffer manager for a special case for dynamic buffer profile.

- Buffer manager change to use profile table class.

- After this, sonic-cfggen and sonic-util change to not generate profile to CONFIG_DB.

- Improve DB migrator.

- Re-generate profile DB when OS version changed after warm-reboot.
- Deprecate hardcoded profile configuration and profile migrate logic.

# 8 References

## SONiC YANG MODEL GUIDELINES

https://github.com/Azure/SONiC/blob/master/doc/mgmt/SONiC_YANG_Model_Guidelines.md

## System-wide Warmboot

https://github.com/sonic-net/SONiC/blob/9d800b1af99c7d2f22c234f1a17224a4de112c93/doc/warm-reboot/system-warmboot.md
Binary file added doc/mgmt/images/swss-common-default-value.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/mgmt/images/swss-common-layer.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading