-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Barefoot dtel design #182
Barefoot dtel design #182
Conversation
|
||
|
||
## Testing | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need virtual switch test to test pure control plan functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comments! I will add a VS test as well. Also, will create a separate test plan document and upload it as part of this PR
|
||
## euclid | ||
|
||
euclid (Euclid Universal Configuration LIbrary for Dataplane telemetry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this a open source project, does it have a website?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. But, we intend to upstream SONiC specific changes as part of this effort
* Value: | ||
* Report session OID | ||
* Reference count | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to follow up with the reference count discussion we had during the meeting:
In meeting, the reference count usage was described as:
- When reference count is not zero, the object is protected from deletion.
- When reference count drops to zero, a delete request will free the object.
I have concern for this approach. The danger is that a delete call at the right time is required to free the object. So the caller of delete will have to check if the object has been freed or denied of deletion. This makes code logic complicated and encourages opportunities for either memory leak or double free.
I think a safe approach to use reference count is following:
- An object get reference count of 1 at creation time.
- A delete request reduces the reference count by 1.
- An association with another entity increases the reference count by 1.
- Detaching from another entity decrease the reference count by 1.
- Object is freed when the reference count drops to 0.
The advantage of this approach is that the delete can be called whenever the code decide to free. And the object will be freed when all the associations are gone. This makes code logical simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment! This makes sense, I will change the design accordingly
|
||
Components of SONiC that will be modified or newly added are discussed in the following sub-sections. | ||
|
||
## Config DB |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think you should explain why you decided to have the A,B,C, etc in the table name as this is strange.
as discussed over the design review we should have better way in SONiC to force an order.
as for naming convention i beleive table has been removed once we moved to config_db. please double check
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! Regarding using tables in config_db, if we remove them, we run into the problem where events are re-ordered upon config replay. Since, DTel has specific dependency requirements between events, can we keep the tables until SONiC has better infrastructure to control event ordering (within an orchagent)? Another option is to create two separate orch agents, one to handle pre-requisite config and another for dependent config, and order the init of these orch agents.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the feedback! Regarding using tables in config_db, if we remove them, we run into the problem where events are re-ordered upon config replay. Since, DTel has specific dependency requirements between events, can we keep the tables until SONiC has better infrastructure to control event ordering (within an orchagent)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liatgrozovik - i think the question is whether this feature should bear the burden with resolving an issue that exists in sonic for all features? In spirit of making progress, we can have a short term solution as we did right now and adhere to long term solution when its designed properly and reviewed by community. Clearly issue of dependency / order of operations is something to be resolved for all features.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@liatgrozovik - @lguohan was making a few other suggestions to us on this topic, @shruthi9 can give summary on what she will implement
|
||
Most events are handled as shown in the sequence diagram below. Ones which are different are depicted in the following figures. | ||
|
||
![Control flow](cflow-1.jpg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to add handling of port stat events and handle cases where ports are not yet configured.
IMO the request should be queued and replayed when the port configuration is ready (port or lag).
1. At least one drop watchlist with drop report enabled | ||
|
||
**Queue report specific:** | ||
1. Reporting enabled on at least one queue with some threshold (latency or depth) set or tail drop enabled. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please note that we have both VS testing as well as ACL testbed which covers ACL and everlfow.
you will need to make sure those are not broken plus add more test cases to cover this.
i don't see any reference to what will happen if a user will try to define an ACL which is not supported.
As there is an ACL HLD already with flows, you should review it and probably update it with all the ACL changes/ additionals. I IMO this HLD should covers details on AC and flowsL. We should have one place and a reference from here.
``` | ||
|
||
|
||
## Testing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VirtualSwitch (a.k.a) VS testing should covers all swss parts including ACL changes as well as the new Dtel.
On top of that we have for each feature a testbed test which should cover system level tests and full functional validation with a real hw. this one should have dedicated test plan document and should be reviewed by community.
you have few examples of those in the github: crm, acl, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will create a separate test plan document and upload it as part of this PR.
Does anyone can tell me where is te euclid tool ? it is mentioned in dtel design spec, also where is the test files? |
Does anyone knows how to use euclid tool ? |
Design document for Barefoot Dataplane Telemetry support in SONiC.