Skip to content

Commit

Permalink
Merge pull request #1132 from lolyu/aa_dualtor_traffic_forwarding_design
Browse files Browse the repository at this point in the history
This is to add the traffic forwarding section to the active-active dualtor HLD.

1. normal traffic forwarding behavior.
2. special handling for control plane traffic to NiC IP addresses.

Signed-off-by: Longxiang Lyu [email protected]
  • Loading branch information
lolyu authored Dec 7, 2022
2 parents 3223866 + 0d97d96 commit 3d1efbe
Show file tree
Hide file tree
Showing 4 changed files with 36 additions and 8 deletions.
44 changes: 36 additions & 8 deletions doc/dualtor/active_active_hld.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,10 @@ Active-active dual ToR link manager is an evolution of active-standby dual ToR l

## Revision

| Rev | Date | Author | Change Description |
|:---:|:--------:|:---------------:|--------------------|
| 0.1 | 05/23/22 | Jing Zhang | Initial version |
| Rev | Date | Author | Change Description |
| :---: | :------: | :-----------: | ------------------------------ |
| 0.1 | 05/23/22 | Jing Zhang | Initial version |
| 0.2 | 12/02/22 | Longxiang Lyu | Add Traffic Forwarding section |

## Scope
This document provides the high level design of SONiC dual toR solution, supporting active-active setup.
Expand Down Expand Up @@ -41,8 +42,11 @@ This document provides the high level design of SONiC dual toR solution, support
- [3.5 Transceiver Daemon](#35-transceiver-daemon)
- [3.5.1 Cable Control through gRPC](#351-cable-control-through-grpc)
- [3.6 State Transition Flow](#36-state-transition-flow)
- [3.7 Further Enhancement](#37-further-enhancement)
- [3.8 Command Line](#38-command-line)
- [3.7 Traffic Forwarding](#37-traffic-forwarding)
- [3.7.1 Special Cases of Traffic Forwarding](#371-special-cases-of-traffic-forwarding)
- [3.7.1.1 gRPC Traffic to the NiC IP](#3711-grpc-traffic-to-the-nic-ip)
- [3.8 Further Enhancement](#38-further-enhancement)
- [3.9 Command Line](#39-command-line)

[4 Warm Reboot Support](#4-warm-reboot-support)

Expand Down Expand Up @@ -427,11 +431,35 @@ MuxOrch will listen to state changes from linkmgrd and does the following at a h
1. Shutdown / restart notification from SoC to ToR.
### 3.6 State Transition Flow
The following UML sequence illustrates the state transtion when linkmgrd state moves to active. The flow will be similar for moving to standby.
The following UML sequence illustrates the state transition when linkmgrd state moves to active. The flow will be similar for moving to standby.
![state transition flow](./image/state_transition_flow.png)
### 3.7 Further Enhancement
### 3.7 Traffic Forwarding
The following shows the traffic forwarding behaviors:
* both ToRs are active.
* one ToR is active while the another ToR is standby.
![Traffic Forwarding](./image/traffic_forwarding.png)
#### 3.7.1 Special Cases of Traffic Forwarding
##### 3.7.1.1 gRPC Traffic to the NiC IP
There is a scenario that, if the upper ToR enters standby when its peer(the lower ToR) is already in standby state, all downstream I/O from ToR A will be forwarded through the tunnel to the peer ToR(the lower ToR), so does the control plane gRPC traffic from the transceiver daemon. As the lower ToR is in standby, those tunneled I/O will be blackholed, the NiC will never know that the upper ToR has entered standby in this case.
To solve this issue, we want the control plane gRPC traffic from the transceiver daemon to be forwarded directly via the local devices. This is to differentiate the control plane traffic to the NiC IPs from dataplane traffic that its forwarding behavior honors the mux state and be forwarded to the peer active ToR via the tunnel when the port comes to standby.
The following shows the traffic forwarding behavior when the lower ToR is active while the upper ToR is standby. Now, gRPC traffic from the standby ToR(Upper ToR) is forwarded to the NiC directly. The downstream dataplane traffic to the Upper ToR are directed to the tunnel to the active Lower ToR.
<img src="./image/traffic_forwarding_enhanced.png" width="600" />
<br/><br/>
When orchagent is notified to change to standby, it will re-program both the ASIC and the kernel to let both control plane and data plane traffic be forwarded via the tunnel. To achieve the design proposed above, MuxOrch now will be changed to skip notifying the Tunnelmgrd if the neighbor address is the NiC IP address, so Tunnelmgrd will not re-program the kernel route in this case and the gRPC traffic to the NiC IP address from the transceiver daemon will be forwarded directly.
The following UML diagram shows this change when Linkmgrd state moves to standby:
![message change flow](./image/message_flow_standby.png)
### 3.8 Further Enhancement
**Advertise updated routes to T1**
Current failover strategy can smoothly handle the link failure cases, but if one of the ToRs crashes, and if T1 still sends traffic to the crashed ToR, we will see packet loss.
Expand All @@ -440,7 +468,7 @@ A further improvement in rescuing scenario, is when detecting peer's unhealthy s
**Server Servicing & ToR Upgrade**
For server graceful restart, We already have gRPC service defined in [3.5.1](#351-cable-control-through-grpc). An indicator of ongoing server servicing should be defined based on that notification, so ToR can avoid upgrades in the meantime. Vice versa, we can also define gRPC APIs to notify server when ToR upgrade is ongoing.
### 3.8 Command Line
### 3.9 Command Line
TBD
## 4 Warm Reboot Support
Expand Down
Binary file added doc/dualtor/image/message_flow_standby.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/dualtor/image/traffic_forwarding.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added doc/dualtor/image/traffic_forwarding_enhanced.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 3d1efbe

Please sign in to comment.