Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[interfaces-config] Fix race condition between networking service and interface-config service #139

Closed
wants to merge 1 commit into from

Conversation

Junchao-Mellanox
Copy link
Owner

@Junchao-Mellanox Junchao-Mellanox commented Apr 8, 2022

Why I did it

The PR is aimed to fix a bug that mgmt port eth0 may loss IP even if user configured static IP of eth0. This is not a always reproduceable issue, the reproducing flow is like:

  1. Systemd starts networking service, which runs a dhcp based configuration and assigned an ip from dhcp.
  2. Systemd starts interface-config service who depends on networking service
  3. Interface-config service runs command “ifdown –force eth0”, check line. but networking service is still running so that this line failed with error: “error: Another instance of this program is already running.”. This error is printed by ifupdown2 lib who is the main process of networking service. So, ifdown actually does not work here, the ip of eth0 is not down.
  4. Interface-config service updates /etc/networking/interface to static configuration.
  5. Interface-config service runs command “systemctl restart networking”. This command kills the previous networking related processes (log: networking.service: Main process exited, code=killed, status=15/TERM), and try to reconfigure the ip address with static configuration. But it detects that the configured IP and the existing IP are the same, and it does not really configure the ip to kernel. Hence, the ip is still getting from dhcp. (this could be a bug of ifupdown2: previous ip is from dhcp, new ip is a static ip, it treats them as same instead of re-configuring the IP)
  6. When the lease of the ip expires, the ip of eth0 is removed by kernel and the issue reproduces.

The issue is not always reproduceable because networking service usually runs fast so that it won't hit step#3.

How I did it

Check networking service state before running "ifdown –force eth0", wait for it done if it is activating.

How to verify it

Manual test.

Which release branch to backport (provide reason below if selected)

  • 201811
  • 201911
  • 202006
  • 202012
  • 202106
  • 202111

Description for the changelog

Link to config_db schema for YANG module changes

A picture of a cute animal (not mandatory but encouraged)

…rvice

Conflicts:
	files/image_config/interfaces/interfaces-config.sh
@Junchao-Mellanox Junchao-Mellanox requested a review from keboliu April 8, 2022 07:20
Junchao-Mellanox pushed a commit that referenced this pull request Oct 25, 2022
79edf66 Longxiang Lyu Wed Aug 17 08:12:37 2022 +0800 Fix azure pipeline (#118)
8e0f2c6 Longxiang Lyu Wed Aug 17 08:36:07 2022 +0800 Update linkmgr health after getting default route update (#117)
b14ffb8 Jing Zhang Wed Aug 17 15:44:37 2022 -0700 [active-active] post mux metrics events (#123)
a30dbb3 Jing Zhang Thu Aug 18 18:16:04 2022 -0700 Update handleMuxConfigNotification logic (#125)
e14aaba Jing Zhang Tue Aug 23 10:02:17 2022 -0700 [active-active] Remove unnecessary mux wait timeout logs (#122)
cc83717 Longxiang Lyu Fri Sep 2 02:17:53 2022 +0800 Fix mux config (#128)
5429281 Mai Bui Thu Sep 1 17:44:04 2022 -0400 [linkmgrd] Replace memset function in link_prober (#126)
b5aaec1 Jing Zhang Fri Sep 9 14:01:03 2022 -0700 [active-active] shutdown link prober when starting as isolated (#130)
75f02cf Jing Zhang Tue Sep 13 10:34:32 2022 -0700 [active-standby] update warmboot reconciliation logic (#129)
a5a9f90 Hua Liu Fri Sep 16 09:54:32 2022 +0800 Install libyang to azure pipeline (#132)
6fe4f0f Jing Zhang Tue Sep 20 10:10:16 2022 -0700 [Active-Active] flaky LinkmgrdBootupSequence unit tests (#134)
ea68e8c Jing Zhang Wed Sep 21 10:52:18 2022 -0700 Post switchover reasons to STATE DB (#131)
60c35b5 Jing Zhang Thu Sep 22 13:00:41 2022 -0700 [Active-Active] server side admin forwarding state sync up (#133)
08e1be5 Jing Zhang Mon Sep 26 10:59:27 2022 -0700 [Active-Active] avoid being stuck in unknown after process init (#136)
2579988 Jing Zhang Mon Oct 3 09:40:55 2022 -0700 [Active-Standby] fix syslog flood caused by unkown -> standby switchovers (#137)
7e9f670 Jing Zhang Wed Oct 5 10:03:45 2022 -0700 [Active-Active] Retry config mux mode standby (#139)
23feb3b Jing Zhang Wed Oct 5 15:22:58 2022 -0700 [Active-Active] Post link prober stats to state db (#140)
e650098 Jing Zhang Fri Oct 7 15:27:17 2022 -0700 [Active-Active] Update default route shutdown heartbeat logic (#141)
d0653e7 Jing Zhang Tue Oct 11 10:22:02 2022 -0700 [Active-Standby] avoid posting mux metrics event when receiving unsolicited mux state notification (#142)

dcf6460 Longxiang Lyu Fri Oct 21 12:15:42 2022 +0800 [active-active] Add support to send/handle mux probe request (#147)
fdf42ed Longxiang Lyu Fri Oct 21 10:34:47 2022 +0800 Fix link prober state event report twice issue (#149)
5fd19a3 Longxiang Lyu Mon Oct 17 09:20:27 2022 +0800 [active-active] Fix config reload (#145)

sign-off: Jing Zhang [email protected]
Junchao-Mellanox pushed a commit that referenced this pull request Oct 28, 2022
…#12292)

linkmgrd:
* a5ac7f6 2022-10-05 | [Active-Active] Post link prober stats to state db  (#140) (HEAD -> 202205, github/202205) [Jing Zhang]
* f4b0e53 2022-10-05 | [Active-Active] Retry config mux mode standby (#139) [Jing Zhang]

utilities:
* a255838 2022-10-04 | [minigraph] new workflow for golden path (sonic-net#2396) (HEAD -> 202205, github/202205) [jingwenxie]
* 99425a8 2022-10-03 | [actions] Support Semgrep by Github Actions (sonic-net#2417) [Mai Bui]
* f41e4d1 2022-09-30 | Fix for show vxlan tunnel command display issue sonic-net#11902 (sonic-net#2391) [Senthil Bhava]
* e1d827e 2022-09-29 | [VxLAN]Fix Vxlan delete command to throw error when there are references (sonic-net#2404) [Sudharsan Dhamal Gopalarathnam]
* d77acf8 2022-09-28 | [doc] add documentation on automatic techsupport based on memory (sonic-net#2411) [Stepan Blyshchak]
* 2cfc75a 2022-09-28 | [doc] update "config feature" section with "--block" option (sonic-net#2409) [Stepan Blyshchak]
* 9dc8471 2022-09-28 | [Vxlanmgrd] [CPA] Update the vxlan_tunnel name len to be under IFNAMIZ to overcome netdev creation failure (sonic-net#2398) [Vivek]
* 342589e 2022-10-03 | Added cisco config platform commands (sonic-net#2242) (sonic-net#2418) [yucgu]

swss:
* 9d9f395 2022-10-04 | [intfmgr]: Enable `accept_untracked_na` kernel param (sonic-net#2436) (HEAD -> 202205, github/202205) [Lawrence Lee]
* 6b6d25d 2022-10-04 |  [orchdaemon]: Fixed sairedis record file rotation (sonic-net#2480) [Bryan Crossland]

Signed-off-by: Ying Xie <[email protected]>

Signed-off-by: Ying Xie <[email protected]>
Junchao-Mellanox pushed a commit that referenced this pull request Apr 17, 2023
Why I did it
[Submodule][202211] Advance sonic-restapi pointer

The branch 202012 has already updated to commit 47e4b53.

4f6f979 Fix the redis security issue CVE-2023-28858 and CVE-2023-28859 (#139)
47e4b53 Fix adv_pfx len for ipv6 (#135)
44121be Support ipv6 prefix lenght greater than 64 and check for adv_prefix (#134)
99c467d Add API support for adv prefix and custom monitoring (#133)
347684a Use github code scanning instead of LGTM (#132)
86543d0 Updates to route PATCH API (#129)
a1af82c Install libyang to azure pipeline (#128)
2007c4c Increase coverage threshold (#126)

Work item tracking
Microsoft ADO (number only): 17705422
How I did it
How to verify it
Junchao-Mellanox pushed a commit that referenced this pull request Jun 25, 2023
update restapi commits till:

4f6f979 (HEAD -> master, origin/master, origin/HEAD) [Security] Fix the redis security issue CVE-2023-28858 and CVE-2023-28859 (#139)
Junchao-Mellanox pushed a commit that referenced this pull request Aug 4, 2023
…lly (sonic-net#16014)

#### Why I did it
src/sonic-gnmi
```
* 58a7b20 - (HEAD -> master, origin/master, origin/HEAD) Add delete field to On change response when key is deleted (#139) (8 hours ago) [Zain Budhwani]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Junchao-Mellanox pushed a commit that referenced this pull request Jun 3, 2024
…omatically (sonic-net#19024)

#### Why I did it
src/sonic-mgmt-common
```
* 1e12744 - (HEAD -> master, origin/master, origin/HEAD) Interface and Ethernet Counter support (#139) (31 hours ago) [Satoru Shinohara]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Junchao-Mellanox pushed a commit that referenced this pull request Aug 1, 2024
…utomatically (sonic-net#19665)

#### Why I did it
src/sonic-host-services
```
* ca6b3cd - (HEAD -> master, origin/master, origin/HEAD) Fix security vulnerability in caclmgrd (#139) (5 days ago) [Zhaohui Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Junchao-Mellanox pushed a commit that referenced this pull request Aug 8, 2024
…utomatically (sonic-net#19732)

#### Why I did it
src/sonic-host-services
```
* 81a4ee8 - (HEAD -> 202405, origin/202405) Fix security vulnerability in caclmgrd (#139) (4 days ago) [Zhaohui Sun]
```
#### How I did it
#### How to verify it
#### Description for the changelog
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant