-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add bgpmon to be started as a new daemon under BGP docker #5329
Conversation
Added one more change to include Monit to monitor this new daemon under BGP so that when it failed/crashed, it will raise an alert to syslog the issue. It will show up something like following:
|
@qiluo-msft Sorry... Have to add one more change to ensure this new daemon is monitored by MONIT and raise an alert (syslog the issue) when bgpmon stops running (crashed). |
@@ -21,3 +22,6 @@ check process staticd matching "/usr/lib/frr/staticd" | |||
|
|||
check process bgpcfgd matching "python /usr/local/bin/bgpcfgd" | |||
if does not exist for 5 times within 5 cycles then alert | |||
|
|||
check process bgpmon matching "python2 /usr/bin/bgpmon.py" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/usr/bin/bgpmon.py [](start = 39, length = 18)
Does the process name include "python2"? If yes, possible to remove? #Closed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@qiluo-msft Per comment in sonic-net/sonic-swss#1429 to make the daemon specific to python2, once that change is in place we see it gets reflected under the output of show services somethis as following:
admin@SONic:~$ show services
...
bgp docker
---------------------------
USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND
root 1 5.2 0.2 28344 22532 pts/0 Ss+ 03:47 0:03 /usr/bin/python2 /usr/bin/supervisord
root 21 0.7 0.2 22660 16684 pts/0 S 03:47 0:00 python /usr/bin/supervisor-proc-exit-listener --container-name bgp
root 25 0.1 0.0 225856 3488 pts/0 Sl 03:47 0:00 /usr/sbin/rsyslogd -n -iNONE
frr 29 0.8 0.1 503176 16156 pts/0 Sl 03:47 0:00 /usr/lib/frr/zebra -A 127.0.0.1 -s 90000000 -M fpm -M snmp
frr 30 0.2 0.0 43176 6140 pts/0 S 03:47 0:00 /usr/lib/frr/staticd -A 127.0.0.1
frr 31 1.3 0.2 294536 20960 pts/0 Sl 03:47 0:00 /usr/lib/frr/bgpd -A 127.0.0.1 -M snmp
root 38 6.1 0.6 69172 55992 pts/0 S 03:47 0:03 /usr/bin/python /usr/local/bin/bgpcfgd
root 41 0.6 0.1 20996 15028 pts/0 S 03:47 0:00 python2 /usr/bin/bgpmon.py
...
For this reason the Monit config specification needs to match it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Based on offline comment from Guohan we have decided to pull in the changes done in (sonic-net/sonic-swss#1429) to here and just have one single PR instead of 2 separate PRs. Another reason is that the sonic-swss repo was not the right place for this new bgpmon daemon code to reside with. Its functionality make more sense to be under "sonic-bgpcfgd". |
src/sonic-bgpcfgd/setup.py
Outdated
@@ -9,7 +9,7 @@ | |||
author_email='[email protected]', | |||
url='https://github.com/Azure/sonic-buildimage', | |||
packages=setuptools.find_packages(), | |||
scripts=['bgpcfgd'], | |||
scripts=['bgpcfgd','bgpmon.py'], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you do console_script packaging? it is better to do that and we can get rid of .py suffix.
https://python-packaging.readthedocs.io/en/latest/command-line-scripts.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lguohan Agreed. I have modified to use console_scripts packaging and it indeed got rid of the .py suffix.
Code re-tested to be functioning as before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Please also resolve others' comments.
retest vsimage please |
@gechiang Could you add some unit test or vs test? |
…5329) * Add bgpmon under sonic-bgpcfgd to be started as a new daemon under BGP docker * Added bgpmon to be monitored by Monit so that if it crashed, it gets alerted * use console_scripts entry point to package bgpmon
This PR is to auto start the bgpmon (BGP Monitor) Daemon under the BGP docker.
The purpose of bgpmon (BGP Monitor daemon) is to assist gathering BGP related information periodically based on BGP activity detection to store BGP Neighbor information into State DB. For components such as SNMP agent that uses this new BGP related DB in state DB to support the MIB operation.
This PR is to auto start the bgpmon daemon under the BGP docker. There is another PR which I originally used to implements the bgpmon processing code (sonic-net/sonic-swss#1429) that was later decided to be moved into this PR instead of under the sonic-swss repo. Many valuable comments are still in that PR so reader should refer to that PR for review comments history purpose only.
Note: This new daemon is also added to be monitored by Monit so that in case it stops running it will be reported.
The HLD doc PR can be found here:
The following PR (SNMP Agent Changes to use State DB for BGP MIB )has a dependency on this PR:
The following PR contains the Pytest code that validate this feature functionality whenever VS Image test is run as part of all PRs as well as used in DUT pytest:
-[Azure-sonic-mgmt] (sonic-net/sonic-mgmt#2241)
Signed-off-by: Gen-Hwa Chiang [email protected]
- What I did
Added a new daemon to be run under BGP docker periodically to gather the BGP peer state information and update them in
the state DB. In order not to continuously running even when there are no BGP peer activities, this new daemon will use the BGP frr.log file timestamp as an indicator whether there are any activities in BGP that warrants it to inspect the peer state info for possible updates.
- How I did it
- How to verify it
once the BGP docker is up and running, you can perform "show services" to see that this new daemon "bgpmon.py" is running under the BGP docker service similar to the following output:
For IPV4 You can perform "show ip bgp summary" and use that output to check against the state DB for each peer.
The state DB key would be something like "NEIGH_STATE_TABLE|10.0.0.33" where 10.0.0.33 is the peer ip address.
you can then read out the state information by issuing the redis cmd hgetall "NEIGH_STATE_TABLE|10.0.0.33". Compare that with the corresponding output of "show ip bgp summary".
Here is a sample output from the State DB:
admin@str-s6000-acs-8:~$ redis-cli -n 6
127.0.0.1:6379[6]> keys "NEIGH*"
127.0.0.1:6379[6]> hgetall "NEIGH_STATE_TABLE|10.0.0.1"
127.0.0.1:6379[6]>
you can do the same for ipv6 by performing "show ipv6 bgp summary" and follow the same steps above to validate with the corresponding peer states stored in state DB.