Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAIREDIS Record when facing continuous SAI events is not compressing and rotating the record causing /var/log partition to completely run out of space. #8162

Closed
gechiang opened this issue Jul 13, 2021 · 2 comments
Labels
Awaiting Info ⌛ Triaged this issue has been triaged

Comments

@gechiang
Copy link
Collaborator

Description

With the continuously L2 MAC station move testcase running observed sairedis record file does not get properly rotated.
If the events is non stop one can see the /var/log partition gets completely occupied by the current sairedis record.
syslog seems to be rotating fine at a rate of every 10 minutes with my L2 MAC station move trigger. This is not the case with SAI REDIS Record.
This issue is found in master and 202012 branches. We confirm if this is also an issue with 201911 branch later and update this case.

Although by the time the /var/log partition storage gets completely depleted, it does nto seems to impact the normal router acticities. But once this occured, we will loose all previous logs thus it is a debuggability issue.

Steps to reproduce the issue:

  1. Trigger something that continuosly generate SAI events that gets logged into sairedis record. Such as coninously L2 MAC station move.
  2. Check that the sapce of /var/log decreases via "df -h" command.
  3. Eventually you will observe the /var/log partition completely run out of storage space. Also, sairedis record continue to grow uncontrolled...

Describe the results you expected:

Expect that the sairedis recod gets properly rotated instead of growing uncontrolled...

@zhangyanzhao zhangyanzhao added Awaiting Info ⌛ Triaged this issue has been triaged labels Jul 21, 2021
@zhangyanzhao
Copy link
Collaborator

Platform issue, log need cleanup regularly. Need someone from MSFT to take a further look. @lguohan

@bacrossland
Copy link

I've opened a PR on sonic-swss to fix this issue: sonic-net/sonic-swss#2299

bacrossland added a commit to target/sonic-swss that referenced this issue Sep 29, 2022
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
yxieca pushed a commit to sonic-net/sonic-swss that referenced this issue Oct 4, 2022
* What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

* [orchdaemon]: Fixed sairedis record file rotation

What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

* [orchdaemon]: Fixed sairedis record file rotation

What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
bacrossland added a commit to target/sonic-swss that referenced this issue Oct 4, 2022
What I did
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
qiluo-msft pushed a commit to sonic-net/sonic-swss that referenced this issue Oct 6, 2022
**What I did**
Fix sonic-net/sonic-buildimage#8162

Moved sairedis record file rotation logic out of flush() to fix issue.

**Why I did it**
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

**How I verified it**
Ran a script to fill log and verified that rotation was happening correctly.
Pterosaur pushed a commit to Pterosaur/sonic-swss that referenced this issue Nov 5, 2022
* [orchdaemon]: Fixed sairedis record file rotation

* What I did
Fix sonic-net/sonic-buildimage#8162
Moved sairedis record file rotation logic out of flush() to fix issue.

Why I did it
Sairedis record file was not releasing the file handle on rotation. This is because the file handle release was inside the flush() which was only being called if a select timeout was triggered. Moved the logic to its own function which is called in the start() loop.

How I verified it
Ran a script to fill log and verified that rotation was happening correctly.

Signed-off-by: Bryan Crossland [email protected]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Awaiting Info ⌛ Triaged this issue has been triaged
Projects
None yet
Development

No branches or pull requests

3 participants