Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing Event Log file cleanup issue #30

Merged
merged 7 commits into from
Jul 13, 2021
Merged

Conversation

khushbr
Copy link
Collaborator

@khushbr khushbr commented Jul 13, 2021

Is your feature request related to a problem?
Issue : #26
Previous PR [now closed] : opensearch-project/performance-analyzer#36
PA side PR : opensearch-project/performance-analyzer#36

Describe the solution you are proposing

  1. The solution removes the 'MetricsPurgeActivity' collector and moves the responsibility for event log file cleanup to 'EventLogQueueProcessor.' The queue processor first invokes the deleteFiles() [taken from MetricsPurgeActivity] to clean up the old event log files and then writes the latest event log file. This ensures that we never run into issue of lingering files as the 'EventLogQueueProcessor' will first perform the cleanup before writing new files.
  2. Adding additional metrics 'EVENT_LOG_FILES_DELETION_TIME', 'EVENT_LOG_FILES_DELETED', 'METRICS_WRITE_ERROR', 'METRICS_REMOVE_ERROR' and 'METRICS_REMOVE_FAILURE'
  3. Refactoring MetricConfig class

Describe alternatives you've considered
Another approach was to launch a new thread and invoke 'MetricsPurgeActivity' within it. We will again run into the same issue if this thread dies, thus to keep the cleanup and write within same thread was better.

Testing
Tested by spinning up a docker container. Manually copied 100 dummy files to /dev/shm/performanceanalyzer/.
Enabled DEBUG logs to verify cleanup is working as expected

[2021-07-01T20:30:52,950][DEBUG][c.a.o.e.p.r.EventLogFileHandler] Starting to delete old writer files
[2021-07-01T20:30:52,950][DEBUG][c.a.o.e.p.r.EventLogFileHandler] Files discovered 169
[2021-07-01T20:30:52,977][DEBUG][c.a.o.e.p.r.EventLogFileHandler] '153' Old writer files cleaned up.

Metrics:

Metrics=EventLogFilesDeletionTime=27.0 millis aggr|MEAN,EventLogFilesDeletionTime=27 millis 
aggr|MAX,EventLogFilesDeletionTime=27 millis aggr|SUM,EventLogFilesDeleted=153 count 
aggr|SUM,EventLogFilesDeleted=153 count aggr|MAX

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@khushbr khushbr merged commit 5b41dd0 into main Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants