Skip to content

Fixing Event Log file cleanup issue#30

Merged
khushbr merged 7 commits intomainfrom
khushbr-writer-purge-fix
Jul 13, 2021
Merged

Fixing Event Log file cleanup issue#30
khushbr merged 7 commits intomainfrom
khushbr-writer-purge-fix

Conversation

@khushbr
Copy link
Copy Markdown
Collaborator

@khushbr khushbr commented Jul 13, 2021

Is your feature request related to a problem?
Issue : #26
Previous PR [now closed] : opensearch-project/performance-analyzer#36
PA side PR : opensearch-project/performance-analyzer#36

Describe the solution you are proposing

  1. The solution removes the 'MetricsPurgeActivity' collector and moves the responsibility for event log file cleanup to 'EventLogQueueProcessor.' The queue processor first invokes the deleteFiles() [taken from MetricsPurgeActivity] to clean up the old event log files and then writes the latest event log file. This ensures that we never run into issue of lingering files as the 'EventLogQueueProcessor' will first perform the cleanup before writing new files.
  2. Adding additional metrics 'EVENT_LOG_FILES_DELETION_TIME', 'EVENT_LOG_FILES_DELETED', 'METRICS_WRITE_ERROR', 'METRICS_REMOVE_ERROR' and 'METRICS_REMOVE_FAILURE'
  3. Refactoring MetricConfig class

Describe alternatives you've considered
Another approach was to launch a new thread and invoke 'MetricsPurgeActivity' within it. We will again run into the same issue if this thread dies, thus to keep the cleanup and write within same thread was better.

Testing
Tested by spinning up a docker container. Manually copied 100 dummy files to /dev/shm/performanceanalyzer/.
Enabled DEBUG logs to verify cleanup is working as expected

[2021-07-01T20:30:52,950][DEBUG][c.a.o.e.p.r.EventLogFileHandler] Starting to delete old writer files
[2021-07-01T20:30:52,950][DEBUG][c.a.o.e.p.r.EventLogFileHandler] Files discovered 169
[2021-07-01T20:30:52,977][DEBUG][c.a.o.e.p.r.EventLogFileHandler] '153' Old writer files cleaned up.

Metrics:

Metrics=EventLogFilesDeletionTime=27.0 millis aggr|MEAN,EventLogFilesDeletionTime=27 millis 
aggr|MAX,EventLogFilesDeletionTime=27 millis aggr|SUM,EventLogFilesDeleted=153 count 
aggr|SUM,EventLogFilesDeleted=153 count aggr|MAX

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@khushbr khushbr merged commit 5b41dd0 into main Jul 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants