Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[wip] notify users of killing workers when exceed memory #17640

Closed
wants to merge 1 commit into from

Conversation

kbrock
Copy link
Member

@kbrock kbrock commented Jun 26, 2018

When a worker exceeds the memory threshold, it is killed and restarted.
But it is difficult for users to detect these events and fix the underlying problem.

https://bugzilla.redhat.com/show_bug.cgi?id=1535177

This change adds a notification to bring the problem to the attention of administrators. It includes the worker name and the current memory value to make action easier.

worker_out_of_memory_notification

implementation notes:
The event stream record after_create callback currently creates the notification record.
This lost some necessary data, so I skipped the after create callback and manually create the notification record.

MiqEvent.raise_evm_event_queue(w.miq_server, "evm_worker_memory_exceeded", :event_details => msg, :type => w.class.name)
notification_options = {
:name => w.type,
:memory_usage => usage.try!(:/, 1024*1024),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we leverage the number_to_human_size method instead?

[2] pry(main)> helper.number_to_human_size(10.megabytes)
=> "10 MB"
[3] pry(main)> helper.number_to_human_size(10.gigabytes)
=> "10 GB"

Copy link
Member Author

@kbrock kbrock Jun 26, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Fryguy was hoping to pass data. only have format for the messages, but wanted computers to be able to understand. I suppose the computer can parse these values

@@ -269,3 +269,8 @@
:expires_in: 24.hours
:level: :error
:audience: superadmin
- :name: evm_worker_memory_exceeded
:message: 'Killing Worker %{name} using %{memory_usage}mb exceeded limit of %{memory_threshold}mb'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The wording feels funny...perhaps "Killing worker %{name} due to excessive memory usage. %{memory_usage} used memory exceeds limit of %{memory_threshold}." ?

@kbrock kbrock force-pushed the monitor_notifier_v2 branch from 2d0b750 to b5b19d0 Compare June 27, 2018 14:33
@kbrock kbrock changed the title notify users of killing workers when exceed memory [wip] notify users of killing workers when exceed memory Jun 28, 2018
@kbrock kbrock added the wip label Jun 28, 2018
@kbrock
Copy link
Member Author

kbrock commented Jun 28, 2018

This was deemed too verbose when systems are not properly configured.

If you want verbose server exceeded notifications, then add the following to db/fixtures/notification_types.yml:

- :name: evm_worker_memory_exceeded
  :message: 'Killing worker %{name} due to excessive memory usage. %{memory_usage} used memory exceeds limit of %{memory_threshold}.'
  :expires_in: 24.hours
  :level: :error
  :audience: superadmin

@kbrock kbrock force-pushed the monitor_notifier_v2 branch from b5b19d0 to c325336 Compare June 28, 2018 15:31
@miq-bot
Copy link
Member

miq-bot commented Jul 11, 2018

This pull request is not mergeable. Please rebase and repush.

- Provide more complete information for worker out of memory errors
- Provide notification messages for out of memory errors
- Introduce ability to disable notification

https://bugzilla.redhat.com/show_bug.cgi?id=1535177
@kbrock kbrock force-pushed the monitor_notifier_v2 branch from c325336 to f97d013 Compare July 13, 2018 20:38
@miq-bot
Copy link
Member

miq-bot commented Jul 13, 2018

Checked commit kbrock@f97d013 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
4 files checked, 0 offenses detected
Everything looks fine. ⭐

@kbrock
Copy link
Member Author

kbrock commented Aug 13, 2018

defer to #17673

@kbrock kbrock closed this Aug 13, 2018
@kbrock kbrock deleted the monitor_notifier_v2 branch August 13, 2018 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants