Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support worker heartbeat to a local file instead of Drb. #15377

Merged
merged 1 commit into from
Jun 23, 2017

Conversation

gtanzillo
Copy link
Member

@gtanzillo gtanzillo commented Jun 14, 2017

For the new architecture, we'd like to move away for using Drb. This PR takes a step in that direction by enabling workers to heartbeat to a file instead of Drb.

https://www.pivotaltracker.com/story/show/147230943

@gtanzillo gtanzillo changed the title [WIP][DO NOT MERGE] Prototype of worker heartbreaking to a local file. [WIP][DO NOT MERGE] Prototype of worker heartbreating to a local file. Jun 14, 2017
@gtanzillo gtanzillo changed the title [WIP][DO NOT MERGE] Prototype of worker heartbreating to a local file. [WIP][DO NOT MERGE] Prototype of worker heartbeating to a local file. Jun 14, 2017
@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from 47ea14b to de2e90e Compare June 14, 2017 21:23
@miq-bot miq-bot added the wip label Jun 14, 2017
Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like what you got hear. Simple, and should do the trick. My comments are pedantic as always.

end

def heartbeat_to_file
require 'fileutils'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My two cents with this:

Just include it at the top of the file. This costs us around a megabyte to include it, and there is a good chance it is already being included by some other gem anyway.

If we do it this way, it means we are making this require "fileutils" call on every heartbeat_to_file call, and I would personally want to make this as lightweight of an operation as possible (though, I think subsequent requires are pretty good at being quick, but still).

@@ -219,6 +219,10 @@ def worker_settings(options = {})
self.class.fetch_worker_settings_from_server(miq_server, options)
end

def heartbeat_file
ENV["WORKER_HEARTBEAT_FILE"] || Rails.root.join("tmp", guid + ".hb")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wat!? No string interpolation... but using string concatenation! For shame... :trollface:

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😭 I'll fix that!

@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from de2e90e to 0760cba Compare June 16, 2017 14:06
@@ -219,6 +219,10 @@ def worker_settings(options = {})
self.class.fetch_worker_settings_from_server(miq_server, options)
end

def heartbeat_file
ENV["WORKER_HEARTBEAT_FILE"] || Rails.root.join("tmp", "#{guid}.hb")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: Need to clean up heartbeat files.

Copy link
Member

@NickLaMuro NickLaMuro Jun 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Morbid curiosity (and since I got pinged on this thread), why?

Is an old Heartbeat file (like a stale worker record) a problem? Not saying we shouldn't clean things up, but just curious if there are issues you have identified where cleaning these up is necessary to resolve them.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No issue other just having it clean up after itself. Every time a worker is started it will get a new guid and a new file so they could accumulate fast.

I think you got pinged because you previously added comments.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you got pinged because you previously added comments.

Yup, knew that, but poorly conveyed. Pure curiosity on something that I am actively following and trying not to act like a dic-tator 🥔 .

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @NickLaMuro, I misread you comment and though you were asking why you got pinged, lol. I maded the change to cleanup the heartbeat file create on prior runs.

send(heartbeat_method)

do_heartbeat_work
rescue DRb::DRbError => err
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be moved down in the drb method

@@ -93,9 +93,17 @@ def validate_heartbeat(w)

private

def workers_last_heartbeat(pid)
def workers_last_heartbeat(w)
send("workers_last_heartbeat_to_#{get_heartbeat_method(w)}", w)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use ternary operator here instead of building method names.

@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from 0760cba to 55851ce Compare June 19, 2017 01:25
@gtanzillo gtanzillo changed the title [WIP][DO NOT MERGE] Prototype of worker heartbeating to a local file. Prototype of worker heartbeating to a local file. Jun 19, 2017
@gtanzillo gtanzillo changed the title Prototype of worker heartbeating to a local file. Support worker heartbeat to a local file instead of Drb. Jun 19, 2017
@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from 55851ce to b8f15ad Compare June 22, 2017 21:04
@gtanzillo
Copy link
Member Author

@jrafanie This should be ready to merge

end
end

def workers_last_heartbeat_to_file(w)
File.mtime(w.heartbeat_file).utc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtanzillo Do you think we probably need to do a File.exist? check here? Is it possible we run this method either:

  • before a worker creates the file
  • after the file is removed but the worker row remains

Is this possible?

We don't have to tackle this here since we're defaulting to DRb, it's just a thought I had.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good thought @jrafanie. I'll add that check because I don't think it should just assume it exists.

Copy link
Member

@NickLaMuro NickLaMuro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtanzillo just some pedantic thoughts I had while looking this over. Probably not worth holding up the merge of this PR to address.

private

def workers_last_heartbeat(pid)
def workers_last_heartbeat(w)
ENV["WORKER_HEARTBEAT_METHOD"] == "file" ? workers_last_heartbeat_to_file(w) : workers_last_heartbeat_to_drb(w)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gtanzillo This is pedantic, but while you are addressing @jrafanie 's comment, I feel like these methods should use from instead of to:

  • workers_last_heartbeat_to_file => workers_last_heartbeat_from_file
  • workers_last_heartbeat_to_drb => workers_last_heartbeat_from_drb

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought of that too (I'm telling the truth!) and discussed it with @jrafanie and we thought that consistency in method names would be better than having them semantically correct.

@@ -219,6 +219,10 @@ def worker_settings(options = {})
self.class.fetch_worker_settings_from_server(miq_server, options)
end

def heartbeat_file
ENV["WORKER_HEARTBEAT_FILE"] || Rails.root.join("tmp", "#{guid}.hb")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we memoize this value into an instance variable?

def heartbeat_file
  @heartbeat_file ||= ENV["WORKER_HEARTBEAT_FILE"] || Rails.root.join("tmp", "#{guid}.hb")
end

@@ -152,6 +152,7 @@ def start
#############################################################
# Start all the configured workers
#############################################################
clean_heartbeat_files
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we want to do conditionally? By that, I mean only worry about this if we are using a file based heartbeat? This can be done within the method of course, just thought I would bring it up.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @NickLaMuro I though about that but didn't think it would hurt to just do it blindly since it's only done once at server startup.

@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from b8f15ad to 4668215 Compare June 23, 2017 19:19
@gtanzillo gtanzillo force-pushed the rearch-heartbeat-to-file branch from 4668215 to 34a11ab Compare June 23, 2017 19:46
@miq-bot
Copy link
Member

miq-bot commented Jun 23, 2017

Checked commit gtanzillo@34a11ab with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
5 files checked, 2 offenses detected

app/models/miq_worker/runner.rb

@jrafanie jrafanie merged commit a0b650f into ManageIQ:master Jun 23, 2017
@jrafanie jrafanie added this to the Sprint 64 Ending Jul 3, 2017 milestone Jun 23, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants