Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-linking Middleware server model with containers. #14043

Merged
merged 1 commit into from
Apr 5, 2017

Conversation

rubenvp8510
Copy link
Contributor

No description provided.

@rubenvp8510 rubenvp8510 force-pushed the xlink-container branch 2 times, most recently from 813d768 to cbfaab7 Compare February 23, 2017 18:53
@rubenvp8510
Copy link
Contributor Author

@miq-bot add_label providers/hawkular

@rubenvp8510
Copy link
Contributor Author

@pilhuhn Could you please review this? Thank you!

alternate << swap_part(machine_id[12, 4])
alternate << machine_id[16, 4]
alternate << machine_id[20, 12]
alternate << swap_part(machine_id[0, 8]) unless machine_id[0, 8].nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't there a way to have those nil checks around the whole block? Or can parts of the machine_id be nil and others not?

alternate << swap_part(machine_id[8, 4]) unless machine_id[8, 4].nil?
alternate << swap_part(machine_id[12, 4]) unless machine_id[12, 4].nil?
alternate << machine_id[16, 4] unless machine_id[16, 4].nil?
alternate << machine_id[20, 12] unless machine_id[20, 4].nil?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the array in the nil check does not match the one in front of it.

@pilhuhn
Copy link
Contributor

pilhuhn commented Feb 28, 2017

LGTM
@miq-bot assign @bronaghs

@miq-bot miq-bot assigned bronaghs and unassigned blomquisg Feb 28, 2017
@bronaghs
Copy link

@miq-bot assign @agrare

@miq-bot miq-bot assigned agrare and unassigned bronaghs Feb 28, 2017
Copy link
Member

@agrare agrare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple of comments, also can you add a description explaining in a bit more detail about what you're doing here?
Also this is going to need some spec tests, if you have vcr cassettes you should re-record with some container cross linking and that alternate_machine_id method needs some specific tests where the machine_id is invalid.

@@ -34,7 +34,13 @@ def fetch_middleware_servers
host_instance = find_host_by_bios_uuid(machine_id) ||
find_host_by_bios_uuid(alternate_machine_id(machine_id))

if host_instance
if server[:properties].try { |prop| prop['In Container'] } == 'true'
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a good place to use fetch_path, try server.fetch_path(:properties, 'In Container') == true
Using a method could help readability as well

@@ -34,7 +34,13 @@ def fetch_middleware_servers
host_instance = find_host_by_bios_uuid(machine_id) ||
find_host_by_bios_uuid(alternate_machine_id(machine_id))

if host_instance
if server[:properties].try { |prop| prop['In Container'] } == 'true'
container = Container.find_by(:name => machine_id)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the name guaranteed to be unique? You should at least add :ems_id to your find_by, and searching by :ems_id and :ems_ref is the generally accepted way to find an item.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about this, but this is the only way I have to x-linking the container with the hawkular agent, but it seems like we need to change this, WDYT @pilhuhn

@@ -101,6 +107,9 @@ def alternate_machine_id(machine_id)
# providers store it in downcase, so, we let the upcase/downcase logic to other methods with more
# business knowledge.
# 1 - https://bugzilla.redhat.com/show_bug.cgi?id=1294461
return if machine_id[0, 8].nil? || machine_id[8, 4].nil? || machine_id[12, 4].nil? || machine_id[16, 4].nil? ||
machine_id[20, 12].nil?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty gross, if you're trying to validate a UUID check out MiqUUID instead of doing anything on the string directly
If you do need to do something like this at least a method that is descriptive like return unless machine_id_valid?(machine_id)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we an detect from within the container what the env outside is and/or in which OpenShift/K8s cluster it is running.

@pilhuhn
Copy link
Contributor

pilhuhn commented Mar 27, 2017

I think the issue is that the code on agent side that provides the container id makes/made an invalid assumption.

I wrote it the way that it reuses the machine-id property by just passing in -Dhawkular.agent.machine-id=.
My thought was that this way it ends up in the right place in ManageIQ, where ManageIQ code then just looks up a container instead of a (virtual) machine with that code.

Now there is code in MiQ that makes some trickery with the machine id in refresh_parser/alternate_machine_id that is not compatible with the container id.
While in fact the container id is the machine id of a container, there is apparently a real thing "machine id", which is normally baked into the hardware (or at least derived from hardware).

So we have several options that I can think of now:

  1. we make the agent not set machine id, but a new property "container id", which is then forwarded all along to MiQ and the refresh parser then takes it and uses it for x-linking
  2. we leave agent as is, but in the refresh parser we check before dorking with alernate-machine-id if the "in container" flag is set and then don't branch into the real-machine-id code, but do the x-linking with that id and the container provider.

I guess the 2nd is something we can do quickly, while the former needs more work (but is cleaner). I guess the former is something we can do post 4.5 - as we already know the logic on the MiQ side for x-linking to containers. We would then just not read the machine-id and branch, but the container-id.
The UI code could even stay the same.

@rubenvp8510
Copy link
Contributor Author

rubenvp8510 commented Mar 27, 2017

@pilhuhn

I agree on both solutions that you proposes here, but in the case of OpenShift containers, the machine-id is set to the pod name, which is unique for one OpenShift provider, but we could have two or more providers, in each of them we could have a pod with the same name. That doesn't happens with "hardware" machine-id I think because it's a UUID.

@pilhuhn
Copy link
Contributor

pilhuhn commented Mar 27, 2017

@rubenvp8510 It is correct that this is no machine-id, but machine-id is not available for the containers, so we need something else.

@abonas
Copy link
Member

abonas commented Mar 28, 2017

I'm linking this here as it is the solution for option 1 which was originally planned for a later time
hawkular/hawkular-agent#306
So please sync so we will not end up with the agent solution merged for option 1, but with miq code that fits option 2 :)
@pilhuhn @rubenvp8510 @jmazzitelli

@jmazzitelli
Copy link

Yes, I need someone to peer review hawkular/hawkular-agent#306 to confirm it provides what the MiQ code needs.

@rubenvp8510 rubenvp8510 force-pushed the xlink-container branch 2 times, most recently from f4eccd4 to f5182ff Compare April 4, 2017 15:19
@rubenvp8510
Copy link
Contributor Author

Now I'm getting the container id from the agent and doing the cross-linking using that is, also I addressed some code style comments.

@rubenvp8510
Copy link
Contributor Author

Could someone on this thread of comments review it please? Thank you!

server[:lives_on_id] = host_instance.id
server[:lives_on_type] = host_instance.type
if server.fetch_path(:properties, 'In Container') == 'true'
container_id = @ems.container_id(eap.feed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible that container_id will be empty? (and then the backing_ref will be corrupted)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like travis is failing on exactly this case ref

Failure/Error: backing_ref = 'docker://' + container_id
     
     TypeError:
       no implicit conversion of nil into String

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In theory, this shouldn't happen, if the hawkular agent is running into a container, it should have a valid container-id, but I'll validate this just in case.

backing_ref = 'docker://' + container_id
container = Container.find_by(:backing_ref => backing_ref)
if container
server[:lives_on_id] = container.id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can't lines 38-39 and 46-47 be unified for code reuse? they are essentially the same, just a different object is passed

@@ -101,6 +110,7 @@ def alternate_machine_id(machine_id)
# providers store it in downcase, so, we let the upcase/downcase logic to other methods with more
# business knowledge.
# 1 - https://bugzilla.redhat.com/show_bug.cgi?id=1294461
return if machine_id.length < 31
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isn't this addition causing alternate_machine_id method to sometimes return empty? then find_host_by bios_uuid will search with an empty var, no?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What changed in this PR that is causing a different machine_id to be passed to this method that is making this extra check necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the method find_host_by_bios_uuid received nil as a parameter it returns nil immediately, it doesn't try to find anything, so if the machine_id is not valid (line 43), host_instance will be nil.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rubenvp8510 I don't see any change to how this method is being called so I'm not sure why this is in here, if there is an issue with this method can it be addressed in another PR? As far as I can see the calling code is just moved but the argument is still just eap.feed

@@ -107,6 +107,10 @@ def machine_id(feed)
os_resource_for(feed).try(:properties).try { |prop| prop['Machine Id'] }
end

def container_id(feed)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this class a proper place for such methods (although this addition repeats another method that has been already here)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cfcosta wdyt? Rails wise, where is the "right" place to put such methods?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd extract all resource-related stuff into it's own class, but I'd also keep it on providers/hawkular. There's no need to put it further away from where it is actually being used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be done on a separate PR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The machine_id method is already here so if we are going to move it, best to do in a follow-up PR

@abonas
Copy link
Member

abonas commented Apr 5, 2017

Could this be done on a separate PR?

sure

@agrare
Copy link
Member

agrare commented Apr 5, 2017

Outside of the scope of this PR but why not normalize the UUID used for the :uid_ems in the refresh_parser before saving it to the DB? That way you don't have to try to look it up twice here you can always assume it will be a properly formatted UUID

@rubenvp8510 rubenvp8510 force-pushed the xlink-container branch 3 times, most recently from ca48c4c to e2420bc Compare April 5, 2017 16:03
@pilhuhn
Copy link
Contributor

pilhuhn commented Apr 5, 2017

@agrare The "agare requested changes" flag is from over a month ago. Could you tell what stands between todays code and being merged (except for the test failure)?
If it is only "cosmetics" - could they be addressed in a subsequent PR?

@agrare
Copy link
Member

agrare commented Apr 5, 2017

@pilhuhn it looks like the last of my issues were just addressed in e2420bc

Copy link
Member

@agrare agrare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Will merge when green

@agrare
Copy link
Member

agrare commented Apr 5, 2017

@rubenvp8510 please address the failing tests

@miq-bot
Copy link
Member

miq-bot commented Apr 5, 2017

Checked commit rubenvp8510@0e59b1c with ruby 2.2.6, rubocop 0.47.1, and haml-lint 0.20.0
2 files checked, 0 offenses detected
Everything looks good. 🍪

@agrare agrare merged commit 0a3d7e4 into ManageIQ:master Apr 5, 2017
@agrare agrare added this to the Sprint 58 Ending Apr 10, 2017 milestone Apr 5, 2017
@abonas
Copy link
Member

abonas commented Apr 5, 2017

@miq-bot add_label fine/yes

@agrare
Copy link
Member

agrare commented Apr 5, 2017

Was this for a BZ? If so it should be in the description and if not we'll need to create one in order to backport it.
Scratch that I've been notified no BZ needed to backport to fine yet

simaishi pushed a commit that referenced this pull request Apr 5, 2017
Cross-linking Middleware server model with containers.
(cherry picked from commit 0a3d7e4)
@simaishi
Copy link
Contributor

simaishi commented Apr 5, 2017

Fine backport details:

$ git log -1
commit 22d25adc220d27c535a057c4c9cfb53b56cea82c
Author: Adam Grare <[email protected]>
Date:   Wed Apr 5 14:30:15 2017 -0400

    Merge pull request #14043 from rubenvp8510/xlink-container
    
    Cross-linking Middleware server model with containers.
    (cherry picked from commit 0a3d7e40ae82b13aa7d1c7d80a709b4763370ac6)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.