Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spec test for a full refresh #138

Merged
merged 3 commits into from
Mar 12, 2018
Merged

Conversation

skovic
Copy link

@skovic skovic commented Mar 7, 2018

This PR adds a test case for testing a full refresh. In particular, it attempts to recreate the guest_devices table problem described in Bug 1542710.

@skovic
Copy link
Author

skovic commented Mar 7, 2018

@miq-bot add_label wip

@miq-bot miq-bot changed the title Add spec test for a full refresh [WIP] Add spec test for a full refresh Mar 7, 2018
@miq-bot miq-bot added the wip label Mar 7, 2018
end

def assert_guest_table_contents
nic = GuestDevice.where(:device_name => "Broadcom 2-port 1GbE NIC Card for IBM").first
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably better going through a physical_server association since that name doesn't look like it'd be particularly unique

nic = GuestDevice.where(:device_name => "Broadcom 2-port 1GbE NIC Card for IBM").first
ports = nic.child_devices

expect(ports[0].device_name).to eq("Physical Port 1")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a good idea to dereference the array without sorting it first since the return order isn't guaranteed and can lead to sporadic test failures

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better to say something like

port_1 = ports.find_by(:device_name => "Physical Port 1")

@agrare
Copy link
Member

agrare commented Mar 7, 2018

@skovic commented, looks like the VCRs might have to be re-recorded? There are vcr errors in the spec tests

@skovic
Copy link
Author

skovic commented Mar 7, 2018

@agrare It works running it locally (with the same vcr files). I'm not sure what to make of this error. This is why it's failing:

An HTTP request has been made that VCR does not know how to handle: GET https://ip-10-243-9-123.ec2.internal/aicc

Notice the strange format of the IP address. Any idea about why this may be occurring?

@agrare
Copy link
Member

agrare commented Mar 7, 2018

@skovic got me my guess is that something in init_parser is doing something with the local machine's hostname, and that was the hostname of the travis appliance running the test

@agrare
Copy link
Member

agrare commented Mar 7, 2018

so @skovic when you said you reproduced this with only one refresh, what exactly was it that you saw? I thought the issue was that ports were getting switched between two refreshes but maybe I don't understand what the bug is exactly.

@skovic
Copy link
Author

skovic commented Mar 7, 2018

@agrare For example, before a refresh, a NIC may have two ports with the following names and MAC addresses:

Physical Port 1    00:0A:F7:25:67:38
Physical Port 2    00:0A:F7:25:67:39

After a refresh, the NIC now has the following two ports:

Physical Port 1    00:0A:F7:25:67:38
Physical Port 1    00:0A:F7:25:67:38

It's as if Port 2 disappeared and was replaced by a duplicate of Port 1.

@skovic
Copy link
Author

skovic commented Mar 7, 2018

@agrare Looks like the same issue again with the malformed IP address. Any idea how I can debug this possible init_parser problem? Thanks

@agrare
Copy link
Member

agrare commented Mar 7, 2018

@skovic I would step into what that connection.discover_aicc method is doing, you might have to stub it out in your spec

@skovic
Copy link
Author

skovic commented Mar 8, 2018

@agrare @rodneyhbrown7 Regarding the bug itself. I believe that I've figured out more precisely the conditions under which the bug occurs and what might be occurring. Here are the preconditions:

  • One LXCA, lxca1, is managing a particular server, server1.
  • Another LXCA, lxca2, is managing the same server, server1.

Here is how to reproduce the problem:

  1. Add a physical infrastructure provider for lxca1. This is provider1.
  2. Add a physical infrastructure provider for lxca2. This is provider2.
  3. Refresh the second provider, provider2.
  4. Once the refresh is complete, navigate to the server summary page for server1 and notice that there are duplicate entries in the Network Devices table.

When the second provider, provider2, is added, a second entry for server1 is created in the physical_servers table in the database. Likewise, a second entry is created for server1's NIC in the guest_devices table. When the second provider, provider2, is refreshed, the entries for the NIC in the guest_devices table somehow become mangled.

@rodneyhbrown7
Copy link
Contributor

I have a theory about the spec test failure. This is the only spec test where we call EmsRefresh. I think EmsRefresh is attempting to create an actual ems instance instead of using the factory instance. We have code in our provider that attempts to resolve any hostname given. This way we can convert IP addresses to hostnames.

In this case I think we are asking DNS to resolve the provided host name. This is going out to Amazon's hosted DNS service and returning a private host name for the given address of the form xx-xx-xx-xx.ec.internal. When we run internally, this IP actually resolves to a valid address so we don't run into this block.

See reference: https://docs.aws.amazon.com/AmazonVPC/latest/UserGuide/vpc-dns.html

@skovic Maybe you can disable the resolver temporarily and see if that enables the test to pass?

@skovic skovic closed this Mar 8, 2018
@skovic skovic reopened this Mar 8, 2018
@skovic skovic closed this Mar 8, 2018
@skovic skovic reopened this Mar 8, 2018
@skovic
Copy link
Author

skovic commented Mar 8, 2018

@rodneyhbrown7 Thanks. Temporarily disabling resolving of hostnames let the test pass.
@agrare How should we address this issue, so the tests pass? Rodney's theory was correct.

@agrare
Copy link
Member

agrare commented Mar 8, 2018

@skovic what if you set the ipaddress in the FactoryGirl ems? Looks like that will shortcut the resolve code

@skovic
Copy link
Author

skovic commented Mar 8, 2018

@agrare I thought we were already doing that:

:ipaddress => "https://10.243.9.123:443"

@agrare
Copy link
Member

agrare commented Mar 8, 2018

So it will only try to resolve if one of these is true, probably if you switched the full URL to the hostname or made up a dummy hostname it won't call to resolve

if @ems.ipaddress.blank?
  resolve_ip_address(hostname, @ems)
end
if @ems.hostname_ipaddress?(hostname)
  resolve_hostname(hostname, @ems)
end

@skovic skovic force-pushed the full-refresh-test branch from 7b10e1c to b6997d2 Compare March 9, 2018 19:00
@miq-bot
Copy link
Member

miq-bot commented Mar 9, 2018

Checked commits skovic/manageiq-providers-lenovo@1d00d59~...b6997d2 with ruby 2.3.3, rubocop 0.52.1, haml-lint 0.20.0, and yamllint 1.10.0
3 files checked, 1 offense detected

**

  • 💣 💥 🔥 🚒 - Linter/Yaml - missing config files

@skovic
Copy link
Author

skovic commented Mar 9, 2018

@agrare Looks like a dummy hostname has fixed the problem. Please let me know if you would like me to make any other changes. Thanks

@skovic
Copy link
Author

skovic commented Mar 9, 2018

@miq-bot remove_label wip

@miq-bot miq-bot changed the title [WIP] Add spec test for a full refresh Add spec test for a full refresh Mar 9, 2018
@miq-bot miq-bot removed the wip label Mar 9, 2018
@agrare agrare self-assigned this Mar 12, 2018
@agrare agrare merged commit 36d36fd into ManageIQ:master Mar 12, 2018
@agrare agrare added this to the Sprint 81 Ending Mar 12, 2018 milestone Mar 12, 2018
@agrare agrare added the test label Mar 12, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants