Skip to content
This repository has been archived by the owner on Jan 31, 2023. It is now read-only.

#1 Identify Distro #3

Merged
4 commits merged into from
Apr 3, 2018
Merged

#1 Identify Distro #3

4 commits merged into from
Apr 3, 2018

Conversation

ghost
Copy link

@ghost ghost commented Mar 26, 2018

How does this work?

Houndigrade is a CLI script that lives in a container, the primary use case is running it on container host that has N amount of volumes attached. Houndigrade will mount the volumes and run some static checks to see what OS and (soon™) software is used on said volume. Houndigrade can inspect multiple volumes at the same time simply by passing extra ones in via the -t flag. Once the inspection is complete results are written to a queue for consumption by a different service.

How do I run it?

For Dev/Locally

Currently there is an included docker compose file that helps you run it locally. The file combined with an entrypoint script will start a queue container (accessible at localhost:15672 with default credentials guest/guest) and the houndigrade container. Inside the houndigrade container the entrypoint script mounts two block devices using losetup and then runs a scan against those volumes. Each one has a single partition with either RHEL or CentOS fingerprints. Once the run is done a message will be placed on the queue.

In AWS

High level overview of launching this in AWS is as follows:

  1. Have host in ECS cluster.
  2. Attach volumes to inspect to that host.
  3. Create a task definition with environmental variables defining queue information and script options as the container command. (Script itself is already the entrypoint)
  4. Launch a task on that host and await the results.

Sample Output

Here is sample output of a message that you'd find on a queue from a real™ scan run in AWS that was called with command parameters -c aws -t ami-test1 /dev/xvdba -t ami-test2 /dev/xvdbb -t ami-test3 /dev/xvdbc

{
    "cloud": "aws",
    "inspection_targets": [
        [
            "ami-test1",
            "/dev/xvdba"
        ],
        [
            "ami-test2",
            "/dev/xvdbb"
        ],
        [
            "ami-test3",
            "/dev/xvdbc"
        ]
    ],
    "facts": {
        "/dev/xvdba": {
            "image_id": "ami-test1",
            "/dev/xvdba3": [
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/os-release",
                    "release_file_contents": "NAME=\"Red Hat Enterprise Linux Server\"\nVERSION=\"7.4 (Maipo)\"\nID=\"rhel\"\nID_LIKE=\"fedora\"\nVARIANT=\"Server\"\nVARIANT_ID=\"server\"\nVERSION_ID=\"7.4\"\nPRETTY_NAME=\"Red Hat Enterprise Linux Server 7.4 (Maipo)\"\nANSI_COLOR=\"0;31\"\nCPE_NAME=\"cpe:/o:redhat:enterprise_linux:7.4:GA:server\"\nHOME_URL=\"https://www.redhat.com/\"\nBUG_REPORT_URL=\"https://bugzilla.redhat.com/\"\n\nREDHAT_BUGZILLA_PRODUCT=\"Red Hat Enterprise Linux 7\"\nREDHAT_BUGZILLA_PRODUCT_VERSION=7.4\nREDHAT_SUPPORT_PRODUCT=\"Red Hat Enterprise Linux\"\nREDHAT_SUPPORT_PRODUCT_VERSION=\"7.4\"\n"
                },
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/redhat-release",
                    "release_file_contents": "Red Hat Enterprise Linux Server release 7.4 (Maipo)\n"
                },
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/system-release",
                    "release_file_contents": "Red Hat Enterprise Linux Server release 7.4 (Maipo)\n"
                }
            ],
            "/dev/xvdba2": [
                {
                    "error": "mount: unknown filesystem type 'swap'\n"
                }
            ],
            "/dev/xvdba1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdba1"
                }
            ]
        },
        "/dev/xvdbb": {
            "image_id": "ami-test2",
            "/dev/xvdbb3": [
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/centos-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/os-release",
                    "release_file_contents": "NAME=\"CentOS Linux\"\nVERSION=\"7 (Core)\"\nID=\"centos\"\nID_LIKE=\"rhel fedora\"\nVERSION_ID=\"7\"\nPRETTY_NAME=\"CentOS Linux 7 (Core)\"\nANSI_COLOR=\"0;31\"\nCPE_NAME=\"cpe:/o:centos:centos:7\"\nHOME_URL=\"https://www.centos.org/\"\nBUG_REPORT_URL=\"https://bugs.centos.org/\"\n\nCENTOS_MANTISBT_PROJECT=\"CentOS-7\"\nCENTOS_MANTISBT_PROJECT_VERSION=\"7\"\nREDHAT_SUPPORT_PRODUCT=\"centos\"\nREDHAT_SUPPORT_PRODUCT_VERSION=\"7\"\n\n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/redhat-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/system-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                }
            ],
            "/dev/xvdbb2": [
                {
                    "error": "mount: unknown filesystem type 'swap'\n"
                }
            ],
            "/dev/xvdbb1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdbb1"
                }
            ]
        },
        "/dev/xvdbc": {
            "image_id": "ami-test3",
            "/dev/xvdbc2": [
                {
                    "error": "mount: unknown filesystem type 'LVM2_member'\n"
                }
            ],
            "/dev/xvdbc1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdbc1"
                }
            ]
        }
    }
}

Demo

https://asciinema.org/a/QEqfQSH6MAzIJ67TTWuYKhxtb

@codecov
Copy link

codecov bot commented Mar 26, 2018

Codecov Report

❗ No coverage uploaded for pull request base (master@1e72bd2). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##             master     #3   +/-   ##
=======================================
  Coverage          ?   100%           
=======================================
  Files             ?      2           
  Lines             ?    193           
  Branches          ?     10           
=======================================
  Hits              ?    193           
  Misses            ?      0           
  Partials          ?      0
Impacted Files Coverage Δ
houndigrade/test_cli.py 100% <100%> (ø)
houndigrade/cli.py 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1e72bd2...485a05c. Read the comment docs.

@ghost ghost force-pushed the 1-identify-distro branch 7 times, most recently from 2f46ed1 to 87524ba Compare March 26, 2018 20:17
@ghost ghost changed the title WIP: #1 Identify Distro #1 Identify Distro Mar 26, 2018
@ghost ghost self-requested a review March 26, 2018 20:31
blentz
blentz previously approved these changes Mar 27, 2018
Copy link

@blentz blentz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 lgtm

@ghost ghost force-pushed the 1-identify-distro branch from 2c2cec5 to 87524ba Compare March 27, 2018 15:30
Dockerfile Outdated
@@ -0,0 +1,16 @@
FROM python:3.6-alpine

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Went looking at pyup to see how they handle Pipfile and I found this interesting article around creating an entrypoint that activates the pipenv virtualenv at entry https://pyup.io/posts/pipfiles-and-docker/

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good info, but unfortunately I don't think it really buys us anything, especially since we actually have to run the container in privileged mode anyway.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I hadn't made it all the way through things when I commented here.

Copy link

@adberglund adberglund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not as bad as you claimed! I know next to nothing around file systems, so I have no expertise to inject here. It scans, it reads files, it enqueues a message. ✅

@@ -0,0 +1,15 @@
[[source]]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pipfile support for pyup might not be 100%

pyupio/pyup#197

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I'm a little on the fence about it, per pyupio/pyup#197 (comment) it seems like it should be at least usable. But if it misbehaves I have no issue with pulling it out and going back to traditional requirements files.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm all for going forward with the pipfile.

if rhel_found:
click.echo('RHEL found on: {}'.format(partition))

results.append({

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The results.append() call does not need to be duplicated in the if/else blocks. You could also use the rhel_found as the value for the rhel_found key in the result.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch!

})
else:
for release_file_path in release_file_paths:
rhel_found, file = check_file(release_file_path)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I got uneasy about file as a variable name, but then realized it's no longer a builit-in.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Python3 🎉

That said I'm open to changing it to a different name if you have suggestions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded on the file name. Conventions I've seen around this can be as simple as the_file or _file, but here I think a more descriptive name like contents or file_contents would be appropriate since it looks like that's what this really is.

@ghost ghost force-pushed the 1-identify-distro branch from 87524ba to 18be243 Compare March 27, 2018 18:56
is_flag=True,
help=_('Print debug output.'))
def main(cloud, target, debug):
r"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this a raw string? What's with the \b a few lines later?

Copy link
Author

@ghost ghost Mar 29, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

\b is for http://click.pocoo.org/5/documentation/#preventing-rewrapping and python wants it to be a raw string if I have back slashes.

To expand on this a bit, I suppose I could use a normal string and escape that backslash and see if the no re-wrapping flag still works, but I didn't see any harm in just using a raw string.

EDIT: Tested, doesn't work 😞

}

for image_id, drive in target:
click.echo('Checking drive {}'.format(drive))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason for using click.echo instead of the standard Python logger? Does echo give us the option to include formatting, timestamp, etc. in its output? Do you think there would be any value in having a richer output format that could be given by the Python logger, or will we only ever want bare-bones stdout from this script?

(FWIW, I'm not terribly familiar with the click library, but I did look enough into the docs to see that echo appears to be a thin wrapper around print that can handle things like ANSI color codes.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

imo even the stdout is overkill and I've considered removing it entirely, or putting all of it behind the --debug flag since normally where this will be running will never really be seen by anyone. There was no specific reason for using click.echo, but it does offer few things over standard print (like being able to easily output to stderr), and ability to use its other functions later on if deemed necessary (things like progress bars, pagers, colors, formatting path names, etc.).

As for the standard python logger ¯\_(ツ)_/¯ I don't see a lot of value since we're running in a container without having to spend additional time to set up another tool that would consume those logs so they wouldn't disappear when the container exits.

click.echo('Checking drive {}'.format(drive))

results['facts'][drive] = {'image_id': image_id}
for partition in get_partitions(drive):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd consider putting this for loop logic into a separate function like you did with report_results. That should make it easier to test and improve readability.

})
else:
for release_file_path in release_file_paths:
rhel_found, file = check_file(release_file_path)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seconded on the file name. Conventions I've seen around this can be as simple as the_file or _file, but here I think a more descriptive name like contents or file_contents would be appropriate since it looks like that's what this really is.

that gets placed on a queue once the processing is done.

\b
Args:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is effectively duplicating what's already stated in the decorators, I'd suggest dropping the "Args" lines.

with open('{}/xvdf1/etc/redhat-release'.format(drive_path), 'w') as f:
f.write('Red Hat Enterprise Linux Server release 7.4 (Maipo)\n')
with open('{}/xvdf1/etc/os-release'.format(drive_path), 'w') as f:
f.write('NAME=\"Red Hat Enterprise Linux Server\"\nVERSION=\"7.4 '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are all those double-quote escapes really necessary since the string is defined with single-quotes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what would you think about making this a triple-quote string so you don't have to escape the newlines?

Maybe shove the whole thing into a variable using triple-quotes and pass it through textwrap.dedent before f.writeing it. I think that might make this more legible.

click.echo('Checking partition {}'.format(partition))

results['facts'][drive][partition] = []
try:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The more I look at this, the more I think it should be a context manager. Something like:

with mount(partition, path, blah blah):
    check_release_files(partition, results['facts'][drive][partition])

That would move the exception handling and unmount command into the context manager's definition, leaving just the interesting business logic here.

I'm digging through pypi to see if there are any decent wrapper libraries for mount that would give us a context manager for free and/or save us from using subprocesses, but I haven't found anything yet. 😞

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Context Managered™

@ghost ghost force-pushed the 1-identify-distro branch from 46f34ad to 04fd7a1 Compare April 2, 2018 15:53
'error': e.stderr
})

continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this continue do anything here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not anymore! That freeloader is outta there.

"""
try:
with open(file_path) as f:
file = f.read()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to rename file or decide not to?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😅


except subprocess.CalledProcessError as e:
click.echo(
_('Mount of {} failed '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this split string can fit on the same line.

(I assume this is a leftover from refactoring and moving around code.)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not with the format on the end, I can shuffle some stuff around to make it a bit more visually appealing, but I don't think number of lines changes.

}

for image_id, drive in target:
click.echo('Checking drive {}'.format(drive))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on the earlier comment about lifting the contents of this for loop into a separate function (effectively "inspect this one image_id/drive pair"). My thought was that it would improve readability and testability, although the latter may not be an issue now if the tests are already there.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was a little hesitant to pull it out as it feels like we are building a matryoshka doll of function calls with them only being called in one place. Test wise, the tests are already there so that point is a bit moot, and calling the main method isn't exactly difficult in this instance.

That said, I did go ahead and pull it out, let me know if you like the look of it more now.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like it! 🎉

}

for image_id, drive in target:
click.echo('Checking drive {}'.format(drive))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one and a few other echo calls (line 55, 146, …) are missing the _ translation function.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed!

@ghost ghost force-pushed the 1-identify-distro branch from 04fd7a1 to 0282020 Compare April 2, 2018 16:46
Copy link
Member

@infinitewarp infinitewarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whoops. I put on my i18n/l10n goggles and noticed a few issues I missed before… 🔍

@click.option('--cloud',
'-c',
default='aws',
help='Cloud in which we are performing the inspection.',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_() 🙂

that gets placed on a queue once the processing is done.

"""
click.echo(_('Provided cloud: {}'.format(cloud)))
Copy link
Member

@infinitewarp infinitewarp Apr 2, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matryoshka doll is closing incorrectly here. You want the .format to happen after the translation occurs. Like this:

_('Provided cloud: {}').format(cloud)

Otherwise we're in the situation of asking for un-translated strings dynamically at runtime.


"""
click.echo(_('Provided cloud: {}'.format(cloud)))
click.echo(_('Provided drive(s) to inspect: {}'.format(target)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like above,

_('Provided drive(s) to inspect: {}').format(target)

results (dict): The results of the inspection.

"""
click.echo(_('Checking drive {}'.format(drive)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

click.echo(_('Checking drive {}'.format(drive)))
results['facts'][drive] = {'image_id': image_id}
for partition in get_partitions(drive):
click.echo(_('Checking partition {}'.format(partition)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

release_file_paths = find_release_files()

if not release_file_paths:
click.echo(_('No release files found on {}'.format(partition)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

click.echo(_('No release files found on {}'.format(partition)))
results.append({
'rhel_found': False,
'status': 'No release files found on {}'.format(partition)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

rhel_found, contents = check_file(release_file_path)

if rhel_found:
click.echo(_('RHEL found on: {}'.format(partition)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

if rhel_found:
click.echo(_('RHEL found on: {}'.format(partition)))
else:
click.echo(_('RHEL not found on: {}'.format(partition)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Like previous comment…

self.assertTrue(mock_subprocess_run.called)
self.assertEqual(result.exit_code, 0)
self.assertIn(
'"status": "No release files found on ./dev/xvdf1"', result.output)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should construct the "No release files found on" string using _ like it is in the main code because if someone runs the tests on a non-English-localized system, that could cause this test to fail.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or we force the English localization on our tests! I'm okay with either solution, although I don't know off-hand how to conveniently force English in the scope of a test.

@ghost ghost force-pushed the 1-identify-distro branch 2 times, most recently from df89c01 to 9865bca Compare April 2, 2018 17:15
@ghost ghost force-pushed the 1-identify-distro branch from 9865bca to 7baa4ec Compare April 2, 2018 17:23

click.echo('Unmounting partition: {}'.format(partition))
subprocess.run(['umount', '{}'.format(INSPECT_PATH)])
@contextmanager

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👏

@ghost ghost force-pushed the 1-identify-distro branch from 7baa4ec to 485a05c Compare April 2, 2018 17:55
Copy link
Member

@infinitewarp infinitewarp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ghost ghost merged commit 5d7e735 into master Apr 3, 2018
@ghost ghost deleted the 1-identify-distro branch April 3, 2018 13:22
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants