#1 Identify Distro #3

ghost · 2018-03-26T19:41:48Z

How does this work?

Houndigrade is a CLI script that lives in a container, the primary use case is running it on container host that has N amount of volumes attached. Houndigrade will mount the volumes and run some static checks to see what OS and (soon™) software is used on said volume. Houndigrade can inspect multiple volumes at the same time simply by passing extra ones in via the -t flag. Once the inspection is complete results are written to a queue for consumption by a different service.

How do I run it?

For Dev/Locally

Currently there is an included docker compose file that helps you run it locally. The file combined with an entrypoint script will start a queue container (accessible at localhost:15672 with default credentials guest/guest) and the houndigrade container. Inside the houndigrade container the entrypoint script mounts two block devices using losetup and then runs a scan against those volumes. Each one has a single partition with either RHEL or CentOS fingerprints. Once the run is done a message will be placed on the queue.

In AWS

High level overview of launching this in AWS is as follows:

Have host in ECS cluster.
Attach volumes to inspect to that host.
Create a task definition with environmental variables defining queue information and script options as the container command. (Script itself is already the entrypoint)
Launch a task on that host and await the results.

Sample Output

Here is sample output of a message that you'd find on a queue from a real™ scan run in AWS that was called with command parameters -c aws -t ami-test1 /dev/xvdba -t ami-test2 /dev/xvdbb -t ami-test3 /dev/xvdbc

{
    "cloud": "aws",
    "inspection_targets": [
        [
            "ami-test1",
            "/dev/xvdba"
        ],
        [
            "ami-test2",
            "/dev/xvdbb"
        ],
        [
            "ami-test3",
            "/dev/xvdbc"
        ]
    ],
    "facts": {
        "/dev/xvdba": {
            "image_id": "ami-test1",
            "/dev/xvdba3": [
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/os-release",
                    "release_file_contents": "NAME=\"Red Hat Enterprise Linux Server\"\nVERSION=\"7.4 (Maipo)\"\nID=\"rhel\"\nID_LIKE=\"fedora\"\nVARIANT=\"Server\"\nVARIANT_ID=\"server\"\nVERSION_ID=\"7.4\"\nPRETTY_NAME=\"Red Hat Enterprise Linux Server 7.4 (Maipo)\"\nANSI_COLOR=\"0;31\"\nCPE_NAME=\"cpe:/o:redhat:enterprise_linux:7.4:GA:server\"\nHOME_URL=\"https://www.redhat.com/\"\nBUG_REPORT_URL=\"https://bugzilla.redhat.com/\"\n\nREDHAT_BUGZILLA_PRODUCT=\"Red Hat Enterprise Linux 7\"\nREDHAT_BUGZILLA_PRODUCT_VERSION=7.4\nREDHAT_SUPPORT_PRODUCT=\"Red Hat Enterprise Linux\"\nREDHAT_SUPPORT_PRODUCT_VERSION=\"7.4\"\n"
                },
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/redhat-release",
                    "release_file_contents": "Red Hat Enterprise Linux Server release 7.4 (Maipo)\n"
                },
                {
                    "rhel_found": true,
                    "release_file": "/mnt/inspect/etc/system-release",
                    "release_file_contents": "Red Hat Enterprise Linux Server release 7.4 (Maipo)\n"
                }
            ],
            "/dev/xvdba2": [
                {
                    "error": "mount: unknown filesystem type 'swap'\n"
                }
            ],
            "/dev/xvdba1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdba1"
                }
            ]
        },
        "/dev/xvdbb": {
            "image_id": "ami-test2",
            "/dev/xvdbb3": [
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/centos-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/os-release",
                    "release_file_contents": "NAME=\"CentOS Linux\"\nVERSION=\"7 (Core)\"\nID=\"centos\"\nID_LIKE=\"rhel fedora\"\nVERSION_ID=\"7\"\nPRETTY_NAME=\"CentOS Linux 7 (Core)\"\nANSI_COLOR=\"0;31\"\nCPE_NAME=\"cpe:/o:centos:centos:7\"\nHOME_URL=\"https://www.centos.org/\"\nBUG_REPORT_URL=\"https://bugs.centos.org/\"\n\nCENTOS_MANTISBT_PROJECT=\"CentOS-7\"\nCENTOS_MANTISBT_PROJECT_VERSION=\"7\"\nREDHAT_SUPPORT_PRODUCT=\"centos\"\nREDHAT_SUPPORT_PRODUCT_VERSION=\"7\"\n\n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/redhat-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                },
                {
                    "rhel_found": false,
                    "release_file": "/mnt/inspect/etc/system-release",
                    "release_file_contents": "CentOS Linux release 7.4.1708 (Core) \n"
                }
            ],
            "/dev/xvdbb2": [
                {
                    "error": "mount: unknown filesystem type 'swap'\n"
                }
            ],
            "/dev/xvdbb1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdbb1"
                }
            ]
        },
        "/dev/xvdbc": {
            "image_id": "ami-test3",
            "/dev/xvdbc2": [
                {
                    "error": "mount: unknown filesystem type 'LVM2_member'\n"
                }
            ],
            "/dev/xvdbc1": [
                {
                    "rhel_found": false,
                    "status": "No release files found on /dev/xvdbc1"
                }
            ]
        }
    }
}

Demo

https://asciinema.org/a/QEqfQSH6MAzIJ67TTWuYKhxtb

codecov · 2018-03-26T19:48:29Z

Codecov Report

❗ No coverage uploaded for pull request base (master@1e72bd2). Click here to learn what that means.
The diff coverage is 100%.

@@           Coverage Diff           @@
##             master     #3   +/-   ##
=======================================
  Coverage          ?   100%           
=======================================
  Files             ?      2           
  Lines             ?    193           
  Branches          ?     10           
=======================================
  Hits              ?    193           
  Misses            ?      0           
  Partials          ?      0

Impacted Files	Coverage Δ
houndigrade/test_cli.py	`100% <100%> (ø)`
houndigrade/cli.py	`100% <100%> (ø)`

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1e72bd2...485a05c. Read the comment docs.

blentz

👍 lgtm

adberglund · 2018-03-27T16:41:49Z

Dockerfile

@@ -0,0 +1,16 @@
+FROM python:3.6-alpine


Went looking at pyup to see how they handle Pipfile and I found this interesting article around creating an entrypoint that activates the pipenv virtualenv at entry https://pyup.io/posts/pipfiles-and-docker/

Good info, but unfortunately I don't think it really buys us anything, especially since we actually have to run the container in privileged mode anyway.

Right, I hadn't made it all the way through things when I commented here.

adberglund

This was not as bad as you claimed! I know next to nothing around file systems, so I have no expertise to inject here. It scans, it reads files, it enqueues a message. ✅

adberglund · 2018-03-27T16:46:30Z

Pipfile

@@ -0,0 +1,15 @@
+[[source]]


Pipfile support for pyup might not be 100%

pyupio/pyup#197

Yea I'm a little on the fence about it, per pyupio/pyup#197 (comment) it seems like it should be at least usable. But if it misbehaves I have no issue with pulling it out and going back to traditional requirements files.

I'm all for going forward with the pipfile.

adberglund · 2018-03-27T16:48:28Z

houndigrade/cli.py

+            if rhel_found:
+                click.echo('RHEL found on: {}'.format(partition))
+
+                results.append({


The results.append() call does not need to be duplicated in the if/else blocks. You could also use the rhel_found as the value for the rhel_found key in the result.

Good catch!

adberglund · 2018-03-27T16:51:47Z

houndigrade/cli.py

+        })
+    else:
+        for release_file_path in release_file_paths:
+            rhel_found, file = check_file(release_file_path)


I got uneasy about file as a variable name, but then realized it's no longer a builit-in.

Python3 🎉

That said I'm open to changing it to a different name if you have suggestions.

Seconded on the file name. Conventions I've seen around this can be as simple as the_file or _file, but here I think a more descriptive name like contents or file_contents would be appropriate since it looks like that's what this really is.

infinitewarp · 2018-03-29T12:59:53Z

houndigrade/cli.py

+              is_flag=True,
+              help=_('Print debug output.'))
+def main(cloud, target, debug):
+    r"""


Why is this a raw string? What's with the \b a few lines later?

\b is for http://click.pocoo.org/5/documentation/#preventing-rewrapping and python wants it to be a raw string if I have back slashes.

To expand on this a bit, I suppose I could use a normal string and escape that backslash and see if the no re-wrapping flag still works, but I didn't see any harm in just using a raw string.

EDIT: Tested, doesn't work 😞

infinitewarp · 2018-03-29T13:02:21Z

houndigrade/cli.py

+    }
+
+    for image_id, drive in target:
+        click.echo('Checking drive {}'.format(drive))


What's the reason for using click.echo instead of the standard Python logger? Does echo give us the option to include formatting, timestamp, etc. in its output? Do you think there would be any value in having a richer output format that could be given by the Python logger, or will we only ever want bare-bones stdout from this script?

(FWIW, I'm not terribly familiar with the click library, but I did look enough into the docs to see that echo appears to be a thin wrapper around print that can handle things like ANSI color codes.)

imo even the stdout is overkill and I've considered removing it entirely, or putting all of it behind the --debug flag since normally where this will be running will never really be seen by anyone. There was no specific reason for using click.echo, but it does offer few things over standard print (like being able to easily output to stderr), and ability to use its other functions later on if deemed necessary (things like progress bars, pagers, colors, formatting path names, etc.).

As for the standard python logger ¯\_(ツ)_/¯ I don't see a lot of value since we're running in a container without having to spend additional time to set up another tool that would consume those logs so they wouldn't disappear when the container exits.

infinitewarp · 2018-03-29T13:16:36Z

houndigrade/cli.py

+        click.echo('Checking drive {}'.format(drive))
+
+        results['facts'][drive] = {'image_id': image_id}
+        for partition in get_partitions(drive):


I'd consider putting this for loop logic into a separate function like you did with report_results. That should make it easier to test and improve readability.

infinitewarp · 2018-03-29T13:22:52Z

houndigrade/cli.py

+        })
+    else:
+        for release_file_path in release_file_paths:
+            rhel_found, file = check_file(release_file_path)


Seconded on the file name. Conventions I've seen around this can be as simple as the_file or _file, but here I think a more descriptive name like contents or file_contents would be appropriate since it looks like that's what this really is.

infinitewarp · 2018-04-02T13:24:59Z

houndigrade/cli.py

+    that gets placed on a queue once the processing is done.
+
+    \b
+    Args:


Since this is effectively duplicating what's already stated in the decorators, I'd suggest dropping the "Args" lines.

infinitewarp · 2018-04-02T13:53:03Z

houndigrade/test_cli.py

+        with open('{}/xvdf1/etc/redhat-release'.format(drive_path), 'w') as f:
+            f.write('Red Hat Enterprise Linux Server release 7.4 (Maipo)\n')
+        with open('{}/xvdf1/etc/os-release'.format(drive_path), 'w') as f:
+            f.write('NAME=\"Red Hat Enterprise Linux Server\"\nVERSION=\"7.4 '


Are all those double-quote escapes really necessary since the string is defined with single-quotes?

Also, what would you think about making this a triple-quote string so you don't have to escape the newlines?

Maybe shove the whole thing into a variable using triple-quotes and pass it through textwrap.dedent before f.writeing it. I think that might make this more legible.

infinitewarp · 2018-04-02T14:10:58Z

houndigrade/cli.py

+            click.echo('Checking partition {}'.format(partition))
+
+            results['facts'][drive][partition] = []
+            try:


The more I look at this, the more I think it should be a context manager. Something like:

with mount(partition, path, blah blah): check_release_files(partition, results['facts'][drive][partition])

That would move the exception handling and unmount command into the context manager's definition, leaving just the interesting business logic here.

I'm digging through pypi to see if there are any decent wrapper libraries for mount that would give us a context manager for free and/or save us from using subprocesses, but I haven't found anything yet. 😞

Context Managered™

infinitewarp · 2018-04-02T16:07:14Z

houndigrade/cli.py

+                    'error': e.stderr
+                })
+
+                continue


Does this continue do anything here?

Not anymore! That freeloader is outta there.

infinitewarp · 2018-04-02T16:09:46Z

houndigrade/cli.py

+    """
+    try:
+        with open(file_path) as f:
+            file = f.read()


Did you forget to rename file or decide not to?

infinitewarp · 2018-04-02T16:10:59Z

houndigrade/cli.py

+
+            except subprocess.CalledProcessError as e:
+                click.echo(
+                    _('Mount of {} failed '


I think this split string can fit on the same line.

(I assume this is a leftover from refactoring and moving around code.)

Not with the format on the end, I can shuffle some stuff around to make it a bit more visually appealing, but I don't think number of lines changes.

infinitewarp · 2018-04-02T16:19:51Z

houndigrade/cli.py

+    }
+
+    for image_id, drive in target:
+        click.echo('Checking drive {}'.format(drive))


Any thoughts on the earlier comment about lifting the contents of this for loop into a separate function (effectively "inspect this one image_id/drive pair"). My thought was that it would improve readability and testability, although the latter may not be an issue now if the tests are already there.

I was a little hesitant to pull it out as it feels like we are building a matryoshka doll of function calls with them only being called in one place. Test wise, the tests are already there so that point is a bit moot, and calling the main method isn't exactly difficult in this instance.

That said, I did go ahead and pull it out, let me know if you like the look of it more now.

I like it! 🎉

infinitewarp · 2018-04-02T16:26:23Z

houndigrade/cli.py

+    }
+
+    for image_id, drive in target:
+        click.echo('Checking drive {}'.format(drive))


This one and a few other echo calls (line 55, 146, …) are missing the _ translation function.

infinitewarp

Whoops. I put on my i18n/l10n goggles and noticed a few issues I missed before… 🔍

infinitewarp · 2018-04-02T16:54:20Z

houndigrade/cli.py

+@click.option('--cloud',
+              '-c',
+              default='aws',
+              help='Cloud in which we are performing the inspection.',


_(…) 🙂

infinitewarp · 2018-04-02T16:56:29Z

houndigrade/cli.py

+    that gets placed on a queue once the processing is done.
+
+    """
+    click.echo(_('Provided cloud: {}'.format(cloud)))


The matryoshka doll is closing incorrectly here. You want the .format to happen after the translation occurs. Like this:

_('Provided cloud: {}').format(cloud)

Otherwise we're in the situation of asking for un-translated strings dynamically at runtime.

infinitewarp · 2018-04-02T16:57:01Z

houndigrade/cli.py

+
+    """
+    click.echo(_('Provided cloud: {}'.format(cloud)))
+    click.echo(_('Provided drive(s) to inspect: {}'.format(target)))


Like above,

_('Provided drive(s) to inspect: {}').format(target)

infinitewarp · 2018-04-02T16:57:24Z

houndigrade/cli.py

+        results (dict): The results of the inspection.
+
+    """
+    click.echo(_('Checking drive {}'.format(drive)))


Like previous comment…

infinitewarp · 2018-04-02T16:57:32Z

houndigrade/cli.py

+    click.echo(_('Checking drive {}'.format(drive)))
+    results['facts'][drive] = {'image_id': image_id}
+    for partition in get_partitions(drive):
+        click.echo(_('Checking partition {}'.format(partition)))


Like previous comment…

infinitewarp · 2018-04-02T16:58:05Z

houndigrade/cli.py

+    release_file_paths = find_release_files()
+
+    if not release_file_paths:
+        click.echo(_('No release files found on {}'.format(partition)))


Like previous comment…

infinitewarp · 2018-04-02T16:58:11Z

houndigrade/cli.py

+        click.echo(_('No release files found on {}'.format(partition)))
+        results.append({
+            'rhel_found': False,
+            'status': 'No release files found on {}'.format(partition)


Like previous comment…

infinitewarp · 2018-04-02T16:58:17Z

houndigrade/cli.py

+            rhel_found, contents = check_file(release_file_path)
+
+            if rhel_found:
+                click.echo(_('RHEL found on: {}'.format(partition)))


Like previous comment…

infinitewarp · 2018-04-02T16:58:21Z

houndigrade/cli.py

+            if rhel_found:
+                click.echo(_('RHEL found on: {}'.format(partition)))
+            else:
+                click.echo(_('RHEL not found on: {}'.format(partition)))


Like previous comment…

infinitewarp · 2018-04-02T17:00:00Z

houndigrade/test_cli.py

+        self.assertTrue(mock_subprocess_run.called)
+        self.assertEqual(result.exit_code, 0)
+        self.assertIn(
+            '"status": "No release files found on ./dev/xvdf1"', result.output)


You should construct the "No release files found on" string using _ like it is in the main code because if someone runs the tests on a non-English-localized system, that could cause this test to fail.

Or we force the English localization on our tests! I'm okay with either solution, although I don't know off-hand how to conveniently force English in the scope of a test.

adberglund · 2018-04-02T17:53:32Z

houndigrade/cli.py


-            click.echo('Unmounting partition: {}'.format(partition))
-            subprocess.run(['umount', '{}'.format(INSPECT_PATH)])
+@contextmanager


infinitewarp

Add skeleton and supporting files.

9c5a6ad

ghost force-pushed the 1-identify-distro branch 7 times, most recently from 2f46ed1 to 87524ba Compare March 26, 2018 20:17

ghost changed the title ~~WIP: #1 Identify Distro~~ #1 Identify Distro Mar 26, 2018

ghost self-requested a review March 26, 2018 20:31

blentz previously approved these changes Mar 27, 2018

View reviewed changes

ghost dismissed blentz’s stale review via 2c2cec5 March 27, 2018 15:22

ghost force-pushed the 1-identify-distro branch from 2c2cec5 to 87524ba Compare March 27, 2018 15:30

adberglund reviewed Mar 27, 2018

View reviewed changes

Add houndigrade cli.

18be243

ghost force-pushed the 1-identify-distro branch from 87524ba to 18be243 Compare March 27, 2018 18:56

Update docker container to be centos based

ff18f3e

infinitewarp suggested changes Mar 29, 2018

View reviewed changes

infinitewarp suggested changes Apr 2, 2018

View reviewed changes

ghost force-pushed the 1-identify-distro branch from 46f34ad to 04fd7a1 Compare April 2, 2018 15:53

infinitewarp suggested changes Apr 2, 2018

View reviewed changes

ghost force-pushed the 1-identify-distro branch from 04fd7a1 to 0282020 Compare April 2, 2018 16:46

infinitewarp suggested changes Apr 2, 2018

View reviewed changes

ghost force-pushed the 1-identify-distro branch 2 times, most recently from df89c01 to 9865bca Compare April 2, 2018 17:15

ghost force-pushed the 1-identify-distro branch from 9865bca to 7baa4ec Compare April 2, 2018 17:23

adberglund reviewed Apr 2, 2018

View reviewed changes

Context managers and other amazing PR tweaks.

485a05c

ghost force-pushed the 1-identify-distro branch from 7baa4ec to 485a05c Compare April 2, 2018 17:55

infinitewarp approved these changes Apr 3, 2018

View reviewed changes

ghost merged commit 5d7e735 into master Apr 3, 2018

ghost deleted the 1-identify-distro branch April 3, 2018 13:22

This pull request was closed.

#1 Identify Distro #3

#1 Identify Distro #3

Conversation

ghost commented Mar 26, 2018 • edited by ghost Loading

How does this work?

How do I run it?

For Dev/Locally

In AWS

Sample Output

Demo

codecov bot commented Mar 26, 2018 • edited Loading

Codecov Report

blentz left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

adberglund left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghost Mar 29, 2018 • edited by ghost Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

infinitewarp left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

infinitewarp Apr 2, 2018 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

infinitewarp left a comment

Choose a reason for hiding this comment

ghost commented Mar 26, 2018 •

edited by ghost

Loading

codecov bot commented Mar 26, 2018 •

edited

Loading

ghost Mar 29, 2018 •

edited by ghost

Loading

infinitewarp Apr 2, 2018 •

edited

Loading