Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[haproxy] unix socket stats urls #3005

Merged
merged 9 commits into from
Nov 17, 2016
Merged

Conversation

sj26
Copy link
Contributor

@sj26 sj26 commented Nov 8, 2016

What does this PR do?

Support unix sockets for haproxy stats collection.

Motivation

Running a separate stats socket with tight permissions instead of over a tcp port provides greater security.

Testing Guidelines

Sorry, I have no idea how to get the tests actually working in a local environment. It has been a world of pain. I'm hoping travis does a better job at running them and providing some useful feedback.

I'm on macOS 10.12, using homebrew, ruby 2.3.1, python 2.7.12. rake aborts on rubocop errors. So does rake ci:run. rake ci:run[haproxy] tries to compile haproxy which can't find openssl, even if I supply correct CFLAGS and LDFLAGS to my homebrew'd openssl install as env. Subsequent rake ci:run[haproxy] runs don't try to recompile haproxy, they fail with Errno::ENOENT No such file or directory - /Users/sj26/Projects/dd-agent/embedded/haproxy_1.5.11/haproxy. And cleanup fails because it can't find an haproxy pid.

I haven't attempted to figure out mocks yet.

Additional Notes

I realise this introduces a new dependency which requires updating your omnibus, but I'd like to get the code working first before jumping on that dragon.

@sj26 sj26 force-pushed the haproxy-unix-stats-urls branch from e0b2874 to bed8d80 Compare November 8, 2016 01:04
@sj26 sj26 force-pushed the haproxy-unix-stats-urls branch from bed8d80 to 0a08eac Compare November 8, 2016 01:12
@masci
Copy link
Contributor

masci commented Nov 8, 2016

Hi @sj26 and thanks for your code, we now that running the tests locally is not as easy as it should be, sorry for that!

Aside from the details that are making the CI fail, I'm a little bit concerned about the overall approach - if I understood correctly, by monkey_patching requests you're changing the behaviour of the check entirely, switching from TCP to unix sockets.

I agree moving to unix socket would improve security but we cannot break the compatibility with existing configurations. I'd make the check compatible with both approaches, defaulting to the old behaviour and making available the new one through the config file.

What do you think?

@sj26
Copy link
Contributor Author

sj26 commented Nov 8, 2016

The monkey patch shouldn't prevent requests from functioning as normal, or change any existing integrations, it just adds an extra adapter for the "http+unix" scheme. :-)

Copy link
Contributor

@masci masci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure this will work when passing a unix socket

@masci
Copy link
Contributor

masci commented Nov 8, 2016

Nice, that was my impression looking at the CI but I discovered the existence of requests_unixsocket only this morning 😄

pid = spawn("#{haproxy_rootdir}/haproxy", '-d', '-f',
"#{ENV['TRAVIS_BUILD_DIR']}/ci/resources/haproxy/#{name}.cfg",
"#{ENV["VOLATILE_DIR"]}/#{name}.cfg",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change these to single quotes to make rubocop stop complaining

@masci masci added this to the Triage milestone Nov 8, 2016

r = requests.get(url, auth=auth, headers=headers(self.agentConfig), verify=verify, timeout=self.default_integration_http_timeout)
r.raise_for_status()
sock.close()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sj26
Copy link
Contributor Author

sj26 commented Nov 15, 2016

@masci sorry for the delays.

Turns out I was completely wrong and the admin/stats sockets don't run HTTP at all but uses their own line-based protocol. I've updated the implementation to accept unix:///var/run/haproxy.sock style values and talk unix sockets directly instead of using http in those cases. It actually makes things easier, and we don't need another egg.

I also managed to get the tests working (kind of) on my local environment, so can actually verify it works.

Mind taking another look?

Copy link
Contributor

@masci masci left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly nitpicks but the code looks good to me!

import time
try:
import urlparse
except LoadError:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't actually need this since the agent ships with python 2.7 (the exception name should be ImportError anyways, tests are failing on this).

simply import urlparse is ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Too much Ruby on the brain.)

sock.send("show stat\r\n")

response = ""
output = sock.recv(8192)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you define this number as a constant on the top of the module? Like:
BUFSIZE=8192

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@@ -5,6 +5,10 @@ instances:
# username: username
# password: password
#
# or, with a unix stats or admin socket:
#
# url: unix:///var/run/haproxy.sock
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: could you add a leading - to the example so it's clear that should be another instance and not an attribute of the first one?

 instances:
   - url: http://localhost/admin?stats
      # username: username
      # password: password
   # or, with a unix stats or admin socket:
   # - url: unix:///var/run/haproxy.sock

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, that looks odd to me, but updated.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah at first I thought it'd be clearer but doesn't seem so...
Feel free to adjust it as you think it's better from the user perspective, or leave it as it is and we'll fix that later.

@masci masci self-assigned this Nov 16, 2016
@masci masci modified the milestones: 5.11.0, Triage Nov 16, 2016
@masci masci removed the triage label Nov 16, 2016
@masci masci merged commit c9bd5b0 into DataDog:master Nov 17, 2016
@sj26
Copy link
Contributor Author

sj26 commented Nov 18, 2016

Thanks!

@masci masci removed this from the 5.11.0 milestone Jan 24, 2017
@masci masci modified the milestones: 5.12.0, 5.11.0 Jan 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants