[synapse] enable generator config update from ZK #241

lap1817 · 2017-08-22T16:36:19Z

to: @jolynch, @igor47
cc: @liangg, @jtai , @wei-zhang

Dynamic generator config update in Syapse

In the current implementation, the “watcher” class has the “config_for_generator” field which is initialized when Synapse starts, and loads the static connection configuration from Synapse config.
The major changes to that are:

Allow Zookeeper watcher to update its “config_for_generator” field when getting update from ZK.
Trigger configuration re-generation when either backend list or generator config changes.
Store front/backend configure in Haproxy stat file, not only backend list.
Add haproxy restart trigger when front/backend configure changes.

Those changes fall into the following Classes.

BaseWatcher Class

The ‘set_backends’ function will take a second argument ‘new_conifg_for_generator’, default to empty hash. The function then compares the new config with the existing one, update the ‘conifg_for_generator’ field and trigger Synapse re-configure if necessary. Watchers other than ZookeeperWatcher do not need to change its function call to “set_backends” as the new argument is default to nil.

ZookeeperWatcher Class

In the “discovery” function, besides reading the children of the znode, also get its node data, parse into generator config.
Pass the new config to the “set_backends” function

Haproxy Class

Haproxy State file should also keep the haproxy config data.
When update_config, compare the new config with the one from state file. If haproxy config changes, it should trigger haproxy restart.

In order to keep backward compatible in the Haproxy Stat file, we propose to add a new key-value-pair under each service, besides the list of backends. The key is specially named as "watcher_config_for_generator". And creates a wrapper class around the current “seen” object (which deserializes the haproxy Stat file) so that it knows how to the added config element from the rest of the backend server list.
{
"service1": {
“watcher_config_for_generator”: { //watcher.config_for_generator["haproxy"]}
"i-xxxxxxx_10.0.0.0:80": {
"name": "i-xxxxxx",
"host": "10.0.0.0.83",
"port": 80,
"id": 75,
"weight": null,
"haproxy_server_options": null,
"labels": null,
"timestamp": 1502949735
}
}
}

For the purpose of comparing two Hash (of the config_for_generator), I am explicitly add the dependency to HashDiff gem, which was implicitly depended by gems that Synapse refers.

Test

updated existing rspec
added new rspec test cases to cover the new feature

lap1817 · 2017-08-23T21:32:26Z

@jolynch @igor47 ,

Did you get a chance to take a look at the PR? We have a tight timeline and are pending on this change to make progress.
Please let me know what you think. Happy to chat online.

jolynch · 2017-08-24T04:34:20Z

@lap1817 thanks for the patch, I'll try to take a look asap.

I am curious though, what is the driving motivation? Are you dynamically adding in different ACLs or something based on some external script?

lap1817 · 2017-08-24T18:02:40Z

@jolynch, thank you!

I sent you an email earlier this week, with a document regarding the whole story behind this change, including motivation and etc. Is [email protected] the correct email address?

Anyway, here is the doc: https://1drv.ms/b/s!AkCYvnLbJodli0E2OY_GuHITRHuC

And since you probably haven't got the email, I am Feng Pan, developer working in the Airbnb infra team. This patch is to address some of the pain points that our developers got when creating/maintaining services, in terms of service discovery.

jolynch · 2017-08-24T18:18:48Z

Ah, very cool. I actually left Yelp about a week ago for another opportunity, but given that most of my SmartStack work has been on my own time I'm happy to keep working on it heh. I am reading the document now and have a few comments, mostly around the tradeoffs of creating something like synapse-tools (a periodic per machine job that polls sources of services e.g. chef, mesos, etc ... and then filters by upstream service) vs a centralized system that publishes configuration to zookeeper itself.

I'm currently traveling but will try to review the patch tonight. This seems useful regardless of why, especially if we could also discover the list of services from ZK as well (eventually probably not in this iteration).

lap1817 · 2017-08-24T20:53:03Z

thank you! I cannot see the comments you left in the doc. Do you mind paste them here? Or we can discuss in email thread ([email protected]).

jolynch

I'd like to see some documentation updates for this change, but the code seems pretty reasonable to me, clever usage of the existing ZK node to provide the configuration data for the watcher itself.

I am curious though, how is this more dynamic than the job that configures synapse writing out this config and restarting synapse as needed? I suppose you only have to have one machine do the re-configuring per zk cluster with this design (whichever one writes the configuration data to the zk node), but I'm not sure how big of a benefit that is.

For example at Yelp we've really benefited from having fine control over each machine, including which services are advertised (nerve) and discovered (synapse) at a per machine level, while this architecture would essentially bring it down to you can control services per ZK cluster differently.

jolynch · 2017-08-28T02:56:15Z

lib/synapse/config_generator/haproxy.rb

    # generates a new config based on the state of the watchers
    def generate_config(watchers)
      new_config = generate_base_config
      shared_frontend_lines = generate_shared_frontend

      watchers.each do |watcher|
        watcher_config = watcher.config_for_generator[name]
+        next if watcher_config.nil? || watcher_config.empty? || watcher_config['disabled']


Interesting, so we're changing the previous behavior that empty watcher configs would be present without a listening port (e.g. a bare backend). Is this intentional?

I think if it's just HAProxy this is potentially reasonable (typically you have backend options for bare backends but ...)

it looks like that in the parse_watcher_config function (the next line), it tries to get fields from watcher_config['listen'], and will throw exception if there is no 'listen' field. So I thought the watcher_config should at least be non-empty.

What is a valid use case when the watcher_config is completely empty?

jolynch · 2017-08-28T03:03:33Z

lib/synapse/config_generator/haproxy.rb


-      # if we've never needed the backends, now is the time to load them
-      @seen = read_state_file if @seen.nil?
+      KEY_WATCHER_CONFIG_FOR_GENERATOR = "watcher_config_for_generator"


Could we achieve the same end perhaps slightly cleaner by versioning the state file and just supporting version 1 or the new version 2? E.g. in the new state files we write out there is a synapse_state_version key? If the key's there and it's == version 2 we can parse out well defined fields, otherwise we treat it as a list of backends.

What do you think?

I'd prefer to do versioned file once we start introducing breaking schemas into the cache. For this particular change, it is intended to be a non-breaking one. The field is added under each watcher and is optional for each watcher to have. So it seems to me that having a file-level version does not help much here, since we still need to logic such as checking "whether the watcher has the config_for_generator" field. Does it make sense to you? Or am I missing some of your points?

I think the main part that's a bit hacky to me is that there is a special key (watcher_config_for_generator) that we effectively treat as a version check. This is fine, I just feel like having a well defined schema for what's in these hashes in each version would make reasoning about the state file easier (as opposed to this change which makes it so that we implicitly support two versions of the state file). This is not a strong objection, just raising the point (feel free add a TODO or just not do it heh, if I care enough I can always go back and change it).

I am all for backwards compatibility, and you could achieve something similar with a synapse_state_file_version key, which if found would indicate this newer formatting.

lap1817 · 2017-08-31T21:59:34Z

@jolynch , thanks for the review! I've updated the README file accordingly. To your question, 'why this is more dynamic than synapse-tool', it is not. The main reason we choose this implementation is that it fits better to Airbnb's current stack, with less additional loads to ZK, and less cost in migration (potentially a long period given all the legacy services we have).

-use node.first as the data blob -make haproxy generator to refresh the config before generate config -make the config_for_generator hash readonly in base watcher by deep clone

lap1817 · 2017-09-02T06:38:13Z

@jolynch , I'd like to merge this PR. I can make follow-up PRs if you have further questions on the two comments. Can you grant me write permission for doing it?

And in order for this change to be effective, I made a very minor change in the nerve gem too. Please help to take look. airbnb/nerve#95

jolynch

lg2m, I can merge it for you if you don't have merge otherwise go for it :-)

jolynch · 2017-09-04T00:00:32Z

Gemfile.lock

@@ -1,9 +1,10 @@
 PATH
  remote: .
  specs:
-    synapse (0.14.7)
+    synapse (0.15.1)


Let's just go to 0.15.0

jolynch · 2017-09-04T00:05:01Z

lib/synapse/config_generator/haproxy.rb


-      # if we've never needed the backends, now is the time to load them
-      @seen = read_state_file if @seen.nil?
+      KEY_WATCHER_CONFIG_FOR_GENERATOR = "watcher_config_for_generator"


I think the main part that's a bit hacky to me is that there is a special key (watcher_config_for_generator) that we effectively treat as a version check. This is fine, I just feel like having a well defined schema for what's in these hashes in each version would make reasoning about the state file easier (as opposed to this change which makes it so that we implicitly support two versions of the state file). This is not a strong objection, just raising the point (feel free add a TODO or just not do it heh, if I care enough I can always go back and change it).

I am all for backwards compatibility, and you could achieve something similar with a synapse_state_file_version key, which if found would indicate this newer formatting.

lap1817 · 2017-09-04T00:53:04Z

@jolynch , I changed the version to 0.15.0 and added a TODO comment in the HaproxyState class.
I don't have the "merge" button". Please help me to merge and publish.

Thanks for taking caring of this over the long weekend. Have a good one!

lap1817 · 2017-09-05T21:38:32Z

hi @jolynch , can you help me to merge? I don't have permisison

jolynch · 2017-09-06T03:54:11Z

@lap1817 ok thanks! merged :-)

I think that folks at airbnb can still admin this project fwiw, it's still under the airbnb namespace so I imagine someone should be able to give you merge access in the future (or someone internal if I don't get back soon enough).

lap1817 · 2017-09-06T04:22:17Z

@jolynch thank you, I will ask around to check (Igor left Airbnb a while back, who was the last one that I know have the permission).

Did you publish the gem btw?

jolynch · 2017-09-06T04:43:10Z

Yup, 0.15.0 on rubygems

lap1817 added 3 commits August 22, 2017 09:24

[synapse] enable generator config update from ZK

45aa4c0

fix gemfile lock for nokogiri

e117aeb

update tests

a65729a

jolynch approved these changes Aug 28, 2017

View reviewed changes

update README

8caf340

lap1817 added 4 commits September 1, 2017 17:31

fix a few issues:

653bda2

-use node.first as the data blob -make haproxy generator to refresh the config before generate config -make the config_for_generator hash readonly in base watcher by deep clone

add logging when restart required due to config change

75fe0ee

fix spec

53264cf

fix spec

2cd3c39

jolynch approved these changes Sep 4, 2017

View reviewed changes

revise version

80bbf62

jolynch merged commit 117e26d into airbnb:master Sep 6, 2017

lap1817 deleted the feng-pan-synapse branch September 6, 2017 17:08

bsherrod mentioned this pull request Nov 11, 2017

add generator_config_path config option #246

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[synapse] enable generator config update from ZK #241

[synapse] enable generator config update from ZK #241

lap1817 commented Aug 22, 2017 •

edited

Loading

lap1817 commented Aug 23, 2017

jolynch commented Aug 24, 2017

lap1817 commented Aug 24, 2017 •

edited

Loading

jolynch commented Aug 24, 2017

lap1817 commented Aug 24, 2017

jolynch left a comment

jolynch Aug 28, 2017

lap1817 Aug 31, 2017

jolynch Aug 28, 2017

lap1817 Aug 31, 2017

jolynch Sep 4, 2017

lap1817 commented Aug 31, 2017

lap1817 commented Sep 2, 2017 •

edited

Loading

jolynch left a comment

jolynch Sep 4, 2017

jolynch Sep 4, 2017

lap1817 commented Sep 4, 2017

lap1817 commented Sep 5, 2017

jolynch commented Sep 6, 2017

lap1817 commented Sep 6, 2017

jolynch commented Sep 6, 2017

[synapse] enable generator config update from ZK #241

[synapse] enable generator config update from ZK #241

Conversation

lap1817 commented Aug 22, 2017 • edited Loading

Dynamic generator config update in Syapse

BaseWatcher Class

ZookeeperWatcher Class

Haproxy Class

Test

lap1817 commented Aug 23, 2017

jolynch commented Aug 24, 2017

lap1817 commented Aug 24, 2017 • edited Loading

jolynch commented Aug 24, 2017

lap1817 commented Aug 24, 2017

jolynch left a comment

Choose a reason for hiding this comment

jolynch Aug 28, 2017

Choose a reason for hiding this comment

lap1817 Aug 31, 2017

Choose a reason for hiding this comment

jolynch Aug 28, 2017

Choose a reason for hiding this comment

lap1817 Aug 31, 2017

Choose a reason for hiding this comment

jolynch Sep 4, 2017

Choose a reason for hiding this comment

lap1817 commented Aug 31, 2017

lap1817 commented Sep 2, 2017 • edited Loading

jolynch left a comment

Choose a reason for hiding this comment

jolynch Sep 4, 2017

Choose a reason for hiding this comment

jolynch Sep 4, 2017

Choose a reason for hiding this comment

lap1817 commented Sep 4, 2017

lap1817 commented Sep 5, 2017

jolynch commented Sep 6, 2017

lap1817 commented Sep 6, 2017

jolynch commented Sep 6, 2017

lap1817 commented Aug 22, 2017 •

edited

Loading

lap1817 commented Aug 24, 2017 •

edited

Loading

lap1817 commented Sep 2, 2017 •

edited

Loading