Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker module cleanup #2665

Merged
merged 1 commit into from
Oct 4, 2016
Merged

Docker module cleanup #2665

merged 1 commit into from
Oct 4, 2016

Conversation

ruflin
Copy link
Contributor

@ruflin ruflin commented Oct 3, 2016

  • Make default period 10s, same as for all other beats
  • Remove unnecessary code comments
  • Make convertContainerPort local to container MetricSet
  • Only add labels to container MetricSet if not empty
  • Apply some code conventions
  • Convert local @timestamp to timestamp to prevent confusion
  • Replace rx / tx with in / out in network metricset to be consistent with system network metricset
  • Add interface name as field instead of having it as a dynamic element. This makes it easier to filter.
  • If not containers are running, just no events are returned
  • Add data.json for all metricsets
  • Introduce ToMapStr for Container data
  • Move socket to Container MapStr. Is Socket info really needed?
  • Remove unnecessary nesting levels as namespace already given by metricset
  • Have one docker client per metricset instead of having one global client

@ruflin ruflin added in progress Pull request is currently in progress. review Metricbeat Metricbeat labels Oct 3, 2016
@ruflin
Copy link
Contributor Author

ruflin commented Oct 3, 2016

@douaejeouit It would be nice if you could have a look at this PR and especially have a look at the // TODO I added. Would be good to get your thoughts on these.

@ruflin ruflin force-pushed the docker-module-cleanup branch from 9a4abde to 4cdcb27 Compare October 3, 2016 20:13

event := common.MapStr{
"@timestamp": time.Now(),
"timestamp": time.Now(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I understand this, does it mean we'll have both @timestamp and timestamp? Seems unnecessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg Agree. Before one was overwriting the other (or at least it tried and didn't as it was not common.Time). With the above implementation each event can have its separate timestamp instead of one for all events of this batch. But as the heavy work happens when the info from the docker api is fetched and the timestamp is generated afterwards, I suggest to just remove it.

Perhaps @douaejeouit can share some insights here?

Copy link
Contributor

@douaejeouit douaejeouit Oct 4, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tsg The @ timestamp here shouldn't be time.now() ! The idea is to get the time generate by Docker daemon while sending the events requested ( here the containers event), this is what I did for all the other metricsets.
I will be on it tomorrow morning! There is some smalll changes to do, otherwise, we can simply keep it as it is for the containers event! What do you think guys ? @ruflin @tsg

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@douaejeouit I saw this part, but we are talking here about a milli second difference in time. I don't think it is worth to do this and have the logic because of this different from all other modules / metricsets where the timestamp represents when the event fetching started. The nice part of this is that timestamp + rrt is when the even creation ended. With the above implementation it is not 100% clear to me what the timestamp will exactly represent.

@douaejeouit Perhaps you can share some insights on what the timestamp exactly is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Ok!
But .. what do you mean by having one dockerclient per metricset instead of having one global client? For me the docker client instance should be created only once. Could you please clarify this point?
Concerning timestamp, as I said before, it's overwritten and it represents the time when the stats were read from the deamon. But since it's only a matter of a milli seconds difference in time, I agree with you! We can change the logic 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous implementation didn't allow to have two different docker host machines as only one global client was used. In general MetricSets are designed to no share information with each other. So in case one MetricSet has some issues, it doesn't affect the others. This should also be true for the client connection, means every MetricSet should have its own instance of the client.

The downside of this is when using all docker module MetricSets, 5 connections are open to the Docker host instead of one. Currently I would say this is acceptable, but in case this becomes a problem we will tackle it. There are options like making the Module more powerful so multiple MetricSets in one Module, meaning if in the config multiple metricsets are defined in a module, this module could also be used to share some information. This will potentially require some refactoring as it exceeds the current capability of the interface methods of the module interface.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok,
Indeed, I made the choice to have one client for failing to 5 simultaneous connections if all the metricsets are used.. anyway, it's Ok as it is not problematic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before it was not only 1 client for all 5 metricsets, but one client for all docker modules (one docker module can be used multiple times in the config) which broke the implementation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All right .. I can see now! Thanks

@ruflin ruflin force-pushed the docker-module-cleanup branch 3 times, most recently from fbb350a to 896d480 Compare October 4, 2016 08:52
@tsg
Copy link
Contributor

tsg commented Oct 4, 2016

LGTM.

@ruflin ruflin removed the in progress Pull request is currently in progress. label Oct 4, 2016
* Make default period 10s, same as for all other beats
* Remove unnecessary code comments
* Make convertContainerPort local to container MetricSet
* Only add labels to container MetricSet if not empty
* Apply some code conventions
* Convert local @timestamp to timestamp to prevent confusion
* Replace rx / tx with in / out in network metricset to be consistent with system network metricset
* Add interface name as field instead of having it as a dynamic element. This makes it easier to filter.
* If not containers are running, just no events are returned
* Add data.json for all metricsets
* Introduce ToMapStr for Container data
* Move socket to Container MapStr. Is Socket info really needed?
* Remove unnecessary nesting levels as namespace already given by metricset
* Have one docker client per metricset instead of having one global client
@ruflin ruflin force-pushed the docker-module-cleanup branch from 896d480 to 83e7c61 Compare October 4, 2016 11:26
@ruflin ruflin mentioned this pull request Oct 4, 2016
19 tasks
@tsg tsg merged commit f9effaf into elastic:master Oct 4, 2016
@ruflin
Copy link
Contributor Author

ruflin commented Oct 4, 2016

@douaejeouit We merged this PR, but lets continue this discussion about the timestamp. Happy to readd it if needed. Please also have a look at the PR to check if I changed some code that should not be changed.

@ruflin ruflin deleted the docker-module-cleanup branch October 4, 2016 12:27
@@ -73,6 +74,8 @@ func TestCPUService_UsageInUsermode(t *testing.T) {
// THEN
assert.Equal(t, float64(0.50), result)
}

/* TODO: uncomment
Copy link
Contributor

@douaejeouit douaejeouit Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ruflin Should I comment it? This function needs to be updated in order to match with last changes you made. I'll do it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please. Thanks

"id": c.Id,
"name": c.Name,
// TODO: Is this really needed
"socket": GetSocket(),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's an extra information. We can ignore it if you judge it isn't needed!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm trying to understand the use case behind it. It see that people will search for id and name, but will the socket information be of any use? I think this is useful in the info metricset, but it is required in the other metricsets. I'm hesitant to remove it as I'm sure there was a good reason to add it which I'm just not aware of yet.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main objectif of having it as a field is to have an clear idea about the conf and to check it as well. Espacially in case we have two docker modules running and each one listen on different socket. Since we can bind docker to another host or port using a tcp connection, it's always good to know if it listen on TCP or a Unix socket or both, that's why it's present n all the metricsets!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking in case a socket is used, if we should copy over the socket information to metricset.host info instead?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, it sounds good for me!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a task to the list here: #2629

i := 0
var keys []string
for key := range labels {
keys = append(keys, key)
}
sort.Strings(keys)
for _, k := range keys {
// Replace all . in the labels by _
// TODO: WHY?
Copy link
Contributor

@douaejeouit douaejeouit Oct 5, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was the naming used in dockbeat Project, where it has been decided to format the labels output . I didn't want it to be different, that's why I left it at it is. If it's not necessery. We can leave it as it's generated by docker's API.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was first thinking . were replaced by _ because in ES <2.4 . can't be used in field names. But it seems to be the names where it is replaced? Perhaps there are some disadvantages in Kibana with the . ? Would be interesting to get some more background info on this decision.

Copy link
Contributor

@douaejeouit douaejeouit Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, it was due to some ES & Kibana issues in some previous version. Apparently, Kibana couldn't distinguish between fields whose names contains a . and those who are nested under a different objects! So as a temporary solution, we replaced the feild names containing ( . ) by ( _ ) to avoid any potential problem. By the way, I'm wondering if the bug was fixed. If so, we can ignore this function and keep our field names without any modification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't aware, that the above was and issue for values. I would expect it to be an issue for keys, but also values? Were the values defined as text or keywords? That could have been the issue, that if it was not defined as keywors that the tokenizer was applied? But this can be solved by applying the right mapping.

We should test the behaviour here, as in general I think it would be nice to keep the name of the labels the same as the user set it, as this is what he will probably look for.

Copy link
Contributor

@douaejeouit douaejeouit Oct 6, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm not mistaken, it concerns only keys !
+1 for not changing the name of labels

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a task to the list here: #2629

@rikatz
Copy link
Contributor

rikatz commented Nov 20, 2016

Guys, with my previous contribution #3024 about labels and dots, is this still necessary?

@ruflin
Copy link
Contributor Author

ruflin commented Nov 21, 2016

@rikatz There were quite a lot of changes and discussions in this PR. Which part are you referring to?

@rikatz
Copy link
Contributor

rikatz commented Nov 21, 2016

@ruflin Sorry! about the labels cleanup, as mentioned on the helper.go here, about the replacement of dots to underlines.

Also, the task "Change and test labels to be taken with . if possible: #2665 (comment)" on #2629

@ruflin
Copy link
Contributor Author

ruflin commented Nov 21, 2016

@rikatz I checked the checkboxes in the issue as done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants