Skip to content
This repository has been archived by the owner on Dec 1, 2018. It is now read-only.

Feature/elasticsearch new mapping #1313

Merged

Conversation

AlmogBaku
Copy link
Contributor

@AlmogBaku AlmogBaku commented Sep 23, 2016

fixes #1310, #1278 #1263 #1262:

This PR brings a lot of changes, most of them caused by refactoring the schema .
In order to allow effective usage of the ELK stack, we had to modify the schema so it'll work properly with Kibana(otherwise, it require development of a custom special dashboard! which is insane!)

This PR fixes the following issues:

  • Types per "metric family"(cpu/memory/network/fs/etc): each entry includes more than one parameter that related to its family; allowing us to correlate the data(such as "usage VS limit")
    • Uses bulk functionalities of ES
    • Creates daily indexes (allowing easy removal of old data, as Logstash and others do)
    • Create aliases for the indexes

This is a big PR, which bring breaking changes(old schema not valid anymore). However- I have to admit that the old schema is totally unusable, and probably not many users use it if any(but me)

cc @huangyuqi @DirectXMan12 @piosz @Thermi

You can see the exported schema here: https://gist.github.com/AlmogBaku/8fe24fd9c61df5a72402a4d25992cb36


This change is Reviewable

@huangyuqi huangyuqi self-assigned this Sep 24, 2016
@AlmogBaku
Copy link
Contributor Author

for some reason the tests takes too long, and it caused travis to fail..

@AlmogBaku
Copy link
Contributor Author

@huangyuqi how can we push this forward?

@AlmogBaku
Copy link
Contributor Author

signed the linux foundation cla

@AlmogBaku
Copy link
Contributor Author

@piosz @huangyuqi ? how can we push this forward?

@DirectXMan12
Copy link
Contributor

@huangyuqi can you please do a review? I'm not particularly familiar with ElasticSearch

@AlmogBaku
Copy link
Contributor Author

I fixed an issue where some of the points haven't committed due to not passing the threshold of the bulk-processor..

Now it's flush on one of the following scenarios:

  • every 10s
  • 2MB of data
  • = 1000 requests

  • end of the processing(end of ExportData or ExportEvents)

@piosz
Copy link
Contributor

piosz commented Oct 24, 2016

@AlmogBaku thanks for your contribution. I'm fine with this PR, but I'm not familiarized with ES. We need someone to review the code. @commixon offered his help.

In the meantime could you please reorganize commits into the following pattern:

  • godeps changes
  • real changes

If you think you need more than 2 commits please explain why.

@AlmogBaku AlmogBaku force-pushed the feature/elasticsearch_mapping branch from a76afd1 to 56e4bc7 Compare October 24, 2016 12:51
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

1 similar comment
@googlebot
Copy link

We found a Contributor License Agreement for you (the sender of this pull request), but were unable to find agreements for the commit author(s). If you authored these, maybe you used a different email address in the git commits than was used to sign the CLA (login here to double check)? If these were authored by someone else, then they will need to sign a CLA as well, and confirm that they're okay with these being contributed to Google.

@googlebot
Copy link

CLAs look good, thanks!

1 similar comment
@googlebot
Copy link

CLAs look good, thanks!

@AlmogBaku
Copy link
Contributor Author

@piosz done.

@AlmogBaku AlmogBaku force-pushed the feature/elasticsearch_mapping branch 2 times, most recently from cbe2f6c to ffa3587 Compare October 24, 2016 13:20
@AlmogBaku
Copy link
Contributor Author

Your test run exceeded 120 minutes.
Can someone try to fix this?

@piosz
Copy link
Contributor

piosz commented Oct 24, 2016

@k8s-bot ok to test

@piosz
Copy link
Contributor

piosz commented Oct 24, 2016

@AlmogBaku I don't know how to fix this. If Jenkins is green we can assume that tests pass.

@piosz
Copy link
Contributor

piosz commented Oct 26, 2016

@huangyuqi please create an issue and we can discuss it there. Thanks for offering the help

@AlmogBaku
Copy link
Contributor Author

@piosz @huangyuqi this can be solved by using glide #1314 (or specifically see my review comment in #1335 )

@andrejvanderzee
Copy link

@piosz See #1363.

@andrejvanderzee
Copy link

Review status: 0 of 100 files reviewed at latest revision, 2 unresolved discussions.


common/elasticsearch/elasticsearch.go, line 37 at r4 (raw file):

  bulkProcessor *elastic.BulkProcessor
  base_index    string
}

Why underscore instead of camelcase?


common/elasticsearch/elasticsearch.go, line 44 at r4 (raw file):

func (esConfig ElasticSearchService) IndexAlias(date time.Time, typeName string) string {
  return date.Format(fmt.Sprintf("%s-%s-2006.01.02", esConfig.base_index, typeName))
}

This should be esSvc and not esConfig like below. Probably also better to use a pointer type as a receiver like below.


Comments from Reviewable

@andrejvanderzee
Copy link

Review status: 0 of 100 files reviewed at latest revision, 3 unresolved discussions.


common/elasticsearch/mapping.go, line 177 at r4 (raw file):

      "index": "not_analyzed"
    }
  }

Have you considered to disable _all (its enabled by default)? That will save disk space:

    "_default_": {
            "_all": {
                "enabled": false,
            },

Comments from Reviewable

@andrejvanderzee
Copy link

Review status: 0 of 100 files reviewed at latest revision, 3 unresolved discussions.


common/elasticsearch/mapping.go, line 177 at r4 (raw file):

Previously, andrejvanderzee wrote…

Have you considered to disable _all (its enabled by default)? That will save disk space:

    "_default_": {
            "_all": {
                "enabled": false,
            },
https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html

Comments from Reviewable

@andrejvanderzee
Copy link

@AlmogBaku I have looked through the PR and added few comments.

@AlmogBaku
Copy link
Contributor Author

Review status: 0 of 100 files reviewed at latest revision, 3 unresolved discussions.


common/elasticsearch/elasticsearch.go, line 37 at r4 (raw file):

Previously, andrejvanderzee wrote…

Why underscore instead of camelcase?

Done.

common/elasticsearch/elasticsearch.go, line 44 at r4 (raw file):

Previously, andrejvanderzee wrote…

This should be esSvc and not esConfig like below. Probably also better to use a pointer type as a receiver like below.

Done.

common/elasticsearch/mapping.go, line 177 at r4 (raw file):

Previously, andrejvanderzee wrote…

https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-all-field.html

Done.

Comments from Reviewable

events remapping

fixes review comments

flush data also in metrics
@andrejvanderzee
Copy link

Reviewed 5 of 9 files at r1, 90 of 91 files at r2, 3 of 4 files at r3, 2 of 2 files at r4, 4 of 4 files at r5.
Review status: all files reviewed at latest revision, all discussions resolved.


Comments from Reviewable

@AlmogBaku
Copy link
Contributor Author

@k8s-bot test this

@k8s-ci-robot
Copy link
Contributor

@AlmogBaku: you can't request testing unless you are a kubernetes member.

In response to [this comment](https://github.com//pull/1313#issuecomment-260325392):

@k8s-bot test this

If you have questions or suggestions related to this bot's behavior, please file an issue against the kubernetes/test-infra repository.

@andyxning
Copy link
Contributor

@AlmogBaku can you guys please have a look at #1379 #1374. I also make some RP based on the current master branch. BTW, creating index daily is really a wonderful thing as i also searched for how to make ES store time-limited events. :)

@andyxning
Copy link
Contributor

andyxning commented Nov 20, 2016

Can we just split the new features into small ones and small PRs. In this way maybe we can push this forward faster. :)

@AlmogBaku
Copy link
Contributor Author

@piosz, few people already review the code and that it functioning(see above).. can we merge this?

@andyxning
Copy link
Contributor

andyxning commented Nov 20, 2016

@AlmogBaku Can you also please have a look at #1380 also. Ignore the build failure status, it seems that Heapster build will burst above the memory limit Travis CI container set and the build will be OOM killed. So, in #1379 i have updated the Travis CI configuration to use a VM which according to Travis CI has about 7.5GB memory except longer build start time. :)

@piosz piosz assigned ghost and unassigned huangyuqi Nov 23, 2016
@piosz
Copy link
Contributor

piosz commented Nov 23, 2016

I'm ok with the change. Can one of the reviewers confirm that it's ready to merge by writing LGTM?

@piosz
Copy link
Contributor

piosz commented Nov 23, 2016

@k8s-bot test this

@andrejvanderzee
Copy link

LGTM

@k8s-ci-robot
Copy link
Contributor

Jenkins GCE e2e failed for commit d8894f3. Full PR test history.

The magic incantation to run this job again is @k8s-bot test this. Please help us cut down flakes by linking to an open flake issue when you hit one in your PR.

@piosz piosz added the lgtm Indicates that a PR is ready to be merged. label Nov 23, 2016
@piosz
Copy link
Contributor

piosz commented Nov 23, 2016

Jenkins failure unrelated

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ElasticSearch | time-based index
9 participants