-
Notifications
You must be signed in to change notification settings - Fork 386
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Elastic Stack deployment for flow visualization #836
Conversation
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are checking-in a lot of large files here. Is there an alternative? Based on our previous conversation, I thought the plan was to check-in a small yml files with our custom Antrea ipfix fields?
@antoninbas Since we want to modify the cache_ttl parameter in the config files, I adopted the first solution we discussed last time. Reason for the need to modify that parameter is -- tcp template in logstash should not be removed according to RFC, but the Netflow plugin of logstash will clear template cache after some time (by default 4000 seconds). So we decided to adjust that value to be the largest possible. Because the parameter change involves modification of original config files for elastiflow, possible alternatives are 1) upload only the elastiflow image or 2) upload both config files and ipfix definitions. Solution 2) is more light-weight, should I change to that way? |
If we need to build a custom docker image, we should still be able to have a much simpler process that doesn't involve checking-in all the files: your Dockerfile can inherit from the official image ( The alternative is the ConfigMap approach we discussed previously, but it gets a bit complicated if multiple files are involved, so I am fine with the above approach, as long as find a way to not copy all the files from the elastiflow repo. In addition to making this repo larger and more complex, there may be some licensing issues (although I haven't looked at the license in details)? |
BTW the docker image that you build should have a different name I think (not |
@antoninbas Thanks for the suggestions. Yes it makes sense to not copying all the elastiflow files and building our own image may cause extra work and license issues. I tried to combine the configs into one file and simplify the Configmap(so we don't need to upload netflow definitions). In this case we don't need to build custom image. pls have a look whether this solution is acceptable. |
@zyiou this looks fine, but what about the |
@antoninbas I put it here. This conf file will overwrite all the config files previously in conf.d folder |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the new version. LGTM.
I assume this is dependent on the IPFix PRs from @srikartati?
Could you talk to @lzhecheng about the possibility of having a daily CI job for validating the manifest (once everything is merged of course)?
docs/elastiflow-integration.md
Outdated
``` | ||
Kibana dashboard is exposed as a Nodeport, which can be accessed via `http://[NodeIP]: 30007` | ||
``` | ||
To import the dashboard into Kibana, go to `Management -> Saved Objects` and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this doesn't need to be rendered as code.
docs/elastiflow-integration.md
Outdated
## Instruction | ||
To put everything in elastiflow namespace and get the configuration up and | ||
running, run: | ||
```shell script |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't know you could provide shell script
as the language here, is it different from just shell
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shell
should be the correct one. I think it is just ignoring the rest after 'shell'
docs/elastiflow-integration.md
Outdated
[Elastiflow](https://github.com/robcowart/elastiflow) is a network flow data | ||
collection and visualization tool based on [Elastic | ||
Stack](https://www.elastic.co/elastic-stack) (Elasticsearch, Logstash and | ||
Kibana). It supports Netflow v5/v9, sFlow and IPFIX flow types. We will use the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/flow types/protocols for flow data collection
docs/elastiflow-integration.md
Outdated
collection and visualization tool based on [Elastic | ||
Stack](https://www.elastic.co/elastic-stack) (Elasticsearch, Logstash and | ||
Kibana). It supports Netflow v5/v9, sFlow and IPFIX flow types. We will use the | ||
IPFIX flow records with following fields: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add explicitly that we use ipfix protocol-- "Flow exporter feature in Antrea Agent uses IPFIX protocol to export flow records."
Follow sentence can be:
IPFIX flow records contains following Antrea specific fields along with standard IANA fields.
docs/elastiflow-integration.md
Outdated
For the requirements to deploy Elastiflow, please refer to | ||
[this](https://github.com/robcowart/elastiflow/blob/master/INSTALL.md#requirements). | ||
|
||
## Instruction |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/Instruction/Deployment Steps
docs/elastiflow-integration.md
Outdated
``` | ||
Kibana dashboard is exposed as a Nodeport, which can be accessed via `http://[NodeIP]: 30007` | ||
|
||
To import the dashboard into Kibana, go to **Management -> Saved Objects** and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/dashboard/pre-built and recommended dashboard
dns { | ||
id => "dns_node_name" | ||
reverse => [ "[node][hostname]" ] | ||
action => "replace" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this code added by us? I am asking because this code is before the license header of elastiflow repo. Should we treat the placement like that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the code is from Elastiflow config except input part (line 1-19). The filter and output part work for processing data to integrate with Kibana dashboard. I will suggest to keep as it is if it does not affect our functionality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also thanks for mentioning the license header, I will need to figure out whether we need to keep it or change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was a question from me whether we differentiate the code added by us and code in Elastiflow repo by the license header.
} | ||
} | ||
|
||
if [flow][sampling_interval] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not supporting ipfix sampling for now. Do we still need the related code in conf? Will this increase unnecessary overhead in elastiflow?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the info, I will remove these part in that case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the change. Visualization with elastic stack seems to be more than enough to begin with.
General comment:
Rename directory flow -> flow-collector/flowcollector to be more specific.
docs/flow-visualization.md
Outdated
@@ -0,0 +1,72 @@ | |||
# Flow Visualization | |||
## Purpose | |||
Antrea supports sending IPFIX flow records as flow exporter. The Elastic Stack |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/as flow exporter/through a flow exporter (we could add a link when documentation for flow exporter is added.)
docs/flow-visualization.md
Outdated
## Purpose | ||
Antrea supports sending IPFIX flow records as flow exporter. The Elastic Stack | ||
(ELK Stack) works | ||
as a data collector for flow records and flow-related information can be |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could we mention about storage too for completeness?
docs/flow-visualization.md
Outdated
|
||
## About Elastic Stack | ||
[Elastic Stack](https://www.elastic.co) is a group of open source products from | ||
Elastic to help collect, store, search, analyze and visualize data in real |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove 'Elastic' as you already provided link to the website.
docs/flow-visualization.md
Outdated
visualization. | ||
[Logstash](https://www.elastic.co/logstash) works as data collector to | ||
centralize flow records. [Logstach Netflow codec | ||
plugin](https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo--s/Logstach/Logstash
docs/flow-visualization.md
Outdated
[Logstash](https://www.elastic.co/logstash) works as data collector to | ||
centralize flow records. [Logstach Netflow codec | ||
plugin](https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html) | ||
supports Netflow v5/v9, sFlow and IPFIX protocols for flow data collection. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can Netflow plugin supports sFlow protocol? Please mention Netflow v10 in parenthesis for IPFIX so there won't be any confusion with plugin name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for pointing this out. I think it only supports v5/v9/v10
@@ -183,49 +181,44 @@ kind: Service | |||
metadata: | |||
name: logstash | |||
labels: | |||
app: elastiflow | |||
app: antreaflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
antrea-flowcollector/antrea-flow-collector is more apt IMO.
docs/flow-visualization.md
Outdated
To create all the necessary resources in the `antreaflow` namespace and get | ||
everything up-and-running, run: | ||
```shell | ||
kubectl create namespace antreaflow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here as well antreaflow to antrea-flowcollector/antrea-flow-collector?
docs/flow-visualization.md
Outdated
The following dashboards are pre-built and recommended for Antrea flow | ||
visualization. | ||
### Overview | ||
<img src="/docs/assets/flow-visualization-overview.png" width="600" alt="Flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A comment on PNG: I saw the figures by opting for rich diff option. Feel text was not visible. May be increase the font to make it visible in screen shots if possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes I think you can set the width to 900
btw, I want to check that these are screenshots that you made? if not, we probably don't want to host them in the Antrea repo, but instead use a link to their original location.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks and yes, I created these screenshots
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for confirming. Since these are "large" binary files, we should only check them in if we expect to never update them, or very infrequently. But it seems to be the case here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just realized that the graphs said "antrea", so my first question was kind of silly :)
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
Can one of the admins verify this patch? |
Thanks for your PR. The following commands are available:
These commands can only be run by members of the vmware-tanzu organization. |
ci/test-flow-collector.sh
Outdated
@@ -0,0 +1,66 @@ | |||
set -eu | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zyiou Why is this added in CI? And not as an e2e test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is suggested by Antonin that we want to create script like test-conformance-gke.sh
-- can be run locally and also daily CI to validate the manifest.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It makes sense under ci/ IMO, since this is not a Golang test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was thinking of other option of turning it into golang test and run as part of e2e suite if possible. Running it only once daily makes sense as the main purpose is to check elasticstack manifest.
Thanks for your PR. The following commands are available:
|
@@ -0,0 +1,232 @@ | |||
apiVersion: storage.k8s.io/v1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about call it elk-flow-collector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure
@@ -0,0 +1,38 @@ | |||
{"attributes":{"description":"","kibanaSavedObjectMeta":{"searchSourceJSON":"{\"query\":{\"query\":\"\",\"language\":\"lucene\"},\"filter\":[]}"},"title":"Antrea: Node Throughput Diagram (Bytes)","uiStateJSON":"{}","version":1,"visState":"{\"title\":\"Antrea: Node Throughput Diagram (Bytes)\",\"type\":\"vega\",\"params\":{\"spec\":\"{\\n \\\"$schema\\\": \\\"https://vega.github.io/schema/vega/v3.0.json\\\",\\n \\\"data\\\": [\\n {\\n \\\"name\\\": \\\"rawData\\\",\\n \\\"url\\\": {\\n \\\"%context%\\\": true,\\n \\\"%timefield%\\\": \\\"@timestamp\\\",\\n \\\"index\\\": \\\"flow-*\\\",\\n \\\"body\\\": {\\n \\\"size\\\": 0,\\n \\\"aggs\\\": {\\n \\\"table\\\": {\\n \\\"composite\\\": {\\n \\\"size\\\": 1000,\\n \\\"sources\\\": [\\n {\\\"stk1\\\": {\\\"terms\\\": {\\\"field\\\": \\\"ipfix.sourceNodeName.keyword\\\"}}},\\n {\\\"stk2\\\": {\\\"terms\\\": {\\\"field\\\": \\\"ipfix.destinationNodeName.keyword\\\"}}},\\n {\\\"stk3\\\": {\\\"terms\\\": {\\\"field\\\": \\\"ipfix.sourceIPv4Address.keyword\\\"}}},\\n {\\\"stk4\\\": {\\\"terms\\\": {\\\"field\\\": \\\"ipfix.destinationIPv4Address.keyword\\\"}}}\\n ]\\n },\\n \\t\\t\\t\\\"aggs\\\": {\\n \\t\\t\\t\\t\\\"bytes\\\": {\\n \\t\\t\\t\\t\\t\\\"sum\\\": {\\n \\t\\t\\t\\t\\t\\t\\\"field\\\": \\\"ipfix.octetDeltaCount\\\"\\n \\t\\t\\t\\t\\t}\\n \\t\\t\\t\\t}\\n \\t\\t\\t}\\n }\\n }\\n }\\n },\\n \\\"format\\\": {\\\"property\\\": \\\"aggregations.table.buckets\\\"},\\n \\\"transform\\\": [\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(datum.key.stk1==''?datum.key.stk3:datum.key.stk1)\\\", \\\"as\\\": \\\"stk1\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(datum.key.stk2==''?datum.key.stk4:datum.key.stk2)\\\", \\\"as\\\": \\\"stk2\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"datum.key.stk3\\\", \\\"as\\\": \\\"stk3\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"datum.key.stk4\\\", \\\"as\\\": \\\"stk4\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"datum.bytes.value\\\", \\\"as\\\": \\\"size\\\"}\\n ]\\n },\\n {\\n \\\"name\\\": \\\"nodes\\\",\\n \\\"source\\\": \\\"rawData\\\",\\n \\\"transform\\\": [\\n {\\n \\\"type\\\": \\\"filter\\\",\\n \\\"expr\\\": \\\"!groupSelector || groupSelector.stk1 == datum.stk1 || groupSelector.stk2 == datum.stk2\\\"\\n },\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"datum.stk1+datum.stk2+datum.stk3+datum.stk4\\\", \\\"as\\\": \\\"key\\\"},\\n {\\\"type\\\": \\\"fold\\\", \\\"fields\\\": [\\\"stk1\\\", \\\"stk2\\\"], \\\"as\\\": [\\\"stack\\\", \\\"grpId\\\"]},\\n {\\n \\\"type\\\": \\\"formula\\\",\\n \\\"expr\\\": \\\"datum.stack == 'stk1' ? datum.stk1+datum.stk2 : datum.stk2+datum.stk1\\\",\\n \\\"as\\\": \\\"sortField\\\"\\n },\\n {\\n \\\"type\\\": \\\"stack\\\",\\n \\\"groupby\\\": [\\\"stack\\\"],\\n \\\"sort\\\": {\\\"field\\\": \\\"sortField\\\", \\\"order\\\": \\\"descending\\\"},\\n \\\"field\\\": \\\"size\\\"\\n },\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"(datum.y0+datum.y1)/2\\\", \\\"as\\\": \\\"yc\\\"}\\n ]\\n },\\n {\\n \\\"name\\\": \\\"groups\\\",\\n \\\"source\\\": \\\"nodes\\\",\\n \\\"transform\\\": [\\n {\\n \\\"type\\\": \\\"aggregate\\\",\\n \\\"groupby\\\": [\\\"stack\\\", \\\"grpId\\\"],\\n \\\"fields\\\": [\\\"size\\\"],\\n \\\"ops\\\": [\\\"sum\\\"],\\n \\\"as\\\": [\\\"total\\\"]\\n },\\n {\\n \\\"type\\\": \\\"stack\\\",\\n \\\"groupby\\\": [\\\"stack\\\"],\\n \\\"sort\\\": {\\\"field\\\": \\\"grpId\\\", \\\"order\\\": \\\"descending\\\"},\\n \\\"field\\\": \\\"total\\\"\\n },\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"scale('y', datum.y0)\\\", \\\"as\\\": \\\"scaledY0\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"scale('y', datum.y1)\\\", \\\"as\\\": \\\"scaledY1\\\"},\\n {\\\"type\\\": \\\"formula\\\", \\\"expr\\\": \\\"datum.stack == 'stk1'\\\", \\\"as\\\": \\\"rightLabel\\\"},\\n {\\n \\\"type\\\": \\\"formula\\\",\\n \\\"expr\\\": \\\"datum.total/domain('y')[1]\\\",\\n \\\"as\\\": \\\"percentage\\\"\\n }\\n ]\\n },\\n {\\n \\\"name\\\": \\\"destinationNodes\\\",\\n \\\"source\\\": \\\"nodes\\\",\\n \\\"transform\\\": [{\\\"type\\\": \\\"filter\\\", \\\"expr\\\": \\\"datum.stack == 'stk2'\\\"}]\\n },\\n {\\n \\\"name\\\": \\\"edges\\\",\\n \\\"source\\\": \\\"nodes\\\",\\n \\\"transform\\\": [\\n {\\\"type\\\": \\\"filter\\\", \\\"expr\\\": \\\"datum.stack == 'stk1'\\\"},\\n {\\n \\\"type\\\": \\\"lookup\\\",\\n \\\"from\\\": \\\"destinationNodes\\\",\\n \\\"key\\\": \\\"key\\\",\\n \\\"fields\\\": [\\\"key\\\"],\\n \\\"as\\\": [\\\"target\\\"]\\n },\\n {\\n \\\"type\\\": \\\"linkpath\\\",\\n \\\"orient\\\": \\\"horizontal\\\",\\n \\\"shape\\\": \\\"diagonal\\\",\\n \\\"sourceY\\\": {\\\"expr\\\": \\\"scale('y', datum.yc)\\\"},\\n \\\"sourceX\\\": {\\\"expr\\\": \\\"scale('x', 'stk1') + bandwidth('x')\\\"},\\n \\\"targetY\\\": {\\\"expr\\\": \\\"scale('y', datum.target.yc)\\\"},\\n \\\"targetX\\\": {\\\"expr\\\": \\\"scale('x', 'stk2')\\\"}\\n },\\n {\\n \\\"type\\\": \\\"formula\\\",\\n \\\"expr\\\": \\\"range('y')[0]-scale('y', datum.size)\\\",\\n \\\"as\\\": \\\"strokeWidth\\\"\\n },\\n {\\n \\\"type\\\": \\\"formula\\\",\\n \\\"expr\\\": \\\"datum.size/domain('y')[1]\\\",\\n \\\"as\\\": \\\"percentage\\\"\\n }\\n ]\\n }\\n ],\\n \\\"scales\\\": [\\n {\\n \\\"name\\\": \\\"x\\\",\\n \\\"type\\\": \\\"band\\\",\\n \\\"range\\\": \\\"width\\\",\\n \\\"domain\\\": [\\\"stk1\\\", \\\"stk2\\\"],\\n \\\"paddingOuter\\\": 0.01,\\n \\\"paddingInner\\\": 0.98\\n },\\n {\\n \\\"name\\\": \\\"y\\\",\\n \\\"type\\\": \\\"linear\\\",\\n \\\"range\\\": \\\"height\\\",\\n \\\"domain\\\": {\\\"data\\\": \\\"nodes\\\", \\\"field\\\": \\\"y1\\\"}\\n },\\n {\\n \\\"name\\\": \\\"color\\\",\\n \\\"type\\\": \\\"ordinal\\\",\\n \\\"range\\\": \\\"category\\\",\\n \\\"domain\\\": {\\\"data\\\": \\\"rawData\\\", \\\"fields\\\": [\\\"stk1\\\",\\\"stk2\\\"]}\\n },\\n {\\n \\\"name\\\": \\\"stackNames\\\",\\n \\\"type\\\": \\\"ordinal\\\",\\n \\\"range\\\": [\\\"Source\\\", \\\"Destination\\\"],\\n \\\"domain\\\": [\\\"stk1\\\", \\\"stk2\\\"]\\n }\\n ],\\n \\\"axes\\\": [\\n {\\n \\\"orient\\\": \\\"bottom\\\",\\n \\\"scale\\\": \\\"x\\\",\\n \\\"labelColor\\\": {\\n \\\"value\\\": \\\"#888888\\\"\\n },\\n \\\"encode\\\": {\\n \\\"labels\\\": {\\n \\\"update\\\": {\\n \\\"text\\\": {\\\"scale\\\": \\\"stackNames\\\", \\\"field\\\": \\\"value\\\"},\\n \\\"fontSize\\\": {\\\"value\\\": 14}\\n }\\n }\\n }\\n },\\n {\\n \\\"orient\\\": \\\"left\\\",\\n \\\"scale\\\": \\\"y\\\",\\n \\\"labelColor\\\": {\\n \\\"value\\\": \\\"#888888\\\"\\n },\\n \\\"encode\\\": {\\n \\\"labels\\\": {\\n \\\"update\\\": {\\n \\\"text\\\": {\\\"signal\\\": \\\"format(datum.value, '.2s') + 'B'\\\"},\\n \\\"fontSize\\\": {\\\"value\\\": 12}\\n }\\n }\\n }\\n }\\n ],\\n \\\"marks\\\": [\\n {\\n \\\"type\\\": \\\"path\\\",\\n \\\"name\\\": \\\"edgeMark\\\",\\n \\\"from\\\": {\\\"data\\\": \\\"edges\\\"},\\n \\\"clip\\\": true,\\n \\\"encode\\\": {\\n \\\"update\\\": {\\n \\\"stroke\\\": [\\n {\\n \\\"test\\\": \\\"groupSelector && groupSelector.stack=='stk1'\\\",\\n \\\"scale\\\": \\\"color\\\",\\n \\\"field\\\": \\\"stk2\\\"\\n },\\n {\\\"scale\\\": \\\"color\\\", \\\"field\\\": \\\"stk1\\\"}\\n ],\\n \\\"strokeWidth\\\": {\\\"field\\\": \\\"strokeWidth\\\"},\\n \\\"path\\\": {\\\"field\\\": \\\"path\\\"},\\n \\\"strokeOpacity\\\": {\\n \\\"signal\\\": \\\"!groupSelector && (groupHover.stk1 == datum.stk1 || groupHover.stk2 == datum.stk2) ? 0.75 : 0.3\\\"\\n },\\n \\\"zindex\\\": {\\n \\\"signal\\\": \\\"!groupSelector && (groupHover.stk1 == datum.stk1 || groupHover.stk2 == datum.stk2) ? 1 : 0\\\"\\n },\\n \\\"tooltip\\\": {\\n \\\"signal\\\": \\\"{'title': datum.stk1 + ' → ' + datum.stk2 + ' ' + format(datum.size, '.2s') + 'B (' + format(datum.percentage, '.1%') + ')', 'IP Address': datum.stk3 +' → ' + datum.stk4 }\\\"\\n }\\n },\\n \\\"hover\\\": {\\\"strokeOpacity\\\": {\\\"value\\\": 0.75}}\\n }\\n },\\n {\\n \\\"type\\\": \\\"rect\\\",\\n \\\"name\\\": \\\"groupMark\\\",\\n \\\"from\\\": {\\\"data\\\": \\\"groups\\\"},\\n \\\"encode\\\": {\\n \\\"enter\\\": {\\n \\\"fill\\\": {\\\"scale\\\": \\\"color\\\", \\\"field\\\": \\\"grpId\\\"},\\n \\\"width\\\": {\\\"scale\\\": \\\"x\\\", \\\"band\\\": 1}\\n },\\n \\\"update\\\": {\\n \\\"x\\\": {\\\"scale\\\": \\\"x\\\", \\\"field\\\": \\\"stack\\\"},\\n \\\"y\\\": {\\\"field\\\": \\\"scaledY0\\\"},\\n \\\"y2\\\": {\\\"field\\\": \\\"scaledY1\\\"},\\n \\\"fillOpacity\\\": {\\\"value\\\": 0.7},\\n \\\"tooltip\\\": {\\n \\\"signal\\\": \\\"datum.grpId + ' ' + format(datum.total, '.2s') + 'B (' + format(datum.percentage, '.1%') + ')'\\\"\\n }\\n },\\n \\\"hover\\\": {\\\"fillOpacity\\\": {\\\"value\\\": 1}}\\n }\\n },\\n {\\n \\\"type\\\": \\\"text\\\",\\n \\\"from\\\": {\\\"data\\\": \\\"groups\\\"},\\n \\\"interactive\\\": false,\\n \\\"encode\\\": {\\n \\\"update\\\": {\\n \\\"x\\\": {\\n \\\"signal\\\": \\\"scale('x', datum.stack) + (datum.rightLabel ? bandwidth('x') + 8 : -8)\\\"\\n },\\n \\\"yc\\\": {\\\"signal\\\": \\\"(datum.scaledY0 + datum.scaledY1)/2\\\"},\\n \\\"align\\\": {\\\"signal\\\": \\\"datum.rightLabel ? 'left' : 'right'\\\"},\\n \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"},\\n \\\"fontWeight\\\": {\\\"value\\\": \\\"bold\\\"},\\n \\\"fontSize\\\": {\\\"value\\\": 12},\\n \\\"text\\\": {\\n \\\"signal\\\": \\\"abs(datum.scaledY0-datum.scaledY1) > 10 ? datum.grpId : ''\\\"\\n }\\n }\\n }\\n },\\n {\\n \\\"type\\\": \\\"group\\\",\\n \\\"data\\\": [\\n {\\n \\\"name\\\": \\\"dataForShowAll\\\",\\n \\\"values\\\": [{}],\\n \\\"transform\\\": [{\\\"type\\\": \\\"filter\\\", \\\"expr\\\": \\\"groupSelector\\\"}]\\n }\\n ],\\n \\\"encode\\\": {\\n \\\"enter\\\": {\\n \\\"xc\\\": {\\\"signal\\\": \\\"width/2\\\"},\\n \\\"y\\\": {\\\"value\\\": 30},\\n \\\"width\\\": {\\\"value\\\": 100},\\n \\\"height\\\": {\\\"value\\\": 36}\\n }\\n },\\n \\\"marks\\\": [\\n {\\n \\\"type\\\": \\\"group\\\",\\n \\\"name\\\": \\\"groupReset\\\",\\n \\\"from\\\": {\\\"data\\\": \\\"dataForShowAll\\\"},\\n \\\"encode\\\": {\\n \\\"enter\\\": {\\n \\\"cornerRadius\\\": {\\\"value\\\": 3.5},\\n \\\"fill\\\": {\\\"value\\\": \\\"#666666\\\"},\\n \\\"height\\\": {\\\"field\\\": {\\\"group\\\": \\\"height\\\"}},\\n \\\"width\\\": {\\\"field\\\": {\\\"group\\\": \\\"width\\\"}}\\n },\\n \\\"update\\\": {\\\"opacity\\\": {\\\"value\\\": 1}},\\n \\\"hover\\\": {\\\"fill\\\": {\\\"value\\\": \\\"#444444\\\"}}\\n },\\n \\\"marks\\\": [\\n {\\n \\\"type\\\": \\\"text\\\",\\n \\\"interactive\\\": false,\\n \\\"encode\\\": {\\n \\\"enter\\\": {\\n \\\"xc\\\": {\\\"field\\\": {\\\"group\\\": \\\"width\\\"}, \\\"mult\\\": 0.5},\\n \\\"yc\\\": {\\\"field\\\": {\\\"group\\\": \\\"height\\\"}, \\\"mult\\\": 0.5, \\\"offset\\\": 1},\\n \\\"align\\\": {\\\"value\\\": \\\"center\\\"},\\n \\\"baseline\\\": {\\\"value\\\": \\\"middle\\\"},\\n \\\"text\\\": {\\\"value\\\": \\\"Show All\\\"},\\n \\\"fontSize\\\": {\\\"value\\\": 14},\\n \\\"stroke\\\": {\\\"value\\\": \\\"#ecf0f1\\\"}\\n }\\n }\\n }\\n ]\\n }\\n ]\\n }\\n ],\\n \\\"signals\\\": [\\n {\\n \\\"name\\\": \\\"groupHover\\\",\\n \\\"value\\\": {},\\n \\\"on\\\": [\\n {\\n \\\"events\\\": \\\"@groupMark:mouseover\\\",\\n \\\"update\\\": \\\"{stk1:datum.stack=='stk1' && datum.grpId, stk2:datum.stack=='stk2' && datum.grpId}\\\"\\n },\\n {\\\"events\\\": \\\"mouseout\\\", \\\"update\\\": \\\"{}\\\"}\\n ]\\n },\\n {\\n \\\"name\\\": \\\"groupSelector\\\",\\n \\\"value\\\": false,\\n \\\"on\\\": [\\n {\\n \\\"events\\\": \\\"@groupMark:click!\\\",\\n \\\"update\\\": \\\"{stack:datum.stack, stk1:datum.stack=='stk1' && datum.grpId, stk2:datum.stack=='stk2' && datum.grpId}\\\"\\n },\\n {\\n \\\"events\\\": [\\n {\\\"type\\\": \\\"click\\\", \\\"markname\\\": \\\"groupReset\\\"},\\n {\\\"type\\\": \\\"dblclick\\\"}\\n ],\\n \\\"update\\\": \\\"false\\\"\\n }\\n ]\\n }\\n ]\\n}\"},\"aggs\":[]}"},"id":"5b165620-cd36-11ea-8911-87da3aad0324","migrationVersion":{"visualization":"7.8.0"},"references":[],"type":"visualization","updated_at":"2020-08-04T04:22:08.308Z","version":"WzcsMV0="} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How this file is generated? Could we add some comments to explain that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is auto-generated from Kibana export function. Once imported, users will be able to view the pre-built dashboards. Sure I will add brief elaboration in the doc since json file does not support comment.
@@ -0,0 +1,4056 @@ | |||
--- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again how this file is generated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is modified based on ipfix file from logstash Netflow codec plugin to interpret the fields from flow records. I added Antrea-specific fields at the end of the file and modified 2 data types.
@@ -0,0 +1,50 @@ | |||
input { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we add some comments to explain the configurations?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
docs/flow-visibility.md
Outdated
@@ -0,0 +1,85 @@ | |||
# Flow Visibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Network Flow Visibility"?
docs/flow-visibility.md
Outdated
@@ -0,0 +1,85 @@ | |||
# Flow Visibility | |||
## Purpose | |||
Antrea supports sending IPFIX flow records through a flow exporter. The Elastic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
flow exporter -> flow collector?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code that sends flow records in the Antrea Agent is called the flow exporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Antrea supports sending records to flow collector through its flow exporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We plan to merge this document with the flow exporter document: #980
Maybe that will make this clearer.
docs/flow-visibility.md
Outdated
kubectl create configmap logstash-configmap -n antrea-flow-collector --from-file=build/yamls/flow-collector/logstash/ | ||
kubectl apply -f build/yamls/flow-collector/flow-collector.yml -n antrea-flow-collector | ||
``` | ||
Kibana dashboard is exposed as a Nodeport, which can be accessed via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a NodePort Service?
ci/test-flow-collector.sh
Outdated
} | ||
|
||
check_record() { | ||
echo "=== Wait for 5 minutes to receive data ===" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we are not deploying any workload Pods, are we just looking at CoreDNS traffic here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes as long as we receive the flow records having one of desired fields I will consider it successful. Do you suggest to add workload pods?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the job works, it's goo enough for me. The whole point of this job is to validate the yaml manifest. May want to add a comment about this though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good. Done
docs/flow-visibility.md
Outdated
@@ -0,0 +1,85 @@ | |||
# Flow Visibility | |||
## Purpose | |||
Antrea supports sending IPFIX flow records through a flow exporter. The Elastic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code that sends flow records in the Antrea Agent is called the flow exporter.
docs/flow-visibility.md
Outdated
[Logstash](https://www.elastic.co/logstash) works as data collector to | ||
centralize flow records. [Logstash Netflow codec plugin](https://www.elastic.co/guide/en/logstash/current/plugins-codecs-netflow.html) | ||
supports Netflow v5/v9/v10(IPFIX) protocols for flow data collection. | ||
Flow exporter feature in Antrea Agent uses IPFIX (Netflow v10) protocol to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The flow exporter feature in Antrea Agent uses the IPFIX (Netflow v10) protocol to ...
docs/flow-visibility.md
Outdated
Flow exporter feature in Antrea Agent uses IPFIX (Netflow v10) protocol to | ||
export flow records. | ||
|
||
IPFIX flow records contain following Antrea specific fields along with standard |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exported IPFIX flow records contain the following Antrea-specific fields along with ...
docs/flow-visibility.md
Outdated
|
||
## Deployment Steps | ||
To create all the necessary resources in the `antrea-flow-collector` namespace | ||
and get |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the line break?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
my mistake. Fixed.
docs/flow-visibility.md
Outdated
### Flows | ||
#### Pod-to-pod Traffic | ||
<img src="http://downloads.antrea.io/static/flow-visualization-flow-1.png" width="900" alt="Flow | ||
Visualization Flows Dashboard"> | ||
<img src="http://downloads.antrea.io/static/flow-visualization-flow-2.png" width="900" alt="Flow | ||
Visualization Flow Dashboard"> | ||
|
||
#### Pod-to-service Traffic | ||
<img src="http://downloads.antrea.io/static/flow-visualization-flow-3.png" width="900" alt="Flow | ||
Visualization Flows Dashboard"> | ||
<img src="http://downloads.antrea.io/static/flow-visualization-flow-4.png" width="900" alt="Flow | ||
Visualization Flow Dashboard"> | ||
|
||
### Flow Records | ||
<img src="http://downloads.antrea.io/static/flow-visualization-flow-record.png" width="900" alt="Flow | ||
Visualization Flow Record Dashboard"> | ||
|
||
### Node Throughput | ||
<img src="http://downloads.antrea.io/static/flow-visualization-node-1.png" width="900" alt="Flow | ||
Visualization Node Throughput Dashboard"> | ||
<img src="http://downloads.antrea.io/static/flow-visualization-node-2.png" width="900" alt="Flow | ||
Visualization Node Throughput Dashboard"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when I look at the rendered markdown file, this part is really packed with images. Having these images is great, but do you think you could add a line or 2 of test for every section to break the flow of images and explain what is being shown?
Additionally, I would really recommend using the full AWS S3 link even though it is longer (which is not an issue IMO since the link is not used by users), in order to load images over HTTPs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agrees. Added some elaboration and changed the image link
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more nits
docs/network-flow-visibility.md
Outdated
|
||
### Flows | ||
#### Pod-to-pod Traffic | ||
Pod-to-pod Tx and Rx traffic is shown in sankey diagrams. Corresponding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
everywhere in this doc:
s/pod/Pod
s/node/Node
s/service/Service
for consistency with the rest of the documentation
docs/network-flow-visibility.md
Outdated
### Flows | ||
#### Pod-to-pod Traffic | ||
Pod-to-pod Tx and Rx traffic is shown in sankey diagrams. Corresponding | ||
source/destinationpod throughput is visualized using stacked line graph. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing whitespace
docs/network-flow-visibility.md
Outdated
Visualization Flow Dashboard"> | ||
|
||
### Flow Records | ||
Flow Records dashboard shows raw flow records over time with filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/with filter/with support for filters
docs/network-flow-visibility.md
Outdated
| sourceNodeName | 55829 | 104 | string | | ||
| destinationNodeName | 55829 | 105 | string | | ||
| destinationClusterIP | 55829 | 106 | ipv4Address | | ||
| destinationServicePort | 55829 | 107 | unsigned16 | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are not populating this field in flow exporter. It is probably better to remove it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the field ID of destinationServicePortName is still 108 then?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the ID remains the same.
@@ -0,0 +1,101 @@ | |||
# Network Flow Visibility | |||
## Purpose |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As this document talks about the purpose and details of ELK flow collector, the title "Elk Flow Collector" is appropriate. This is can be a section in the document like "Flow Exporter": https://github.com/vmware-tanzu/antrea/pull/980/files#diff-a27d53d71ee3ed485cf12d6147782edb
docs/network-flow-visibility.md
Outdated
Visualization Overview Dashboard"> | ||
|
||
### Flows | ||
#### Pod-to-pod Traffic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/pod/Pod
At other places too.
docs/network-flow-visibility.md
Outdated
Visualization Flow Dashboard"> | ||
|
||
#### Pod-to-service Traffic | ||
Pod-to-service traffic is presented similar to pod-to-pod traffic. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/service/Service
At other places too..
docs/network-flow-visibility.md
Outdated
|
||
#### Pod-to-service Traffic | ||
Pod-to-service traffic is presented similar to pod-to-pod traffic. | ||
Corresponding source/destination IP addresses are shown in tooltips. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/source/destination/source or destination
docs/network-flow-visibility.md
Outdated
Visualization Flow Dashboard"> | ||
|
||
### Flow Records | ||
Flow Records dashboard shows raw flow records over time with filter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the raw flow records..
docs/network-flow-visibility.md
Outdated
Visualization Flow Record Dashboard"> | ||
|
||
### Node Throughput | ||
Node Throughput dashboard visualizes inter-node and intra-node traffic |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... shows the visualization of inter-node and intra-node traffic by aggregating all the pod traffic per each node.
docs/network-flow-visibility.md
Outdated
|
||
|
||
## Pre-built Dashboards | ||
The following dashboards are pre-built and recommended for Antrea flow |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and are recommended
docs/network-flow-visibility.md
Outdated
`http://[NodeIP]: 30007` | ||
|
||
`build/yamls/flow/kibana.ndjson` is an auto-generated reusable file containing | ||
pre-built objects for visualizing pod-to-pod, pod-to-service and node-to-node |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalize pod, service, and node to be consistent.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
## Deployment Steps | ||
To create all the necessary resources in the `elk-flow-collector` namespace | ||
and get everything up-and-running, run: | ||
```shell |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you add an empty line before and after the code block? I think Cody reported issues with netlify (website) without them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
/test-all |
docs/network-flow-visibility.md
Outdated
|
||
<img src="https://s3-us-west-2.amazonaws.com/downloads.antrea.io/static/flow-visualization-flow-record.png" width="900" alt="Flow | ||
Visualization Flow Record Dashboard"> | ||
|
||
### Node Throughput | ||
Node Throughput dashboard visualizes inter-node and intra-node traffic | ||
by aggregating pod traffic per node. | ||
Node Throughput dashboard shows the visualization of inter-Node and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, here with inter-Node, intra-Node and per Node, we have to use smaller case because we are not referencing a particular node. Similarly for "all the Pod traffic", we need to use smaller because we are characterizing the traffic as pod traffic and not referencing a particular Pod.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds great. How about changing it to:
Node Throughput dashboard provides an insight in inter-Node and intra-Node traffic through aggregated Pod traffic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/test-all |
Merging this as there are approvals and Antonin's final comment has been addressed. |
* Add Elastic stack deployment * address comments * modified format * add comments for check_record * address comments * address comments
This PR is a follow-up of PR #825 for adding Elastic Stack manifests and doc for collecting IPFIX flow records and visualize in Kibana dashboard.