Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Kubernetes] Add data_stream.dataset pre filter is dashboards and remove duplicated dashboard tags #4972

Merged

Conversation

MichaelKatsoulis
Copy link
Contributor

@MichaelKatsoulis MichaelKatsoulis commented Jan 11, 2023

What does this PR do?

This PR adds pre filters based on field data_stream.dataset on all visualizations of Kubernetes Integration except the below which are still under development.(elastic/beats#34237)

  • [Metrics Kubernetes] Proxy
  • [Metrics Kubernetes] Controller Manager
  • [Metrics Kubernetes] Scheduler

Example Visualization:

example

In addition, based on the discussions in #4969 this PR removes all tags (Kubernetes, Managed) specifically added by the Kubernetes dashboards. The reason is that those tags are already added by default from Fleet, so what ends up happening is that they are getting duplicated.

Important notes:

  1. Some of the visualizations were already using pre filters but with event.dataset instead of data_stream.dataset. This approach didn't produce the performance benefits that the data stream naming scheme offers. (Faster queries on Keyword type). https://www.elastic.co/blog/an-introduction-to-the-elastic-data-stream-naming-scheme
  2. In [Metrics Kubernetes] Cluster Overview dashboard, some of the visualizations are using ad-hoc data views. Those ad-hoc data views use a different way to pre filter. This is based on index pattern like metrics-kubernetes.state_pod*. The reason behind this is to leverage the power of custom fields. In those visualizations the data_stream filter is redundant. See the screenshots.
    adhoc example
    adhoc
    custom field

@MichaelKatsoulis MichaelKatsoulis requested review from a team as code owners January 11, 2023 15:41
@MichaelKatsoulis MichaelKatsoulis marked this pull request as draft January 11, 2023 15:41
@elasticmachine
Copy link

elasticmachine commented Jan 11, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-01-19T07:44:54.384+0000

  • Duration: 26 min 13 sec

Test stats 🧪

Test Results
Failed 0
Passed 91
Skipped 0
Total 91

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@MichaelKatsoulis MichaelKatsoulis marked this pull request as ready for review January 12, 2023 08:50
@MichaelKatsoulis
Copy link
Contributor Author

cc @ruflin

@ruflin
Copy link
Contributor

ruflin commented Jan 12, 2023

I like the idea of the embedded data views, unfortunately it seems currently if there is not data for a visualisation, Kibana shows extra errors? If that is the case, we should take this up with the Kibana team to get it out of the way.

One the naming of the embedded data views, lets follow a similar pattern to the data stream naming scheme. Lets prefix the name by the data stream pattern we query and then an addition to it which might explain which visualisation it is part of. This should make debugging easier when errors show up. Something like metrics-kubernetes.state_pod/{visualisation_name/title} (just an idea).

For the visualisations that query multiple data sets, ideally we could some optimisations like using the prefix query on data_stream.dataset or use metrics-kubernetes.* or if there are 2 dataset relevant, have an or query. The less data that needs to be touched, the better.

@MichaelKatsoulis
Copy link
Contributor Author

I like the idea of the embedded data views, unfortunately it seems currently if there is not data for a visualisation, Kibana shows extra errors? If that is the case, we should take this up with the Kibana team to get it out of the way.

Error handling in Kibana side is already discussed with Kibana/Lens team and they have plans on fixing those in some ways. Like changing the errors to warning and having custom warning messages.
elastic/kibana#143673

I agree about the naming conventions for the ad hoc data views. But as far I have seen so far, their names do not appear in the errors shown. Only the index pattern that they have. See in the picture
errors

About the visualisations that query multiple data sets, we follow the approach with an or query for now. In all of them , they are relevant datasets. But yes, touching as less data as we can is the optimal solution.
Maybe we could consider using ad hoc data views everywhere without the need of any pre filtering.

But it is quite interesting that when testing kubernetes on scale with 5-10 thousands pods, the visualizations with the ad hoc data views were very efficient. They were showing the right results in few seconds. I remember the case with metricbeat tests we have done with BY in the past that the previous dashboards with complex visualizations (TSVB) took minutes to show results with too many pods.

@gizas
Copy link
Contributor

gizas commented Jan 12, 2023

Just for reference we are going to investigate ad-hoc dataview problems in 10000 pod size k8s cluster: elastic/kibana#148058

@@ -458,7 +460,7 @@
"panelIndex": "3e6790d6-de88-47de-8c3e-d8aa2c89c538",
"title": "StatefulSet Generation Observed [Metrics Kubernetes]",
"type": "visualization",
"version": "8.4.0-SNAPSHOT"
"version": "8.6.0-SNAPSHOT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"version": "8.6.0-SNAPSHOT"
"version": "8.6.0"

},
"coreMigrationVersion": "8.5.0",
"coreMigrationVersion": "8.6.0",
Copy link
Contributor

@gizas gizas Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general this should be 8.5.0 everywhere not 8.6.0 right?
So no need to introduce it after 8.6.0

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are using 8.5 Kibana features only so yes 8.5.0 should be fine. But this version there, is created automatically. Truly I don't think it plays a role. But I can set them all to 8.5.0

Copy link
Contributor

@tetianakravchenko tetianakravchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Container and Statefulset dashboards do not have all queries updated, like -

"query": "event.dataset :\"kubernetes.state_statefulset\" "
or
(for Container there are more event.dataset used). Is it expected?

@@ -327,6 +324,8 @@
"rowHeightLines": 1
}
},
"title": "CronJobs Informations [Metrics Kubernetes]",
Copy link
Contributor

@tetianakravchenko tetianakravchenko Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we remove this from name [Metrics Kubernetes] here as well? similar to this comment in #4948 (comment)

not for this PR, in general

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for spotting those. In the statefulsets dashboard I missed to update one of the visualizations filters.
For the other ones for Container these were some extra filters used in advance settings of some fields in two visualizations. So it would not be a problem as the visualization was already using the right pre filters on top level. But I updated them all now for consistency.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

after commit 43482cc Statefulset dashboard still use event

"query": "event.dataset :\"kubernetes.state_statefulset\" "
, is it expected?

@tetianakravchenko
Copy link
Contributor

@MichaelKatsoulis

In [Metrics Kubernetes] Cluster Overview dashboard, some of the visualizations are using ad-hoc data views. Those ad-hoc data views use a different way to pre filter. This is based on index pattern like metrics-kubernetes.state_pod*. The reason behind this is to leverage the power of custom fields. In those visualizations the data_stream filter is redundant. See the screenshots.

do we have any documentation on pre filter using ad-hoc data views, comparison to the data_stream.dataset pre filter somewhere?

@MichaelKatsoulis
Copy link
Contributor Author

do we have any documentation on pre filter using ad-hoc data views, comparison to the data_stream.dataset pre filter somewhere?

Unfortunately, there is no documentation about this. But the reason is that we did not use the ad hoc data views for pre filtering reasons. We used them as the best way to create ad hoc fields. I just decided when creating those data views to limit the scope of the index pattern to metrics-kubernetes.state_pod instead of the generic metrics-*.
About the comparison to the data_stream.dataset pre filter from a performance perspective, both approached are almost the same as far as I have understood.

"id": "metrics-*",
"name": "indexpattern-datasource-current-indexpattern",
"type": "index-pattern"
},
{
"id": "metrics-*",
"name": "indexpattern-datasource-layer-7711169c-3a7b-4071-98d0-3644aa1dde0b",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there 2 different metrics-* index patterns? See line 173

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In references section there are five with the same id metrics-* referenced. I don't know how so many got created. I will see if just one is ok

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be the "data view as value" entries. If that is the case, I suspect there will be one per visualisations. I think that is one of the limitations @joshdover stumbled over recently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes exactly. There Is one per visualizations are all needed for the dash to render

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to remove on of the two in this example but then I get errors in the visualization using it like undefined is not an object (evaluating 't.find((({name:t})=>t===wc(e))).id'). So I will just keep as it is

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I created this issue for Kibana team elastic/kibana#149105

…d of pop up kibana errors in case of no data stream found
@MichaelKatsoulis
Copy link
Contributor Author

In latest commit 1b6f2c0 I updated the adHoc data views to use the more generic index pattern metrics-* rather that the more specifics like metrics-kubernetes.state_pod or metrics-kubernetes.state_deployment
Although having more specific ones makes more sense if we only need the data derived from them, Kibana is not ready yet to handle error cases gratefully.
The pop ups appearing in case there is not data stream indicate that the specific index pattern like metrics-kubernetes.state_pod did not match any indices.
In case we use generic index pattern like metrics-* those pop ups do not appear , rather we see some errors like field not found on each visualisation area.
errors2

So until Kibana handles the errors more gratefully we will use generic index pattern and filter on data_stream.dataset only.

@@ -258,7 +259,7 @@
"panelIndex": "2f904623-cc34-4a48-afce-46fff964dbdf",
"title": "DaemonSet Replicas Available [Metrics Kubernetes]",
"type": "visualization",
"version": "8.4.0-SNAPSHOT"
"version": "8.6.0-SNAPSHOT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"version": "8.6.0-SNAPSHOT"
"version": "8.6.0"

@@ -158,7 +158,7 @@
"panelIndex": "74f075ae-ea3e-40aa-a84c-2538a2195f6a",
"title": "DaemonSet Replicas Desired [Metrics Kubernetes]",
"type": "visualization",
"version": "8.4.0-SNAPSHOT"
"version": "8.6.0-SNAPSHOT"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you remove all snapshot references?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? These are created by the elastic-package tool? We have them also in all our previous dashboards versions. We should not change the yaml files manually. It is not safe

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree we should not modify these manually. But I think there is a different question here: What is this the oldest version this package is compatible with? The assets should built with this version to make sure things are compatible. So if it lets say 8.3, this version should be used for building dashboards. Otherwise there is a chance that we use features which are not compatible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes agree. 8.6 is the oldest version it is compatible with and I used that to build the dashboards. Thing is that I used image 8.6-SNAPSHOT which may be the reason this field has this value. Then I switched to 8.6.0 image and exported the dashboards again with elastic-package tool but it did not changed that. It needs attention when building the dashboards with higher version of the stack, cause fields are added in the yaml that are not supported by previous versions like "usesAdHocDataView": false for example.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then I switched to 8.6.0 image and exported the dashboards again with elastic-package tool but it did not changed that.

I'm surprised by this one. I would expect if you have an 8.6.0 stack running, it should not have a -SNAPSHOT mention anywhere. Can you raise this with the ecosystem team?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I searched a bit more into that and it seems that elastic package changes the version of all visualizations of a dashboard only in case there was an update in at least one of those visualizations. If there is no update it keeps the old one and changes only created_at" field

Copy link
Contributor

@tetianakravchenko tetianakravchenko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
@MichaelKatsoulis could you double check this comment #4972 (comment) ?

@MichaelKatsoulis
Copy link
Contributor Author

@tetianakravchenko thanks! Updated. 95f4a1d

@elasticmachine
Copy link

elasticmachine commented Jan 17, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (0/0) 💚
Files 100.0% (0/0) 💚 3.043
Classes 100.0% (0/0) 💚 3.043
Methods 94.872% (74/78) 👍 3.423
Lines 100.0% (0/0) 💚 9.304
Conditionals 100.0% (0/0) 💚

@gizas
Copy link
Contributor

gizas commented Jan 18, 2023

@MichaelKatsoulis Can we also test the allowNoIndex field in ad-hoc-data views?
Even manual?

Of course the bad thing here is that this change will be lost in case automatically we recreate the dashboards. But we can keep a note in the code (especially for the general dashboard)

@MichaelKatsoulis
Copy link
Contributor Author

Can we also test the allowNoIndex field in ad-hoc-data views? Even manual?

allowNoIndex field can be added in the dashboard json file only, as far as I have understood, to mitigate the problem of unwanted pop up errors in case of missing indices.
For this PR we tackled this problem by using the more generic index-pattern as described in #4972 (comment).

I can try the allowNoIndex to test if it works but I will not include it in this PR and wait until it is available as an option in Kibana UI.

@MichaelKatsoulis
Copy link
Contributor Author

/test

@MichaelKatsoulis
Copy link
Contributor Author

MichaelKatsoulis commented Jan 18, 2023

Ci is failing due to some missing fields in some of the kubernetes data streams due to a change introduced in 8.6 beats. We are working on a fix with @tetianakravchenko and then this pr can be merged.
#5044

@andrewkroh andrewkroh changed the title Add data_stream.dataset pre filter in kubernetes dashboards and remove duplicated tags from the dashboards [Kubernetes] Add data_stream.dataset pre filter is dashboards and remove duplicated dashboard tags Jan 18, 2023
Copy link
Contributor

@drewdaemon drewdaemon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a great optimization. Thank you!

Snapshot version

Removing the SNAPSHOT version has already been discussed.

However, from a process perspective I'm concerned to see a SNAPSHOT version popping up here in the first place. This means that the dashboards were changed using an unreleased version of Kibana, right?

As far as this PR goes, you're probably okay since it was created only one day after 8.6 was publicly released.

But, we should avoid doing this as a rule since Kibana features can change before the public release, potentially breaking the integration assets that are created on the non-released version.

Other Kibana issues

As far as I can tell, all the issues that have been filed with the Kibana team have been answered. However, if you don't feel like any of them are getting sufficient attention, please ping me.

@MichaelKatsoulis
Copy link
Contributor Author

/test

@MichaelKatsoulis
Copy link
Contributor Author

Thanks @andrewctate for your comment.
So I understand that the right approach from now on, is to never use a snapshot version of Kibana to develop dashboards as an example.
What concerns me is that, the official image is released when it is GA, which is usually close to a month after the FF. If we want to deliver a new package version which leverages some features of that stack/kibana release then the development process should not start before the GA.
Is this rational right?

@ruflin
Copy link
Contributor

ruflin commented Jan 19, 2023

@MichaelKatsoulis The build of new packages can start at any time. But as long as a -SNAPSHOT build is used for development, these packages should only be released as a pre release package. As soon as we hit FF, there should be a release candidate that does not included the -SNAPSHOT anymore but lets talk about this internally a bit more.

@MichaelKatsoulis MichaelKatsoulis merged commit e215f99 into elastic:main Jan 19, 2023
@elasticmachine
Copy link

Package kubernetes - 1.31.0 containing this change is available at https://epr.elastic.co/search?package=kubernetes

@joshdover
Copy link
Contributor

The build of new packages can start at any time. But as long as a -SNAPSHOT build is used for development, these packages should only be released as a pre release package. As soon as we hit FF, there should be a release candidate that does not included the -SNAPSHOT anymore but lets talk about this internally a bit more.

Using pre-release versions to build the dashboards and then simply removing the SNAPSHOT tag in the SO version numbers is risky.

What happens if the Kibana teams removes or renames a field to the dashboard SO schema between the time the dashboard was built on a SNAPSHOT version and the final release? If we just remove the SNAPSHOT tag, these dashboards won't get migrated by the migraiton system and we could end up with a corrupt dashboard that doesn't work.

We should probably make elastic-package refuse to export dashboards from snapshot versions (maybe with some override flag like --accept-unstable-snapshots)

@MichaelKatsoulis
Copy link
Contributor Author

So I understand that package developers should wait until the FF and the release candidate in order to develop a package, only in cases it requires features of that version of Kibana. This is mainly because it is controlled by a different team.
Otherwise they can start with development of the package with a snapshot versions, but then after FF use the release candidate to change again something in the dashboard in order the export process removes the SNAPSHOT from the json.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants