Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert Istio dashboards to Kibana Lens #5268

Merged
merged 8 commits into from
Mar 7, 2023

Conversation

gsantoro
Copy link
Contributor

@gsantoro gsantoro commented Feb 14, 2023

What does this PR do?

Convert the Istio dashboards to Lens

Checklist

  • I have reviewed tips for building integrations and this pull request is aligned with them.
  • I have verified that all data streams collect metrics or logs.
  • I have added an entry to my package's changelog.yml file.
  • I have verified that Kibana version constraints are current according to guidelines.

Author's Checklist

  • [ ]

How to test this PR locally

Related issues

Screenshots

@gsantoro gsantoro added enhancement New feature or request draft Draft labels Feb 14, 2023
@gsantoro gsantoro self-assigned this Feb 14, 2023
@gsantoro gsantoro requested a review from a team as a code owner February 14, 2023 12:23
@gsantoro gsantoro requested review from gizas and devamanv February 14, 2023 12:23
@elasticmachine
Copy link

elasticmachine commented Feb 14, 2023

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2023-03-06T20:56:41.791+0000

  • Duration: 15 min 44 sec

Test stats 🧪

Test Results
Failed 0
Passed 14
Skipped 0
Total 14

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@constanca-m
Copy link
Contributor

Seems ok. For the jenkins error, update the kibana requirement here to 8.6.0, or change the version here and here for the dashboard (8.4.0 if possible).

"dataType": "number",
"isBucketed": false,
"label": " ",
"operationType": "max",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be max?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

About this metric. I don't even know what it's doing but it is not one of those ones reported by Stephen to convert from Max to Sum. Before I change that, I'll have a chat with Stephen.

"title": "Galley Validation Failed",
"type": "visualization",
"version": "8.4.0"
"title": "Galley Validation Failed by Label",
Copy link
Contributor

@gizas gizas Feb 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just tempted to ask if we need Galley in general. Do you think it gives value? Do you think we can remove it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no idea what it is yet. @ChrsMark do you know?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Istiod still reports this metric, right? If so, why to remove it?
The definition of this metric seems to be Count of validations failed by the configuration validation webhook. The metric is reported as galley_validation_failed{group="networking.istio.io",reason="invalid_resource",resource="gateways",version="v1alpha3"} 1.

So I would say if it's still provided, it's a valuable indicator that Istio provides and some other tools use it to create alerts. See an example here. So I wouldn't remove it if there is room for it in the dashboard but it's up to you folks to make the last call.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a look at that metric and I'll write here about it.

@gizas
Copy link
Contributor

gizas commented Feb 15, 2023

We can keep adding short info for how we test and also screenshots to provide evidence for your results.
Dashboard stories with screenshot evidence should be a must.

Also in general the description text like Metrics from Istiod does not give much value. Either remove it or enhance it with valuable information on how the users should read the dashboards .

Some more ideas like grouping Pilot XDs together or Time related visualisation can also be handy. Especially in overview I am getting lost to read so many information.

Also @constanca-m used markdown to group specific visualisations. You can also group by this your visualisations like here elastic/beats#34161

Overall great improvement Giuseppe!

@gsantoro
Copy link
Contributor Author

We can keep adding short info for how we test and also screenshots to provide evidence for your results. Dashboard stories with screenshot evidence should be a must.

I agree. I opened this PR in draft mostly to see what was broken with the Jenkins build before adding screenshots and instructions to test.

Also in general the description text like Metrics from Istiod does not give much value. Either remove it or enhance it with valuable information on how the users should read the dashboards .

I am not sure what to put in there. I just wanted to convey that those metrics in Overview are from Istiod compared to the ones in Traffic which have a description Metrics from Istio Sidecar Proxies. I am open to suggestions.

Some more ideas like grouping Pilot XDs together or Time related visualisation can also be handy. Especially in overview I am getting lost to read so many information.

Also @constanca-m used markdown to group specific visualisations. You can also group by this your visualisations like here elastic/beats#34161

Looks good. I'll have a look into those. Thanks

Overall great improvement Giuseppe!

@gsantoro
Copy link
Contributor Author

@constanca-m and @gizas

Also @constanca-m used markdown to group specific visualisations. You can also group by this your visualisations like here elastic/beats#34161

It looks great that piece of markdown. Unfortunately it's a TSVB panel and the styling with custom CSS is not available in Lens. Given that the topic of this PR is to move away from TSVB I am not sure that I should add a TSVB panel for styling some markdown.

Screenshot 2023-02-15 at 17 11 22

@elasticmachine
Copy link

elasticmachine commented Feb 15, 2023

🌐 Coverage report

Name Metrics % (covered/total) Diff
Packages 100.0% (1/1) 💚
Files 100.0% (3/3) 💚
Classes 100.0% (3/3) 💚
Methods 90.0% (27/30) 👍 65.0
Lines 97.472% (347/356) 👎 -2.528
Conditionals 100.0% (0/0) 💚

@gizas
Copy link
Contributor

gizas commented Feb 16, 2023

  • Ok markdown in TSVB is not something we must do it. Just group then the related visualisations
  • For suggestions for text: You can even put links to Istio pages that describe the components you visualise for reference.
  • Keep Galley
  • Have a look for the max comment Convert Istio dashboards to Kibana Lens #5268 (comment)

@gsantoro
Copy link
Contributor Author

I would like to start a discussion about the changes applied to migrate from TSVB to Lens. I'll highlight every change in a separate comment so that we easily reference them and comment them.

  1. in Lens you can't see the last value for a metric in the Legend. The current value is displayed as usual when you over the metric.

In TSVB
Screenshot 2023-02-16 at 10 24 51

in Lens
Screenshot 2023-02-16 at 10 24 57

@gsantoro
Copy link
Contributor Author

  1. Requests per source/destination used to use the Max function instead of Sum function. . Now instead, we sum the number of requests per source/destination and we calculate the rate in seconds as well. They have migrated to Lens as well

old implementation in TSVB with Max function
Screenshot 2023-02-16 at 13 24 35

new implementation in Lens with Sum function
Screenshot 2023-02-16 at 13 24 51

@gsantoro
Copy link
Contributor Author

  1. Size of requests/responses has been rewritten from TSVB to Lens. (99) for 99th percentile in the legends has been replaced by (99th percentile) in the titles

in TSVB
Screenshot 2023-02-16 at 13 29 35

in Lens
Screenshot 2023-02-16 at 13 29 28

@gsantoro
Copy link
Contributor Author

  1. Similar discussion as point 3 for Duration of requests by source/destination

in TSVB
Screenshot 2023-02-16 at 13 31 12

in Lens
Screenshot 2023-02-16 at 13 30 39

@gsantoro
Copy link
Contributor Author

  1. Multiple Gauges being replaced by Numeric Metric colour coded in 3 ranges (>0 and < 50%, >= 50% and < 75%, > 75% and <100%) with colours Green, Yellow, Red similarly to the behaviour with Gauges

in TSVB
Screenshot 2023-02-16 at 13 34 56
Screenshot 2023-02-16 at 13 35 01

in Lens
Screenshot 2023-02-16 at 13 35 13
Screenshot 2023-02-16 at 13 35 18

@gsantoro
Copy link
Contributor Author

  1. Multiple percentiles in the same visualization

in TSVB (notice the last value after the metric is a bit confusing here
Screenshot 2023-02-16 at 13 36 37

In Lens (percentiles are indicated with (25) for 25th percentile, ...
Screenshot 2023-02-16 at 13 36 47

@gsantoro
Copy link
Contributor Author

  1. The final look of both dashboards Overview and Traffic

in TSVB
Screenshot 2023-02-16 at 13 40 56

Screenshot 2023-02-16 at 13 40 46

in Lens
Screenshot 2023-02-16 at 13 40 21
Screenshot 2023-02-16 at 13 39 59

@gsantoro
Copy link
Contributor Author

Hey @bvader,
do you mind having a look at this PR for the changes that we discussed on our last meeting?

@gsantoro
Copy link
Contributor Author

  1. I recently saw that Kubernetes integration comes with a new way to implement Gauges (like in the below screenshot).

Screenshot 2023-02-16 at 15 16 19

The problem with that is it's coming from a Gauge TSVB visualization.

Since there was lot of focus on moving the Istio integration entirely to Lens, I am now wondering how much of hour current dashboards still rely on TSVB given that I have seen at least

  • markdown for category of dashboards
  • Gauges for percentage usage
  • (there might be more I am not aware of yet)

@gizas
Copy link
Contributor

gizas commented Feb 20, 2023

At the moment all above changes LGTM.
Two minor:

  1. Foot Labels like @timestamp per hour can be removed? Or to be included in the header of visualisation?
  2. Numeric Metrics in 5: Can you change the actual number to be in the center and big? Also Capitialise headers and make them onliners?

@gsantoro
Copy link
Contributor Author

At the moment all above changes LGTM. Two minor:
thanks for the feedback

  1. Foot Labels like @timestamp per hour can be removed? Or to be included in the header of visualisation?

About the foot label. I can either remove the label entirely or customize the @timestamp part. The second part of that label actually change based on the timeframe. So that part might be per hour or per seconds. I can customize that part other than fixing the time frame or remove that part entirely.

this is how you remove the @timestamp part of that label

Screenshot 2023-02-21 at 14 43 46

and this is how it looks with no X axis label.

Screenshot 2023-02-21 at 14 45 00

What do you suggest?

  1. Numeric Metrics in 5: Can you change the actual number to be in the center and big? Also Capitialise headers and make them onliners?

I can't modify any of the visual aspect of this Metric visualization, meaning either the aspect of the number or the headers. TSVB has lot more customization capabilities (via CSS) than Lens. The only alternative is to have a single big percentage number but we will be getting rid of the color that indicate a possible range of values.

@constanca-m
Copy link
Contributor

constanca-m commented Feb 21, 2023

It looks great that piece of markdown. Unfortunately it's a TSVB panel and the styling with custom CSS is not available in Lens. Given that the topic of this PR is to move away from TSVB I am not sure that I should add a TSVB panel for styling some markdown.

This is unfortunately true. There are other things that Lens can't do yet, so I wouldn't call having this text as a problem. Example of another thing: summing the last value of some metric that breaks down in a label (Limitations section in this PR)

@constanca-m
Copy link
Contributor

Foot Labels like @timestamp per hour can be removed? Or to be included in the header of visualisation?

If it were me, I would try to remove the axis labels, and try to be clear with the title and the legend (if it exists). And there is no reason to add the unit type (example, sec, %) in the title, if we can write the axis value with a suffix - like 4 sec. @gsantoro

@gsantoro
Copy link
Contributor Author

If it were me, I would try to remove the axis labels, and try to be clear with the title and the legend (if it exists). And there is no reason to add the unit type (example, sec, %) in the title, if we can write the axis value with a suffix - like 4 sec. @gsantoro

I am more than glad to remove the X axis entirely, fix the date histrogram resolution (like @gizas was suggesting) and let the user guess the timeframe by moving the cursor through the values. Unfortunately you can't add a suffix to an X axis when the axis has a Date histogram. I have done than when the Y axis was numeric to add a s for seconds.

This is how it looks with no X axis and 1h fixed timeframe

Screenshot 2023-02-21 at 16 30 27

@constanca-m
Copy link
Contributor

let the user guess the timeframe by moving the cursor through the values ... This is how it looks with no X axis and 1h fixed timeframe

The visualization is very understandable, I don't see a problem in not having a suffix in the x values.

@gsantoro
Copy link
Contributor Author

gsantoro commented Feb 21, 2023

Here it is the updated look of the dashboard. Changes are:

  • inline all pilot Numeric metrics
  • removed X axis from all visualization
  • set the default Date Histogram resolution to 1h. Just to be clear. If you have more than a couple of days of time frame, the dashboard use a 3h bucket size instead of 1h. This bucket size changes automatically by default based on the timeframe. Fixing it to 1h might have performance implication if you look at more 1w time frame. I hope everyone is ok with this tradeoff.

Screenshot 2023-02-21 at 17 30 22

@constanca-m
Copy link
Contributor

The link to the screenshot just leads me to the description of the PR @gsantoro

@gizas
Copy link
Contributor

gizas commented Feb 22, 2023

  • set the default Date Histogram resolution to 1h. Just to be clear. If you have more than a couple of days of time frame, the dashboard use a 3h bucket size instead of 1h. This bucket size changes automatically by default based on the timeframe. Fixing it to 1h might have performance implication if you look at more 1w time frame. I hope everyone is ok with this tradeoff.

Such kind f information can go to a small markdown if you think the user should know. Sth like Default size of bucket is 1h, you can anticipate performance degradation if large number is set

Overall LGTM

@gsantoro
Copy link
Contributor Author

@constanca-m sorry about that. I fixed the screenshot now.

@gizas I am personally not a huge fan of adding a markdown text to explain the decision behind a choice that might affect the performance of a dashboard. I would have preferred to keep the time frame dynamic like we do elsewhere and then do not required the markdown. Nevertheless, I am happy to go with this solution if you think it is better.

@gizas
Copy link
Contributor

gizas commented Feb 22, 2023

Sure no strong opinion on this, just an alternative if you wanted to consider.

@gsantoro
Copy link
Contributor Author

I got some feedback from Stephen Brown that the Istiod Dashboards with 3 visualization per row are a a bit cramped even on a fairly wide screen.

This is how it looks with 2 visualization per row instead.

Screenshot 2023-02-22 at 11 18 00

I agree with him that it looks better now. So I'll keep it like this unless someone object here.

About the time frame being fixed or not from a previous discussion. This is really what I would like to do:

  • keep the time frame dynamic like per default settings so that it doesn't create buckets that are too small or too big, it doesn't impact the performance of the dashboard and we don't have to add the extra bit of markdown
  • I will still remove the X axis label @timestamp per 3h
  • I'll leave to the user to notice the time frame resolution by just moving the mouse over a couple of sample point from each visualization.

Since we don't have strong guidelines on the previous topics, this is what I think it's sensible but feel free to point out to a better way to do this or some documentation.

@gsantoro
Copy link
Contributor Author

gsantoro commented Mar 6, 2023

A couple of small changes:

  • @bvader made me notice that we have both "Requests per source/destination" and "Requests / sec per source/destination". So I remove the first 2 in favour of just leaving the last 2.
  • I have removed the X labels from all dashboards (this was already agreed but I had some still there)
  • I have added the dashboard pre-filters according to Add data_stream.dataset pre filter in Istio dashboards #4978
  • I have tested everything with some real Istio data

Screenshot 2023-03-06 at 18 03 48

Screenshot 2023-03-06 at 18 04 01

@gsantoro gsantoro mentioned this pull request Mar 6, 2023
14 tasks
Copy link
Member

@ChrsMark ChrsMark left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@gsantoro gsantoro merged commit 3c02fde into elastic:main Mar 7, 2023
@gsantoro gsantoro deleted the feature/istio_lens branch March 7, 2023 09:51
@elasticmachine
Copy link

Package istio - 0.2.4 containing this change is available at https://epr.elastic.co/search?package=istio

agithomas pushed a commit to agithomas/integrations that referenced this pull request Mar 20, 2023
* convert dashboards from TSVB to Lens
* add data_set.datastream pre-filter
* some other layout optimizations
agithomas pushed a commit to agithomas/integrations that referenced this pull request Mar 21, 2023
* convert dashboards from TSVB to Lens
* add data_set.datastream pre-filter
* some other layout optimizations
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
draft Draft enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add data_stream.dataset pre filter in Istio dashboards
5 participants