[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

magnuslarsen · 2019-10-28T10:32:01Z

Kibana version:
7.4.1
Elasticsearch version:
7.4.1
Server OS version:
18.04.3 LTS
Browser version:
Firefox 70, Chromium Version 78.0.3904.70 (Official Build) snap (64-bit)
Browser OS version:
Kubuntu 19.10
Original install method (e.g. download page, yum, from source, etc.):
DEB packages
Describe the bug:
On the monitoring page, under pipelines, ALL pipelines are duplicated n amount times. On my test environment i see 4 times duplication, and on my production environment i see 3 duplicate pipelines.

Steps to reproduce:
I am actually not entirely sure...

In my test environment i searched after a pipeline, and after clearing the search, i saw duplicates.
In my production environment, i simply upgraded Elastic stack (from 7.4.0), and went to the page, discovering duplicates

Expected behavior:
Only see one of each pipeline.

Screenshots (if relevant):

Errors in browser console (if relevant):
No errors in the console, however looking at the pipeline request, the response contains the duplicate results:
Request:

Response:

Provide logs and/or server output (if relevant):
No logs in Kibana log
Any additional context:
Both environments worked in 7.4.0, and was upgraded from that version to 7.4.1. Only after the 7.4.1 upgrade, the duplication started.

Without knowing it, this change may have brought it up: #47154

If i can help with anything, don't hesitate to message me :)

elasticmachine · 2019-10-28T16:13:57Z

Pinging @elastic/stack-monitoring (Team:Monitoring)

igoristic · 2019-10-29T09:23:58Z

@magnuslarsen

I can't reproduce this issue for some reason.

Can you please test and see if you have the same behavior here In Management > Pipelines eg: http://localhost:5601/app/kibana#/management/logstash/pipelines

Also, see if you get any duplicates by just running this query in Dev Tools: (I already pre-filled it with time range from your description above)

GET .monitoring-logstash-*/_search
{
  "size": 0,
  "_source": ["aggregations.check.buckets.pipelines_nested.by_pipeline_id.buckets"],
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "logstash_stats.timestamp": {
              "format": "epoch_millis",
              "gte": 1572254116200,
	      "lte": 1572257716200
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "check": {
      "date_histogram": {
        "field": "logstash_stats.timestamp",
        "fixed_interval": "10m"
      },
      "aggs": {
        "pipelines_nested": {
          "nested": {
            "path": "logstash_stats.pipelines"
          },
          "aggs": {
            "by_pipeline_id": {
              "terms": {
                "field": "logstash_stats.pipelines.id",
                "size": 1000
              },
              "aggs": {
                "by_pipeline_hash": {
                  "terms": {
                    "field": "logstash_stats.pipelines.hash",
                    "size": 1000
                  },
                  "aggs": {
                    "by_ephemeral_id": {
                      "terms": {
                        "field": "logstash_stats.pipelines.ephemeral_id",
                        "size": 1000
                      },
                      "aggs": {
                        "events_stats": {
                          "stats": {
                            "field": "logstash_stats.pipelines.events.out"
                          }
                        },
                        "throughput": {
                          "bucket_script": {
                            "script": "params.max - params.min",
                            "buckets_path": {
                              "min": "events_stats.min",
                              "max": "events_stats.max"
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

I'm also wondering if it was a fluke from upgrading. Have you tried deleting your indices via:
DELETE .monitoring-logstash-* (Warning: This will delete your Logstash monitoring data). It will delete the index, but it will then be re-created once data is put in it again.

ps: I think this can also happen if you run the same logstash again, but with a different data source via bin/logstash -E path.data=...

magnuslarsen · 2019-10-29T10:39:33Z

Test environment
Going to Management -> Pipelines, i see one of each pipeline (as expected)
Going to Monitoring -> Nodes (under logstash that is) -> and then Pipelines, i see 1 of each pipeline too, as expected (for all nodes)
Going to Monitoring -> Pipelines (under logstash) i get No items found :(

Going back a couple of hours, and the pipelines come back; seemingly without data though (same pipelines as yesterday):

Running the query you provided yielded one of each (as expected too)

Prod environment
Going to Management -> Pipelines, i see one of each pipeline (as expected)
Going to Monitoring -> Nodes (under logstash that is) -> and then Pipelines, i see 1 of each pipeline too, as expected (for all nodes)
Going to Monitoring -> Pipelines (under logstash) i get 2 of each pipeline.. 1 less than yesterday. So it seems random..?

Running the query you provided yielded one of each (as expected too)

I am using a CI/CD pipeline to push Pipeline config via the Kibana Centralized Pipeline Management API.

Also, running a search and clearing it under the Monitoring -> Nodes (Logstash) -> Pipelines, yields no duplicates. So either it's not present under that pipeline view, or the bug appears somewhere else.

For the logstash data source, it has always been static at /var/lib/logstash

I am happy to delete the .monitoring-logstash-* index, if you don't need more debug information from me/it 👍

chrisronline · 2019-10-31T15:32:06Z

Hi @magnuslarsen

Thanks again for opening this issue! We did recently make changes here and this feedback is excellent and very helpful!

I'm not sure if you read through the PR you linked (great find, btw!), but we've modified the way we query for the list of pipelines to avoid performance issues when loading that page with many, many pipelines.

Can you run the following query and report back on the results?

POST .monitoring-logstash-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "logstash_stats.timestamp": {
              "gte": "now-1h"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "nested_context": {
      "nested": {
        "path": "logstash_stats.pipelines"
      },
      "aggs": {
        "composite_data": {
          "composite": {
            "size": 10000,
            "sources": [
              {
                "id": {
                  "terms": {
                    "field": "logstash_stats.pipelines.id"
                  }
                }
              },
              {
                "hash": {
                  "terms": {
                    "field": "logstash_stats.pipelines.hash"
                  }
                }
              },
              {
                "ephemeral_id": {
                  "terms": {
                    "field": "logstash_stats.pipelines.ephemeral_id"
                  }
                }
              }
            ]
          }
        }
      }
    },
    "clusters": {
      "terms": {
        "field": "cluster_uuid",
        "size": 10
      }
    }
  }
}

magnuslarsen · 2019-11-01T06:52:47Z

FYI: I will not update to the new 7.4.2 version, until i get your clearance.'

Running your query (as is) on my test environment yields two buckets for each pipeline (looking at hash or id) - ephemeral_id is unique per bucket.
The strange thing however, is each pair of buckets are doc_count = 349 and 351 (no difference between the pairs)
Full output can be found here: https://pastebin.com/r7RuXsxJ

My production environment looks identical, except that each pipeline pair (yes two of each there too) are doc_count = 343 and 343.

Looking at the Monitoring -> Pipelines page in Kibana:
test environment
Looks like something happend on the main overview page, where it stopped working sometime 3 days ago (when I look back at the earlier picture)?

production environment
2 of each pipeline, as per "usual"

On both environments, going to Monitoring -> Nodes (Logstash) -> Pipelines, are still correct (1 pipeline of each, and data gets shown with timetamp Last 1 hour:

Worth noting again, both my test and production environments are identical, besides naming, ip addresses, vlan and such, and the actual number of pipelines (test: 20, prod: 36).
It is provisioned via Puppet (mostly custom modules)

chrisronline · 2019-11-01T19:52:21Z

@magnuslarsen Awesome, thanks so much for the data!

It looks like this is a legit bug! We're not properly de-duping based on the ephemeral_id changing. This makes sense with what you're seeing because certain time ranges might or might not have ephemeral_id changes.

For now, there is not much you can do, but I'll get a PR up for this soon!

markov00 added bug Fixes for quality problems that affect the customer experience Feature:Logstash Pipelines Logstash Pipeline UI related Team:Monitoring Stack Monitoring team triage_needed labels Oct 28, 2019

chrisronline mentioned this issue Nov 1, 2019

[Monitoring] De-duplicate pipeline ids based on the ephemeral_id changing #49978

Merged

2 tasks

chrisronline closed this as completed in #49978 Nov 14, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

magnuslarsen commented Oct 28, 2019

elasticmachine commented Oct 28, 2019

igoristic commented Oct 29, 2019 •

edited

Loading

magnuslarsen commented Oct 29, 2019 •

edited

Loading

chrisronline commented Oct 31, 2019

magnuslarsen commented Nov 1, 2019 •

edited

Loading

chrisronline commented Nov 1, 2019

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

Comments

magnuslarsen commented Oct 28, 2019

elasticmachine commented Oct 28, 2019

igoristic commented Oct 29, 2019 • edited Loading

magnuslarsen commented Oct 29, 2019 • edited Loading

chrisronline commented Oct 31, 2019

magnuslarsen commented Nov 1, 2019 • edited Loading

chrisronline commented Nov 1, 2019

igoristic commented Oct 29, 2019 •

edited

Loading

magnuslarsen commented Oct 29, 2019 •

edited

Loading

magnuslarsen commented Nov 1, 2019 •

edited

Loading