Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

Closed
magnuslarsen opened this issue Oct 28, 2019 · 6 comments · Fixed by #49978
Closed

[Monitoring] Duplicated pipelines on the Pipelines overview page #49462

magnuslarsen opened this issue Oct 28, 2019 · 6 comments · Fixed by #49978
Labels
bug Fixes for quality problems that affect the customer experience Feature:Logstash Pipelines Logstash Pipeline UI related Team:Monitoring Stack Monitoring team triage_needed

Comments

@magnuslarsen
Copy link

Kibana version:
7.4.1
Elasticsearch version:
7.4.1
Server OS version:
18.04.3 LTS
Browser version:
Firefox 70, Chromium Version 78.0.3904.70 (Official Build) snap (64-bit)
Browser OS version:
Kubuntu 19.10
Original install method (e.g. download page, yum, from source, etc.):
DEB packages
Describe the bug:
On the monitoring page, under pipelines, ALL pipelines are duplicated n amount times. On my test environment i see 4 times duplication, and on my production environment i see 3 duplicate pipelines.

Steps to reproduce:
I am actually not entirely sure...

In my test environment i searched after a pipeline, and after clearing the search, i saw duplicates.
In my production environment, i simply upgraded Elastic stack (from 7.4.0), and went to the page, discovering duplicates

Expected behavior:
Only see one of each pipeline.

Screenshots (if relevant):
image

Errors in browser console (if relevant):
No errors in the console, however looking at the pipeline request, the response contains the duplicate results:
Request:
image
Response:
image

Provide logs and/or server output (if relevant):
No logs in Kibana log
Any additional context:
Both environments worked in 7.4.0, and was upgraded from that version to 7.4.1. Only after the 7.4.1 upgrade, the duplication started.

Without knowing it, this change may have brought it up: #47154

If i can help with anything, don't hesitate to message me :)

@markov00 markov00 added bug Fixes for quality problems that affect the customer experience Feature:Logstash Pipelines Logstash Pipeline UI related Team:Monitoring Stack Monitoring team triage_needed labels Oct 28, 2019
@elasticmachine
Copy link
Contributor

Pinging @elastic/stack-monitoring (Team:Monitoring)

@igoristic
Copy link
Contributor

igoristic commented Oct 29, 2019

@magnuslarsen

I can't reproduce this issue for some reason.

Can you please test and see if you have the same behavior here In Management > Pipelines eg: http://localhost:5601/app/kibana#/management/logstash/pipelines

Also, see if you get any duplicates by just running this query in Dev Tools: (I already pre-filled it with time range from your description above)

GET .monitoring-logstash-*/_search
{
  "size": 0,
  "_source": ["aggregations.check.buckets.pipelines_nested.by_pipeline_id.buckets"],
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "logstash_stats.timestamp": {
              "format": "epoch_millis",
              "gte": 1572254116200,
	      "lte": 1572257716200
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "check": {
      "date_histogram": {
        "field": "logstash_stats.timestamp",
        "fixed_interval": "10m"
      },
      "aggs": {
        "pipelines_nested": {
          "nested": {
            "path": "logstash_stats.pipelines"
          },
          "aggs": {
            "by_pipeline_id": {
              "terms": {
                "field": "logstash_stats.pipelines.id",
                "size": 1000
              },
              "aggs": {
                "by_pipeline_hash": {
                  "terms": {
                    "field": "logstash_stats.pipelines.hash",
                    "size": 1000
                  },
                  "aggs": {
                    "by_ephemeral_id": {
                      "terms": {
                        "field": "logstash_stats.pipelines.ephemeral_id",
                        "size": 1000
                      },
                      "aggs": {
                        "events_stats": {
                          "stats": {
                            "field": "logstash_stats.pipelines.events.out"
                          }
                        },
                        "throughput": {
                          "bucket_script": {
                            "script": "params.max - params.min",
                            "buckets_path": {
                              "min": "events_stats.min",
                              "max": "events_stats.max"
                            }
                          }
                        }
                      }
                    }
                  }
                }
              }
            }
          }
        }
      }
    }
  }
}

I'm also wondering if it was a fluke from upgrading. Have you tried deleting your indices via:
DELETE .monitoring-logstash-* (Warning: This will delete your Logstash monitoring data). It will delete the index, but it will then be re-created once data is put in it again.

ps: I think this can also happen if you run the same logstash again, but with a different data source via bin/logstash -E path.data=...

@magnuslarsen
Copy link
Author

magnuslarsen commented Oct 29, 2019

Test environment
Going to Management -> Pipelines, i see one of each pipeline (as expected)
Going to Monitoring -> Nodes (under logstash that is) -> and then Pipelines, i see 1 of each pipeline too, as expected (for all nodes)
Going to Monitoring -> Pipelines (under logstash) i get No items found :(
image
Going back a couple of hours, and the pipelines come back; seemingly without data though (same pipelines as yesterday):
image

Running the query you provided yielded one of each (as expected too)

Prod environment
Going to Management -> Pipelines, i see one of each pipeline (as expected)
Going to Monitoring -> Nodes (under logstash that is) -> and then Pipelines, i see 1 of each pipeline too, as expected (for all nodes)
Going to Monitoring -> Pipelines (under logstash) i get 2 of each pipeline.. 1 less than yesterday. So it seems random..?

Running the query you provided yielded one of each (as expected too)


I am using a CI/CD pipeline to push Pipeline config via the Kibana Centralized Pipeline Management API.

Also, running a search and clearing it under the Monitoring -> Nodes (Logstash) -> Pipelines, yields no duplicates. So either it's not present under that pipeline view, or the bug appears somewhere else.

For the logstash data source, it has always been static at /var/lib/logstash

I am happy to delete the .monitoring-logstash-* index, if you don't need more debug information from me/it 👍

@chrisronline
Copy link
Contributor

Hi @magnuslarsen

Thanks again for opening this issue! We did recently make changes here and this feedback is excellent and very helpful!

I'm not sure if you read through the PR you linked (great find, btw!), but we've modified the way we query for the list of pipelines to avoid performance issues when loading that page with many, many pipelines.

Can you run the following query and report back on the results?

POST .monitoring-logstash-*/_search
{
  "size": 0,
  "query": {
    "bool": {
      "filter": [
        {
          "range": {
            "logstash_stats.timestamp": {
              "gte": "now-1h"
            }
          }
        }
      ]
    }
  },
  "aggs": {
    "nested_context": {
      "nested": {
        "path": "logstash_stats.pipelines"
      },
      "aggs": {
        "composite_data": {
          "composite": {
            "size": 10000,
            "sources": [
              {
                "id": {
                  "terms": {
                    "field": "logstash_stats.pipelines.id"
                  }
                }
              },
              {
                "hash": {
                  "terms": {
                    "field": "logstash_stats.pipelines.hash"
                  }
                }
              },
              {
                "ephemeral_id": {
                  "terms": {
                    "field": "logstash_stats.pipelines.ephemeral_id"
                  }
                }
              }
            ]
          }
        }
      }
    },
    "clusters": {
      "terms": {
        "field": "cluster_uuid",
        "size": 10
      }
    }
  }
}

@magnuslarsen
Copy link
Author

magnuslarsen commented Nov 1, 2019

FYI: I will not update to the new 7.4.2 version, until i get your clearance.'

Running your query (as is) on my test environment yields two buckets for each pipeline (looking at hash or id) - ephemeral_id is unique per bucket.
The strange thing however, is each pair of buckets are doc_count = 349 and 351 (no difference between the pairs)
Full output can be found here: https://pastebin.com/r7RuXsxJ

My production environment looks identical, except that each pipeline pair (yes two of each there too) are doc_count = 343 and 343.


Looking at the Monitoring -> Pipelines page in Kibana:
test environment
Looks like something happend on the main overview page, where it stopped working sometime 3 days ago (when I look back at the earlier picture)?
image

production environment
2 of each pipeline, as per "usual"

On both environments, going to Monitoring -> Nodes (Logstash) -> Pipelines, are still correct (1 pipeline of each, and data gets shown with timetamp Last 1 hour:
image


Worth noting again, both my test and production environments are identical, besides naming, ip addresses, vlan and such, and the actual number of pipelines (test: 20, prod: 36).
It is provisioned via Puppet (mostly custom modules)

@chrisronline
Copy link
Contributor

@magnuslarsen Awesome, thanks so much for the data!

It looks like this is a legit bug! We're not properly de-duping based on the ephemeral_id changing. This makes sense with what you're seeing because certain time ranges might or might not have ephemeral_id changes.

For now, there is not much you can do, but I'll get a PR up for this soon!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Fixes for quality problems that affect the customer experience Feature:Logstash Pipelines Logstash Pipeline UI related Team:Monitoring Stack Monitoring team triage_needed
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants