Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Elastic/logs redis.log-default datastream doesn't use start_date / end_date track parameters #541

Open
cbuescher opened this issue Dec 11, 2023 · 1 comment
Labels

Comments

@cbuescher
Copy link
Member

While doing dome experiments with the elastic/logs track I saw that the "@timestamp" fields of the documents ending up in the ".ds-logs-redis.log-default-*" datastream do not seem to be affected by the "start_date"/"end_date" or "bulk_start_date"/"bulk_end_date" track parameters. I'm wondering if those are then correctly queried in the querying challanges.

Ways to reproduce locally:

I run the "logging-querying" challenge with the following parameters:

"logging-querying-params.json":
{
   "number_of_shards": 1,
   "wait_for_status":"yellow",
   "raw_data_volume_per_day": "4GB",
   "bulk_start_date": "2020-01-01",
   "bulk_end_date": "2020-01-02"
}

Running race like this:

esrally race --challenge="logging-querying" --track-params="logging-querying-params.json" --preserve-install --kill-running-processes --track="elastic/logs"

I inspected the data after re-starting the cluster used by that race.
My expectation was that all documents lie in a data range between 2020-01-01 and 2020-01-02, that would also be the defaults. Looking at date histograms of the "@timestamp" field I found that mostly the above mentioned "redis" datastreams have documents in todays time range:

POST /logs*/_search?size=0
{
    
    "aggs": {
        "timestamp": {
            "date_histogram": {
                "field": "@timestamp",
                "fixed_interval": "1h",
                "min_doc_count": 1
            },
            "aggs": {
                "stream_name": {
                    "terms": {
                        "field": "_index",
                        "order": {
                            "_key": "asc"
                        }
                    }
                }
            }
        }
    }
}

I'm curious about whether this is an error and/or a problem for when querying this tracks data.

@cbuescher cbuescher added the bug label Dec 11, 2023
@AI-IshanBhatt
Copy link
Contributor

I was able to reproduce this and I could verify with following query

POST /logs-redis.log-default*/_search?size=0
{
    
    "aggs": {
        "timestamp": {
            "date_histogram": {
                "field": "@timestamp",
                "fixed_interval": "1h",
                "min_doc_count": 1
            },
            "aggs": {
                "stream_name": {
                    "terms": {
                        "field": "_index",
                        "order": {
                            "_key": "asc"
                        }
                    }
                }
            }
        }
    }
}

I ran the same query for different indices as well, and it turns our only redis.log-default has this issue. Looking further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants