Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix NPE when indexing a document that just has been deleted in a tsdb index #96461

Merged
merged 6 commits into from
Jun 1, 2023

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented May 31, 2023

Sometimes a segment only contains tombstone documents. In that case, loading min and max @timestamp field values can result into NPE. Because these documents don't have a @timestamp field.

This change fixes that by checking for the existence of the @timestamp field in the a segment's field infos.

Reproduction of the issue:

Create tsdb index:

PUT weather_sensors
{
    "mappings": {
        "properties": {
            "@timestamp": {
                "type": "date"
            },
            "humidity": {
                "type": "half_float",
                "time_series_metric": "gauge"
            },
            "location": {
                "type": "keyword",
                "time_series_dimension": true
            },
            "sensor_id": {
                "type": "keyword",
                "time_series_dimension": true
            },
            "temperature": {
                "type": "half_float",
                "time_series_metric": "gauge"
            }
        }
    },
    "settings": {
        "index": {
            "time_series": {
                "start_time": "2000-01-01T00:00:00.000Z",
                "end_time": "2099-12-31T23:59:59.999Z"
            },
            "routing_path": [
                "sensor_id",
                "location"
            ],
            "mode": "time_series",
            "codec": "best_compression",
            "number_of_shards": "1"
        }
    }
}

Index document:

POST weather_sensors/_doc
{
  "@timestamp": "2023-05-31T08:41:15.000Z",
  "sensor_id": "SYKENET-000001",
  "location": "swamp",
  "temperature": 32.4,
  "humidity": 88.9
}

Delete that document:

DELETE weather_sensors/_doc/crxuhC8WO3aVdhvtAAABiHD35_g

Index same document again:

POST weather_sensors/_doc
{
  "@timestamp": "2023-05-31T08:31:15.000Z",
  "sensor_id": "SYKENET-000001",
  "location": "swamp",
  "temperature": 42.4,
  "humidity": 88.9
}

Last api call results in the following error:

{
    "error": {
        "root_cause": [
            {
                "type": "null_pointer_exception",
                "reason": "Cannot invoke \"org.apache.lucene.index.CompositeReaderContext.reader()\" because \"org.apache.lucene.index.LeafReader.getContext().parent\" is null"
            }
        ],
        "type": "null_pointer_exception",
        "reason": "Cannot invoke \"org.apache.lucene.index.CompositeReaderContext.reader()\" because \"org.apache.lucene.index.LeafReader.getContext().parent\" is null"
    },
    "status": 500
}

…ionAndSeqNoLookup.

It can happen that sometimes segments only contain delete tombstone documents. In that case, loading min and max `@timestamp` field values can result into NPE. Because these documents that have a `@timestamp` field.

The change fixes that by checking whether numDocs is not zero. These tombstone documents are marked as deleted, so checking whether there are live documents fixes this bug.
@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you.

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label May 31, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

It can happen that sometimes segments only contain delete tombstone documents. In that case, loading min and max `@timestamp` field values can result into NPE. Because these documents that have a `@timestamp` field.

This change fixes that by checking for the existence of the `@timestamp` field in the a segment's field infos.
@martijnvg martijnvg added v8.8.1 auto-backport Automatically create backport pull requests when merged labels Jun 1, 2023
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.8

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jun 1, 2023
… index (elastic#96461)

Sometimes a segment only contains tombstone documents. In that case, loading min and max @timestamp field values can result into NPE. Because these documents don't have a @timestamp field.

This change fixes that by checking for the existence of the @timestamp field in the a segment's field infos.
martijnvg added a commit that referenced this pull request Jun 1, 2023
…a tsdb index (#96476)

Backporting #96461 to 8.8 branch.

Sometimes a segment only contains tombstone documents. In that case, loading min and max @timestamp field values can result into NPE. Because these documents don't have a @timestamp field.

This change fixes that by checking for the existence of the @timestamp field in the a segment's field infos.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport Automatically create backport pull requests when merged >bug :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.8.1 v8.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants