Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/elasticsearch] Fix dynamic mapping for double values storing integers #34814

Merged

Conversation

carsonip
Copy link
Contributor

@carsonip carsonip commented Aug 22, 2024

Description:

Use json.Visitor SetExplicitRadixPoint option to ensure that double values are encoded like double values, such that ES determines the correct dynamic mapping for the field.

Link to tracking Issue:

Fixes #34680

Testing:

Updated tests

Documentation:

@carsonip carsonip marked this pull request as ready for review August 22, 2024 14:36
@carsonip carsonip requested review from a team and TylerHelmuth August 22, 2024 14:36
@carsonip carsonip force-pushed the fix-double-value-encoding branch from 047bd30 to c24f64b Compare August 22, 2024 15:44
@github-actions github-actions bot added the cmd/otelcontribcol otelcontribcol command label Aug 22, 2024
Copy link
Member

@andrzej-stencel andrzej-stencel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm worried that this may potenially be a breaking change. Consider the following scenario:

A user uses the Elasticsearch exporter to export a metric that is defined as a double in OTLP, but that contains only integer values, into an Elasticsearch index that has the field mapped as an integer or long. After this change, the user will get an error from Elasticsearch, as the ES exporter will start sending values with the decimal point to Elasticsearch.

I think that at the very least, we should mark this as a breaking change, describing which users may be affected by this.

If we agree this is a breaking change, we need to also phase it in gradually, probably via a feature gate.

@carsonip
Copy link
Contributor Author

Thanks for reviewing and thinking about the implication of this change. I understand your concern, but it should not be an issue.

but that contains only integer values, into an Elasticsearch index that has the field mapped as an integer or long. After this change, the user will get an error from Elasticsearch, as the ES exporter will start sending values with the decimal point to Elasticsearch.

I experimented with this. Sending double values to a field mapped as long will not result in any failures. They will be stored as a double in _source, while it will be truncated to return a long in fields API. Therefore, I'd say that this is addressed by ES and this change would not be a breaking change due to this.

{
    "took": 2,
    "timed_out": false,
    "_shards": {
        "total": 1,
        "successful": 1,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": {
            "value": 3,
            "relation": "eq"
        },
        "max_score": 1.0,
        "hits": [
            {
                "_index": ".ds-logs-foo-default-2024.08.30-000001",
                "_id": "2YRyo5EBmDIc-wF_g7hx",
                "_score": 1.0,
                "_source": {
                    "foo": 4.5,
                    "@timestamp": "2024-08-30T13:21:16.657225033Z"
                },
                "fields": {
                    "foo": [
                        4
                    ]
                }
            },
            {
                "_index": ".ds-logs-foo-default-2024.08.30-000001",
                "_id": "2IRxo5EBmDIc-wF_07hR",
                "_score": 1.0,
                "_source": {
                    "foo": 4.0,
                    "@timestamp": "2024-08-30T13:20:31.569660078Z"
                },
                "fields": {
                    "foo": [
                        4
                    ]
                }
            },
            {
                "_index": ".ds-logs-foo-default-2024.08.30-000001",
                "_id": "1IRxo5EBmDIc-wF_dLgs",
                "_score": 1.0,
                "_source": {
                    "foo": 4,
                    "@timestamp": "2024-08-30T13:20:06.687976753Z"
                },
                "fields": {
                    "foo": [
                        4
                    ]
                }
            }
        ]
    }
}

@felixbarny @axw are there any cases I missed that would make this change a breaking change?

@felixbarny
Copy link
Contributor

I'd also expect ES to coerce the floats to integers/longs, therefore this shouldn't be a breaking change. Thanks @carsonip for testing that coercion works as expected. Also good catch @andrzej-stencel on the potential breaking change here.

@andrzej-stencel
Copy link
Member

Oh in that case all good, thanks a lot @carsonip for verifying this.

Can you fix the CI checks and we should be good to go.

Generated code is out of date, please run "make genotelcontribcol" and commit the changes in this PR.

@carsonip
Copy link
Contributor Author

carsonip commented Sep 5, 2024

Can you fix the CI checks and we should be good to go.

I think the previous main was broken. I've rebased (merge main) and let's see if it works

@andrzej-stencel andrzej-stencel merged commit cb71224 into open-telemetry:main Sep 5, 2024
156 checks passed
@github-actions github-actions bot added this to the next release milestone Sep 5, 2024
f7o pushed a commit to f7o/opentelemetry-collector-contrib that referenced this pull request Sep 12, 2024
…g integers (open-telemetry#34814)

**Description:** <Describe what has changed.>
<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->

Use json.Visitor SetExplicitRadixPoint option to ensure that double
values are encoded like double values, such that ES determines the
correct dynamic mapping for the field.

**Link to tracking Issue:** <Issue number if applicable>

Fixes open-telemetry#34680

**Testing:** <Describe what testing was performed and which tests were
added.>

Updated tests

**Documentation:** <Describe the documentation added.>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[exporter/elasticsearch] Double values encoded without decimal point causes wrong dynamic mapping
4 participants