Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

500 error on writing data out of policy range #20359

Closed
ssoroka opened this issue Dec 16, 2020 · 11 comments · Fixed by #20442
Closed

500 error on writing data out of policy range #20359

ssoroka opened this issue Dec 16, 2020 · 11 comments · Fixed by #20442
Assignees
Labels
area/HTTP area/2.x OSS 2.0 related issues and PRs kind/bug

Comments

@ssoroka
Copy link

ssoroka commented Dec 16, 2020

Writing data outside of policy range should result in a 4xx error that the client can understand not to retry sending the data. It's expected behavior that 5xx errors received by clients should result in the client re-transmitting the same data to avoid data loss.

This issue was raised by a customer here influxdata/telegraf#8571

Steps to reproduce:

I've configured a bucket with a retention policy of 2 weeks. When I write some measurements that are older than those two weeks, influxdb response with an 500 error.

  1. setup influxdb v2
  2. create bucket with a retention policy of N weeks
  3. send some data to this bucket that is older than N weeks. E.g. via curl.

Expected behavior:

server responds with an error indicating client error, eg 4xx. such as. 406 Not Acceptable or 422 Unprocessable Entity

Actual behavior:

Server responds with 500 Internal Server Error

Environment info:

  • InfluxDB version: 2.x

Logs:
see logs from influxdata/telegraf#8571

@psteinbachs
Copy link
Contributor

Added @danxmoran for tracking what we'll need to add for OSS.

@StacieClark
Copy link

this is the set of default errors allowed by the swagger definition enum: - internal error - not found - conflict - invalid - unprocessable entity - empty value - unavailable - forbidden - too many requests - unauthorized - method not allowed
422: Unprocessable Entity should be used rather that 406: unacceptable, as unacceptable is not on our list

@ssoroka
Copy link
Author

ssoroka commented Dec 17, 2020

@StacieClark do we have any timeline for this?

@StacieClark
Copy link

I'm looking at it today. I'll let you know when I know what the fix is

@StacieClark
Copy link

StacieClark commented Dec 22, 2020

I configured a bucket to have a 1 hr retention policy against a cloud repo

`curl -i -XPOST 'http://localhost:8080/api/v2/write?org=dev&bucket=1hr&precision=s' --header "Authorization: Token $INFLUX_TOKEN" --data-raw "note mod=13 1234567"

HTTP/1.1 204 No Content
Date: Tue, 22 Dec 2020 15:31:45 GMT
Server: Caddy
`
The content does not get inserted. No timestamp or a timestamp within the hour do get inserted.

So this is incorrect as well.

@StacieClark
Copy link

StacieClark commented Dec 22, 2020

this issue is not reproducible as a 500 error on cloud. On cloud it is 204. While not correct, it will not cause telegraf to go into a retry loop

@ssoroka
Copy link
Author

ssoroka commented Dec 27, 2020

why are you getting Server: Caddy in the response? seems like you're not getting a response from the right server.

@StacieClark
Copy link

That is local cluster. From an AWS cluster:

curl -i -XPOST 'https://us-east-1-1.aws.cloud2.influxdata.com/api/v2/[email protected]&bucket=dev&precision=s' --header "Authorization: Token $INFLUX_TOKEN" --data-raw "note mod=15 12345"

HTTP/2 204
date: Mon, 04 Jan 2021 19:29:57 GMT
strict-transport-security: max-age=15724800; includeSubDomains

@StacieClark
Copy link

StacieClark commented Jan 4, 2021

Was this issue found on OSS and not cloud?

@danxmoran
Copy link
Contributor

danxmoran commented Jan 4, 2021

I can confirm this is a problem on OSS master, & can prioritize working on a fix for the upcoming 2.0.4 release. I'm unclear about the desired behavior change, though: do we want OSS to match cloud and return a 204, or do its own thing and return some 4xx code? My main concern about returning a 4xx is that users might think all of their data was rejected, but in reality any of the points that fell inside the RP will have been accepted & stored.

@psteinbachs
Copy link
Contributor

psteinbachs commented Jan 4, 2021

@danxmoran In general, we should always be staying in sync with cloud. We can have a deeper discussion if there are aspects that aren't applicable to one or the other. In this case, we need to get it corrected in both cloud and OSS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HTTP area/2.x OSS 2.0 related issues and PRs kind/bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants