Decode JSON body when Prometheus returns 400 to an api call #414

bboreham · 2018-05-31T14:20:04Z

400 and 422 are documented error codes from Prometheus, so we should attempt to parse the error returned for both of them. Discussed briefly on IRC last night.

Needed to change a test that was requiring the old behaviour - made it use 500 instead of 400.

@beorn7 as maintainer

beorn7 · 2018-06-02T16:06:03Z

Thanks @bboreham . My maintainer role for the api part of this repo is more a formality mishap than a real thing. It would be helpful to get the people chime in here that were part of the discussion. (@brian-brazil IIRC.)

In general, this makes sense to me. HTTP as I understand it may always give you something useful in the body. Thus, I'm a fan of trying to decode the body by default and only not do it if it is for sure (part of the API contract or something) that the body can or should be ignored.

bboreham · 2018-06-02T16:10:00Z

FWIW the contributor instructions say to @-mention the maintainer.

brian-brazil · 2018-06-02T16:20:13Z

I think this makes sense, we should pass back error information when we can.

beorn7 · 2018-06-03T22:45:17Z

FWIW the contributor instructions say to @-mention the maintainer.

@bboreham you did nothing wrong. What's wrong is the situation. The api subdir was dropped here a while ago by important Prometheus developer A as an experiment and without any intentions of maintaining it themself.

When we formalized maintainership a bit more, I wanted to add a note that the api subdir is an unmaintained experience. But then important Prometheus developer B said that would appear unprofessional or something…

I will bring this up with the team again.

beorn7

Looks like we can take this as is.

Could you just satisfy my OCD and fix that nit I raised?

beorn7 · 2018-06-03T22:47:38Z

api/prometheus/v1/api.go

@@ -455,6 +455,11 @@ type apiResponse struct {
 	Error     string          `json:"error"`
 }

+func apiError(code int) bool {
+	// These are the codes that Prometheus sends when it returns an error


Nit: End sentence with a period.

400 and 422 are documented error codes from Prometheus, so we should attempt to parse the error returned for both of them. Needed to change a test that was requiring the old behaviour - made it use 500 instead of 400. Signed-off-by: Bryan Boreham <[email protected]>

Signed-off-by: Bryan Boreham <[email protected]>

bboreham · 2018-06-04T10:36:53Z

Nit fixed.

beorn7 · 2018-06-04T13:00:06Z

Thanks again.

This is to pickup changes from prometheus/client_golang#414. It leads to better error output in promtool. Signed-off-by: Sneha Inguva <[email protected]>

Signed-off-by: Mark Knapp <[email protected]> Bubble up errors to promql from populating iterators (prometheus#4136) This changes the Walk/Inspect API inside the promql package to bubble up errors. This is done by having the inspector return an error (instead of a bool) and then bubbling that up in the Walk. This way if any error is encountered in the Walk() the walk will stop and return the error. This avoids issues where errors from the Querier where being ignored (causing incorrect promql evaluation). Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4136 *: cut v2.3.0 Signed-off-by: Fabian Reinartz <[email protected]> Update changelog Signed-off-by: Fabian Reinartz <[email protected]> limit size of POST requests against remote read endpoint (prometheus#4239) This commit fixes a denial-of-service issue of the remote read endpoint. It limits the size of the POST request body to 32 MB such that clients cannot write arbitrary amounts of data to the server memory. Fixes prometheus#4238 Signed-off-by: Andreas Auernhammer <[email protected]> Update example console template for node exporter 0.16.0 (prometheus#4208) Signed-off-by: Brian Brazil <[email protected]> Makefile: update .PHONY target (prometheus#4234) Makefile: update .PHONY target * Move .PHONY declarations near their targets Signed-off-by: Simon Pasquier <[email protected]> Add prompb/README (prometheus#4222) Signed-off-by: Henri DF <[email protected]> discovery/file: fix logging (prometheus#4178) Signed-off-by: Simon Pasquier <[email protected]> web: remove security headers Signed-off-by: Fabian Reinartz <[email protected]> config: set target group source index during unmarshalling (prometheus#4245) * config: set target group source index during unmarshalling Fixes issue prometheus#4214 where the scrape pool is unnecessarily reloaded for a config reload where the config hasn't changed. Previously, the discovery manager changed the static config after loading which caused the in-memory config to differ from a freshly reloaded config. Signed-off-by: Paul Gier <[email protected]> * [issue prometheus#4214] Test that static targets are not modified by discovery manager Signed-off-by: Paul Gier <[email protected]> Log the line when failing a PromQL test. (prometheus#4272) Signed-off-by: Alin Sinpalean <[email protected]> web: restore old path prefix behavior Signed-off-by: Fabian Reinartz <[email protected]> kubernetes_sd: fix namespace filtering (prometheus#4273) Signed-off-by: Simon Pasquier <[email protected]> fix minor issues in custom SD example (prometheus#4278) Signed-off-by: Callum Styan <[email protected]> federation: nil pointer deference when using remove read ``` level=error ts=2018-06-13T07:19:04.515149169Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56202: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.516199547Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56204: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.51717692Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56206: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.564952878Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56208: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.566575791Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56210: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.567106063Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56212: runtime error: invalid memory address or nil pointer dereference" ``` When remove read is enabled, federation will call `q.Select(nil, mset...)` which will break remote reads because it currently doesn't handle empty SelectParams. Signed-off-by: Corentin Chary <[email protected]> Extend API tests to cover remote read API. Signed-off-by: Tom Wilkie <[email protected]> Review feedback. Signed-off-by: Tom Wilkie <[email protected]> spelling. Signed-off-by: Tom Wilkie <[email protected]> web: restore old path prefix behavior Signed-off-by: Fabian Reinartz <[email protected]> kubernetes_sd: fix namespace filtering (prometheus#4273) Signed-off-by: Simon Pasquier <[email protected]> Avoid infinite loop on duplicate NaN values. (prometheus#4275) Fixes prometheus#4254 NaNs don't equal themselves, so a duplicate NaN would always hit the break statement and never get popped. We should not be returning multiple data point for the same timestamp, so don't compare values at all. Signed-off-by: Brian Brazil <[email protected]> config: set target group source index during unmarshalling (prometheus#4245) * config: set target group source index during unmarshalling Fixes issue prometheus#4214 where the scrape pool is unnecessarily reloaded for a config reload where the config hasn't changed. Previously, the discovery manager changed the static config after loading which caused the in-memory config to differ from a freshly reloaded config. Signed-off-by: Paul Gier <[email protected]> * [issue prometheus#4214] Test that static targets are not modified by discovery manager Signed-off-by: Paul Gier <[email protected]> discovery/file: fix logging (prometheus#4178) Signed-off-by: Simon Pasquier <[email protected]> federation: nil pointer deference when using remove read ``` level=error ts=2018-06-13T07:19:04.515149169Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56202: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.516199547Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56204: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.51717692Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56206: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.564952878Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56208: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.566575791Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56210: runtime error: invalid memory address or nil pointer dereference" level=error ts=2018-06-13T07:19:04.567106063Z caller=stdlib.go:89 component=web caller="http: panic serving [::1" msg="]:56212: runtime error: invalid memory address or nil pointer dereference" ``` When remove read is enabled, federation will call `q.Select(nil, mset...)` which will break remote reads because it currently doesn't handle empty SelectParams. Signed-off-by: Corentin Chary <[email protected]> Review feedback. Signed-off-by: Tom Wilkie <[email protected]> spelling. Signed-off-by: Tom Wilkie <[email protected]> Release 2.3.1 Signed-off-by: Brian Brazil <[email protected]> Timeout if populating iterators takes too long (prometheus#4291) Right now promql won't time out a request if populating the iterators takes a long time. Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4289 return error exit status in prometheus cli (prometheus#4296) Signed-off-by: mikeykhalil <[email protected]> Check for timeout in each iteration of matrixSelector (prometheus#4300) Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4288 Make TestUpdate() do some work (prometheus#4306) Previously it would set no preconditions and check no postconditions, as the `groups` member was empty. Signed-off-by: Bryan Boreham <[email protected]> Add "omitempty" to some SD config YAML field tags (prometheus#4338) Especially for Kubernetes SD, this fixes a bug where the rendered configuration says "api_server: null", which when read back is not interpreted as an un-set API server (thus the default is not applied). Signed-off-by: Julius Volz <[email protected]> travis: remove testing with go 1.x Travis and CircleCI should use the same Go version(s). Signed-off-by: Simon Pasquier <[email protected]> Reduce CircleCI duplication (prometheus#4335) Reduce the duplication of per-project specifics in the CircleCI config. * Add docker repo variable, default to docker hub. * Add make targets for docker push and tag latest. Signed-off-by: Ben Kochie <[email protected]> fix the TestManagerReloadNoChange test (prometheus#4267) Signed-off-by: Krasi Georgiev <[email protected]> Reorder startup and shutdown to prevent panics. (prometheus#4321) Start rule manager only after tsdb and config is loaded. Stop rule manager before tsdb to avoid writing to closed storage. Wait for any in-progress reloads to complete before shutting down rule manager, so that rule manager doesn't get updated after being shut down. Remove incorrect comment around shutting down query enginge. Log when config reload is completed. Fixes prometheus#4133 Fixes prometheus#4262 Signed-off-by: Brian Brazil <[email protected]> discovery/kubernetes/ingress: add more tests Signed-off-by: Dmitry Bashkatov <[email protected]> discovery/kubernetes/ingress: fix scheme discovery (Closes prometheus#4327) Signed-off-by: Dmitry Bashkatov <[email protected]> discovery/kubernetes/ingress: remove unnecessary check Signed-off-by: Dmitry Bashkatov <[email protected]> Fix markup in example. (prometheus#4351) Signed-off-by: Marcin Owsiany <[email protected]> fix the zookeper race (prometheus#4355) Signed-off-by: Krasi Georgiev <[email protected]> docs: added undocumented step api parameter format (prometheus#4360) Update vendoring for tsdb (prometheus#4369) This pulls in tsdb PRs 330 344 348 353 354 356 Signed-off-by: Brian Brazil <[email protected]> k8s SD: Fix "schema" -> "scheme" typo (prometheus#4371) Signed-off-by: Julius Volz <[email protected]> Fix missing 'msg' in remote storage adapter main.go .Log info message (prometheus#4377) Signed-off-by: Peter Gallerani <[email protected]> Don't forget to register query_duration_seconds{slice="queue_time"} (prometheus#4381) Signed-off-by: Tom Wilkie <[email protected]> docs: fix OpenStack SD for the hypervisor role Signed-off-by: Simon Pasquier <[email protected]> discovery/openstack: remove unneeded assignment Signed-off-by: Simon Pasquier <[email protected]> Bubble up errors to promql from populating iterators (prometheus#4136) This changes the Walk/Inspect API inside the promql package to bubble up errors. This is done by having the inspector return an error (instead of a bool) and then bubbling that up in the Walk. This way if any error is encountered in the Walk() the walk will stop and return the error. This avoids issues where errors from the Querier where being ignored (causing incorrect promql evaluation). Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4136 Timeout if populating iterators takes too long (prometheus#4291) Right now promql won't time out a request if populating the iterators takes a long time. Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4289 Check for timeout in each iteration of matrixSelector (prometheus#4300) Signed-off-by: Thomas Jackson <[email protected]> Fixes prometheus#4288 fix the zookeper race (prometheus#4355) Signed-off-by: Krasi Georgiev <[email protected]> return error exit status in prometheus cli (prometheus#4296) Signed-off-by: mikeykhalil <[email protected]> Reorder startup and shutdown to prevent panics. (prometheus#4321) Start rule manager only after tsdb and config is loaded. Stop rule manager before tsdb to avoid writing to closed storage. Wait for any in-progress reloads to complete before shutting down rule manager, so that rule manager doesn't get updated after being shut down. Remove incorrect comment around shutting down query enginge. Log when config reload is completed. Fixes prometheus#4133 Fixes prometheus#4262 Signed-off-by: Brian Brazil <[email protected]> Update vendoring for tsdb (prometheus#4369) This pulls in tsdb PRs 330 344 348 353 354 356 Signed-off-by: Brian Brazil <[email protected]> Release 2.3.2 Signed-off-by: Brian Brazil <[email protected]> rules: Minor naming/comment cleanups (prometheus#4328) Signed-off-by: Julius Volz <[email protected]> Optimize PromQL aggregations (prometheus#4248) * Compute hash of label subsets without creating a LabelSet first. Signed-off-by: Alin Sinpalean <[email protected]> Add offset to selectParams (prometheus#4226) * Add Start/End to SelectParams * Make remote read use the new selectParams for start/end This commit will continue sending the start/end time of the remote read query as the overarching promql time and the specific range of data that the query is intersted in receiving a response to is now part of the ReadHints (upstream discussion in prometheus#4226). * Remove unused vendored code The genproto.sh script was updated, but the code wasn't regenerated. This simply removes the vendored deps that are no longer part of the codegen output. Signed-off-by: Thomas Jackson <[email protected]> Forbid rule-abiding robots from indexing. (prometheus#4266) * Resolves github issue prometheus#4257 Signed-off-by: Martin Lee <[email protected]> Discovery consul service meta (prometheus#4280) * Upgrade Consul client * Add ServiceMeta to the labels in ConsulSD Signed-off-by: Romain Baugue <[email protected]> Fix some (valid) lint errors (prometheus#4287) Signed-off-by: Julius Volz <[email protected]> Update vendoring of Prometheus Go client (prometheus#4283) This is to pickup changes from prometheus/client_golang#414. It leads to better error output in promtool. Signed-off-by: Sneha Inguva <[email protected]> Simplify BufferedSeriesIterator usage (prometheus#4294) * Allow for BufferedSeriesIterator instances to be created without an underlying iterator, to simplify their usage. Signed-off-by: Alin Sinpalean <[email protected]> Saner defaults and metrics for remote-write (prometheus#4279) * Rename queueCapacity to shardCapacity * Saner defaults for remote write * Reduce allocs on retries Signed-off-by: Goutham Veeramachaneni <[email protected]> Update autorest vedoring (prometheus#4147) Signed-off-by: bege13mot <[email protected]> Update aws-sdk-go (prometheus#4153) Signed-off-by: bege13mot <[email protected]> add unused pointslices to the pool (prometheus#4363) Signed-off-by: Tony Lee <[email protected]> Add 3 commands in `promtool` for getting debug information from prometheus server (prometheus#4247) `debug all` - all information `debug metrics` - metrics information `debug pprof` - profiling information the final result is compressed in a `tar.gz` file Signed-off-by: chyeh <[email protected]> main: Improve / clean up error messages (prometheus#4286) Signed-off-by: Julius Volz <[email protected]> Document internal Prometheus server architecture (prometheus#4295) * Document internal Prometheus server architecture Signed-off-by: Julius Volz <[email protected]> * Review fixups Signed-off-by: Julius Volz <[email protected]> promtool: add command for querying series (prometheus#4308) Signed-off-by: Shubheksha Jalan <[email protected]> Add missing import to promtool, fix build (prometheus#4395) Sorry, I used GitHub's web-based merge-conflict-resolution editor on prometheus#4308 and it didn't show me test errors afterwards, but maybe they didn't run again or I should have waited or something. Signed-off-by: Julius Volz <[email protected]> EC2 Discovery: Allow to set a custom endpoint (prometheus#4333) Allowing to set a custom endpoint makes it easy to monitor targets on non AWS providers with EC2 compliant APIs. Signed-off-by: Jannick Fahlbusch <[email protected]> Reuse (copy) overlapping matrix samples between range evaluation steps (prometheus#4315) * Reuse (copy) overlapping matrix samples between range evaluation steps. Signed-off-by: Alin Sinpalean <[email protected]> Expose Group.CopyState() (prometheus#4304) This makes the `rules` package more useful to projects that use Prometheus as a library. Signed-off-by: Bryan Boreham <[email protected]> add query labels command to promtool (prometheus#4346) Signed-off-by: Daisy T <[email protected]> web: add named anchors for each rule group (prometheus#4130) * web: add named anchors for each rule group Signed-off-by: Adam Shannon <[email protected]> Update internal architecture diagram (prometheus#4398) Signed-off-by: Julius Volz <[email protected]> Only add LookbackDelta to vector selectors (prometheus#4399) Signed-off-by: Thomas Jackson <[email protected]> Related to prometheus#4226 add prefix "common-" to make target names This allows rules to be overridden with warnings about conflicting target names. Signed-off-by: Paul Gier <[email protected]> expose log.level for promlog for remote_storage_adapter (prometheus#4195) * expose log.level for promlog for remote_storage_adapter Signed-off-by: sipian <[email protected]> * replace flag description Signed-off-by: Harsh Agarwal <[email protected]> go-bindata debug clarification (prometheus#4411) Signed-off-by: Stafford Williams <[email protected]> discovery/ec2: Maintain order of subnet_id label Signed-off-by: José Martínez <[email protected]> discovery/ec2: Add primary_subnet_id label Signed-off-by: José Martínez <[email protected]> Don't import testing in code which is imported from non-test code. (prometheus#4400) It polutes the flags. Signed-off-by: Tom Wilkie <[email protected]> Log errors encountered when marshalling and writing responses. Signed-off-by: Tom Wilkie <[email protected]> Review feedback. Signed-off-by: Tom Wilkie <[email protected]> Review feedback. Signed-off-by: Tom Wilkie <[email protected]> Review feedback. Signed-off-by: Tom Wilkie <[email protected]> Update method name in rules template, fix rendering (prometheus#4416) Fixes prometheus#4407 Signed-off-by: Julius Volz <[email protected]> Fix typo (prometheus#4423) Signed-off-by: Henri DF <[email protected]> Send "Accept-Encoding" header in read request (prometheus#4421) We should be doing this since we only accept Snappy-encoded responses. Signed-off-by: Henri DF <[email protected]> Handle a remote read error and return other results, add remote error as extra field in api response. Signed-off-by: Mark Knapp <[email protected]> Removed some code from other project Signed-off-by: Mark Knapp <[email protected]>

This is to pickup changes from prometheus/client_golang#414. It leads to better error output in promtool. Signed-off-by: Sneha Inguva <[email protected]>

bboreham force-pushed the decode-400s branch from 5ae24c4 to 33b86be Compare May 31, 2018 14:22

beorn7 approved these changes Jun 3, 2018

View reviewed changes

bboreham added 2 commits June 4, 2018 10:35

Add non-nil Data because Go 1.7 needs it

04c0326

Signed-off-by: Bryan Boreham <[email protected]>

bboreham force-pushed the decode-400s branch from 2801796 to 04c0326 Compare June 4, 2018 10:35

beorn7 merged commit 7540c07 into prometheus:master Jun 4, 2018

si74 mentioned this pull request Jun 18, 2018

Update vendoring of Prometheus Go client prometheus/prometheus#4283

Merged

si74 pushed a commit to si74/prometheus that referenced this pull request Jun 18, 2018

Update vendoring of Prometheus Go client

70ebf3e

This is to pickup changes from prometheus/client_golang#414. It leads to better error output in promtool. Signed-off-by: Sneha Inguva <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Decode JSON body when Prometheus returns 400 to an api call #414

Decode JSON body when Prometheus returns 400 to an api call #414

bboreham commented May 31, 2018

beorn7 commented Jun 2, 2018

bboreham commented Jun 2, 2018

brian-brazil commented Jun 2, 2018

beorn7 commented Jun 3, 2018

beorn7 left a comment

beorn7 Jun 3, 2018

bboreham commented Jun 4, 2018

beorn7 commented Jun 4, 2018

Decode JSON body when Prometheus returns 400 to an api call #414

Decode JSON body when Prometheus returns 400 to an api call #414

Conversation

bboreham commented May 31, 2018

beorn7 commented Jun 2, 2018

bboreham commented Jun 2, 2018

brian-brazil commented Jun 2, 2018

beorn7 commented Jun 3, 2018

beorn7 left a comment

Choose a reason for hiding this comment

beorn7 Jun 3, 2018

Choose a reason for hiding this comment

bboreham commented Jun 4, 2018

beorn7 commented Jun 4, 2018