BulkRequest Handling on Failure #666

sundarv85 · 2017-12-20T09:36:57Z

Which version of Elastic are you using?

[x] elastic.v5 (for Elasticsearch 5.x)

v5.0.30

Please describe the actual behavior

When some bulk request fail:

ERRO[0551] Elastic Flushing Failed                       err=elastic: Error 400 (Bad Request): Validation Failed: 1: index is missing;2: index is missing;3: index is missing;4: index is missing;5: index is missing;6: index is missing;7: index is missing;8: index is missing;9: index is missing;10: index is missing;11: index is missing;12: index is missing;13: index is missing;14: index is missing;

The retry is happening forever. In this case, since the record is having an incorrect index and type, this never succeeds (of course, this shall(should) not happen in the production env)

Please describe the expected behavior

Could we have a max limit for the retries configurable for bulk updates
Another question is - there were 30 records in this request. Only 14 of them have failed. When the retry happens all the 30 are being retried. Could the 14 be identified and only those be retried.
Another interesting behaviour is, I'm using the following code:

	afterCallback := func(id int64, requests []elastic.BulkableRequest, response *elastic.BulkResponse, err error) {
		if err != nil {
			log.WithFields(log.Fields{"id": id, "err": err.Error()}).Error("Elastic Flushing Failed")
			for k := range requests {
				src, rerr := requests[k].Source()
				if rerr == nil {
					for i := range src {
						fmt.Printf("Request[%d] Source[%d]: %s\n", k, i, src[i])
					}
				} else {
					fmt.Printf("Bulkable Request[%d] Not Returned: %s", k, rerr.Error())
				}
			}

			if response != nil {
				failed := response.Failed()
				for i := range failed {
					if failed[i].Error != nil {
						fmt.Printf("Failed[%d]: %+v", i, failed[i].Error)
					}
				}
			}
		}
	}

and I'm not seeing any Failed prints. So how to find which of these 14 were the real failures.

Any steps to reproduce the behavior?

One of my teammate had an issue during his development. So for testing purpose, I gave an incorrect type for certain records and incorrect index for some of them and added them to the bulk.

The text was updated successfully, but these errors were encountered:

olivere · 2017-12-27T14:17:07Z

Okay, I see. This looks a bit like we need to be able to use configurable backoff/retries for Bulk API as well (just like Scroll API).

With regards to your questions:

Max limit of retries: You could do that once we have configurable backoff/retries for the Bulk API. Maybe it's worth adding a configurable backoff/retry for every API, and use the client's settings as the default.
Retrying: I really don't want Elastic to be "clever" and pick up the things that "should" happen. I try to simply make Elastic a transparent layer for working with Elasticsearch. I think it's better to let the application pick the correct logic on what should happen in case of an error.
Could you try to log the response and set up a unit test in a separate issue? With that we can check if there's something wrong. Either Elasticsearch changed its response structure or your code is missing something. There already is a test for the "failed" case, so I'm not sure what's wrong here...

This commit extends the `PerformRequestOptions` to pass a custom `Retrier` per request. This is enabled for the Scroll and Bulk API for now. See #666 and #610

This commit backports the `PerformRequestOptions` structure from v6 and introduces `PerformRequestWithOptions`, which allows us to extend the list of parameters passed when executing a HTTP request. This change allows us to now also pass a custom `Retrier` per request, which we allow for Scroll API and Bulk API via `Retrier(...)` on the service respectively. Backported from d2219c2. See also #666 and #610

olivere · 2018-01-08T11:25:05Z

I just released 5.0.60 and 6.1.1 which both allow to set a per-request Retrier (with configurable Backoff). For now, only Bulk API and Scroll API make use of that. Use Retrier(...) on the BulkService or ScrollService to pass your own retrier for the given service.

Have you looked into the 3rd comment above? Are you able to create a test to reproduce the problem?

olivere · 2018-01-18T21:31:28Z

Closing for inactivity. Let me know if I can do anything, and I'll re-open.

olivere added a commit that referenced this issue Jan 8, 2018

Allow custom Retry/Backoff per request

d2219c2

This commit extends the `PerformRequestOptions` to pass a custom `Retrier` per request. This is enabled for the Scroll and Bulk API for now. See #666 and #610

olivere closed this as completed Jan 18, 2018

tsaikd mentioned this issue Jun 19, 2018

Would gogstash silent fail in elastic output? tsaikd/gogstash#38

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BulkRequest Handling on Failure #666

BulkRequest Handling on Failure #666

sundarv85 commented Dec 20, 2017 •

edited

Loading

olivere commented Dec 27, 2017

olivere commented Jan 8, 2018

olivere commented Jan 18, 2018

BulkRequest Handling on Failure #666

BulkRequest Handling on Failure #666

Comments

sundarv85 commented Dec 20, 2017 • edited Loading

Which version of Elastic are you using?

Please describe the actual behavior

Please describe the expected behavior

Any steps to reproduce the behavior?

olivere commented Dec 27, 2017

olivere commented Jan 8, 2018

olivere commented Jan 18, 2018

sundarv85 commented Dec 20, 2017 •

edited

Loading