Improve response checks in bulk API runner #207

danielmitterdorfer · 2017-01-27T09:34:57Z

At the moment, Rally checks for bulk errors by inspecting the errors property of the bulk response (and adds it to the request meta-data).

On errors, we should improve the inspection and also report more meta-data (like items with failed shards etc.)

The text was updated successfully, but these errors were encountered:

bleskes · 2017-01-27T09:52:10Z

note that failing replicas will not bubble up as an error. You need to look at the failures section of the _shards header of each item.

danielmitterdorfer · 2017-01-27T09:54:52Z

Thanks for the pointer. In that case I think we should add this as an option which needs to be enabled by the user in the track.

bleskes · 2017-01-27T09:57:05Z

maybe we should improve the bulk response to bubble this up. I think it has merit for exactly the reason you think of making it optional - people won't have to parse the response.

danielmitterdorfer · 2017-01-27T10:02:34Z

maybe we should improve the bulk response to bubble this up.

This would definitely help in this case.

The reason I wanted to make response parsing optional is to avoid adding any bottlenecks in the load test driver. I did not check yet how long it would take to extract these data (for different bulk sizes) and my gut feeling tells me that the overhead is negligible but I'd rather measure it first and base my decision on hard numbers. :)

danielmitterdorfer · 2017-02-07T17:11:09Z

Rally will return a structure like:

{
  "weight": 5000,
  "unit": "docs",
  "bulk-size": 5000,
  "success": true,
  "success-count": 5000,
  "error-count": 0,
  "ops": {
    "index": {
      "item-count": 5000,
      "created": 5000
    }
  },
  "shards_histogram": [
    {
      "item-count": 5000,
      "shards": {
        "total": 2,
        "successful": 2,
        "failed": 0
      }
    }
  ]
}

when the new parameter detailed-results is set to true in the track specification. If the value is false, it will return instead the following meta-data:

{
  "weight": 5000,
  "unit": "docs",
  "bulk-size": 5000,
  "success": true,
  "success-count": 5000,
  "error-count": 0
}

The default value will be false as this feature adds very significant overhead due to the need to iterate over the complete response structure. The benchmarks (which can be run with make benchmark) indicate a slowdown by a factor greater than 1.000: Whereas we can return within single digit microseconds when detailed-results is false, the call needs several milliseconds when detailed-results is true. Whether this is acceptable needs to be decided on a case-by-case basis (by looking at typical response times of the bulk operation in a benchmark) and to avoid that users run unintentionally into trouble I decided to set it to false by default.

danielmitterdorfer · 2017-02-07T17:15:51Z

By the way, the pydoc comment has a few more examples that might also be interesting.

bleskes · 2017-02-08T17:08:02Z

@danielmitterdorfer does it mean you gave up on:

maybe we should improve the bulk response to bubble this up. I think it has merit for exactly the reason you think of making it optional - people won't have to parse the response.

danielmitterdorfer · 2017-02-09T07:48:08Z

@bleskes I did not gave up but Rally works with ES 1.x, 2.x, 5.x and 6.x and I needed to implement a solution that works for all of them. We can still implement this in ES though. I can create a ticket later.

bleskes · 2017-02-09T07:51:02Z

@danielmitterdorfer makes total sense. Thanks for explaining.

danielmitterdorfer · 2017-02-13T11:24:28Z

I've created elastic/elasticsearch#23143 now as a follow-up.

danielmitterdorfer added :Load Driver Changes that affect the core of the load driver such as scheduling, the measurement approach etc. :Metrics How metrics are stored, calculated or aggregated enhancement Improves the status quo labels Jan 27, 2017

danielmitterdorfer added this to the 0.5.0 milestone Jan 27, 2017

danielmitterdorfer self-assigned this Feb 7, 2017

danielmitterdorfer closed this as completed in deee0a2 Feb 7, 2017

danielmitterdorfer removed their assignment Feb 7, 2017

danielmitterdorfer mentioned this issue Feb 13, 2017

Expose more summary metrics in bulk API response elastic/elasticsearch#23143

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve response checks in bulk API runner #207

Improve response checks in bulk API runner #207

danielmitterdorfer commented Jan 27, 2017

bleskes commented Jan 27, 2017

danielmitterdorfer commented Jan 27, 2017

bleskes commented Jan 27, 2017

danielmitterdorfer commented Jan 27, 2017

danielmitterdorfer commented Feb 7, 2017

danielmitterdorfer commented Feb 7, 2017

bleskes commented Feb 8, 2017

danielmitterdorfer commented Feb 9, 2017

bleskes commented Feb 9, 2017

danielmitterdorfer commented Feb 13, 2017 •

edited

Loading

Improve response checks in bulk API runner #207

Improve response checks in bulk API runner #207

Comments

danielmitterdorfer commented Jan 27, 2017

bleskes commented Jan 27, 2017

danielmitterdorfer commented Jan 27, 2017

bleskes commented Jan 27, 2017

danielmitterdorfer commented Jan 27, 2017

danielmitterdorfer commented Feb 7, 2017

danielmitterdorfer commented Feb 7, 2017

bleskes commented Feb 8, 2017

danielmitterdorfer commented Feb 9, 2017

bleskes commented Feb 9, 2017

danielmitterdorfer commented Feb 13, 2017 • edited Loading

danielmitterdorfer commented Feb 13, 2017 •

edited

Loading