Gateway HTTP Metrics inconsistently measure failed response times #8820
Labels
kind/bug
A bug in existing code (including security flaws)
need/triage
Needs initial labeling and prioritization
Checklist
Installation method
built from source
Version
Config
No response
Description
The per-format latency metrics added as part of #8441 are inconsistent as to whether they include failed responses or not.
gw_unixfs_file_get_duration_seconds
,gw_raw_block_get_duration_seconds
include timings for failed responsesgw_car_stream_get_duration_seconds
,gw_unixfs_gen_dir_listing_get_duration_seconds
do not include timings for failed responsesThese should be consistent and my preference is that they should not include timings for failed responses (since errors will skew latency timings lower in general)
Tech note: the difference is caused by the use of
http.ServeContent
for serving files and blocks which does not return an error. We should pass a custom ResponseWriter that detects whether an error occurred during the write and then only record the metric if there is no error.The text was updated successfully, but these errors were encountered: