Return timing data in response header(s) #357

milesrichardson · 2023-04-12T22:33:58Z

Related: #356

Feature Request

It would be nice if the response from Seafowl included some information about execution time in the response headers. Currently, a client has no insight into execution metrics, other than measuring the total round trip time of the request and response.

Motivation

Having access to more timing data would be helpful for diagnostics, and for writing interactive query tools similar to the Splitgraph Query Console. For example, the Splitgraph DDN provides fields executionTime and executionTimeHighRes, which represent the time spent by the Splitgraph Engine "processing the query," which, in effect, is the interval defined by the boundaries where the "DDN web bridge" stops reading the HTTP request, and when it starts writing the HTTP response.

Open questions:

What should be measured?

A minimal implementation should measure the total processing time, as delineated by the interval between when Seafowl finished reading the HTTP request body, and when it started writing the HTTP response body.
Other more granular "internal" metrics might also be helpful, e.g. time spent resolving tables or fetching an external resource. But this generalizes to a telemetry framework, since it's possible to measure the time between any two events. So perhaps there are some important metrics worth including by default.

How should it be returned in the response headers?

All considerations from #356 (comment) apply:

Each metric could be its own response header
All metrics could be in one response header
The most important metrics could be in the parameters of the Content-Type header, similarly to how we might return the field types. (The advantage of this option is that Content-Type is a default CORS-safelisted HTTP response header.)

For the content-type option, perhaps any non-field metadata could be in parameters conventionally prefixed with __, e.g.:

content-type: application/octet-stream; __exec_ms=15; __resolving_ms=20; __total_ms=35; 
total_volume=FLOAT; year=INT;
...
{"total_volume":78.0,"year":2018}

The text was updated successfully, but these errors were encountered:

onpaws · 2023-06-15T06:31:20Z

OK, after flailing around in Warp and it's world of Filters, I ended up switching gears and now added an explicit timer inside cached_read_query /uncached_read_write_query.
It's very simple and not yet DRY'd but wanted to push this up as a PoC of 'whole request' timing.

Probably whether Seafowl returns this header should either be 1) configurable or 2) perhaps it should follow warp::trace::request()'s footsteps and return only at INFO/DEBUG level. Or maybe both. (?)

Thoughts?

gruuya · 2023-07-13T06:27:00Z

Closed by #438

onpaws mentioned this issue Jun 15, 2023

Return timing data in response header(s) #438

Merged

7 tasks

gruuya closed this as completed Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Return timing data in response header(s) #357

Return timing data in response header(s) #357

milesrichardson commented Apr 12, 2023 •

edited

Loading

onpaws commented Jun 15, 2023 •

edited

Loading

gruuya commented Jul 13, 2023

Return timing data in response header(s) #357

Return timing data in response header(s) #357

Comments

milesrichardson commented Apr 12, 2023 • edited Loading

onpaws commented Jun 15, 2023 • edited Loading

gruuya commented Jul 13, 2023

milesrichardson commented Apr 12, 2023 •

edited

Loading

onpaws commented Jun 15, 2023 •

edited

Loading