Since we can only improve on what we measure, we started implementing the Server Timing API on all HTTP endpoints.
The API provides simple means to extend the http header with some performance measurements.
For some general introduction into the topic you can check the mozilla documentation on it (https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Server-Timing).
As for the actual implementation, we use the following Rust implementation, which provides us with performance data for axum server out of the box, (https://crates.io/crates/axum-server-timing).
Because our architecture spans multiple datacenters, we do not publish these metrics as a single measurement. We have taken a layered approach to our network.
Here are some sample headers produced by our API.
server-timing: dtz-red;dur=765, dtz-blue;desc="dtz-blue(warm)";dur=538
content-length: 19276864
last-modified: Sun, 07 May 2023 10:25:33 GMT
date: Sun, 07 May 2023 10:27:52 GMT
content-type: application/octet-stream