-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
otel: instrument observability (metrics + traces) on the database #2285
Comments
A simpler path is probably to switch to using the standard library's sql API and use OTEL's instrumentation for that. We're currently using sqlc's pgx5 driver because it supports bulk inserts (IIRC?), but we can likely work around that. |
Niiiiice that looks fantastic! From some brief grepping, I couldn't find any It seems like the biggest thing we'd be losing is the codegen. What do you think would be the best way to move forward with the existing |
Notes from standup: we don't need to remove sqlc, just switch out the driver. :) |
Fixes #2285 Register the standard otel database metrics using: https://github.com/XSAM/otelsql Found call sites by grepping for `sql.Open` and skipping all test code. Sample: ``` ScopeMetrics #2 ScopeMetrics SchemaURL: InstrumentationScope github.com/XSAM/otelsql 0.32.0 Metric #0 Descriptor: -> Name: db.sql.latency -> Description: The latency of calls in milliseconds -> Unit: ms -> DataType: Histogram -> AggregationTemporality: Cumulative HistogramDataPoints #0 Data point attributes: -> method: Str(sql.conn.begin_tx) -> status: Str(ok) StartTimestamp: 2024-08-08 23:18:01.011762 +0000 UTC Timestamp: 2024-08-08 23:18:11.012177 +0000 UTC Count: 23 Sum: 17.082795 Min: 0.137584 Max: 3.233917 ``` --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
reopening until we can unrevert the revert |
Fixes #2285 Unreverts #2305 with these changes: * Reverts [`copyAny`](https://github.com/TBD54566975/ftl/blob/main/internal/reflect/reflect.go#L149) changes except for the `list.List` handling. * The original PR caused a panic when deploying to prod: `reflect: internal error: must be a Pointer or Ptr; got unsafe.Pointer` * #2376 removed the DB handles from the module context Database struct, so we no longer need to handle all these types in `DeepCopy`. * Updates all `reflect.Ptr` refs to `reflect.Pointer`. Not a big deal, but `Ptr` is the [old name](https://cs.opensource.google/go/go/+/refs/tags/go1.23.0:src/reflect/type.go;l=303), and since we're using `reflect.Pointer` elsewhere in this file, it's better to stay consistent. Accordingly, this also fixes the error message that we saw in the prod deployment, which previously implied `Pointer` and `Ptr` were two _separate_ types. --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
From #2194 (comment):
Immediate Needs
For metrics in DD, the well-lit path appears to be DD's official Postgres integration. That comes with not only all the core metrics we want, but also a nice dashboard. For immediate use cases, this is definitely the best solution. Next step on that is talking to @bradleydwyer
If we cannot do that^, then we'll need to instrument custom metrics right now, which is possible but not ideal, both because of the implementation time and also because of the added complexity to the sql code (much of which is generated by sqlc, so we can't just do whatever we want to make the interface nice/intuitive).
Long Term
That said, FTL is not tied to DD, so in the long run, we’ll likely want some integration for this outside of the DD world as well. The first reasonable avenue for this was otel's zero code instrumentation for Go. Unfortunately, that integration only supports traces right now, not metrics. The whole project is marked as a work in progress, but there doesn't appear to be any immediate plans to add metric support.
The urgency for supporting non-DD database metrics is fairly low. Our current customer team and next customer team are both internal, so they should both use DD. Most likely, the next one after that will be as well. We will want to lift this instrumentation out of the DD-only world before significant amounts of open source development, but perhaps otel will also provide better built-ins for instrumenting database metrics by then. This seems like something we can afford to wait for.
Parent doc link for zero-code: https://opentelemetry.io/docs/concepts/instrumentation/zero-code/
The text was updated successfully, but these errors were encountered: