Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: add Database Link to span context #107

Open
axw opened this issue Jul 4, 2019 · 5 comments
Open

Proposal: add Database Link to span context #107

axw opened this issue Jul 4, 2019 · 5 comments

Comments

@axw
Copy link
Member

axw commented Jul 4, 2019

Description of the issue

In elastic/apm-agent-java#696 support is being added for extracting the Database Link from queries. This is an Oracle database feature for linking multiple databases to form a single logical database.

Queries across these databases may have different performance characteristics, since they may be geographically distributed, on different hardware, or under different load. As such it may be useful to know which remote database a query is operating on.

Proposal

Add the following field to span context:

Intake API field Elasticsearch field
context.db.link (string, optional) span.db.link (keyword)

Initially this field would not be displayed in the APM UI, as the link is already visible in the statement. It would be indexed separately for custom visualisations, breaking down metrics by destination database. (See elastic/apm-agent-java#696 (comment))

@wolframhaussig @elastic/apm-agent-devs @elastic/apm-server please take a look and comment if you have any concerns.

@wolframhaussig
Copy link

The proposal looks good to me - just for completeness: This is not a function supported only by Oracle - only the syntax is Oracle specific. Here are the documentations as reference:
MS SQL: "When you execute a distributed query against a linked server, include a fully qualified, four-part table name for each data source to query. This four-part name should be in the form linked_server_name.catalog.schema.object_name."
Oracle: "Oracle Database uses the global database name to name the schema objects globally using the following scheme: schema.schema_object@global_database_name"

@axw
Copy link
Member Author

axw commented Jul 4, 2019

Thanks for the further context @wolframhaussig. That makes me wonder if this is the right way to go then, since then the behaviour would be different for MS SQL and Oracle. Currently we would include the linked_server_name part for MS SQL in the span name.

@wolframhaussig
Copy link

As I have no MSSQL database I cannot test the agent and the sql parsing logic against the Microsoft database. Therefore I didn't add the logic of parsing the link for this db. But from what I found in the internet I think we are on the right track here - we just have to extract the db links for MSSQL in the future too:

As both Oracle and MSSQL are not limited to link to databases of the same vendor it is possible that different applications access the same target database via different "jump hosts":
Application 1(Oracle): SELECT * FROM SCHEMA.TESTTABLE@DBLINK
Application 2(MSSQL):SELECT * FROM DBLINK.SCHEMA.TESTTABLE

If we do not separate the dblink from the schema and tablename we would not recognise that both selects access the same table SCHEMA.TESTTABLE on database DBLINK

@axw
Copy link
Member Author

axw commented Jul 5, 2019

Right. I was hoping, a bit naively, to avoid vendor-specific logic in the parser.

The MSSQL/Transact-SQL object name syntax is described here: https://docs.microsoft.com/en-us/sql/t-sql/language-elements/transact-sql-syntax-conventions-transact-sql?view=sql-server-2017#multipart-names

If I read it correctly, then we could unambiguously extract the linked server name; database and schema may be omitted, but the number of dots between name components can be used to identify them.

@wolframhaussig
Copy link

Right. I was hoping, a bit naively, to avoid vendor-specific logic in the parser.

Maybe we can split the parser logic? SignatureParser contains all parsing logic for SQL standard features and will be used for the majority of databases. If we need vendor specific logic it could go into OracleSignatureParser or MsSqlSignatureParser which subclass SignatureParser. Of course StatementInstrumentation and PreparedStatementInstrumentation would then have to recognise the vendor...

If I read it correctly, then we could unambiguously extract the linked server name; database and schema may be omitted, but the number of dots between name components can be used to identify them.

Right, that looks easy enough. At first I thought that the dots are otpional too but fortunately, they are required - even if that looks a bit weird: server...object

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants