Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SqlServer: some tags have whitespaces appended up to a total length of 128 characters #7241

Closed
leepfrog-ger opened this issue Mar 27, 2020 · 10 comments · Fixed by #7351
Closed
Assignees
Labels
area/sqlserver bug unexpected problem or unintended behavior
Milestone

Comments

@leepfrog-ger
Copy link

Relevant telegraf.conf:

[[inputs.sqlserver]]
  ## Specify instances to monitor with a list of connection strings.
  ## All connection parameters are optional.
  ## By default, the host is localhost, listening on default port, TCP 1433.
  ##   for Windows, the user is the currently running AD user (SSO).
  ##   See https://github.com/denisenkom/go-mssqldb for detailed connection
  ##   parameters.
  servers = [
    "Server=REDACTED"
  ]

  ## Optional parameter, setting this to 2 will use a new version
  ## of the collection queries that break compatibility with the original
  ## dashboards.
  query_version = 2

System info:

  • Telegraf 1.14.0-rc1 on Windows Server 2012 R2
  • Querying SQL 2014/2016 on Windows Server 2012 R2
  • Writing to InfluxDB

Steps to reproduce:

  1. Collect SQL Metrics
  2. View tag values that are written to the InfluxDB Database

Expected behavior:

Tags, e.g. the instance tag that can contain values like the database name for a specific metric are a string that contains exactly the name of the object (e.g. tempdb)

Actual behavior:

Some tags seem to be appended with spaces to be exactly 128 characters in length (e.g. tempdb [put 122 spaces here])

Additional info:

  1. Issue did not exist with Telegraf 1.10.4 and appeard when running 1.14.0-rc1, so it was introduced somewhere between those versions
  2. The issue did start exactly when switching to the new Telegraf version, when taking the tempdb example there are NO entries with appended spaces before before the upgrade. There are NO entries without appended spaces after the time when the new version was deployed.
  3. From what I can tell the SQL query that is reading the values from the SQL server is returning correct data (so in the example it would return tempdb without any appended spaces)
  4. This issue breaks several things down the line, e.g. dashboards and search filters, you now have two "different" entries for all objects
  5. Github is truncating the spaces so I cannot post the influx query output here, but see the following screenshot to understand the issue:

influx_sql

@danielnelson danielnelson added area/sqlserver bug unexpected problem or unintended behavior labels Mar 27, 2020
@danielnelson danielnelson added this to the 1.14.1 milestone Mar 27, 2020
@leepfrog-ger
Copy link
Author

leepfrog-ger commented Mar 27, 2020

Update: can be reproduced with 1.14.0 release. Also observable when using file output: sql_output_json.txt

In the example you can also see that it is not happening for all entries

@leepfrog-ger
Copy link
Author

Is there any other data I can provide to help to narrow this issue down?
Or does someone have an idea where that behaviour is caused?

@danielnelson
Copy link
Contributor

I'm planning to take a look soon, in the meantime can you try this workaround:

[[processors.strings.trim]]
  namepass = ["sqlserver_performance"]
  tag = "instance"

@leepfrog-ger
Copy link
Author

Thanks for the great Idea!

I wasn't aware of that option. I had to modify it slightly but the following seems to work flawless:

[[processors.strings]]
  namepass = ["sqlserver_performance"]
  [[processors.strings.trim]]
    tag = "instance"

@Trovalo
Copy link
Collaborator

Trovalo commented Apr 7, 2020

@leepfrog-ger can you provide some additional information about this issue?

SQL instance:

  • Edition (on-prem, azure SQL db), looks on-prem form the issue description
  • Version (Sql 2012, 2014, etc)

If you are not sure just run:

SELECT
  SERVERPROPERTY('ProductVersion') as [FullVersion]
  ,SERVERPROPERTY('EngineEdition') as [EngineEdition]

Telegraf/InfluxDB:

  • the measurement name in which this happens

In the latest version, some queries have been edited, but the way the tag "instance" is created has not been changed. it is a simple.

SELECT REPLACE(@@SERVERNAME,'\',':') AS [sql_instance]

@leepfrog-ger
Copy link
Author

All servers are on prem. Issue observed on all servers we are collecting data from with telegraf (multiple 2014 Enterprise and 2016 Enterprise).

It seems to be limited to the "instance" field, I've posted example output here

@danielnelson danielnelson removed this from the 1.14.1 milestone Apr 13, 2020
@danielnelson
Copy link
Contributor

cc @denzilribeiro

@Trovalo
Copy link
Collaborator

Trovalo commented Apr 16, 2020

I had a second look at it.

Looks like the query is missing an RTRIM() for the column that becomes the tag "instance".
I will make a PR soon to add it

@denzilribeiro
Copy link
Contributor

@Trovalo thank you, I think Rtrim was missed in all the conditional logic to get right DB name from managed instance - aka this part needs an RTRIM and have it in a couple parts, missed it in a couple.

	CASE
	WHEN CAST(SERVERPROPERTY('EngineEdition') AS int) IN (5,8)  --- needed to get actual DB Name for SQL DB/ Managed instance
	THEN N'CASE WHEN (
                         RTRIM(spi.object_name) LIKE ''%:Databases''
                         OR RTRIM(spi.object_name) LIKE ''%:Database Replica''
                         OR RTRIM(spi.object_name) LIKE ''%:Catalog Metadata''
                         OR RTRIM(spi.object_name) LIKE ''%:Query Store''
                         OR RTRIM(spi.object_name) LIKE ''%:Columnstore''
                         OR RTRIM(spi.object_name) LIKE ''%:Advanced Analytics'')
                         AND TRY_CONVERT(uniqueidentifier, spi.instance_name) 
						 IS NOT NULL -- for cloud only
                   THEN d.name
		WHEN RTRIM(object_name) LIKE ''%:Availability Replica''
			AND TRY_CONVERT(uniqueidentifier, spi.instance_name) IS NOT NULL -- for cloud only
		THEN d.name + RTRIM(SUBSTRING(spi.instance_name, 37, LEN(spi.instance_name)))
                   ELSE spi.instance_name
            END AS instance_name,'
	ELSE 'spi.instance_name as instance_name, '

@leepfrog-ger
Copy link
Author

I can confirm this is now working as intended with the most recent release (1.14.2)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/sqlserver bug unexpected problem or unintended behavior
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants