Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

{Doc} Add telemetry doc #22689

Merged
merged 5 commits into from
Jun 29, 2022
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
81 changes: 81 additions & 0 deletions doc/telemetry/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
Telemetry Documentation
=======================
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved

> Two types of telemetry are used to monitor and analyze execution of Azure CLI commands. One called ARM telemetry is recorded basing on HTTP traffic by ARM, and another is client side telemetry sent by Azure CLI.

### ARM Telemetry
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved
ARM telemetry tracks all HTTP requests and responses through ARM endpoint. As far as we know, below cases don't have ARM telemetry record.
- Command doesn't create request successfully, for instance, parameter cannot pass validation, request or payload cannot be constructed.
- Command calls data plane service API.
- Network is inaccessible.
- No request is needed during execution.

Kusto Cluster and Database: https://dataexplorer.azure.com/clusters/armprod/databases/ARMProd


### CLI Client Telemetry
Client side telemetry is sent at the end of Azure CLI command execution. It covers all commands, no matter if it has http requests or just has local operations.
Sanitized data is stored in Kusto cluster which is managed by DevDiv Data team.

Kusto Cluster and Database: https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli

All Azure CLI data is stored in a large json named `Properties` in table `RawEventsAzCli`. Some properties are flatten, some are not. Here are some useful fields:
> The telemetry has different schema pre Azure CLI 2.0.28. All the fields explained below are for new schema, in other words, CLI version > 2.0.28.
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved
- `EventName`: `azurecli/command` or `azurecli/fault` or `azurecli/extension`
- `azurecli/command` means this is common event record with general `Properties` field.
- `azurecli/fault` means this is additional event record with extra exception info in `Properties` field.
- `azurecli/extension` means this is additional event record with customized info in `Properties` field.
- Additional event record can be joined with common event record using `CorrelationId`.
- `CorrelationId`: GUID to join additional event records with common event record
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved
- `azurecli/extension` event records have the same `CorrelationId` with related `azurecli/command` event record.
- `azurecli/fault` event has `Properties['reserved.datamodel.correlation.1']` field. The field value is `{correlationId},UserTask,` where `{correlationId}` is `CorrelationId` field of related `azurecli/command` record.
- `EntityType`: `UserTask`/`Fault`
- For `EventName == 'azurecli/command'`, the `EntityType` is `UserTask`.
- For `EventName == 'azurecli/fault'`, the `EntityType` is `Fault`.
- `EventTimeStamp`: time when the telemetry record is sent
- `ProductVersion`: CLI core version in the format of `azurecli@{version}`
- `CoreVersion`: CLI core version
- `ExeVersion`: `{cli_core_version}@{module_version}`. In the new schema (CLI version > 2.0.28), all module versions are `none`. Hence this field is `{cli_core_version}@none`
- `OsType`: OS system, eg. linux, windows
- `OsVersion`: OS platform version, eg. 10.0.14942
- `PythonVersion`: platform python version
- `ShellType`: cmd/bash/ksh/zsh/cloud-shell/... Note: may not be accurate.
- `MacAddressHash`: SHA256 hashed MAC address
- `MachineId`: GUID coming from the first 128bit of MacAddressHash
- `UserId`: CLI installation id. Each CLI client installed locally will have a GUID as installation id.
- `SessionId`: SHA256 hashed result of CLI installation id, parent process (terminal session) creation time and parent process (terminal session) id. Note: may not be accurate.
- `RawCommand`: CLI command name
- `Params`: CLI command arguments (without argument value)
- `AzureSubscriptionId`: current subscription id
- `ClientRequestId`: GUID which is set on HTTP header
- `StartTime`: time when the command begins executing
- `EndTime`: time when the command exits
- `ActionResult`:
- For `EntityType == 'UserTask'`, it could be `Success`/`Failure`/`UserFault`/`None`. All others besides `Success` means failure.
- For `EntityType == 'Fault'`, it's empty.
- `ResultSummary`: details of result, may be suppressed to meet security & privacy requirements.
- `ExceptionMessage`: details of exception, may be suppressed to meet security & privacy requirements.
- `Properties`: large json with all constructed fields. Below is to explain some unflattened fields not introduced before.
- `reserved.datamodel.entityname`: CLI command name with hyphens
- `reserved.datamodel.correlation.1`: Additional field when `EventName == 'azurecli/fault'`. It's in the format of `{correlationId},UserTask,` where `{correlationId}` is `CorrelationId` field of related `EventName == 'azurecli/command'` record.
- `reserved.datamodel.fault.typestring`: Additional field when `EventName == 'azurecli/fault'`. It logs the exception class.
- `reserved.datamodel.fault.description`: Additional field when `EventName == 'azurecli/fault'`. It logs exception description or fault type.
- `context.default.vs.core.os.platform`: OS platform
- `context.default.azurecli.source`: `az`/`completer`. It's `completer` if we found argument auto complete settings in os environment variable.
- `context.default.azurecli.environmentvariables`: It logs customer's environment variables starting with `AZURE_CLI`
- `context.default.azurecli.extensionname`: It logs the extension name and version in the format of `{extension_name}@{extension_version}` if the command is from CLI extension.
- `context.default.azurecli.installer`: value of os environment variable `AZ_INSTALLER`
- `context.default.azurecli.error_type`: It logs the exception class name.
- `context.default.azurecli.exception_name`: A supplementation for `context.default.azurecli.error_type`

### Accessing Client Telemetry
To ensure you have a smooth experience using our Data Tools and Data, you have to take the required trainings and join a security group.

Please follow instruction [Accessing DevDiv Data](https://devdiv.visualstudio.com/DevDiv/_wiki/wikis/DevDiv.wiki/9768/Accessing-DevDiv-Data) to get access permission.


### Doc Sections

- [Kusto Examples](kusto_examples.md) - Samples for kusto query

- [FAQ](faq.md) - Commonly asked questions
27 changes: 27 additions & 0 deletions doc/telemetry/faq.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
FAQ
===
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved

### What's the relationship of CLI telemetry and ARM telemetry?

- CLI telemetry is client telemetry. It logs os, platform, command, parameter, result and other client info.
- ARM telemetry is server telemetry. It tracks all HTTP requests and responses through ARM endpoint from different clients, including CLI, Powershell, SDK...
- They share the same `clientRequestId` which you can leverage to join `HttpIncomingRequests` (ARM telemetry table) with `RawEventsAzCli` (CLI telemetry table)


### How can I filter CLI requests from ARM telemetry?

[Execute in Web](https://dataexplorer.azure.com/clusters/armprod/databases/ARMProd?query=H4sIAAAAAAAAA/MoKSnwzEvOz83MSw9KLSxNLS4p5qpRKM9ILUpVCPH0dQ0OcfQNULBTSEzP1zDM0ITLlRanFjmmp+aVKCTn55UkZuYVK6g7VpUWpTr7eOob6akDFZYkZqcqGBoAAPoAVdxjAAAA)
```
HttpIncomingRequests
| where TIMESTAMP > ago(1h)
| where userAgent contains 'AzureCLI/2.'
| take 10
```

### How can I collect customized properties into CLI telemetry?

You can utilize `add_extension_event` [function](https://github.com/Azure/azure-cli/blob/dev/src/azure-cli-core/azure/cli/core/telemetry.py#L418-L420) to collect properties for your extension.

When customers run command, in additional to general CLI record whose `EventName` is `azurecli/command`, there will be another record whose `EventName` is `azurecli/extension` recorded in CLI telemetry.

And you can join the general `azurecli/command` record with `azurecli/extension` record on `CorrelationId` field.
99 changes: 99 additions & 0 deletions doc/telemetry/kusto_examples.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,99 @@
Samples for kusto query
=======================
### Query for new schema
CLI telemetry has different schema after version 2.0.28
[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAAwtKLHctS80rKXascs7J5KpRKM9ILUpVKC7IySzRCCjKTylNLglLLSrOzM/TUVDXU9eMNohVsLVVUE+sKi1KTc7JdDBSh+sqyc/MK9HAo9cwVlPBTsFAIb+IsFojsFojC6DpJYnZqQqmAIl8AharAAAA)
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved
```
RawEventsAzCli
| where split(ProductVersion, '.')[0] == 'azurecli@2'
| where toint(split(ProductVersion, '.')[1]) > 0 or toint(split(ProductVersion, '.')[2]) > 28
| take 5
```

### Query for specific command
e.g. `az account show` command

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAAwtKLHctS80rKXascs7J5KpRKM9ILUpVAIuFZOamFpck5hYo2CkkpudrGGZowhUEJZY75+fmJualKNjaKqgnJifnl+aVKBRn5JerAxWVJGanKhgaAABH9KmYXgAAAA==)
```
RawEventsAzCli
| where EventTimestamp > ago(1h)
| where RawCommand == 'account show'
| take 10
```

### Query for specific command group
e.g. `az storage accout` command group
evelyn-ys marked this conversation as resolved.
Show resolved Hide resolved

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAAz3KQQqAIBAAwHuvWDzVLT8QhPQB6QOLLSqlhm4J0eOLDl2H0VinkyKX8VKbb26ojjLBZ7MPVBjDDgOgTa103R80VpVCwLjAWzKX6tmBKJwyWgI0Jh2RxfsZVwLZP6LV9udpAAAA)
```
RawEventsAzCli
| where EventTimestamp > ago(1h)
| where RawCommand startswith "storage account"
| take 10
```

### Query for specific command with specific cli version
e.g. `az account show` command with CLI version `2.35.0`

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAAz3KsQqDMBAA0L1fcZt1Ea04Wlqku0hxP+LRhJpcSS4GpB9ftOD6eAOmx0JOwn3tZnP6QtLkCXZ7GktB0H7gCvjic6XzIwyYOrYW3QRtCxkqxdEJBM0pO1LveYpKRvLBsPvHNXpSs7ldiropyu0Kvgmq8gc6Rz0AigAAAA==)
```
RawEventsAzCli
| where EventTimestamp > ago(1h)
| where RawCommand == 'account show'
| where ProductVersion == '[email protected]'
| take 10
```

### Query for specific command with specific cli extension version
e.g. `az connectedk8s connect` command with version `1.2.8` of extension `connectedk8s`

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAA12OvQ6CQBCEe59iQwM0F7GiwWgIrTHEzlhsYMUL3B25W8QQH97D+BOtJjv5ZmdKHIsraXbbKe/k4g7jhSzB0ztIRY5R9bAGbEyUXGIP0I1J11DM4qTRO1QEGbBxbKVuor01PVmW5I5BZTR7XtR0xqFjgdNgqeqkoHda+3Rwij/FJY65UQp9Q5ZB6B9oqpjqNnXwOsLvyt8Nf/wmESuRzjRjS5AsH7YF5vDsAAAA)
```
RawEventsAzCli
| where EventTimestamp > ago(1h)
| extend ExtensionName = tostring(Properties["context.default.azurecli.extensionname"])
| where RawCommand == 'connectedk8s connect'
| where ExtensionName == '[email protected]'
| take 10
```

### Count specific command calls by date
e.g. `az account show` usage by date

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAA1WMwQrCMBAF737FuzUFL548RZDiDxR/YJssNmASyW4NFj/eEkHwOjPMSPXy5KRyXod72L1RZy6Mxq4hsijFB06gWzZH3/+CkeqQY6TkYS06ci4vSSFzrt0WybK5ElaGS2qbMz2mF1TsFJL5/+9xaOtcPJdvBc/iPgod8yqdAAAA)
```
RawEventsAzCli
| where EventTimestamp > ago(7d)
| where RawCommand == 'account show'
| summarize cnt=count() by ts=bin(EventTimestamp, 1d)
| order by ts desc
```

### Calculate success rate for specific command
e.g. `az group create` command

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAA1XMwQrCMBAE0Ltfsbe2J4+eKpTi1UPwB9Z0qYEkWzYbg8GPb60geBuGN2OwXJ4UNQ119O7whvIgIdi7mwuUFMMCZ8CZ29PU/YDBMnIIGCfoe2hm4byAFUKl5v/lioF2gzULWe+O9jv8uJS3KK4SWM5R2w7uLxisOo6GUva6AvX30C+gAAAA)
```
RawEventsAzCli
| where EventTimestamp > ago(7d)
| where RawCommand == 'group create'
| where EventName == 'azurecli/command'
| summarize count() by ActionResult
```
Note: `where EventName == 'azurecli/command'` is necessary because in some cases one record will have additional records whose `EventName` could be `azurecli/extension` or `azurecli/fault`. If you count these additional records, some calls might be calculated twice or more times.

### Query failure details for specific command
e.g. `az group create` command

[Execute in Web](https://dataexplorer.azure.com/clusters/ddazureclients/databases/AzureCli?query=H4sIAAAAAAAAA52Ry2rDMBBF9/0KdeUEhKEf4EJI3V1LcLIrJQzS1FGxHozGcRL68ZVtGseUbroV554Z3amgK4/oOK4u68bcfYnugIRieNsZi5HBBvEooPaLB728AhV0a28tOC2KQmQ1+TYIRQiM2RVaKTbeVRjbhsV9wratUhhjT+CJMYVFSeRpdw4oCsE+MhlXLzbkAxIbjG+Z8o4TnGv8gOTJ4dISqsbk2Cf3nKLZ+3IylieFoZ/7CvYf0p/03qX4TPzck08YFZmB+MtNGJGOqHMNqTyvscnHIXrKjuJA/hMVj23368qbXqXYAIGNclajnAqT86/KXwsm25DZtklI5xv+JR0B6hSZ9v4GVmNP5AkCAAA=)
```
RawEventsAzCli
| where EventTimestamp > ago(1d)
| where RawCommand == 'group create'
| where ActionResult != 'Success'
| extend ErrorType = tostring(Properties['context.default.azurecli.error_type'])
| extend ExceptionName = tostring(Properties['context.default.azurecli.exception_name'])
| extend FaultDescription = tostring(Properties['reserved.datamodel.fault.description'])
| project EventName, RawCommand, Params, ActionResult, ErrorType, ExceptionName, FaultDescription, ResultSummary, ExceptionMessage, Properties
```
Notes: `ResultSummary` and `ExceptionMessage` might be suppressed to meet security & privacy requirements.