Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Emit structured log events to ApplicationInsights #308

Merged
merged 13 commits into from
Jul 11, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cluster/dev-values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,9 @@ pctasks:
keyvault:
enabled: false

applicationinsights:
enabled: false

pcdev:
services:
pctasks:
Expand Down
4 changes: 4 additions & 0 deletions deployment/helm/deploy-values.template.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -90,6 +90,10 @@ pctasks:
enabled: true
url: "{{ tf.keyvault_url }}"

applicationinsights:
enabled: true
connection_string: "{{ tf.applicationinsights_connection_string }}"

pcingress:
services:
pctasks:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -199,6 +199,11 @@ spec:
value: "{{ .Values.pctasks.run.keyvault.url }}"
{{- end }}

{{- if .Values.pctasks.run.applicationinsights.enabled }}
- name: PCTASKS_RUN__APPLICATIONINSIGHTS_CONNECTION_STRING
value: "{{ .Values.pctasks.run.applicationinsights.connection_string }}"
{{- end }}

livenessProbe:
httpGet:
path: "/_mgmt/ping"
Expand Down
4 changes: 4 additions & 0 deletions deployment/helm/published/pctasks-server/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -174,3 +174,7 @@ pctasks:
sp_tenant_id: ""
sp_client_id: ""
sp_client_secret: ""

applicationinsights:
enabled: false
connection_string: ""
4 changes: 4 additions & 0 deletions deployment/terraform/resources/output.tf
Original file line number Diff line number Diff line change
Expand Up @@ -160,6 +160,10 @@ output "instrumentation_key" {
value = azurerm_application_insights.pctasks.instrumentation_key
}

output "applicationinsights_connection_string" {
value = azurerm_application_insights.pctasks.connection_string
}

## PCTasks Server

output "argo_wf_node_group_name" {
Expand Down
35 changes: 35 additions & 0 deletions docs/getting_started/telemetry.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,35 @@
# Telemetry

## Structured Logs

The pctasks executor generates logs for the following events:

1. `WorkflowCreated`
2. `WorkflowFinished`
3. `JobCreated`
4. `JobFinished`
5. `JobPartitionCreated`
6. `JobPartitionFinished`
7. `TaskCreated`
8. `TaskFinished`

In general, a record is emitted when something is created or finished, at the Workflow, Job, JobPartition, and Task levels.

Depending on the level (Workflow, Job, JobPartition, Task) the logs will contain the following fields:

| Field | Record Levels | Description |
| ----------- | ----------------------- | --------------------------------------------------------------------- |
| type | All | The event type, from the list above |
| workflowId | All | The ID of the workflow, from the workflow definition |
| datasetId | All | The ID of the dataset, from the workflow definition |
| runId | All | The of the workflow run, generated by pctasks |
| recordLevel | All | The level (Workflow, Job, JobPartition, Task) this record belongs to. |
| jobId | Job, JobPartition, Task | The ID of the job, from the workflow definition |
| partitionId | JobPartition, Task | The ID of the partition, from the workflow definition and pctasks |
| taskId | Task | The ID of the task, from the workflow definition and pctasks |

Depending on the record, additional fields will be included:

* `status`: Present for "Finished" events, indicating success or failure of that operation.
* `errors`: Present for `JobFinished` and `TaskFinished` events when
`status="failed"`, containing a list of errors.
16 changes: 16 additions & 0 deletions pctasks/run/pctasks/run/argo/client.py
Original file line number Diff line number Diff line change
Expand Up @@ -197,6 +197,14 @@ def submit_workflow(
else:
kwargs = {}

if run_settings.applicationinsights_connection_string:
env.append(
EnvVar(
name="APPLICATIONINSIGHTS_CONNECTION_STRING",
value=run_settings.applicationinsights_connection_string,
)
)

# Enable local secrets for development environment
if run_settings.local_secrets:
for env_var in [
Expand Down Expand Up @@ -317,6 +325,14 @@ def submit_task(
else:
kwargs = {}

if run_settings.applicationinsights_connection_string:
env.append(
EnvVar(
name="APPLICATIONINSIGHTS_CONNECTION_STRING",
value=run_settings.applicationinsights_connection_string,
)
)

templates = [
IoArgoprojWorkflowV1alpha1Template(
name="run-workflow",
Expand Down
2 changes: 2 additions & 0 deletions pctasks/run/pctasks/run/settings.py
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,8 @@ def section_name(cls) -> str:
# Type of workflow runner to use.
workflow_runner_type: WorkflowRunnerType = WorkflowRunnerType.ARGO

applicationinsights_connection_string: Optional[str] = None

@property
def batch_settings(self) -> BatchSettings:
if not (self.batch_url and self.batch_key and self.batch_default_pool_id):
Expand Down
Loading
Loading