Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Database errors in microbatches are missing "node_info" in structured logs #10840

Closed
2 tasks done
Tracked by #10624
davidharting opened this issue Oct 10, 2024 · 2 comments · Fixed by #11003
Closed
2 tasks done
Tracked by #10624
Assignees
Labels
bug Something isn't working Impact: Exp incremental Incremental modeling with dbt logging microbatch Issues related to the microbatch incremental strategy pre-release Bug not yet in a stable release

Comments

@davidharting
Copy link

davidharting commented Oct 10, 2024

Is this a new bug in dbt-core?

  • I believe this is a new bug in dbt-core
  • I have searched the existing issues, and I could not find an existing issue for this bug

Current Behavior

When using structured JSON logs, database errors in a node do not have node_info.

Expected Behavior

I expected database error message to have node_info attached to the, like they do for regular models.

Example database error message for a table-materialized model. It contains the `data.node_info` key.
{
  "data": {
    "msg": "Database Error in model simulated_raw_events (models/events/simulated_raw_events.sql)\n  001003 (42000): SQL compilation error:\n  syntax error line 10 at position 8 unexpected ''2024-10-08''.\n  syntax error line 18 at position 8 unexpected ''2024-10-08''.\n  compiled code at target/run/microbatch_example/models/events/simulated_raw_events.sql",
    "node_info": {
      "materialized": "table",
      "meta": {},
      "node_finished_at": "",
      "node_name": "simulated_raw_events",
      "node_path": "events/simulated_raw_events.sql",
      "node_relation": {
        "alias": "simulated_raw_events",
        "database": "analytics_dev",
        "relation_name": "analytics_dev.dbt_dharting.simulated_raw_events",
        "schema": "dbt_dharting"
      },
      "node_started_at": "",
      "node_status": "None",
      "resource_type": "model",
      "unique_id": "model.microbatch_example.simulated_raw_events"
    }
  },
  "info": {
    "category": "",
    "code": "Z024",
    "extra": {},
    "invocation_id": "94829561-4308-4c87-b607-cbc2c3b3a55c",
    "level": "error",
    "msg": "  Database Error in model simulated_raw_events (models/events/simulated_raw_events.sql)\n  001003 (42000): SQL compilation error:\n  syntax error line 10 at position 8 unexpected ''2024-10-08''.\n  syntax error line 18 at position 8 unexpected ''2024-10-08''.\n  compiled code at target/run/microbatch_example/models/events/simulated_raw_events.sql",
    "name": "RunResultError",
    "pid": 59726,
    "thread": "MainThread",
    "ts": "2024-10-10T19:24:10.953726Z"
  }
}

Steps To Reproduce

You can watch me perform these reproduction steps in this Zoom clip.

  1. Initialize a new project using dbt-snowflake >= 1.9.0
  2. Create a new folder: models/events. Create two files in that folder.

File 1: models/events/simulated_raw_events.sql

{{ config(
    materialized='table'
) }}

{% set num_iterations = 100 %}

{% for i in range(1, num_iterations + 1) %}
    SELECT
        {{ i }} AS event_id
        '2024-10-08'::timestamptz + interval '{{ i }} hours' AS occurred_at,
        uniform(0, 1000, random()) AS value
    {% if not loop.last %}
    UNION ALL
    {% endif %}
{% endfor %}

File 2: models/events/stg_raw_events_inc.sql (missing a comma, so it will cause a database error)

{{ config(
    materialized='incremental',
    incremental_strategy='microbatch',
    event_time='occurred_at',
    begin='2019-12-31',
    batch_size='day',
    lookback=3
) }}

with renamed as (

    select
        event_id,
        occurred_at -- Note the invalid sql here, missing a coma
        value
    from {{ ref("simulated_raw_events") }}

)

select * from renamed
  1. Build the "simulated raw events" table: dbt build -m simulated_raw_events (no issues here)

  2. Build the incremental model, with JSON logging:
    DBT_EXPERIMENTAL_MICROBATCH=True dbt --log-format=json build -m stg_raw_events_inc --log-path=logs/1

  3. Inspect the log file.
    a. Note that you can't find the compilation errors when looking at the node-specific logs:
    cat ./logs/1/dbt.log | jq 'select(.info.level == "error" and .data.node_info.unique_id != null)'
    b. But you can see them when looking at "system logs" (logs not associated with a node):
    cat ./logs/1/dbt.log | jq 'select(.info.level == "error" and .data.node_info.unique_id == null)'

I've included the output from "logs/1/dbt.log" from my steps above in the "relevant log output" section.

Relevant log output

No response

Environment

- OS: MacOS 15.0.1
- Python: 3.12.5
- dbt: Core: 1.9.0-b2, Snowflake:  1.9.0b1

Which database adapter are you using with dbt?

No response

Additional Context

dbt.log

@davidharting davidharting added bug Something isn't working triage Impact: Exp labels Oct 10, 2024
@davidharting
Copy link
Author

I labeled this as Impact: Exp, because we rely on data.node_info.unique_id in order to gather log messages for a particular node in one place.

@davidharting davidharting changed the title [Bug] <title> [Bug] Database errors in microbatches are missing "node_info" in structured logs Oct 10, 2024
@dbeatty10 dbeatty10 added incremental Incremental modeling with dbt logging pre-release Bug not yet in a stable release labels Oct 10, 2024
@davidharting
Copy link
Author

Thanks for the triage @dbeatty10!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Impact: Exp incremental Incremental modeling with dbt logging microbatch Issues related to the microbatch incremental strategy pre-release Bug not yet in a stable release
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants