DbtRuntimeError.init() missing 1 required positional argument: 'msg' #169

ozeanith · 2023-07-19T14:39:26Z

Describe the bug

I am having this error message when trying to build both materialized tables and incremental tables. The funny thing is that sometimes I am able to build the tables and sometimes I get this error message.

DbtRuntimeError.__init__() missing 1 required positional argument: 'msg'
    
    > in macro materialization_table_clickhouse (macros/materializations/table.sql)
    > called by model stg_transaction_event_card (models/prd/staging/stg_transaction_event_card.sql)

When successful, the table is build with all the data. But when it fails, an empty table is created in my database.

Steps to reproduce

For materialized tables

profiles.yaml

local_clickhouse:
      type: clickhouse
      schema: warehouse_dev
      user: "{{ env_var('CLICKHOUSE_USER') }}"
      password: "{{ env_var('CLICKHOUSE_PASS') }}"
      port: 9000
      host: localhost
      cluster: default
      retries: 1
      connect_timeout: 10
      send_receive_timeout: 3600 # 1hr

stg_transaction_event_card.sql example code:

{{
    config(
        materialized = "table",
        engine = "ReplicatedMergeTree",
        unique_key = "transaction_id",
        pre_hook = "TRUNCATE TABLE IF EXISTS {{ this }} ON CLUSTER default",
        indexes=[
          {'columns': ['transaction_id'], 'type': 'btree', 'unique': True },
          {'columns': ['last_mod'], 'type': 'btree'}
        ]
    )
}}

WITH txns_events AS (
    SELECT
        col1,
        col2,
        col3
    FROM postgresql('host:port', 'db_name', 'table_name', 'db_user', 'db_password', 'schema')
)

SELECT
    col1,
    col2,
    col3
FROM txns_events

For incremental tables

profiles.yaml

local_clickhouse:
      type: clickhouse
      schema: warehouse_dev
      user: "{{ env_var('CLICKHOUSE_USER') }}"
      password: "{{ env_var('CLICKHOUSE_PASS') }}"
      port: 9000
      host: localhost
      cluster: default
      retries: 1
      connect_timeout: 10
      send_receive_timeout: 3600 # 1hr
      use_lw_deletes: true

stg_transaction_event_card.sql example code:

{{
    config(
        materialized = "incremental",
        unique_key = "transaction_id",
        indexes=[
          {'columns': ['transaction_id'], 'type': 'btree', 'unique': True },
          {'columns': ['last_mod'], 'type': 'btree'}
        ]
    )
}}

WITH txns_events AS (
    SELECT
        col1,
        col2,
        col3
    FROM postgresql('host:port', 'db_name', 'table_name', 'db_user', 'db_password', 'schema')

    {% if is_incremental() %}

         AND (last_mod > (SELECT max(last_mod) FROM {{ this }}))

    {% endif %}
)

SELECT
    col1,
    col2,
    col3
FROM txns_events

Configuration

Environment

dbt version: dbt-core=1.4.6
dbt-clickhouse version: dbt-clickhouse=1.4.2
Python version: Python=3.11.4
Operating system: M1 Mac

The text was updated successfully, but these errors were encountered:

genzgd · 2023-07-19T23:45:57Z

It looks like there's a bug in error message handling/logging that keeps you from seeing the underlying ClickHouse exception I've pushed a fix for that in the new release 1.4.4. My guess is that's hiding the real error message which might be related to some Postgres issue. If you can try the 1.4.4 dbt-clickhouse version that might point to the root cause.

You might also check your ClickHouse server logs for related errors.

ozeanith · 2023-07-20T15:18:29Z

@genzgd I've updated dbt-clickhouse to 1.4.4 version and now I'm seeing this error.

Completed with 1 error and 0 warnings:
14:57:33  
14:57:33  Runtime Error in model stg_transaction_event_card (models/prd/staging/stg_transaction_event_card.sql)
14:57:33    ClickHouse exception:  Unexpected EOF while reading bytes

I found out that by running the model with this syntax the error message shows up:

WITH txns_events AS (
    SELECT
        col1,
        col2,
        col3
    FROM postgresql('host:port', 'db_name', 'table_name', 'db_user', 'db_password', 'schema')
)

SELECT
    col1,
    col2,
    col3
FROM txns_events

But when running the model like this, the table is successfully built:

SELECT
        col1,
        col2,
        col3
    FROM postgresql('host:port', 'db_name', 'table_name', 'db_user', 'db_password', 'schema')

Here are some logs. Tbh these logs also shown when dbt succeeds:

<Debug> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736): Removing 1 parts from filesystem (serially): Parts: [202307_253284_253284_0]
<Debug> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736): Removing 1 parts from memory: Parts: [202307_253284_253284_0]
<Trace> AsynchronousMetrics: MemoryTracking: was 955.30 MiB, peak 2.82 GiB, free memory in arenas 58.20 MiB, will set to 966.10 MiB (RSS), difference: 10.80 MiB
<Trace> AsynchronousMetrics: MemoryTracking: was 1.07 GiB, peak 2.82 GiB, free memory in arenas 57.03 MiB, will set to 979.50 MiB (RSS), difference: -116.07 MiB
<Trace> SystemLog (system.trace_log): Flushing system log, 81 entries to flush up to offset 10193
<Trace> system.trace_log (6fd4d340-82b6-4bd8-b1fa-d74d84939d91): Trying to reserve 1.00 MiB using storage policy from min volume index 0
<Trace> DiskLocal: Reserved 1.00 MiB on local disk `default`, having unreserved 73.72 GiB.
<Trace> MergedBlockOutputStream: filled checksums 202307_49_49_0 (state Temporary)
<Trace> system.trace_log (6fd4d340-82b6-4bd8-b1fa-d74d84939d91): Renaming temporary part tmp_insert_202307_49_49_0 to 202307_372917_372917_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
<Trace> SystemLog (system.trace_log): Flushed system log up to offset 10193
<Trace> system.asynchronous_metric_log (2db8234d-75a8-4a52-af05-9478570fd30b): Found 2 old parts to remove. Parts: [202307_467011_467593_244, 202307_467597_467597_0]
<Debug> system.asynchronous_metric_log (2db8234d-75a8-4a52-af05-9478570fd30b): Removing 2 parts from filesystem (serially): Parts: [202307_467011_467593_244, 202307_467597_467597_0]
<Debug> system.asynchronous_metric_log (2db8234d-75a8-4a52-af05-9478570fd30b): Removing 2 parts from memory: Parts: [202307_467011_467593_244, 202307_467597_467597_0]
<Trace> AsynchronousMetrics: MemoryTracking: was 979.46 MiB, peak 2.82 GiB, free memory in arenas 52.82 MiB, will set to 990.95 MiB (RSS), difference: 11.48 MiB
<Trace> SystemLog (system.metric_log): Flushing system log, 7 entries to flush up to offset 369
<Trace> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736): Trying to reserve 1.00 MiB using storage policy from min volume index 0
<Trace> DiskLocal: Reserved 1.00 MiB on local disk `default`, having unreserved 73.72 GiB.
<Trace> MergedBlockOutputStream: filled checksums 202307_49_49_0 (state Temporary)
<Trace> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736): Renaming temporary part tmp_insert_202307_49_49_0 to 202307_253347_253347_0 with tid (1, 1, 00000000-0000-0000-0000-000000000000).
<Debug> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736) (MergerMutator): Selected 6 parts from 202307_252732_253342_122 to 202307_253347_253347_0
<Trace> system.metric_log (dc9bb045-6d58-43be-9cc9-74b41fd8f736): Trying to reserve 1.00 MiB using storage policy from min volume index 0
<Trace> DiskLocal: Reserved 1.00 MiB on local disk `default`, having unreserved 73.72 GiB.

genzgd · 2023-07-20T16:09:51Z

That looks like a network error in the underlying HTTP request from dbt to your ClickHouse server. How long does this select usually take? Where is the Clickhouse server you are connecting to?

genzgd · 2023-07-20T16:11:15Z

Oh, you're doing this on localhost, so it's probably an error between your localhost and the Postgres server?

ozeanith · 2023-07-20T22:14:13Z

The query takes around 20 - 25min for materialized=table. For incremental it takes 1min and 30seconds approx. But I don't understand why there would be an error between my localhost and Postgres when using WITH statements specifically. @genzgd

genzgd · 2023-07-20T22:25:10Z

It could just be a coincidence. Unfortunately the error still doesn't provide much information once it's been processed through several DBT layers. You could try running the same query using just the underlying python client (clickhouse-driver if you are using native protocol over port 9000, clickhouse-connect if you are using http over port 8123). The error is currently coming directly from the clickhouse-driver query so you should see the same thing with hopefully more information.

Does the DBT debug log show anything in addition to that error message?

ozeanith · 2023-07-28T19:07:29Z

Found my issue!

So, basically it wasn't an error on the network, it was a memory issue. Scaling my memory did the work!
Other people also suggested updating send and receive timeout as well as max_insert_threads.

ozeanith added the bug Something isn't working label Jul 19, 2023

genzgd closed this as completed Aug 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DbtRuntimeError.init() missing 1 required positional argument: 'msg' #169

DbtRuntimeError.init() missing 1 required positional argument: 'msg' #169

ozeanith commented Jul 19, 2023

genzgd commented Jul 19, 2023

ozeanith commented Jul 20, 2023

genzgd commented Jul 20, 2023

genzgd commented Jul 20, 2023

ozeanith commented Jul 20, 2023 •

edited

Loading

genzgd commented Jul 20, 2023

ozeanith commented Jul 28, 2023 •

edited

Loading

DbtRuntimeError.__init__() missing 1 required positional argument: 'msg' #169

DbtRuntimeError.__init__() missing 1 required positional argument: 'msg' #169

Comments

ozeanith commented Jul 19, 2023

Describe the bug

Steps to reproduce

For materialized tables

For incremental tables

Configuration

Environment

genzgd commented Jul 19, 2023

ozeanith commented Jul 20, 2023

genzgd commented Jul 20, 2023

genzgd commented Jul 20, 2023

ozeanith commented Jul 20, 2023 • edited Loading

genzgd commented Jul 20, 2023

ozeanith commented Jul 28, 2023 • edited Loading

DbtRuntimeError.init() missing 1 required positional argument: 'msg' #169

DbtRuntimeError.init() missing 1 required positional argument: 'msg' #169

ozeanith commented Jul 20, 2023 •

edited

Loading

ozeanith commented Jul 28, 2023 •

edited

Loading