Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is threads > 0 supported? #197

Closed
the4thamigo-uk opened this issue Oct 16, 2023 · 12 comments
Closed

Is threads > 0 supported? #197

the4thamigo-uk opened this issue Oct 16, 2023 · 12 comments
Labels
bug Something isn't working

Comments

@the4thamigo-uk
Copy link
Contributor

We have a clickhouse cluster with more than one node.
Hi I have adbt pipeline which works fine with threads = 1, but yields a variety of irregular errors when I run with threads > 1. Is this expected to work?

@the4thamigo-uk the4thamigo-uk added the bug Something isn't working label Oct 16, 2023
@genzgd
Copy link
Contributor

genzgd commented Oct 16, 2023

We run tests against ClickHouse Cloud with threads = 4. What type of errors are you seeing? Support for distributed tables is also experimental, so that might be related.

@the4thamigo-uk
Copy link
Contributor Author

the4thamigo-uk commented Oct 18, 2023

First of all I often see this right after I do dbt run

19:07:45  Encountered an error:
Database Error
  Database Error
    Failed to create dbt_dev database due to ClickHouse exception

Then often various steps will fail, and some steps will be skipped, that I dont see when running on a single thread. Here is one example of an error I never see singlethreaded:

19:09:24  Running with dbt=1.4.9
19:09:24  [WARNING]: Configuration paths exist in your dbt_project.yml file which do not apply to any resources.
There are 1 unused configuration paths:
- models.clickhouse.lookups
19:09:24  Found 15 models, 0 tests, 0 snapshots, 0 analyses, 325 macros, 1 operation, 1 seed file, 11 sources, 0 exposures, 0 metrics
19:09:24  
19:09:24  Concurrency: 4 threads (target='dev')
19:09:24  
19:09:24  1 of 15 START sql incremental model dbt_dev_intermediate.int_sales__daily__external  [RUN]
19:09:24  2 of 15 START sql table model dbt_dev_staging.stg_configuration__adunit__site__placement  [RUN]
19:09:24  3 of 15 START sql table model dbt_dev_staging.stg_configuration__placements .... [RUN]
19:09:24  4 of 15 START sql table model dbt_dev_staging.stg_configuration__sites ......... [RUN]
19:09:25  2 of 15 OK created sql table model dbt_dev_staging.stg_configuration__adunit__site__placement  [OK in 0.83s]
19:09:25  5 of 15 START sql table model dbt_dev_staging.stg_dim__bidders ................. [RUN]
19:09:26  4 of 15 OK created sql table model dbt_dev_staging.stg_configuration__sites .... [OK in 1.07s]
19:09:26  6 of 15 START sql table model dbt_dev_staging.stg_dim__channels ................ [RUN]
19:09:26  3 of 15 OK created sql table model dbt_dev_staging.stg_configuration__placements  [OK in 1.13s]
19:09:26  7 of 15 START sql table model dbt_dev_staging.stg_dim__countries ............... [RUN]
19:09:26  5 of 15 OK created sql table model dbt_dev_staging.stg_dim__bidders ............ [OK in 0.75s]
19:09:26  8 of 15 START sql table model dbt_dev_staging.stg_dim__device_types ............ [RUN]
19:09:26  6 of 15 OK created sql table model dbt_dev_staging.stg_dim__channels ........... [OK in 0.79s]
19:09:26  9 of 15 START sql table model dbt_dev_staging.stg_dim__media_types ............. [RUN]
19:09:26  7 of 15 OK created sql table model dbt_dev_staging.stg_dim__countries .......... [OK in 0.88s]
19:09:26  10 of 15 START sql view model dbt_dev_staging.stg_landscape__realtime .......... [RUN]
19:09:27  8 of 15 OK created sql table model dbt_dev_staging.stg_dim__device_types ....... [OK in 0.69s]
19:09:27  9 of 15 OK created sql table model dbt_dev_staging.stg_dim__media_types ........ [OK in 0.67s]
19:09:27  10 of 15 OK created sql view model dbt_dev_staging.stg_landscape__realtime ..... [OK in 0.91s]
19:09:27  11 of 15 START sql incremental model dbt_dev_staging.stg_landscape__hourly ..... [RUN]
19:09:31  11 of 15 ERROR creating sql incremental model dbt_dev_staging.stg_landscape__hourly  [ERROR in 3.83s]
19:09:31  12 of 15 SKIP relation dbt_dev_staging.stg_landscape__daily .................... [SKIP]
19:09:31  13 of 15 SKIP relation dbt_dev_intermediate.int_sales__daily__internal ......... [SKIP]
19:11:50  1 of 15 OK created sql incremental model dbt_dev_intermediate.int_sales__daily__external  [OK in 145.72s]
19:11:50  14 of 15 SKIP relation dbt_dev_intermediate.int_sales__daily ................... [SKIP]
19:11:50  15 of 15 SKIP relation dbt_dev_marts.mrt_sales__daily .......................... [SKIP]
19:11:50  
19:11:50  Running 1 on-run-end hook
19:11:50  1 of 1 START hook: clickhouse.on-run-end.0 ..................................... [RUN]
19:11:51  1 of 1 OK hook: clickhouse.on-run-end.0 ........................................ [OK in 0.30s]
19:11:51  
19:11:51  
19:11:51  Finished running 8 table models, 4 view models, 3 incremental models, 1 hook in 0 hours 2 minutes and 26.40 seconds (146.40s).
19:11:51  
19:11:51  Completed with 1 error and 0 warnings:
19:11:51  
19:11:51  Database Error in model stg_landscape__hourly (models/staging/landscape/stg_landscape__hourly.sql)
19:11:51    :HTTPDriver for http://clickhouse-dev.dragonite.venatusmedia.com:8123 returned response code 500)
19:11:51     Code: 36. DB::Exception: ALTER UPDATE/ALTER DELETE statement with subquery may be nondeterministic, see allow_nondeterministic_mutations setting. (BAD_ARGUMENTS) (version 23.8.1.2992 (official build))
19:11:51    compiled Code at target/run/clickhouse/models/staging/landscape/stg_landscape__hourly.sql
19:11:51  
19:11:51  Done. PASS=10 WARN=0 ERROR=1 SKIP=4 TOTAL=15

Note that as far as I can tell there is no non-deterministic code in that model, it is just a time-aggregation on another CH table, configured as:

{{
  config(
    materialized='incremental',
    incremental_strategy='delete_insert',
    engine='ReplicatedMergeTree',
    on_schema_change='append_new_columns',
    unique_key='x,y,z'
  )
}}

@genzgd
Copy link
Contributor

genzgd commented Oct 18, 2023

It would be useful to understand (maybe from the ClickHouse logs?) what the exception is for the create database DDL query. It seems like it might be permissions related.

Certain incremental materializations using "lightweight deletes" with the delete_insert strategy use subqueries that could be considered "non-deterministic" by Clickhouse (although they aren't actually non-deterministic because dbt should be executing the affected queries in sequence as part of the materialization). Recent versions of dbt-clickhouse attempt to automatically send the "allow_nondeterministic_mutations" setting, but that doesn't always seem to work (and could also be permissions related.)

Are there any additional exceptions available in the ClickHouse server or query logs?

@the4thamigo-uk
Copy link
Contributor Author

Sorry for the delay getting back, but this is an error from the clickhouse log at the time of the iniital error:

2023.10.27 16:05:42.038408 [ 619 ] {1b9a3f62-1d96-45c7-a9d3-d5b467da31d2} <Error> executeQuery: Code: 82. DB::Exception: There was an error on [chi-repl-04-23-replcluster-23-0-0:9000]: Code: 82. DB::Exception: Database dbt_dev already exists. (DATABASE_ALREADY_EXISTS) (version 23.8.1.2992 (official build)). (DATABASE_ALREADY_EXISTS) (version 23.8.1.2992 (official build)) (from 10.16.0.78:40268) (in query: CREATE DATABASE dbt_dev ON CLUSTER "replcluster-23" ), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c604e57 in /usr/bin/clickhouse
1. DB::DDLQueryStatusSource::generate() @ 0x00000000122944a3 in /usr/bin/clickhouse
2. DB::ISource::tryGenerate() @ 0x000000001315ab35 in /usr/bin/clickhouse
3. DB::ISource::work() @ 0x000000001315a686 in /usr/bin/clickhouse
4. DB::ExecutionThreadContext::executeTask() @ 0x000000001317205a in /usr/bin/clickhouse
5. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x0000000013168e90 in /usr/bin/clickhouse
6. DB::PipelineExecutor::execute(unsigned long, bool) @ 0x00000000131681c2 in /usr/bin/clickhouse
7. DB::CompletedPipelineExecutor::execute() @ 0x0000000013166a72 in /usr/bin/clickhouse
8. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptr<DB::Context>, std::function<void (DB::QueryResultDetails const&)>, std::optional<DB::FormatSettings> const&) @ 0x00000000122a9f8c in /usr/bin/clickhouse
9. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&) @ 0x00000000130b2d4f in /usr/bin/clickhouse
10. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&) @ 0x00000000130b7629 in /usr/bin/clickhouse
11. DB::HTTPServerConnection::run() @ 0x0000000013127fd2 in /usr/bin/clickhouse
12. Poco::Net::TCPServerConnection::start() @ 0x0000000015b10374 in /usr/bin/clickhouse
13. Poco::Net::TCPServerDispatcher::run() @ 0x0000000015b11571 in /usr/bin/clickhouse
14. Poco::PooledThread::run() @ 0x0000000015c47da7 in /usr/bin/clickhouse
15. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000015c4607c in /usr/bin/clickhouse
16. ? @ 0x00007fa0c226c609 in ?
17. ? @ 0x00007fa0c2191133 in ?

2023.10.27 16:05:42.038758 [ 619 ] {1b9a3f62-1d96-45c7-a9d3-d5b467da31d2} <Error> DynamicQueryHandler: Code: 82. DB::Exception: There was an error on [chi-repl-04-23-replcluster-23-0-0:9000]: Code: 82. DB::Exception: Database dbt_dev already exists. (DATABASE_ALREADY_EXISTS) (version 23.8.1.2992 (official build)). (DATABASE_ALREADY_EXISTS), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(DB::Exception::MessageMasked&&, int, bool) @ 0x000000000c604e57 in /usr/bin/clickhouse
1. DB::DDLQueryStatusSource::generate() @ 0x00000000122944a3 in /usr/bin/clickhouse
2. DB::ISource::tryGenerate() @ 0x000000001315ab35 in /usr/bin/clickhouse
3. DB::ISource::work() @ 0x000000001315a686 in /usr/bin/clickhouse
4. DB::ExecutionThreadContext::executeTask() @ 0x000000001317205a in /usr/bin/clickhouse
5. DB::PipelineExecutor::executeStepImpl(unsigned long, std::atomic<bool>*) @ 0x0000000013168e90 in /usr/bin/clickhouse
6. DB::PipelineExecutor::execute(unsigned long, bool) @ 0x00000000131681c2 in /usr/bin/clickhouse
7. DB::CompletedPipelineExecutor::execute() @ 0x0000000013166a72 in /usr/bin/clickhouse
8. DB::executeQuery(DB::ReadBuffer&, DB::WriteBuffer&, bool, std::shared_ptr<DB::Context>, std::function<void (DB::QueryResultDetails const&)>, std::optional<DB::FormatSettings> const&) @ 0x00000000122a9f8c in /usr/bin/clickhouse
9. DB::HTTPHandler::processQuery(DB::HTTPServerRequest&, DB::HTMLForm&, DB::HTTPServerResponse&, DB::HTTPHandler::Output&, std::optional<DB::CurrentThread::QueryScope>&) @ 0x00000000130b2d4f in /usr/bin/clickhouse
10. DB::HTTPHandler::handleRequest(DB::HTTPServerRequest&, DB::HTTPServerResponse&) @ 0x00000000130b7629 in /usr/bin/clickhouse
11. DB::HTTPServerConnection::run() @ 0x0000000013127fd2 in /usr/bin/clickhouse
12. Poco::Net::TCPServerConnection::start() @ 0x0000000015b10374 in /usr/bin/clickhouse
13. Poco::Net::TCPServerDispatcher::run() @ 0x0000000015b11571 in /usr/bin/clickhouse
14. Poco::PooledThread::run() @ 0x0000000015c47da7 in /usr/bin/clickhouse
15. Poco::ThreadImpl::runnableEntry(void*) @ 0x0000000015c4607c in /usr/bin/clickhouse
16. ? @ 0x00007fa0c226c609 in ?
17. ? @ 0x00007fa0c2191133 in ?
 (version 23.8.1.2992 (official build))

I should mention also that we are running against a cluster of one shard consisting of two replicas.

@the4thamigo-uk
Copy link
Contributor Author

the4thamigo-uk commented Oct 27, 2023

For the non-deterministic error, I should also mention that this incremental materialization is based off a remote view to another clickhouse instance. Im not sure if this is relevant though.

i.e. https://clickhouse.com/docs/en/sql-reference/table-functions/remote

@genzgd
Copy link
Contributor

genzgd commented Oct 27, 2023

Ah, the DATABASE ALREADY EXISTS error is a pretty obvious bug in the way that the client handles creating a database on startup for clusters. If the database exists on one replica but not the one that is checked, it runs the CREATE DATABASE ON CLUSTER incorrectly. I'll have to give it some thought but that should be a pretty easy fix.

The fact that you're doing something off of a Remote table function may in fact be contributing to the other weird error. I know it's unlikely but if you could find a reproducible example it might be possible to track down.

@the4thamigo-uk
Copy link
Contributor Author

Ill try to find some time to create an example.

@supnobita
Copy link

For those who have this issue like me. Set allow_nondeterministic_mutations = 1 in user profile will make it work. https://clickhouse.com/docs/en/operations/settings/settings#allow_nondeterministic_mutations

@the4thamigo-uk
Copy link
Contributor Author

I should also point out that this has recently been occurring on a single thread. So I think this is not related to multiple threads as first thought.

@genzgd
Copy link
Contributor

genzgd commented Nov 24, 2023

If you are continuing to get this error:

ALTER UPDATE/ALTER DELETE statement with subquery may be nondeterministic, see allow_nondeterministic_mutations setting.

I believe @supnobita's suggestion should provide a valid workaround if you make sure the active profile has allow_nondeterministic_mutations=1 enable on all nodes. dbt_clickhouse tries to send that setting with every relevant request, but it may not be effective due to permissions or other issues. I'll look at the behavior here and try to make it more foolproof from the dbt-clickhouse side but there's only so much the plugin can do if the appropriate setting is not enabled, since it's difficult for the client to validate the setting on all possible nodes.

Similarly, ensuring that the dbt database already exists on all nodes should also fix the initial create database error. For similar reasons this is difficult to ensure from dbt-clickhouse.

@genzgd
Copy link
Contributor

genzgd commented Nov 24, 2023

Just to be clear, the delete+insert incremental strategy involves what ClickHouse regards as a non-deterministic mutation. It's an implementation behavior of "lightweight deletes".

@genzgd
Copy link
Contributor

genzgd commented Dec 15, 2023

As of version 1.7.1 there are now some prechecks in place to ensure the database exists on the entire cluster and that nondeterministic mutations are allowed in connection with the delete+insert strategy.

@genzgd genzgd closed this as completed Dec 15, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants