Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[exporter/clickhouse] Injection of projection into table schema via config #29443

Closed
fredthomsen opened this issue Nov 21, 2023 · 13 comments
Closed
Labels
documentation Improvements or additions to documentation enhancement New feature or request exporter/clickhouse Stale

Comments

@fredthomsen
Copy link
Contributor

fredthomsen commented Nov 21, 2023

Component(s)

exporter/clickhouse

Is your feature request related to a problem? Please describe.

I am using the clickhouseexporter to get data into clickhouse and multiple different services/applications read that data, but the current table schema may allow a query to be done in a way that is scalable or performs well.

Describe the solution you'd like

Injection of projection in the schema DDL via configuration as follows:

clickhouse:
  endpoint: tcp://clickhouse:9000
  projections:
    gauge:
      app1_query_projection: |
        SELECT * 
        ORDER BY toUnixTimestamp64Nano(TimeUnix), Attributes['context.id'], ResourceAttributes['resource.name']

Describe alternatives you've considered

More customizable DDL statements for table schema creation in general.

Additional context

No response

@fredthomsen fredthomsen added enhancement New feature or request needs triage New item requiring triage labels Nov 21, 2023
Copy link
Contributor

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@crobert-1
Copy link
Member

Hello @fredthomsen, can you format your suggested configuration to make the indentation of options clear? I'm having a hard time following when it's one line.

@fredthomsen
Copy link
Contributor Author

Hello @fredthomsen, can you format your suggested configuration to make the indentation of options clear? I'm having a hard time following when it's one line.

Ok @crobert-1 yeah that did look bad. Update made

@crobert-1
Copy link
Member

crobert-1 commented Dec 8, 2023

I'll have to defer to the code owners on design decisions and configuration options, but from documentation this sounds like a good idea to me. Thanks for the suggestion!

@crobert-1 crobert-1 removed the needs triage New item requiring triage label Dec 8, 2023
@hanjm
Copy link
Member

hanjm commented Dec 9, 2023

The projection optimization is heavily associated with the specific query scenarios after writing. The collector merely establishes the tables required for writing at startup and will not modify the schema once it's created.
Hence, it is suggested to operate the Clickhouse tables directly as needed, rather than placing it at the collector.

Add some documentation on how to use projection and materialized views to optimize telemetry data querying would be worthwhile.

@fredthomsen
Copy link
Contributor Author

Yeah, I concede that knowing all the specific querying scenarios at deployment time is unlikely, although we had a few cases where we thought we could specify them up front.

Copy link
Contributor

github-actions bot commented Feb 9, 2024

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Feb 9, 2024
@crobert-1 crobert-1 added the documentation Improvements or additions to documentation label Feb 9, 2024
@github-actions github-actions bot removed the Stale label Feb 10, 2024
@RoryCrispin
Copy link

My preferred implementation of this is to add a flag to the exporter which disables table creation. Then, you can manage the tables yourself with whatever schema you like, using your preferred DB migrator setup.
Then just emplace the tables you need and set the table names in the OTel configuration.

We run this in prod and insert into a table with Null engine and build materialized views based off from that, where we can have different TTL, indexes and materialized columns (from LogAttributes and ResourceAttributes) for each application type

I'll be happy to raise a PR for this

@hanjm
Copy link
Member

hanjm commented Feb 19, 2024

@RoryCrispin Looks good 👍

@RoryCrispin
Copy link

Cc @SpencerTorres

Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label May 30, 2024
@Frapschen Frapschen removed the Stale label May 30, 2024
mx-psi added a commit that referenced this issue Jun 3, 2024
**Description:**

Adds `create_schema` boolean to config. This flag will disable the
automatic creation of the database and tables so that this can be
manually managed by the user.

**Link to tracking Issue:**

#29443
(related)

**Testing:**
Ran integration tests, and added test for verifying undefined + true +
false in for the value in the config.

**Documentation:**

Added config option to README along with new "schema management" section
that contains some recommendations for production deployments as well as
requirements for maintaining compatibility.

Also added `examples/default_ddl/` which contains all of the default DDL
that the exporter runs. These are to be used as a starting point for
users managing their own schema.

---------

Co-authored-by: Pablo Baeyens <[email protected]>
Copy link
Contributor

This issue has been inactive for 60 days. It will be closed in 60 days if there is no activity. To ping code owners by adding a component label, see Adding Labels via Comments, or if you are unsure of which component this issue relates to, please ping @open-telemetry/collector-contrib-triagers. If this issue is still relevant, please ping the code owners or leave a comment explaining why it is still relevant. Otherwise, please close it.

Pinging code owners:

See Adding Labels via Comments if you do not have permissions to add labels yourself.

@github-actions github-actions bot added the Stale label Jul 30, 2024
@SpencerTorres
Copy link
Member

@crobert-1 This can be closed, we solved this by adding a flag to toggle auto table creation (#32282). In the latest version, the user can create their own tables however they need.

@mx-psi mx-psi closed this as completed Jul 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation enhancement New feature or request exporter/clickhouse Stale
Projects
None yet
Development

No branches or pull requests

7 participants