Add system command to show #25844

pauldix · 2025-01-16T00:17:36Z

We expose all information about the server and its state through system tables. The user can execute queries from the command line to get at this, but it would be a nice convenience to have a command to quickly show the most recent records in any system table.

Something like:

influxdb3 show system parquet_files --limit=10

Limit should default to 100 so it is optional. The help should display what are valid names for the system table we're querying. Under the covers the CLI should just execute the query for the user.

This would likely be good enough to close out #25792

The text was updated successfully, but these errors were encountered:

waynr · 2025-01-24T17:09:37Z

Okay I've started looking into how system tables are implemented to figure out how best to validate system table name arguments (ie the parquet_files argument in your example) and I think I can use some of the code in the system tables module to perform that validation, ie

influxdb/influxdb3_server/src/system_tables/mod.rs

Lines 31 to 41 in 061b62a

    
           pub const SYSTEM_SCHEMA_NAME: &str = "system"; 
        
           pub const TABLE_NAME_PREDICATE: &str = "table_name"; 
        
           pub(crate) const QUERIES_TABLE_NAME: &str = "queries"; 
        
           pub(crate) const LAST_CACHES_TABLE_NAME: &str = "last_caches"; 
        
           pub(crate) const DISTINCT_CACHES_TABLE_NAME: &str = "distinct_caches"; 
        
           pub(crate) const PARQUET_FILES_TABLE_NAME: &str = "parquet_files"; 
        
           const PROCESSING_ENGINE_PLUGINS_TABLE_NAME: &str = "processing_engine_plugins"; 
        
           const PROCESSING_ENGINE_TRIGGERS_TABLE_NAME: &str = "processing_engine_triggers";

But now I'm wondering -- should this subcommand be constructing an SQL query and reaching out to another node's FlightSQL endpoint or should it be constructing its own system tables internally based on available object store connection parameters? I think with the latter approach there would probably be a lot of missing information since my rough understanding is that a lot of system table info lives in-memory only (at least, that's how it was in IOx).

So the former approach is probably preferable, right? But in that case it's probably better not to worry about using the system table names in code to validate CLI table name arguments.

pauldix · 2025-01-24T17:59:04Z

The CLI should just be constructing a SQL query and running it through the front door. Can be FlightSQL or the HTTP API, whatever's easiest. What system tables exist can be established by querying the DB through the front door using SHOW TABLES, but you probably don't need to do that. The set of system tables exist in the code (i.e. their schema) so you should be able to pull that in the build.

The thing we'd want the user to be able to see, either through the CLI help or through a quick command is what system tables are available to query (which can be established through a query)

hiltontj · 2025-01-24T18:14:55Z

Currently, queries to a given system table go through the QueryExecutor::query_sql interface with queries like,

select * from system.parquet_files

The implementation of that is here in core.

For the CLI described in the issue, one way would be to compose the request to the /api/v3/query_sql endpoint, and that would just use the code mentioned above, or, a new HTTP endpoint could provide a different interface, e.g.,

GET /api/v3/system_tables/parquet_files?limit=10&table_name=foo[...]

the contents of which could be parsed in the API handler and used to construct a query and pass it in to the above interface.

I agree that an API to show available system tables would be nice, and we probably have the means to do that.

When a query comes in, the system table provider is constructed on the fly here:

influxdb/influxdb3_server/src/query_executor/mod.rs

Lines 451 to 458 in 061b62a

    
           let system_schema_provider = Arc::new(SystemSchemaProvider::AllSystemSchemaTables( 
        
               AllSystemSchemaTablesProvider::new( 
        
                   Arc::clone(&db_schema), 
        
                   Arc::clone(&self.query_log), 
        
                   Arc::clone(&self.write_buffer), 
        
                   Arc::clone(&self.sys_events_store), 
        
               ), 
        
           ));

We could have a new method on the QueryExecutor trait whose purpose is to show_system_tables that just constructs the provider to query its schema. There are similar methods dedicated, e.g., to show_databases, which is used directly by the GET /api/v3/configure/database API.

pauldix · 2025-01-24T18:30:17Z

I'd hold off on creating APIs for these things and instead lean on using the query API and its results. I want to keep the HTTP API surface area as small as possible, which is why we have system tables for everything in the DB. The system command on the CLI is really just a convenience function around constructing SQL queries and displaying their results.

waynr · 2025-01-24T18:59:32Z

I'd hold off on creating APIs for these things and instead lean on using the query API and its results.

Yeah this is what I was leaning toward myself -- it should be more resilient to code changes like the addition or removal of system tables.

As far as the subcommand structure goes, I was thinking of something a little different than what's described in this issue:

influxdb3 system <command> [args]

Where <command> would have the following variants:

list - lists available system tables
get <system-table-name> [get-args] - performs the default (probably SELECT * FROM <system-table-name> LIMIT 100 query operation for the named system table
summary [summary-args] - perform a series of default queries across several system tables to display them all with one command
- could introduce flags to generate different types of summaries for use in troubleshooting/support scenarios

pauldix · 2025-01-24T19:05:49Z

Your suggested list looks good to me

waynr · 2025-01-29T22:18:59Z

Closing this issue after addressing in #25912

pauldix added the v3 label Jan 16, 2025

waynr self-assigned this Jan 24, 2025

waynr mentioned this issue Jan 25, 2025

feat: add 'system' subcommand to display system tables #25912

Merged

waynr closed this as completed Jan 29, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add system command to show #25844

Add system command to show #25844

pauldix commented Jan 16, 2025

waynr commented Jan 24, 2025

pauldix commented Jan 24, 2025

hiltontj commented Jan 24, 2025

pauldix commented Jan 24, 2025

waynr commented Jan 24, 2025

pauldix commented Jan 24, 2025

waynr commented Jan 29, 2025

Add system command to show #25844

Add system command to show #25844

Comments

pauldix commented Jan 16, 2025

waynr commented Jan 24, 2025

pauldix commented Jan 24, 2025

hiltontj commented Jan 24, 2025

pauldix commented Jan 24, 2025

waynr commented Jan 24, 2025

pauldix commented Jan 24, 2025

waynr commented Jan 29, 2025