Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Describe Table with catalog name. #989

Merged
merged 1 commit into from
Nov 1, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
86 changes: 86 additions & 0 deletions docs/user/general/identifiers.rst
Original file line number Diff line number Diff line change
Expand Up @@ -176,3 +176,89 @@ Query delimited multiple indices seperated by ``,``::
|-------|
| 5 |
+-------+



Fully Qualified Table Names
===========================

Description
-----------
With the introduction of different datasource catalogs along with Opensearch, support for fully qualified table names became compulsory to resolve tables to a catalog.

Format for fully qualified table name.
``<catalogName>.<schemaName>.<tableName>``

* catalogName:[Mandatory] Catalog information is mandatory when querying over tables from catalogs other than opensearch connector.

* schemaName:[Optional] Schema is a logical abstraction for a group of tables. In the current state, we only support ``default`` and ``information_schema``. Any schema mentioned in the fully qualified name other than these two will be resolved to be part of tableName.

* tableName:[Mandatory] tableName is mandatory.

The current resolution algorithm works in such a way, the old queries on opensearch work without specifying any catalog name.
So queries on opensearch indices doesn't need a fully qualified table name.

Table Name Resolution Algorithm.
--------------------------------

Fully qualified Name is divided into parts based on ``.`` character.

TableName resolution algorithm works in the following manner.

1. Take the first part of the qualified name and resolve it to a catalog from the list of catalogs configured.
If it doesn't resolve to any of the catalog names configured, catalog name will default to ``@opensearch`` catalog.

2. Take the first part of the remaining qualified name after capturing the catalog name.
If this part represents any of the supported schemas under catalog, it will resolve to the same otherwise schema name will resolve to ``default`` schema.
Currently ``default`` and ``information_schema`` are the only schemas supported.

3. Rest of the parts are combined to resolve tablename.

** Only table name identifiers are supported with fully qualified names, identifiers used for columns and other attributes doesn't require prefixing with catalog and schema information.**

Examples
--------
Assume [my_prometheus] is the only catalog configured other than default opensearch engine.

1. ``my_prometheus.default.http_requests_total``

catalogName = ``my_prometheus`` [Is in the list of catalogs configured].

schemaName = ``default`` [Is in the list of schemas supported].

tableName = ``http_requests_total``.

2. ``logs.12.13.1``


catalogName = ``@opensearch`` [Resolves to default @opensearch connector since [my_prometheus] is the only catalog configured name.]

schemaName = ``default`` [No supported schema found, so default to `default`].

tableName = ``logs.12.13.1``.


3. ``my_prometheus.http_requests_total``


catalogName = ```my_prometheus`` [Is in the list of catalogs configured].

schemaName = ``default`` [No supported schema found, so default to `default`].

tableName = ``http_requests_total``.

4. ``prometheus.http_requests_total``

catalogName = ``@opensearch`` [Resolves to default @opensearch connector since [my_prometheus] is the only catalog configured name.]

schemaName = ``default`` [No supported schema found, so default to `default`].

tableName = ``prometheus.http_requests_total``.

5. ``prometheus.default.http_requests_total.1.2.3``

catalogName = ``@opensearch`` [Resolves to default @opensearch connector since [my_prometheus] is the only catalog configured name.]

schemaName = ``default`` [No supported schema found, so default to `default`].

tableName = ``prometheus.default.http_requests_total.1.2.3``.
28 changes: 26 additions & 2 deletions docs/user/ppl/cmd/describe.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,9 +16,12 @@ Description

Syntax
============
describe <index>
describe <catalog>.<schema>.<tablename>

* catalog: optional. If catalog is not provided, it resolves to opensearch catalog.
* schema: optional. If schema is not provided, it resolves to default schema.
* tablename: mandatory. describe command must specify which tablename to query from.

* index: mandatory. describe command must specify which index to query from.


Example 1: Fetch all the metadata
Expand Down Expand Up @@ -63,3 +66,24 @@ PPL query::
| age |
+----------------+


Example 3: Fetch metadata for table in prometheus catalog
=========================================================

The example retrieves table info for ``prometheus_http_requests_total`` metric in prometheus catalog.

PPL query::

os> describe my_prometheus.prometheus_http_requests_total;
fetched rows / total rows = 7/7
+-----------------+----------------+--------------------------------+---------------+-------------+
| TABLE_CATALOG | TABLE_SCHEMA | TABLE_NAME | COLUMN_NAME | DATA_TYPE |
|-----------------+----------------+--------------------------------+---------------+-------------|
| my_prometheus | default | prometheus_http_requests_total | @labels | keyword |
| my_prometheus | default | prometheus_http_requests_total | handler | keyword |
| my_prometheus | default | prometheus_http_requests_total | code | keyword |
| my_prometheus | default | prometheus_http_requests_total | instance | keyword |
| my_prometheus | default | prometheus_http_requests_total | @timestamp | timestamp |
| my_prometheus | default | prometheus_http_requests_total | @value | double |
| my_prometheus | default | prometheus_http_requests_total | job | keyword |
+-----------------+----------------+--------------------------------+---------------+-------------+
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@

import static org.opensearch.sql.legacy.TestsConstants.TEST_INDEX_DOG;
import static org.opensearch.sql.util.MatcherUtils.columnName;
import static org.opensearch.sql.util.MatcherUtils.rows;
import static org.opensearch.sql.util.MatcherUtils.verifyColumn;
import static org.opensearch.sql.util.MatcherUtils.verifyDataRows;

Expand Down Expand Up @@ -87,4 +88,26 @@ public void describeCommandWithoutIndexShouldFailToParse() throws IOException {
assertTrue(e.getMessage().contains("Failed to parse query due to offending symbol"));
}
}

@Test
public void testDescribeCommandWithPrometheusCatalog() throws IOException {
JSONObject result = executeQuery("describe my_prometheus.prometheus_http_requests_total");
verifyColumn(
result,
columnName("TABLE_CATALOG"),
columnName("TABLE_SCHEMA"),
columnName("TABLE_NAME"),
columnName("COLUMN_NAME"),
columnName("DATA_TYPE")
);
verifyDataRows(result,
rows("my_prometheus", "default", "prometheus_http_requests_total", "@labels", "keyword"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "handler", "keyword"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "code", "keyword"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "instance", "keyword"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "@value", "double"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "@timestamp",
"timestamp"),
rows("my_prometheus", "default", "prometheus_http_requests_total", "job", "keyword"));
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -29,6 +29,7 @@

import com.google.common.collect.ImmutableList;
import com.google.common.collect.ImmutableMap;
import java.util.ArrayList;
import java.util.Collections;
import java.util.List;
import java.util.Optional;
Expand All @@ -44,6 +45,7 @@
import org.opensearch.sql.ast.expression.Literal;
import org.opensearch.sql.ast.expression.Map;
import org.opensearch.sql.ast.expression.ParseMethod;
import org.opensearch.sql.ast.expression.QualifiedName;
import org.opensearch.sql.ast.expression.UnresolvedArgument;
import org.opensearch.sql.ast.expression.UnresolvedExpression;
import org.opensearch.sql.ast.tree.AD;
Expand Down Expand Up @@ -117,11 +119,19 @@ public UnresolvedPlan visitSearchFilterFrom(SearchFilterFromContext ctx) {

/**
* Describe command.
* Current logic separates table and metadata info about table by adding
* MAPPING_ODFE_SYS_TABLE as suffix.
* Even with the introduction of catalog and schema name in fully qualified table name,
* we do the same thing by appending MAPPING_ODFE_SYS_TABLE as syffix to the last part
* of qualified name.
*/
@Override
public UnresolvedPlan visitDescribeCommand(DescribeCommandContext ctx) {
final Relation table = (Relation) visitTableSourceClause(ctx.tableSourceClause());
return new Relation(qualifiedName(mappingTable(table.getTableName())));
QualifiedName tableQualifiedName = table.getTableQualifiedName();
ArrayList<String> parts = new ArrayList<>(tableQualifiedName.getParts());
parts.set(parts.size() - 1, mappingTable(parts.get(parts.size() - 1)));
Comment on lines +132 to +133
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry what does this mean? No UT changes required to cover this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make comments and also include UTs.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should it be handled by catalog resolver?

Copy link
Member Author

@vmmusings vmmusings Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, CatalogSchemaIdentifierNameResolver will handle this.
For

describe prometheus.requests_total command

the table name will get converted to below in the above code.

prometheus.requests_total.MAPPING_ODFE_SYS_TABLE

So that all the system tables are recognized the way it works currently.

Copy link
Member Author

@vmmusings vmmusings Oct 31, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Future Design
describe prometheus.requests_total --> source = prometheus.information_schema.columns.

For information_schema, there will be a separate information_schema internal connector which can interact with all the storage engines to get column and table details.

Storage Engine will have more methods in interface to expose tables, columns, schema

return new Relation(new QualifiedName(parts));
}

/**
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -724,6 +724,14 @@ public void testDescribeCommandWithMultipleIndices() {
relation(mappingTable("t,u")));
}

@Test
public void testDescribeCommandWithFullyQualifiedTableName() {
assertEqual("describe prometheus.http_metric",
relation(qualifiedName("prometheus", mappingTable("http_metric"))));
assertEqual("describe prometheus.schema.http_metric",
relation(qualifiedName("prometheus", "schema", mappingTable("http_metric"))));
}

@Test
public void test_fitRCFADCommand_withoutDataFormat() {
assertEqual("source=t | AD shingle_size=10 time_decay=0.0001 time_field='timestamp' "
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
import java.util.List;
import java.util.Map;
import java.util.Objects;
import okhttp3.HttpUrl;
import okhttp3.OkHttpClient;
import okhttp3.Request;
import okhttp3.Response;
Expand Down Expand Up @@ -82,7 +81,11 @@ public Map<String, List<MetricMetadata>> getAllMetrics() throws IOException {
private List<String> toListOfStrings(JSONArray array) {
List<String> result = new ArrayList<>();
for (int i = 0; i < array.length(); i++) {
result.add(array.optString(i));
//__name__ is internal label in prometheus representing the metric name.
//Exempting this from labels list as it is not required in any of the operations.
if (!"__name__".equals(array.optString(i))) {
result.add(array.optString(i));
}
vmmusings marked this conversation as resolved.
Show resolved Hide resolved
}
return result;
}
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,6 @@ void testGetLabel() {
mockWebServer.enqueue(mockResponse);
List<String> response = prometheusClient.getLabels(METRIC_NAME);
assertEquals(new ArrayList<String>() {{
add("__name__");
add("call");
add("code");
}
Expand Down