From 12e73f2e9e42821eeb59204b4796184dbc5ed58d Mon Sep 17 00:00:00 2001 From: The Magician Date: Wed, 30 Oct 2024 14:25:33 -0700 Subject: [PATCH] Clean up inconsistent formats in the google_bigquery_table reference documentation (#12185) (#20117) [upstream:b443341984595832d2e1ee7718a1280403223e2e] Signed-off-by: Modular Magician --- .changelog/12185.txt | 3 + website/docs/r/bigquery_table.html.markdown | 118 ++++++++++---------- 2 files changed, 62 insertions(+), 59 deletions(-) create mode 100644 .changelog/12185.txt diff --git a/.changelog/12185.txt b/.changelog/12185.txt new file mode 100644 index 00000000000..42b910df155 --- /dev/null +++ b/.changelog/12185.txt @@ -0,0 +1,3 @@ +```release-note:none + +``` \ No newline at end of file diff --git a/website/docs/r/bigquery_table.html.markdown b/website/docs/r/bigquery_table.html.markdown index 5fe16203254..75c460cc702 100644 --- a/website/docs/r/bigquery_table.html.markdown +++ b/website/docs/r/bigquery_table.html.markdown @@ -108,7 +108,7 @@ The following arguments are supported: * `friendly_name` - (Optional) A descriptive name for the table. -* `max_staleness`: (Optional) The maximum staleness of data that could be +* `max_staleness` - (Optional) The maximum staleness of data that could be returned when the table (or stale MV) is queried. Staleness encoded as a string encoding of [SQL IntervalValue type](https://cloud.google.com/bigquery/docs/reference/standard-sql/data-types#interval_type). @@ -184,13 +184,13 @@ The following arguments are supported: The `external_data_configuration` block supports: -* `autodetect` - (Required) - Let BigQuery try to autodetect the schema +* `autodetect` - (Required) Let BigQuery try to autodetect the schema and format of the table. -* `compression` (Optional) - The compression type of the data source. +* `compression` - (Optional) The compression type of the data source. Valid values are "NONE" or "GZIP". -* `connection_id` (Optional) - The connection specifying the credentials to be used to read +* `connection_id` - (Optional) The connection specifying the credentials to be used to read external storage, such as Azure Blob, Cloud Storage, or S3. The `connection_id` can have the form `{{project}}.{{location}}.{{connection_id}}` or `projects/{{project}}/locations/{{location}}/connections/{{connection_id}}`. @@ -199,40 +199,40 @@ The following arguments are supported: table schema must be specified using the top-level `schema` field [documented above](#schema). -* `csv_options` (Optional) - Additional properties to set if +* `csv_options` - (Optional) Additional properties to set if `source_format` is set to "CSV". Structure is [documented below](#nested_csv_options). -* `bigtable_options` (Optional) - Additional properties to set if +* `bigtable_options` - (Optional) Additional properties to set if `source_format` is set to "BIGTABLE". Structure is [documented below](#nested_bigtable_options). -* `json_options` (Optional) - Additional properties to set if +* `json_options` - (Optional) Additional properties to set if `source_format` is set to "JSON". Structure is [documented below](#nested_json_options). -* `json_extension` (Optional) - Used to indicate that a JSON variant, rather than normal JSON, is being used as the sourceFormat. This should only be used in combination with the `JSON` source format. Valid values are: `GEOJSON`. +* `json_extension` - (Optional) Used to indicate that a JSON variant, rather than normal JSON, is being used as the sourceFormat. This should only be used in combination with the `JSON` source format. Valid values are: `GEOJSON`. -* `parquet_options` (Optional) - Additional properties to set if +* `parquet_options` - (Optional) Additional properties to set if `source_format` is set to "PARQUET". Structure is [documented below](#nested_parquet_options). -* `google_sheets_options` (Optional) - Additional options if +* `google_sheets_options` - (Optional) Additional options if `source_format` is set to "GOOGLE_SHEETS". Structure is [documented below](#nested_google_sheets_options). -* `hive_partitioning_options` (Optional) - When set, configures hive partitioning +* `hive_partitioning_options` - (Optional) When set, configures hive partitioning support. Not all storage formats support hive partitioning -- requesting hive partitioning on an unsupported format will lead to an error, as will providing an invalid specification. Structure is [documented below](#nested_hive_partitioning_options). -* `avro_options` (Optional) - Additional options if `source_format` is set to +* `avro_options` - (Optional) Additional options if `source_format` is set to "AVRO". Structure is [documented below](#nested_avro_options). -* `ignore_unknown_values` (Optional) - Indicates if BigQuery should +* `ignore_unknown_values` - (Optional) Indicates if BigQuery should allow extra values that are not represented in the table schema. If true, the extra values are ignored. If false, records with extra columns are treated as bad records, and if there are too many bad records, an invalid error is returned in the job result. The default value is false. -* `max_bad_records` (Optional) - The maximum number of bad records that +* `max_bad_records` - (Optional) The maximum number of bad records that BigQuery can ignore when reading data. * `schema` - (Optional) A JSON schema for the external table. Schema is required @@ -251,7 +251,7 @@ The following arguments are supported: table schema must be specified using the top-level `schema` field [documented above](#schema). -* `source_format` (Optional) - The data format. Please see sourceFormat under +* `source_format` - (Optional) The data format. Please see sourceFormat under [ExternalDataConfiguration](https://cloud.google.com/bigquery/docs/reference/rest/v2/tables#externaldataconfiguration) in Bigquery's public API documentation for supported formats. To use "GOOGLE_SHEETS" the `scopes` must include "https://www.googleapis.com/auth/drive.readonly". @@ -271,7 +271,7 @@ The following arguments are supported: The `csv_options` block supports: -* `quote` (Required) - The value that is used to quote data sections in a +* `quote` - (Required) The value that is used to quote data sections in a CSV file. If your data does not contain quoted sections, set the property value to an empty string. If your data contains quoted newline characters, you must also set the `allow_quoted_newlines` property to true. @@ -279,63 +279,63 @@ The following arguments are supported: limitations with Terraform default values, this value is required to be explicitly set. -* `allow_jagged_rows` (Optional) - Indicates if BigQuery should accept rows +* `allow_jagged_rows` - (Optional) Indicates if BigQuery should accept rows that are missing trailing optional columns. -* `allow_quoted_newlines` (Optional) - Indicates if BigQuery should allow +* `allow_quoted_newlines` - (Optional) Indicates if BigQuery should allow quoted data sections that contain newline characters in a CSV file. The default value is false. -* `encoding` (Optional) - The character encoding of the data. The supported +* `encoding` - (Optional) The character encoding of the data. The supported values are UTF-8 or ISO-8859-1. -* `field_delimiter` (Optional) - The separator for fields in a CSV file. +* `field_delimiter` - (Optional) The separator for fields in a CSV file. -* `skip_leading_rows` (Optional) - The number of rows at the top of a CSV +* `skip_leading_rows` - (Optional) The number of rows at the top of a CSV file that BigQuery will skip when reading the data. The `bigtable_options` block supports: -* `column_family` (Optional) - A list of column families to expose in the table schema along with their types. This list restricts the column families that can be referenced in queries and specifies their value types. You can use this list to do type conversions - see the 'type' field for more details. If you leave this list empty, all column families are present in the table schema and their values are read as BYTES. During a query only the column families referenced in that query are read from Bigtable. Structure is [documented below](#nested_column_family). -* `ignore_unspecified_column_families` (Optional) - If field is true, then the column families that are not specified in columnFamilies list are not exposed in the table schema. Otherwise, they are read with BYTES type values. The default value is false. -* `read_rowkey_as_string` (Optional) - If field is true, then the rowkey column families will be read and converted to string. Otherwise they are read with BYTES type values and users need to manually cast them with CAST if necessary. The default value is false. -* `output_column_families_as_json` (Optional) - If field is true, then each column family will be read as a single JSON column. Otherwise they are read as a repeated cell structure containing timestamp/value tuples. The default value is false. +* `column_family` - (Optional) A list of column families to expose in the table schema along with their types. This list restricts the column families that can be referenced in queries and specifies their value types. You can use this list to do type conversions - see the 'type' field for more details. If you leave this list empty, all column families are present in the table schema and their values are read as BYTES. During a query only the column families referenced in that query are read from Bigtable. Structure is [documented below](#nested_column_family). +* `ignore_unspecified_column_families` - (Optional) If field is true, then the column families that are not specified in columnFamilies list are not exposed in the table schema. Otherwise, they are read with BYTES type values. The default value is false. +* `read_rowkey_as_string` - (Optional) If field is true, then the rowkey column families will be read and converted to string. Otherwise they are read with BYTES type values and users need to manually cast them with CAST if necessary. The default value is false. +* `output_column_families_as_json` - (Optional) If field is true, then each column family will be read as a single JSON column. Otherwise they are read as a repeated cell structure containing timestamp/value tuples. The default value is false. The `column_family` block supports: -* `column` (Optional) - A List of columns that should be exposed as individual fields as opposed to a list of (column name, value) pairs. All columns whose qualifier matches a qualifier in this list can be accessed as Other columns can be accessed as a list through column field. Structure is [documented below](#nested_column). -* `family_id` (Optional) - Identifier of the column family. -* `type` (Optional) - The type to convert the value in cells of this column family. The values are expected to be encoded using HBase Bytes.toBytes function when using the BINARY encoding value. Following BigQuery types are allowed (case-sensitive): "BYTES", "STRING", "INTEGER", "FLOAT", "BOOLEAN", "JSON". Default type is BYTES. This can be overridden for a specific column by listing that column in 'columns' and specifying a type for it. -* `encoding` (Optional) - The encoding of the values when the type is not STRING. Acceptable encoding values are: TEXT - indicates values are alphanumeric text strings. BINARY - indicates values are encoded using HBase Bytes.toBytes family of functions. This can be overridden for a specific column by listing that column in 'columns' and specifying an encoding for it. -* `only_read_latest` (Optional) - If this is set only the latest version of value are exposed for all columns in this column family. This can be overridden for a specific column by listing that column in 'columns' and specifying a different setting for that column. +* `column` - (Optional) A List of columns that should be exposed as individual fields as opposed to a list of (column name, value) pairs. All columns whose qualifier matches a qualifier in this list can be accessed as Other columns can be accessed as a list through column field. Structure is [documented below](#nested_column). +* `family_id` - (Optional) Identifier of the column family. +* `type` - (Optional) The type to convert the value in cells of this column family. The values are expected to be encoded using HBase Bytes.toBytes function when using the BINARY encoding value. Following BigQuery types are allowed (case-sensitive): "BYTES", "STRING", "INTEGER", "FLOAT", "BOOLEAN", "JSON". Default type is BYTES. This can be overridden for a specific column by listing that column in 'columns' and specifying a type for it. +* `encoding` - (Optional) The encoding of the values when the type is not STRING. Acceptable encoding values are: TEXT - indicates values are alphanumeric text strings. BINARY - indicates values are encoded using HBase Bytes.toBytes family of functions. This can be overridden for a specific column by listing that column in 'columns' and specifying an encoding for it. +* `only_read_latest` - (Optional) If this is set only the latest version of value are exposed for all columns in this column family. This can be overridden for a specific column by listing that column in 'columns' and specifying a different setting for that column. The `column` block supports: -* `qualifier_encoded` (Optional) - Qualifier of the column. Columns in the parent column family that has this exact qualifier are exposed as . field. If the qualifier is valid UTF-8 string, it can be specified in the qualifierString field. Otherwise, a base-64 encoded value must be set to qualifierEncoded. The column field name is the same as the column qualifier. However, if the qualifier is not a valid BigQuery field identifier i.e. does not match [a-zA-Z][a-zA-Z0-9_]*, a valid identifier must be provided as fieldName. -* `qualifier_string` (Optional) - Qualifier string. -* `field_name` (Optional) - If the qualifier is not a valid BigQuery field identifier i.e. does not match [a-zA-Z][a-zA-Z0-9_]*, a valid identifier must be provided as the column field name and is used as field name in queries. -* `type` (Optional) - The type to convert the value in cells of this column. The values are expected to be encoded using HBase Bytes.toBytes function when using the BINARY encoding value. Following BigQuery types are allowed (case-sensitive): "BYTES", "STRING", "INTEGER", "FLOAT", "BOOLEAN", "JSON", Default type is "BYTES". 'type' can also be set at the column family level. However, the setting at this level takes precedence if 'type' is set at both levels. -* `encoding` (Optional) - The encoding of the values when the type is not STRING. Acceptable encoding values are: TEXT - indicates values are alphanumeric text strings. BINARY - indicates values are encoded using HBase Bytes.toBytes family of functions. 'encoding' can also be set at the column family level. However, the setting at this level takes precedence if 'encoding' is set at both levels. -* `only_read_latest` (Optional) - If this is set, only the latest version of value in this column are exposed. 'onlyReadLatest' can also be set at the column family level. However, the setting at this level takes precedence if 'onlyReadLatest' is set at both levels. +* `qualifier_encoded` - (Optional) Qualifier of the column. Columns in the parent column family that has this exact qualifier are exposed as . field. If the qualifier is valid UTF-8 string, it can be specified in the qualifierString field. Otherwise, a base-64 encoded value must be set to qualifierEncoded. The column field name is the same as the column qualifier. However, if the qualifier is not a valid BigQuery field identifier i.e. does not match [a-zA-Z][a-zA-Z0-9_]*, a valid identifier must be provided as fieldName. +* `qualifier_string` - (Optional) Qualifier string. +* `field_name` - (Optional) If the qualifier is not a valid BigQuery field identifier i.e. does not match [a-zA-Z][a-zA-Z0-9_]*, a valid identifier must be provided as the column field name and is used as field name in queries. +* `type` - (Optional) The type to convert the value in cells of this column. The values are expected to be encoded using HBase Bytes.toBytes function when using the BINARY encoding value. Following BigQuery types are allowed (case-sensitive): "BYTES", "STRING", "INTEGER", "FLOAT", "BOOLEAN", "JSON", Default type is "BYTES". 'type' can also be set at the column family level. However, the setting at this level takes precedence if 'type' is set at both levels. +* `encoding` - (Optional) The encoding of the values when the type is not STRING. Acceptable encoding values are: TEXT - indicates values are alphanumeric text strings. BINARY - indicates values are encoded using HBase Bytes.toBytes family of functions. 'encoding' can also be set at the column family level. However, the setting at this level takes precedence if 'encoding' is set at both levels. +* `only_read_latest` - (Optional) If this is set, only the latest version of value in this column are exposed. 'onlyReadLatest' can also be set at the column family level. However, the setting at this level takes precedence if 'onlyReadLatest' is set at both levels. The `json_options` block supports: -* `encoding` (Optional) - The character encoding of the data. The supported values are UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, and UTF-32LE. The default value is UTF-8. +* `encoding` - (Optional) The character encoding of the data. The supported values are UTF-8, UTF-16BE, UTF-16LE, UTF-32BE, and UTF-32LE. The default value is UTF-8. The `google_sheets_options` block supports: -* `range` (Optional) - Range of a sheet to query from. Only used when +* `range` - (Optional) Range of a sheet to query from. Only used when non-empty. At least one of `range` or `skip_leading_rows` must be set. Typical format: "sheet_name!top_left_cell_id:bottom_right_cell_id" For example: "sheet1!A1:B20" -* `skip_leading_rows` (Optional) - The number of rows at the top of the sheet +* `skip_leading_rows` - (Optional) The number of rows at the top of the sheet that BigQuery will skip when reading the data. At least one of `range` or `skip_leading_rows` must be set. The `hive_partitioning_options` block supports: -* `mode` (Optional) - When set, what mode of hive partitioning to use when +* `mode` - (Optional) When set, what mode of hive partitioning to use when reading data. The following modes are supported. * AUTO: automatically infer partition key name(s) and type(s). * STRINGS: automatically infer partition key name(s). All types are @@ -348,7 +348,7 @@ The following arguments are supported: require a partition filter that can be used for partition elimination to be specified. -* `source_uri_prefix` (Optional) - When hive partition detection is requested, +* `source_uri_prefix` - (Optional) When hive partition detection is requested, a common for all source uris must be required. The prefix must end immediately before the partition key encoding begins. For example, consider files following this data layout. `gs://bucket/path_to_table/dt=2019-06-01/country=USA/id=7/file.avro` @@ -359,19 +359,19 @@ The following arguments are supported: The `avro_options` block supports: -* `use_avro_logical_types` (Optional) - If is set to true, indicates whether +* `use_avro_logical_types` - (Optional) If is set to true, indicates whether to interpret logical types as the corresponding BigQuery data type (for example, TIMESTAMP), instead of using the raw type (for example, INTEGER). The `parquet_options` block supports: -* `enum_as_string` (Optional) - Indicates whether to infer Parquet ENUM logical type as STRING instead of BYTES by default. +* `enum_as_string` - (Optional) Indicates whether to infer Parquet ENUM logical type as STRING instead of BYTES by default. -* `enable_list_inference` (Optional) - Indicates whether to use schema inference specifically for Parquet LIST logical type. +* `enable_list_inference` - (Optional) Indicates whether to use schema inference specifically for Parquet LIST logical type. The `time_partitioning` block supports: -* `expiration_ms` - (Optional) Number of milliseconds for which to keep the +* `expiration_ms` - (Optional) Number of milliseconds for which to keep the storage for a partition. * `field` - (Optional) The field used to determine how to create a time-based @@ -443,26 +443,26 @@ The following arguments are supported: The `primary_key` block supports: -* `columns`: (Required) The columns that are composed of the primary key constraint. +* `columns` - (Required) The columns that are composed of the primary key constraint. The `foreign_keys` block supports: -* `name`: (Optional) Set only if the foreign key constraint is named. +* `name` - (Optional) Set only if the foreign key constraint is named. -* `referenced_table`: (Required) The table that holds the primary key +* `referenced_table` - (Required) The table that holds the primary key and is referenced by this foreign key. Structure is [documented below](#nested_referenced_table). -* `column_references`: (Required) The pair of the foreign key column and primary key column. +* `column_references` - (Required) The pair of the foreign key column and primary key column. Structure is [documented below](#nested_column_references). The `referenced_table` block supports: -* `project_id`: (Required) The ID of the project containing this table. +* `project_id` - (Required) The ID of the project containing this table. -* `dataset_id`: (Required) The ID of the dataset containing this table. +* `dataset_id` - (Required) The ID of the dataset containing this table. -* `table_id`: (Required) The ID of the table. The ID must contain only +* `table_id` - (Required) The ID of the table. The ID must contain only letters (a-z, A-Z), numbers (0-9), or underscores (_). The maximum length is 1,024 characters. Certain operations allow suffixing of the table ID with a partition decorator, such as @@ -470,20 +470,20 @@ The following arguments are supported: The `column_references` block supports: -* `referencing_column`: (Required) The column that composes the foreign key. +* `referencing_column` - (Required) The column that composes the foreign key. -* `referenced_column`: (Required) The column in the primary key that are +* `referenced_column` - (Required) The column in the primary key that are referenced by the referencingColumn The `table_replication_info` block supports: -* `source_project_id` (Required) - The ID of the source project. +* `source_project_id` - (Required) The ID of the source project. -* `source_dataset_id` (Required) - The ID of the source dataset. +* `source_dataset_id` - (Required) The ID of the source dataset. -* `source_table_id` (Required) - The ID of the source materialized view. +* `source_table_id` - (Required) The ID of the source materialized view. -* `replication_interval_ms` (Optional) - The interval at which the source +* `replication_interval_ms` - (Optional) The interval at which the source materialized view is polled for updates. The default is 300000. The `biglake_configuration` block supports: @@ -505,7 +505,7 @@ The following arguments are supported: In addition to the arguments listed above, the following computed attributes are exported: -* `id` - an identifier for the resource with format `projects/{{project}}/datasets/{{dataset}}/tables/{{name}}` +* `id` - An identifier for the resource with format `projects/{{project}}/datasets/{{dataset}}/tables/{{name}}` * `creation_time` - The time when this table was created, in milliseconds since the epoch.