Skip to content

Commit

Permalink
Document hive storage table properties in new section
Browse files Browse the repository at this point in the history
  • Loading branch information
rosewms authored and hashhar committed Feb 24, 2022
1 parent 7e99818 commit 26c8027
Showing 1 changed file with 110 additions and 4 deletions.
114 changes: 110 additions & 4 deletions docs/src/main/sphinx/connector/hive.rst
Original file line number Diff line number Diff line change
Expand Up @@ -889,6 +889,8 @@ as Hive. For example, converting the string ``'foo'`` to a number,
or converting the string ``'1234'`` to a ``tinyint`` (which has a
maximum value of ``127``).

.. _hive_avro_schema:

Avro schema evolution
---------------------

Expand Down Expand Up @@ -1008,6 +1010,109 @@ Procedures
Flush Hive metadata cache entries connected with selected partition.
Procedure requires named parameters to be passed

.. _hive_table_properties:

Table properties
----------------

Table properties supply or set metadata for the underlying tables. This
is key for :doc:`/sql/create-table-as` statements. Table properties are passed
to the connector using a :doc:`WITH </sql/create-table-as>` clause::

CREATE TABLE tablename
WITH (format='CSV',
csv_escape = '"')

See the :ref:`hive_examples` for more information.

.. list-table:: Hive connector table properties
:widths: 20, 60, 20
:header-rows: 1

* - Property name
- Description
- Default
* - ``auto_purge``
- Indicates to the configured metastore to perform a purge when a table or
partition is deleted instead of a soft deletion using the trash.
-
* - ``avro_schema_url``
- The URI pointing to :ref:`hive_avro_schema` for the table.
-
* - ``bucket_count``
- The number of buckets to group data into. Only valid if used with
``bucketed_by``.
- 0
* - ``bucketed_by``
- The bucketing column for the storage table. Only valid if used with
``bucket_count``.
- ``[]``
* - ``bucketing_version``
- Specifies which Hive bucketing version to use. Valid values are ``1``
or ``2``.
-
* - ``csv_escape``
- The CSV escape character. Requires CSV format.
-
* - ``csv_quote``
- The CSV quote character. Requires CSV format.
-
* - ``csv_separator``
- The CSV separator character. Requires CSV format.
-
* - ``external_location``
- The URI for an external Hive table on S3, Azure Blob Storage, etc. See the
:ref:`hive_examples` for more information.
-
* - ``format``
- The table file format. Valid values include ``ORC``, ``PARQUET``, ``AVRO``,
``RCBINARY``, ``RCTEXT``, ``SEQUENCEFILE``, ``JSON``, ``TEXTFILE``, and
``CSV``. The catalog property ``hive.storage-format`` sets the default
value and can change it to a different default.
-
* - ``null_format``
- The serialization format for ``NULL`` value. Requires TextFile, RCText,
or SequenceFile format.
-
* - ``orc_bloom_filter_columns``
- Comma separated list of columns to use for ORC bloom filter. It improves
the performance of queries using range predicates when reading ORC files.
Requires ORC format.
- ``[]``
* - ``orc_bloom_filter_fpp``
- The ORC bloom filters false positive probability. Requires ORC format.
- 0.05
* - ``partitioned_by``
- The partitioning column for the storage table. The columns listed in the
``partitioned_by`` clause must be the last columns as defined in the DDL.
- ``[]``
* - ``skip_footer_line_count``
- The number of footer lines to ignore when parsing the file for data.
Requires TextFile or CSV format tables.
-
* - ``skip_header_line_count``
- The number of header lines to ignore when parsing the file for data.
Requires TextFile or CSV format tables.
-
* - ``sorted_by``
- The column to sort by to determine bucketing for row. Only valid if
``bucketed_by`` and ``bucket_count`` are specified as well.
- ``[]``
* - ``textfile_field_separator``
- Allows the use of custom field separators, such as '|', for TextFile
formatted tables.
-
* - ``textfile_field_separator_escape``
- Allows the use of a custom escape character for TextFile formatted tables.
-
* - ``transactional``
- Set this property to ``true`` to create an ORC ACID transactional table.
Requires ORC format. This property may be shown as true for insert-only
tables created using older versions of Hive.
-
.. _hive_special_columns:

Special columns
---------------

Expand Down Expand Up @@ -1038,11 +1143,10 @@ Retrieve all records that belong to files stored in the partition
FROM hive.web.page_views
WHERE "$partition" = 'ds=2016-08-09/country=US'
Special tables
----------------
.. _hive_special_tables:

Table properties
^^^^^^^^^^^^^^^^
Special tables
--------------

The raw Hive table properties are available as a hidden table, containing a
separate column per table property, with a single row containing the property
Expand All @@ -1053,6 +1157,8 @@ You can inspect the property names and values with a simple query::

SELECT * FROM hive.web."page_views$properties";

.. _hive_examples:

Examples
--------

Expand Down

0 comments on commit 26c8027

Please sign in to comment.