Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overview for schema evolution support in Hive #19643

Open
14 of 18 tasks
findinpath opened this issue Nov 6, 2023 · 4 comments
Open
14 of 18 tasks

Overview for schema evolution support in Hive #19643

findinpath opened this issue Nov 6, 2023 · 4 comments
Labels
hive Hive connector

Comments

@findinpath
Copy link
Contributor

findinpath commented Nov 6, 2023

This issue acts as an ☂️ for the coercions done or which are supposed to get done in Hive to provide an overview for what efforts have been done in supporting coercion scenarios.

Coercions

Preview Give feedback
  1. cla-signed hive tests:hive
  2. cla-signed hive tests:hive
  3. cla-signed hive ready to merge release-notes tests:hive
    findinpath
  4. cla-signed hive tests:hive
  5. cla-signed hive tests:hive
  6. cla-signed hive release-notes tests:hive
    findinpath
  7. cla-signed docs hive tests:hive
  8. cla-signed docs hive tests:hive

To Date type

Preview Give feedback
  1. cla-signed hive ready to merge release-notes tests:hive
    findinpath

To Binary type

Preview Give feedback

To String/Varchar/Char

Preview Give feedback
  1. cla-signed docs hive release-notes tests:hive
    findinpath
  2. cla-signed docs hive tests:hive
  3. cla-signed hive

To Decimal

Preview Give feedback
  1. cla-signed docs hive tests:hive
    findinpath
@findinpath findinpath added the hive Hive connector label Nov 6, 2023
@findinpath findinpath changed the title Coercions for Hive Overview Overview for schema evolution support in Hive Nov 6, 2023
@findinpath
Copy link
Contributor Author

@mosabua it would probably be helpful to provide an overview of the existing coercions within the Hive connector

I stumbled by chance on https://docs.dremio.com/current/reference/sql/data-types/coercions/

@mosabua
Copy link
Member

mosabua commented Nov 8, 2023

We recently updated the schema evolution docs after @dain found a bunch missing .. we now have at table at https://trino.io/docs/current/connector/hive.html#schema-evolution

If there is more info to add it would be great to get a PR ...

@Praveen2112
Copy link
Member

The list which is supported in Hive

  /**
   * (Rules from Hive's PrimitiveObjectInspectorUtils conversion)
   *
   * To BOOLEAN, BYTE, SHORT, INT, LONG:
   *   Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) with down cast if necessary.
   *   Convert from (FLOAT, DOUBLE) using type cast to long and down cast if necessary.
   *   Convert from DECIMAL from longValue and down cast if necessary.
   *   Convert from STRING using LazyLong.parseLong and down cast if necessary.
   *   Convert from (CHAR, VARCHAR) from Integer.parseLong and down cast if necessary.
   *   Convert from TIMESTAMP using timestamp getSeconds and down cast if necessary.
   *
   *   AnyIntegerFromAnyIntegerTreeReader (written)
   *   AnyIntegerFromFloatTreeReader (written)
   *   AnyIntegerFromDoubleTreeReader (written)
   *   AnyIntegerFromDecimalTreeReader (written)
   *   AnyIntegerFromStringGroupTreeReader (written)
   *   AnyIntegerFromTimestampTreeReader (written)
   *
   * To FLOAT/DOUBLE:
   *   Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using cast
   *   Convert from FLOAT using cast
   *   Convert from DECIMAL using getDouble
   *   Convert from (STRING, CHAR, VARCHAR) using Double.parseDouble
   *   Convert from TIMESTAMP using timestamp getDouble
   *
   *   FloatFromAnyIntegerTreeReader (existing)
   *   FloatFromDoubleTreeReader (written)
   *   FloatFromDecimalTreeReader (written)
   *   FloatFromStringGroupTreeReader (written)
   *
   *   DoubleFromAnyIntegerTreeReader (existing)
   *   DoubleFromFloatTreeReader (existing)
   *   DoubleFromDecimalTreeReader (written)
   *   DoubleFromStringGroupTreeReader (written)
   *
   * To DECIMAL:
   *   Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using to HiveDecimal.create()
   *   Convert from (FLOAT, DOUBLE) using to HiveDecimal.create(string value)
   *   Convert from (STRING, CHAR, VARCHAR) using HiveDecimal.create(string value)
   *   Convert from TIMESTAMP using HiveDecimal.create(string value of timestamp getDouble)
   *
   *   DecimalFromAnyIntegerTreeReader (existing)
   *   DecimalFromFloatTreeReader (existing)
   *   DecimalFromDoubleTreeReader (existing)
   *   DecimalFromStringGroupTreeReader (written)
   *
   * To STRING, CHAR, VARCHAR:
   *   Convert from (BYTE, SHORT, INT, LONG) using to string conversion
   *   Convert from BOOLEAN using boolean (True/False) conversion
   *   Convert from (FLOAT, DOUBLE) using to string conversion
   *   Convert from DECIMAL using HiveDecimal.toString
   *   Convert from CHAR by stripping pads
   *   Convert from VARCHAR with value
   *   Convert from TIMESTAMP using Timestamp.toString
   *   Convert from DATE using Date.toString
   *   Convert from BINARY using Text.decode
   *
   *   StringGroupFromAnyIntegerTreeReader (written)
   *   StringGroupFromBooleanTreeReader (written)
   *   StringGroupFromFloatTreeReader (written)
   *   StringGroupFromDoubleTreeReader (written)
   *   StringGroupFromDecimalTreeReader (written)
   *
   *   String from Char/Varchar conversion
   *   Char from String/Varchar conversion
   *   Varchar from String/Char conversion
   *
   *   StringGroupFromTimestampTreeReader (written)
   *   StringGroupFromDateTreeReader (written)
   *   StringGroupFromBinaryTreeReader *****
   *
   * To TIMESTAMP:
   *   Convert from (BOOLEAN, BYTE, SHORT, INT, LONG) using TimestampWritable.longToTimestamp
   *   Convert from (FLOAT, DOUBLE) using TimestampWritable.doubleToTimestamp
   *   Convert from DECIMAL using TimestampWritable.decimalToTimestamp
   *   Convert from (STRING, CHAR, VARCHAR) using string conversion
   *   Or, from DATE
   *
   *   TimestampFromAnyIntegerTreeReader (written)
   *   TimestampFromFloatTreeReader (written)
   *   TimestampFromDoubleTreeReader (written)
   *   TimestampFromDecimalTreeReader (written)
   *   TimestampFromStringGroupTreeReader (written)
   *   TimestampFromDateTreeReader
   *
   *
   * To DATE:
   *   Convert from (STRING, CHAR, VARCHAR) using string conversion.
   *   Or, from TIMESTAMP.
   *
   *  DateFromStringGroupTreeReader (written)
   *  DateFromTimestampTreeReader (written)
   *
   * To BINARY:
   *   Convert from (STRING, CHAR, VARCHAR) using getBinaryFromText
   *
   *  BinaryFromStringGroupTreeReader (written)
   *
   * (Notes from StructConverter)
   *
   * To STRUCT:
   *   Input must be data type STRUCT
   *   minFields = Math.min(numSourceFields, numTargetFields)
   *   Convert those fields
   *   Extra targetFields to NULL
   *
   * (Notes from ListConverter)
   *
   * To LIST:
   *   Input must be data type LIST
   *   Convert elements
   *
   * (Notes from MapConverter)
   *
   * To MAP:
   *   Input must be data type MAP
   *   Convert keys and values
   *
   * (Notes from UnionConverter)
   *
   * To UNION:
   *   Input must be data type UNION
   *   Convert value for tag
   */ 

@dain
Copy link
Member

dain commented Nov 10, 2023

Are there issues for all of the missing items on that list now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hive Hive connector
Development

No branches or pull requests

4 participants