Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function to_utf8 supports char type #20158

Merged
merged 2 commits into from
Jan 12, 2024
Merged

Conversation

borderlayout
Copy link
Contributor

@borderlayout borderlayout commented Dec 18, 2023

Description

Function to_utf8 doesn't support char type. When this function is applied to the char type in Hive, an error will be reported. This PR modifies to_utf8 function to support char type in Hive. This function is important, because it also used in ranger-trino plugin for masking.

Additional context and related issues

In hive, create a table using SQL:
CREATE TABLE char_table_test (
id INT,
name string,
description CHAR(120)
);
insert into char_table_test values(1, 'xiaoming', 'he is a student');
insert into char_table_test values(2, 'honghong', 'she is a teacher');

When using function to_utf8 in a sql, there is an error.
To_utf8 doesn't support char type, as below:
SQL1:
select to_utf8(description) from char_table_test;
Query 20231218_125526_00001_ewnva failed: line 1:8: Unexpected parameters (char(120)) for function to_utf8. Expected: to_utf8(varchar(x))
select to_utf8(description) from char_table_test;

SQL2:
select length(from_utf8(to_utf8(description))) from char_table_test;
Query 20231218_125334_00000_ewnva failed: line 1:25: Unexpected parameters (char(120)) for function to_utf8. Expected: to_utf8(varchar(x))
select length(from_utf8(to_utf8(description))) from char_table_test

After changed, it supports char type. As below:

SQL1:
select to_utf8(description) from char_table_test;
_col0

68 65 20 69 73 20 61 20 73 74 75 64 65 6e 74 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20
73 68 65 20 69 73 20 61 20 74 65 61 63 68 65 72+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20+
20 20 20 20 20 20 20 20
(2 rows)

SQL2:
select from_utf8(to_utf8(description)) from char_table_test;
_col0

she is a teacher
he is a student

SQL3:
select length(from_utf8(to_utf8(description))) from char_table_test;
_col0

120
120
(2 rows)

Release notes

(x) Release notes are required, with the following suggested text:

* Add support for `char(n)` values in {func}`to_utf8`. ({issue}`20158`)

@cla-bot cla-bot bot added the cla-signed label Dec 18, 2023
@wendigo wendigo requested a review from martint December 18, 2023 13:17
@mosabua
Copy link
Member

mosabua commented Dec 18, 2023

Docs need update as well

@martint
Copy link
Member

martint commented Dec 18, 2023

Syntax and semantics look good.

@SqlType(StandardTypes.VARBINARY)
public static Slice toUtf8(@LiteralParameter("x") long x, @SqlType("char(x)") Slice slice)
{
return Chars.padSpaces(slice, (int) x);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

toIntExact(x)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I have just updated the code.

@wendigo
Copy link
Contributor

wendigo commented Jan 12, 2024

@mosabua please squash these commits into a single one

@mosabua
Copy link
Member

mosabua commented Jan 12, 2024

@mosabua please squash these commits into a single one

I think @wendigo meant for @borderlayout to squash commits...

@martint martint merged commit 1f5d0ea into trinodb:master Jan 12, 2024
88 checks passed
@github-actions github-actions bot added this to the 437 milestone Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants