-
Notifications
You must be signed in to change notification settings - Fork 542
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Treating the Empty String as NULL since 0.3.2-patch6 #931
Comments
Thanks @mafiore. It sounds like empty string result in a null flag. Let me add a few more test cases to see if I can reproduce the issue - would be great if you can share a minimal test case. |
thanks for your quick reply @zhicwu. My test-case is build in Apache Hop, so this is no raw code to share. Up to 0.3.2-patch5 everthing worked fine. When we tried to change to a higher revision, the mess began. I ended up with a Simple Table like
I set the This leads to the described error. If I set the value to e.g. There must be a related change between |
Below test case works for me. Did I miss something? try (ClickHouseConnection conn = newConnection(new Properties());
Statement s = conn.createStatement()) {
s.execute("drop table if exists test_read_write_strings;"
+ "create table test_read_write_strings(id Int32, s1 String, s2 Nullable(String), s3 Array(Nullable(String)))engine=Memory");
try (PreparedStatement stmt = conn
.prepareStatement("insert into test_read_write_strings")) {
stmt.setInt(1, 1);
stmt.setString(2, "");
stmt.setString(3, "");
stmt.setObject(4, new String[] { "" });
stmt.addBatch();
stmt.setInt(1, 2);
stmt.setString(2, "");
stmt.setString(3, null);
stmt.setObject(4, new String[] { null });
stmt.addBatch();
int[] results = stmt.executeBatch();
Assert.assertEquals(results, new int[] { 1, 1 });
}
ResultSet rs = s.executeQuery("select * from test_read_write_strings order by id");
Assert.assertTrue(rs.next());
Assert.assertEquals(rs.getInt(1), 1);
Assert.assertEquals(rs.getString(2), "");
Assert.assertEquals(rs.getString(3), "");
Assert.assertEquals(rs.getObject(4), new String[] { "" });
Assert.assertTrue(rs.next());
Assert.assertEquals(rs.getInt(1), 2);
Assert.assertEquals(rs.getString(2), "");
Assert.assertEquals(rs.getString(3), null);
Assert.assertEquals(rs.getObject(4), new String[] { null });
Assert.assertFalse(rs.next());
} |
This might not related but what's the version of ClickHouse you're using? |
22.3.2 revision 54455. |
Hmm... I can only reproduce the issue when trying to write null to a non-nullable string column. Also I didn't see suspicious change in patch6. Could that be some feature in Hop to treat empty string as null? |
No there is absolutely nothing like that in Hop. What I do ist to filter out the nulls and set them explicitely to an empty String to avoid Nulls. to be wriiten to a non nullable column.I tried it in every direction. leave the filed null, set the field to empty String, set it to a defaut String "-" . The crazy is, that it works with the same setup since month, but the switch away from rev5 is breaking it. For the moment it is ok for us to stay at rev5, but this no longterm solution. |
Thanks @mafiore and sorry for the inconvenience at your end. In general, we should avoid writing null to a non-nullable column. However, it's not always true - for example, when we want to use default value of a column, we may pass null even we know the column is not nullable. This is also the reason why #935 didn't pass build check :) I think both cases can be addressed by adding a JDBC-specific option
Make sense? |
I agree with you @zhicwu but I don't tried write NULL to non nullable. I change NULL to an empty String which is not NULL at all and this had worked until rev5. Thank you ! |
I understand. It could be a breaking change I made without noticing. But with the new option, it can simplify your ETL work too, because you'll no longer need additional step to convert null to empty string. Have you tried insert_null_as_default before? It could be a server-side solution. |
@zhicwu this is a very cool helpful feature. The clickhouse-developers do a great job. I curious how it will be when the jdbc driver supports native connections (hot http based) |
@mafiore, did you mean clickhouse-tcp-client? I don't think it will help much, except native data format support(for parallel deserialization & serialization). If the concern is performance, you may watch #928 as I'll provide more updates in the next following days. To put in short, the JDBC driver is very slow. Java client and streaming make things better, but still far behind ClickHouse native client. |
@zhicwu It may be that JDBC ist slow compared to the native CH Client. But for our needs it is fast like a flash :-) |
Exactly. There'll be a |
Example:
|
hello @zhicwu Error inserting/updating row Why can't this not work, like it did before patch 6 ? I tried every combination and ended up with setting the value to "empty", which is not what I want. The empty String is a String with the length of 0, not NULL |
Sorry to hear about that @mafiore. Let's reopen the issue. Can you provide minimum pipeline for reproducing the issue? I'll then install Apache hop and debug from there. |
@mafiore, I tried on hop 2.0.0 and was not able to reproduce the issue. On a side note, it looks like they started to enhance clickhouse plugin by replacing clickhouse4j to official driver, so that nobody has to use generic database plugin to access ClickHouse :D |
@zhicwu Sorry, But I have a lack of time thees days.
Who do you think, has triggered this? :-) |
@zhicwu Hey there, unfortunately this bug still occurs for us. Independently from the value 2022/11/24 11:02:35 - Table output.0 - Error inserting row into table [Test] with values: [null], [null]
2022/11/24 11:02:35 - Table output.0 -
2022/11/24 11:02:35 - Table output.0 - Unexpected error inserting/updating row in part [insertRow add batch]
2022/11/24 11:02:35 - Table output.0 - at java.lang.Thread.run (Thread.java:829)
2022/11/24 11:02:35 - Table output.0 - at org.apache.hop.pipeline.transform.RunThread.run (RunThread.java:51)
2022/11/24 11:02:35 - Table output.0 - at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.processRow (TableOutput.java:117)
2022/11/24 11:02:35 - Table output.0 - at org.apache.hop.pipeline.transforms.tableoutput.TableOutput.writeToTable (TableOutput.java:251)
2022/11/24 11:02:35 - Table output.0 - at org.apache.hop.core.database.Database.insertRow (Database.java:1097)
2022/11/24 11:02:35 - Table output.0 - at com.clickhouse.jdbc.internal.InputBasedPreparedStatement.addBatch (InputBasedPreparedStatement.java:313)
2022/11/24 11:02:35 - Table output.0 - at com.clickhouse.client.data.ClickHouseRowBinaryProcessor$MappedFunctions.serialize (ClickHouseRowBinaryProcessor.java:486)
2022/11/24 11:02:35 - Table output.0 - at com.clickhouse.client.data.ClickHouseRowBinaryProcessor$MappedFunctions.lambda$buildMappingsForDataTypes$65 (ClickHouseRowBinaryProcessor.java:337)
2022/11/24 11:02:35 - Table output.0 - at com.clickhouse.client.data.BinaryStreamUtils.writeString (BinaryStreamUtils.java:1667) One therory is that the empty string interferes with the check for a null character here: I created an example repo to demonstrate this behaviour if you want to check it out yourself: edit: |
Hi @uklft, thanks for sharing your findings. Just tried the example you provided using drop table if exists test_empty_string;
create table test_empty_string(id Int32, name String)engine=MergeTree() order by tuple(); To upgrade JDBC driver:
Did I miss anything?
What's the expectation? The option is only useful when you pass null values to PreparedStatement, but I'm not sure how I can do that in DBeaver. If you're talking about binding variables like Update: submitted https://issues.apache.org/jira/browse/HOP-4629 for upgrading JDBC driver and remove http-client jar, which might be the cause of the issue. |
Hey @zhicwu thanks for looking into it and creating an Issue for Hop! You are right, the parameter works as expected with patch11. Concerning DBeaver: I expected the parameter to work also on a direct |
Thanks @uklft for confirming the issue has been resolved. |
@zhicwu Sorry to bother you again, but we noticed that the original problem of this issue unfortunately is not resolved.
Version is |
Hi @uklft, just tried your repo again and it works for me. Did you update database configuration by setting |
Thanks for looking into it. If i add The problem is that the driver sends a |
I see. By default, Apache Hop treats empty string as null value, but you can change the behavior by setting variable |
Oh wow, that is the key. Thank you so much for pointing that out! |
today I updated clickhouse and now same python code that passed None data to clickhouse-connect and made Null values in tables, now makes empty string. |
I can reproduce this misbehaviour since version 0.3.2-patch6.
Up to version 0.3.2-patch5 everything worked fine.
I poked around an bit and found out that the jdbc-driver has problems with empty strings.
It seems that it handles empty strings like null values.
Before patch6 everything worked fine. This behavior still exists in patch9
This relates to #896
Unexpected error inserting/updating row in part [insertRow add batch]
...
com.clickhouse.client.data.BinaryStreamUtils.writeString (BinaryStreamUtils.java:1667)
If is replace the empty string value with any character the error is gone.
The text was updated successfully, but these errors were encountered: