Import data cause BufferOverflowException #3728

ljwh · 2024-01-26T03:12:18Z

Bug Description
I am trying to import data to online db with hive table, exception happens if there are some strings length bigger than 255:

Caused by: java.io.IOException: write row to openmldb failed on:  ... 
	at com._4paradigm.openmldb.spark.write.OpenmldbDataSingleWriter.write(OpenmldbDataSingleWriter.java:89)
	at com._4paradigm.openmldb.spark.write.OpenmldbDataSingleWriter.write(OpenmldbDataSingleWriter.java:39)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.$anonfun$run$1(WriteToDataSourceV2Exec.scala:419)
	at org.apache.spark.util.Utils$.tryWithSafeFinallyAndFailureCallbacks(Utils.scala:1496)
	at org.apache.spark.sql.execution.datasources.v2.DataWritingSparkTask$.run(WriteToDataSourceV2Exec.scala:457)
	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.$anonfun$writeWithV2$2(WriteToDataSourceV2Exec.scala:358)
	at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:90)
	at org.apache.spark.scheduler.Task.run(Task.scala:131)
	at org.apache.spark.executor.Executor$TaskRunner.$anonfun$run$3(Executor.scala:506)
	at org.apache.spark.util.Utils$.tryWithSafeFinally(Utils.scala:1462)
	at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:509)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:750)
Caused by: java.nio.BufferOverflowException
	at java.nio.HeapByteBuffer.put(HeapByteBuffer.java:194)
	at java.nio.ByteBuffer.put(ByteBuffer.java:867)
	at com._4paradigm.openmldb.common.codec.FlexibleRowBuilder.build(FlexibleRowBuilder.java:385)
	at com._4paradigm.openmldb.sdk.impl.InsertPreparedStatementImpl.buildRow(InsertPreparedStatementImpl.java:302)
	at com._4paradigm.openmldb.sdk.impl.InsertPreparedStatementImpl.execute(InsertPreparedStatementImpl.java:317)
	at com._4paradigm.openmldb.spark.write.OpenmldbDataSingleWriter.write(OpenmldbDataSingleWriter.java:77)
	... 13 more

Expected Behavior
import data success

Relation Case
no

Steps to Reproduce

prepare some data that some all string value length bigger than 255 and some less
import these data to online db
exception with java.nio.BufferOverflowException

After digging into the code,

    // FlexibleRowBuilder.java
    int totalSize = strFieldStartOffset + strAddrLen + strTotalLen;
    // check totalSize if bigger than UNIT8_MAX or UNIT16_MAX ...
    int curStrAddrSize = CodecUtil.getAddrLength(totalSize);
    if (curStrAddrSize > strAddrSize) {
        // strAddrBuf will be expanded if the totalSize bigger than UNIT8_MAX(255)
        strAddrBuf = expandStrLenBuf(curStrAddrSize, settedStrCnt);
        strAddrSize = curStrAddrSize;
        totalSize = strFieldStartOffset + strAddrLen + strTotalLen;
    }

private variable strAddrBuf will be expanded if totalSize bigger than UNIT8_MAX(255) and wil be used for the following records but never reduce the array size, that causes java.nio.BufferOverflowException.

currently i manually reduce the strAddrBuf size at the end of result allocate to solve the problem.

The text was updated successfully, but these errors were encountered:

…mport data, fix 4paradigm#3728

ljwh added the bug Something isn't working label Jan 26, 2024

ljwh assigned aceforeverd Jan 26, 2024

ljwh added a commit to ljwh/OpenMLDB that referenced this issue Jan 26, 2024

fix: BufferOverflowException if strings length bigger than 255 when i…

6cb19af

…mport data, fix 4paradigm#3728

ljwh mentioned this issue Jan 26, 2024

fix: BufferOverflowException when import data fix #3728 #3729

Merged

dl239 closed this as completed in #3729 Feb 22, 2024

dl239 pushed a commit that referenced this issue Feb 22, 2024

fix: BufferOverflowException when import data fix #3728 (#3729)

1a714ee

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Import data cause BufferOverflowException #3728

Import data cause BufferOverflowException #3728

ljwh commented Jan 26, 2024

Import data cause BufferOverflowException #3728

Import data cause BufferOverflowException #3728

Comments

ljwh commented Jan 26, 2024