Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

load data issues and error msg improvements #3708

Open
vagetablechicken opened this issue Jan 17, 2024 · 1 comment
Open

load data issues and error msg improvements #3708

vagetablechicken opened this issue Jan 17, 2024 · 1 comment
Assignees

Comments

@vagetablechicken
Copy link
Collaborator

vagetablechicken commented Jan 17, 2024

  • CLI local load_mode issues(high-priority)
    • error msg
    • inconsistent with cluster mode

All methods to load data

method desc col convert failure col set failure row build failure put failure whole failure
insert sql sql to insert row in router MakeDefault recursion, hard to print row, just print row idx print row idx print row idx and status msg failed rows peek
java(sdk&jdbc) prepared stmt getInsertPreparedStmt, optimized insert(FlexibleRowBuilder) - SQLException col pos SQLException no row hint log status msg - (executeBatch returns 0/1)
load data cluster getInsertPreparedStmt, but many rows - same same log, no hint in exception readable row in exception msg
load data local sql_cluster_router.cc translate col name, type, value cvt and set readable row in status msg status msg file & lineno with error msg
api server JsonReader parse is easy, json -> row will print hint(put-cvt col name, type, value;deployment-cvt col name, type,value;query.parameter-cvt col type, idx ) the same place just one row status msg status msg
jdbc insert row(not recommend)

One row insertion report col level failures?
Multi rows insertion report row idx, if user can get row easily
Spark insertion print failed row(readable), cuz user can't get row easily in spark way

TODO openmldb-import use prepared stmt instead of getInsertRow
local use new csv library to support escape, but it may still != cluster spark style.

@vagetablechicken vagetablechicken changed the title load data in local mode issues load data issues and error msg improvements Jan 22, 2024
@vagetablechicken vagetablechicken self-assigned this Jan 22, 2024
@vagetablechicken
Copy link
Collaborator Author

vagetablechicken commented Jan 23, 2024

Test

insert sql

Error: [2000] Fail to get insert info--fail to parse row[1]: (2,invalid,,failed)

load data cluster

csv load failures will gen csv df with NULL, won't break loading.
set row or put row failures will report
Caused by: java.io.IOException: write row to openmldb failed on -1,0,19025,date has time,

  • 19025 is date, the number of days elapsed, I don't cvt here, you can use other cols to find the row

And internal error exception set xx failed. pos is ..., execute false just throw the exception below.

insert prepared stmt

set col failed to row: exception set xx failed. pos is ...
execute(put) failed: java log execute insert failed on ...

load data local mode

Error: [2000] file [/work/test/csv_data/insert_fail.csv] line [lineno=0: 1, 11, "not date", "csv row"] insert failed, translate failed on column c3(2) with value "not date"

vagetablechicken added a commit to vagetablechicken/fedb that referenced this issue Jan 25, 2024
4paradigm#3708 print failed column value if trans failed
print failed row when load data failed
apiserver print col/row or idx
vagetablechicken added a commit that referenced this issue Feb 6, 2024
#3708 print failed column value if trans failed
print failed row when load data failed
apiserver print col/row or idx
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant