Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in 0.9 #888

Closed
palday opened this issue Sep 8, 2021 · 7 comments · Fixed by #898
Closed

Race condition in 0.9 #888

palday opened this issue Sep 8, 2021 · 7 comments · Fixed by #898

Comments

@palday
Copy link

palday commented Sep 8, 2021

I will try to provide a nicer MWE later, but I ran into a problem while working on some proprietary data. The file in question parses without problem in CSV 0.8.3, but fails under the same options in 0.9. Adding ntasks=1 in CSV 0.9 solves the issue, which makes me suspect a race condition.

cc: @dmbates

@quinnj
Copy link
Member

quinnj commented Sep 8, 2021

Thanks for reporting; even if you can only share the error + stacktrace you're seeing, that can often be very helpful.

@palday
Copy link
Author

palday commented Sep 9, 2021

Filename passed as a string; kwargs:

comment="#", 
delim=';', 
missingstring=["NA",""],
downcast=true,
drop=[1,2],
decimal=','

Error + stacktrace:

TaskFailedException

nested task error: AssertionError: task_col.column isa Vector{UInt32}

Stacktrace:

[1] syncrefs!(#unused#::Type{String}, col::CSV.Column, task_col::CSV.Column, task_rows::Int64)

@ CSV ~/.julia/packages/CSV/eB98W/src/file.jl:474

[2] multithreadpostparse(ctx::CSV.Context, ntasks::Int64, pertaskcolumns::Vector{Vector{CSV.Column}}, rows::Vector{Int64}, rowchunkguess::Int64, finalrows::Int64, j::Int64, col::CSV.Column)

@ CSV ~/.julia/packages/CSV/eB98W/src/file.jl:425

[3] (::CSV.var"#26#31"{CSV.Context, Vector{Int64}, Vector{Vector{CSV.Column}}, Int64, Int64, CSV.Column, Int64, Int64})()

@ CSV ./threadingconstructs.jl:169

...and 1 more exception.

    sync_end(::Channel{Any})@task.jl:369
    macro expansion@task.jl:388[inlined]
    CSV.File(::CSV.Context, ::Bool)@file.jl:257
    File@file.jl:223[inlined]
    #File#[email protected]:219[inlined]
    top-level scope@Local: 2[inlined]

@quinnj
Copy link
Member

quinnj commented Sep 9, 2021

Thanks that's helpful. If there's anyway to share the file privately with me, that would be extremely helpful in terms of debugging this; happy to jump on a call/zoom to facilitate sharing if possible. Otherwise, I'll try to walk through the code and figure out how this is getting triggered. In the mean time, you can avoid the error by doing pool=false, which will avoid this code path entirely.

quinnj added a commit that referenced this issue Sep 12, 2021
Fixes #888. One of the core issues is that we weren't "remembering",
when a column's type got detected, whether its pool value was a generic
pool value (like `pool=true`), or provided per-column. This is unique to
pooling, because we don't/can't really apply the pooling until we know
the column type, so it's usually post-context.

The proposal here adds another field `columnspecificpool`, which will
remember this property. The other big issue is that the pooling
mechanics have shifted around a bit throughout the recent internals
refactoring, so we were a little inconsistent in how we were handling
things overall.
quinnj added a commit that referenced this issue Sep 13, 2021
Fixes #888. One of the core issues is that we weren't "remembering",
when a column's type got detected, whether its pool value was a generic
pool value (like `pool=true`), or provided per-column. This is unique to
pooling, because we don't/can't really apply the pooling until we know
the column type, so it's usually post-context.

The proposal here adds another field `columnspecificpool`, which will
remember this property. The other big issue is that the pooling
mechanics have shifted around a bit throughout the recent internals
refactoring, so we were a little inconsistent in how we were handling
things overall.
@quinnj
Copy link
Member

quinnj commented Sep 13, 2021

I could reproduce the issue on 0.9.1 and the proposed fix in #898 fixes it for me.

@palday
Copy link
Author

palday commented Sep 14, 2021

In the REPL the fix works, but it in Pluto it doesn't:

TaskFailedException

nested task error: TypeError: in typeassert, expected Dict{Union{Missing, String}, UInt32}, got a value of type Dict{Union{Missing, WeakRefStrings.String31}, UInt32}

Stacktrace:

[1] syncrefs!(#unused#::Type{String}, col::CSV.Column, task_col::CSV.Column, task_rows::Int64)

@ CSV ~/.julia/packages/CSV/h66c0/src/file.jl:476

[2] multithreadpostparse(ctx::CSV.Context, ntasks::Int64, pertaskcolumns::Vector{Vector{CSV.Column}}, rows::Vector{Int64}, rowchunkguess::Int64, finalrows::Int64, j::Int64, col::CSV.Column)

@ CSV ~/.julia/packages/CSV/h66c0/src/file.jl:422

[3] (::CSV.var"#26#31"{CSV.Context, Vector{Int64}, Vector{Vector{CSV.Column}}, Int64, Int64, CSV.Column, Int64, Int64})()

@ CSV ./threadingconstructs.jl:169

    sync_end(::Channel{Any})@task.jl:369
    macro expansion@task.jl:388[inlined]
    CSV.File(::CSV.Context, ::Bool)@file.jl:257
    File@file.jl:223[inlined]
    #File#[email protected]:219[inlined]
    top-level scope@Local: 2[inlined]

Note that Pluto runs the notebooks in a Distributed worker process, so you get slightly different scheduling. I think the problem here could be sidestepped by specifying the string type, but the auto detection threshold is what has the race condition.

@quinnj quinnj reopened this Sep 16, 2021
@quinnj
Copy link
Member

quinnj commented Sep 16, 2021

Thanks for the follow up; I'll try to dig in and see if there's still a promotion problem with the refpool sometimes.

@quinnj quinnj closed this as completed Sep 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants