Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault Julia 1.8.2, DataFrames v1.4.3 #3227

Closed
AndiMD opened this issue Nov 19, 2022 · 5 comments
Closed

Segmentation fault Julia 1.8.2, DataFrames v1.4.3 #3227

AndiMD opened this issue Nov 19, 2022 · 5 comments

Comments

@AndiMD
Copy link

AndiMD commented Nov 19, 2022

Dear all,

I have a long computation that results in a Dataframe like this:

558×4 DataFrame
 Row │ Pulse width (s)  Delta_Pr (µC/cm²)  Norm_Pr    sheetName 
     │ Float64          Float64            Any        Any       
─────┼──────────────────────────────────────────────────────────
   1 │          2.0e-8            1.62243  0.0370832  -1V
   2 │          3.0e-8            1.74437  0.0398703  -1V
   3 │          4.0e-8            1.49144  0.0340893  -1V
   4 │          5.0e-8            1.61675  0.0369534  -1V
   5 │          6.0e-8            1.97172  1          -1V

Column 3 contains only Float64, except the last element, which is an Int64.
Column 4 contains only String.
This specific DataFrame instance seems to be corrupted. Many operations, such as df = DataFrame(df) or transform(df,...) consistently yield the following segfault:

df = DataFrame(df)
signal (11): Segmentation fault
in expression starting at REPL[53]:1
#copy#226 at /home/andi/.julia/packages/DataFrames/KKiZW/src/dataframe/dataframe.jl:803
copy##kw at /home/andi/.julia/packages/DataFrames/KKiZW/src/dataframe/dataframe.jl:799 [inlined]
#DataFrame#186 at /home/andi/.julia/packages/DataFrames/KKiZW/src/dataframe/dataframe.jl:249 [inlined]
DataFrame at /home/andi/.julia/packages/DataFrames/KKiZW/src/dataframe/dataframe.jl:249
unknown function (ip: 0x7f48700eb1e2)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
jl_apply at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined]
do_call at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:126
eval_value at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:215
eval_stmt_value at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:166 [inlined]
eval_body at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:612
jl_interpret_toplevel_thunk at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:750
jl_toplevel_eval_flex at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/toplevel.c:906
jl_toplevel_eval_flex at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/toplevel.c:850
eval_body at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:556
jl_interpret_toplevel_thunk at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/interpreter.c:750
jl_toplevel_eval_flex at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/toplevel.c:906
ijl_toplevel_eval_in at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/toplevel.c:965
eval at ./boot.jl:368 [inlined]
eval_user_input at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:151
repl_backend_loop at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:247
start_repl_backend at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:232
#run_repl#47 at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:369
run_repl at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:355
jfptr_run_repl_64841.clone_1 at /home/andi/programs/julia-1.8/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
#967 at ./client.jl:419
jfptr_YY.967_30403.clone_1 at /home/andi/programs/julia-1.8/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
jl_apply at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined]
jl_f__call_latest at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/builtins.c:774
#invokelatest#2 at ./essentials.jl:729 [inlined]
invokelatest at ./essentials.jl:726 [inlined]
run_main_repl at ./client.jl:404
exec_options at ./client.jl:318
_start at ./client.jl:522
jfptr__start_56736.clone_1 at /home/andi/programs/julia-1.8/lib/julia/sys.so (unknown line)
_jl_invoke at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2367 [inlined]
ijl_apply_generic at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/gf.c:2549
jl_apply at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/julia.h:1839 [inlined]
true_main at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/jlapi.c:575
jl_repl_entrypoint at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/src/jlapi.c:719
main at /cache/build/default-amdci4-6/julialang/julia-release-1-dot-8/cli/loader_exe.c:59
unknown function (ip: 0x7f48a4aedd8f)
__libc_start_main at /lib/x86_64-linux-gnu/libc.so.6 (unknown line)
unknown function (ip: 0x401098)
Allocations: 168385459 (Pool: 168340484; Big: 44975); GC: 61
Segmentation fault (core dumped)

The segfault does not occur when I reconstruct the DataFrame with CSV.write, CSV.read and convert the last two columns to Any. With BSON, I can save the file, but loading fails.

julia> versioninfo()
Julia Version 1.8.2
Commit 36034abf260 (2022-09-29 15:21 UTC)
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 24 × AMD Ryzen 9 3900X 12-Core Processor
  WORD_SIZE: 64
  LIBM: libopenlibm
  LLVM: libLLVM-13.0.1 (ORCJIT, znver2)
  Threads: 12 on 24 virtual cores
Environment:
  LD_LIBRARY_PATH = :/usr/local/cuda-11.0/lib64
  JULIA_NUM_THREADS = 12
  JULIA_EDITOR = vim
]status
  [cbdf2221] AlgebraOfGraphics v0.6.12
  [fbb218c0] BSON v0.3.6
  [a134a8b2] BlackBoxOptim v0.6.2
  [13f3f980] CairoMakie v0.9.3
  [b0b7db55] ComponentArrays v0.13.4
  [a93c6f00] DataFrames v1.4.3
  [251d5f9e] EvalMetrics v0.2.1
  [e9467ef8] GLMakie v0.7.3
  [5903a43b] Infiltrator v1.6.3
  [a98d9a8b] Interpolations v0.14.6
  [429524aa] Optim v1.7.3
  [1fd47b50] QuadGK v2.6.0
  [276daf66] SpecialFunctions v2.1.7
  [fdbf4ff8] XLSX v0.8.4

Let me know if more details are required. Unfortunately I have not found a MWE yet.

@AndiMD
Copy link
Author

AndiMD commented Nov 19, 2022

The data in this DataFrame came from a BSON file. After regenerating the BSON file, the error disappeared.

@AndiMD AndiMD closed this as completed Nov 19, 2022
@gnadt
Copy link

gnadt commented Dec 8, 2022

Seeing a similar issue using a BSON file, tested on DataFrames v1.4.1 and v1.4.4. The data can be loaded, but it fails when trying to convert a column to Symbol. There seems to be a compatibility issue with older DataFrames package versions. A workaround is DataFrames = "<1.4" in Project.toml.

DataFrames.convert.(Symbol,df[:,:col1])

signal (11): Segmentation fault: 11
in expression starting at REPL[10]:1
== at ./promotion.jl:477 [inlined]
isempty at ./dict.jl:709 [inlined]
isempty at ./abstractdict.jl:59 [inlined]
metadatakeys at /Users/gnadt/.julia/packages/DataFrames/bza1S/src/other/metadata.jl:155 [inlined]
_drop_table_nonnote_metadata! at /Users/gnadt/.julia/packages/DataFrames/bza1S/src/other/metadata.jl:749
_drop_all_nonnote_metadata! at /Users/gnadt/.julia/packages/DataFrames/bza1S/src/other/metadata.jl:759
insert_single_column! at /Users/gnadt/.julia/packages/DataFrames/bza1S/src/dataframe/dataframe.jl:653
setindex! at /Users/gnadt/.julia/packages/DataFrames/bza1S/src/dataframe/dataframe.jl:669
ijl_apply_generic at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
do_call at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
eval_body at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_interpret_toplevel_thunk at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_toplevel_eval_flex at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_toplevel_eval_flex at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
ijl_toplevel_eval_in at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
eval at ./boot.jl:368 [inlined]
eval_user_input at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:151
repl_backend_loop at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:247
start_repl_backend at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:232
#run_repl#47 at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:369
run_repl at /Users/julia/.julia/scratchspaces/a66863c6-20e8-4ff4-8a62-49f30b1f605e/agent-cache/default-macmini-x64-6.0/build/default-macmini-x64-6-0/julialang/julia-release-1-dot-8/usr/share/julia/stdlib/v1.8/REPL/src/REPL.jl:355
jfptr_run_repl_65939.clone_1 at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
#967 at ./client.jl:419
jfptr_YY.967_55066.clone_1 at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_f__call_latest at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
#invokelatest#2 at ./essentials.jl:729 [inlined]
invokelatest at ./essentials.jl:726 [inlined]
run_main_repl at ./client.jl:404
exec_options at ./client.jl:318
_start at ./client.jl:522
jfptr__start_36967.clone_1 at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/sys.dylib (unknown line)
ijl_apply_generic at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
true_main at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
jl_repl_entrypoint at /Users/gnadt/.julia/juliaup/julia-1.8.2+0.x64/lib/julia/libjulia-internal.1.8.dylib (unknown line)
Allocations: 9590141 (Pool: 9585805; Big: 4336); GC: 11

@gnadt
Copy link

gnadt commented Dec 8, 2022

@AndiMD can you reopen, or why did you close it?

@bkamins
Copy link
Member

bkamins commented Dec 8, 2022

I guess because it is expected:

There seems to be a compatibility issue with older DataFrames package versions.

See in https://github.com/JuliaData/DataFrames.jl/blob/main/NEWS.md:

DataFrame is now a mutable struct and has three new fields metadata, colmetadata, and allnotemetadata; this change makes DataFrame objects serialized under earlier versions of DataFrames.jl incompatible with version 1.4 (#3055)

You need to convert DataFrame to e.g. NamedTuple of vectors under old DataFrames.jl and load it back under DataFrames.jl 1.4. This change was unfortunately unavoidable because we added metadata.

@gnadt
Copy link

gnadt commented Dec 8, 2022

That clears it up! Thanks Bogumił

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants