Skip to content
This repository has been archived by the owner on Feb 18, 2024. It is now read-only.

Bitmap::mutable line 155 may Panic/segfault #309

Closed
ritchie46 opened this issue Aug 21, 2021 · 5 comments · Fixed by #311
Closed

Bitmap::mutable line 155 may Panic/segfault #309

ritchie46 opened this issue Aug 21, 2021 · 5 comments · Fixed by #311
Assignees

Comments

@ritchie46
Copy link
Collaborator

This panics in debug build and segfaults release.

There seems to be an offset in the logs. I believe it is this line:

std::iter::repeat(0b11111111u8).take(required - existing),

Backtrace:

RUST_BACKTRACE=1 POLARS_MAX_THREADS=2 python -m user_guide.src.examples.projection_pushdown
thread '<unnamed>' panicked at 'attempt to subtract with overflow', /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/bitmap/mutable.rs:157:54
stack backtrace:
   0: rust_begin_unwind
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panicking.rs:516:5
   1: core::panicking::panic_fmt
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/core/src/panicking.rs:93:14
   2: core::panicking::panic
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/core/src/panicking.rs:50:5
   3: arrow2::bitmap::mutable::MutableBitmap::extend_set
             at /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/bitmap/mutable.rs:157:54
   4: arrow2::bitmap::mutable::MutableBitmap::extend_constant
             at /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/bitmap/mutable.rs:188:13
   5: arrow2::array::growable::utils::build_extend_null_bits::{{closure}}
             at /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/array/growable/utils.rs:32:13
   6: <alloc::boxed::Box<F,A> as core::ops::function::Fn<Args>>::call
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/alloc/src/boxed.rs:1586:9
   7: <arrow2::array::growable::utf8::GrowableUtf8<O> as arrow2::array::growable::Growable>::extend
             at /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/array/growable/utf8.rs:66:9
   8: arrow2::compute::concat::concatenate
             at /home/ritchie46/.cargo/git/checkouts/arrow2-8a2ad61d97265680/fa3b003/src/compute/concat.rs:59:9
   9: <polars_core::chunked_array::ChunkedArray<polars_core::datatypes::Utf8Type> as polars_core::chunked_array::ops::chunkops::ChunkOps>::rechunk
             at /home/ritchie46/code/polars/polars/polars-core/src/chunked_array/ops/chunkops.rs:56:31
  10: <polars_core::series::implementations::SeriesWrap<polars_core::chunked_array::ChunkedArray<polars_core::datatypes::Utf8Type>> as polars_core::series::SeriesTrait>::rechunk
             at /home/ritchie46/code/polars/polars/polars-core/src/series/implementations/mod.rs:678:17
  11: polars_core::frame::DataFrame::as_single_chunk
             at /home/ritchie46/code/polars/polars/polars-core/src/frame/mod.rs:158:18
  12: <polars_io::csv::CsvReader<R> as polars_io::SerReader<R>>::finish
             at /home/ritchie46/code/polars/polars/polars-io/src/csv.rs:511:13
  13: <polars_lazy::physical_plan::executors::scan::CsvExec as polars_lazy::physical_plan::Executor>::execute
             at /home/ritchie46/code/polars/polars/polars-lazy/src/physical_plan/executors/scan.rs:143:18
  14: <polars_lazy::physical_plan::executors::join::JoinExec as polars_lazy::physical_plan::Executor>::execute
             at /home/ritchie46/code/polars/polars/polars-lazy/src/physical_plan/executors/join.rs:58:14
  15: <polars_lazy::physical_plan::executors::udf::UdfExec as polars_lazy::physical_plan::Executor>::execute
             at /home/ritchie46/code/polars/polars/polars-lazy/src/physical_plan/executors/udf.rs:12:18
  16: <polars_lazy::physical_plan::executors::udf::UdfExec as polars_lazy::physical_plan::Executor>::execute
             at /home/ritchie46/code/polars/polars/polars-lazy/src/physical_plan/executors/udf.rs:12:18
  17: polars_lazy::frame::LazyFrame::collect
             at /home/ritchie46/code/polars/polars/polars-lazy/src/frame.rs:615:19
  18: polars_lazy::frame::LazyFrame::fetch
             at /home/ritchie46/code/polars/polars/polars-lazy/src/frame.rs:482:19
  19: polars::lazy::dataframe::PyLazyFrame::fetch::{{closure}}
             at /home/ritchie46/code/polars/py-polars/src/lazy/dataframe.rs:202:38
  20: <std::panic::AssertUnwindSafe<F> as core::ops::function::FnOnce<()>>::call_once
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panic.rs:347:9
  21: std::panicking::try::do_call
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panicking.rs:401:40
  22: __rust_try
  23: std::panicking::try
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panicking.rs:365:19
  24: std::panic::catch_unwind
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panic.rs:434:14
  25: pyo3::python::Python::allow_threads
             at /home/ritchie46/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/pyo3-0.14.1/src/python.rs:268:22
  26: polars::lazy::dataframe::PyLazyFrame::fetch
             at /home/ritchie46/code/polars/py-polars/src/lazy/dataframe.rs:202:18
  27: polars::lazy::dataframe::__init8915052033671681482::__wrap::{{closure}}
             at /home/ritchie46/code/polars/py-polars/src/lazy/dataframe.rs:84:1
  28: pyo3::callback::handle_panic::{{closure}}
             at /home/ritchie46/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/pyo3-0.14.1/src/callback.rs:247:9
  29: std::panicking::try::do_call
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panicking.rs:401:40
  30: __rust_try
  31: std::panicking::try
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panicking.rs:365:19
  32: std::panic::catch_unwind
             at /rustc/492723897e9b4db6701b3a75b72618d08a7d5319/library/std/src/panic.rs:434:14
  33: pyo3::callback::handle_panic
             at /home/ritchie46/.cargo/registry/src/github.aaakk.us.kg-1ecc6299db9ec823/pyo3-0.14.1/src/callback.rs:245:24
  34: polars::lazy::dataframe::__init8915052033671681482::__wrap
             at /home/ritchie46/code/polars/py-polars/src/lazy/dataframe.rs:84:1
  35: method_vectorcall_VARARGS_KEYWORDS
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/descrobject.c:332
  36: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  37: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  38: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3486
  39: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  40: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
  41: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:435:12
  42: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  43: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  44: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3486
  45: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  46: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
  47: PyEval_EvalCodeEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4327
  48: PyEval_EvalCode
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:718
  49: builtin_exec_impl
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/bltinmodule.c:1033
  50: builtin_exec
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/clinic/bltinmodule.c.h:396
  51: cfunction_vectorcall_FASTCALL
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/methodobject.c:422:24
  52: PyVectorcall_Call
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:199:24
  53: do_call_core
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4983
  54: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3559
  55: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  56: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
  57: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:435:12
  58: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  59: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  60: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3469
  61: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  62: function_code_fastcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:283:14
  63: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:410:20
  64: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  65: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  66: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3486
  67: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  68: function_code_fastcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:283:14
  69: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:410:20
  70: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  71: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  72: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3500
  73: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  74: function_code_fastcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:283:14
  75: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:410:20
  76: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  77: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  78: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3500
  79: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  80: function_code_fastcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:283:14
  81: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:410:20
  82: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  83: _PyObject_FastCall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:147:12
  84: object_vacall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:1186:14
  85: _PyObject_CallMethodIdObjArgs
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:1244
  86: import_find_and_load
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/import.c:1698:11
  87: PyImport_ImportModuleLevelObject
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/import.c:1798:15
  88: import_name
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:5139
  89: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:2993
  90: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
  91: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
  92: PyEval_EvalCodeEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4327
  93: PyEval_EvalCode
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:718
  94: builtin_exec_impl
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/bltinmodule.c:1033
  95: builtin_exec
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/clinic/bltinmodule.c.h:396
  96: cfunction_vectorcall_FASTCALL
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/methodobject.c:422:24
  97: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
  98: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
  99: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3500
 100: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
 101: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
 102: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:435:12
 103: _PyObject_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Include/cpython/abstract.h:127:11
 104: call_function
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4963
 105: _PyEval_EvalFrameDefault
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:3500
 106: PyEval_EvalFrameEx
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:741:12
 107: _PyEval_EvalCodeWithName
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Python/ceval.c:4298
 108: _PyFunction_Vectorcall
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:435:12
 109: PyVectorcall_Call
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:199:24
 110: PyObject_Call
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Objects/call.c:227:16
 111: pymain_run_module
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Modules/main.c:308
 112: pymain_run_python
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Modules/main.c:606:21
 113: Py_RunMain
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Modules/main.c:691
 114: Py_BytesMain
             at /home/conda/feedstock_root/build_artifacts/python-split_1611614749976/work/Modules/main.c:1123
 115: __libc_start_main
 116: <unknown>
note: Some details are omitted, run with `RUST_BACKTRACE=full` for a verbose backtrace.
Traceback (most recent call last):
  File "/opt/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/ritchie46/code/polars-book/user_guide/src/examples/projection_pushdown/__main__.py", line 2, in <module>
    from .snippet import dataset, df1
  File "/home/ritchie46/code/polars-book/user_guide/src/examples/projection_pushdown/snippet.py", line 20, in <module>
    df1 = dataset.fetch(int(1e7))
  File "/opt/miniconda3/lib/python3.8/site-packages/polars/lazy/frame.py", line 321, in fetch
    return pl.eager.frame.wrap_df(ldf.fetch(n_rows))
pyo3_runtime.PanicException: attempt to subtract with overflow

@ritchie46 ritchie46 changed the title Bitmap::mutable line 155 may segfault Bitmap::mutable line 155 may Panic/segfault Aug 21, 2021
@jorgecarleitao jorgecarleitao self-assigned this Aug 21, 2021
@jorgecarleitao jorgecarleitao added the bug Something isn't working label Aug 21, 2021
@jorgecarleitao
Copy link
Owner

Hey, thanks!

Do you have a repro for this? I with the test on #310 I can reproduce a panic in debug that does not show in release before its patch, but I am not sure that that is the solution to this issue specifically.

Interesting that it segfaults in release and not in debug: this should not happen in safe code in Rust, and that whole block is safe. I will try to write a repro to the compiler team.

@ritchie46
Copy link
Collaborator Author

I will try to reproduce this in Rust. I was in a hurry. 🙈

I will also check if #310 fixes this.

@ritchie46
Copy link
Collaborator Author

Do you have a repro for this?

Got it! Both the panic and the segfault.

I jammed a pub on MutableBitmap to make it easier to reproduce:

pub struct MutableBitmap {
    pub buffer: MutableBuffer<u8>,
    pub length: usize,
}

MWE

use arrow::bitmap::MutableBitmap;
use arrow::buffer::MutableBuffer;

fn main() {
    let buffer = [
        255u8,
        253,
        255,
        191,
        255,
        255,
        255,
        251,
        223,
        255,
        255,
        247,
        255,
        254,
        255,
        255,
        247,
        255,
        247,
        255,
        255,
        254,
        223,
        255,
        247,
        255,
        191,
        255,
        255,
        255,
        255,
        247,
        255,
        255,
        255,
        127,
        255,
        255,
        142,
        255,
        247,
        255,
        247,
        255,
        239,
        255,
        255,
        255,
        255,
        255,
        255,
        239,
        255,
        255,
        255,
        255,
        255,
        255,
        223,
        255,
        255,
        255,
        247,
        255,
        191,
        255,
        253,
        231,
        255,
        255,
        191,
        255,
        223,
        127,
        255,
        255,
        223,
        255,
        255,
        254,
        127,
        255,
        30,
    ];

    let buffer = MutableBuffer::from(buffer);
    let mut bitmap = MutableBitmap {
        buffer: buffer,
        length: 640
    };
    bitmap.extend_set(15);
}

@ritchie46
Copy link
Collaborator Author

ritchie46 commented Aug 21, 2021

And a segfault in safe rust:

O, this is a core dump due to OOM:

memory allocation of 18446744073709551615 bytes failed
Aborted (core dumped)
#![feature(bench_black_box)]

fn main() {
    std::hint::black_box(std::iter::repeat(0b11111111u8).take(std::hint::black_box(82) - std::hint::black_box(83)).collect::<Vec<_>>());
}

@ritchie46
Copy link
Collaborator Author

It seems to be fixed if I compute existing from the buffer itself instead of the length state.

Could it be that the length state is updated incorrectly? I see that in MutableBitmap::extend_set it is updated twice.

here:

self.length += added;

and here:

self.length += additional;

Is that intentional?

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants