Support `unary_dyn_mut` in arth #3708

Ted-Jiang · 2023-02-13T14:46:50Z

Is your feature request related to a problem or challenge? Please describe what you are trying to do.
apache/datafusion@48732b4 all math compute kernel like add_dyn_scalar, multi_dyn_scalar...
I try to use use copy one write (mock func add_scalar_mut use

arrow-rs/arrow-arith/src/arity.rs

Line 59 in 26ea71c

pub fn unary_mut<I, F>(

)
run select t.a + 1 from t
got

warning: `datafusion` (bench "math_query_sql") generated 1 warning
    Finished bench [optimized] target(s) in 7m 26s
     Running benches/math_query_sql.rs (target/release/deps/math_query_sql-64399df00616f72b)
Benchmarking tess_add: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, enable flat sampling, or reduce sample count to 60.
tess_add                time:   [1.2643 ms 1.2664 ms 1.2685 ms]
                        change: [-78.711% -78.544% -78.361%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 5 outliers among 100 measurements (5.00%)
  2 (2.00%) low severe
  1 (1.00%) high mild
  2 (2.00%) high severe

(venv) yangjiang@LM-SHC-15009782 arrow-datafusion %

Describe the solution you'd like

Add unary_dyn_mut which take ownership of array then support cow math operator.

Describe alternatives you've considered

Additional context

The text was updated successfully, but these errors were encountered:

tustvold · 2023-02-13T14:59:22Z

Could you provide a link to the benchmark you were running, when this was previously proposed the numbers were not nearly so flattering - #3134 (comment)

I'm mainly apprehensive as we already have issues with the amount of codegen for the arithmetic kernels 😅

viirya · 2023-02-13T17:50:59Z

Cannot it just use unary_mut? Looks like unary_dyn_mut simply calls unary_mut.

Ted-Jiang · 2023-02-14T02:23:45Z

Cannot it just use unary_mut? Looks like unary_dyn_mut simply calls unary_mut.

There is already a

arrow-rs/arrow-arith/src/arity.rs

Line 59 in d3fe797

pub fn unary_mut<I, F>(

it takes PrimitiveArray as input

Ted-Jiang · 2023-02-14T02:26:47Z

Could you provide a link to the benchmark you were running, when this was previously proposed the numbers were not nearly so flattering - #3134 (comment)

I'm mainly apprehensive as we already have issues with the amount of codegen for the arithmetic kernels 😅

I used unary_mut in datafusion like

pub(crate) fn add_scalar_cow<T>(
    array: PrimitiveArray<T>,
    scalar: T::Native,
) -> Result<PrimitiveArray<T>, ArrowError>
where
    T: ArrowNumericType,
    T::Native: ArrowNativeTypeOp,
{
    match unary_mut(array, |value| value.add_wrapping(scalar)) {
        Ok(array) => Ok(array),
        // Fall back need copy array
        Err(array) => add_scalar(&array, scalar),
    }
}

But there a lot code change during that time, like support dictionaryArray.... So i try to modify less api in datafusion, support unary_dyn_mut

Ted-Jiang · 2023-02-14T02:29:07Z

@tustvold The benchMark just add select f32 + 1 from t
in https://github.com/apache/arrow-datafusion/blob/96512ac05df40deb42632992cece656842d6debb/datafusion/core/benches/math_query_sql.rs#L107

I lost this branch in my local 😭

tustvold · 2023-02-14T09:05:45Z

I think it would help move this forward if there we could get a standalone benchmark in arrow-rs, the level of performance improvement makes me sceptical that there isn't something else going on - e.g checked to unchecked overflow or something

Ted-Jiang · 2023-02-14T09:39:48Z

I think it would help move this forward if there we could get a standalone benchmark in arrow-rs, the level of performance improvement makes me sceptical that there isn't something else going on - e.g checked to unchecked overflow or something

yeap, i prefer reimplement it in datafusion find the cause (I didn't notice before already bench in arrow-rs). By the way I think without performance gain, still need this api for user 😆

tustvold · 2023-02-14T11:05:06Z

By the way I think without performance gain, still need this api for user

What makes you say this, the memory savings are likely irrelevant, so I'm not sure what the advantage would be if not performance?

Ted-Jiang · 2023-02-14T13:55:02Z

By the way I think without performance gain, still need this api for user

What makes you say this, the memory savings are likely irrelevant, so I'm not sure what the advantage would be if not performance?

I mean even if without huge performance gain , it will still avoid re-allocate memory by manual.

Ted-Jiang · 2023-02-15T06:20:05Z

@tustvold
I did bench in https://github.com/apache/arrow-rs/pull/3709/files#diff-d31b0761ffe79b72672cd516aa8ef0792c004328f31994dc616e677e8eb3c50cR31-R59
Sadly same as you mentioned before only little improvement 😢 .

But I did a poc test in datafusion apache/datafusion#5285
got

warning: `datafusion-physical-expr` (lib) generated 28 warnings
    Finished bench [optimized] target(s) in 4m 06s
     Running benches/math_query_sql.rs (target/release/deps/math_query_sql-2162bfde908d94df)
add_cow                 time:   [4.5467 ms 4.6540 ms 4.7887 ms]
                        change: [-84.767% -82.599% -79.807%] (p = 0.00 < 0.05)
                        Performance has improved.
Found 8 outliers among 100 measurements (8.00%)
  5 (5.00%) high mild
  3 (3.00%) high severe

Ted-Jiang · 2023-02-15T06:27:50Z

One thing need mention, i do profile on https://github.com/apache/arrow-rs/pull/3709/files#diff-d31b0761ffe79b72672cd516aa8ef0792c004328f31994dc616e677e8eb3c50cR31-R59. I think they are do the same amount work:
add_scalar_dyn has 463 examples

add_scalar_dyn_mut has 292 examples

@tustvold I am not an expert about this 😭 , could your share your opinion 🆘

Ted-Jiang · 2023-02-15T06:28:39Z

And test with datafusion-cli





0 rows in set. Query took 0.018 seconds.
❯ select l_partkey from test limit 21;
+-----------+
| l_partkey |
+-----------+
| 155190    |
| 67310     |
| 63700     |
| 2132      |
| 24027     |
| 15635     |
| 106170    |
| 4297      |
| 19036     |
| 128449    |
| 29380     |
| 183095    |
| 62143     |
| 88035     |
| 108570    |
| 123927    |
| 37531     |
| 139636    |
| 182052    |
| 145243    |
| 94780     |
+-----------+
21 rows in set. Query took 0.039 seconds.
❯ select l_partkey+1 from test limit 21;
+---------------------------+
| test.l_partkey + Int64(1) |
+---------------------------+
| 155191                    |
| 67311                     |
| 63701                     |
| 2133                      |
| 24028                     |
| 15636                     |
| 106171                    |
| 4298                      |
| 19037                     |
| 128450                    |
| 29381                     |
| 183096                    |
| 62144                     |
| 88036                     |
| 108571                    |
| 123928                    |
| 37532                     |
| 139637                    |
| 182053                    |
| 145244                    |
| 94781                     |
+---------------------------+
21 rows in set. Query took 0.034 seconds.
❯

Seems not error during add

Dandandan · 2023-02-17T10:46:34Z

A difference with #3134 (comment) is that a batch size of 512 (in math_query_sql) vs 512000 (in the linked issue).

I would expect a larger difference to arise for simple arithmetic operations (like adding) when choosing a small batch size as the operation itself could be fast compared to the allocation. Also in the posted benchmark in the comments a allocation is done in the loop, whereas the query from @Ted-Jiang does a scalar addition.
Would be good if we have a benchmark to show the difference.

Ted-Jiang · 2023-02-21T11:09:56Z

@Dandandan Thanks for your info ❤️ I try to update https://github.com/apache/arrow-rs/pull/3709/files#diff-d31b0761ffe79b72672cd516aa8ef0792c004328f31994dc616e677e8eb3c50cR31-R59 to

const BATCH_SIZE: i32 = 512;
const BATCH_NUM: i32 = 1024 * 10;

Got

warning: `arrow` (bench "cow") generated 8 warnings
    Finished bench [optimized] target(s) in 0.34s
     Running benches/cow.rs (target/release/deps/cow-506a8d861bd3158e)
add                     time:   [8.5384 ms 8.5754 ms 8.6142 ms]
                        change: [-4.1072% -3.3312% -2.6109%] (p = 0.00 < 0.05)
                        Performance has improved.

Still little improvement found 😢 . Seems there lost something in my poc in datafusion.

tustvold · 2024-01-01T11:35:58Z

Closing this one for now, feel free to reopen if you wish to pursue this further

Ted-Jiang added the enhancement Any new improvement worthy of a entry in the changelog label Feb 13, 2023

Ted-Jiang mentioned this issue Feb 13, 2023

[feat] Support unary_dyn_mut in arrow-arth. #3709

Closed

tustvold closed this as not planned Won't fix, can't repro, duplicate, stale Jan 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support `unary_dyn_mut` in arth #3708

Support `unary_dyn_mut` in arth #3708

Ted-Jiang commented Feb 13, 2023

tustvold commented Feb 13, 2023

viirya commented Feb 13, 2023

Ted-Jiang commented Feb 14, 2023 •

edited

Loading

Ted-Jiang commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023 •

edited

Loading

tustvold commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023

tustvold commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023

Ted-Jiang commented Feb 15, 2023

Ted-Jiang commented Feb 15, 2023 •

edited

Loading

Ted-Jiang commented Feb 15, 2023

Dandandan commented Feb 17, 2023

Ted-Jiang commented Feb 21, 2023

tustvold commented Jan 1, 2024

Support unary_dyn_mut in arth #3708

Support unary_dyn_mut in arth #3708

Comments

Ted-Jiang commented Feb 13, 2023

tustvold commented Feb 13, 2023

viirya commented Feb 13, 2023

Ted-Jiang commented Feb 14, 2023 • edited Loading

Ted-Jiang commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023 • edited Loading

tustvold commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023

tustvold commented Feb 14, 2023

Ted-Jiang commented Feb 14, 2023

Ted-Jiang commented Feb 15, 2023

Ted-Jiang commented Feb 15, 2023 • edited Loading

Ted-Jiang commented Feb 15, 2023

Dandandan commented Feb 17, 2023

Ted-Jiang commented Feb 21, 2023

tustvold commented Jan 1, 2024

Support `unary_dyn_mut` in arth #3708

Support `unary_dyn_mut` in arth #3708

Ted-Jiang commented Feb 14, 2023 •

edited

Loading

Ted-Jiang commented Feb 14, 2023 •

edited

Loading

Ted-Jiang commented Feb 15, 2023 •

edited

Loading