Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Aggregate performance #487

Merged
merged 1 commit into from
Mar 21, 2022
Merged

Improve Aggregate performance #487

merged 1 commit into from
Mar 21, 2022

Conversation

Hoverbear
Copy link
Contributor

@Hoverbear Hoverbear commented Mar 17, 2022

Thanks to a report from @jamessewell we dug into some aggregate performance issues.

This hints that memory context swap calls like these should always be inline, and should improve performance in many cases.

Using an example of

pub struct RustMaxInternal;

#[pg_aggregate]
impl Aggregate for RustMaxInternal {
    type Args = i32;
    type State = Internal;
    type Finalize = i32;
    #[inline(always)]
    fn state(
        mut current: Self::State,
        value: Self::Args,
        _fcinfo: pg_sys::FunctionCallInfo,
    ) -> Self::State {
        let inner = unsafe { current.get_or_insert_default::<i32>() };

        if *inner < value {
            *inner = value;
        };
        current
    }
    #[inline(always)]
    fn combine(
        mut first: Self::State,
        mut second: Self::State,
        _fcinfo: pg_sys::FunctionCallInfo,
    ) -> Self::State {
        let first_inner = unsafe { first.get_or_insert_default::<i32>() };
        let second_inner = unsafe { second.get_or_insert_default::<i32>() };


        if *first_inner < *second_inner {
            *first_inner = *second_inner;
        };
        first
    }
    #[inline(always)]
    fn finalize(
        mut current: Self::State,
        _direct_arg: Self::OrderedSetArgs,
        _fcinfo: pg_sys::FunctionCallInfo,
    ) -> Self::Finalize {
        let inner = unsafe { current.get_or_insert_default::<i32>() };
        *inner
    }
}

Before:

$ cargo pgx run pg14 -p pgx-tests --features pg_test --release
    Stopping Postgres v14
building extension with features `pg_test pg14`
"cargo" "build" "--package" "pgx-tests" "--release" "--features" "pg_test pg14" "--no-default-features" "--message-format=json-render-diagnostics"
   Compiling pgx-tests v0.4.0 (/home/ana/git/zombodb/pgx/pgx-tests)
    Finished release [optimized] target(s) in 11.81s

installing extension
     Copying control file to /home/ana/.pgx/14.2/pgx-install/share/postgresql/extension/pgx_tests.control
     Copying shared library to /home/ana/.pgx/14.2/pgx-install/lib/postgresql/pgx_tests.so
 Discovering SQL entities
  Discovered 332 SQL entities: 32 schemas (2 unique), 282 functions, 11 types, 1 enums, 2 sqls, 0 ords, 0 hashes, 4 aggregates
     Writing SQL entities to /home/ana/.pgx/14.2/pgx-install/share/postgresql/extension/pgx_tests--1.0.sql
    Finished installing pgx_tests
    Starting Postgres v14 on port 28814
    Re-using existing database pgx_tests
psql (14.2)
Type "help" for help.

pgx_tests=# DROP EXTENSION pgx_tests; CREATE EXTENSION pgx_tests;
DROP EXTENSION
CREATE EXTENSION
pgx_tests=# \timing
Timing is on.
pgx_tests=# SELECT RustMaxInternal(value) FROM generate_series(0, 400000) as value;
 rustmaxinternal 
-----------------
          400000
(1 row)

Time: 131.944 ms
pgx_tests=# 

After:

pgx_tests=# SELECT RustMaxInternal(value) FROM generate_series(0, 400000) as value;
 rustmaxinternal 
-----------------
          400000
(1 row)

Time: 128.002 ms

A single run isn't really meaningful, but we think this hinting may yield some perf improvements where the optimizer does actually need hinting. Further, it does make sense to inline these calls, since they always happen, have small bodies, and it's performance sensitive bits of the code.

@Hoverbear Hoverbear self-assigned this Mar 17, 2022
@Hoverbear Hoverbear changed the title Improve Aggregate and Internal performance Improve Aggregate performance Mar 17, 2022
@Hoverbear Hoverbear marked this pull request as ready for review March 17, 2022 17:26
@Hoverbear Hoverbear merged commit 2302d96 into develop Mar 21, 2022
@Hoverbear Hoverbear deleted the aggregate-perf branch March 21, 2022 16:01
@Hoverbear Hoverbear mentioned this pull request Mar 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant