Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PSA: It is possible to use BenchmarkTools.BenchmarkGroup with Chairmarks #70

Closed
asinghvi17 opened this issue Mar 6, 2024 · 14 comments · Fixed by #150
Closed

PSA: It is possible to use BenchmarkTools.BenchmarkGroup with Chairmarks #70

asinghvi17 opened this issue Mar 6, 2024 · 14 comments · Fixed by #150
Labels
documentation Improvements or additions to documentation

Comments

@asinghvi17
Copy link

Simply replacing @benchmarkable with @be suffices, and you don't have to run tune! or run either!

Even running Statistics.median(suite) works - although any custom plotting utilities might need a couple of tweaks :)

@LilithHafner
Copy link
Owner

What? I had not idea. This is lovely :)

@MilesCranmer
Copy link

Is there a way to take a BenchmarkGroup and "translate" it somehow? Or otherwise run a benchmark suite using Chairmarks instead of BenchmarkTools? Many projects have historically defined their benchmark suite in benchmarks/benchmarks.jl and use BenchmarkTools. So to measure performance over time it is perhaps not practical to replace all previous @benchmarkable (especially if there are setup=... in the macro).

I ask specifically in the context of AirspeedVelocity.jl: MilesCranmer/AirspeedVelocity.jl#35

@asinghvi17
Copy link
Author

I mean they both have the same syntax, so an operation at all leaf nodes of that dict could easily reassign the @benchmarkables to the results of @be on that same Expr (if you can directly invoke a macro on an Expr).

@MilesCranmer
Copy link

I think they have slightly different syntax, no?

  Positional argument disambiguation
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

  setup, teardown, and init are optional and are parsed with that precedence giving these possible forms:

  @be f
  @be setup f
  @be setup f teardown
  @be init setup f teardown

whereas @benchmarkable is

help?> @benchmarkable
  @benchmarkable <expr to benchmark> [setup=<setup expr>]


  Create a Benchmark instance for the given expression. @benchmarkable has similar syntax with @benchmark. See also @benchmark.

@asinghvi17
Copy link
Author

Ah that's true - I had not used setup much but I guess it could also be translated by way of Expr rewriting

@MilesCranmer
Copy link

Wait, I'm a bit confused. @benchmarkable returns a benchmark (that you can then execute). Whereas @be appears to actually return the results of benchmarking the expression. Is that correct?

@asinghvi17
Copy link
Author

Yep! Creating a benchmark suite with @be instead of @benchmarkable gives you the equivalent of that suite after run.

@MilesCranmer
Copy link

MilesCranmer commented Mar 13, 2024

Oh, but isn't the whole point of @benchmarkable for it to be lazily evaluated, so you can tune! it? And @benchmark for eager evaluation?

@asinghvi17
Copy link
Author

That's true, but if running in a non-interactive framework I don't think it really matters?

Even then, the performance difference is substantial enough that it's actually possible to do semi-interactive workflows with Chairmarks.

@MilesCranmer
Copy link

I guess I just wouldn't consider @benchmarkable -> @be to be a complete solution (maybe this deserves a new issue with a feature request). For example if I need a single benchmark result I would use @benchmark. But I want a suite of benchmarks that I can store in my REPL, in between Revise.jl-ing my library, then the suite is something I would want to re-run. In principle it doesn't seem too bad to add a compatibility layer?

This is the benchmarkable code. The logic isn't too complicated it seems, it just generates a function `samplefunc` that returns the recorded time, allocations, memory for a particular expression.
"""
    @benchmarkable <expr to benchmark> [setup=<setup expr>]

Create a `Benchmark` instance for the given expression. `@benchmarkable`
has similar syntax with `@benchmark`. See also [`@benchmark`](@ref).
"""
macro benchmarkable(args...)
    core, setup, teardown, quote_vars, quote_vals, params = benchmarkable_parts(args)
    map!(esc, params, params)

    # extract any variable bindings shared between the core and setup expressions
    setup_vars = isa(setup, Expr) ? collectvars(setup) : []
    core_vars = isa(core, Expr) ? collectvars(core) : []
    out_vars = filter(var -> var in setup_vars, core_vars)

    # generate the benchmark definition
    return quote
        generate_benchmark_definition(
            $__module__,
            $(Expr(:quote, out_vars)),
            $(Expr(:quote, setup_vars)),
            $(Expr(:quote, quote_vars)),
            $(esc(Expr(:tuple, Expr.(:quote, quote_vals)...))),
            $(esc(Expr(:quote, core))),
            $(esc(Expr(:quote, setup))),
            $(esc(Expr(:quote, teardown))),
            Parameters($(params...)),
        )
    end
end

# `eval` an expression that forcibly defines the specified benchmark at
# top-level in order to allow transfer of locally-scoped variables into
# benchmark scope.
#
# The double-underscore-prefixed variable names are not particularly hygienic - it's
# possible for them to conflict with names used in the setup or teardown expressions.
# A more robust solution would be preferable.
function generate_benchmark_definition(
    eval_module, out_vars, setup_vars, quote_vars, quote_vals, core, setup, teardown, params
)
    @nospecialize
    corefunc = gensym("core")
    samplefunc = gensym("sample")
    type_vars = [gensym() for i in 1:(length(quote_vars) + length(setup_vars))]
    signature = Expr(:call, corefunc, quote_vars..., setup_vars...)
    signature_def = Expr(
        :where,
        Expr(
            :call,
            corefunc,
            [
                Expr(:(::), var, type) for
                (var, type) in zip([quote_vars; setup_vars], type_vars)
            ]...,
        ),
        type_vars...,
    )
    if length(out_vars) == 0
        invocation = signature
        core_body = core
    elseif length(out_vars) == 1
        returns = :(return $(out_vars[1]))
        invocation = :($(out_vars[1]) = $(signature))
        core_body = :($(core); $(returns))
    else
        returns = :(return $(Expr(:tuple, out_vars...)))
        invocation = :($(Expr(:tuple, out_vars...)) = $(signature))
        core_body = :($(core); $(returns))
    end
    @static if isdefined(Base, :donotdelete)
        invocation = :(
            let x = $invocation
                Base.donotdelete(x)
                x
            end
        )
    end
    return Core.eval(
        eval_module,
        quote
            @noinline $(signature_def) = begin
                $(core_body)
            end
            @noinline function $(samplefunc)(
                $(Expr(:tuple, quote_vars...)), __params::$BenchmarkTools.Parameters
            )
                $(setup)
                __evals = __params.evals
                __gc_start = Base.gc_num()
                __start_time = time_ns()
                __return_val = $(invocation)
                for __iter in 2:__evals
                    $(invocation)
                end
                __sample_time = time_ns() - __start_time
                __gcdiff = Base.GC_Diff(Base.gc_num(), __gc_start)
                $(teardown)
                __time = max((__sample_time / __evals) - __params.overhead, 0.001)
                __gctime = max((__gcdiff.total_time / __evals) - __params.overhead, 0.0)
                __memory = Int(Base.fld(__gcdiff.allocd, __evals))
                __allocs = Int(
                    Base.fld(
                        __gcdiff.malloc +
                        __gcdiff.realloc +
                        __gcdiff.poolalloc +
                        __gcdiff.bigalloc,
                        __evals,
                    ),
                )
                return __time, __gctime, __memory, __allocs, __return_val
            end
            $BenchmarkTools.Benchmark($(samplefunc), $(quote_vals), $(params))
        end,
    )
en

So perhaps a ext/ChairmarksBenchmarkToolsExt.jl could create a @benchmarkable that still uses benchmarkable_parts to extract the pieces, but uses Chairmarks instead for running the thing? I'm not sure how doable this is. Maybe @LilithHafner could share their thoughts.

@asinghvi17
Copy link
Author

Hmm! Yeah that logic could probably be directly translated to Chairmarks somehow, probably by creating a new macro which stores an Expr object when run (the equivalent of @benchmarkable). It doesn't look like this is hijackable unless either we do major type piracy by overloading BenchmarkTools.generate_benchmark_definition directly, or offer a function Chairmarks.override_benchmarkable!() to @eval that code in.

@LilithHafner
Copy link
Owner

This should be moved from an issue into the documentation

@LilithHafner LilithHafner linked a pull request Nov 30, 2024 that will close this issue
@LilithHafner
Copy link
Owner

This PSA is in the docs now. Issues with its behavior should be reported separately. Though as I mention in the docs, this isn't intentionally designed behavior so it may never be seamless.

@LilithHafner
Copy link
Owner

Thanks again for reporting, @asinghvi17, and thanks for the qualifications and context, @MilesCranmer

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants