Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimizer control primitives #57

Merged
merged 7 commits into from
Sep 1, 2016
Merged

Conversation

rmartinho
Copy link
Collaborator

@rmartinho rmartinho commented Aug 8, 2016

This adds two primitives for a tiny bit of control over the optimizer.

keep_memory(p); prevents optimization of *p writes that precede it and of *p reads that follow it. This is the "escape" mentioned in #51.

keep_memory(); does the same for all memory. It is currently not available for MSVC (any help with that is appreciated). This is the "clobber" mentioned in #51.

(*this)(chronometer(model, plan.iterations_per_sample));
detail::optimizer_barrier();
Copy link
Contributor

@arximboldi arximboldi Aug 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest putting the optimizer_barrier around the function call to the concrete benchmark, instead of the whole measurement. That way we can also use the return value for the deoptimization, which would make the trick compatible with MSVC and with what the documentation suggests.

template <typename T>
void deoptimize_value(T&& x) { 
    keep_memory(&x); 
}

template <typename Fn, typename... Args>
auto invoke_deoptimized(Fn&& fn, Args&&... args) -> std::enable_if_t<!std::is_same<void, std::result_of_t<Fn(Args...)>>{}> {
    deoptimize_value(std::forward<Fn>(fn) (std::forward<Args>(args...)));
}

template <typename Fn, typename... Args>
auto invoke_deoptimized(Fn&& fn, Args&&... args) -> std::enable_if_t<std::is_same<void, std::result_of_t<Fn(Args...)>>{}> {
    std::forward<Fn>(fn) (std::forward<Args>(args...));
    // maybe call optimizer_barrier() ??
}

And then in chronometer::measure:

        template <typename Fun>
        void measure(Fun&& fun, std::false_type) {
            measure([&fun](int) { return fun(); }); // added return !
        }

        template <typename Fun>
        void measure(Fun&& fun, std::true_type) {
            impl->start();
            for(int i = 0; i < k; ++i) 
                deoptimized_invoke(fun, i);
            impl->finish();
        }

I did not try the code and it is C++14, but you get the idea... What do you think?

@arximboldi
Copy link
Contributor

Cool! In general I like the approach and the technique. Only thing is I am not sure about what is the best place and way to put the barrier.

@rmartinho rmartinho merged commit 4c4b970 into devel Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants