Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add developer documentation for benchmarking #11122

Merged
merged 12 commits into from
Aug 5, 2022

Conversation

vyasr
Copy link
Contributor

@vyasr vyasr commented Jun 17, 2022

This PR documents best practices for writing cuDF Python benchmarks. It includes an overview of the various fixtures provided by our benchmarking suite to all benchmarks and indicates how best to make use of them. It also discusses the various features of our benchmarking suite (including easy comparison to pandas and running in CI) and what developers must do to maintain compatibility with those features.

A PR to incorporate the cudf_benchmarks repo into cudf proper is imminent, but this documentation PR can be reviewed (and merged) independently.

@vyasr vyasr added 3 - Ready for Review Ready for review by team doc Documentation Python Affects Python cuDF API. improvement Improvement / enhancement to an existing function labels Jun 17, 2022
@vyasr vyasr self-assigned this Jun 17, 2022
@github-actions github-actions bot removed the Python Affects Python cuDF API. label Jun 17, 2022
@vyasr vyasr added non-breaking Non-breaking change and removed improvement Improvement / enhancement to an existing function labels Jun 17, 2022
@vyasr vyasr requested review from shwina and isVoid June 24, 2022 00:27
rapids-bot bot pushed a commit that referenced this pull request Jun 27, 2022
This PR ports the benchmarks in https://github.com/vyasr/cudf_benchmarks, adding official benchmarks to the repository. The new benchmarks are designed from the ground up to make the best use of pytest, pytest-benchmark, and pytest-cases to simplify writing and maintaining benchmarks. Extended discussions of various previous design questions may be found on [the original repo](https://github.com/vyasr/cudf_benchmarks). Reviewers may also benefit from reviewing the companion PR creating documentation for how to write benchmarks, #11122.

Tests will not pass here until rapidsai/integration#492 is merged.

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - AJ Schmidt (https://github.com/ajschmidt8)
  - Bradley Dice (https://github.com/bdice)
  - Michael Wang (https://github.com/isVoid)
  - GALI PREM SAGAR (https://github.com/galipremsagar)
  - Matthew Roeschke (https://github.com/mroeschke)

URL: #11125
Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor queries

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jul 18, 2022

Codecov Report

❗ No coverage uploaded for pull request base (branch-22.10@acadcf2). Click here to learn what that means.
The diff coverage is n/a.

@@               Coverage Diff               @@
##             branch-22.10   #11122   +/-   ##
===============================================
  Coverage                ?   86.47%           
===============================================
  Files                   ?      144           
  Lines                   ?    22856           
  Branches                ?        0           
===============================================
  Hits                    ?    19765           
  Misses                  ?     3091           
  Partials                ?        0           

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some more minor comments, but overall looks good I think.

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
@vyasr vyasr requested review from isVoid and wence- July 22, 2022 21:56
@vyasr
Copy link
Contributor Author

vyasr commented Jul 22, 2022

@shwina @wence- @isVoid thanks for your patience here. I think that I have addressed all the comments. I left open the few discussions that I thought still required some response. I also adapted what we discussed in #11180 (comment) into the discussion on how to handle parametrization in different scenarios.

Note that there is significant overlap in some parts of this document with discussions on testing. However, since I anticipate this PR being merged before #11199 I am fine with getting this done and then consolidating the work in that PR.


When it comes to parametrizing tests, we have a number of options at our disposal.
One option is fixtures, while a second is using `pytest.mark.parametrize`.
A third option is provided by the [`pytest_cases`](https://smarie.github.io/python-pytest-cases/) `pytest` plugin.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did we come to a conclusion about whether we want to advocate for the use of cases v/s straight parametrize/fixtures? It's not super clear from the thread. cc: @wence-

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We concluded there's a hierarchy of complexity, which is illustrated with some guidance a little bit further down this paragraph. It's not fully cut-and-dried, which is why this guidance is soft rather than hard.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My fear is that unless it's cut-and-dried, most developers would rather not deal with complexity.

If I develop a feature and need to write a test or benchmark for it, I'm going to choose the path of least resistance, which could include:

  • writing as little code as possible
  • touching as few files as possible
  • learning as few new tools as possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry to pile on this discussion further (and late!).

I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.

But, it's quite mind-bending to think about the intersectionality of parameters for someone wanting to develop a feature and write incremental tests or benchmarks for it. Incorporating fixtures into this workflow is easy and familiar as one identifies reusable components between tests. Incorporating cases seems more difficult and really only possible in retrospect - or am I thinking of this all wrong?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really sufficiently familiar with cases to say, but I think that is reasonable. Probably you would start out writing fixtures, and then, as they became more baroque, wonder if there were a better way (which I believe cases offer).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think that anything that you've said is wrong or untrue, but I think you're classifying as a difference in kind something that is really only a difference in degree.

I would argue that incorporating fixtures effectively is also really only possible in retrospect. As our test suite clearly demonstrates, most people's starting point (at least historically) is to write an unparametrized test with some hardcoded objects, then abuse pytest.mark.parametrize to test as many possibilities as you can. Getting everyone to switch to using fixtures already requires some amount of either 1) foresight into what tests you want to write, or 2) refactoring tests in hindsight. Typically the latter occurs as part of PR review, although over the past year or two I think many developers have become more accustomed to thinking in terms of reusable fixtures and reaching for them first.

Cases are just one more step up the complexity ladder. Instead of just thinking about how you might write a bunch of tests that take the exact same arguments, in order to start off with using cases you need to start thinking about tests that might have only partial overlap in parametrization. Otherwise, you refactor tests in hindsight. The review process will train people in those best practices in the same way that we've improved our fixture usage.

I think the current set of rules for choosing a method of parametrization is well-written and as concise as it can be.

Just to make sure that I understand: am I interpreting this as saying that you're not yet convinced that we should use cases at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did a poor job communicating what changes are required here.

I'm not unilaterally opposed to the use of cases, but I feel I need a better understanding of how much friction, if any, we would be adding to the process of developing benchmarks and - more importantly - tests, if we require them.

I think before moving forward with this recommendation, it'd be great to have a deeper discussion about cases offline. Perhaps with the rest of the team so we can get their feedback as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, bringing my experience from other projects that heavily used parameterized fixtures (but not pytest cases). I think there's a tradeoff between generality (and lack of code repetition) that the more complex approaches bring, and ease of debugging (and extraction) of a particular test when something goes wrong and you need to fix things. Although it is relatively easy to run a single test with pytest test_file::test_name after it fails for debugging purposes, I have often found that what I end up doing is pulling that test out into a single file that I can run with none of the pytest machinery in the way. If it then depends on a bunch of fixtures this is painful because you have to find all those and so forth.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll leave this conversation unresolved for now so that it's easy to find when we finally come back to this discussion.

Copy link
Contributor

@wence- wence- left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trivial grammar nit, but otherwise looks good, thanks!

docs/cudf/source/developer_guide/benchmarking.md Outdated Show resolved Hide resolved
@bdice bdice changed the base branch from branch-22.08 to branch-22.10 August 2, 2022 21:43
@vyasr
Copy link
Contributor Author

vyasr commented Aug 4, 2022

In the interest of getting this PR merged sooner rather than later, I'm removing the discussion of cases for now until we can have a discussion and come to some consensus on how they should be used. I'm going to copy the exact relevant text out of the current document into this comment and then remove it from the doc so that we can merge and then revisit.

Discussion of cases

In the second case, fixtures are really functioning as parameters, which we discuss in the next section.

Parametrization: custom fixtures, pytest-cases and pytest.mark.parametrize

When it comes to parametrizing benchmarks, we have a number of options at our disposal.
One option is fixtures, while a second is using pytest.mark.parametrize.
A third option is provided by the pytest_cases pytest plugin.
Our benchmarks make extensive use of this plugin to handle complex parametrization.
Specifically, it provides some syntactic sugar around

@pytest.mark.parametrize(
    "num", [1, 2, 3]
)
def bench_foo(benchmark, num):
    benchmark(num * 2)

for when the parameters are nontrivial and require complex initialization.
This is common for benchmarks of functions accepting cuDF objects, such as cudf.concat.
With pytest_cases, the different cases are instead placed into separate functions and automatically made available.

# bench_foo_cases.py
def case_1():
    return 1

def case_2():
    return 2

def case_3():
    return 3

# bench_foo.py
@pytest_cases.parametrize_with_cases(num)
def bench_foo(benchmark, num):
    benchmark(num * 2)

pytest-cases is allows developers to put complex initialization into named, documented functions.
That becomes especially valuable when benchmarking APIs whose performance can vary drastically based on parameters.
Additionally, cases, like fixtures, are lazily evaluated.
Initializing complex objects inside a pytest.mark.parametrize can dramatically slow down test collection,
or even lead to out of memory issues if too many complex cases are collected.
Using lazy case functions ensures that the associated objects are only created on an as-needed basis.

When writing cases, just as in writing custom fixtures, developers should make use of the config variables.
Cases should import the NUM_ROWS and/or NUM_COLS variables from the config module and use them to define data sizes.

Given the plethora of options for parametrization,
we codify here some best practices for how each should be employed.
In general, these approaches are applicable to parametrizations of different complexity.
For the purpose of this discussion,
we define a "simple parametrization" as parametrization using a list (possibly nested) of primitive objects.
Examples include a list of integers or a list of list of strings.
This does not include e.g. cuDF or pandas objects.

With that in mind, here are some ground rules for how to parametrize.

Use pytest.mark.parametrize when:

  • One test must be run on many inputs and those inputs are simple to construct.

Use fixtures when:

  • One or more tests must be run on the same set of inputs,
    and all of those inputs can be constructed with simple parametrizations.
    In practice, that means that it is acceptable to use a fixture like this:
        @pytest.fixture(params=['a', 'b'])
        def foo(request):
            if request.param == 'a':
                # Some complex initialization
            elif request.param == 'b':
                # Some other complex initialization
    In other words, the construction of the fixture may be complex,
    as long as the parametrization of that construction is simple.

Use pytest-cases.parametrize_with_cases when:

  • One or more tests must be run on the same set of inputs,
    and at least one of those inputs requires complex parametrizations.
  • Given a set of cases, different tests need to run on different subsets with a nonempty intersection.

@vyasr vyasr requested a review from shwina August 4, 2022 21:59
rapids-bot bot pushed a commit that referenced this pull request Aug 4, 2022
This PR adds a primary developer guide for Python. It provides a more complete and informative landing page for new developers. When #11217, #11199, and #11122 are merged, they will all be linked from this page to provide a complete set of developer documentation.

There is one main point of discussion that I would like reviewer comments on, and that is the section on directory and file organization. How do we want that aspect of cuDF to look?

Authors:
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - Matthew Roeschke (https://github.com/mroeschke)
  - Lawrence Mitchell (https://github.com/wence-)
  - Ashwin Srinath (https://github.com/shwina)

URL: #11235
@vyasr
Copy link
Contributor Author

vyasr commented Aug 5, 2022

@gpucibot merge

@rapids-bot rapids-bot bot merged commit 493d96b into rapidsai:branch-22.10 Aug 5, 2022
@vyasr vyasr deleted the docs/python_benchmarking branch August 5, 2022 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review Ready for review by team doc Documentation non-breaking Non-breaking change
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants