Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Optimal" tests reordering - synthesis & strategy #5460

Open
smarie opened this issue Jun 18, 2019 · 1 comment
Open

"Optimal" tests reordering - synthesis & strategy #5460

smarie opened this issue Jun 18, 2019 · 1 comment
Labels
topic: collection related to the collection phase topic: fixtures anything involving fixtures directly or indirectly type: enhancement new feature or API change, should be merged into features branch

Comments

@smarie
Copy link
Contributor

smarie commented Jun 18, 2019

Introduction

To quote #5054 (comment)

Fixture order due to parametrization is a "hot topic" in the sense that there's been some work been done on it, but is a non-trivial issue because fixture setup order might greatly affect test suite performance.

Currently as you all know, pytest post-processes the order of test items in the pytest_collection_modifyitems hook. The purpose is to reach some kind of "optimality". There are currently many open tickets in pytest-dev about these ordering issues. My personal feeling is that we will not solve each of these problems separately, and that there is a need for a single place where we can discuss what "optimal" means, and what is the direction pytest will take on that topic.

Current Status

1- What is "optimal" ?

a) Current implementation

Even if PR #3551 makes it way to solving the above issues (thanks @ceridwen !), a few other issues go beyong the current definition of "optimal":

b) Additional need 1: "priorities"

The first issue with current approach is that inside a given scope, ordering may be counter intuitive especially when there are multiple "best orders". Some comments in some tickets (#2846 (comment), #2846 (comment)) do not agree with it.

A new ticket #3393 has been opened and led to request an updated definition of "optimal": adding a "priority" argument. @Sup3rGeo proposed a plugin to handle this: pytest-param-priority.

My personal feeling: "priority" is a very technical term that most users will not understand properly. Whereas a notion of "setup/teardown cost", that users can express in seconds or in any other unit of their choice, could be easier to document and understand.

c) Additional need 2: "constraints"

#4892 raises the question of "shared resources" between fixtures. Part of the OP's need can be solved by setting the fixtures with the highest cost as "high priority", but the notion of "shared resource" is still an additional need: that two fixtures have an "interlock" between their setup/teardown. (one can not be setup if the other is setup).

2- Other desirable features

a) Explicit ordering

pytest_reorder proposes an additional commandline option to reorder tests based on their node ids, or based on a custom regex matching order. This allows users to customize the order pretty much the way they wish.

pytest-ordering proposes to reorder tests based on marks. Not sure that this applies to fixtures also.

b) Disabling order optimization

As this topic grows it seems more and more appropriate to be able to disable any kind of order optimization, just to be able to understand wher an order comes from. I suggested in #5054 and implemented in pytest-cases a commandline switch to skip all reordering done by pytest and plugins.

c) Readability / maintainability

To quote #3161 (comment)

"After spending some time staring at the reorder_items_atscope function, I still don't understand what the order it's supposed to produce is. I would assume that the correct order is to group all the fixtures of a given scope together and otherwise preserve the order in which they're processed. Are there more constraints than that?"

This raises the point about readability/maintainability of the chosen algorithm, whatever it is.

d) Support for parallelism

pytest-xdist allows users to parallelize tests. I expect that the "optimal" scheduling will therefore have to be completely modified in presence of parallelism.

Now what ?

From here, the debate is open:

  • Is this entire topic in scope of pytest or just some of it? If not, where is the best place to work on this topic ?
  • At which point should we switch from "relying on heuristics" where the algorithm needs to be modified everytime a new issue is discovered, to "relying on an optimization solver" where the problem formulation is the only thing that should be maintained ? For me the best way to properly handle several sets of constraints and to be able to add more later, is probably this. It seems relatively easy to formulate the (MILP) optimization problem in the case where there is no parallelism, there are plenty of resources about it for example this chapter 4. Then once the mathematical problem is formulated, many MILP solvers exist in python as presented here.

Your ideas ?

@RonnyPfannschmidt
Copy link
Member

i am definitively onboard for making this a contraints solver,
i beleive for sound implemntation we need to destroy the test protocol hooks as currently designed (yay)

while we are at it there should be a layer weaved in to support communication about setup/teardown dependencies for xdist as its current scheduling mechanisms leave much to be desired

@Zac-HD Zac-HD added topic: collection related to the collection phase topic: fixtures anything involving fixtures directly or indirectly type: enhancement new feature or API change, should be merged into features branch labels Jul 9, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: collection related to the collection phase topic: fixtures anything involving fixtures directly or indirectly type: enhancement new feature or API change, should be merged into features branch
Projects
None yet
Development

No branches or pull requests

3 participants