Refactor TestResult into proper rusty result type #1827

yihuaf · 2023-04-19T17:19:08Z

This PR refactored the TestResult into a rusty std::Result<T, TestError>. In this way, the handling of TestResult can follow how rust handles error, such as using ?. We further use thiserror to generate proper TestError for the cases the TestResult was handling. For simplicity, we wrap TestError::Failed with anyhow::Error to maintain backward compatibility. In the future, we can, should, and will expand the TestError::Failed case to specific errors so the error handling can become more explicit.

Tag along, also enabled container exec test since it is implemented now.

Signed-off-by: yihuaf <[email protected]>

tsturzl · 2023-04-19T17:36:37Z

My memory escapes me a little here, but I thought the reasoning behind the idea of skipping tests was to not consider them failures if the system the tests are run on isn't capable of running that test. Is it appropriate to consider it an error type? To me it doesn't really seem like an error, though I could see benefits in the future if we ever want to include context as to why a certain test got skipped.

yihuaf · 2023-04-19T21:21:43Z

To me it doesn't really seem like an error, though I could see benefits in the future if we ever want to include context as to why a certain test got skipped.

Logically, yes, skip is a special case. It can be argued that skip should be considered as success or Ok. It can also be argued that skip as an exception that needs to be handled. It is an odd ball here. To me, first and foremost is to use the result type so all the typical rust error handling facilities can work, such as ? operator. Result is a binary case with Ok or Err. In this PR, for backward compatibility and to minimize the changes, I include the skip as a special case of Err that needs to be handled. As one of the case of TestErr, rust compiler would force the code to handle the skip case explicitly, which I believe is a good thing.

Also, I think the key thing here is not get hangup on the case of what is an error. Rather, my mental model is that the result type is a list of explicit Enum or state where the Ok state is specially handled. All other state are grouped together and handled.

Signed-off-by: yihuaf <[email protected]>

YJDoc2 · 2023-04-20T16:04:43Z

Hey, I'll take a look at this over weekend 👍

YJDoc2 · 2023-04-23T11:03:20Z

Hye @yihuaf , thanks a lot for putting in the time and making this PR. One thing is you have added a with() method to TestGroup impl, but I didn't see it being used anywhere. Apart from that the code seems fine, and as you haven't changes any core logic of tests, as long as it compiles and runs in CI, it should be good .

Unfortunately, I don't particularly get the reasoning/advantages behind this.

I think introduction of thiserror might be a good idea, but I didn't see changes in output of my failing tests.

Some points which makes me question the changes here :

Both the integration tests (binary) and test framework are not intended to be published at all. Which means we don't really have anything that needs to worry about interface ergonomics : as long as the original TestResult, or the one introduced by you serves our purpose of tests, anything/anyone else being /not-being able to use things such as ? with it is not really a concern.
Now if changing original TestResult enum would have made its usage more ergonomic for us or made the code of tests simpler, I could see its benefits, however , I don't really see that with these changes. We still need to wrap our errors (anyhow based) within TestError, and then that inside Err. Which actually has increased one level of nesting, as originally we only wrapped anyhow into TestResult.
Another thing is, this change would have been really welcome if TestResult was an internal type which, say, we returned from intermediate functions, and needed to bubble up. In such case this change would have added the excellent ergonomic of ? , so we didn't need to to a match and return at every place such intermediate functions were called. However , I don't think that is the case. In our usage, TestResult is almost exclusively the last return type of tests themselves which is consumed only by the TestManager, where it matches and prints out the result. Thus I don't think that the ? is particularly useful, and changing the existing to more rusty result is worth the changes.
One more reason for which I don't really like the changes here is that TestResult doesn't really fit in the two-value rust Result. As @tsturzl has mentioned, the reason not to use rust's Result in first place was that
1. The test that pass will always return (). There will not be any other type of value in Ok() and the Passed state is just an indicative state that marks the test has passed. Thus adding that <()> to method signatures is a bit redundant.
2. Skipped state is something completely different from Passed and Failed states. We cannot say a skipped test is failed, as it was skipped because the system is not capable of running it (not that it ran on the system, and produced incorrect result) . It is also different from Passed, as it never ran on the system in the first place to pass or fail.

Thus we need something that is capable of indicating these three different states. Putting these in rust's Result does not do any good, because we still have to do a match with three arms (Ok, err(skipped), err(failed) ); moreover now we must consider skipped as either failure (as in this PR) or passing , both of which are not really correct.

I might be missing something that you see, and maybe the changes you have proposed might have benefits which I do not see yet. However, I don't currently see tradeoff of changing to more rusty Result v/s the points I mentioned above being worth it.

Please feel free to correct me or indicate the reasons I am missing.
Thanks!

yihuaf · 2023-04-24T02:45:42Z

Both the integration tests (binary) and test framework are not intended to be published at all. Which means we don't really have anything that needs to worry about interface ergonomics : as long as the original TestResult, or the one introduced by you serves our purpose of tests, anything/anyone else being /not-being able to use things such as ? with it is not really a concern.

The main motivation for the change is that in lifecycle tests, the test itself calls other tests. Without using ?, each call needs to be handled as test pass or fail or skip, which can be quite cumbersome. The alternative is to refactor the operation of these tests as actual functions and only wrap the TestResult in the actual test calls. In this way, other tests can call these lifecycle functions without the need to deal with TestResult. Your assumption makes sense only when TestResult is handled once for each test cases at the end. However, in the case where test cases may call other test cases, this assumption won't hold.

Another thing is, this change would have been really welcome if TestResult was an internal type which, say, we returned from intermediate functions, and needed to bubble up. In such case this change would have added the excellent ergonomic of ? , so we didn't need to to a match and return at every place such intermediate functions were called.

Exactly. The key here is should we allow test cases to call other test cases?

The test that pass will always return ().

This is not true. Yes, most test case will return (), but for functions such as fn get_container_pid(project_path: &Path, id: &str) -> TestResult<i32>, the return type is simplified if the TestResult is a proper result type. This is also another example of functions calling other test cases and need to handle TestResult.

Skipped state is something completely different from Passed and Failed states.

I agree that skip is different from failed and pass, but this is all semantics in my opinion. Having a Err(Skipped) does not necessarily mean it is an actual error. It is just working around the rust language constructs. As you mentioned, the match statement simply becomes:

match result {
Ok(()) => {}
Err(Skipped) => {}
Err(Failed) => {}
}

Lastly, I kept TestError to be Skipped and Failed. As I mentioned, this is intended for this PR to maintain as much as the original behavior. Next, we can expand the Failed case to specific errors. For example, container already exist is marked simply as Failed today. When testing for duplicate containers, we actually want to check that container is failed because container already exist. However, the second container create call can fail for a number of reasons and our tests currently treats them all as success.

In the end, I don't feel strongly about this particular implementation, but I would like to at least go with this direction or refactor other test cases to avoid the test case calling other test case situation.

yihuaf · 2023-04-24T20:56:45Z

I've thought a bit more about this and I believe the core of the issue is test cases calling other test cases. I am now more inclined to refactor the tests such that we do not call other test cases within a test case directly. In addition, in this case, the skip result doesn't make sense. Skip should be decided before the actual test is run by checking for specific conditions. A skip test result should not be returned in the middle of a test since it doesn't make much sense.

@YJDoc2 Let me know what you think.

yihuaf · 2023-04-30T19:01:45Z

@YJDoc2 Let me what you think :) ?

I read the lifecycle test again and I think the issue I previously described is mostly focused within the lifecycle test. The other tests are much more simpler and straightforward. I think a better approach is to refactor the lifecycle tests to not use TestResult if a function call is used within the lifecycle test. We can decide what we should do with TestResult at a later point?

YJDoc2 · 2023-05-01T05:40:17Z

@yihuaf Apologies for late reply, I was a bit busy, and missed this 😓

So I agree with you that we should not be calling tests inside other tests as if util functions. So We should make the lifecycle functions "util" functions , which return normal Rust results, and other tests will call them as normal functions. The lifecycle tests themselves will also simply call these functions and convert their result into TestResult (at least for now).

I feel it would be a better approach to first do this refactoring and then maybe take another look at what this PR aims to do, maybe we will get a better idea of exactly what needs to be changed. Separating out lifecycle stuff as normal functions is definitely something that we should go ahead with 👍

On a separate note,

Skip should be decided before the actual test is run by checking for specific conditions. A skip test result should not be returned in the middle of a test since it doesn't make much sense.

I'm not sure exactly which test you are talking about, but the Skip was always intended to be returned at the very start, by the check function of ConditionTest trait . If your comment was in context of liifecycle functions returning TestResult which can potentially be skip, then with the above change, that should be mitigated. Correct me if I misunderstood anything.

Again, apologies for late reply, thanks for pinging me again!

yihuaf · 2023-05-01T15:52:18Z

Again, apologies for late reply, thanks for pinging me again!

No worries :)

I'm not sure exactly which test you are talking about, but the Skip was always intended to be returned at the very start, by the check function of ConditionTest trait . If your comment was in context of liifecycle functions returning TestResult which can potentially be skip, then with the above change, that should be mitigated.

You understood it correctly. I was talking in the context of test calling other test. I agree with your assessment on this.

Separating out lifecycle stuff as normal functions is definitely something that we should go ahead with 👍

I agree. I will close this PR and open a new PR for this change. I am glad we talked/sorted this out :)

yihuaf added 2 commits April 19, 2023 17:16

enable exec test

3c68c9e

Signed-off-by: yihuaf <[email protected]>

refactor TestResult to a proper rusty result

0b93392

Signed-off-by: yihuaf <[email protected]>

fix lint

32d0844

Signed-off-by: yihuaf <[email protected]>

yihuaf requested review from utam0k, YJDoc2 and a team April 19, 2023 21:31

yihuaf self-assigned this Apr 19, 2023

yihuaf closed this May 1, 2023

yihuaf deleted the yihuaf/refactor-tests branch May 1, 2023 15:52

yihuaf mentioned this pull request May 1, 2023

Refactor the lifecycle test #1868

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor TestResult into proper rusty result type #1827

Refactor TestResult into proper rusty result type #1827

yihuaf commented Apr 19, 2023 •

edited

Loading

tsturzl commented Apr 19, 2023

yihuaf commented Apr 19, 2023 •

edited

Loading

YJDoc2 commented Apr 20, 2023

YJDoc2 commented Apr 23, 2023

yihuaf commented Apr 24, 2023 •

edited

Loading

yihuaf commented Apr 24, 2023

yihuaf commented Apr 30, 2023

YJDoc2 commented May 1, 2023

yihuaf commented May 1, 2023

Refactor TestResult into proper rusty result type #1827

Refactor TestResult into proper rusty result type #1827

Conversation

yihuaf commented Apr 19, 2023 • edited Loading

tsturzl commented Apr 19, 2023

yihuaf commented Apr 19, 2023 • edited Loading

YJDoc2 commented Apr 20, 2023

YJDoc2 commented Apr 23, 2023

yihuaf commented Apr 24, 2023 • edited Loading

yihuaf commented Apr 24, 2023

yihuaf commented Apr 30, 2023

YJDoc2 commented May 1, 2023

yihuaf commented May 1, 2023

yihuaf commented Apr 19, 2023 •

edited

Loading

yihuaf commented Apr 19, 2023 •

edited

Loading

yihuaf commented Apr 24, 2023 •

edited

Loading