-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fuzz Shrinking feature #351
base: main
Are you sure you want to change the base?
Conversation
Does it also produce a crash log for failed invariants on assertions? |
yes |
Now, shrinking work with this
|
81df7ae
to
6256be2
Compare
wake/cli/test.py
Outdated
type=str, | ||
help="Path to the shrink log file.", | ||
is_flag=False, | ||
flag_value=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, true. we uses flag as well, but it better be string I will change to "-1"
wake/cli/test.py
Outdated
type=str, | ||
help="Path of shrank file.", | ||
is_flag=False, | ||
flag_value=0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same question, why 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same as above
@@ -1462,6 +1462,9 @@ class Chain(ABC): | |||
|
|||
tx_callback: Optional[Callable[[TransactionAbc], None]] | |||
|
|||
def __deepcopy__(self, memo): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this may be tricky
on one side I understand why it's needed (FuzzTest
members will contain Chain
transitively), on the other snapshot + revert won't completely restore the state, just a subset
for example, default accounts won't be restored
we should either "backup" more attributes or find a better way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other than the data stored in the FuzzTest class would not stored,
we store random state already in data collecting phase.
chain and account information couls store in chain.snapshot().
Even tester defined outside of FuzztestClass, chain state would stored.
since those chains are added in wake.testing.core.connected_chains as global variable and Shrinking use this array.
And we can add missing member in Chain() in snapshot function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think there are a few attributes missing to be stored in snapshot
, then it should be good
@@ -43,6 +43,7 @@ class JsonRpcCommunicator: | |||
_protocol: ProtocolAbc | |||
_request_id: int | |||
_connected: bool | |||
_interrupt_received: bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can be removed, I think
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, true
# from wake.development.transactions import Error | ||
if type(e1) == Error and type(e2) == Error: | ||
# If it was the Error(TransactionRevertedError), compare message content. | ||
if e1.message != e2.message: | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why must there be an extra check for Error? What about other exceptions of the same type but different values/members?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a transaction error.
If the transaction reverts with Error().
like, revert("Switch Pushed");
In this case, it is reverted without a custom error. It is usually with string. We compare the message.
But if there is no message with just require() and also multiple places, it is unable to distinguish.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes but what if we are trying to reproduce NotOwner(0x742d35Cc6634C0532925a3b844Bc454e4438f44e)
but encounter NotOwner(0x19E7E376E7C213B7E7e7e46cc70A5dD086DAff2A)
?
shouldn't we just compare with ==
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I see the next comment
wake/testing/fuzzing/fuzz_shrink.py
Outdated
frame1 = None | ||
for frame1 in tb1: | ||
if is_relative_to( | ||
Path(frame1.filename), Path.cwd() | ||
) and not is_relative_to( | ||
Path(frame1.filename), Path().cwd() / "pytypes" | ||
): | ||
break | ||
frame2 = None | ||
for frame2 in tb2: | ||
if is_relative_to( | ||
Path(frame2.filename), Path.cwd() | ||
) and not is_relative_to( | ||
Path(frame2.filename), Path().cwd() / "pytypes" | ||
): | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we are searching here for a frame with an exception that happened in our cwd but not in pytypes, correct?
so the comparison does not only take into account the exception data but also the location where it happened?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly.
If it was a transaction error, rely on the Error type also if transaction. Error then relies on message. And does not care about another error argument.
If it was an error in Python, rely on the file and line number in Python. so we do not care about the actual value.
There was a lot of consideration.
Since it could depend on the definition of the "same error".
The purpose of shrinking is to create a minimum flow sequence to reproduce the same error.
The same error could be different.
- the same error is emitted.
- the same assertion in Python fails. (fails at the same line in the Python test)
or
- the same error with the same argument is emitted. (and also in exactly the same flow)
- the same assertion in Python fails with the same value. (and also in exactly the same flow)
I decided to implement 1. and 2. since these conditions could significantly shorten the test.
For example, If it was 3. 4. and Error was emitted with TransferError(nft_id=10)
, then at least the test required to emit 10 NFT and make an error. This would be redundant.
However, one possible issue is that when checking the balance for each account, the shrinken result shows an unbalance for different accounts.
@invariant(period=30)
def invariant_erc20_balances(self):
for contract in self.erc20_balances:
for acc in self.erc20_balances[contract]:
assert contract.balanceOf(acc) == self.erc20_balances[contract][acc] # <- error in same file and same line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got it. Then we should at least implement the same logic for Panic error as it behaves the same as Error.
Still I think we should implement strict shrinking feature where we compare the errors exactly with ==
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's true.
I already have an exact match for the flow number. I can extend this also Error matching.
wake/wake/testing/fuzzing/fuzz_shrink.py
Line 36 in 6256be2
IGNORE_FLOW_INDEX = True # True if you accept it could reproduce same error earlier. |
wake/testing/fuzzing/fuzz_shrink.py
Outdated
set_sequence_initial_internal_state( | ||
pickle.dumps( | ||
random.getstate() | ||
) | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this needed when we're not in the shrinking mode?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wake/testing/fuzzing/fuzz_shrink.py
Outdated
def revert(self, python_instance: FuzzTest, chains: Tuple[Chain, ...]): | ||
assert self.chain_states != [], "Chain snapshot is missing" | ||
assert self._python_state is not None, "Python state snapshot is missing " | ||
assert self.flow_number is not None, "Flow number is missing" | ||
|
||
python_instance.__dict__.update(copy.deepcopy(self._python_state.__dict__)) | ||
|
||
self._python_state = None | ||
for temp_chain, chain in zip(self.chain_states, chains): | ||
chain.revert(temp_chain) | ||
self.chain_states = [] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As I understand it, we always do revert just to create another snapshot just after the revert.
In the case of chain snapshot
I'm afraid it's necessary to call it again but I don't think we need to create deepcopy
snapshot again. Could it be optimized? Also, I don't understand why is deepcopy
used in this revert function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I remember, Anvil creates a snapshot and returns the ID, but once the snapshot is used, it is removed, and reverting to that ID will fail.
But, the Python instance seems to be working with direct assignment.
wake/testing/fuzzing/fuzz_shrink.py
Outdated
with print_ignore(): | ||
test_instance._flow_num = 0 | ||
test_instance.pre_sequence() | ||
exception_content = None | ||
try: | ||
with redirect_stdout(open(os.devnull, 'w')), redirect_stderr(open(os.devnull, 'w')): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isn't the redirection applied twice? once in print_ignore
and for the second time directly here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was redundant. removed.
Based on my measurements, most of the time is spent in invariants - can be even 70%. Do you think it would be possible to skip the execution of invariants when trying to reproduce the exception when deciding whether to keep a flow or not? Of course, it brings up some problems to handle:
What do you think? |
It could depend on the project and situation. |
Having invariant
cons
It can select specific invariants. (target error occurred in erc20 balance, then do only this test) It can change check or not depending on the shrinking removal
I think a bit more thinking is required. |
…o think about random state input
Fuzz Shrinking
1. Fuzz Test
To generate a crash log when the fuzz test is failing:
wake test tests/fuzz_test.py
2. Shrinking
To shrink the fuzz using the latest failure in the fuzz test:
wake test -SH
you can specify the test path. It verifies the testing target is the same:
wake test tests/fuzz_test.py -SH
Or specify a crash log directly:
wake test -SH .wake/logs/crashes/20241010_035704.txt
3. Reproduce the Error by Shrunk File
To reproduce the shrunk test:
wake test -SR
You can also specify the test here as well:
wake test tests/test_fuzz.py -SR
Alternatively, specify a shrunk data file:
wake test -SR .wake/logs/shrank/20241010_042322.bin
Shrinking phase
Shrinking tries to remove flows using two algorithms.
First, remove multiple flows thus it is faster, we print the progress of the removed flow.
Second, try to remove flow one by one.
✅ remove flow by flow kind (takes O(n))
✅ remove flow by brute force (takes O(n^2)
✅ flow fail because of removed flow dependency and it is not the target fail flow, can be removed.(checked precondition and un-executed flow will be removed in brute force shrink)