-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
docs: ✨ 📝 Add quick start guide to docs.
- Loading branch information
1 parent
3f30767
commit a59871c
Showing
1 changed file
with
119 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,119 @@ | ||
# Quick Start Guide | ||
|
||
`hypothesis-torch` is a package of `hypothesis` strategies for various Pytorch structures (including tensors and modules). | ||
|
||
## What is `hypothesis` and property-based testing? | ||
In short, property-based testing is a testing methodology where the developer defined properties a unit of code should have, and the testing framework generates arbitrary inputs (following the developer's guidelines) to test the code against these properties. This is in contrast to example-based testing, where the developer provides specific inputs and expected outputs. This is exceptionally useful for (but by no means limited to!) [NP-class problems](https://en.wikipedia.org/wiki/NP_(complexity)) where solving is difficult, but checking a solution is easy. | ||
|
||
[Hypothesis](https://hypothesis.readthedocs.io/en/latest/) is a powerful property-based testing library for Python. It | ||
lacks built-in support for Pytorch tensors and modules, so this library provides strategies for generating them. | ||
|
||
The [`hypothesis` quick start guide](https://hypothesis.readthedocs.io/en/latest/quickstart.html) is a great place to start if you're new to property-based testing. | ||
|
||
## Installation | ||
`hypothesis-torch` can be installed via pip: | ||
```bash | ||
pip install hypothesis-torch | ||
``` | ||
|
||
Optionally, you can also install the `huggingface` extra to also install the `transformers` library: | ||
```bash | ||
pip install hypothesis-torch[huggingface] | ||
``` | ||
|
||
Strategies for generating Hugging Face transformer models are provided in the `hypothesis_torch.huggingface` module. If | ||
and only if `transformers` is installed when `hypothesis-torch` is imported, these strategies will be available from | ||
the root `hypothesis_torch` module. | ||
|
||
## An example with tensors | ||
|
||
Suppose we have written a function that takes two PyTorch tensors, and [projects](https://en.wikipedia.org/wiki/Vector_projection) the second one onto the first one (treating all tensors as vectors). | ||
|
||
We consequently want to define our function to have the following properties (which constitute one definition of a vector projection): | ||
1. The projection (output tensor) should all be parallel to the first input tensor. | ||
2. The second input tensor subtracted by the projection should be orthogonal to the first input tensor. | ||
|
||
We might use these properties to define a test for our function. We can use `hypothesis-torch` to generate random tensors to test our function against these properties. | ||
|
||
In the sample below, the two tensors will be in single precision (`torch.float32`) and have a shape of `(1, 2, 3)`, but these values could be changed to any valid values for the `dtype`, `shape`. | ||
|
||
```python | ||
import torch | ||
import hypothesis_torch | ||
from hypothesis import given | ||
from hypothesis import strategies as st | ||
|
||
def project(tensor1: torch.Tensor, tensor2: torch.Tensor) -> torch.Tensor: | ||
"""Project tensor2 onto tensor1.""" | ||
return tensor2.dot(tensor1) / tensor1.dot(tensor1) * tensor1 | ||
|
||
@given( | ||
tensor1=hypothesis_torch.tensor_strategy( | ||
dtype=torch.float32, | ||
shape=(1,2,3), | ||
), | ||
tensor2=hypothesis_torch.tensor_strategy( | ||
dtype=torch.float32, | ||
shape=(1,2,3), | ||
), | ||
) | ||
def test_projection(tensor1: torch.Tensor, tensor2: torch.Tensor): | ||
projection = project(tensor1, tensor2) | ||
# Property 1 | ||
# The projection (output tensor) should all be parallel to the first input tensor. | ||
# Two vectors are parallel if and only if their cosine similarity is 1. | ||
assert torch.allclose(torch.nn.functional.cosine_similarity(projection, tensor1), torch.tensor(1.0)) | ||
|
||
# Property 2 | ||
# The second input tensor subtracted by the projection should be orthogonal to the first input tensor. | ||
# Two vectors are orthogonal if and only if their dot product is 0. | ||
difference = tensor2 - projection | ||
assert torch.allclose(difference.dot(tensor1), torch.tensor(0.0)) | ||
``` | ||
|
||
Our test does not require any knowledge of the implementation of `project`, nor does it require any specific examples of input tensors. | ||
Instead, it generates random tensors and tests the function against the properties we defined. `hypothesis` will use the `tensor_strategy` | ||
to attempt to find a counterexample to our properties. If it finds one, it will shrink the input tensors to the smallest possible example | ||
that violates the properties. | ||
|
||
Very quickly, however, `hypothesis` will inform us that our function fails (will raise an division by zero error) for the example `torch.Tensor([[[0,0,0],[0,0,0]]])`. | ||
This should be expected, however, because the projection of any vector onto the zero vector is undefined. In this case, we should add a hypothesis assume to ensure that the first tensor is not the zero tensor. | ||
|
||
```python | ||
import torch | ||
import hypothesis_torch | ||
from hypothesis import given, assume | ||
from hypothesis import strategies as st | ||
|
||
def project(tensor1: torch.Tensor, tensor2: torch.Tensor) -> torch.Tensor: | ||
"""Project tensor2 onto tensor1.""" | ||
return tensor2.dot(tensor1) / tensor1.dot(tensor1) * tensor1 | ||
|
||
@given( | ||
tensor1=hypothesis_torch.tensor_strategy( | ||
dtype=torch.float32, | ||
shape=(1,2,3), | ||
), | ||
tensor2=hypothesis_torch.tensor_strategy( | ||
dtype=torch.float32, | ||
shape=(1,2,3), | ||
), | ||
) | ||
def test_projection(tensor1: torch.Tensor, tensor2: torch.Tensor): | ||
|
||
# Assume that the first tensor is not the zero tensor, because the projection of any vector onto the zero vector is undefined. | ||
# We have to use a small epsilon value because the values are floats and we can still have numerical instability close to 0 vectors. | ||
assume(tensor1.norm().item() > 1e-6) | ||
|
||
projection = project(tensor1, tensor2) | ||
# Property 1 | ||
# The projection (output tensor) should all be parallel to the first input tensor. | ||
# Two vectors are parallel if and only if their cosine similarity is 1. | ||
assert torch.allclose(torch.nn.functional.cosine_similarity(projection, tensor1), torch.tensor(1.0)) | ||
|
||
# Property 2 | ||
# The second input tensor subtracted by the projection should be orthogonal to the first input tensor. | ||
# Two vectors are orthogonal if and only if their dot product is 0. | ||
difference = tensor2 - projection | ||
assert torch.allclose(difference.dot(tensor1), torch.tensor(0.0)) | ||
``` |