Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Parameters to Benchmarks #103

Merged
merged 5 commits into from
Mar 10, 2024
Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 5 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -49,11 +49,11 @@ record = runner.run("__main__", params={"a": 2, "b": 10})
rep = nnbench.BenchmarkReporter()
rep.display(record) # ...and print the results to the terminal.

# results in a table like the following:
# name function date value time_ns
# ------- ---------- ------------------- ------- ---------
# product product 2024-03-07T10:14:21 20 1000
# power power 2024-03-07T10:14:21 1024 750
# results in a table look like the following:
# name function date parameters value time_ns
# ------- ---------- ------------------- ----------------- ------- ---------
# product product 2024-03-08T18:03:48 {'a': 2, 'b': 10} 20 1000
# power power 2024-03-08T18:03:48 {'a': 2, 'b': 10} 1024 750
```

For a more realistic example of how to evaluate a trained model with a benchmark suite, check the [Quickstart](https://aai-institute.github.io/nnbench/latest/quickstart/).
Expand Down
2 changes: 2 additions & 0 deletions src/nnbench/runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -255,6 +255,7 @@ def run(
results: list[dict[str, Any]] = []
for benchmark in self.benchmarks:
bmparams = {k: v for k, v in dparams.items() if k in benchmark.interface.names}
bmdefaults = {k: v for (k, _, v) in benchmark.interface.variables}
# TODO: Wrap this into an execution context
res: dict[str, Any] = {
"name": benchmark.name,
Expand All @@ -263,6 +264,7 @@ def run(
"date": datetime.now().isoformat(timespec="seconds"),
"error_occurred": False,
"error_message": "",
"parameters": {**bmdefaults, **bmparams},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this ensure that the defaults are overwritten by the bmparams?

I think something like bmdefaults.update(bmparams) would be easier to read.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the entires are unpacked in order with the latter overwriting the former in case of colliding keys.

dict.update does not return the merged dict, but just updates inplace.

We couid do dict(bmdefaults, **bmparams) , though I don't think that is easier to read.
Alternatively, do an update on bmdefaults and then pass only that. That is an extra line though and we have a bmdefaults flying around that does not contain the defaults.

Copy link
Collaborator

@nicholasjng nicholasjng Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this does what it needs to do: For any variable, if the default is used, it's not in bmparams, so the update does nothing - and for every overridden default, the update() overwrites with the argument given in the params dict.

So the order should be "use defaults, then update with params", which is what I suggested (you can also write bmdefaults | bmparams).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you need a mental model, it helps to think about what happens once the params are passed into the function - its default keyword arguments are already there, and everything else gets overridden by the incoming parameters. So it is indeed bmdefault.update(bmparams) we're looking for here (you can even use that as the dict value, since we do not explicitly use it again afterwards).

}
try:
benchmark.setUp(**bmparams)
Expand Down
6 changes: 6 additions & 0 deletions tests/benchmarks/parametrized.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import nnbench


@nnbench.parametrize([{"a": 1}, {"a": 2}], tags=("parametrized",))
def double(a: int) -> int:
return 2 * 2
18 changes: 18 additions & 0 deletions tests/test_runner.py
Original file line number Diff line number Diff line change
Expand Up @@ -71,3 +71,21 @@ def duplicate_context_provider() -> dict[str, str]:
params={"x": 1, "y": 1},
context=context_providers,
)


def test_filter_benchmarks_on_params(testfolder: str) -> None:
r = nnbench.BenchmarkRunner()
results = r.run(testfolder, tags=("parametrized",))
print(results)
assert len(results.benchmarks) == 2
assert (
len(
list(
filter(
lambda bm: bm["parameters"]["a"] == 1,
results.benchmarks,
)
)
)
== 1
)