This is a template you can use to develop tasks for the METR Task Standard.
Note: If you store your task code on GitHub, please set the repository to "private" so it does not end up in training data for future AI models.
- Implement your task in
my_task/my_task.py
(rename it with the name of your task) - Write tests in
my_task/my_task_test.py
(rename it with the name of your task) - Use the workbench to run your task and tests
- Have someone do a QA run and document it in
my_task/meta/qa
- Finish documenting your task in
my_task/meta/summary.md
,my_task/meta/detail.md
, andmy_task/meta/eval_info.json
If you run into technical issues or have questions about task development, you can email us at [email protected]