Skip to content

somosnlp/la-leaderboard-backend

Repository files navigation

Backend de "La Leaderboard"

  • To evaluate the models in the requests dataset, run python3 -m main_eval_basic_queue.py
  • To evaluate a custom combination of models and tasks, run python3 -m main_eval_internal_queue.py with an optional argument containing the path to the JSON file with the tasks to run which defaults to "internal_queue/tasks_todo.json"
  • To evaluate the models in the cluster's queue managed by slurm, run python3 -m main_eval_slurm_queue.py

Notes:

  • Check the version of the lm-evaluation-harness used in the requirements

About

No description, website, or topics provided.

Resources

Code of conduct

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published