Skip to content

Commit

Permalink
simplify paloma readme
Browse files Browse the repository at this point in the history
  • Loading branch information
AkshitaB committed Jan 31, 2024
1 parent 27ff07b commit 1d887fd
Show file tree
Hide file tree
Showing 2 changed files with 27 additions and 17 deletions.
29 changes: 27 additions & 2 deletions paloma/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,19 @@ So far the models evaluated by the benchmark are the 6 baseline 1B parameter mod
## Setup
Start by following the installation instructions for this repo in this [readme](../README.md).

Then follow the instructions in this [readme](eval_data/README.md) to obtain and set up the evaluation data.
Then, download the PALOMA dataset from HF hub:

```commandline
huggingface-cli login
git lfs install
git clone https://huggingface.co/datasets/allenai/paloma
```

Finally, export the path to this data when running the pipeline:

```commandline
export EVAL_DATA_PATH=/path/to/paloma
```

## Running evaluation
After following the setup instructions above, you can make an evaluation configuration based on our template [here](../configs/example_paloma_config.jsonnet). This is designed to work with any model hosted on the HuggingFace hub. Just specify the name of the model on the hub and any revisions (i.e., checkpoints) that you want results over. Read the comments in the configuration with the ❗ symbol for more information about details you may need to fill in. Finally make sure to set an output directory for `output_dir` where you want the job to output your results.
Expand All @@ -41,4 +53,17 @@ Our approach for fixing the training data order requires the use of the same tra
We ask that submissions that do not investigate changes in vocabulary opt in to our standardized vocabulary to enable the greatest level of comprability. That vocabulary is available from the tokenizer hosted on HuggingFace hub as `allenai/gpt-neox-olmo-dolma-v1_5`.

## Making a submission
At present we are building out an automatic submission process that will soon be available. Until then please reach out to us by emailing the first author of Paloma, if you would like to submit results to the benchmark.
At present we are building out an automatic submission process that will soon be available. Until then please reach out to us by emailing `[email protected]`, if you would like to submit results to the benchmark.

## Citation

```bibtex
@article{Magnusson2023PalomaAB,
title={Paloma: A Benchmark for Evaluating Language Model Fit},
author={Ian Magnusson and Akshita Bhagia and Valentin Hofmann and Luca Soldaini and A. Jha and Oyvind Tafjord and Dustin Schwenk and Pete Walsh and Yanai Elazar and Kyle Lo and Dirk Groeneveld and Iz Beltagy and Hanna Hajishirzi and Noah A. Smith and Kyle Richardson and Jesse Dodge},
journal={ArXiv},
year={2023},
volume={abs/2312.10523},
url={https://api.semanticscholar.org/CorpusID:266348815}
}
```
15 changes: 0 additions & 15 deletions paloma/eval_data/README.md

This file was deleted.

0 comments on commit 1d887fd

Please sign in to comment.