Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Paloma release #19

Merged
merged 77 commits into from
Dec 13, 2023
Merged
Changes from 1 commit
Commits
Show all changes
77 commits
Select commit Hold shift + click to select a range
905de7a
Merge remote-tracking branch 'origin/small-fixes' into perplexity-sui…
IanMagnusson Oct 27, 2023
4b867a0
Merge remote-tracking branch 'origin/token-ppls' into perplexity-suit…
IanMagnusson Oct 27, 2023
9e4c9fb
Merge remote-tracking branch 'origin/other-metrics-per-subdomain' int…
IanMagnusson Oct 27, 2023
382efdc
Merge remote-tracking branch 'origin/main' into perplexity-suite-paper
IanMagnusson Oct 30, 2023
6a6af5b
pythia 7b runs
IanMagnusson Oct 30, 2023
9a143fb
domla 1b runs
IanMagnusson Oct 30, 2023
9173088
fix the hf_olmo image
IanMagnusson Oct 31, 2023
7578d2c
add aws secrets
IanMagnusson Oct 31, 2023
0ed28b4
handle local tokenizer problem in hf olmo
IanMagnusson Oct 31, 2023
e8aed05
add feature for saving to file
IanMagnusson Oct 31, 2023
080e219
handle s3 auth with env vars
IanMagnusson Oct 31, 2023
c5d4a44
still fixing s3
IanMagnusson Oct 31, 2023
81cbe9d
split up sheets into different files
IanMagnusson Oct 31, 2023
98b1203
make json lines instead of one big json
IanMagnusson Oct 31, 2023
2f6b7de
passing arg by right name
IanMagnusson Oct 31, 2023
2d1fe30
save dolma 1b to file
IanMagnusson Oct 31, 2023
b625b85
pythia 1b
IanMagnusson Oct 31, 2023
6367feb
pythia 7b
IanMagnusson Oct 31, 2023
4d99b8b
initial results exploration
IanMagnusson Nov 1, 2023
8382500
dolma 7b
IanMagnusson Nov 1, 2023
c3ceb44
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Nov 1, 2023
e33d37c
rp without save file yet
IanMagnusson Nov 1, 2023
285bc8c
Try with manually uploaded fixed files
IanMagnusson Nov 1, 2023
af67310
first line chart
IanMagnusson Nov 1, 2023
2345106
clean up
IanMagnusson Nov 1, 2023
2a2a242
now with win rate
IanMagnusson Nov 2, 2023
b6009d2
New ppl and win rate viz
IanMagnusson Nov 2, 2023
b5889e8
exclude fringe datasets
IanMagnusson Nov 2, 2023
d244801
subdomain bar charts
IanMagnusson Nov 2, 2023
8ae7255
add support for olmo models in s3
IanMagnusson Nov 2, 2023
aa4e49a
just add olmo to the path instead
IanMagnusson Nov 2, 2023
a2f36dc
fix figues labels
IanMagnusson Nov 2, 2023
7f967ae
subdomain line charts
IanMagnusson Nov 3, 2023
5f192ab
added new subdomains by tasks figures
IanMagnusson Nov 3, 2023
0b7b28b
Inital results over all models
IanMagnusson Nov 4, 2023
edee19e
clean up aggregation over subdomains tables
IanMagnusson Nov 6, 2023
25ec471
Add curves by macro subdomains
IanMagnusson Nov 7, 2023
9989734
Also add median over subdomains
IanMagnusson Nov 7, 2023
193fac9
subdomains by order of performance
IanMagnusson Nov 7, 2023
d8773b8
dolma7b 1T
IanMagnusson Nov 7, 2023
9e5f420
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Nov 7, 2023
0c7bf7e
RP save results
IanMagnusson Nov 7, 2023
7af8bb7
domains by rank by task
IanMagnusson Nov 9, 2023
b9d04a5
fringe curves
IanMagnusson Nov 10, 2023
fc8d17a
add pmi filtered metrics
IanMagnusson Nov 10, 2023
b9d617a
track split on token counts
IanMagnusson Nov 11, 2023
efed63a
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Nov 11, 2023
9b7a85b
fix pmi ppl
IanMagnusson Nov 11, 2023
ef62450
reweighting inside ppl
IanMagnusson Nov 11, 2023
8a83142
exclude ice and stack from "all" metrics
IanMagnusson Nov 12, 2023
1afd076
remove references to "subdomain" from figures
IanMagnusson Nov 12, 2023
e5a3f27
domain improvement inequality
IanMagnusson Nov 12, 2023
044d676
most and least improved domains
IanMagnusson Nov 12, 2023
634514d
ppl reduction over model size
IanMagnusson Nov 20, 2023
c3a0594
pmi ppl on twitter aae
IanMagnusson Nov 20, 2023
1c42981
add pythia 160m and standardize model size names
IanMagnusson Nov 21, 2023
ae550a9
pile lumi
IanMagnusson Nov 25, 2023
6a8b81c
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Nov 25, 2023
2fb71b7
save code!
IanMagnusson Nov 28, 2023
b597465
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Nov 28, 2023
7ebe22f
fix tokens seen on baselines
IanMagnusson Nov 30, 2023
8538d66
only change unsharded dirs
IanMagnusson Dec 5, 2023
a115353
Merge branch 'perplexity-suite-paper' of github.com:allenai/ai2-llm-e…
IanMagnusson Dec 5, 2023
d61a5db
doma 1b test eos fix
IanMagnusson Dec 5, 2023
c25ce9f
more tokens on the 7b
IanMagnusson Dec 5, 2023
a6dbc07
count non-embedding params
IanMagnusson Dec 5, 2023
f9ff98b
A script to get checkpoint that were made on lumi
IanMagnusson Dec 5, 2023
92d4828
roll back paper specifics
IanMagnusson Dec 5, 2023
3d2b22a
more paper specific rollback
IanMagnusson Dec 5, 2023
365edfc
Merge branch 'main' of github.com:allenai/ai2-llm-eval into paloma-re…
IanMagnusson Dec 5, 2023
b58c78a
minimal PPL inference
IanMagnusson Dec 5, 2023
c494108
style stuff
IanMagnusson Dec 5, 2023
c0002e5
changeloooooog
IanMagnusson Dec 5, 2023
c65dec9
remove local path
IanMagnusson Dec 5, 2023
9ed51d1
centralize documentation
IanMagnusson Dec 12, 2023
a1cbad4
Update README.md
AkshitaB Dec 13, 2023
62600b4
Update README.md
AkshitaB Dec 13, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
add support for olmo models in s3
IanMagnusson committed Nov 2, 2023

Verified

This commit was signed with the committer’s verified signature.
abelsromero Abel Salgado Romero
commit 8ae7255f479276d4ea737c3db7f92457c6b8c25c
2 changes: 1 addition & 1 deletion configs/dolma_7b_ppl_suite.jsonnet
Original file line number Diff line number Diff line change
@@ -48,5 +48,5 @@ local task_sets = [


{
steps: utils.create_fine_grained_pipeline(models, task_sets, gsheet)
steps: utils.create_fine_grained_pipeline(models, task_sets, gsheet, output_dir)
}
2 changes: 1 addition & 1 deletion configs/utils.libsonnet
Original file line number Diff line number Diff line change
@@ -41,7 +41,7 @@ local contains = function(main_string, sub_string)
std.length(std.findSubstr(sub_string, main_string)) > 0;

local is_olmo_model = function(model_config)
contains(std.get(model_config, "model_path"), "olmo");
contains(std.get(model_config, "model_path"), "olmo") || contains(std.get(model_config, "model_path"), "s3://ai2-llm/checkpoints");

// Model steps