Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce reproduction cost 96%, from $600 to $24, by releasing the instruct dataset only #3

Closed
MarkSchmidty opened this issue Mar 13, 2023 · 2 comments

Comments

@MarkSchmidty
Copy link

The blog post says $500 was spent producing the dataset.
The blog post also says $100 was spent on 3xA100 80GB for 3 hours.
The market rate for 4xA100 is around $8 per hour. (See vast.ai for example)

If the dataset is provided for fine tuning then Alpaca could be reproduce for just about $24 and we would not have to wait for Facebook's response regarding sharing of the pre-trained model.

@MarkSchmidty MarkSchmidty changed the title Reduce reproduction cost from $600 to $24 by releasing the instruct dataset only Reduce reproduction cost from 96%, from $600 to $24, by releasing the instruct dataset only Mar 13, 2023
@MarkSchmidty MarkSchmidty changed the title Reduce reproduction cost from 96%, from $600 to $24, by releasing the instruct dataset only Reduce reproduction cost 96%, from $600 to $24, by releasing the instruct dataset only Mar 13, 2023
@MarkSchmidty
Copy link
Author

Or is alpaca_data.json the dataset?

In which case the reproduction cost is already $24, not $600.

@lxuechen
Copy link
Collaborator

Hi Mark,

You're right alpaca_data.json is our released dataset. We're also releasing the recipe for producing the dataset, so other researchers can build on this.

You're correct that excluding the cost of reproducing the data, the cost of training the model is much lower.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants