Tuning of model and dataset retrievers #314

neubig · 2023-09-01T13:02:00Z

Currently our model and dataset retrievers are not perfect, and it would be good to have a way to make them better.

One way we can do so is by explicitly training the model/dataset retrievers to:

Retrieve multiple datasets (models) and run the prompt2model pipeline with all of them
Take the resulting accuracy scores, and train the retriever so that the retriever gives higher scores to datasets (models) that give higher accuracy scores for the full pipeline

This would result in a training objective that explicitly rewards retrieving of datasets (models) that give high accuracy.

This would also be helpful for #285 , as it would reduce the need for human intervention when selecting models.

zhaochenyang20 · 2023-09-12T01:56:46Z

Also, here is something related:

https://github.com/stanfordnlp/dspy

Vijay and I actually thought about using LLM to automatically select columns and datasets, but just by prompting a row LLM, it is somehow impractical. Now, with DSPy, it seems that we can achieve this.

neubig added the enhancement New feature or request label Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tuning of model and dataset retrievers #314

Tuning of model and dataset retrievers #314

neubig commented Sep 1, 2023 •

edited

Loading

zhaochenyang20 commented Sep 12, 2023

Tuning of model and dataset retrievers #314

Tuning of model and dataset retrievers #314

Comments

neubig commented Sep 1, 2023 • edited Loading

zhaochenyang20 commented Sep 12, 2023

neubig commented Sep 1, 2023 •

edited

Loading