An example recipe search engine using a document -> embedding model (sentence transformer), an approximate nearest neighbors index, and a query / document re-ranker.
Used: SBERT (sentence bert) for the language models and Spotify's Annoy package for ANN indexing.
It comes with a simple flask API and frontend for searching recipes (uses Bulma for CSS)
Docker and docker-compose was used in this repository, so should be easy to run.
First download the RAW_recipes.csv dataset from Kaggle:
Place the RAW_recipes.csv file into the "data" folder, creating the folder if it doesn't exist.
Then build the python docker image
docker-compose build
Run the following script, which will download sentence models from SBERT, embed the recipe documents, and build the ann index.
docker-compose run semantic_search python
Finally, run the API with:
docker-compose up
Then go to http://localhost:8080/ to the search page.
Note, that for ease of starting, the initial dataset is limited to just 1000 results, you can change the NROWS variable in semantic_search/ to increase that number. It will take longer to build the ANN index the larger that number is.
Shout out to this example:
And this tutorial and explanation:
For the ML sentence transformer models: SBERT
And Hugging Face
Spotify, for the ANN index
And Bulma for the awesome CSS package: