A webapp which can generate brief captions from images. We have used a merge model similar to "Show and tell architecture" to generate brief captions. We trained model on flickr8k dataset with the help of google colab.
Below is the demo output of our model
Basic requirements
- Python3
- Docker
Steps to run in your local system:
git clone https://github.com/jaykshirsagar05/captionify.git
cd Captionify
Docker-compose build
Docker-compose up
visit to http://172.19.0.3:8501/ for streamlit app.
visit to for http://127.0.0.1:8000/docs server side(fastapi)
NOTE: You need to change the path of pre-trained model in file.
This project is open sourced.
Anyone is welcomed to contribute to this project.
-
Brownlee, J. (2019, June 27). How to Develop a Deep Learning Photo Caption Generator from Scratch. Retrieved September 26, 2020, from https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/
-
Davidefiocco. (n.d.). Davidefiocco/streamlit-fastapi-model-serving. Retrieved September 26, 2020, from https://github.com/davidefiocco/streamlit-fastapi-model-serving
-
Xu, K., Ba, J. L., Kiros, R., Cho, K., Courville, A., Salakhutdinov, R., . . . Bengio, Y. (2016, April 19). Show, Attend and Tell: Neural Image Caption Generation with Visual Attention [Scholarly project]. In ArXiv. Retrieved September 26, 2020, from https://arxiv.org/pdf/1502.03044.pdf
Our project is mostly completed, but the prediction of model is not accurate. Further improvement is welcomed!