Caption_Generation

Deep Learning Photo Caption Generator

Final Project of the Computer Vision class at Illinois Institute of technology, Fall semester 2018.

Developed with Benjamin Scialom.

The whole project and code is based on the following tutorial : https://machinelearningmastery.com/develop-a-deep-learning-caption-generation-model-in-python/

We implemented our own model for the feature extraction, based on the vgg16 architecture, to compare the final scores. The NLP part is exactly the same.

We are using google colab to accelerate the data preparation and training process.

The architecture between our files is the following :

If you wish to implement the project, please copy the "/data" folder into your drive and mount it locally.

The /data folder contains also all the model weigths.

The /doc folder contains both of our report, for any more details please refer to the poriginal tutorial or these reports.

The /src folder contains all the source code.

The main model architecture is :

Our Final results are :

We used the Flickr8K dataset, so we are require to cite here : M. Hodosh, P. Young and J. Hockenmaier (2013) "Framing Image Description as a Ranking Task: Data, Models and Evaluation Metrics", Journal of Artifical Intellegence Research, Volume 47, pages 853-899 http://www.jair.org/papers/paper3994.html

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
data		data
doc		doc
src		src
README.md		README.md
SLIDES_CAPTION_GENERATION.pptx		SLIDES_CAPTION_GENERATION.pptx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Caption_Generation

About

Releases

Packages

Languages

ThomasEhling/Caption_Generation

Folders and files

Latest commit

History

Repository files navigation

Caption_Generation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages