Lab Assignment 4 - PavankumarManchala/CS5542_BigDataAnalyticsAppsLab GitHub Wiki

Submitted by:

Pavankumar Manchala, 16 Team – 5

Objectives:

The main objective of this lab is to implement a bottom-up attention model to generate captions for an image.

Technologies:

Pycharm – IDE for executing the python files

Packages used:

pandas nltk numpy MSCOCO images and captions Tensorflow BLEU score Inception_v4 PIL

Introduction:

Caption generation is the challenging artificial intelligence problem using NLP technique and computer vision. It requires the two pictures comprehension and language show from the field of Natural language processing. For sure, a depiction must catch the articles contained in a picture, yet it additionally should express how these items identify with one another, just as their characteristics and the exercises they are associated with.

Results:

Using the MSCOCO dataset for the generation of features file.

We have used the pretrained inception_v4 model which is an image recognition model, attained almost 70% of accuracy.

Using the inception file we are generating a “features.npy” file which contains all the features of the images and for building the attention model respective features are added in the python file.

Generation of Vocabulary file:

Training the model and generating captions:

Image with BLEU score results and image captions:

References:

Show and Tell: A Neural Image Caption Generator -Oriol Vinyals, Alexander Toshev, Samy Bengio, Dumitru Erhan

https://arxiv.org/abs/1707.07998