menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Deep Learning News

>

Show and Tell
source image

Towards Data Science

1M

read

454

img
dot

Image Credit: Towards Data Science

Show and Tell

  • In this article, an implementation of the Deep Learning model proposed in the "Show and Tell: A Neural Image Caption Generator" paper using PyTorch has been discussed.
  • The image captioning task can be done by combining CNN and RNN models.
  • The paper proposed to use GoogLeNet and LSTM for the task.
  • In PyTorch, the InceptionEncoder and LSTMDecoder classes are used for this purpose.
  • The ShowAndTell class packages the encoder and decoder together, and can be used for training and inference.
  • The EMBED_DER and LSTM_HIDDEN_DIM variables are set to 512.
  • A pretrained GoogLeNet model is used for the encoder, and transferred learning method is used.
  • The generate() method simultaneously processes image features and generates an appropriate token sequence.
  • To do the post-processing, the sequence generated from the generate() method needs to be converted into a set of words.
  • The process of the model is summarized with each set of necessary code explained in order

Read Full Article

like

27 Likes

For uninterrupted reading, download the app