Skip to content

Latest commit

 

History

History
26 lines (21 loc) · 1.49 KB

Sequence to Sequence Learning with Neural Networks.md

File metadata and controls

26 lines (21 loc) · 1.49 KB

Sequence to Sequence Learning with Neural Networks

Summary

  • Provides a general end-to-end approach to sequence learning.
  • Introduces a new class of "Encoder-Decoder" architectures for the seq2seq learning.
  • A straightforward LSTM architecture can be used to solve general sequence to sequence problems.
    • The idea to is use one LSTM to read the input sequence, to obtain some sort of "encoding".
    • This encoding is then used by another LSTM to extract the output sequence, this can be thought of as the decoder, which is basically a rnn language model which is conditioned on the input sequence.
    • LSTMs are used because they can handle long range temporal dependencies and also can map variable length inputs to fixed dimensional vector representation.
    • The inputs are given in the reverse order to create short term dependencies and a special end of sentence symbol is needed.
  • This architecture performed state of the art performance in 2014 on the english to french translation task by WMT, The LSTM did not suffer on longer sentences.

Strenghts

  • Introduces Seq2Seq Learning using LSTMs to Deep Learning.
  • The ability to turn sentences into a fixed-dim vector helps in visual representation and see if the model is able to "learn" semantic properties.

Weaknesses and Doubts

  • Interpretability of the output of the encoder, How can this "fixed dimensional vector" be interpreted and can this be thought of as a sentence embedding?
  • There is no clear explanation to reversal of input.