Bidirectional LSTM (BiLSTM)

Bidirectional LSTM or BiLSTM is a term used for a sequence model which contains two LSTM layers, one for processing input in the forward direction and the other for processing in the backward direction. It is usually used in NLP-related tasks. The intuition behind this approach is that by processing data in both directions, the model is able to better understand the relationship between sequences (e.g. knowing the following and preceding words in a sentence).

To better understand this let us see an example. The first statement is “Server can you bring me this dish” and the second statement is “He crashed the server”. In both these statements, the word server has different meanings and this relationship depends on the following and preceding words in the statement. The bidirectional LSTM helps the machine to understand this relationship better than compared with unidirectional LSTM. This ability of BiLSTM makes it a suitable architecture for tasks like sentiment analysis, text classification, and machine translation.

Architecture

The architecture of bidirectional LSTM comprises of two unidirectional LSTMs which process the sequence in both forward and backward directions. This architecture can be interpreted as having two separate LSTM networks, one gets the sequence of tokens as it is while the other gets in the reverse order. Both of these LSTM network returns a probability vector as output and the final output is the combination of both of these probabilities. It can be represented as:

where,

  • : Final probability vector of the network.
  • : Probability vector from the forward LSTM network.
  • : Probability vector from the backward LSTM network.

Bidirectional LSTM layer Architecture

Figure 1 describes the architecture of the BiLSTM layer where is the input token, is the output token, and and are LSTM nodes. The final output of is the combination of and LSTM nodes.

Now, let us look into an implementation of a review system using BiLSTM layers in Python using the Tensorflow library. We would be performing sentiment analysis on the IMDB movie review dataset. We would implement the network from scratch and train it to identify if the review is positive or negative.

Bidirectional LSTM in NLP

In this article, we will first discuss bidirectional LSTMs and their architecture. We will then look into the implementation of a review system using Bidirectional LSTM. Finally, we will conclude this article while discussing the applications of bidirectional LSTM.

Similar Reads

Bidirectional LSTM (BiLSTM)

Bidirectional LSTM or BiLSTM is a term used for a sequence model which contains two LSTM layers, one for processing input in the forward direction and the other for processing in the backward direction. It is usually used in NLP-related tasks. The intuition behind this approach is that by processing data in both directions, the model is able to better understand the relationship between sequences (e.g. knowing the following and preceding words in a sentence)....

Importing Libraries and Dataset

Python libraries make it very easy for us to handle the data and perform typical and complex tasks with a single line of code....

Model Architecture

...

Model Training

...

Model Evaluation

...