Not like standard DNN, which assumes that inputs and outputs are independent of each other, the output of RNN is reliant on prior factors throughout the sequence. Even so, standard recurrent networks have the issue of vanishing gradients, which makes learning long facts sequences demanding. In the subsequent, we explore quite a few popular variants