Building a recurrent neural network (RNN) from scratch can be a daunting task, especially when things don’t go as planned. I recently embarked on a project to recreate neural networks from scratch, and I thought I’d share my experience with building an RNN.
My RNN model was designed to predict the next token in a sequence, using the past 10 tokens as input. I used a simple architecture with an embedding layer, a recurrent layer, and two dense layers. However, things took a turn for the worse when I started training the model. The loss would decrease initially, but then suddenly shoot up to 25 and stay there. I was stumped.
After reviewing my code, I realized that the issue was likely with the backpropagation process in my recurrent layer. I’ve included the code for my recurrent layer below. As you can see, I’ve implemented the forward and backward passes, as well as the weight updates. But despite my best efforts, I couldn’t seem to pinpoint the problem.
If you’ve ever built an RNN from scratch, you know how frustrating it can be when things don’t work as expected. But it’s all part of the learning process, right? If you have any insights or suggestions, I’d love to hear them!