A New Optimizer for Noisy Deep RL: Introducing Ano | Ranjan Kumar

Hey there! As a student and independent researcher, I’ve been diving into the world of optimization in Deep Reinforcement Learning. I’m excited to share my first preprint with the community and get your feedback on both the method and the clarity of my writing.

The optimizer I’ve developed is called Ano, and its key idea is to decouple the magnitude of the gradient from the direction of the momentum. This aims to make training more stable and faster in noisy or highly non-convex environments, which are common in deep RL settings.

If you’re interested in learning more, you can check out my preprint and source code on Zenodo, or install Ano via pip. I’ve also set up a GitHub repository for experiments.

This is my first real research contribution, so I’d greatly appreciate any feedback, suggestions, or constructive criticism. Your input will help me refine Ano and make it more useful for the community.

Additionally, I’d love to make my preprint available on arXiv, but as I’m not affiliated with an institution, I need an endorsement to submit. If anyone feels comfortable endorsing it after reviewing the paper, it would mean a lot.

Thanks for reading and helping out!

Leave a Comment Cancel Reply