I recently came across a Reddit post that really caught my attention. The author, who built 5 agentic AI products in just 3 months, shared 10 hard lessons they learned along the way. As I read through their experiences, I realized that building agentic AI products isn’t as glamorous as it’s often made out to be. In fact, it’s a lot of hard work, trial and error, and human-in-the-loop feedback.
One of the biggest takeaways from the post was that feedback loops are essential, but they’re not as automated as we might think. In reality, it’s usually the developer manually reviewing outputs, spotting failure patterns, and tweaking prompts or retraining models. Reflection techniques like CRITIC and self-review can help, but they’re not a replacement for actual human QA.
Another important point was that coding agents work well, but only in very narrow cases. They need clean inputs, structured tasks, and test cases to function properly. And even then, they can be fragile and prone to errors.
The post also highlighted the limitations of AI evaluating AI (RLAIF) and the challenges of skill acquisition via self-play. It’s not as easy as it sounds, and it often requires a human to check the results.
What does this mean for those of us building agentic AI products? It means we need to be realistic about what’s possible and focus on building robust, reliable systems that can handle real-world scenarios. We need to scope tightly, evaluate constantly, and keep a human in the loop. And most importantly, we need to start small and focus on solving boring, repetitive problems first.
It’s not the most glamorous work, but it’s essential for building agentic AI products that actually work.