Have you ever worked with large language models (LLMs) and felt frustrated by their tendency to forget? I know I have. No matter how much information you pack into the prompt or how elaborate your pipeline is, LLMs just can’t seem to retain context. Fine-tuning is an option, but it’s not a real solution. It’s like putting a Band-Aid on a broken leg.
That’s why I’ve started exploring alternative approaches to memory management. Instead of retraining models, I’m experimenting with giving them ‘on-demand context.’ It’s early days, but I’m excited by the potential of this uncharted territory.
I’m not alone in my frustration, I’m sure. How do you handle memory in your projects? Do you think we’ll ever move beyond the current RAG/fine-tuning combo and find a more elegant solution?
The current state of LLM memory is like trying to hold water in our hands. We need a better way to retain context and make these models truly useful.