I’ve been noticing a disturbing trend in my data team lately. The pressure to adopt Large Language Models (LLMs) for every task is causing more harm than good. We’re sacrificing reliability and accuracy for the sake of innovation, and it’s taking a toll on our work.
We used to have deterministic solutions for specific tasks, like data extraction from PDFs and websites, fuzzy matching, and data categorization. These solutions may not have been flashy, but they were reliable and accurate. Now, we’re being forced to use LLMs that are more flexible but make many more mistakes.
Take data extraction, for example. We used to use a case-by-case approach with regex, keywords, and stopwords. It was time-consuming, but it worked. Now, we’re using LLMs that are more flexible but less accurate. The same goes for fuzzy matching and data categorization. We had fixed rules or supervised models trained for high-accuracy classification, but the new LLM-based approach is simply less precise.
The problem is that the board is more interested in the perceived innovation of LLMs than the actual results. They want a magical black box that can solve everything in a generic way, without considering the specific needs of each task. And when things go wrong, my team will be held responsible, even though we didn’t have a say in the matter.
It’s time to take a step back and re-evaluate our priorities. Are we really making progress with LLMs, or are we just chasing a trendy solution that doesn’t deliver?