The Unromantic Truth About Data Science: It's Mostly Data Cleaning

The Unromantic Truth About Data Science: It’s Mostly Data Cleaning

I’m sure I’m not the only one who got into data science thinking I’d be building cool models and working on cutting-edge stuff like NLP or computer vision. But the harsh reality is that most of our time is spent on data cleaning. I mean, who doesn’t love fixing nulls, merging CSVs, and chasing stakeholders for missing data? 😅

Don’t get me wrong, I still love the field. But sometimes it feels like 80% of the job is just prepping the data, 15% is explaining the results, and 5% is actually running models. It’s like, where’s the excitement in that?

I remember when I first started out, I was so excited to dive into machine learning and AI. But now, it feels like most of my day is spent just trying to get the data in a usable state. And don’t even get me started on stakeholders who think data just magically appears out of thin air.

So, am I just being dramatic, or is data cleaning really taking over our jobs? Share your thoughts!

Leave a Comment

Your email address will not be published. Required fields are marked *