Today’s the big day – GPT-5 is finally launching! But before we dive into all the excitement, I wanted to take a step back and look at some unsaturated language model evaluations that could put GPT-5 to the test.
Ahead of the launch, I compiled a list of these evals, and I’m curious to see how GPT-5 will perform. The list includes some tough challenges that will really push the limits of language models like GPT-5.
If you’re interested in seeing the full list, you can check it out here: https://rolandgao.github.io/blog/unsaturated_evals_before_gpt5. And if you’re wondering what it all means, don’t worry – I’ll break it down for you.
The goal of these evals is to see how well language models can generalize and adapt to new, unseen situations. It’s not just about getting the right answer; it’s about understanding the context and nuances of language.
So, will GPT-5 be able to crack these unsaturated evals? Only time will tell. But one thing’s for sure – it’ll be an exciting ride.