Maxproof
Posted by ilreb 5 days ago
Comments
Comment by daquisu 5 days ago
Crudely, IMO gold medals are awarded to the highest-scoring 1/12 of contestants.1 However, because scores are integers up to 42 and there’s no provision for tiebreaking, it’s possible for a lot of contestants to be tied around the threshold. In that case, either all of them get a gold medal or none do, and the fraction of gold medalists might deviate substantially from 1/12. That’s what happened this year: 46 contestants all won a gold medal by scoring exactly 35 points.
In fact, bizarrely, 35 is the mode of the scores this year; the last time the modal score was a gold medal score was in 1994. And, of course, 35 is the same score claimed by AI systems from Google, OpenAI, and others."
Comment by quibono 5 days ago
Comment by quietbritishjim 5 days ago
This is the part of the quote your6 replying about.
You seemed to take "of course" as an implication that the contestants used LLMs, and that's why they got the same score as the LLMs.
I took it to mean: since this was the modal score, there seemed to be 35 points worth of significantly easier answers (relatively speaking) than the remaining points, so it's not a surprise that LLMs got the same easier bits right. (Though I doubt all contestants got their points on exactly the same answers.)
But it's certainly unclear what exactly the author meant.
Comment by daquisu 4 days ago
> We can also consider the IMO 2025 problems individually. In the Epoch AI newsletter, Greg Burnham combines a subjective analysis with Evan Chen’s MOHS ratings to argue that the first five problems at IMO 2025 were unusually easy and the sixth was unusually hard, so it’s not surprising that the first five problems were exactly the ones solved by these AIs. Though I’m not sure the MOHS scale is rigorous enough to make sense as the x-axis of a bar chart it’s easy to corroborate the high-level story with the official IMO statistics. Based on average scores, this year’s Problem 6 was the fourth hardest and its Problem 3 was by far the easiest of all Problem 3s and 6s since 2000.
In the linked MaxProof paper, in the section "6.3.1. Per-Problem Analysis" it shows the same behavior: 7/7 in the first 5 problems, 0/7 in the last problem.
Comment by dooglius 5 days ago
Comment by gus_massa 4 days ago
It's not rare at all. I can't find the 2026 results, but here are the 2025 ones.
https://www.imo-official.org/results/individual/year/2025/
The top of the table is full of 7 and the bottom is full of 0, but in the middle there are a lot of intermediate points. It's not uncommon "7 ? 0 7 ? 0" because the 1st and 4th are usually the easiest and the 3rd and 6th the hardest. But there are a of of other combinations due to stupid mistakes and lucky solutions and different personal styles/preferences that make some problems easier/harder for each contestant.
Comment by dooglius 4 days ago
Comment by gus_massa 4 days ago
Comment by pfannl 5 days ago
Comment by thierrydamiba 5 days ago
Comment by rapsacnz 4 days ago
Comment by korbonits 5 days ago
Comment by minimaxir 5 days ago
Comment by uyuyuy 5 days ago
Comment by thatsgcasey 5 days ago