DeepMind AI Achieves Silver Medal Level at 2024 International Mathematical Olympiad
Google DeepMind's AI systems have reached a significant milestone by achieving silver medal-level performance at the 2024 International Mathematical Olympiad (IMO). The company's specialized models, AlphaProof and AlphaGeometry 2, successfully solved four out of the six problems in the prestigious competition, demonstrating the growing capabilities of AI in tackling complex mathematical tasks.

AlphaProof and AlphaGeometry 2
Two specialized AI systems were developed by Google DeepMind to solve complex mathematical problems. AlphaProof combines a pre-trained language model with the AlphaZero reinforcement learning algorithm, enabling it to solve and prove problems in algebra and number theory. AlphaGeometry 2, an enhanced version of its predecessor, focuses on geometry problems and was trained on a vast dataset of 100 million synthetic examples. This innovative data generation approach helped overcome the scarcity of human-written training data, a common obstacle in AI development for mathematical reasoning tasks.
Training Methodologies of AlphaProof and AlphaGeometry 2
AlphaProof and AlphaGeometry 2 employ innovative training methodologies to achieve their impressive mathematical reasoning capabilities. AlphaProof uses a self-play approach, solving millions of problems across various difficulty levels and mathematical topics over several weeks. It generates solution candidates and searches for proof steps in the Lean formal language, reinforcing its language model with each verified proof. AlphaGeometry 2 builds on this by integrating a Gemini language model trained on a larger dataset containing 100 million synthetic examples. To bridge the gap between natural and formal languages, researchers fine-tuned a Gemini model to translate natural language problem statements into formal mathematical language, creating a vast library of formal problems. This approach enabled the systems to tackle a wide range of mathematical challenges.
Performance at the 2024 IMO
At the 2024 International Mathematical Olympiad, AlphaProof successfully solved two algebra problems and one number theory problem, while AlphaGeometry 2 solved one geometry problem. Their combined solutions earned a total of 28 points out of a possible 42, equivalent to a silver medal performance and just one point shy of the gold medal threshold. Notably, AlphaGeometry 2 solved its problem in just 19 seconds, showcasing its remarkable efficiency. The problems were manually translated into formal mathematical language for the AI systems, and the solutions took anywhere from a few minutes to three days to generate.
Significance of the Achievement
This milestone represents a significant leap forward in AI's ability to handle complex mathematical reasoning, a task previously challenging for machines. The success of AlphaProof and AlphaGeometry 2 demonstrates that AI can now perform the high-level logical reasoning, abstraction, and hierarchical planning required to solve IMO problems. Particularly noteworthy is that the AI systems produced human-readable proofs and used classical geometry rules, similar to human competitors. This achievement was acknowledged by expert mathematicians, including Fields Medalist Tim Gowers, who expressed surprise at AI's ability to find the "magic keys" that unlock complex problems. The systems' performance approaches that of human gold medalists, with AlphaGeometry 2 solving 83% of all historical IMO geometry problems from the past 25 years, a significant improvement over its predecessor's 53% success rate.
Future Potential of AI in Mathematics
The successful performance of AlphaProof and AlphaGeometry 2 at the IMO opens new possibilities for AI-assisted mathematical research and problem-solving. These systems could potentially help mathematicians discover new insights, solve open problems, and accelerate scientific discovery. At the same time, DeepMind researchers acknowledge that AI still lacks the creativity and problem-solving intuition of human mathematicians, suggesting further development is needed for AI to fully match human capabilities in mathematics. As these systems continue to evolve, they could become powerful computational tools, akin to calculators, assisting humans in formulating mathematical proofs and exploring complex hypotheses.