DeepMind AI Achieves Silver Medal Level at 2024 International Mathematical Olympiad

Gábor Bíró • 2024. July 31.

3 min de lecture

Google DeepMind's AI systems have reached a significant milestone by achieving silver medal-level performance at the 2024 International Mathematical Olympiad (IMO). The company's specialized models, AlphaProof and AlphaGeometry 2, successfully solved four out of the six problems in the prestigious competition, demonstrating the growing capabilities of AI in tackling complex mathematical tasks.

DeepMind AI Achieves Silver Medal Level at 2024 International Mathematical Olympiad

Source: Création originale

AlphaProof and AlphaGeometry 2

Two specialized AI systems were developed by Google DeepMind to solve complex mathematical problems. AlphaProof combines a pre-trained language model with the AlphaZero reinforcement learning algorithm, enabling it to solve and prove problems in algebra and number theory. AlphaGeometry 2, an enhanced version of its predecessor, focuses on geometry problems and was trained on a vast dataset of 100 million synthetic examples. This innovative data generation approach helped overcome the scarcity of human-written training data, a common obstacle in AI development for mathematical reasoning tasks.

Training Methodologies of AlphaProof and AlphaGeometry 2

AlphaProof and AlphaGeometry 2 employ innovative training methodologies to achieve their impressive mathematical reasoning capabilities. AlphaProof uses a self-play approach, solving millions of problems across various difficulty levels and mathematical topics over several weeks. It generates solution candidates and searches for proof steps in the Lean formal language, reinforcing its language model with each verified proof. AlphaGeometry 2 builds on this by integrating a Gemini language model trained on a larger dataset containing 100 million synthetic examples. To bridge the gap between natural and formal languages, researchers fine-tuned a Gemini model to translate natural language problem statements into formal mathematical language, creating a vast library of formal problems. This approach enabled the systems to tackle a wide range of mathematical challenges.

Performance at the 2024 IMO

At the 2024 International Mathematical Olympiad, AlphaProof successfully solved two algebra problems and one number theory problem, while AlphaGeometry 2 solved one geometry problem. Their combined solutions earned a total of 28 points out of a possible 42, equivalent to a silver medal performance and just one point shy of the gold medal threshold. Notably, AlphaGeometry 2 solved its problem in just 19 seconds, showcasing its remarkable efficiency. The problems were manually translated into formal mathematical language for the AI systems, and the solutions took anywhere from a few minutes to three days to generate.

Significance of the Achievement

This milestone represents a significant leap forward in AI's ability to handle complex mathematical reasoning, a task previously challenging for machines. The success of AlphaProof and AlphaGeometry 2 demonstrates that AI can now perform the high-level logical reasoning, abstraction, and hierarchical planning required to solve IMO problems. Particularly noteworthy is that the AI systems produced human-readable proofs and used classical geometry rules, similar to human competitors. This achievement was acknowledged by expert mathematicians, including Fields Medalist Tim Gowers, who expressed surprise at AI's ability to find the "magic keys" that unlock complex problems. The systems' performance approaches that of human gold medalists, with AlphaGeometry 2 solving 83% of all historical IMO geometry problems from the past 25 years, a significant improvement over its predecessor's 53% success rate.

Future Potential of AI in Mathematics

The successful performance of AlphaProof and AlphaGeometry 2 at the IMO opens new possibilities for AI-assisted mathematical research and problem-solving. These systems could potentially help mathematicians discover new insights, solve open problems, and accelerate scientific discovery. At the same time, DeepMind researchers acknowledge that AI still lacks the creativity and problem-solving intuition of human mathematicians, suggesting further development is needed for AI to fully match human capabilities in mathematics. As these systems continue to evolve, they could become powerful computational tools, akin to calculators, assisting humans in formulating mathematical proofs and exploring complex hypotheses.

Recommandé

Pétunia bioluminescent : La fleur lumineuse

Gábor Bíró • 2024. February 15.

Connue sous le nom de « pétunia luciole », ce pétunia lumineux est une plante génétiquement modifiée qui émet continuellement une lumière verte, grâce à des gènes dérivés d'un champignon lumineux.

L'avenir des robots humanoïdes

Gábor Bíró • 2024. July 11.

La convergence de l'intelligence artificielle et de la robotique a inauguré une nouvelle ère d'innovation technologique, caractérisée par des robots capables d'apprendre et de s'adapter en temps réel. Cette capacité dynamique transforme l'automatisation traditionnelle, permettant aux robots d'améliorer leurs fonctionnalités dans des environnements divers et imprévisibles, révolutionnant ainsi des secteurs allant de la fabrication aux soins de santé.

OpenAI s'associe à Stack Overflow

Gábor Bíró • 2024. May 07.

OpenAI et Stack Overflow ont annoncé un partenariat visant à améliorer les capacités des modèles d'IA en intégrant le vaste savoir technique de la communauté. Cette collaboration accorde à OpenAI l'accès à l'API de Stack Overflow, fournissant une base de données fiable pour le développement de l'IA et contribuant à améliorer les performances des modèles, en particulier pour les requêtes de programmation et techniques.

Nvidia Unveils Blackwell: The Next-Generation AI Superchip Platform

Gábor Bíró • 2024. March 19.

Nvidia, a leader in accelerated computing and AI, has unveiled its highly anticipated next-generation platform built around the powerful Blackwell GPU. Announced at the company's GTC 2024 conference, this new architecture, named after mathematician David Blackwell, succeeds the influential Hopper generation (H100/H200). Significantly, Blackwell represents Nvidia's first foray into a chiplet-based design for its data center GPUs, integrating two large GPU dies manufactured using a custom TSMC 4NP process node.

Deepseek V3 : Une qualité proche de l'état de l'art sur votre propre serveur

Gábor Bíró • 2025. January 09.

Jusqu'à récemment, le paysage de l'IA haut de gamme était dominé par des modèles propriétaires tels que GPT-4 et Claude Sonnet. L'accès à ces modèles implique souvent des coûts importants et des limitations. Cependant, l'arrivée de DeepSeek-V3 marque un tournant potentiel : ce modèle de langage open source offre non seulement des performances compétitives par rapport aux meilleurs modèles propriétaires, mais il donne également la possibilité de l'exécuter sur sa propre infrastructure.

How is Artificial Intelligence Reshaping Agriculture?

Gábor Bíró • 2024. August 05.

Agriculture stands on the cusp of a technological revolution, with Artificial Intelligence (AI) at the forefront of this transformation. AI is revolutionizing the agricultural sector, offering new solutions to increase productivity, optimize resource use, and address challenges like labor shortages and sustainability. By integrating machine learning, robotics, and data analytics, AI not only enhances the efficiency of farming practices but also promises a more sustainable and profitable future for food production.

Le Système Trachtenberg de Calcul Mental

Gábor Bíró • 2024. September 19.

Le Système Trachtenberg, développé par l'ingénieur russe Jakow Trachtenberg pendant son emprisonnement dans les camps de concentration nazis, est une méthode de calcul mental rapide qui a fasciné aussi bien les mathématiciens que les étudiants. Cette approche innovante de l'arithmétique, qui élimine le besoin des tables de multiplication et ne repose que sur des compétences de base en calcul, promet une plus grande vitesse, précision et facilité dans l'exécution des calculs.