AlphaGeometry scores almost as well as the best students on geometry questions from the International Mathematical Olympiad
By Alex Wilkins
17 January 2024
Geometrical problems involve proving facts about angles or lines in complicated shapes
Google DeepMind
An AI from Google DeepMind can solve some International Mathematical Olympiad (IMO) questions on geometry almost as well as the best human contestants.
Read more
How does ChatGPT work and do AI-powered chatbots “think” like us?
“The results of AlphaGeometry are stunning and breathtaking,” says Gregor Dolinar, the IMO president. “It seems that AI will win the IMO gold medal much sooner than was thought even a few months ago.”
Advertisement
The IMO, aimed at secondary school students, is one of the most difficult maths competitions in the world. Answering questions correctly requires mathematical creativity that AI systems have long struggled with. GPT-4, for instance, which has shown remarkable reasoning ability in other domains, scores 0 per cent on IMO geometry questions, while even specialised AIs struggle to answer as well as average contestants.
This is partly down to the difficulty of the problems, but it is also because of a lack of training data. The competition has been run annually since 1959, and each edition consists of just six questions. Some of the most successful AI systems, however, require millions or billions of data points. Geometrical problems in particular, which make up one or two of the six questions and involve proving facts about angles or lines in complicated shapes, are particularly difficult to translate to a computer-friendly format.
Thang Luong at Google DeepMind and his colleagues have bypassed this problem by creating a tool that can generate hundreds of millions of machine-readable geometrical proofs. When they trained an AI called AlphaGeometry using this data and tested it on 30 IMO geometry questions, it answered 25 of them correctly, compared with an estimated score of 25.9 for an IMO gold medallist based on their scores in the contest.