This AI computer can beat students at SAT geometry questions

Researchers have created an artificial intelligence system capable of answering geometry SAT questions as well as the average high school junior. 

Ann Hermes/The Christian Science Monitor
Students at Boston Collegiate Charter High School review for a math test in June 2014. Researchers have developed an artificial intelligence system that could score just as well as these students on the SAT.

In 2014, the average SAT test taker correctly answered answered 49 percent of the test's math questions. Today, a new software program is now close to doing the same.

In a paper published Monday, researchers at the Allen Institute for Artificial Intelligence (AI2) and the University of Washington revealed that their artificial intelligence (AI) system, known as GeoSolver, or GeoS for short, is able to answer “unseen and unaltered” geometry problems on par with humans.  

According to a report released by College Board, the average SAT math score in 2014 was 513. Though GeoS has only been tested on geometry questions, if the system’s accuracy was extrapolated, GeoS would have scored a 500.

Using a combination of computer vision and natural language processing, GeoS can interpret diagrams and process text that it then feeds into a geometric solver that analyzes the input and selects the best multiple choice answer.

“Our method consists of two steps: interpreting a geometry question by deriving a logical expression that represents the meaning of the text and the diagram, and solving the geometry question by checking the satisfiablity of the derived logical expression,” write the researchers in their most recent paper.

The difficulty lies in analyzing the diagrams, which in the context of geometry problems, usually provide information missing from the textual descriptions.

In alignment with AI2’s mission of AI research for the common good, all of GeoS source code and test data is available online. You can also watch a demo of how GeoS works.

While the test results may not seem that impressive on their own, it’s important to consider how much work it takes to achieve even these “average” results.

“Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate," AI2 CEO Oren Etzioni said in a press release. "Creating a system to be able to successfully take these tests is challenging, and we are proud to achieve these unprecedented results."

GeoS is in fact, the first end-to-end system that can solve these types of plane geometry problems. But beyond that, GeoS signifies development in a machine's ability to reason as opposed to merely perceiving input and recognizing patterns, which AI systems such as Apple's Siri have advanced in recent years.

Many experts say we’re still years away from the biggest AI advances. “Although they can do some individual tasks as well as or even better than humans, technology still cannot approach the complex thinking that humans have,” says Microsoft Allison Linn.

But GeoS does represent significant progress, even if minimal in the grand scheme of AI.

There’s no doubt that replicating human intelligence is complicated, but “we’re taking baby steps in that direction,” wrote Fortune's Derrick Harris.

You've read  of  free articles. Subscribe to continue.
Real news can be honest, hopeful, credible, constructive.
What is the Monitor difference? Tackling the tough headlines – with humanity. Listening to sources – with respect. Seeing the story that others are missing by reporting what so often gets overlooked: the values that connect us. That’s Monitor reporting – news that changes how you see the world.

Dear Reader,

About a year ago, I happened upon this statement about the Monitor in the Harvard Business Review – under the charming heading of “do things that don’t interest you”:

“Many things that end up” being meaningful, writes social scientist Joseph Grenny, “have come from conference workshops, articles, or online videos that began as a chore and ended with an insight. My work in Kenya, for example, was heavily influenced by a Christian Science Monitor article I had forced myself to read 10 years earlier. Sometimes, we call things ‘boring’ simply because they lie outside the box we are currently in.”

If you were to come up with a punchline to a joke about the Monitor, that would probably be it. We’re seen as being global, fair, insightful, and perhaps a bit too earnest. We’re the bran muffin of journalism.

But you know what? We change lives. And I’m going to argue that we change lives precisely because we force open that too-small box that most human beings think they live in.

The Monitor is a peculiar little publication that’s hard for the world to figure out. We’re run by a church, but we’re not only for church members and we’re not about converting people. We’re known as being fair even as the world becomes as polarized as at any time since the newspaper’s founding in 1908.

We have a mission beyond circulation, we want to bridge divides. We’re about kicking down the door of thought everywhere and saying, “You are bigger and more capable than you realize. And we can prove it.”

If you’re looking for bran muffin journalism, you can subscribe to the Monitor for $15. You’ll get the Monitor Weekly magazine, the Monitor Daily email, and unlimited access to

QR Code to This AI computer can beat students at SAT geometry questions
Read this article in
QR Code to Subscription page
Start your subscription today