This AI computer can beat students at SAT geometry questions

Researchers have created an artificial intelligence system capable of answering geometry SAT questions as well as the average high school junior. 

Students at Boston Collegiate Charter High School review for a math test in June 2014. Researchers have developed an artificial intelligence system that could score just as well as these students on the SAT.

Ann Hermes/The Christian Science Monitor

September 22, 2015

In 2014, the average SAT test taker correctly answered answered 49 percent of the test's math questions. Today, a new software program is now close to doing the same.

In a paper published Monday, researchers at the Allen Institute for Artificial Intelligence (AI2) and the University of Washington revealed that their artificial intelligence (AI) system, known as GeoSolver, or GeoS for short, is able to answer “unseen and unaltered” geometry problems on par with humans.  

According to a report released by College Board, the average SAT math score in 2014 was 513. Though GeoS has only been tested on geometry questions, if the system’s accuracy was extrapolated, GeoS would have scored a 500.

In Kentucky, the oldest Black independent library is still making history

Using a combination of computer vision and natural language processing, GeoS can interpret diagrams and process text that it then feeds into a geometric solver that analyzes the input and selects the best multiple choice answer.

“Our method consists of two steps: interpreting a geometry question by deriving a logical expression that represents the meaning of the text and the diagram, and solving the geometry question by checking the satisfiablity of the derived logical expression,” write the researchers in their most recent paper.

The difficulty lies in analyzing the diagrams, which in the context of geometry problems, usually provide information missing from the textual descriptions.

In alignment with AI2’s mission of AI research for the common good, all of GeoS source code and test data is available online. You can also watch a demo of how GeoS works.

While the test results may not seem that impressive on their own, it’s important to consider how much work it takes to achieve even these “average” results.

A majority of Americans no longer trust the Supreme Court. Can it rebuild?

“Much of what we understand from text and graphics is not explicitly stated, and requires far more knowledge than we appreciate," AI2 CEO Oren Etzioni said in a press release. "Creating a system to be able to successfully take these tests is challenging, and we are proud to achieve these unprecedented results."

GeoS is in fact, the first end-to-end system that can solve these types of plane geometry problems. But beyond that, GeoS signifies development in a machine's ability to reason as opposed to merely perceiving input and recognizing patterns, which AI systems such as Apple's Siri have advanced in recent years.

Many experts say we’re still years away from the biggest AI advances. “Although they can do some individual tasks as well as or even better than humans, technology still cannot approach the complex thinking that humans have,” says Microsoft Allison Linn.

But GeoS does represent significant progress, even if minimal in the grand scheme of AI.

There’s no doubt that replicating human intelligence is complicated, but “we’re taking baby steps in that direction,” wrote Fortune's Derrick Harris.