PlaNet, Google's latest AI, has amazing accuracy with photo locations

Google has developed a deep-learning neural network program that beats well-traveled humans at guessing where a photo was taken.

Peter Power/Reuters/File
A neon Google sign stands in the foyer of Google's new Canadian engineering headquarters, Jan. 14, 2016. Google's new system can tell where a picture is taken more reliably than humans.

Thanks to Google, a new artificial intelligence system is outperforming humans in spotting the origins of images.

Google has unveiled a new system to identify where photos are taken. The task, simple when images contain famous landmarks or unique architecture, goes beyond the overt to examine small clues hidden in the pixels.

The program, named PlaNet, uses a deep-learning neural network, which means the more images PlaNet sees, the smarter it gets.

The impressive part? Its geotagging abilities are already better than any other program or human, and it's still getting smarter.

"PlaNet is able to localize 3.6% of the images at street-level accuracy and 10.1% at city-level accuracy. 28.4% of the photos are correctly localized at country level and 48.0% at continent level," wrote the research team. 

That's still a long way from a reliable level of accuracy – but PlaNet already outperforms even the most well-traveled humans.

To compare PlaNet to human accuracy, the researchers matched their program against 10 well-traveled people in the game Geoguessr, a game providing a random street-view photo and requiring players to identify where they believe the photo was taken.

PlaNet and its human challengers played 50 rounds in total.

"PlaNet won 28 of the 50 rounds with a median localization error of 1131.7 km, while the median human localization error was 2320.75 km," according to the paper. 

Other computer programs are tackling image location as well. Im2GPS has achieved high accuracy by relying on image retrieval to identify location. For example, if im2GPS was trying to identify where a picture of a forest was taken, it would browse the internet's millions of forest photos. When it found one that looked almost identical, it would conclude they were taken in the same place. With enough data, this method can achieve high accuracy, according to the paper.

The team behind PlaNet took a different approach:

In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images.

The researchers trained the neural network using 29.7 million public photos from Google+. The neural network relies on clues and features from photos it has already seen to help identify the most likely whereabouts of a new image.

The program has some limitations. Because it depends on internet images, PlaNet is at a disadvantage when confronted with rural countrysides and other rarely photographed locales. The team also left out large swaths of the Earth, including oceans and the polar caps.

Tobias Weyland, the lead author on the project, noted that supplementing internet photos with satellite images resolved some of these weaknesses. PlaNet also focuses on landscapes and other factors besides landmarks, making it more accurate at identifying non-city images than other programs.

One more piece of good news for users interested in identifying locations on the go: PlaNet only needs 377 MB, so it could fit on smartphones.

You've read  of  free articles. Subscribe to continue.

Dear Reader,

About a year ago, I happened upon this statement about the Monitor in the Harvard Business Review – under the charming heading of “do things that don’t interest you”:

“Many things that end up” being meaningful, writes social scientist Joseph Grenny, “have come from conference workshops, articles, or online videos that began as a chore and ended with an insight. My work in Kenya, for example, was heavily influenced by a Christian Science Monitor article I had forced myself to read 10 years earlier. Sometimes, we call things ‘boring’ simply because they lie outside the box we are currently in.”

If you were to come up with a punchline to a joke about the Monitor, that would probably be it. We’re seen as being global, fair, insightful, and perhaps a bit too earnest. We’re the bran muffin of journalism.

But you know what? We change lives. And I’m going to argue that we change lives precisely because we force open that too-small box that most human beings think they live in.

The Monitor is a peculiar little publication that’s hard for the world to figure out. We’re run by a church, but we’re not only for church members and we’re not about converting people. We’re known as being fair even as the world becomes as polarized as at any time since the newspaper’s founding in 1908.

We have a mission beyond circulation, we want to bridge divides. We’re about kicking down the door of thought everywhere and saying, “You are bigger and more capable than you realize. And we can prove it.”

If you’re looking for bran muffin journalism, you can subscribe to the Monitor for $15. You’ll get the Monitor Weekly magazine, the Monitor Daily email, and unlimited access to CSMonitor.com.