Thanks to Google, a new artificial intelligence system is outperforming humans in spotting the origins of images.
Google has unveiled a new system to identify where photos are taken. The task, simple when images contain famous landmarks or unique architecture, goes beyond the overt to examine small clues hidden in the pixels.
The program, named PlaNet, uses a deep-learning neural network, which means the more images PlaNet sees, the smarter it gets.
The impressive part? Its geotagging abilities are already better than any other program or human, and it's still getting smarter.
"PlaNet is able to localize 3.6% of the images at street-level accuracy and 10.1% at city-level accuracy. 28.4% of the photos are correctly localized at country level and 48.0% at continent level," wrote the research team.
That's still a long way from a reliable level of accuracy – but PlaNet already outperforms even the most well-traveled humans.
To compare PlaNet to human accuracy, the researchers matched their program against 10 well-traveled people in the game Geoguessr, a game providing a random street-view photo and requiring players to identify where they believe the photo was taken.
PlaNet and its human challengers played 50 rounds in total.
"PlaNet won 28 of the 50 rounds with a median localization error of 1131.7 km, while the median human localization error was 2320.75 km," according to the paper.
Other computer programs are tackling image location as well. Im2GPS has achieved high accuracy by relying on image retrieval to identify location. For example, if im2GPS was trying to identify where a picture of a forest was taken, it would browse the internet's millions of forest photos. When it found one that looked almost identical, it would conclude they were taken in the same place. With enough data, this method can achieve high accuracy, according to the paper.
The team behind PlaNet took a different approach:
In contrast, we pose the problem as one of classification by subdividing the surface of the earth into thousands of multi-scale geographic cells, and train a deep network using millions of geotagged images.
The researchers trained the neural network using 29.7 million public photos from Google+. The neural network relies on clues and features from photos it has already seen to help identify the most likely whereabouts of a new image.
The program has some limitations. Because it depends on internet images, PlaNet is at a disadvantage when confronted with rural countrysides and other rarely photographed locales. The team also left out large swaths of the Earth, including oceans and the polar caps.
Tobias Weyland, the lead author on the project, noted that supplementing internet photos with satellite images resolved some of these weaknesses. PlaNet also focuses on landscapes and other factors besides landmarks, making it more accurate at identifying non-city images than other programs.
One more piece of good news for users interested in identifying locations on the go: PlaNet only needs 377 MB, so it could fit on smartphones.