How Google's neural network will improve YouTube

Researchers from Google unveiled software that analyzes videos using a machine learning technology called a neural network that some argue could lead to truly artificial intelligence.

  • close
    A picture illustration shows a YouTube logo reflected in a person's eye, in central Bosnian town of Zenica in June 2014. Google researchers last week unveiled software that automatically picks a high-quality thumbnail for a YouTube video using a machine learning technology called a deep neural network.
    Dado Ruvic/ Reuters file
    View Caption
  • About video ads
    View Caption

For many people who post videos on YouTube, an enticing thumbnail can make or break whether viewers decide to click on a video or scroll to the next one. But what if the video sharing site could pick the best image for each video automatically?

That was the question researchers from Google, which owns YouTube, attempted to answer recently by feeding thousands of high-quality images into a computer in order to train it to do what photographers would likely argue is a high subjective task: select the best quality photo from each video on the site.

The company unveiled its automatic “thumbnailer” in a blog post last Thursday, explaining that the tool analyzes a user’s video at one frame per second, giving each frame a score. The software then selects the thumbnails with the highest quality scores and displays them.

Recommended: What Google's investments reveal about the company and the future

The technology builds on Google’s deep neural network of supercomputers that the company has been training to “think” and recognize images – such as identifying videos of cats on YouTube, even though the computer had no previous information about what a cat looked like.

Neural networks are one part of advances in so-called deep learning, a subset of machine learning that is designed to mimic higher-level thought and abstraction and may be one path toward developing truly artificial intelligence, the Monitor reported in July.

But determining a high quality photo is an additional challenge, the researchers say.

“Unlike the task of identifying if a video contains your favorite animal, judging the visual quality of a video frame can be very subjective - people often have very different opinions and preferences when selecting frames as video thumbnails,” wrote Weilong Yang from Google’s Video Content Analysis team and Min-hsuan Tsai from the YouTube Creator team.

Part of the issue is that neural networks function in a different manner than the human brain, some researchers say.  Like humans, neural networks are designed to learn “by example,” wrote Imperial College London researchers Christos Stergiou and Dimitrios Siganos, in a 2011 guide.

But to do this, they apply limited layers of computation to draw conclusions and perform specific tasks – such as pattern recognition – which differ from the “distributed, varied and compounded approach” used by human brains, Tufts University computer science professor Anselm Blumer told the Monitor in July.

This leads to an issue called “overfitting,” where “the network has memorized the training examples, but it has not learned to generalize to new situations,” a guide from the software developer Mathworks explains.

For Google’s network, this has lead to some downright bizarre results when researchers inputted images and video into the system, leading the computer to create new, artistic images of its own that often scarcely resembled the original. In July, researchers unveiled a series of hallucinatory images created by the software, such as horses sprouting dog's heads and brightly-colored glowing temples.

“A network like that is harder to train, and it’s much easier for it to come to false conclusions,” Professor Blumer told the Monitor.

The goal for machine learning researchers focused on neural networks is to introduce additional layers of abstraction into the process, better mimicking human brains in understanding concepts – like photographic composition or what may define an e-mail as spam – and allowing the computer to “learn” how to apply them.

In the case of YouTube’s thumbnail software, this training process appears to be succeeding.

In order to ensure the computer could distinguish high-quality images from low-quality ones, the Google researchers uploaded custom thumbnails created by YouTube users – which tended to be well-framed and in-focus, designating these as high quality while contrasting them with “low quality” images selected randomly from a sampling of videos.

This allowed the computer to “learn” about nuances of framing and composition, as well as gaining the ability to favor images that emphasized a central character in the video – such as a music video performer, or a family pet, two examples the researchers showed in their blog post.

They also put the new images to a more subjective test, showing them to human subjects side by side with images from YouTube’s previous thumbnail software. People who looked at the two sets of images preferred the images selected by the neural network more than 65 percent of the time, the researchers wrote.

Make a Difference
Inspired? Here are some ways to make a difference on this issue.
FREE Newsletters
Get the Monitor stories you care about delivered to your inbox.

We want to hear, did we miss an angle we should have covered? Should we come back to this topic? Or just give us a rating for this story. We want to hear from you.




Save for later


Saved ( of items)

This item has been saved to read later from any device.
Access saved items through your user name at the top of the page.

View Saved Items


Failed to save

You reached the limit of 20 saved items.
Please visit following link to manage you saved items.

View Saved Items


Failed to save

You have already saved this item.

View Saved Items