shadow

Voice assistants can't understand Pittsburghese

If you're from Pittsburgh, 'Sorry, I didn't get that,' may be a refrain you're used to hearing from Alexa. Voice assistants, trained on regular American English, often trip up on requests in regional dialects.

Jessie Wardarski/ Pittsburgh Post-Gazette/AP
The Alexa Echo Dot by Amazon sits in a Pittsburgh basement on April 5 in Greensburg, Pa. Voice assistants trained on standard American English, devoid of accent markers, have difficulty understanding Pittsburgh and other regional speakers.

Don't ask Alexa where to "warsh" the car.

She's probably going to spit out her catch-all for queries that she can't compute: "Sorry, I didn't get that."

Amazon's Alexa isn't the only disembodied voice plagued by misunderstanding Pittsburgh speakers's and other regional speakers' quirks: Apple's Siri, Google Assistant, and Samsung's Bixby are all pretty bad at picking up on what they're laying down.

This new class of software agents is typically trained in regular American English, which is largely devoid of accent markers.

"The more you hear voices follow certain speech patterns, the easier you find it to understand," said Amazon spokesman Nate Michel. "For voice assistants, this is no different. They work best when the accent is well represented in the training data."

But fear not. Dozens of engineers are working on the Alexa machine translation team at Amazon's SouthSide Works office in Pittsburgh, and the staff is growing – Amazon HQ2 or not. Plus, a team of Pittsburgh-based researchers is a finalist in the Alexa Prize competition, which wants to build a conversational chatbot for Amazon.

Perhaps directions to the car warsh are on the way after all.

"People understand Pittsburghese because they live among people who speak it," Mr. Michel said. "With enough examples of Pittsburghese, Alexa's understanding of the dialect should improve."

Alexa, do you understand?

More people are tapping into voice assistants daily for help with recipes, weather forecasts, and news or just to play games.

During the third quarter of 2017, Amazon's smart speaker sales jumped to about 5 million units, up from roughly 900,000 for the same period in 2016, according to data from Strategy Analytics, a business consulting firm focusing on digital consumers.

Voice assistants range in color, shape, and size by manufacturer, but they all include a speaker and microphone so that the computer and human owner may interact. While currently limited in capability, voice assistants can help users to order items online, play a song, or set a timer.

If those interactions are going to work, the computer must understand human input.

Alan Black, a professor at Carnegie Mellon University's Language Technologies Institute, said voice assistants are fed thousands of audio samples from people with different voices. Those snippets are also transcribed into text, and the machine learns to make associations between sounds and words.

The voice data isn't very representative of regional dialects – making Alexa a bit hit-or-miss with Pittsburgh words like chipped ham (a type of ham lunch meat) and gumbands (rubber bands).

"When you say things like 'yinz' [second-person plural pronoun] or 'n'at' ['and that'] to Alexa and she doesn't understand you, you stop doing that," Professor Black said. "You're discouraged from doing that."

When asking where to "warsh" the car, for example, you might rephrase to ask where the nearest gas station is because those may have car wash stations.

Pittsburghers's dubious relationship with an extra possessive "s'' at the end of store names – turning the grocery store "Aldi" to "Aldi's" – isn't usually enough to trip up a voice assistant. Alexa can also deal with dropped being verbs, so don't worry if "the carpet needs vacuumed." Hint: If nothing looks awry in that sentence, it should be written as "the carpet needs to be vacuumed."

Black, who has taught a course on Amazon Alexa, said the systems are getting better at interpreting accents that represent a large swath of the country like American English in the South (think: y'all).

But there are more than two dozen variations of American English in the United States, usually separated by geographic location. That's not including accents of people who learn English as a second or third language.

Pittsburghese – formally called Western Pennsylvania English – is the result of Scots-Irish, German, Polish, Ukrainian, and Croatian immigrants mixing together in the 18th and 19th centuries. Each provided loanwords, words adopted from a donor language without translation.

Some Scots-Irish words, for example, are "redd up" (to tidy up), ''nebby" (nosy), ''yinz," and "slippy," according to research by Barbara Johnstone, professor of linguistics and rhetoric at CMU, and Scott Kiesling, associate professor of linguistics at the University of Pittsburgh.

Alexa, let's chat

There's value in banter – whether it's small talk about the weather or something more involved. Humans tend to learn that skill relatively quickly. For voice assistants, it's a struggle.

Enter the Alexa Prize competition, a bid to create a chatbot that can sustain a conversation. If it works, not only will you be able to talk to Alexa about the Pittsburgh Steelers but also debate if she thinks Le'Veon Bell will ever get a long-term contract.

CMU is building one of these agents as a finalist in the 2018 Alexa Prize contest.

The 13-member team is one of eight teams that have secured $250,000 in funding from Amazon to continue research. The grand prize is $500,000. If the team builds a chatbot that can sustain a conversation for 20 minutes, CMU will land a $1 million grant. Last year, winners were announced around November – and no team managed to get Alexa to babble for 20 minutes.

Right now, it's possible to ask Alexa questions that seem to elicit intelligent responses, such as if she enjoys "Star Wars." The fact that she may respond to a particular reference like that says more about the engineers who programmed the voice agent than the person talking to it.

This summer, Echo owners will test drive all eight Alexa chatbots, including the CMU prototype.

By saying, "Alexa, let's chat," users will be able to converse with the bots, score them, and leave comments.

There are other ways to interact with the voice assistant without giving up your "yinz's." 

Danielle Heberling, a software engineer from Portland, Ore., has been developing an Alexa "skill," or an app, that serves as a Pittsburghese dictionary for out-of-towners.

Originally from New Kensington, Pa., Ms. Heberling developed "hey yinz," as a way to experiment with the Alexa platform.

"I've moved all over the country," she said. "I would say some things that I didn't even realize I was saying, and people not from western Pennsylvania didn't understand."

The app translates what the user says, like a rudimentary version of a speech translation program.

So if you say "clicker," Alexa will return with "remote." If you say "crick," Alexa will say "creek."

Still, Heberling said some words, like "dahntahn" (downtown) are tricky.

"It might be the way I phonetically coded it," she said. "I'm still trying to figure it out. I made it open source so people could try to mess with it."

Alexa, will HQ2 come here?

CMU's Language Technologies Institute has been gaining clout since the earliest days of voice assistants, not only building out a pipeline of graduates for the industry but also directly impacting today's agents.

Black said it's the largest program in the country, with at least 350 graduate students. He said that upon graduation about half will move on to companies like Amazon, Google, and Apple.

The institute's relationship with the tech industry dates further back than Alexa – CMU was instrumental in Apple commercializing its Siri voice assistant.

Black noted that in 2011, Menlo Park, Calif.-based research firm SRI International assembled a team of university researchers to build a software program that could filter and process noisy speech. The Defense Advanced Research Projects Agency put $13 million toward the effort.

By 2012, language technologists were in hot demand.

"I remember when there was only one employee at Amazon," said Black, reflecting on the Seattle online retailer's 2015 acquisition of Safaba, a developer of automated text translation services founded by two CMU graduates.

That group would lead the charge at the SouthSide Works office, where graduates continue to trickle in to work on the machine learning team for Alexa. Nationwide, more than 5,000 employees focus on Alexa.

Black believes that as long as Amazon remains focused on its voice assistant, the institute is vital in the Pittsburgh region's bid for the company's planned second headquarters.

In the meantime, don't hold your breath waiting for Alexa to suggest the best deli-fresh chipped ham. 

This article was reported by The Pittsburgh Post-Gazette.

of 5 stories this month > Get unlimited stories
You've read 5 of 5 free stories

Only $1 for your first month.

Get unlimited Monitor journalism.