Google’s Gmail service suffered a widespread outage Friday that lasted just over an hour but that felt like days to some users of the free e-mail service, highlighting how reliant on the service a broad swath of the globe has become.
At 2:12 p.m. Eastern Standard Time, Google reported it was “investigating” a Gmail outage. By 3:23 p.m. the company reported on its Google Apps dashboard that: “The problem with Gmail should be resolved. We apologize for the inconvenience and thank you for your patience and continued support.”
Still, the Gmail outage was magnified by its wide reach, reportedly affecting parts of Europe, the US, Canada, India, and elsewhere, according to TechCrunch, the technology reporting website. It also, predictably perhaps, sent the Twitter-sphere into overdrive with speculation and beefing about the outage.
“I have had nothing but TROUBLE with my Gmail accounts today!! I had so much to do now I have to work through the night,” tweeted Tina Graves, who is described on her Twitter page as an aspiring author.
"Oh, thank goodness, GMail is back,” another Twitterer opined. “I was nervous I'd miss out on my Bed, Bath, and Beyond newsletter.”
As Gmail outages go, this one was not huge – lasting 1 hour, 11 minutes, according to the company’s Google Apps dashboard webpage. Google experienced a widespread Gmail outage in 2009. On Sept. 23 of last year, a Gmail outage lasted more than four hours and delayed 29 percent of all e-mails sent during the work day, with 1.5 percent of e-mails delayed for hours. Gmail was reported last year to have at least 400 million users.
In that case, the problem was attributed to a rare double network failure of two independent systems that were supposed to back each other up. In Friday's case, the company is still investigating the cause. Past cases have often involved internal upgrades that went awry, data industry experts say.
On at least four occasions, Gmail downtime has been traced to software updates in which bugs triggered unexpected consequences, reports Data Center Knowledge, an industry trend-watching website. In February 2009, a software update “overloaded some of Google’s European network infrastructure, causing cascading outages at its data centers in the region that took about an hour to get under control,” DSK reported. In September 2009, Google underestimated the impact of a software update on traffic flow between network equipment, overloading key routers – until other equipment was added to dilute the flow, it said.
The timing of the outage was unfortunate for Google for another reason: Its system-reliability engineers were – at almost the same moment that Gmail went down – making themselves available to answer questions from the curious on the news site reddit.com, via chat, about how Google ensures its system reliability.
“Hello, reddit! We are the Google Site Reliability Engineering (SRE) team. Our previous AMA from almost exactly a year ago got some good questions, so we thought we’d come back and answer any questions about what we do, what it’s like to be an SRE, or anything else.”