RIM global outage caused by core switch failure
UPDATE: Vodafone Ireland has issued a statement in relation to the ongoing Blackberry issues:
"BlackBerry subscribers may be experiencing ongoing issues with BlackBerry services in Ireland and other countries. Research in Motion (RIM) has identified the issue, is clearing the backlog of data and is working to restore services as quickly as possible."
BlackBerry service delays experienced by users around the world on Tuesday were caused by a core switch failure within the infrastructure of Research in Motion (RIM), the company said late Tuesday.
A RIM spokesman said service was beginning to be restored to normal around 7pm GMT, although there would be further delays as backlogs in data are cleared. It was the second outage or "delay," as RIM put it, in two days affecting users in numerous countries.
RIM’s system is designed to failover to a back-up switch, but the failover system "did not function as previously tested," according to a statement issued by RIM at 10pm.
When the failover did not function, a backlog of data was generated. The company is working to clear that backlog.
"RIM has failed again at what plagued them in past outages, which is to provide a comprehensive disaster recovery solution," Ken Dulaney, an analyst at Gartner, said after the cause of the outage had been made public.
Dulaney said that while switches can fail, "there should be automatic ways in which the system recovers from this type of event. Any vendor who runs this type of mission critical service must constantly be reviewing disaster recovery solutions."
The latest problems occurred in two phases, with a 12-hour outage Monday morning affecting some BlackBerry users in Europe, the Middle East and Africa, according to RIM. That problem was fixed, the company said, without explaining the cause.
Then at about 3pm Tuesday, wireless carriers in the UK and Egypt reported outages that continued for hours.
RIM said an hour later that the delays affected some customers in South America, Europe, the Middle East, Africa and India, but didn’t immediately offer an update about the underlying problem.
Tweets and other reports blamed a server outage in Slough, UK, where RIM operates a data center, but the company would not comment on those reports. The Slough data center would serve much of Europe and the Middle East, analysts said. RIM also run a data center near its headquarters in Waterloo, Ontario.
But a data center outage in the UK or Canada probably wouldn’t explain service problems in South American countries, such as Brazil, Chile and Argentina, analysts noted.
RIM doesn’t usually explain the cause of its outages and disruptions. In the past, those outages have lasted one or two days and only in a certain region of a country or a portion of a continent, not over several continents as happened Monday and Tuesday.
In March 2010, there was an outage in both North America and the UK on Wi-Fi-ready BlackBerry devices that were not connected to Wi-Fi. A more severe December 2009 outage in North America was related to a BlackBerry Messenger update.
International
Users on Blackberryforums.com also took note of the problems Monday. One contributor, MrTuck, reported that "50% of the population with BlackBerry devices across Europe, Middle East and Africa are unable use their Internet, BlackBery Messenger, Facebook Twitter, Email and other applications." He noted that calls and texts were working normally at the time.
Some UK reports said the Monday problem seemed to be related to BlackBerry Internet Service customers, who are mostly consumers and small businesses.
Dulaney said the problem Tuesday seemed to be related to both BIS and BlackBerry Enterprise Server, which is used by larger businesses with email and other functions routed through a server placed inside a corporation and its firewall for added management and security. An editor for IDG News Service based in Paris who uses a BES server in Boston reported he was not affected by the Monday outage. He was, however, affected by the Tuesday problems and could not receive email.
RIM has not said whether BlackBerry BIS or BES or both were affected in either Monday’s or Tuesday’s delays. RIM also didn’t explain what it meant by a "delay" since some users were able to tweet or posted comments that they could not receive certain services at all.
IDG News Service
Subscribers 0
Fans 0
Followers 0
Followers