Companies Test Backup Plans, and Learn Some Lessons

By Michael Hickins, The Wall Street Journal
30 October 2012
URL: http://online.wsj.com/article/SB10001424052970204840504578089083861617790.html

Natural disasters of the past few years taught companies the importance of good backup systems—but Hurricane Sandy showed many that there is still a lot to learn.

The good news is that many businesses spared themselves a great deal of economic loss by keeping duplicates of their data, email systems and other applications far from their primary data centers. Employees were able to access those systems from their homes or elsewhere via software and mobile devices.

Still, many companies have struggled to cope with the effects of the storm. Some found that their critical systems weren’t in fact backed up, or that their duplicate systems weren’t located far enough apart to provide the desired diversity. Others learned their plans were hampered by the inability of their employees to access those systems via the Internet, or to travel to alternate locations that needed tending.

Alex Tabb, partner at financial and technology consulting firm Tabb Group, noted that financial institutions “invested a lot of money to create business continuity plans, and…technologically these institutions are entirely capable of running in the aftermath of a storm like Sandy.”

However, the main exchanges had to close because the sites were inaccessible and the institutions were concerned about the safety of the people involved, he said.

NYSE/Euronext’s primary data center, a 400,000-square-foot, two-level edifice, is located in Mahwah, N.J. Though that is fairly close to its New York City offices, the company said the facility has been built to withstand high winds and flooding.

Another financial services firm, foreign exchange trading platform FXall, has a data center in New Jersey and another “out of region,” according to its technology chief, Steve Rubinow. The company’s trading systems are duplicated, meaning it can switch trading activities to an alternate location if the primary center goes down.

Mr. Rubinow said the goal for the business continuity plan is for “customers that aren’t in the affected area to be able to conduct their business.” FXall, which has customers around the world, has continued operating throughout the week without interruption.

Freshpair.com, a New York-based 50-person retailer of undergarments, has been able to maintain most of its operating capability. Most of the company’s applications, including email, tools for managing paid online search programs and business analytic software, are running in remote locations across the country.

But its software-based phone system, which includes a feature that lets employees route calls to their home or cellphones, isn’t duplicated at a remote location. Because its headquarters, located at the corner of Broadway and Houston Street, is inaccessible to employees, they can’t take customer service calls.

“That’s one of the key lessons learned—what happens if the office loses power. We’re prepared for every eventuality except for that,” said Matthew Butlein, the president of the company. Mr. Butlein said the company will explore the possibility of duplicating the phone software at another location.

Companies located outside New York City have also been affected by the outage. MailChimp, an email marketing and list management company in Atlanta, has email and Web servers running in multiple locations around the country, including Dallas, New York and Seattle. Each location serves as a backup for the others, but is also the primary center for its region of the country.

According to Joe Uhl, who manages technology infrastructure at MailChimp, two of the three New York locations belonging to the company that hosts its servers failed because of flooding and loss of power. But his company’s equipment happened to be at the third location. “We dodged a bullet there,” said Mr. Uhl.

If it hadn’t dodged that bullet, some of the company’s customers wouldn’t have been able to access the Web portals they use to create new email campaigns, until those portals had been replicated at another center. “It wouldn’t have been seamless,” Mr. Uhl said, adding that the company is likely to explore improving its ability to do that type of replication.

FXall’s Mr. Rubinow said the company has rehearsed business continuity plans, but that the dry runs were no substitute for the real thing. “In concept [business continuity is] simple and straightforward. But you are never able to test these things in real life conditions.”

The firm also sent a small team of engineers to a hotel close to the New Jersey facility “in case someone needs to do some work,” Mr. Rubinow said.

Mr. Rubinow said company executives meeting by phone the last two days have looked at ways to improve business continuity plans, so that “all of our checklists become more detailed.”

People are a double-edged sword in this context, said Mr. Rubinow. “In any computer system, the weakest element is the human one.” However, he added that in unexpected circumstances, there is no algorithm that will adapt as well as a person, and “that’s where the human’s ability to deal with uncertainty shines.”