Online stock exchange enhances performance, reliability

By Elisabeth Horwitt, Storage Networking World Online
23 June 2003

An online stock exchange is not the kind of computing environment that can greet an occasional system outage or dip in service levels with a shrug and a “better luck next time.” As a recent entry in this highly competitive business sector, Archipelago needed to deliver a high level of performance and reliability from the get-go, while keeping costs to a minimum, says CTO Steve Rubinow.

In its effort to fulfill that mandate, one of the firm’s major challenges was ensuring that applications received the storage capacity they needed, when they needed it, if not before, Rubinow notes. World events can cause sudden spikes in application activity and in storage demand. “Last year during the Enron crisis, the whole world went crazy,” Rubinow recalls. “We have to make sure there aren’t any bottlenecks in storage or network capacity to slow response time down.”

That’s why Archipelago decided to implement SAN-based virtualized storage resource management. “A lot of people in the industry talk about doing this, but we went ahead and did it,” he declares.

The legacy environment
The online stock exchange’s original installation consisted of direct-attached HP/Compaq and EMC storage subsystems. When an application or server group began to run out of storage, someone in IT would go out and buy another disk. As result, the company was paying a heavy price not only for storage hardware but also for installing and managing all those additional disks — even as many storage devices remained underutilized.

Archipelago found out just how wasteful this storage allocation method was when EMC did an enterprise-wide capacity utilization analysis. “We found that across the EMC and Storageworks installations, utilization was about 25% to 30%. That was pretty inefficient,” Rubinow comments.

Archipelago solved the problem by moving from direct-attached storage to a SAN-based storage utility. The new platform, which went into production in February, consists of a 128-port Brocade Fibre Channel switch connecting storage subsystems at the main data center in Chicago. In addition to existing HPQ and EMC storage arrays, an eight-terabyte Hitachi Lightning 9900 subsystem has been installed to run the exchange and related processes.

To ensure business continuity — another crucial priority for a stock exchange — an ATM link supports ongoing replication between a Hitachi Lightning 9900 residing at the Chicago data center and another 9900 at a backup facility in New York. Hitachi’s Truecopy software manages the replication.

Veritas’s SANpoint Control platform provides real-time monitoring of storage capacity, notifying administrators of allocation levels so they can purchase storage on a just-in-time basis. Using the platform’s virtualization capabilities, administrators can set up new volumes and LUNs across heterogeneous subsystems, using available capacity much more efficiently, Rubinow explains.

Major payback
The payback: Storage utilization levels have reached 85% on Archipelago’s Hitachi array. In addition, Veritas’s SANPoint Control provides real-time SAN monitoring capabilities. Using the platform’s policy-based management facilities, Archipelago can automatically respond to alert and performance data. The platform allows administrators to do end-to-end monitoring and troubleshooting, from HBA to disk.

The migration went relatively smoothly, Rubinow reports. “The challenge was not the implementation itself, but the tight timeframe we needed to adhere to from a business standpoint while ensuring that services were not disrupted,” he states.

In addition, IT staff had to get up to speed fast on several new platforms: the Hitachi Lightning 9900, Veritas’ SANPoint Control, plus an entirely new Sun environment to run the trading system. “We brought all those in, and we were writing software at the same time, and installing systems in both New York and Chicago,” Rubinow says. Just the same, the installation was completed on schedule.

Rubinow’s group recognized that its leading-edge SRM strategy posed risks, at least in the short term. “Because we were using products from multiple vendors, problem resolution time could have increased significantly,” he notes. “And if the number of problems increased, operational costs could increase.”

But these risks paled beside the potential downside of not doing the project at all, which entailed losing a competitive edge. As he puts it, “If we don’t provide a consistent level of service to customers, they can switch to another provider with a few key strokes.”

Rubinow’s group took several steps to minimize the risk of negative impact on end users and applications. The group designed clustered server platforms that can scale up in near real-time, in response to spikes in demand. Scaling techniques include extending the number of nodes in clusters to handle increasing trading loads, adding disks dynamically to databases as they grow, and adding Fibre Channel switch ports and disks to Hitachi cabinets as requirements dictate.

Through SANpoint Control, IT administrators monitor components throughout the trading day, and respond quickly to events and fluctuating transaction levels.

The bottom line
The implementation of a SAN-based storage utility has enhanced Archipelago’s ability to scale, monitor and manage its storage infrastructure, which is critical for reacting quickly to customer demands.

Archipelago hopes to extend the capabilities of its new platform in several directions over the next year. For example, the IT group wants to exploit SANPoint Control’s ability to automatically allocate new capacity to applications on a dynamic as-needed basis, with little or no human intervention.

Also on Rubinow’s to-do list is integrating SANPoint Control with Archipelago’s network management platform, Hewlett-Packard’s OpenView. That way, network administrators could monitor end-to-end network connections across SANs, LANs and WANs.

“The more automated we can make monitoring and getting a picture of the environment, the easier it will be to stop problems, and the more efficient our staff can be,” Rubinow asserts. “This is key, because we’re pretty lean.” While Archipelago’s total IT department comprises 100 people, only about four to six of them are concerned with storage, and then only on a part-time basis.

The company is also discussing the possibility of installing a second backup site closer to Chicago, which would provide automated fail-over should the primary site go down. But right now, Archipelago is fine-tuning and getting comfortable with its new SAN-based, virtualized, heterogeneous storage resource management environment.

“We’ve implemented a lot of new technology in the last few months,” Rubinow comments. “We’ve got to learn to walk before we can run.”

Plus, you never know what will happen next in the world. Says Rubinow, “Someone mentioned to me five minutes ago, ‘If we capture Saddam, the stock market may go bonkers. We need to plan for this.’