Doomsday predictions peppered the media when the Brookhaven National Laboratory (BNL) brought the Relativistic
Heavy Ion Collider (RHIC) online two years ago. The collider, some said, could create a black hole that would absorb the earth. So far, said IT administrator Maurice Askinazi, RHIC only creates a black hole that absorbs all network storage capacity.
At the lab, scientists use RHIC to recreate the conditions of the Big Bang. The collider smashes gold ions into the smallest units of matter known to science at the highest energy levels man has achieved. Inside RHIC, two rings 2.25 miles in circumference intersect at four places. At these intersections, the gold particles from each ring collide. Energy released by the collisions melts the matter, creating quark-gluon plasma and, subsequently, a Big Bang-like "event" and lots of data.
Storing the creation of the universe is quite a challenge. Last year, for instance, the collider ran for two weeks and filled up 50T Bytes of disk storage.
"If we could afford a petabyte of disk storage, we'd buy it and could probably fill it up," said Askinazi, group leader of the Upton, N.Y.-based BNL Computing Facility.
When Askinazi first came on board at the laboratory, he found an IBM RS/6000 server with an external SCSI disk handling the storage. "When they'd needed more storage they just daisy chained it," said Askinazi. "They had a table with a server on it and a bunch of JBOD daisy chains behind it."
Since hundreds of physicists rely upon the Computing Facility to protect its data, Askinazi wanted a storage system that was as cutting edge as the engineering marvel it supports. A server with JBOD daisy chains didn't offer the scalability, reliability, or availability needed for accessing and analyzing huge volumes of data.
In his first storage system makeover, Askinazi brought in a rack-mounted server with RAID systems plugged into it. Using multiple RAID 5 systems with 2T Bytes of storage each would provide the availability needed, he reasoned. This system posed several problems, however. It didn't scale easily. Also, if there was a problem with the host bus adapter, or a cable, or even one of the controllers, the system would go down.
Next, Askinazi chose a server-based, direct-connect solution from Anaheim, Calif.-based MTI Technology Corp. It included a Fibre Channel switch between the server and the storage, two controllers that were redundant and had failover capabilities and good error reporting tools.
Once again, Askinazi was disappointed with the results. If disks needed to be moved from one server to another, both servers had to be shut down and the host bus adapters had to be moved. "That is a killer," he said.
Askinazi decided that a Storage Area Network (SAN) solution was the next logical step. A SAN would allow for more portability of storage than direct connect, while maximizing the use of all the storage.
After looking at SAN solutions from Hitachi and EMC Corp., Askinazi brought in OSSI's SAN. Despite that vendor's promises, the SAN wouldn't work with BNL's Veritas software. Left with a gaping hole in his storage infrastructure, Askinazi turned to MTI. MTI quickly delivered and implemented a SAN based on its Vivant D100 direct-attach storage systems. The MTI option worked with Veritas and offered the performance needed.
"I've found that MTI is the most competitive with price but gives the same class of service as the big boys," Askinazi said.
In the current system, the raw data from detectors at the collision points in the RHIC is initially stored on a tape system from Louisville, Colo.-based StorageTek. One Intel-based Linux server farm is dedicated to processing raw event data from the detectors, and another handles scientists' analysis of reconstructed events.
Data regarding individual events is then moved in one and 2G Byte files onto disks in the MTI Vivant D100 storage resources, where it can be analyzed and massaged by the scientists. The MTI D100 is served to the Linux farms through a group of Palo Alto, Calif.-based Sun Microsystems NFS servers with 80T Bytes of SAN-based RAID arrays.
Askinazi has pushed his systems up to 90% uptime. The MTI Vivant D100s include several redundant components and dual data I/O paths. Servers have multiple host bus adapters, and there are multiple paths between the disks and the Brocade Fibre Channel switch and the switch and the servers.
"With the SAN, if a user's file system is full, I can grow the file system without any server downtime at all," said Askinazi. Best of all, changes are transparent to the user. "One minute, the user's storage file is 100% full, the next it's only 50% full," he said. "I can easily double capacity without downtime."
BNL's scientists have been supportive on Askinazi's search for cutting edge systems. "They're used to working with technologies no one has used before," Askinazi said. "They respect the fact that I'm working with solutions that are barely on most businesses' horizon. It's very satisfying."
For additional information on MTI visit their Web site.
Learn more about Brookhaven National Laboratory here.
How innovative is your company or someone you know? Nominate a true storage trailblazer for a prestigious 'Storage Innovator' award.
Award winners will be revealed at the Storage Decisions 2002 conference. Learn more about this free conference here.