We recently added four new racks to the Open Cloud Testbed. The racks are designed to support cloud computing, both clouds that support on demand VMs as well as those that support data intensive computing. Since there is not a lot of information available describing how to put together these types of clouds, I thought I would share how we configured our racks.
These racks can be used as a basis for private clouds, hybrid clouds, or condo clouds.
In contrast, our racks are designed to support data intensive computing. We sometimes call these Raywulf clusters. Briefly, the goal is to make sure that there are enough spindles moving data in parallel with enough cores to process the data being moved. (Our data intensive middleware is called Sector, Graywulf is already taken, and there are not many words that rhyme with Beo- left. Other suggestions are welcome. Please use the comments below.)
The racks cost about $85,000 (with standard discounts), consist of 32 nodes and 124 cores with 496 GB of RAM, 124 TB of disk & 124 spindles, and consume about 10.3 kW of power (excluding the power required for cooling).
With 3x replication, there is about 40 TB of usable storage available, which means that the cost to provide balanced long term storage and compute power is about $2,000 per TB. So, for example, a single rack could be used as a basis for a private cloud that can manage and analyze approximately 40 TB of data. At the end of this note, is some performance information about a single rack system.
Each rack is a standard 42U computer rack and consists of a head node and 31 compute/storage nodes. We installed GNU/Debian Linux 5.0 as the operating system. Here is the configuration of the rack and of the compute/storage nodes.
In contrast, there are specialized configurations, such as designed by Backblaze, that provide 67TB for $8,000. This is 1/2 the storage for 1/10 the cost. The difference is that Raywulf clusters are designed for data intensive computing using middleware such as Hadoop and Sector/Sphere, not just storage.
- 31 compute/storage nodes (see below)
- 1 head node (see below)
- 2 Force10 S50N switches, with 2 10 Gbps uplinks so that the inter-rack bandwidth is 20 Gbps
- 1 10GE module
- 2 optics and stacking modules
- 1 3Com Baseline 2250 switch to provide to provide additional cat5 ports for IPMI management interfaces.
- Intel Xeon 5410 Quad Core CPU with 16GB of RAM
- SATA RAID controller
- four (4) SATA 1TB hard drives in RAID-0 configuration
- 1 Gbps NIC
- IPMI management
Benchmarks. We benchmarked these new racks using the Terasort Benchmark and version 0.20.1 of Hadoop and version 1.24a of Sector/Sphere. Replication was turned off in both Hadop and Sector. All the racks were located within one data center. It is clear from these tests that the new versions of Hadoop and Sector/Sphere are both faster than the previous versions.
|1 rack (32 nodes)||28m 25s||85m 49s|
|2 racks (64 nodes)||15m 20s||37m 0s|
|3 racks (96 nodes)||10m 19s||24m 14s|
|4 racks (128 nodes)||7m 56s||17m 45s|
The Raywulf clusters were designed by Michal Sabala and Yunhong Gu of the National Center for Data Mining at the University of Illinois at Chicago.
We are working on putting together more information of how to build a Raywulf cluster.
The photograph above of two racks from the Open Cloud Testbed was taken by Michal Sabala.