Google

WWW
DEAC Cluster

 

The WFU DEAC Cluster provides the critical infrastructure necessary for researchers to reliably upload research codes, perform large scale computations, store their actively utilized results, and have confidence in the persistence of their data in the event of storage failures. A comprehensive list of these services can be found on the Services page.

Below, you'll find a comprehensive list of hardware resources that are currently in production within the WFU DEAC Cluster facility.

64-bit Hardware

29 IBM HS22 Blade systems: each containing two processors, two embedded gigabit ethernet adapters, and a 73GB 10K SAS disk. The HS22 blade is based on the Intel "Nehalem" processor line providing a significantly improved bandwidth in a non-uniform memory architecture (NUMA). The specific combinations of processor core number and memory configurations are:
14 "Quad Core, Infiniband" nodes:
Xeon E5530 at 2.40GHz, 8MB Cache, 1066MHz, 48GB RAM
15 "Quad Core, Ethernet" nodes:
Xeon E5530 at 2.40GHz, 8MB Cache, 1066MHz, 48GB RAM
73 IBM HS21XM Blade systems: each containing two processors, two embedded gigabit ethernet adapters, and a 73GB 10K SAS disk. The specific combinations of processor core number and memory configurations are:
24 "Quad core, Infiniband" nodes:
Xeon E5430 at 2.66Ghz, 16GB RAM
49 "Quad core, Ethernet" nodes:
Xeon E5345 at 2.33Ghz, 16GB RAM (28 nodes)
Xeon E5430 at 2.66Ghz, 16GB RAM (14 nodes), 32GB RAM (7 nodes)
37 IBM HS21 Blade systems: each containing two processors, two embedded gigabit ethernet adapters, and a 73GB 10K SAS disk. The specific combinations of processor core number and memory configurations are:
37 "Dual core, Ethernet" nodes:
Xeon 5160 at 3.0Ghz, 8GB RAM

Storage

60 TB for Research Data: The WFU DEAC cluster utilizes multiple storage devices attached to a SAN (Storage Area Network) through which several disk server infrastructure nodes export the space. The specific storage devices, configurations, and primary functions are:
IBM DS4200 Storage Device: Primary function is the principal data store for home directories and actively used research data on the cluster.
13TB via 32 SATA drives, 500GB capacity, 7200 RPM
27TB via 48 SATA drives, 750GB capacity, 7200 RPM
IBM DS3400 Storage Device: Provides storage for the infrastructure systems and services as well as supports large data stores needed for research data.
5.4TB via 24 SAS drives, 300GB capacity, 10000 RPM (infrastructure, VMs) 20TB via 24 SATA drives, 1TB capacity, 7200RPM (research data)

Network

Parallel Programs and Inter-processor communication
A number of parallel computation problems require a great deal of coordination traffic between the processors/nodes in order to assure accurate calculations that are consistent. While bandwidth can be a big component of this inter-processor communication (IPC), typically the messages passed are quite small thereby making the latency caused by creating and transmitting the data a critical performance factor.

Traditional gigabit Ethernet switches typically have a latency in the 50-80 microseconds. For CPUs with a capability of 3 gigaflops, which translates to 3 floating point operations every nanosecond, this can cause processors to sit idle for extraordinary lengths of time waiting for data from a participating node.

The industry standard high speed interconnect technology available today is Infiniband (IB). The IB specification and measured performance of these adapters and switch technologies yield node-to-node message latencies around 1-2 microseconds.

Voltaire Infiniband
24 IBM HS21XM blades in the cluster have been upfitted to use the Voltaire 4X DDR Expansion Card (CFFh) and connect via the IB pass through module to a Voltaire ISR 9024D-M 24-port Infiniband DDR switch.
Cisco Topspin Infiniband
14 IBM HS22 blades in the cluster have been upfitted to use the Cisco 4X DDR Expansion Card (CFFh) and are connected internally via a Cisco 4X Infiniband 14-port switch module.
Cisco Core Ethernet Environment
The network design and connectivity available in the WFU DEAC Cluster follow the standard, best practices approach of a "switch block" design: two gateway switches provide redundant connectivity to the WFU campus and security for the cluster through redundant firewall switch modules. The back-end connectivity for the computational nodes are driven by two key Cisco technologies, VSS and DFC, to ensure the availability of bandwidth (raw speed) and throughput (packet rates) necessary for high-performance computing. In addition, the serviceability inherent in the VSS technology permits network maintenance concurrent with active computational research.

Looking forward, this recently implemented design change supports several capabilities for research that had not previously been possible:
  • 10 Gbps connectivity internally between compute node chassis'
  • 10 Gbps network connectivity to NC-REN and other research networks
  • Dedicated optical connectivity to another site via NC-REN and NLR or I2 (DWDM)

64-bit Hardware
Storage
Network

horizontal bar
blank spacer
Wake Forest University
Information Systems
University Corporate Center
1100 Reynolds Blvd
Winston-Salem, NC 27105
E-mail:
is-cluster **AT** wfu.edu
Last Updated: 2011-Mar-09, 17:25 EST Website templates and general design
by the WFU Physics Department