Specifications

The UWM Research Computing Service was established in 2009 and maintains a variety of computing resources for UWM researchers, including standalone servers, High Performance Computing (HPC) clusters for running parallel programs, and a High Throughput Computing (HTC) grid for running serial programs in parallel.

The parallel computing services currently consist of an HPC cluster for faculty research named "Avi", an HPC cluster for education named "Peregrine.", and an HTC grid named "Meadows".

Unlike many grids, which employ the "Bring Your Own Binary" (BYOB) model, the Meadows compute hosts are preloaded with hundreds of open source scientific applications and libraries.

A new HPC research cluster named "Mortimer" will be available late 2014 or early 2015.

Avi specifications

  • 1136 total computing cores and 3616 GiB (3.5TiB) RAM
  • 142 compute nodes (1136 cores total). Each node is a Dell PowerEdge R410 server with two quad-core Intel(R) Xeon(R) X5550 processors @ 2.67GHz
  • Most compute nodes have 24 GiB of RAM.  A few "high memory" nodes with 128 GiB of RAM are provided for special programs that require large amounts of memory on a single node
  • One head node running the SLURM resource manager, a Dell PowerEdge R310 server with 6 Intel(R) Xeon(R) E5-2407 processors @ 2.20GHz and 32 gigabytes of RAM.   An identical backup node automatically takes over in the event of a head node failure
  • A primary IO node, a Dell PowerEdge R710 server, with two quad-core Intel(R) Xeon(R) E5520 processors @ 2.27GHz, 48 GiB of system memory and seven Dell PowerVault MD1000 3Gb/s SAS attached expansion units, serving nine shared RAID 60 and RAID 10 partitions of approximately 7 terabytes each over NFSv4
  • One high-speed I/O node, a Dell PowerEdge R720xd with two six-core Intel(R) Xeon(R) CPU E5-2620 0 @ 2.00GHz and 32 gigabytes of RAM, serving a single 10 terabyte RAID 6 partition over NFSv4
  • All compute and I/O nodes are linked by Qlogic DDR InfiniBand (16Gb/s) and gigabit Ethernet networks
  • All nodes currently run CentOS Linux 6

Mortimer specifications

  • 608 total computing cores and 3904 GiB (3.8TiB) RAM
  • 28 standard compute nodes, each with 16 cores and 48 GiB RAM  (448 cores and 1344 GiB RAM total).  Each node is a Dell PowerEdge R420 server with two 8-core Intel(R) Xeon(R) CPU E5-2450 v2 processors @ 2.50GHz
  • 4 high-memory compute nodes, each with 24 cores and 256 GiB RAM.  Each high-memory node is a Dell R630 with 3 8-core Intel(R) Xeon(R) CPU E5-2680 v3 processors @ 2.50GHz
  • 1 high-memory compute node with 32 cores, 768 GiB RAM, and a local 17TiB RAID.  4 Intel(R) Xeon(R) CPU E5-2650 v2 processors @ 2.60GHz
  • 1 high-memory compute node with 32 cores, 768 GiB RAM, and a local 1TiB RAID.  4 Intel(R) Xeon(R) CPU E5-2680 v3 processors @ 2.50GHz
  • One head node running the SLURM resource manager, a Dell PowerEdge R415 server with 1 6-core AMD Opteron(tm) 4133 processor and 16GiB  of RAM.   An identical backup node automatically takes over in the event of a head node failure
  • 4 high-speed I/O nodes, each a Dell PowerEdge R720xd serving a single 19 TiB RAID over NFSv4.  Each node can receive over 800 GiB/sec from compute nodes over the Infiniband network
  • 2 high-capacity I/O nodes, each a Dell PowerEdge R720xd serving a single 37 TiB RAID over NFSv4.  Each node can receive over 700 GiB/sec from compute nodes over the Infiniband network
  • All compute and I/O nodes are linked by Mellanox FDR Infiniband (56Gb/s) and gigabit Ethernet networks
  • All nodes currently run CentOS Linux 6

Peregrine Specifications

  • 8 compute nodes (96 cores total). Each node is a Dell PowerEdge R415 rack-mount server with two six-core AMD Opteron 4180 2.6GHz processors and 32 GB of system memory
  • One head node, a Dell PowerEdge R415 server, with one 6-core AMD Opteron processor and 16 GB of system memory
  • The head node houses a 5 Terabyte RAID5 array, available to all compute nodes via NFS
  • All nodes are connected by a dedicated gigabit Ethernet network interface
  • Jobs are scheduled using the SLURM resource manager
  • In addition, Peregrine is a submit node and manager for the UWM HTCondor grid, which provides access to idle cores on lab PCs and other machines around campus for use in embarrassingly parallel computing
  • All nodes run FreeBSD 9.2 and offer a wide variety of open source software installed via the FreeBSD ports system

Meadows HTC Grid

  • Jobs scheduled via HTCondor
  • 1 large pool of dedicated servers (formerly part of the LIGO grid for gravitational wave research) offering several hundred cores preloaded with hundreds of open source scientific applications and libraries
  • Access to virtual machines on idle UWM desktop machines, also preloaded with scientific software
  • All hosts run Unix-compatible operating systems (currently CentOS Linux and FreeBSD)
  • Unused processors on UW Madison's CHTC grid are automatically utilized when the UWM grid is full

SciProg Development and Education Server

  • Dell PowerEdge R420 server
  • Dual Intel(R) Xeon(R) CPU E5-2450 CPUs for a total of 16 hyper-threaded cores (32 threads)
  • 64 GB RAM
  • 5.3 TB RAID storage
  • Hundreds of preloaded software packages including compilers and scientific applications and libraries