Dell PowerEdge 1655MC User Manual

Garima Kochhar and Nishanth Dandapanthula
High Performance Computing Engineering
July 2012 | Version 1.0
Optimal BIOS settings for HPC with
Dell PowerEdge 12th generation
servers
This Dell technical white paper analyses the various BIOS options available in Dell PowerEdge 12th generation servers and provides recommendations for High Performance Computing workloads.
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
This document is for informational purposes only and may contain typographical errors and technical inaccuracies. The content is provided as is, without express or implied warranties of any kind.
© 2012 Dell Inc. All rights reserved. Dell and its affiliates cannot be responsible for errors or omissions in typography or photography. Dell, the Dell logo, and PowerEdge are trademarks of Dell Inc. Intel and Xeon are registered trademarks of Intel Corporation in the U.S. and other countries. Microsoft, Windows, and Windows Server are either trademarks or registered trademarks of Microsoft Corporation in the United States and/or other countries. Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products. Dell disclaims proprietary interest in the marks and names of others.
July 2012| Rev 1.0
ii
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Contents
Executive summary ..................................................................................................... 5
1. Introduction ....................................................................................................... 6
2. Dell PowerEdge 12th generation servers and Intel Sandy Bridge-EP architecture ..................... 6
2.1. Intel SandyBridge architecture .......................................................................... 7
3. Overview of BIOS options ....................................................................................... 9
3.1. System Profile .............................................................................................. 9
3.2. Turbo Boost ............................................................................................... 12
3.3. Node interleaving ........................................................................................ 12
3.4. Logical Processor ......................................................................................... 13
3.5. BIOS options specific to latency sensitive applications ........................................... 13
4. Test bed and applications .................................................................................... 14
5. Results and analysis ............................................................................................ 15
5.1. Idle power ................................................................................................. 16
5.2. System Profile ............................................................................................ 17
5.3. Turbo Boost ............................................................................................... 19
5.4. Node Interleaving ........................................................................................ 20
5.5. Logical Processor ......................................................................................... 21
5.6. C States, C1E on remote memory access ............................................................ 22
6. Comparison to Dell PowerEdge 11th generation servers ................................................. 23
7. Conclusion ....................................................................................................... 26
8. References ....................................................................................................... 28
Appendix A – Summary of findings ................................................................................. 29
Appendix B – Dell Deployment Toolkit to modify BIOS options from the command line ................. 31
Tables
Table 1. Dell PowerEdge 12th generation server models ....................................................... 7
Table 2. Intel Sandy Bridge-based servers ........................................................................ 9
Table 3. System Profile options .................................................................................. 10
Table 4. Test bed details .......................................................................................... 14
Table 5. Benchmark and application details ................................................................... 15
Table 6. 11th and 12th generation cluster test bed details ................................................... 24
Table 7. Recommended BIOS setting ............................................................................ 26
Table 8. DTK syscfg options for changing BIOS settings ...................................................... 31
iii
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Figures
Figure 1. Sandy Bridge-EP architecture for a PowerEdge R620 ................................................ 8
Figure 2. Local, remote and interleaved memory bandwidth ................................................ 13
Figure 3. Idle power usage across different System Profiles ................................................. 16
Figure 4. Impact of power-based BIOS options on idle power ............................................... 17
Figure 5. Performance and Energy Efficiency of System Profiles on applications ........................ 18
Figure 6. Performance and Energy Efficiency of Turbo Boost ................................................ 19
Figure 7. Performance and Energy Efficiency of Node Interleaving ........................................ 20
Figure 8. Performance and Energy Efficiency of Logical Processor ......................................... 21
Figure 9. Impact of C States and C1E on remote memory access ........................................... 23
Figure 10. 11th vs. 12th generation cluster – idle power comparison ......................................... 25
Figure 11. 11th vs. 12th generation cluster – performance and energy efficiency comparison ........... 25
iv
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Executive summary
The latest Dell PowerEdge 12th generation servers provide several BIOS options that can be tuned for performance and energy efficiency. In this technical white paper, the cluster-level impact of different BIOS options is quantified and presented for different types of high performance computing (HPC) workloads. The performance impact and power consumption of various BIOS settings and System Profiles are compared across several open source and commercial applications, and best practices are recommended from the measured results.
Comparing these results to the previously published study on Dell’s 11th generation servers, this document also presents the improvements achieved by Dell’s latest servers for HPC workloads.
5
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
1. Introduction
Dell PowerEdge 12th generation servers1 include the Intel Xeon E5-2600 series processors based on the Intel microarchitecture codenamed Sandy Bridge. With the new processor and chipset technology, the new servers support PCI-Gen3 capable PCI slots, memory DIMM speeds up to 1600 MT/s, four memory channels per socket, and Intel QuickPath Interconnect (QPI) lanes running at 8.0GT/s. Dell PowerEdge 12th generation servers also provide several processor-agnostic enhancements, including improved energy efficiency, support for more hard drives, support for PCI-E based Solid State Disks, a richer and simplified BIOS interface, and a choice of Network Daughter Cards.2
High performance computing (HPC) clusters utilize several commodity servers interconnected with a high-speed network fabric to achieve supercomputer-like performance. Clusters have become the most popular supercomputer architecture over the last 10 years due to the advantage they provide in terms of price, performance, and simplicity, over other designs.3 Dell’s dual-socket PowerEdge server line fits the requirements of the HPC cluster market and is a popular choice for building compute clusters.
This white paper focuses on the impact of the BIOS options available with the latest generation servers on HPC applications. It first introduces the servers used in this study and describes the Intel Sandy Bridge architecture. It quantifies the cluster-level impact of the BIOS options on performance and power consumption across a wide range of HPC applications. Based on measured results it provides guidelines for tuning the 12th generation BIOS for HPC. It also presents the improvements of the latest servers over the previous generation in terms of power and performance for various HPC domains.
The guidelines presented here apply to HPC workloads similar to those tested as part of this study. The recommendations in this document may not be appropriate for general enterprise workloads.
2. Dell PowerEdge 12
th
generation servers and Intel Sandy
Bridge-EP architecture
Dell PowerEdge 12th generation servers feature a simplified BIOS interface that is different in look and feel from previous generations. This new interface is in accordance with the Unified Extensible Firmware Interface (UEFI) specification, but with the option to boot from legacy mode when desired. The same interface is now used to configure the BIOS, iDRAC, Dell PowerEdge RAID Controller (PERC), LOM, and other adapter settings. The 12th generation BIOS setup introduces a “System Profiles” menu that provides a single option to set a group of tuning parameters.4 The BIOS options evaluated in this study are described in detail in Section 3.
In addition to the richer and simplified BIOS interface, the servers include several technology enhancements like support for PCI-E based Solid State Disks, a choice of Network Daughter Cards as opposed to fixed onboard LOMs, hot plug PCIe flash storage, and common power supply modules.
Enhancements to Dell’s iDRAC for systems management provide improved energy efficiencies and power savings over previous generations.
Dell’s latest server lineup includes many choices. For the two-socket space, Table 1 lists Intel Xeon E5­2600 based servers that are good candidates for HPC clusters.
All the server models in Table 1 are similar in system architecture and board design. Details of the architecture are presented in Section 2.1.
6
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Server model
Form factor
PowerEdge R620
1U Rack
PowerEdge R720
2U Rack
PowerEdge M620
Half height Blade
PowerEdge C6220
Shared Infrastructure system,
2U Rack with 4 servers.
The following features are common to the servers:
Support for processors from the Intel Xeon E5-2600 series. 4 memory channels per socket. The number of DIMM slots per server varies by product line.
o 3 DIMMs per channel for the PowerEdge R and M product line. Total of 12 DIMMs per
socket, 24 DIMMs per server.
o 2 DIMMs per channel for the PowerEdge C product. Total of 8 DIMMs per socket, 16
DIMMs per server.
Support memory speeds of 800 MT/s, 1066 MT/s, 1333 MT/s and 1600 MT/s.
Table 1. Dell PowerEdge 12
The servers differ in
Form factor Number of hard drives supported Support for hot plug flash storage Number of onboard NICs Number and configuration of PCI slots PERC options for internal and external hard drives
th
generation server models
Support for GP GPU and other PCI cards.
This study used the Dell PowerEdge M620 blade servers, but the recommendations contained here apply to the PowerEdge R and M server models that use the Intel Xeon E5-2600 series processors.
The PowerEdge C product line has a different BIOS interface from the standard PowerEdge products. The BIOS layout and the options exposed are different and all 12th generation features may not apply. In general, however, the analysis and recommendations in this document will apply to the PowerEdge C6220 as well.
2.1. Intel SandyBridge architecture
The Intel microarchitecture codenamed Sandy Bridge is the latest tock in Intel’s tick-tock model of development5. It uses the same 32nm process technology as its predecessor (Intel Xeon 5600 series, codenamed Westmere-EP) but introduces a whole new microarchitecture.
7
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
QPI links
8.0 GT/s
Four memory
channels
Four memory channels
3 DIMMs
per
channel
3 DIMMs per channel
NDC
x8
Left
x16
Center
x8
Storage
x8
Right
x16
Right
x8
Center
x16
QPI
DDR3 memory channel
PCI-Gen3 lanes
ProcessorProcessor
Like Westmere-EP, Sandy Bridge is also a NUMA-based architecture. Figure 1 shows a block diagram of the Sandy Bridge-EP architecture. Each processor socket has an integrated memory controller. A core’s access to the memory attached to its local memory controller is faster and has higher bandwidth than access to the memory attached to the other, remote socket’s, memory controller.
Figure 1. Sandy Bridge-EP architecture for a PowerEdge R620
With Westmere-EP, each memory controller had three DDR3 memory channels; Sandy Bridge-EP increases that to four memory channels per controller. The maximum number of DIMMs per channel remains three. Sandy Bridge supports up to eight cores per socket as opposed to the six cores per socket on Westmere-EP.
The QPI links that connect the processors run at up to 8 GT/s with Sandy Bridge. The maximum speed with Westmere was 6.4 GT/s. Sandy Bridge supports up to two QPI links whereas Westmere supported only one. Additionally, Sandy Bridge-based processors can support DIMMs at speeds up to 1600 MT/s; Westmere’s limit was 1333MT/s. Sandy Bridge-EP also has a larger L3 cache of up to 20MB compared to Westmere-EP’s 12MB L3 cache. Intel introduced Advanced Vector Extensions (AVX) 6 with its Sandy Bridge lineup. AVX provides a huge performance boost when compared to Westmere or Nehalem, as it doubles the number of FLOPS/cycle. A detailed explanation of AVX can be found here.
Unlike Westmere, Sandy-Bridge-based processors also include an integrated PCI controller. This makes access to the PCI slots non-uniform. Access to slots that are directly connected to the socket’s PCI controller will be faster than to slots connected to the remote socket’s PCI controller.
Also new to Sandy-Bridge-based systems is PCI-Gen3 support. This is good news for HPC, as the Mellanox FDR InfiniBand HCA can utilize this technology enhancement and run at Gen3 speeds.
Sandy Bridge-based servers come in three architectures: Sandy Bridge-EP, Sandy Bridge-EN and Sandy Bridge-EP 4S. These architectures are compared in Table 2. Additionally Sandy Bridge-EN processors operate at a lower wattage, with maximum Thermal Design Power (TDP) ranging from 50W to 95W. Sandy Bridge-EP processors have a Maximum TDP of up to 135W7. Other differences include the number of PCI lanes and number of QPI lanes.
8
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Architecture
Processor Series
Dell PowerEdge Server models
Memory channels
Max Memory DIMMs per channel (DPC)
Sandy Bridge-EP
Intel Xeon E5-2600
R620, R720, M620, C6220
4 channels per socket
3 DPC
Sandy Bridge-EP 4S
Intel Xeon E5-4600
R820, M820
4 channels per socket
3 DPC
Sandy Bridge-EN
Intel Xeon E5-2400
R420, M420
3 channels per socket
2 DPC or 1DPC depending on server model
Table 2. Intel Sandy Bridge-based servers
This study focuses on the Sandy Bridge-EP based servers. A follow-on study will evaluate the other architectural variants.
3. Overview of BIOS options
Dell PowerEdge 12th generation servers provide numerous BIOS tunable features. The goal of this white paper, however, is to examine the impact of only those BIOS options that are relevant in the HPC domain. This section provides an overview of the specific BIOS settings evaluated in this study. Section 5 presents the impact of these options on performance and energy efficiency across a variety of HPC applications.
3.1. System Profile
With the latest mainstream PowerEdge server line (i.e., excluding the PowerEdge C products), several BIOS settings have been grouped into a common “System Profile” section. Predetermined values for Turbo mode, C States, C1E, Monitor/Mwait, CPU Power Management, Memory Speed, Memory Patrol Scrub rate, Memory Refresh Rate, and the Memory Operating Voltage can all be set by selecting a single System Profile value. If these options need to be manipulated independent of the presets available, a Custom System Profile can be configured.
The available System Profile options are:
Performance Per Watt Optimized (DAPC) Performance Per Watt Optimized (OS) Performance Optimized Dense Configuration Optimized Custom
The preset BIOS settings for each of these System Profiles are described in Table 3. If a preset System Profile other than Custom is selected, the following sub-options are selected automatically and are not individually tunable. The Custom System Profile should be selected to tune each option in Table 3 individually.
9
Optimal BIOS settings for HPC with Dell PowerEdge 12th generation servers
Per Watt Optimized (DAPC)
Per Watt Optimized (OS)
Optimized
Configuration Optimized
CPU Power Management
System DBPM
OS DBPM
Max Performance
System DBPM
System DBPM
Max Performance
OS DBPM
Memory Frequency
Max Performance
Max Performance
Max Performance
Max Reliability
Max Performance
Max Reliability
1600 MT/s
1333 MT/s
1067 MT/s
800 MT/s
Turbo Boost
Enabled
Enabled
Enabled
Disabled
Enabled
Disabled
C States
Enabled
Enabled
Disabled
Enabled
Enabled
Disabled
C1E
Enabled
Enabled
Disabled
Enabled
Enabled
Disabled
Memory Patrol Scrub
Standard
Standard
Standard
Extended
Extended
Standard
Disabled
Memory Refresh Rate
1x
1x
1x
2x
1x
2x
Memory Operating Voltage
Auto
Auto
Auto
1.5V
Auto
1.5V
Monitor/ Mwait
Enabled
Enabled
Enabled
Enabled
Enabled
Disabled
Details of each of these settings are provided in [4]. A quick overview is provided here.
Table 3. System Profile options
10
Loading...
+ 22 hidden pages