Intel AT80615005760AB, BX80615E74830 User Manual

WHITE
PAPER
Intel® Xeon® Processor E7 Family
Performance Brief: Model Number Characteristics and Impact to Performance
Intel® Xeon® Processor E7 Family Performance and Model Numbers
Intel® Xeon® Processor E7-8800/4800/2800 Product Families Characteristics and Impact to Performance
EXECUTIVE SUMMARY
Just like many automobile manufacturers and other companies that have multiple product lines within their product family, server processors have model numbers to help distinguish the differences in features and delineate value. As your business grows, so does demand for your products and / or services with additional customers, users, and transactions that strain your current IT infrastructure and back-end databases. The Intel® Xeon® brand helps customers
1
select the appropriate product line and family stack as their demand justifies it
.
This paper focuses specifically on the Intel Xeon processor E7 family which is designed to be expandable and scalable for larger deployments of business- or mission-critical workloads such as on-line transaction processing, physical-to-virtual machine consolidation projects, business intelligence, customer relationship management (CRM), and enterprise resource planning (ERP) / line-of-business applications that generate revenue. The model numbers (see Figure 1) help differentiate the capabilities of the processors and in the case of the Intel Xeon processor E7 product family, the wayness or maximum number of processors (CPUs or sockets) in a node can be two, four, or eight (contrasted to the Intel Xeon processor E3 or E5 families, which support only one or two/four processors, respectively). Performance may scale as the number of processors installed (wayness) in a server is increased (up to 94% efficiency as published in this paper); but in a two-way server, regardless of the actual processor wayness capability, the throughput application performance would be expected to be the same.
MODEL NUMBERS AND SCALABILITY
For the Intel Xeon processor E7 family, processor models (also called SKUs) are available in three wayness configurations – two, four, or eight
Figure 1 - 2012 Processor Numbering Example
sockets together). Within a given Intel Xeon processor E7-xxxx SKU, the difference in wayness is irrelevant if populated in only a two-socket node and corresponding performance differences are negligible. For example, the top-bin Intel Xeon processor E7-8870/E7-4870/E7-2870 all have the same socket type (8) and the same
socket native support (no third party
node controller required to connect the
White Paper
Intel® Xeon® Processor E7 Family Performance
processor SKU (70); which indicates same core frequency of 2.4 GHz, the same Intel® QuickPath Interconnect speed of 6.4 GT/s, the same last-level cache (LLC) of 30 MB, and the same number of cores at 10 per processor.
So the only difference is in the first product family number represents wayness (2, 4, or 8) capability indicating that the Intel Xeon processor E7-4xxx and E7-8xxx models can scale natively beyond just 2-sockets (see Figure 2 below). It is common IT practice to buy “headroom” by purchasing a larger server but only initially partitioning a portion of the processor sockets for today’s level of requirements allowing for future compute power expansion as the number of users, transactions, or problem fidelity increases. Ideally, with perfect scaling, you can double the number of users, for example, when doubling the number of processor compute power (assuming storage, memory, and I/O are scaled as to not be the bottleneck). However, when any of these otherwise identical processors are populated in 2-sockets only though, performance throughput should be expected to be the same.
Figure 2 - Intel® Xeon® Processor E7-8800/4800/2800 Product Family Numbering2
PERFORMANCE IMPACT
For the purposes of demonstrating the impact of model numbers on performance, the top of the advanced capability levels of each product family is compared below (Intel Xeon processor E7-8870/4870/2870). Figure 3 below illustrates the options original equipment manufacturers (OEMs) have in designing an Intel Xeon processor E7 family-based server.
Looking at the first number in the Intel Xeon processor E7 family, -8xxx, -4xxx or -2xxx, which represents the number of processors natively supported in a server, the processors can scale to support the increased number of users, transactions or throughput as additional sockets are tested in performance benchmarks. The typical example of this
Figure 3 - Intel Xeon processor E7 family scalability to support 2- to 256-sockets3
benchmark that is fairly representative of typical integer-based, compute-intensive server applications to test the
2
can be found while using the SPECint*_rate_base2006
White Paper
2-socket
4-socket
8-socket
Intel® Xeon® Processor E7 Family Performance
number of users (typically matches the number of logical threads seen by the Operating System, OS) simultaneously running a problem on a given server. The performance scaling is calculated by dividing the resulting score from the maximum number of processors populated in one server by the score of the server with n-way processors populated in another server configuration. So from 2- to 4- to 8-socket-based servers, the perfect scaling would be four times, meaning that the number of users supported (or problems solved) in the 8­socket server is four times more than what a 2-socket server could support. The efficiency is measured by how close a scale-up server performs comparatively to that perfect scaling, which in this case is quite reasonable at up to 94% efficiency (see Figure 4 below).
Servers
Servers
Servers
Figure 4 - Scaling of supported users on multi-processor servers4
The Intel Xeon processor E7-8870 can be populated in a 2-, 4-, or 8-socket server configuration. This is due to the Intel® QuickPath Interconnect (Intel® QPI) that allows the processors to share resources by allowing all of the components to access other components through the mainboard network. Similar to the Intel Xeon processor E7­8870, the E7-4870 model supports 2- or 4-socket server configurations; but on the Intel Xeon processor E7-2870, only 2-sockets can be populated in a server node (though multiple nodes can be joined together to form a larger single server image 2S – see Figure 3 above).
There are no characteristics in each of the three processors noted above that differ, other than the wayness capability. All three processors operate in the same number of available cores per socket, core frequency, Intel® QPI speed, and cache structure (see Figure 2 above). Therefore, in a 2-socket server configuration, the performance delta between the three will only be typical run-to-run variation due to a number of factors including manufacturing variances that may affect the length of time the processors run above their marked frequency using Intel® TurboBoost Technology. SPEC* allows for up to 1.75% variation. This hypothesis was confirmed through testing using Intel internal labs and as seen in Figure 5 below as there is less than 0.5% difference in performance between the three processors when in the same two-socket server configuration (see Table 1 below for complete list of equivalent processor SKUs).
3
White Paper
Table 1 - Intel Xeon processor E7 family model numbers and wayness supported
Intel® Xeon® Processor E7 Family Performance
Wayness Intel® Xeon® Processor E7-8800/4800/2800 Product Family Equivalent Performance 2-Sockets Native E7-2870 / E7-4870
/ E7-8870
E7-2860 / E7-4860
/ E7-8860
E7-2850 / E7-4850
/ E7-8850
E7-2830 / E7-4830
/ E7-8830 4-Sockets Native E7-4870 / E7-8870 E7-4860 / E7-8860 E7-4850 / E7-8850 E7-4830 / E7-8830 8-Sockets Native E7-8870 E7-8860 E7-8850 E7-8830
CONCLUSION
Servers are very complex machines, especially in the “big iron” class where multi-processor configurations are the norm. The Intel Xeon processor E7 family is designed to support a multitude of shipping configurations and the model numbering schema is attempting to clarify the wayness and feature choices that customers have. In the
current generation Intel Xeon processor E7 family scalable platform situation, the performance throughput increases at up to 94% efficiency from 2- to 8­sockets. However, in a two-
552 553 555
Figure 5 - 2-socket Server Performance using Intel Xeon processors E7-8870/4870/28705
socket server configuration, there is no appreciable difference in performance regardless of the processor SKU chosen – Intel Xeon processors E7­8870/4870/2870 and others shown in Table 1 above are equivalent.
4
White Paper
NOTES / SOURCES
Intel® Xeon® Processor E7 Family Performance
1. See http://www.intel.com/content/www/us/en/processors/processor-numbers.html for more information on the Intel
Xeon processor numbering.
2. See http://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/xeon-e7-8800-4800-2800-
families-specification-update.pdf for more information on Intel Xeon processor E7 family identification information.
3. Additional Configurations via OEM-specific scaling technologies (up to 256-sockets)
4. Comparison based on best published Intel Xeon processor E7 results in a 2-, 4-, and 8-socket configuration using the
SPECint*_rate_base2006 integer throughput benchmark that is often used as a proxy for general server application performance.
a. 8-socket server: Hewlett-Packard ProLiant* DL980 G7 scoring 2070 baseline.
Source: http://www.spec.org/cpu2006/results/res2011q4/cpu2006-20110923-18595.html
b. 4-socket server: Cisco UCS* C460 M2 scoring 1100 baseline.
Source: http://www.spec.org/cpu2006/results/res2012q1/cpu2006-20111223-19278.html
c. 2-socket server: IBM System x* 3690 X5 scoring 550 baseline.
Source: http://www.spec.org/cpu2006/results/res2012q3/cpu2006-20120716-23707.html
5. Comparison based on Intel internal testing on Intel Xeon processor E7 family using SPECint*_rate_base2006
benchmark baseline scores. System Configuration: Intel® C606 Chipset based reference platform (see http://www.qsscit.com/en/01_product/02_detail.php?mid=27&sid=125&id=126&qs=50 two each Intel Xeon processors E7-8870, E7-4870, and E7-2870 populated in sockets 0 and 1 with 128 GB memory (32x 4 GB DR DDR3-1066 RDIMMs), Red Hat* Enterprise LINUX 6.2, Intel Compiler XE2012 (12.1) compiled binaries. Source: Intel internal TR#1326 October 2012. See Appendix for details.
for details) supporting
5
White Paper
APPENDIX
Base Base Base Peak Peak Peak
Benchmarks Copies Run Time Rate Copi es Run Time Rate
-------------- ------ --------- --------- ------ --------- ---------
400.perl ben ch 40 923 423 *
400.perl ben ch 40 926 422 S
400.perl ben ch 40 921 424 S
401.bzi p2 40 1236 312 S
401.bzi p2 40 1235 313 *
401.bzi p2 40 1233 313 S
403.gcc 40 750 429 *
403.gcc 40 755 426 S
403.gcc 40 747 431 S
429.mcf 40 471 775 *
429.mcf 40 472 774 S
429.mcf 40 470 776 S
445.gobmk 40 883 475 *
445.gobmk 40 883 475 S
445.gobmk 40 882 476 S
456.hmmer 40 553 675 *
456.hmmer 40 552 677 S
456.hmmer 40 559 667 S
458.sjeng 40 1065 454 S
458.sjeng 40 1064 455 *
458.sjeng 40 1063 455 S
462.libquantum 40 247 3360 S
462.libquantum 40 248 3340 S
462.libquantum 40 247 3350 *
464.h264re f 40 1314 674 S
464.h264re f 40 1368 647 S
464.h264re f 40 1368 647 *
471.omnetpp 40 799 313 S
471.omnetpp 40 799 313 *
471.omnetpp 40 799 313 S
473.astar 40 877 320 *
473.astar 40 878 320 S
473.astar 40 876 320 S
483.x al ancbmk 40 478 577 S
483.x al ancbmk 40 477 579 S
483.x al ancbmk 40 478 578 * =================== ======= ======
=========
===================================
400.perl ben ch 40 923 423 *
401.bzi p2 40 1235 313 *
403.gcc 40 750 429 *
429.mcf 40 471 775 *
445.gobmk 40 883 475 *
456.hmmer 40 553 675 *
458.sjeng 40 1064 455 *
462.libquantum 40 247 3350 *
464.h264re f 40 1368 647 *
471.omnetpp 40 799 313 *
473.astar 40 877 320 *
483.x al ancbmk 40 478 578 *
SPECi nt(R)_rate_base2006 552
SPECin t_rate2006 Not Run
Intel® Xeon® Processor E7-2870
6
Intel® Xeon® Processor E7 Family Performance
White Paper
Base Base Base Peak Peak Pe ak
Benchmarks Copies Run Time Rate Copies Run Time Rate
-------------- ------ --------- --------- ------ --------- ---------
400.perl bench 40 922 424 *
400.perl bench 40 924 423 S
400.perl bench 40 918 426 S
401.bzi p 2 40 1235 313 S
401.bzi p 2 40 1234 313 *
401.bzi p 2 40 1233 313 S
403.gcc 40 746 432 S
403.gcc 40 753 427 *
403.gcc 40 763 422 S
429.mcf 40 470 776 *
429.mcf 40 472 774 S
429.mcf 40 470 776 S
445.gobmk 40 883 475 *
445.gobmk 40 886 474 S
445.gobmk 40 882 476 S
456.hmmer 40 555 672 S
456.hmmer 40 551 677 *
456.hmmer 40 548 681 S
458.sjeng 40 1064 455 *
458.sjeng 40 1063 455 S
458.sjeng 40 1065 454 S
462.libquantum 40 247 3350 S
462.libquantum 40 247 3360 *
462.libquantum 40 247 3360 S
464.h264re f 40 1325 668 S
464.h264re f 40 1358 652 *
464.h264re f 40 1367 647 S
471.omn etp p 40 799 313 S
471.omn etp p 40 799 313 *
471.omn etp p 40 799 313 S
473.astar 40 879 319 S
473.astar 40 877 320 *
473.astar 40 875 321 S
483.x alancbmk 40 478 577 S
483.x alancbmk 40 477 579 S
483.x alancbmk 40 478 578 * ============== =======
=========
=========
===================================
400.perl bench 40 922 424 *
401.bzi p 2 40 1234 313 *
403.gcc 40 753 427 *
429.mcf 40 470 776 *
445.gobmk 40 883 475 *
456.hmmer 40 551 677 *
458.sjeng 40 1064 455 *
462.libquantum 40 247 3360 *
464.h264re f 40 1358 652 *
471.omn etp p 40 799 313 *
473.astar 40 877 320 *
483.x alancbmk 40 478 578 *
SPECi nt(R)_rate _base 2006 553
SPECint_rate2006 Not Run
Intel® Xeon® Processor E7- 4870
7
Intel® Xeon® Processor E7 Family Performance
White Paper
Base Base Base Peak Peak Peak
Benchmarks Copies Run Time Rate Copies Run Time Rate
-------------- ------ --------- --------- ------ --------- ---------
400.perl bench 40 925 423 S
400.perl bench 40 919 425 S
400.perl bench 40 920 425 *
401.bzi p 2 40 1233 313 S
401.bzi p 2 40 1231 313 S
401.bzi p 2 40 1232 313 *
403.gcc 40 747 431 *
403.gcc 40 747 431 S
403.gcc 40 756 426 S
429.mcf 40 470 777 S
429.mcf 40 471 774 S
429.mcf 40 471 775 *
445.gobmk 40 880 477 S
445.gobmk 40 882 475 *
445.gobmk 40 884 475 S
456.hmmer 40 554 674 *
456.hmmer 40 554 674 S
456.hmmer 40 555 673 S
458.sjeng 40 1066 454 S
458.sjeng 40 1063 455 S
458.sjeng 40 1065 455 *
462.libquantum 40 247 3350 S
462.libquantum 40 247 3360 S
462.libquantum 40 247 3350 *
464.h264re f 40 1314 674 *
464.h264re f 40 1313 674 S
464.h264re f 40 1335 663 S
471.omn etp p 40 798 313 *
471.omn etp p 40 798 313 S
471.omn etp p 40 798 313 S
473.astar 40 876 321 S
473.astar 40 879 319 S
473.astar 40 876 321 *
483.x alancbmk 40 477 579 S
483.x alancbmk 40 477 578 *
483.x alancbmk 40 480 575 S ============== =======
=========
=========
===================================
400.perl bench 40 920 425 *
401.bzi p 2 40 1232 313 *
403.gcc 40 747 431 *
429.mcf 40 471 775 *
445.gobmk 40 882 475 *
456.hmmer 40 554 674 *
458.sjeng 40 1065 455 *
462.libquantum 40 247 3350 *
464.h264re f 40 1314 674 *
471.omn etp p 40 798 313 *
473.astar 40 876 321 *
483.x alancbmk 40 477 578 *
SPECi nt(R)_rate _base 2006 555
SPECint_rate2006 Not Run
Intel® Xeon® Processor E7- 8870
8
Intel® Xeon® Processor E7 Family Performance
White Paper
Author
Frank Jensen
Intel® Xeon® Processor E7 Family Performance
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors. Performance tests, such as SYSmark and MobileMark, are measured using specific computer systems, components, software, operations and functions. Any change to any of those factors may cause the results to vary. You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.
Intel does not control or audit the design or implementation of third party benchmarks or Web sites referenced in this document. Intel encourages all of its customers to visit the referenced Web sites or others where similar performance benchmarks are reported and confirm whether the referenced benchmarks are accurate and reflect performance of servers available for purchase.
This paper is for informational purposes only. THIS DOCUMENT IS PROVIDED “AS IS” WITH NO WARRANTIES WHATSOEVER. INCLUDING WARRANTY OF MERCHANTABILITY, NONINFRINGEMENT, FITNESS FOR ANY PARTICULAR PURPOSE OR ANY WARRANTY OTHERWISE ARISING OUT OF ANY PROPOSAL, SPECIFICATION OR SAMPLE. Intel disclaims all liability, including liability for infringement of any proprietary rights relating to use of information in this specification. No license, express or implied, by estoppel or otherwise, to any intellectual property rights is granted herein.
SPEC and the benchmark name SPECint are trademarks of the Standard Performance Evaluation Corporation. Benchmark results stated above reflect results published on http://www.spec.org as of October 22, 2012. For the latest SPECint_rate_base2006 benchmark results, visit http://www.spec.org/cgi-bin/osgresults?conf=rint2006
Intel's compilers may or may not optimize to the same degree for non-Intel microprocessors for optimizations that are not unique to Intel microprocessors. These optimizations include SSE2, SSE3, and SSE3 instruction sets and other optimizations. Intel does not guarantee the availability, functionality, or effectiveness of any optimization on microprocessors not manufactured by Intel.
is a Performance Engineer in Intel’s Data Center Marketing Group
.
Microprocessor-dependent optimizations in this product are intended for use with Intel microprocessors. Certain optimizations not specific to Intel microarchitecture are reserved for Intel microprocessors. Please refer to the applicable product User and Reference Guides for more information regarding the specific instruction sets covered by this notice.
Notice revision #20110804
Copyright® 2012 Intel Corporation. All rights reserved. Intel, the Intel logo, Intel®, Xeon®, and the Xeon® logos are trademarks or registered trademarks of Intel Corporation or its subsidiaries in other countries. * Other names and brands may be claimed as the property of others. All timeframes, dates and products are subject to change without further notification.
Printed in USA 1112/FJ/xxx/PDF Please Recycle 328306-001EN
9
Loading...