Compaq Evo W8000, Xeon, Evo Workstation W6000, Evo Workstation W8000 Supplementary Manual

White Paper
March 2002 16G1-0302A-WWEN
Analysis of Intel Xeon
Prepared by: Workstations Division Compaq Computer Corporation
Contents
Introduction................................. 3
Differences Between Intel
Xeon Processors ........................4
Intel Xeon .18 Processors......... 4
Intel Xeon .13 Processors......... 4
Benchmark Analyses ................. 4
Business Winstone.................... 5
SYSmark 2001.......................... 7
Cadalyst.................................... 8
ProE 2001i2 ............................ 10
Summary ................................... 13
Processor Frequency Grades and Cache Sizes on Performance Benchmarks
Abstract: Compaq Evo Workstations W6000 and W8000 refreshes will feature the leading edge Intel Xeon .18 processor with Hyper­Threading Technology. The new processors are fabricated with the latest .13µ technology, 512KB-L2 cache, the ability to support frequencies ranging from 1.8 GHz to greater than 2.6 GHz and also includes support for multi-threaded execution.
The purpose of this paper is to study the performance benefits in terms of processor frequency grades and larger cache sizes of the new Intel Xeon .18 processors versus the previous Intel Xeon .18 processors. This will be done using different industry-standard benchmarks.
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 2
Notice
The information in this publication is subject to change without notice and is provided “AS IS” WITHOUT WARRANTY OF ANY KIND. THE ENTIRE RISK ARISING OUT OF THE USE OF THIS INFORMATION REMAINS WITH RECIPIENT. IN NO EVENT SHALL COMPAQ BE LIABLE FOR ANY DIRECT, CONSEQUENTIAL, INCIDENTAL, SPECIAL, PUNITIVE, OR OTHER DAMAGES WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS PROFITS, BUSINESS INTERRUPTION, OR LOSS OF BUSINESS INFORM ATION), EVEN IF COMPAQ HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
The limited warranties for Compaq products are exclusively set forth in the documentation accompanying such products. Nothing herein should be construed as constituting a further or additional warranty.
This publication does not constitute an endorsement of the product or products that were tested. The configuration or configurations tested or described may or may not be the only available solution. This test is not a determination of product quality or correctness, nor does it ens ure compliance with any federal, state or local requirements.
Compaq and Evo are trademarks of Compaq Information Technologies Group, L.P. in the U.S. and/or other countries.
Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the U.S. and/or other countries.
Intel, Pentium, and Xeon are trademarks of Intel Corporation in the U.S. and/or other countries. All other product names mentioned herein may be trademarks of their respective companies. ©2002 Compaq Information Technologies Group, L.P. Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks
White Paper prepared by Workstations Division First Edition (March 2002)
Document Number 16G1-0302A-WWEN
16G1-0302A-WWEN
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 3
Introduction
Computer performance is highly dependent on key system features. Processor speed, memory bandwidth, graphics cards, and disk drives all play im porta nt rol es in dete rmining system performance. Highly computationally intensive tasks can benefit from higher frequency processors. A larger cache on the processors can reduce the number of memory accesses and increase system performance on applications that have small data sets (those that can reside on the processor cache). Applications that require very large files use many disk accesses and require more optimization in that area. Memory bandwidth is a crucial factor in getting the data quickly to the processor. This paper will focus mainly on system processor performance.
A good measure of performance is the amount of time it takes to execute a given application. Contrary to popular belief, clock frequency (MHz) and the number of instructions executed per clock (IPC) are not fair indexes of performance by themselves. True performance is a combination of both clock frequency (MHz) and IPC.
Performance = Frequency x IPC
The formula: Performance = Frequency x IPC means that performance can be improved by increasing frequency, IPC or both. Frequency is a function of both the manufacturing process and the micro-architecture. At any given clock frequency, the IPC is a function of processor micro­architecture and the specific application being executed. Although it is not always feasible to improve both the frequency and the IPC, increasing one and holding the other close to constant with the prior generation provides a significantly higher level of performance.
In addition to these two methods for increasing performance, it is also possible to increase performance by reducing the number of instructions that it takes to execute a specific task. Single Instruction Multiple Data-Stream (SIMD) is a technique used to accomplish this. This is done using 128-bit SIMD single-precision floating-point Streaming SIMD Extensions (SSE).
This analysis will discuss the performance differences between different speeds of the previous Intel Xeon .18 processor (Foster) and new Intel Xeon .13 Processor (Prestonia). This paper will also analyze how the larger cache on Xeon .13 determines system performance.
16G1-0302A-WWEN
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 4
Differences Between Intel Xeon Processors
Intel Xeon .18 Processors
The Intel Xeon .18 processor (Foster) builds upon the Intel Netburst micro-architecture, built with the 0.18-micron process and with 256KB L2 cache, which facilitates high-speed critical calculations, memory accesses, and an Execution Trace Cache. The Execution Trace Cache caches decoded x86 instructions (micro-ops), removing the latency associated with the instruction decoder from the main execution loops. In addition, the Execution Trace Cache stores these micro-ops in the path of program execution flow, where the results of branches in the code are integrated into the same cache line. This increases the instruction flow from the cache and makes better use of the overall cache storage space (12K micro-ops), since the cache no longer stores instructions that are branched over and never executed. The result is a means to deliver a high volume of instructions to the processor’s execution units and a reduction in the overall time required to recover from branches that have been mispredicted. The trace cache is a micro­architectural design that has a direct impact in the Intel Pentium 4 (P4) core attaining a higher IPC than the Intel Pentium 3 (P3). However, this has a drawback too. When the processor needs to fetch a new instruction, it must rely on relatively much slower instruction decoders— thereby causing the netburst architecture to idle and wait on the slow decoders.
The Level 2 Advanced Transfer Cache is 256KB in size and delivers a high data throughput between the Level 2 cache and the processor core. The Advanced Transfer Cache consists of a 256-bit (32-byte) interface that transfers data on each core clock. As a result, the processor can deliver a data transfer rate of core speed multiplied by 32 bytes, reported in GB/s. This contributes to the processor's ability to keep the high-frequency execution units executing instructions vs. sitting idle.
Intel Xeon .13 Processors
The new Intel Xeon .13 processor now features a 512KB L2 cache instead of the original 256KB cache in the Xeon .18 Processor. The addition of the extra cache reduces the miss rates versus the 256KB cache misses. The size of the execution trace cache has not been changed nor have any of the other units of the P4 core, but the increase in L2 cache will provide some performance increase for most applications, especially newer ones. This will be evaluated in the following sections.
The Xeon .13 processor is built with a 0.13-micron die shrink. The smaller transistors can switch faster and produce less heat than their older counterparts. This can result in higher clock speeds for these processors. All 0.13-micron CPUs use copper interconnects, which also aid in increasing clock speeds.
Benchmark Analyses
These analyses are based on running benchmarks that focus on real-world applications run by typical users running business appl ica ti ons, such as th e following:
Microsoft Word
Microsoft Outlook
Users running Internet applica tio ns
16G1-0302A-WWEN
Loading...
+ 9 hidden pages