Processor Frequency Grades
and Cache Sizes on
Performance Benchmarks
Abstract: Compaq Evo Workstations W6000 and W8000 refreshes
will feature the leading edge Intel Xeon .18 processor with HyperThreading Technology. The new processors are fabricated with the
latest .13µ technology, 512KB-L2 cache, the ability to support
frequencies ranging from 1.8 GHz to greater than 2.6 GHz and also
includes support for multi-threaded execution.
The purpose of this paper is to study the performance benefits in
terms of processor frequency grades and larger cache sizes of the
new Intel Xeon .18 processors versus the previous Intel Xeon .18
processors. This will be done using different industry-standard
benchmarks.
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 2
Notice
The information in this publication is subject to change without notice and is provided “AS IS” WITHOUT
WARRANTY OF ANY KIND. THE ENTIRE RISK ARISING OUT OF THE USE OF THIS
INFORMATION REMAINS WITH RECIPIENT. IN NO EVENT SHALL COMPAQ BE LIABLE FOR
ANY DIRECT, CONSEQUENTIAL, INCIDENTAL, SPECIAL, PUNITIVE, OR OTHER DAMAGES
WHATSOEVER (INCLUDING, WITHOUT LIMITATION, DAMAGES FOR LOSS OF BUSINESS
PROFITS, BUSINESS INTERRUPTION, OR LOSS OF BUSINESS INFORM ATION), EVEN IF
COMPAQ HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
The limited warranties for Compaq products are exclusively set forth in the documentation accompanying
such products. Nothing herein should be construed as constituting a further or additional warranty.
This publication does not constitute an endorsement of the product or products that were tested. The
configuration or configurations tested or described may or may not be the only available solution. This test
is not a determination of product quality or correctness, nor does it ens ure compliance with any federal,
state or local requirements.
Compaq and Evo are trademarks of Compaq Information Technologies Group, L.P. in the U.S. and/or other
countries.
Microsoft, Windows, and Windows NT are trademarks of Microsoft Corporation in the U.S. and/or other
countries.
White Paper prepared by Workstations Division
First Edition (March 2002)
Document Number 16G1-0302A-WWEN
16G1-0302A-WWEN
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 3
Introduction
Computer performance is highly dependent on key system features. Processor speed, memory
bandwidth, graphics cards, and disk drives all play im porta nt rol es in dete rmining system
performance. Highly computationally intensive tasks can benefit from higher frequency
processors. A larger cache on the processors can reduce the number of memory accesses and
increase system performance on applications that have small data sets (those that can reside on
the processor cache). Applications that require very large files use many disk accesses and require
more optimization in that area. Memory bandwidth is a crucial factor in getting the data quickly
to the processor. This paper will focus mainly on system processor performance.
A good measure of performance is the amount of time it takes to execute a given application.
Contrary to popular belief, clock frequency (MHz) and the number of instructions executed per
clock (IPC) are not fair indexes of performance by themselves. True performance is a
combination of both clock frequency (MHz) and IPC.
Performance = Frequency x IPC
The formula: Performance = Frequency x IPC means that performance can be improved by
increasing frequency, IPC or both. Frequency is a function of both the manufacturing process and
the micro-architecture. At any given clock frequency, the IPC is a function of processor microarchitecture and the specific application being executed. Although it is not always feasible to
improve both the frequency and the IPC, increasing one and holding the other close to constant
with the prior generation provides a significantly higher level of performance.
In addition to these two methods for increasing performance, it is also possible to increase
performance by reducing the number of instructions that it takes to execute a specific task. Single
Instruction Multiple Data-Stream (SIMD) is a technique used to accomplish this. This is done
using 128-bit SIMD single-precision floating-point Streaming SIMD Extensions (SSE).
This analysis will discuss the performance differences between different speeds of the previous
Intel Xeon .18 processor (Foster) and new Intel Xeon .13 Processor (Prestonia). This paper will
also analyze how the larger cache on Xeon .13 determines system performance.
16G1-0302A-WWEN
Analysis of Intel Xeon Processor Frequency Grades and Cache Sizes on Performance Benchmarks White Paper 4
Differences Between Intel Xeon Processors
Intel Xeon .18 Processors
The Intel Xeon .18 processor (Foster) builds upon the Intel Netburst micro-architecture, built with
the 0.18-micron process and with 256KB L2 cache, which facilitates high-speed critical
calculations, memory accesses, and an Execution Trace Cache. The Execution Trace Cache
caches decoded x86 instructions (micro-ops), removing the latency associated with the instruction
decoder from the main execution loops. In addition, the Execution Trace Cache stores these
micro-ops in the path of program execution flow, where the results of branches in the code are
integrated into the same cache line. This increases the instruction flow from the cache and makes
better use of the overall cache storage space (12K micro-ops), since the cache no longer stores
instructions that are branched over and never executed. The result is a means to deliver a high
volume of instructions to the processor’s execution units and a reduction in the overall time
required to recover from branches that have been mispredicted. The trace cache is a microarchitectural design that has a direct impact in the Intel Pentium 4 (P4) core attaining a higher
IPC than the Intel Pentium 3 (P3). However, this has a drawback too. When the processor needs
to fetch a new instruction, it must rely on relatively much slower instruction decoders— thereby
causing the netburst architecture to idle and wait on the slow decoders.
The Level 2 Advanced Transfer Cache is 256KB in size and delivers a high data throughput
between the Level 2 cache and the processor core. The Advanced Transfer Cache consists of a
256-bit (32-byte) interface that transfers data on each core clock. As a result, the processor can
deliver a data transfer rate of core speed multiplied by 32 bytes, reported in GB/s. This
contributes to the processor's ability to keep the high-frequency execution units executing
instructions vs. sitting idle.
Intel Xeon .13 Processors
The new Intel Xeon .13 processor now features a 512KB L2 cache instead of the original 256KB
cache in the Xeon .18 Processor. The addition of the extra cache reduces the miss rates versus the
256KB cache misses. The size of the execution trace cache has not been changed nor have any of
the other units of the P4 core, but the increase in L2 cache will provide some performance
increase for most applications, especially newer ones. This will be evaluated in the following
sections.
The Xeon .13 processor is built with a 0.13-micron die shrink. The smaller transistors can switch
faster and produce less heat than their older counterparts. This can result in higher clock speeds
for these processors. All 0.13-micron CPUs use copper interconnects, which also aid in increasing
clock speeds.
Benchmark Analyses
These analyses are based on running benchmarks that focus on real-world applications run by
typical users running business appl ica ti ons, such as th e following:
• Microsoft Word
• Microsoft Outlook
• Users running Internet applica tio ns
16G1-0302A-WWEN
Loading...
+ 9 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.