Performance Comparison of Pentium III Xeon Cache Options
.
.
.
First Edition (April 2001)
.
.
.
TC010401TB
.
.
.
.
.
.
.
2
(cont.)
Cache: a relatively small
local memory that stores a
copy of recently used
instructions and data
Coherency:
Maintaining a
consistent relationship
among the contents of
multiple caches,
keeping each cache
informed of any
changes that affect its
data
.
TC010401TB
ECHNOLOGY BRIEF
T
.
.
.
.
NTRODUCTION
I
.
.
.
.
.
.
.
.
.
.
Although for years pro cessor speed ha s been considered the prime indica t or of system
.
.
.
.
.
.
performance, cache size can also dramatically affect system performance in some application
.
.
.
.
.
.
environments. For example, the performance of memory-intensive applications ordinarily benefit
.
.
.
.
.
.
from increased cache memory. While some applications can take advantage of more cache to gain
.
.
.
.
.
.
better performance, others achieve their peak performance from a smaller cache and realize no
.
.
.
.
.
.
improvement from a larger one. Where a choice exists, selecting the optimum cache size for a
.
.
.
.
.
.
given environment gives cust omers the best value for their computi ng dollar.
.
.
.
.
.
.
.
.
This paper reports the results of five benchmark tests performed by Compaq laboratories to
.
.
.
.
.
.
determine the effect of cache size on system performance. These tests represent several types of
.
.
.
.
.
.
.
.
computing environments for which customers often purchase servers. The results of these tests
.
.
.
.
.
.
underscore the importance of understanding the relationship of the cache size to system
.
.
.
.
.
.
performance in specific computing environments and of weighing the trade-offs involved to
.
.
.
.
.
.
determine the best cache solution.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
ACHE ARCHITECTURE
C
.
.
.
.
.
.
.
.
.
.
For those unfamiliar with the basic concepts, some background information on cache memory will
.
.
.
.
.
.
be helpful in understanding the discussion of the test results.
.
.
.
.
.
.
.
.
.
.
.
.
The Function of Cache Memory
.
.
.
.
.
.
.
.
.
.
Cache memory improves system performance by keeping a copy of recently used data in a small,
.
.
.
.
.
.
fast memory near the processor. Since cache memory typically runs at the same speed as the
.
.
.
.
.
.
processor, access to information stored in cache memory is much faster than access to the same
.
.
.
.
.
.
information stored in main memory. Caching is effective because most programs use the same
.
.
.
.
.
.
instructions and data repeatedly, and these repetitions allow a processor to run from its cache most
.
.
.
.
.
.
of the time. The more a processor can run from its cache, the more system performance increases.
.
.
.
.
.
.
In fact, to increase performance, processor designers today include multiple levels of cache
.
.
.
.
.
.
typically, a small primary (level one or L1) cache and a larger secondary (level two or L2) cache.
.
.
.
.
.
.
When requesting data, the processor first accesses the L1 cache. If the requested data is not found
.
.
.
.
.
.
in the L1 cache, the processor then accesses the L2 cache before going to slower main memory.
.
.
.
.
.
.
.
.
Logically, larger caches should improve scalability and performance in multiprocessor systems
.
.
.
.
.
.
.
.
because each processor keeps more data in its local cache, reducing competition with other
.
.
.
.
.
.
processors in the system for access to resources. However, increasing the cache size also increases
.
.
.
.
.
.
the likelihood that another processor will need access to some of the data stored there. Managing
.
.
.
.
.
.
and maintaining coherency between the caches adds system bus traffic, and this overhead is
.
.
.
.
.
.
increased with the use of more processors and larger caches.
.
.
.
.
.
.
.
.
.
.
.
.
The Pentium III Xeon Cache Architecture
.
.
.
.
.
.
.
.
.
.
Intel Pentium III Xeon processors have two levels of cache memory, a relatively small L1 cache
.
.
.
.
.
.
and a much larger L2 cache that varies in size. The L2 cache connects to the processor through a
.
.
.
.
.
.
64-bit dedicated, transaction-oriented bus that supports up to four concurrent cache accesses. The
.
.
.
.
.
.
earliest Pentium III processor (code named Katmai) used an external L2 cache running at half the
.
.
.
.
.
.
processor frequency, but Pentium III Xeon processors use integrated caches that run at the full
.
.
.
.
.
.
speed of the processor bus.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
1
.
For more information on the recent history of Intel processors, refer to The Intel Microprocessor
.
.
.
Roadmap at http://www.compaq.com/support/techpubs/whitepapers/tc000808tb.html.
.
3
(cont.)
1
Normalized: A method to
simplify comparison of a
set of values, in which the
lowest number divides into
all the numbers. The
lowest number is then
shown as a “normalized”
value of 1.0, while a
number that is twice as
much as the lowest number
is shown as 2.0, and so on.
MB: megabyte
TC010401TB
ECHNOLOGY BRIEF
T
.
.
ENCHMARK TEST RESULTS
B
.
.
.
.
.
Benchmark tests simulate real-world application environments in a controlled and repeatable
.
.
.
fashion. Results of these standardized tests are certifiable by a standards body and therefore
.
.
.
provide a ready means for comparing the performance of different configurations or solutions from
.
.
.
different vendors. Several industry-standard benchmarks exist to evaluate server performance, each
.
.
.
emphasizing different applications or areas of interest. This paper summarizes the results of such
.
.
.
benchmark tests conducted by Compaq engineers to establish the effect of cache size on server
.
.
.
performance.
.
.
.
.
.
.
On-Line Transaction Processing Testing in a SQL Server Environment
.
.
.
.
.
The on-line tr ansaction-pro cessing (OLTP ) test demonstrates how well a system’s throughput
.
.
.
responds to a transaction-processing load. The goal for this test was to compare cache size
.
.
.
performance rather than to obtain certifiable data, so only normalized data is presented here
.
.
.
Figure 1 gives the normalized OLTP performance results for a Compaq ProLiant 8500 server using
.
.
.
Pentium III Xeon 700-MHz processors with L2 caches of 1 MB (lower curve) and 2 MB (upper
.
.
.
curve). The load placed on the system affects OLTP test performance, so tuning the load to
.
.
.
optimize the performance of the configuration under test is part of the usual process. In this case,
.
.
.
however, the load was kept constant so meaningful comparisons coul d be made.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
.
The OLTP test results in this paper were loosely based on TPC benchmark testing. Refer to
.
.
.
http://www.tpc.org
.
.
.
results.
.
4
Normalized Transactions/Second
(cont.)
Relative Performance Scaling
3.50
3.00
2.50
2.00
1.50
1.00
0.50
2468
Number of Processo rs
Figure 1 - OLTP Performance Improvement Between 1-MB
and 2-MB Caches in a ProLiant 8500 Server
to learn more about certified TPC testing and to see Compaq’s published
2 MB ca che
1 MB ca che
2
.
TC010401TB
ECHNOLOGY BRIEF
T
.
.
Figure 1 shows that, in each configuration, processors with 2-MB caches provided better system
.
.
.
performance than the processors with 1-MB caches. In addition, it shows that both cache
.
.
.
configurations followed the same non-linear scalability pattern—scalability did not improve with
.
.
.
.
the larger cache. Table 1 summarizes the performance increases measured due to replacing 1-MB
.
.
.
caches with 2-MB caches. Even though the difference in absolute system performance increased
.
.
.
slightly with additional processors, that increase actually declined as a percentage of overall
.
.
.
performance. The larger cache still provides improved performance for the configurations with a
.
.
.
higher number of processors, but not as much improvement as it does for systems with fewer
.
.
.
processors.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Figure 2 demonstrates the difference larger caches can make in a four-processor OLTP
.
.
.
environment using a Compaq ProLiant ML570 server. A set of 2-MB caches yields a 16 percent
.
.
.
performance improvement over the 1-MB caches, and an even mix of 1-MB and 2-MB caches
.
.
.
yields a performance gain of about half as much. Because Compaq supports mixing of cache sizes
.
.
.
under certain conditions
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
3
.
For details about Compaq support for mixing Intel processo rs in Compaq industry-standard
.
.
.
servers, see http://www.compaq.com/support/techpubs/whitepapers/tc000703tb.html
.
5
Normalized Transactions/Second
(cont.)
Table 1 – OLTP Performance Gain in an 8-Processor System
When 1-MB Caches Are Replaced by 2-MB Caches
Number of
Processors
2 20.1 %
4 14.6 %
6 15.8 %
8 12.8 %
3
, customers have another option for increasing system performance.
1.20
1.15
1.10
1.05
1.00
1.00
0.95
0.90
4 x 1MB2 x 1MB + 2 x 2MB4 x 2MB
Number and Cache S ize of Processors
Figure 2 - Cache Performance in a Four-Processo r System
Performance gained
from the larger cache
1.16
1.09
Loading...
+ 10 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.