Xilinx PPC405 User Manual

Volume 2(a): PPC405 User Manual
Virtex-II Pro™ Platform FPGA Developer’s Kit
March 2002 Release
R
R
The shadow X shown above is a trademark of Xilinx, Inc.
"Xilinx" and the Xilinx logo are registered trademarks of Xilinx, Inc. Any rights not expressly granted herein are reserved.
CoolRunner, RocketChips, Rocket IP, Spartan, StateBENCH, StateCAD, Virtex, XACT, XC2064, XC3090, XC4005, XC5210 are registered Trademarks of Xilinx, Inc.
ACE Controller, ACE Flash, A.K.A. Speed, Alliance Series, AllianceCORE, Bencher, ChipScope, Configurable Logic Cell, CORE Generator, CoreLINX, Dual Block, EZTag, Fast CLK, Fast CONNECT, Fast FLASH, FastMap, Fast Zero Power, Foundation, Gigabit Speeds...and Beyond!, HardWire, HDL Bencher, IRL, J Drive, JBits, LCA, LogiBLOX, Logic Cell, LogiCORE, LogicProfessor, MicroBlaze, MicroVia, MultiLINX, Nano­Blaze, PicoBlaze, PLUSASM, PowerGuide, PowerMaze, QPro, Real-PCI, Rocket I/O, SelectI/O, SelectRAM, SelectRAM+, Silicon Xpresso, Smartguide, Smart-IP, Smar tSearch, SMARTswitch, System ACE, Testbench In A Minute, TrueMap, UIM, VectorMaze, VersaBlock, VersaRing, Virtex-II Pro, Wave Table, WebFITTER, WebPACK, WebPOWERED, XABEL, XACT-Floorplanner, XACT-Performance, XACTstep Advanced, XACTstep Foundry, XAM, XAPP, X-BLOX +, XC designated products, XChecker, XDM, XEPLD, Xilinx Foundation Series, Xilinx XDTV, Xinfo, XSI, XtremeDSP and ZERO+ are trademarks of Xilinx, Inc.
The Programmable Logic Company is a service mark of Xilinx, Inc.
The following are trademarks of International Business Machines Corporation in the United States, or other countries, or both: IBM IBM Logo PowerPC PowerPC Logo Blue Logic CoreConnect CodePack
All other trademarks are the property of their respective owners.
Xilinx does not assume any liability arising out of the application or use of any product described or shown herein; nor does it convey any license under its patents, copyrights, or maskwork rights or any rights of others. Xilinx reserves the right to make changes, at any time, in order to improve reliability, function or design and to supply the best product possible. Xilinx will not assume responsibility for the use of any circuitry described herein other than circuitry entirely embodied in its products. Xilinx provides any design, code, or information shown or described herein "as is." By providing the design, code, or information as one possible implementation of a feature, application, or standard, Xilinx makes no rep­resentation that such implementation is free from any claims of infringement. You are responsible for obtaining any rights you may require for your implementation. Xilinx expressly disclaims any warranty whatsoever with respect to the adequacy of any such implementation, including but not limited to any warranties or representations that the implementation is free from claims of infringement, as well as any implied warranties of mer­chantability or fitness for a particular purpose. Xilinx assumes no obligation to correct any errors contained herein or to advise any user of this text of any correction if such be made. Xilinx will not assume any liability for the accuracy or correctness of any engineering or software support or assistance provided to a user.
Xilinx products are not intended for use in life support appliances, devices, or systems. Use of a Xilinx product in such applications without the written consent of the appropriate Xilinx officer is prohibited.
Copyright 2002 Xilinx, Inc. All Rights Reserved.
Virtex-II Pro™ Platform FPGA Developer’s Kit www.xilinx.com March 2002 Release
1-800-255-7778
R

About This Book

Preface
This document is intended to serve as a stand-alone reference for application and system programmers of the PowerPC following documents:
PowerPC 405 Embedded Processor Core User’s Manual published by IBM Corporation
(IBM order number SA14-2339-01).
The IBM PowerPC Embedded Environment Architectural Specifications for IBM PowerPC Embedded Controllers, published by IBM Corporation.
PowerPC Microprocessor Family: The Programming Environments published by IBM Corporation (IBM order number G522-0290-01).
IBM PowerPC Embedded Processors Application Note: PowerPC 400 Series Caches: Programming and Coherency Issues.
IBM PowerPC Embedded Processors Application Note: PowerPC 40x Watch Dog Timer.
IBM PowerPC Embedded Processors Application Note: Programming Model Differences
of the IBM PowerPC 400 Family and 600/700 Family Processors.

Document Organization

Chapter 1, Introduction to the PPC405, provides a general understanding of the
PPC405 as an implementation of the PowerPC embedded-environment architecture. This chapter also contains an overview of the features supported by the PPC405.
Chapter 2, Operational Concepts, introduces the processor operating modes,
execution model, synchronization, operand conventions, and instruction conventions.
Chapter 3, User Programming Model, describes the registers and instructions
available to application software.
Chapter 4, PPC405 Privileged-Mode Programming Model, introduces the registers
and instructions available to system software.
Chapter 5, Memory-System Management, describes the operation of the memory
system, including caches. Real-mode storage control is also described in this chapter.
Chapter 6, Virtual-Memory Management, describes virtual-to-physical address
translation as supported by the PPC405. Virtual-mode storage control is also described in this chapter.
Chapter 7, Exceptions and Interrupts, provides details of all exceptions recognized by
the PPC405 and how software can use the interrupt mechanism to handle exceptions.
Chapter 8, Timer Resources, describes the timer registers and timer-interrupt controls
available in the PPC405.
Chapter 9, Debugging, describes the debug resources available to software and
hardware debuggers.
Chapter 10, Reset and Initialization, describes the state of the PPC405 following reset
®
405D5 processor. It combines information from the
March 2002 Release www.xilinx.com 311 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
and the requirements for initializing the processor.
Chapter 11, Instruction Set, provides a detailed description of each instruction
supported by the PPC405.
Appendix A, Register Summary, is a reference of all registers supported by the
PPC405.
Appendix B, Instruction Summary, lists all instructions sorted by mnemonic, opcode,
function, and form. Each entry for an instruction shows its complete encoding. General instruction-set information is also provided.
Appendix C, Simplified Mnemonics, lists the simplified mnemonics recognized by
many PowerPC assemblers. These mnemonics provide a shorthand means of specifying frequently-used instruction encodings and can greatly improve assembler code readability.
Appendix D, Programming Considerations, provides information on improving
performance of software written for the PPC405.
®
Appendix E, PowerPC
6xx/7xx Compatibility, describes the programming model
differences between the PPC405 and PowerPC 6xx and 7xx series processors.
®
Appendix F, PowerPC
Book-E Compatibility, describes the programming model
differences between the PPC405 and PowerPC Book-E processors.

Document Conventions

General Conventions

Ta bl e 1 lists the general notational conventions used throughout this document.
Table P-1: General Notational Conventions
Convention Definition
mnemonic Instruction mnemonics are shown in lower-case bold.
. (period) Update. When used as a character in an instruction
! (exclamation) In instruction listings, an exclamation (!) indicates the
variable Variable items are shown in italic.
<optional> Optional items are shown in angle brackets.
ActiveLow
n A decimal number.
0xn A hexadecimal number.
mnemonic, a period (.) means that the instruction updates the condition-register field.
start of a comment.
An overbar indicates an active-low signal.
0bn A binary number.
(rn) The contents of GPR rn.
(rA|0) The contents of the register rA, or 0 if the rA instruction
field is 0.
cr_bit Used in simplified mnemonics to specify a CR-bit
position (0 to 31) used as an operand.
312 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
Document Conventions
R
Table P-1: General Notational Conventions (Continued)
Convention Definition
cr_field Used in simplified mnemonics to specify a CR field
(0 to 7) used as an operand.
OBJECT
OBJECT
OBJECT
REGISTER[FIELD] Fields within any register are shown in square brackets.
REGISTER[FIELD, FIELD
REGISTER[FIELD:FIELD] A

Instruction Fields

Ta bl e 2 lists the instruction fields used in the various instruction formats. They are found in
the instruction encodings and pseudocode, and are referred to throughout this document when describing instructions. The table includes the bit locations for the field within the instruction encoding.
Table P-2: Instruction Field Definitions
Field Location Description
b
b:b
b,b, . . .
A single bit in any object (a register, an instruction, an address, or a field) is shown as a subscripted number or name.
A range of bits in any object (a register, an instruction, an address, or a field).
A list of bits in any object (a register, an instruction, an address, or a field).
]A list of fields in any register.
. . .
range of fields in any register.
AA 30 Absolute-address bit (branch instructions).
0The immediate field represents an address relative to the current instruction address (CIA). The effective address (EA) of the branch is either the sum of the LI field sign-extended to 32 bits and the branch instruction address, or the sum of the BD field sign-extended to 32 bits and the branch instruction address.
1The immediate field represents an absolute address. The EA of the branch is either the LI field or the BD field, sign-extended to 32 bits.
BD 16:29 An immediate field specifying a 14-bit signed two’s-complement
branch displacement. This field is concatenated on the right with 0b00 and sign-extended to 32 bits.
BI 11:15 Specifies a bit in the CR used as a source for the condition of a
conditional-branch instruction.
BO 6:10 Specifies options for conditional-branch instructions. See
Conditional Branch Control, page 367
crbA 11:15 Specifies a bit in the CR used as a source of a CR-logical instruction.
crbB 16:20 Specifies a bit in the CR used as a source of a CR-logical instruction.
crbD 6:10 Specifies a bit in the CR used as a destination of a CR-Logical
instruction.
March 2002 Release www.xilinx.com 313 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
Table P-2: Instruction Field Definitions (Continued)
Field Location Description
crfD 6:8 Specifies a field in the CR used as a target in a compare or mcrf
instruction.
crfS 11:13 Specifies a field in the CR used as a source in a mcrf instruction.
CRM 12:19 The field mask used to identify CR fields to be updated by the
mtcrf instruction.
d 16:31 Specifies a 16-bit signed twos-complement integer displacement
for load/store instructions.
DCRF 11:20 A split field used to specify a device control register (DCR). The
field is used to form the DCR number (DCRN).
E 16 A single-bit immediate field in the wrteei instruction specifying the
value to be written to the MSR[EE] bit.
LI 6:29 An immediate field specifying a 24-bit signed two’s-complement
branch displacement. This field is concatenated on the right with 0b00 and sign-extended to 32 bits.
LK 31 Link bit.
0Do not update the link register (LR).
1Update the LR with the address of the next instruction.
MB 21:25 Mask begin. Used in rotate-and-mask instructions to specify the
beginning bit of a mask.
ME 26:30 Mask end. Used in rotate-and-mask instructions to specify the
ending bit of a mask.
NB 16:20 Specifies the number of bytes to move in an immediate-string load
or immediate-string store.
OE 21 Enables setting the OV and SO fields in the fixed-point exception
register (XER) for extended arithmetic.
OPCD 0:5 Primary opcode. Primary opcodes, in decimal, appear in the
instruction format diagrams presented with individual instructions. The OPCD field name does not appear in instruction descriptions.
rA 11:15 Specifies a GPR source operand and/or destination operand.
rB 16:20 Specifies a GPR source operand.
Rc 31 Record bit.
0Instruction does not update the CR.
1Instruction updates the CR to reflect the result of an operation.
See Condition Register (CR), page 361 for a further discussion of how the CR bits are set.
rD 6:10 Specifies a GPR destination operand.
rS 6:10 Specifies a GPR source operand.
SH 16:20 Specifies a shift amount.
314 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
Document Conventions
R
Table P-2: Instruction Field Definitions (Continued)
Field Location Description
SIMM 16:31 An immediate field used to specify a 16-bit signed-integer value.
SPRF 11:20 A split field used to specify a special purpose register (SPR). The
field is used to form the SPR number (SPRN).
TBRF 11:20 A split field used to specify a time-base register (TBR). The field is
used to form the TBR number (TBRN).
TO 6:10 Specifies the trap conditions, as defined in the tw and twi
instruction descriptions.
UIMM 16:31 An immediate field used to specify a 16-bit unsigned-integer value.
XO 21:30 Extended opcode for instructions without an OE field. Extended
opcodes, in decimal, appear in the instruction format diagrams presented with individual instructions. The XO field name does not appear in instruction descriptions.
XO 22:30 Extended opcode for instructions with an OE field. Extended
opcodes, in decimal, appear in the instruction format diagrams presented with individual instructions. The XO field name does not appear in instruction descriptions.

Pseudocode Conventions

Ta bl e 3 lists additional conventions used primarily in the pseudocode describing the
operation of each instruction.
Table P-3: Pseudocode Conventions
Convention Definition
Assignment AND logical operator ¬ NOT logical operator OR logical operator Exclusive-OR (XOR) logical operator
+Twos-complement addition
Twos-complement subtraction, unary minus
× Multiplication ÷ Division yielding a quotient
% Remainder of an integer division. For example, (33 % 32) = 1.
|| Concatenation =, ≠ Equal, not-equal relations
<, > Signed comparison relations
u
u
, Unsigned comparison relations
>
<
c
0:3
A four-bit object used to store condition results in compare instructions.
March 2002 Release www.xilinx.com 315 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
Table P-3: Pseudocode Conventions (Continued)
Convention Definition
n
b The bit or bit value b is replicated n times.
x Bit positions that are don’t-cares. CEIL(n) Least integer n.
CIA Current instruction address. The 32-bit address of the instruction
being described by a sequence of pseudocode. This address is used to set the next instruction address (NIA). Does not correspond to any architected register.
DCR(DCRN) A specific device control register, as indicated by DCRN.
DCRN The device control register number formed using the split DCRF
field in a mfdcr or mtdcr instruction.
do Do loop. “to” and “by” clauses specify incrementing an iteration
variable. while and until clauses specify terminating conditions. Indenting indicates the scope of a loop.
EA Effective address. The 32-bit address that specifies a location in
main storage. Derived by applying indexing or indirect addressing rules to the specified operand.
EXTS(n) The result of extending
if...then...else... Conditional execution: if
n on the left with sign bits.
condition then a else b, where a and b
represent one or more pseudocode statements. Indenting indicates the ranges of
a and b. If b is null, the else does not
appear.
instruction(EA) An instruction operating on a data-cache block or instruction-
cache block associated with an EA.
leave Leave innermost do-loop or the do-loop specified by the leave
statement.
MASK(MB,ME) Mask having 1s in positions MB through ME (wrapping if
MB > ME) and 0s elsewhere.
MS(addr, n) The number of bytes represented by
storage represented by
addr.
n at the location in main
NIA Next instruction address. The 32-bit address of the next
instruction to be executed. In pseudocode, a successful branch is indicated by assigning a value to NIA. For instructions that do not branch, the NIA is CIA +4.
RESERVE Reserve bit. Indicates whether a process has reserved a block of
storage.
ROTL((RS),n) Rotate left. The contents of RS are shifted left the number of bits
specified by
n.
SPR(SPRN) A specific special-purpose register, as indicated by SPRN.
316 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation

Registers

R
Table P-3: Pseudocode Conventions (Continued)
Convention Definition
SPRN The special-purpose register number formed using the split
SPRF field in a mfspr or mtspr instruction
TBR(TBRN) A specific time-base register, as indicated by TBRN.
TBRN The time-base register number formed using the split TBRF field
in a mftb instruction.

Operator Precedence

Ta bl e 4 lists the pseudocode operators and their associativity in descending order of
precedence
:
Table P-4: Operator Precedence
Operators Associativity
Registers
REGISTER
n
b Right to left
, REGISTER[FIELD], function evaluation Left to right
b
¬, – (unary minus) Right to left
×, ÷ Left to right
+, – Left to right || Left to right
u
, <, >, , Left to right
=,
u
>
<
, ⊕ Left to right
Left to right None
Ta bl e 5 lists the PPC405 registers and their descriptive names.
Table P-5: PPC405 Registers
Register Descriptive Name
CCR0 Core-configuration register 0
CR Condition register
CTR Count register
DACn Data-address compare n
DBCRn Debug-control register n
DBSR Debug-status register
DCCR Data-cache cacheability register
DCWR Data-cache write-through register
March 2002 Release www.xilinx.com 317 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
Table P-5: PPC405 Registers (Continued)
Register Descriptive Name
DEAR Data-error address register
DVCn Data-value compare n
ESR Exception-syndrome register
EVPR Exception-vector prefix register
GPR General-purpose register. Specific GPRs are identified using the
notational convention rn (see below)
IACn Instruction-address compare n
ICCR Instruction-cache cacheability register
ICDBDR Instruction-cache debug-data register
LR Link register
MSR Machine-state register
PID Process ID
PIT Programmable-interval timer

Terms

PVR Processor-version register
rn Specifies GPR n (r15, for example)
SGR Storage-guarded register
SLER Storage little-endian register
SPRGn SPR general-purpose register n
SRRn Save/restore register n
SU0R Storage user-defined 0 register
TBL Time-base lower
TBU Time-base upper
TCR Timer-control register
TSR Timer-status register
USPRGn User SPR general-purpose register n
XER Fixed-point exception register
ZPR Zone-protection register
atomic access
A memory access that attempts to read from and write to the same address uninterrupted by other accesses to that address. The term refers to the fact that such transactions are indivisible.
big endian
A memory byte ordering where the address of an item corresponds to the most-significant byte.
318 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
Terms
R
Book-E
cache block
cacheline
clear
cache set
congruence class
dirty
doubleword
effective address
exception
fill buffer
An version of the PowerPC architecture designed specifically for embedded applications.
Synonym for cacheline.
A portion of a cache array that contains a copy of contiguous system-memory addresses. Cachelines are 32-bytes long and aligned on a 32-byte address.
To write a bit value of 0.
Synonym for congruence class.
A collection of cachelines with the same index.
An indication that cache information is more recent than the copy in memory.
Eight bytes, or 64 bits.
The untranslated memory address as seen by a program.
An abnormal event or condition that requires the processor’s attention. They can be caused by instruction execution or an external device. The processor records the occurrence of an exception and they often cause an interrupt to occur.
A buffer that receives and sends data and instructions between the processor and PLB. It is used when cache misses occur and when access to non-cacheable memory occurs.
flush
GB
halfword
hit
interrupt
invalidate
KB
line buffer
little endian
logical address
MB
A cache or TLB operation that involves writing back a modified entry to memory, followed by an invalidation of the entry.
Gigabyte, or one-billion bytes.
Two bytes, or 16 bits.
For cache arrays and TLB arrays, an indication that requested information exists in the accessed array.
The process of stopping the currently executing program so that an exception can be handled.
A cache or TLB operation that causes an entry to be marked as invalid. An invalid entry can be subsequently replaced.
Kilobyte, or one-thousand bytes.
A buffer located in the cache array that can temporarily hold the contents of an entire cacheline. It is loaded with the contents of a cacheline when a cache hit occurs.
A memory byte ordering where the address of an item corresponds to the least-significant byte.
Synonym for effective address.
Megabyte, or one-million bytes.
memory
miss
Collectively, cache memory and system memory.
For cache arrays and TLB arrays, an indication that requested information does not exist in the accessed array.
March 2002 Release www.xilinx.com 319 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
OEA
on chip
pending
physical address
PLB
privileged mode
process
problem state
The PowerPC operating-environment architecture, which defines the memory-management model, supervisor-level registers and instructions, synchronization requirements, the exception model, and the time-base resources as seen by supervisor programs.
In system-on-chip implementations, this indicates on the same chip as the processor core, but external to the processor core.
As applied to interrupts, this indicates that an exception occurred, but the interrupt is disabled. The interrupt occurs when it is later enabled.
The address used to access physically-implemented memory. This address can be translated from the effective address. When address translation is not used, this address is equal to the effective address.
Processor local bus.
The operating mode typically used by system software. Privileged operations are allowed and software can access all registers and memory.
A program (or portion of a program) and any data required for the program to run.
Synonym for user mode.
real address
scalar
set
sticky
string
supervisor state
system memory
tag
UISA
Synonym for physical address.
Individual data objects and instructions. Scalars are of arbitrary size.
To write a bit value of 1.
A bit that can be set by software, but cleared only by the processor. Alternatively, a bit that can be cleared by software, but set only by the processor.
A sequence of consecutive bytes.
Synonym for privileged mode.
Physical memory installed in a computer system external to the processor core, such RAM, ROM, and flash.
As applied to caches, a set of address bits used to uniquely identify a specific cacheline within a congruence class. As applied to TLBs, a set of address bits used to uniquely identify a specific entry within the TLB.
The PowerPC user instruction-set architecture, which defines the base user-level instruction set, registers, data types, the memory model, the programming model, and the exception model as seen by user programs.
user mode
The operating mode typically used by application software. Privileged operations are not allowed in user mode, and software can access a restricted set of registers and memory.
320 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation

Additional Reading

R
VEA
virtual address
word
Additional Reading
In addition to the source documents listed on page 311, the following documents contain additional information of potential interest to readers of this manual:
The PowerPC Architecture: A Specification for a New Family of RISC Processors, IBM 5/1994. Published by Morgan Kaufmann Publishers, Inc. San Francisco (ASIN:
1558603166).
Book E: Enhanced PowerPC Architecture, IBM 3/2000.
The PowerPC Compiler Writers Guide, IBM 1/1996. Published by Warthman Associates,
Palo Alto, CA (ISBN 0-9649654-0-2).
Optimizing PowerPC Code : Programming the PowerPC Chip in Assembly Language, by Gary Kacmarcik (ASIN: 0201408392)
PowerPC Programming Pocket Book, by Steve Heath (ISBN 0750621117).
Computer Architecture: A Quantitative Approach, by John L. Hennessy and David A.
Patterson.
The PowerPC virtual-environment architecture, which defines a multi-access memory model, the cache model, cache-control instructions, and the time-base resources as seen by user programs.
An intermediate address used to translate an effective address into a physical address. It consists of a process ID and the effective address. It is only used when address translation is enabled.
Four bytes, or 32 bits.
March 2002 Release www.xilinx.com 321 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Preface
322 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro Platform FPGA Documentation
R

Introduction to the PPC405

The PPC405 is a 32-bit implementation of the PowerPC® embedded-environment architecture that is derived from the PowerPC architecture. Specifically, the PPC405 is an embedded PowerPC 405D5 processor core.
The PowerPC architecture provides a software model that ensures compatibility between implementations of the PowerPC family of microprocessors. The PowerPC architecture defines parameters that guarantee compatible processor implementations at the application-program level, allowing broad flexibility in the development of derivative PowerPC implementations that meet specific market requirements.
This chapter provides an overview of the PowerPC architecture and an introduction to the features of the PPC405 core.

PowerPC Architecture Overview

Chapter 1
The PowerPC architecture is a 64-bit architecture with a 32-bit subset. The material in this document only covers aspects of the 32-bit architecture implemented by the PPC405.
In general, the PowerPC architecture defines the following:
Instruction set
Programming model
Memory model
Exception model
Memory-management model
Time-keeping model
Instruction Set
The instruction set specifies the types of instructions (such as load/store, integer arithmetic, and branch instructions), the specific instructions, and the encoding used for the instructions. The instruction set definition also specifies the addressing modes used for accessing memory.
Programming Model
The programming model defines the register set and the memory conventions, including details regarding the bit and byte ordering, and the conventions for how data are stored.
Memory Model
The memory model defines the address-space size and how it is subdivided into pages. It also defines attributes for specifying memory-region cacheability, byte ordering (big­endian or little-endian), coherency, and protection.
March 2002 Release www.xilinx.com 323 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Exception Model
The exception model defines the set of exceptions and the conditions that can cause those exceptions. The model specifies exception characteristics, such as whether they are precise or imprecise, synchronous or asynchronous, and maskable or non-maskable. The model defines the exception vectors and a set of registers used when interrupts occur as a result of an exception. The model also provides memory space for implementation-specific exceptions.
Memory-Management Model
The memory-management model defines how memory is partitioned, configured, and protected. The model also specifies how memory translation is performed, defines special memory-control instructions, and specifies other memory-management characteristics.
Time-Keeping Model
The time-keeping model defines resources that permit the time of day to be determined and the resources and mechanisms required for supporting timer-related exceptions.

PowerPC Architecture Levels

These above aspects of the PowerPC architecture are defined at three levels . This layering provides flexibility by allowing degrees of software compatibility across a wide range of implementations. For example, an implementation such as an embedded controller can support the user instruction set, but not the memory management, exception, and cache models where it might be impractical to do so.
The three levels of the PowerPC architecture are defined in Tab le 1 -1 .
Chapter 1: Introduction to the PPC405
Table 1-1: Three Levels of PowerPC Architecture
User Instruction-Set Architecture
Virtual Environment Architecture
(UISA)
Defines the architecture level to which user-level (sometimes referred to as problem state) software should conform
Defines the base user-level instruction set, user-level registers, data types, floating­point memory conventions, exception model as seen by user programs, memory model, and the programming model
Defines additional user-level functionality that falls outside typical user-level software requirements
Describes the memory model for an environment in which multiple devices can access memory
Defines aspects of the cache model and cache-control instructions
Defines the time-base resources from a user-level perspective
Note: All PowerPC implementations adhere to the UISA.
Note: Implementations that conform to the VEA level are guaranteed to conform to the UISA level.
The PowerPC architecture requires that all PowerPC implementations adhere to the UISA, offering compatibility among all PowerPC application programs. However, different versions of the VEA and OEA are permitted.
Embedded applications written for the PPC405 are compatible with other PowerPC implementations. Privileged software generally is not compatible. The migration of
(VEA)
Operating Environment
Architecture (OEA)
Defines supervisor-level resources typically required by an operating system
Defines the memory­management model, supervisor­level registers, synchronization requirements, and the exception model
Defines the time-base resources from a supervisor-level perspective
Note: Implementations that conform to the OEA level are guaranteed to conform to the UISA and VEA levels.
324 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PowerPC Architecture Overview
privileged software from the PowerPC architecture to the PPC405 is in many cases straightforward because of the simplifications made by the PowerPC embedded­environment architecture. Software developers who are concerned with cross­compatibility of privileged software between the PPC405 and other PowerPC implementations should refer to Appendix E, PowerPC
Latitude Within the PowerPC Architecture Levels
Although the PowerPC architecture defines parameters necessary to ensure compatibility among PowerPC processors, it also allows a wide range of options for individual implementations. These are:
Some resources are optional, such as certain registers, bits within registers, instructions, and exceptions.
Implementations can define additional privileged special-purpose registers (SPRs), exceptions, and instructions to meet special system requirements, such as power management in processors designed for very low-power operation.
Implementations can define many operating parameters. For example, the PowerPC architecture can define the possible condition causing an alignment exception. A particular implementation can choose to solve the alignment problem without causing an exception.
Processors can implement any architectural resource or instruction with assistance from software (that is, they can trap and emulate) as long as the results (aside from performance) are identical to those specified by the architecture. In this case, a complete implementation requires both hardware and software.
Some parameters are defined at one level of the architecture and defined more specifically at another. For example, the UISA defines conditions that can cause an alignment exception and the OEA specifies the exception itself.
®
6xx/7xx Compatibility.
R
Features Not Defined by the PowerPC Architecture
Because flexibility is an important feature of the PowerPC architecture, many aspects of processor design (typically relating to the hardware implementation) are not defined, including the following:
System-Bus Interface
Although many implementations can share similar interfaces, the PowerPC architecture does not define individual signals or the bus protocol. For example, the OEA allows each implementation to specify the signal or signals that trigger a machine-check exception.
Cache Design
The PowerPC architecture does not define the size, structure, replacement algorithm, or mechanism used for maintaining cache coherency. The PowerPC architecture supports, but does not require, the use of separate instruction and data caches.
Execution Units
The PowerPC architecture is a RISC architecture, and as such has been designed to facilitate the design of processors that use pipelining and parallel execution units to maximize instruction throughput. However, the PowerPC architecture does not define the internal hardware details of an implementation. For example, one processor might implement two units dedicated to executing integer-arithmetic instructions and another might implement a single unit for executing all integer instructions.
Other Internal Microarchitecture Issues
The PowerPC architecture does not specify the execution unit responsible for executing a particular instruction. The architecture does not define details regarding the instruction­fetch mechanism, how instructions are decoded and dispatched, and how results are written to registers. Dispatch and write-back can occur in-order or out-of-order. Although
March 2002 Release www.xilinx.com 325 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Chapter 1: Introduction to the PPC405
the architecture specifies certain registers, such as the GPRs and FPRs, implementations can use register renaming or other schemes to reduce the impact of data dependencies and register contention.
Implementation-Specific Registers
Each implementation can have its own unique set of implementation registers that are not defined by the architecture.

PowerPC Embedded-Environment Architecture

The PowerPC embedded-environment architecture is optimized for embedded controllers. This architecture is a forerunner to the PowerPC Book-E architecture. The PowerPC embedded-environment architecture provides an alternative definition for certain features specified by the PowerPC VEA and OIA. Implementations that adhere to the PowerPC embedded-environment architecture also adhere to the PowerPC UISA. PowerPC embedded-environment processors are 32-bit only implementations and thus do not include the special 64-bit extensions to the PowerPC UISA. Also, floating-point support can be provided either in hardware or software by PowerPC embedded-environment processors.
Figure 1-1 shows the relationship between the PowerPC embedded-environment
architecture, the PowerPC architecture, and the PowerPC Book-E architecture.
PowerPC
Embedded-Environment Architecture
32-Bit Only
VEA Enhancements
- True Little-Endian Support
- Enhanced Cache Management
OEA Enhancements
- Simplified Memory Management
- Software-Managed TLB
- Variable Page Sizes
- Interrupt Extensions
- Critical/Non-Critical
- Virtual-Memory Relocatable
- Timer Extensions
- Debug Extensions
64-Bit UISA Extensions Synchronization Using Memory Barriers
PowerPC
Book-E Architecture
UISA
PowerPC
Architecture
32-Bit/64-Bit Modes OEA
- Hashed Paging
- Segments, BATs
UG011_38_090701
Figure 1-1: Relationship of PowerPC Architectures
The PowerPC embedded-environment architecture features:
Memory management optimized for embedded software environments.
Cache-management instructions for optimizing performance and memory control in
complex applications that are graphically and numerically intensive.
Storage attributes for controlling memory-system behavior.
326 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PowerPC Architecture Overview
Special-purpose registers for controlling the use of debug resources, timer resources, interrupts, real-mode storage attributes, memory-management facilities, and other architected processor resources.
A device-control-register address space for managing on-chip peripherals such as memory controllers.
A dual-level interrupt structure and interrupt-control instructions.
Multiple timer resources.
Debug resources that enable hardware-debug and software-debug functions such as
instruction breakpoints, data breakpoints, and program single-stepping.
Virtual Environment
The virtual environment defines architectural features that enable application programs to create or modify code, to manage storage coherency, and to optimize memory-access performance. It defines the cache and memory models, the timekeeping resources from a user perspective, and resources that are accessible in user mode but are primarily used by system-library routines. The following summarizes the virtual-environment features of the PowerPC embedded-environment architecture:
Storage model:
- Storage-control instructions as defined in the PowerPC virtual-environment
- Storage attributes for controlling memory-system behavior. These are: write-
- Operand-placement requirements and their effect on performance.
The time-base function as defined by the PowerPC virtual-environment architecture, for user-mode read access to the 64-bit time base.
R
architecture. These instructions are used to manage instruction caches and data caches, and for synchronizing and ordering instruction execution.
through, cacheability, memory coherence (optional), guarded, and endian.
March 2002 Release www.xilinx.com 327 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Chapter 1: Introduction to the PPC405
Operating Environment
The operating environment describes features of the architecture that enable operating systems to allocate and manage storage, to handle errors encountered by application programs, to support I/O devices, and to provide operating-system services. It specifies the resources and mechanisms that require privileged access, including the memory­protection and address-translation mechanisms, the exception-handling model, and privileged timer resources. Tab le 1 -2 summarizes the operating-environment features of the PowerPC embedded-environment architecture.
Table 1-2: Operating-Environment Features of the PowerPC Embedded-Environment Architecture
Operating
Environment
Register model
Storage model
Exception model
Debug model
Time-keeping model
Synchronization requirements
Reset and initialization requirements
Features
Privileged special-purpose registers (SPRs) and instructions for accessing those registers
Device control registers (DCRs) and instructions for accessing those registers
Privileged cache-management instructions
Storage-attribute controls
Address translation and memory protection
Privileged TLB-management instructions
Dual-level interrupt structure supporting various exception types
Specification of interrupt priorities and masking
Privileged SPRs for controlling and handling exceptions
Interrupt-control instructions
Specification of how partially executed instructions are handled when an interrupt
occurs
Privileged SPRs for controlling debug modes and debug events
Specification for seven types of debug events
Specification for allowing a debug event to cause a reset
The ability of the debug mechanism to freeze the timer resources
64-bit time base
32-bit decrementer (the programmable-interval timer)
Three timer-event interrupts:
- Programmable-interval timer (PIT)
- Fixed-interval timer (FIT)
-Watchdog timer (WDT)
Privileged SPRs for controlling the timer resources
The ability to freeze the timer resources using the debug mechanism
Requirements for special registers and the TLB
Requirements for instruction fetch and for data access
Specifications for context synchronization and execution synchronization
Specification for two internal mechanisms that can cause a reset:
- Debug-control register (DBCR)
- Timer-control register (TCR)
Contents of processor resources after a reset
The software-initialization requirements, including an initialization code example
328 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation

PPC405 Features

PowerPC Book-E Architecture

The PowerPC Book-E architecture extends the capabilities introduced in the PowerPC embedded-environment architecture. Although not a PowerPC Book-E implementation, many of the features available in the 32-bit subset of the PowerPC Book-E architecture are available in the PPC405. The PowerPC Book-E architecture and the PowerPC embedded­environment architecture differ in the following general ways:
64-bit addressing and 64-bit operands are available. Unlike 64-bit mode in the PowerPC UISA, 64-bit support in PowerPC Book-E architecture is non-modal and instead defines new 64-bit instructions and flags.
Real mode is eliminated, and the memory-management unit is active at all times. The elimination of real mode results in the elimination of real-mode storage-attribute registers.
Memory synchronization requirements are changed in the architecture and a memory-barrier instruction is introduced.
A small number of new instructions are added to the architecture and several instructions are removed.
Several SPR addresses and names are changed in the architecture, as are the assignment and meanings of some bits within certain SPRs.
Embedded applications written for the PPC405 are compatible with PowerPC Book-E implementations. Privileged software is, in general, not compatible, but the differences are relatively minor. Software developers who are concerned with cross-compatibility of privileged software between the PPC405 and PowerPC Book-E implementations should
®
refer to Appendix F, PowerPC
Book-E Compatibility.
R
PPC405 Features
The PPC405 processor core is an implementation of the PowerPC embedded-environment architecture. The processor provides fixed-point embedded applications with high performance at low power consumption. It is compatible with the PowerPC UISA. Much of the PPC405 VEA and OEA support is also available in implementations of the PowerPC Book-E architecture. Key features of the PPC405 include:
A fixed-point execution unit fully compliant with the PowerPC UISA:
PowerPC embedded-environment architecture extensions providing additional
Performance-enhancing features, including:
- 32-bit architecture, containing thirty-two 32-bit general purpose registers (GPRs).
support for embedded-systems applications:
- True little-endian operation
- Flexible memory management
- Multiply-accumulate instructions for computationally intensive applications
- Enhanced debug capabilities
- 64-bit time base
- 3 timers: programmable interval timer (PIT), fixed interval timer (FIT), and
watchdog timer (All are synchronous with the time base)
- Static branch prediction
- Five-stage pipeline with single-cycle execution of most instructions, including
loads and stores
- Multiply-accumulate instructions
- Hardware multiply/divide for faster integer arithmetic (4-cycle multiply, 35-cycle
divide)
- Enhanced string and multiple-word handling
March 2002 Release www.xilinx.com 329 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Chapter 1: Introduction to the PPC405
- Support for unaligned loads and unaligned stores to cache arrays, main memory,
and on-chip memory (OCM)
- Minimized interrupt latency
Integrated instruction-cache:
- 16 KB, 2-way set associative
- Eight words (32 bytes) per cacheline
- Fetch line buffer
- Instruction-fetch hits are supplied from the fetch line buffer
- Programmable prefetch of next-sequential line into the fetch line buffer
- Programmable prefetch of non-cacheable instructions: full line (eight words) or
half line (four words)
- Non-blocking during fetch line fills
Integrated data-cache:
- 16 KB, 2-way set associative
- Eight words (32 bytes) per cacheline
- Read and write line buffers
- Load and store hits are supplied from/to the line buffers
- Write-back and write-through support
- Programmable load and store cacheline allocation
- Operand forwarding during cacheline fills
- Non-blocking during cacheline fills and flushes
Support for on-chip memory (OCM) that can provide memory-access performance identical to a cache hit
Flexible memory management:
- Translation of the 4 GB logical-address space into the physical-address space
- Independent control over instruction translation and protection, and data
translation and protection
- Page-level access control using the translation mechanism
- Software control over the page-replacement strategy
- Write-through, cacheability, user-defined 0, guarded, and endian (WIU0GE)
storage-attribute control for each virtual-memory region
- WIU0GE storage-attribute control for thirty-two 128 MB regions in real mode
- Additional protection control using zones
Enhanced debug support with logical operators:
- Four instruction-address compares
- Two data-address compares
- Two data-value compares
- JTAG instruction for writing into the instruction cache
- Forward and backward instruction tracing
Advanced power management support

Privilege Modes

Software running on the PPC405 can do so in one of two privilege modes: privilieged and user. The privilege modes supported by the PPC405 are described in Processor Operating
Modes, page 343.
330 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PPC405 Features

Address Translation Modes

R
Privileged Mode
Privileged mode allows programs to access all registers and execute all instructions supported by the processor. Normally, the operating system and low-level device drivers operate in this mode.
User Mode
User mode restricts access to some registers and instructions. Normally, application programs operate in this mode.
The PPC405 also supports two modes of address translation: real and virtual. Refer to
Chapter 6, Virtual-Memory Management, for more information on address translation.
Real Mode
In real mode, programs address physical memory directly.
Virtual Mode
In virtual mode, programs address virtual memory and virtual-memory addresses are translated by the processor into physical-memory addresses. This allows programs to access much larger address spaces than might be implemented in the system.

Addressing Modes

Whether the PPC4 05 is running in real mode or virtual mode, data addressing is supported by the load and store instructions using one of the following addressing modes:
Register-indirect with immediate indexA base address is stored in a register, and a displacement from the base address is specified as an immediate value in the instruction.
Register-indirect with indexA base address is stored in a register, and a displacement from the base address is stored in a second register.
Register indirectThe data address is stored in a register.
Instructions that use the two indexed forms of addressing also allow for automatic updates to the base-address register. With these instruction forms, the new data address is calculated, used in the load or store data access, and stored in the base-address register.
The data-addressing modes are described in Operand-Address Calculation, page 378.
With sequential-instruction execution, the next-instruction address is calculated by adding four bytes to the current-instruction address. In the case of branch instructions, however, the next-instruction address is determined using one of four branch-addressing modes:
Branch to relativeThe next-instruction address is at a location relative to the current­instruction address.
Branch to absoluteThe next-instruction address is at an absolute location in memory.
Branch to link registerThe next-instruction address is stored in the link register.
Branch to count registerThe next-instruction address is stored in the count register.
The branch-addressing modes are described in Branch-Target Address Calculation,
page 372.

Data Types

PPC405 instructions support byte, halfword, and word operands. Multiple-word operands are supported by the load/store multiple instructions and byte strings are supported by
March 2002 Release www.xilinx.com 331 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
the load/store string instructions. Integer data are either signed or unsigned, and signed data is represented using twos-complement format.
The address of a multi-byte operand is determined using the lowest memory address occupied by that operand. For example, if the four bytes in a word operand occupy addresses 4, 5, 6, and 7, the word address is 4. The PPC405 supports both big-endian (an operands most-significant byte is at the lowest memory address) and little-endian (an operands least-significant byte is at the lowest memory address) addressing.
See Operand Conventions, page 347, for more information on the supported data types and byte ordering.

Register Set Summary

Figure 1-2, page 333 shows the registers contained in the PPC405. Descriptions of the
registers are in the following sections.
Chapter 1: Introduction to the PPC405
332 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PPC405 Features
R
User Registers
General-Purpose Registers
r0 r1
. . .
r31
Condition Register
CR
Fixed-Point Exception Register
XER
Link Register
LR
Count Register
CTR
User-SPR General-Purpose
Registers
USPRG0
SPR General-Purpose
Registers
Time-Base Registers
(read only)
SPRG4 SPRG5 SPRG6 SPRG7
(read only)
TBU
TBL
Privileged Registers
Machine-State Register
MSR
Core-Configuration Register
CCR0
SPR General-Purpose
Registers
SPRG0 SPRG1 SPRG2 SPRG3 SPRG4 SPRG5 SPRG6 SPRG7
Exception-Handling Registers
EVPR
ESR DEAR SRR0 SRR1 SRR2 SRR3
Memory-Management
Registers
PID
ZPR
Storage-Attribute Control
Registers
DCCR
DCWR
ICCR
SGR SLER SU0R
Debug Registers
DBSR DBCR0 DBCR1
DAC1 DAC2 DVC1 DVC2
IAC1 IAC2 IAC3 IAC4
ICDBR
Timer Registers
TCR
TSR
PIT
Processor-Version Register
PVR
Time-Base Registers
TBU
TBL
UG011_51_033101
Figure 1-2: PPC405 Registers
General-Purpose Registers
The processor contains thirty-two 32-bit general-purpose registers (GPRs), identified as r0 through r31. The contents of the GPRs are read from memory using load instructions and written to memory using store instructions. Computational instructions often read operands from the GPRs and write their results in GPRs. Other instructions move data between the GPRs and other registers. GPRs can be accessed by all software. See General-
Purpose Registers (GPRs), page 360, for more information.
March 2002 Release www.xilinx.com 333 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Special-Purpose Registers
The processor contains a number of 32-bit special-purpose registers (SPRs). SPRs provide access to additional processor resources, such as the count register, the link register, debug resources, timers, interrupt registers, and others. Most SPRs are accessed only by privileged software, but a few, such as the count register and link register, are accessed by all software. See User Registers, page 359, and Privileged Registers, page 429 for more information.
Machine-State Register
The 32-bit machine-state register (MSR) contains fields that control the operating state of the processor. This register can be accessed only by privileged software. See Machine-State
Register, page 431, for more information.
Condition Register
The 32-bit condition register (CR) contains eight 4-bit fields, CR0–CR7. The values in the CR fields can be used to control conditional branching. Arithmetic instructions can set CR0 and compare instructions can set any CR field. Additional instructions are provided to perform logical operations and tests on CR fields and bits within the fields. The CR can be accessed by all software. See Condition Register (CR), page 361, for more information.
Device Control Registers
Chapter 1: Introduction to the PPC405
The 32-bit device control registers (not shown) are used to configure, control, and report status for various external devices that are not part of the PPC405 processor. Although the DCRs are not part of the PPC405 implementation, they are accessed using the mtdcr and mfdcr instructions. The DCRs can be accessed only by privileged software. See the PPC405
Processor Block Manual for more information on implementing DCRs.

PPC405 Organization

As shown in Figure 1-3, the PPC405 processor contains the following elements:
A 5-stage pipeline consisting of fetch, decode, execute, write-back, and load write­back stages
A virtual-memory-management unit that supports multiple page sizes and a variety of storage-protection attributes and access-control options
Separate instruction-cache and data-cache units
Debug support, including a JTAG interface
Three programmable timers
The following sections provide an overview of each element.
334 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PPC405 Features
R
PLB Master
Read Interface
I-Cache
Array
Instruction-Cache
I-Cache
Controller
Unit
Cache Units
Data-Cache
Unit
D-Cache
Array
D-Cache
Controller
Instruction
OCM
Instruction
Shadow-TLB
(4-Entry)
Unified TLB
(64-Entry)
Data
Shadow-TLB
(8-Entry)
Fetch
and
Decode
Logic
32x32
GPR
CPUMMU
3-Element
Fetch Queue
Execute Unit
ALU MAC
Timers
Timers
and
Debug
Debug
Logic
PLB Master
Read Interface
PLB Master
Write Interface
Data
OCM
External-Interrupt
Controller Interface
JTAG
Instruction
Figure 1-3: PPC405 Organization
Central-Processing Unit
The PPC405 central-processing unit (CPU) implements a 5-stage instruction pipeline consisting of fetch, decode, execute, write-back, and load write-back stages.
The fetch and decode logic sends a steady flow of instructions to the execute unit. All instructions are decoded before they are forwarded to the execute unit. Instructions are queued in the fetch queue if execution stalls. The fetch queue consists of three elements: two prefetch buffers and a decode buffer. If the prefetch buffers are empty instructions flow directly to the decode buffer.
Up to two branches are processed simultaneously by the fetch and decode logic. If a branch cannot be resolved prior to execution, the fetch and decode logic predicts how that branch is resolved, causing the processor to speculatively fetch instructions from the predicted path. Branches with negative-address displacements are predicted as taken, as are branches that do not test the condition register or count register. The default prediction can be overridden by software at assembly or compile tim e. This capability is described further in Branch Prediction, page 370.
The PPC405 has a single-issue execute unit containing the general-purpose register file (GPR), arithmetic-logic unit (ALU), and the multiply-accumulate unit (MAC). The GPRs consist of thirty-two 32-bit registers that are accessed by the execute unit using three read ports and two write ports. During the decode stage, data is read out of the GPRs for use by the execute unit. During the write-back stage, results are written to the GPR. The use of five read/write ports on the GPRs allows the processor to execute load/store operations in parallel with ALU and MAC operations.
Trace
UG011_29_033101
March 2002 Release www.xilinx.com 335 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
The execute unit supports all 32-bit PowerPC UISA integer instructions in hardware, and is compliant with the PowerPC embedded-environment architecture specification. Floating­point operations are not supported.
The MAC unit supports implementation-specific multiply-accumulate instructions and multiply-halfword instructions. MAC instructions operate on either signed or unsigned 16-bit operands, and they store their results in a 32-bit GPR. These instructions can produce results using either modulo arithmetic or saturating arithmetic. All MAC instructions have a single cycle throughput. See Multiply-Accumulate Instruction-Set
Extensions, page 405 for more information.
Exception Handling Logic
Exceptions are divided into two classes: critical and noncritical. The PPC405 CPU services exceptions caused by error conditions, the internal timers, debug events, and the external interrupt controller (EIC) interface. Across the two classes, a total of 19 possible exceptions are supported, including the two provided by the EIC interface.
Each exception class has its own pair of save/restore registers. SRR0 and SRR1 are used for noncritical interrupts, and SRR2 and SRR3 are used for critical interrupts. The exception­return address and the machine state are written to these registers when an exception occurs, and they are automatically restored when an interrupt handler exits using the return-from-interrupt (rfi) or return-from critical-interrupt (rfci) instruction. Use of separate save/restore registers allows the PPC405 to handle critical interrupts independently of noncritical interrupts.
See Chapter 7, Exceptions and Interrupts, for information on exception handling in the PPC405.
Chapter 1: Introduction to the PPC405
Memory Management Unit
The PPC405 supports 4 GB of flat (non-segmented) address space. The memory­management unit (MMU) provides address translation, protection functions, and storage­attribute control for this address space. The MMU supports demand-paged virtual memory using multiple page sizes of 1 KB, 4 KB, 16 KB, 64 KB, 256 KB, 1 MB, 4 MB and 16 MB. Multiple page sizes can improve memory efficiency and minimize the number of TLB misses. When supported by system software, the MMU provides the following functions:
Translation of the 4 GB logical-address space into a physical-address space.
Independent enabling of instruction translation and protection from that of data
translation and protection.
Page-level access control using the translation mechanism.
Software control over the page-replacement strategy.
Additional protection control using zones.
Storage attributes for cache policy and speculative memory-access control.
The translation look-aside buffer (TLB) is used to control memory translation and protection. Each one of its 64 entries specifies a page translation. It is fully associative, and can simultaneously hold translations for any combination of page sizes. To prevent TLB contention between data and instruction accesses, a 4-entry instruction and an 8-entry data shadow-TLB are maintained by the processor transparently to software.
Software manages the initialization and replacement of TLB entries. The PPC405 includes instructions for managing TLB entries by software running in privileged mode. This capability gives significant control to system software over the implementation of a page replacement strategy. For example, software can reduce the potential for TLB thrashing or delays associated with TLB-entry replacement by reserving a subset of TLB entries for globally accessible pages or critical pages.
Storage attributes are provided to control access of memory regions. When memory translation is enabled, storage attributes are maintained on a page basis and read from the
336 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
PPC405 Features
R
TLB when a memory access occurs. When memory translation is disabled, storage attributes are maintained in storage-attribute control registers. A zone-protection register (ZPR) is provided to allow system software to override the TLB access controls without requiring the manipulation of individual TLB entries. For example, the ZPR can provide a simple method for denying read access to certain application programs.
Chapter 6, Virtual-Memory Management, describes these memory-management
resources in detail.
Instruction and Data Caches
The PPC405 accesses memory through the instruction-cache unit (ICU) and data-cache unit (DCU). Each cache unit includes a PLB-master interface, cache arrays, and a cache controller. Hits into the instruction cache and data cache appear to the CPU as single-cycle memory accesses. Cache misses are handled as requests over the PLB bus to another PLB device, such as an external-memory controller.
The PPC405 implements separate instruction-cache and data-cache arrays. Each is 16 KB in size, is two-way set-associative, and operates using 8-word (32 byte) cachelines. The caches are non-blocking, allowing the PPC405 to overlap instruction execution with reads over the PLB (when cache misses occur).
The cache controllers replace cachelines according to a least-recently used (LRU) replacement policy. When a cacheline fill occurs, the most-recently accessed line in the cache set is retained and the other line is replaced. The cache controller updates the LRU during a cacheline fill.
The ICU supplies up to two instructions every cycle to the fetch and decode unit. The ICU can also forward instructions to the fetch and decode unit during a cacheline fill, minimizing execution stalls caused by instruction-cache misses. When the ICU is accessed, four instructions are read from the appropriate cacheline and placed temporarily in a line buffer. Subsequent ICU accesses check this line buffer for the requested instruction prior to accessing the cache array. This allows the ICU cache array to be accessed as little as once every four instructions, significantly reducing ICU power consumption.
The DCU can independently process load/store operations and cache-control instructions. The DCU can also dynamically reprioritize PLB requests to reduce the length of an execution stall. For example, if the DCU is busy with a low-priority request and a subsequent storage operation requested by the CPU is stalled, the DCU automatically increases the priority of the current (low-priority) request. The current request is thus finished sooner, allowing the DCU to process the stalled request sooner. The DCU can forward data to the execute unit during a cacheline fill, further minimizing execution stalls caused by data-cache misses.
Additional features allow programmers to tailor data-cache performance to a specific application. The DCU can function in write-back or write-through mode, as determined by the storage-control attributes. Loads and stores that do not allocate cachelines can also be specified. Inhibiting certain cacheline fills can reduce potential pipeline stalls and unwanted external-bus traffic.
See Chapter 5, Memory-System Management, for details on the operation and control of the PPC405 caches.
Timer Resources
The PPC405 contains a 64-bit time base and three timers. The time base is incremented synchronously using the CPU clock or an external clock source. The three timers are incremented synchronously with the time base. (See Chapter 8, Timer Resources, for more information on these features.) The three timers supported by the PPC405 are:
Programmable Interval Timer
Fixed Interval Timer
Watc h dog Ti m er
March 2002 Release www.xilinx.com 337 Virtex-II Pro Platform FPGA Documentation 1-800-255-7778
R
Chapter 1: Introduction to the PPC405
Programmable Interval Timer
The programmable interval timer (PIT) is a 32-bit register that is decremented at the time-base increment frequency. The PIT register is loaded with a delay value. When the PIT count reaches 0, a PIT interrupt occurs. Optionally, the PIT can be programmed to automatically reload the last delay value and begin decrementing again.
Fixed Interval Timer
The fixed interval timer (FIT) causes an interrupt when a selected bit in the time-base register changes from 0 to 1. Programmers can select one of four predefined bits in the time-base for triggering a FIT interrupt.
Watchdog Timer
The watchdog timer causes a hardware reset when a selected bit in the time-base register changes from 0 to 1. Programmers can select one of four predefined bits in the time-base for triggering a reset, and the type of reset can be defined by the programmer.
Note: The time-base register alone does not cause interrupts to occur.
Debug
The PPC405 debug resources include special debug modes that support the various types of debugging used during hardware and software development. These are:
Internal-debug mode for use by ROM monitors and software debuggers
External-debug mode for use by JTAG debuggers
Debug-wait mode, which allows the servicing of interrupts while the processor appears
to be stopped
Real-time trace mode, which supports event triggering for real-time tracing
Debug events are supported that allow developers to manage the debug process. Debug modes and debug events are controlled using debug registers in the processor. The debug registers are accessed either through software running on the processor or through the JTAG port. The JTAG port can also be used for board tests.
The debug modes, events, controls, and interfaces provide a powerful combination of debug resources for hardware and software development tools. Chapter 9, Debugging, describes these resources in detail.
PPC405 Interfaces
The PPC405 provides a set of interfaces that supports the attachment of cores and user logic. The software resources used to manage the PPC405 interfaces are described in the
Core-Configuration Register, page 459 . For information on the hardware operation, use,
and electrical characteristics of these interfaces, refer to the PPC405 Processor Block
Manual. The following interfaces are provided:
Processor local bus interface
Device control register interface
Clock and power management interface
JTAG port interface
On-chip interrupt controller interface
On-chip memory controller interface
Processor Local Bus
The processor local bus (PLB) interface provides a 32-bit address and three 64-bit data buses attached to the instruction-cache and data-cache units. Two of the 64-bit buses are attached to the data-cache unit, one supporting read operations and the other supporting write operations. The third 64-bit bus is attached to the instruction-cache unit to support instruction fetching.
338 www.xilinx.com March 2002 Release
1-800-255-7778 Virtex-II Pro™ Platform FPGA Documentation
Loading...
+ 532 hidden pages