The information contained herein is subject to change without notice.
The information contained herein is presented only as a guide for the applications of our products. No
responsibility is assumed by TOSHIBA for any infringements of patents or other rights of the third parties which may
result from its use. No license is granted by implication or otherwise under any patent or patent rights of TOSHIBA
or others.
The products described in this document contain components made in the United States and subject to export control
of the U.S.authorities. Diversion contrary to the U.S. law is prohibited.
These TOSHIBA products are intended for usage in general electronic equipments (office equipment, communication
equipment, measuring equipment, domestic electrification, etc.).Please make sure that you consult with us before you
use these TOSHIBA products in equipments which require high quality and/or reliability, and in equipments which
could have major impact to the welfare of human life (atomic energy control, airplane, spaceship, traffic signal,
combustion control, all type of safety devices, etc.). TOSHIBA cannot accept liability to any damage which may
occur in case these TOSHIBA products were used in the mentioned equipments without prior consultation with
5.4 Reduced Frequency Mode-------------------------------------------------------235
v
Architecture
Architecture
1
Architecture
2
Chapter 1 Introduction
1.1 Features
The R3900 Processor Core is a high-performance 32-bit microprocessor core developed by Toshiba based on
the R3000A RISC (Reduced Instruction Set Computer) microprocessor. The R3000A was developed by
MIPS Technologies, Inc.
Toshiba develops ASSPs (Application Specific Standard Products) using the R3900 Processor Core and
provides the R3900 as a processor core in Embedded Array or Cell-based ICs. The low power consumption
and high cost-performance ratio of this processor make it especially well-suited to embedded control
applications in products such as PDAs (Personal Digital Assistants) and game equipment.
− Data cache snoop function: Invalidatation of data in the data cache to maintain cache memory
and main memory consistency on DMA transfer cycles
• Nonblocking load
− Execute the following instruction regardless of a cache miss caused by a preceding load
instruction
• DSP function
− Multiply/Add (32-bit x 32-bit + 64-bit) in one clock cycle.
1.1.2 Functions for embedded applications
• Small code size
− Branch Likely instruction:The branch delay slot accepts an instruction to be executed at the
branch target
− Hardware Interlock: Stall the pipeline at the load delay slot when the instruction in the slot
depends on the data to be loaded
3
• Real-time performance
− Cache Lock Function: Lock one set of the two-way set associative cache memory to keep data in
cache memory
• Debug support
− Breakpoint
− Single step execution
• Real-time debug system interface
1.1.3 Low power consumption
• Power Down mode
Architecture
− Prepare for Reduced Frequency mode: Control the clock frequency of the R3900 Processor Core
with a clock generator
− Halt and Doze mode: Stop R3900 Processor Core operations
• Clock can be stopped
− Clock signal can be stopped at high state
1.1.4 Development environment for embedded arrays and cell-based ICs
• Compact core
• Easy-to-design peripheral circuits
− Single direction separate bus: Bus configuration suitable for core
− Built-in cache memory: No need to consider cache operation timing
• ASIC Process
• Sufficient Development Environment
4
1.2 Notation Used in This Manual
Mathematical notation
• Hexadecimal numbers are expressed as follows (example shown for decimal number 42)
0x2A
• A K(kilo)byte is 210 = 1,024 bytes, a M(mega)byte is 220 = 1,024 x 1,024 = 1,048,576 bytes, and a
G(giga)byte is 230 = 1,024 x 1,024 x 1,024 = 1,073,741,824 bytes.
Data notation
• Byte: 8 bits
• Halfword: 2 contiguous bytes (16 bits)
Architecture
• Word: 4 contiguous bytes (32 bits)
• Doubleword: 8 contiguous bytes (64 bits)
Signal notation
• Low active signals are indicated by an asterisk (*) at the end of the signal name (e.g.: RESET*).
• Changing a signal to active level is to “assert” a signal, while changing it to a non-active level is to “de-
assert” the signal.
5
2.
Architecture
6
Chapter 2 Architecture
CPU core
R3900 Processor Core
2.1 Overview
A block diagram of the R3900 Processor Core is shown in Figure 2-1. It includes the CPU core, an
instruction cache and a data cache. You can select an optimum data and instruction cache configuration for
your system from among a variety of possible configurations.
The CPU Core comprises the following blocks:
• CPU registers:General-purpose register, HI/LO register and program counter (PC).
• CP0 registers:Registers for system control coprocessor (CP0) functions.
Architecture
• ALU/Shifter:Computational unit.
• MAC:Computational unit for multiply/add.
• Bus interface unit:Control bus interface between CPU core and external circuit.
• Memory management unit : Direct segment mapping memory management unit.
CPU Register
CP0 Register
ALU/Shifter
MAC
Bus Interface Unit
Memory
Management Unit
Data CacheInstruction Cache
Figure 2-1. Block Diagram of the R3900 Processor Core
7
2.2 Registers
310
310
310
310
2.2.1 CPU registers
The R3900 Processor Core has the following 32-bit registers.
• Thirty-two general-purpose registers
• A program counter (PC)
• HI/LO registers for storing the result of multiply and divide operations
The configuration of the registers is shown in Figure 2-2.
Architecture
General-purpose registers
r0
r1
r2
.
.
.
.
r29
r30
r31
Multiply/Divide registers
HI
LO
Program counter
PC
Figure 2-2. R3900 Processor Core registers
The r0 and r31 registers have special functions.
• Register r0 always contains the value 0. It can be a target register of an instruction whose
operation result is not needed. Or, it can be a source register of an instruction that requires a value
of 0.
• Register r31 is the link register for the Jump And Link instruction. The address of the instruction
after the delay slot is placed in r31.
The R3900 Processor Core has the following three special registers that are used or modified
implicitly by certain instructions.
PC:Program counterHI:High word of the multiply/divide registersLO:Low word of the multiply/divide registers
The multiply/divide registers (HI, LO) store the double-word (64-bit) result of integer multiply
operations. In the case of integer divide operations, the quotient is stored in LO and the remainder in
HI.
8
2.2.2 System control coprocessor (CP0) registers
The R3900 Processor Core can be connected to as many as three coprocessors, referred to as CP1,
CP2 and CP3. The R3900 also has built-in system control coprocessor (CP0) functions for exception
handling and for configuring the system. Figure 2-3 shows the functional breakdown of the CP0
registers.
<Exception Processing>
Architecture
Status register
EPC register
BadVAddr register
Config register
Cache register
<Debugging>
Debug register
†
†
†
Cause register
PRld register
DEPC register
†
Figure 2-3 CP0 registers
†
Additional R3900 Processor Core
registers not present in the
R3000A
9
Architecture
Table 2-1 lists the CP0 registers built into the R3900 Processor Core. Some of these registers are reserved
for use by an external memory management unit.
Table 2-1. List of system control coprocessor (CP0) registers
NoMnemonicDescription
0
1
2
3Config
-(reserved)
-(reserved)
-(reserved)
††
Hardware configuration
†
†
†
-(reserved)
†
4
-(reserved)
†
5
-(reserved)
†
6
7Cache
††
Cache lock function
8BadVAddrLast virtual address triggering error
-(reserved)
†
9
10-(reserved)
11-(reserved)
††
12StatusInformation on mode, interrupt enabled, diagnostic status
13CauseIndicates nature of last exception
14EPCException program counter
15PRIdProcessor revision ID
16Debug
17DEPC
18
†††
†††
Debug exception control
Program counter for debug exception
-(reserved)
†
|
31
†
Reserved for external memory management unit, when direct segment mapping
MMU is not used.
††
Additional R3900 Processor Core register not present in R3000A.
†††
Additional R3900 Processor Core Debug register not present in R3000A.
10
2.3 Instruction Set Overview
3126 2521 2016 150
3126 250
3126 2521 2016 1511 106 50
All R3900 Processor Core instructions are 32 bits in length. There are three instruction formats: immediate
(I-type), jump (J-type) and register (R-type), as shown in Figure 2-4. Having just three instruction formats
simplifies instruction decoding. If more complex functions or addressing modes are required, they can be
produced with the compiler using combinations of the instructions.
Figure 2-4. Instruction formats and subfield mnemonics
11
The instruction set is classified as follows.
(1)Load/store
These instructions transfer data between memory and general registers. All instructions in this group
are I-type. “Base register + 16 bit signed immediate offset” is the only supported addressing mode.
(2)Computational
These instructions perform arithmetic, logical and shift operations on register values. The format can
be R-type (when both operands and the result are register values) or I-type (when one operand is 16bit immediate data).
(3)Jump/branch
These instructions change the program flow. A jump is always made to a 32 bit address contained in
Architecture
a register (R-type format ), or to a paged absolute address constructed by combining a 26-bit target
address with the upper 4 bits of the program counter (J-type format). In a branch instruction, the
target address is made up of the program counter value plus a 16 bit offset.
(4)Coprocessor
These instructions execute coprocessor operations. Each coprocessor has its own format for
computational instructions.
Note : Coprocessor load instruction LWCz and coprocessor store instruction SWCz are not
supported by the R3900 Processor Core. An attempt to execute either of these instructions
will trigger a Reserved Instruction exception.
(5)Coprocessor 0
These instructions are used for operations with system control coprocessor (CP0) registers, processor
memory management and exception handling.
Note : TLB (Translation Lookaside Buffer) instructions (TLBR, TLBWJ, TLBWR and TLBP) are
(6)Special
These instructions support system calls and breakpoint functions. The format is always R-type.
not supported by the R3900 Processor Core. These instructions will be treated by the R3900
as NOP(no operation).
12
Architecture
The instruction set supported by all MIPS R-Series processors is listed in Table 2-2. Table 2-3 shows
extended instructions supported by the R3900 Processor Core, and Table 2-4 lists coprocessor 0 (CP0)
instructions.
Table 2-5 shows R3000A instructions not supported by the R3900 Processor Core.
Table 2-2. Instructions supported by MIPS R-Series processors (ISA)
InstructionDescription
Load/Store Instructions
LBLoad Byte
LBULoad Byte Unsigned
LHLoad Halfword
LHULoad Halfword Unsigned
LWLoad Word
LWLLoad Word Left
LWRLoad Word Right
SBStore Byte
SHStore Halfword
SWStore Word
SWLStore Word Left
SWRStore Word Right
Computational Instructions
(ALU Immediate)
ADDIAdd Immediate
ADDIUAdd Immediate Unsigned
SLTISet on Less Than Immediate
SLTIUSet on Less Than Immediate Unsigned
ANDIAND Immediate
ORIOR Immediate
XORIXOR Immediate
LUILoad Upper Immediate
(ALU 3-operand, register type)
ADDAdd
ADDUAdd Unsigned
SUBSubtract
SUBUSubtract Unsigned
SLTSet on Less Than
SLTUSet on Less Than Unsigned
ANDAND
OROR
XORXOR
NORNOR
13
Table 2-2(cont.). Instructions supported by MIPS R-Series processors (ISA)
InstructionDescription
(Shift)
SLLShift Left Logical
SRLShift Right Logical
SRAShift Right Arithmetic
SLLVShift Left Logical Variable
SRLVShift Right Logical Variable
SRAVShift Right Arithmetic Variable
(Multiply/Divide)
MULTMultiply
MULTUMultiply Unsigned
DIVDivide
DIVUDivide Unsigned
MFHIMove from HI
MTHIMove to HI
MFLOMove from LO
MTLOMove to LO
Jump/Branch Instructions
JJump
JALJump And Link
JRJump Register
JALRJump And Link Register
BEQBranch on Equal
BNEBranch on Not Equal
BLEZBranch on Less than or Equal to Zero
BGTZBranch on Greater Than Zero
BLTZBranch on Less Than Zero
BGEZBranch on Greater than or Equal to Zero
BLTZALBranch on Less Than Zero And Link
BGEZALBranch on Greater than or Equal to Zero And Link
Coprocessor Instructions
MTCzMove to Coprocessor z
MFCzMove from Coprocessor z
CTCzMove Control Word to Coprocessor z
CFCzMove control Word from Coprocessor z
COPzCoprocessor Operation z
BCzTBranch on Coprocessor z True
BCzFBranch on Coprocessor z False
BEQLBranch on Equal Likely
BNELBranch on Not Equal Likely
BLEZLBranch on Less than or Equal to Zero Likely
BGTZLBranch on Greater Than Zero Likely
BLTZLBranch on Less Than Zero Likely
BGEZLBranch on Greater than or Equal to Zero Likely
BLTZALLBranch on Less Than Zero And Link Likely
BGEZALLBranch on Greater than or Equal to Zero And Link Likely
Coprocessor Instructions
BCzTLBranch on Coprocessor z True Likely
BCzFLBranch on Coprocessor z False Likely
Special Instruction
SDBBPSoftware Debug Breakpoint
Architecture
Table 2-4. CP0 instructions
InstructionDescription
CP0 Instructions
MTC0Move to CP0
MFC0Move from CP0
RFERestore from Exception
DERETDebug Exception Return
CACHECache Operation
Table 2-5. R3000A instructions not supported by the R3900
InstructionDescriptionOperation
Coprocessor Instructions
LWCzLoad Word from CoprocessorReserved Instruction Exception
SWCzStore Word to CoprocessorReserved Instruction Exception
This section explains how data is organized in R3900 registers and memory.
The R3900 uses the following data formats: 64-bit doubleword, 32-bit word, 16-bit halfword and 8-bit byte.
The byte order can be set to either big endian or little endian.
Figure 2-5 shows how bytes are ordered in words, and how words are ordered in multiple words, for both the
big-endian and little-endian formats.
Architecture
Higher address
Lower address
Higher address
Lower address
31
24
8910118
45674
01230
23 16 15 8 7
0
Byte 0 is the most significant byte (bit 31-24).
A word is addressed beginning with the most significant byte.
(a) Big endian
31
24
1110988
76544
32100
23
16
Byte 0 is the least significant byte (bit 7-0).
15
8
7
0
Word address
Word address
A word is addressed beginning with the least significant byte.
(b) Little endian
Figure 2-5. Big endian and little endian formats
16
Architecture
17
Architecture
In this document (bit 0 is always the rightmost bit).
Byte addressing is used with the R3900 Processor Core, but there are alignment restrictions for halfword and
word access. Halfword access is aligned on an even byte boundary (0, 2, 4...) and word access on a byte
boundary divisible by 4 (0, 4, 8...) .
The address of multiple-byte data, as shown in Figure 2-5 above, begins at the most significant byte for the
big endian format and at the least significant byte for the little endian format.
There are special instructions (LWL, LWR, SWL, SWR) for accessing words not aligned on a word
boundary. They are used in pairs for addressing misaligned words, but involve an extra instruction cycle
which is wasted if used with properly aligned words. Figure 2-6 shows the byte arrangement when a
misaligned word is addressed at byte address 3 for the big and little endian formats.
Higher address
Lower address
Higher address
Lower address
3124 2316 158 70
456
3
(a) Big endian
3124 2316 158 70
654
3
(b)Little endian
Figure 2-6. Byte addresses of a misaligned word
18
2.5 Pipeline Processing Overview
The R3900 Processor Core executes instructions in five pipeline stages (F: instruction fetch; D: decode; E:
execute; M: memory access; W: register write-back). Each pipeline stage is executed in one clock cycle.
When the pipeline is fully utilized, five instructions are executed at the same time resulting in an instruction
execution rate of one instruction per cycle.
With the R3900 Processor Core an instruction that immediately follows a load instruction can use the result of
that load instruction. Execution of the following instruction is delayed by hardware interlock until the result of
the load instruction becomes available. The instruction position immediately following the load instruction is
called the “load delay slot.”
Architecture
In the case of branch instructions, a one-cycle delay is required to generate the branch target address. This
delayed cycle is referred to as the “branch delay slot.” An instruction placed immediately after a branch
instruction (in the branch delay slot) can be executed prior to the branch while the branch target address is
being generated.
The R3900 Processor Core provides a Branch Likely instruction whereby an instruction to be executed at the
branch target can be placed in the delay slot of the Branch Likely instruction and executed only if the
conditions of the branch instruction are met. If the conditions are not met, and the branch is not taken, the
instruction in the delay slot is treated as a NOP. This makes it possible to place an instruction that would
normally be executed at the branch target into the delay slot for quick execution (if the conditions of the
branch are met).
FDEMW
FDEMW
FDEMW
FDEMW
FDEMW
Current CPU
cycle
Figure 2-7. Pipeline stages for execution of R3900 Processor Core instructions
19
2.6 Memory Management Unit (MMU)
2.6.1 R3900 Processor Core operating modes
The R3900 Processor Core has two operating modes, user mode and kernel mode. Normally the
processor operates in user mode. It switches to kernel mode if an exception is detected. Once in
kernel mode, it remains there until an RFE (Restore From Exception) instruction is executed.
(1)User mode
User mode makes available one of the two 2 Gbyte virtual address spaces (kuseg). In this
mode the most significant bit of each kuseg address in the memory map is 0. Attempting to
access an address whose MSB is 1 while in user mode returns an Address Error exception.
Architecture
(2)Kernel mode
Kernel mode makes available a second 2 Gbyte virtual address space (kseg), in addition to the
kuseg accessible in user mode. The MSB of each kseg address in the memory map is 1.
20
2.6.2 Direct segment mapping
The R3900 Processor Core includes a direct segment mapping MMU. The following virtual address
spaces are available depending on the processor mode (Figure 2-8 shows the address mapping).
(1)User mode
One 2 Gbyte virtual address space (kuseg) is available. Virtual addresses from 0x0000 0000
to 0x7FFF FFFF are translated to physical addresses 0x4000 0000 to 0xBFFF FFFF,
respectively.
(2)Kernel mode
The kernel mode address space is treated as four virtual address segments. One of these is
Architecture
the same as the kuseg space in user mode; the remaining three are the kernel segments kseg0,
kseg1 and kseg2.
(a)kuseg
This is the same as the virtual address space available in user mode. Address
translation is also the same as in user mode. The upper 16 Mbytes of kuseg is
reserved for on-chip resources and is not cacheable.
(b)kseg0
This is a 512 Mbyte segment spanning virtual addresses 0x8000 0000 to 0x9FFF
FFFF. Fixed mapping of this segment is made to physical addresses 0x0000 0000 to
0x1FFF FFFF, respectively. This area is cacheable.
(c)kseg1
This is a 512 Mbyte segment from virtual address 0xA000 0000 to 0xBFFF FFFF.
Fixed mapping of this segment is made to physical address 0x0000 0000 to 0x1FFF
FFFF, respectively. Unlike kseg0, this area is not cacheable.
(d)kseg2
This is a 1 Gbyte linear address space from virtual addresses 0xC000 0000 to 0xFFFF
FFFF. The upper 16 Mbytes of kseg2 are reserved for on-chip resources and are not
cacheable. Of this reserved area, 0xFF20 0000 to 0xFF3F FFFF is a 2 Mbyte
reserved area intended for use as a debugging monitor area and for testing.
21
0xFFFF FFFF
16MB Kernel Reserved
0xC000 0000
Kernel Cached
0xA000 0000
Kernel Uncached
0x8000 0000
Kernel Cached
0x0000 0000
Kernel/User Cached
(kuseg)
1024MB
Kernel/User
2048MB
512MB
Kernel Boot and I/O
512MB
(kseg2)
(kseg1)
Architecture
Physical address spaceVirtual address space
Kernel Cached Tasks
(kseg0)
16MB User Reserved
Figure 2-8. Address mapping
Cached Tasks
Inaccessible
Cached/uncached
22
Loading...
+ 216 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.