PMC RM7065A-300T, RM7065A-350T Datasheet

Download

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

RM7065A

RM7065A™ Microprocessor with On-

Chip Secondary Cache

Data Sheet

Preliminary

Issue 2, June 2001

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Legal Information

use. In any event, you cannot reproduce any part of this document, in any form, without the express written consent of PMC-Sierra, Inc.

PMC-2010145 (P1)

Disclaimer

None of the information co ntained in this document co nst it ut es an express or implied warran ty by PMC-Sierr a, Inc. as to the sufficiency, fitness or suitability for a particular pu r pose of any such information or the fitness, or suitability for a particular purpose, merchantability, performance, compatibility with other parts or systems, of any of the products of PMC-Sierra, Inc., or any portion thereof, referred to in this document. PMC-Sierra, Inc. expressly disclaims all representations and warranties of any kind regarding the contents or use of the information, including, but not limited to, express and implied warranties of accuracy, completeness, merchantability, fitness for a particular use, or non-infringement.

In no event will PMC-Sierra, Inc. be liable for any direct, indirect, special, incidental or consequential damages, including, but not limited to, lost profits, lost business or lost data resulting from any use of or reliance upon the information, whether or not PMC-Sierra, Inc. has been advised of the possibility of such damage.

Trademarks

RM7000A and Fast Packet Cache are trademarks of PMC-Sierra, Inc.

Patents

The technology discussed is protected by one or more of the following Patents: U.S. Patent Numbers 5,953,748 5,606,683 5,760,620. Relevant patent applications and other patents may also exist.

Contacting PMC-Sierra

PMC-Sierra, Inc. 105-8555 Baxter Place Burnaby, BC Canada V5A 4V7

Tel: (604) 415-6000 Fax: (604) 415-6200

Document Information: document@pmc-sierra.com Corporate Information: info@pmc-sierra.com Technical Support: apps@pmc-sierra.com Web Site: http: //www.pmc-sierra.com

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 2 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Revision History

Issue No. Issue Date Details of Change

2 June 2001 Changed IP references to INT, page 34. Changed W7 pin name to SysClk.

1 April 2001 Applied PMC-Sierra template to existing MPD (QED) prelimina r y Fram eM ake r

document. Updated Sections 4.33, 4.34, 4.38, 9, and 12. In the Pinout Table, changed all

references from IP to Int. Changed QED references to PMC-Sierra or MIPS.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 3 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Document Conventions

The following conventions are used in this datasheet:

• All signal, pin, and bus names described in the text, such as ExtRqst*, are in boldface

typeface.

• All bit and field names described in the text, such as Interrupt Mask, are in an italic-bo ld

typeface.

• All instruct ion names, such as MFHI, are in san serif typeface.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 4 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

1 Features ............................ ....................................................................... ...............................9

2 Block Diagram .......................... ...... ....... ...... ....... ...... ...... ....... ...... ....... ...... ....... ......................10

3 Description ............................................................................................................................11

4 Hardware Overview ...............................................................................................................12

4.1 CPU Registers .............................................................................................................12

4.2 Superscalar Dispatch ...................................................................................................12

4.3 Pipeline ........................................................................................................................13

4.4 Integer Unit ..................................................................................................................14

4.5 ALU ..............................................................................................................................15

4.6 Integer Multiply/Divide ..................................................................................................15

4.7 Floating-Point Coprocessor ..........................................................................................16

4.8 Floating-Point Unit .......................................................................................................16

4.9 Floating-Point General Register File ............................................................................17

4.10 System Control Coprocessor (CP0) .............................................................................18

4.11 System Control Coprocessor Registers .......................................................................18

4.12 Virtual to Physical Address Mapping ............................................................................19

4.13 Joint TLB ......................................................................................................................20

4.14 Instruction TLB .............................................................................................................21

4.15 Data TLB ......................................................................................................................21

4.16 Cache Memory .............................................................................................................22

4.17 Instruction Cache .........................................................................................................22

4.18 Data Cache ..................................................................................................................22

4.19 Secondary Cache ........................................................................................................24

4.20 Secondary Caching Protocols ......................................................................................24

4.21 Cache Locking .............................................................................................................25

4.22 Cache Management .....................................................................................................26

4.23 Primary Write Buffer .....................................................................................................26

4.24 System Interface .......................... ...... ....... ...... ...... ....................................................... 26

4.25 System Address/Data Bus ................. ....... ...... ...... ....... ....................................... ...... ...27

4.26 System Command Bus ......................................... ....... ...... ....................................... ...27

4.27 Handshake Signals ......................................................................................................28

4.28 System Interface Operation ............................ ...... ....... ...... ....... ...... ....... ...... ....... ...... ...28

4.29 Data Prefetch ...............................................................................................................30

4.30 Enhanced Write Modes ................................................................................................31

4.31 External Requests ........................................................................................................31

4.32 Test/Breakpoint Registers ............................................................................................31

4.33 Performance Counters .................................................................................................32

4.34 Interrupt Handling ........................................................................................................34

4.35 Standby Mode .... ...... ....... ...... ....... ...... ....... ...... ....................................... ...... ....... ...... ...36

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 5 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

4.36 JTAG Interface .............................................................................................................36

4.37 Boot-Time Options .......................................................................................................36

4.38 Boot-Time Modes .........................................................................................................36

5 Pin Descriptions ....................................................................................................................38

6 Absolute Maximum Ratings ..................................................................................................41

7 Recommended Operating Conditions ...................................................................................42

8 DC Electrical Characteristics .................................................................................................43

9 Power Consumption ..............................................................................................................44

10 AC Electrical Characteristic s .......... ....................................... ...... ....... ...... ....... ...... ....... ...... . ..45

10.1 Capacitive Load Deration .............................................................................................45

10.2 Clock Parameters ........................................................................................................45

10.3 System Interface Parameters ................... ...... .............................................................46

10.4 Boot-Time Interface Parameters ..................................................................................46

11 Timing Diagrams ...................................................................................................................47

11.1 Clock Timing ................................................................................................................47

12 Packaging Information ..........................................................................................................48

13 RM7065A Pinout ...................................................................................................................50

14 Ordering Information .............................................................................................................52

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 6 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

List of Figures

Figure 1 Block Diagram .............................................................................................................10

Figure 2 CP0 Registers .............................................................................................................12

Figure 3 Instruction Issue Paradigm ..........................................................................................13

Figure 4 Pipeline ........................................................................................................................14

Figure 5 CP0 Registers .............................................................................................................19

Figure 6 Kernel Mode Virtual Addressing (32-bit) .....................................................................20

Figure 7 Typical Embedded System Block Diagram .................................................................27

Figure 8 Processor Block Read .................................................................................................29

Figure 9 Processor Block Write .................................................................................................30

Figure 10 Multiple Outstanding Reads ......................................................................................30

Figure 11 Clock Timing ..............................................................................................................47

Figure 12 Input Timing ...............................................................................................................47

Figure 13 Output Timing ............................................................................................................47

Figure 14 Mechanical Diagram .................................................................................................48

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 7 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

List of Tables

Table 1 Instruction Issue Rules .................................................................................................12

Table 2 Dual Issue Instruction Classes .....................................................................................13

Table 3 ALU Operations ............................................................................................................15

Table 4 Integer Multiply/Divide Operations ................................................................................15

Table 5 Floating Point Latencies and Repeat Rates .................................................................17

Table 6 Cache Attributes ...........................................................................................................25

Table 7 Cache Locking Control .................................................................................................26

Table 8 Penalty Cycles ..............................................................................................................26

Table 9 Watch Control Register ................................................................................................32

Table 10 Performance Counter Control .....................................................................................33

Table 11 Cause Register ...........................................................................................................35

Table 12 Interrupt Control Register ...........................................................................................35

Table 13 IPLLO Register ...........................................................................................................35

Table 14 IPLHI Register ............................................................................................................35

Table 15 Interrupt Vector Spacing .............................................................................................36

Table 16 Boot Time Mode Stream .............................................................................................37

Table 17 System Interface .........................................................................................................38

Table 18 Clock/Control Interface ...............................................................................................39

Table 19 Interrupt Interface .......................................................................................................40

Table 20 JTAG Interface ...........................................................................................................40

Table 21 Initialization Interface ..................................................................................................40

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 8 Document ID: PMC-2010145, Issue 2

1 Features

• Dual issue symmetric superscalar m icroprocessor with instruction prefetch optimized for

system level price/performance

• 300, 350 MHz operating frequency

• >525 Dhrystone 2.1 MIPS @ 350 MHz

• High-performance system interface

• 1000 MB per second peak throughput

• 125 MHz max. freq., multiplexed address/data

• Supports two outstanding reads with out-of-order return

• Processor clock multipliers 2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9

• Integrated primary and secondary caches

• All are 4-way set associative with 32 byte line size

• 16 KB instruction, 16 KB data, 256 KB on-chip secondary

• Per line cache locking in primaries and secondary

• Fast Packet Cache™ increases system efficiency in

networking applications

• High-performance floating-point unit — 800 MFLOPS maximum

• Single cycle repeat rate for common single -pr ecision ope ra tions and some double-p recision operations

• Single cycle repeat rate for single-precision combined multiply-add operations

• Two cycle repeat rate for double-precision multiply and double-precision combined

multiply-add operations

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

• MIPS IV superset instructi on set architecture

• Data PREFETCH instruction allows the processor to overl ap cache miss latency an d instruction execution

• Single-cycle floating-point multiply-add

• Integrated memory management unit

• Fully associative joint TLB (shared by I and D translations)

• 64/48 dual entries map 128/96 pages

• Variable page size

• Embedded application enhancements

• Specialized DSP integer Multiply-Accumulate instructions, (MAD/MADU) and three-operand multiply instruction (MUL)

• I&D Test/Break-point (Watch) registers for emulation & debug

• Performance counter for system and software tuning & debug

• Fourteen fully prioritized vectored interrupts — 10 external, 2 internal, 2 software

• Fully static CMOS design with dynamic power down logic

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 9 Document ID: PMC-2010145, Issue 2

2 Block Diagram

Figure 1 Block Diagram

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Secondary Tags

Set A

Primary Data Cache

4-way Set Associative

Store Buffer

Write Buffer

Read Buffer

D Bus

Floating-Point

Load/Align

Floating-Point

Packer/Unpacker

Comparator

Floating-Point

MultAdd, Add, Sub,

Cvt, Div, Sqrt

Multiplier Array

256KB Secondary Cache, 4-way Set Associative Secondary Tags

Set B

DTag

DTLB

Floating-Point Control

Pad Buffer

Address Buffer

Joint TLB

Coprocessor 0

System/Memory

Control

PC Incrementer

Branch PC Adder

ITLB Virtual

Program Counter

ITag

ITLB

F-Pipe Bus

DVA

IVA

Primary Instruction Cache

Instruction Dispatch Unit

Integer Register File

M Pipe

Adder

StAIn/Sh

Logicals

FA Bus

DTLB Virtual

PLL/Clocks

4-way Set Associative

Pad BusA/D Bus

Prefetch Buffer

F Pipe Register

M Pipe Register

M-Pipe Bus

Load Aligner

F Pipe

Adder

Shifter

Logicals

Integer Control

Int Mult, Div, Madd

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 10 Document ID: PMC-2010145, Issue 2

3 Description

PMC-Sierra’s RM7065A is a highly integrated symmetric superscalar microprocessor capable of issuing two instructions each processor cycle. It has two high-performance 64-bit integer units as well as a high-throughput, fully pipelined 64-bit floating point unit.

The RM7065A integrates 16 KB 4-way set associative instruction and data caches along with an integrated 256 KB 4-way set associative secondary. The primary data and secondary caches are write-back and non-blocking.

The memory management unit contains a 64/48-entry fully associative TLB and a 64-bit system interface supporting multiple outstanding reads with out-of-order return and hardware prioritized and vectored interrupts.

The RM7065A ideally suits high-end embedded control applications such as internetworking, high-performance image manipulati on, high-sp eed print ing, and 3-D vi sualizati on. The RM7065A is also applicable to the low end workstation market where its balanced integer and floating-point performance provide outstanding price/performance.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 11 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

4 Hardware Overview

The RM7065A offers a high-level of integration targeted at high-performance embedded applications. The key elements of the RM7065A are described throughout this section.

4.1 CPU Registers

The RM7065A CPU contains 32 general purpose registers (GPR), two special purpose registers for integer multiplication and division, and a program counter; there are no condition code bits. Figure 2 shows the user visible state.

Figure 2 CP0 Registers

General Purpose Registers

63 0

0630 r1 HI r2 63 0

• LO

•

• 63 0

r29 PC r30 r31

Preliminary

Multiply/Divide Registers

Program Counter

4.2 Superscalar Dispatch

The RM7065A incorporates a superscalar dispatch unit that allows it to issue up to two instructions per cycle. For purposes of instruction issue, the RM7065A defines four classes of instructions: integer, load/store, branches, and floating-point. There are two logical pipelines, the function, or F, pipeline and the memory, or M, pipeline. Note however that the M pip e ca n exe cut e integer as well as memory type instruc tions.

Table 1 Instruction Issue Rules

F Pipe M Pipe

one of: one of: integer, branch, floating-point,

integer mul, div

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 12 Document ID: PMC-2010145, Issue 2

integer, load/store

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Figure 2 is a simplification of the pipeline section and illustrates the basics of the instruction issue mechanism.

Figure 3 Instruction Issue Paradigm

Instruction

Cache

Dispatch

Unit

F Pipe IBus

M Pipe IBus

F Pipe

The figure illustrates that one F pipe instruction and one M pipe instruction can be issued concurrently but that two M pipe or two F pipe instructions cannot be issued. Table 2 specifies more completely the instructions within each class.

T able 2 Dual Issue Instruction Classes

integer load/store

add, sub, or , xor, sh ift, etc .

4.3 Pipeline

The logical length of both the F an d M pipel ines i s fiv e stages with st ate c ommitti ng in t he reg ister write, or W, pipe stage. The physical length of the floating-point execution pipeline is actually seven stag es but this is co mpletely transparent t o the user.

M Pipe

lw, sw, ld, sd, ldc1, sdc1, mov, movc, fmov, etc.

Integer

F Pipe

floatingpoint branch

fadd, fsub, fmult, fm add, fdiv, fcmp, fsqrt, etc.

Integer M Pipe

beq, bne, bCzT, bCzF, j, etc.

Figure 4 shows instruction execution within the RM7065A when instructions are issuing simultaneously down both pipelines. As illustrated in the figure, up to ten instructions can be executing simultaneou sly. This figure presents a somewhat simplistic view of the processors operation since the out-of-order completion of loads, stores, and long latency floating-point operations can result in there being even more instructions in process than what is shown.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 13 Document ID: PMC-2010145, Issue 2

Figure 4 Pipeline

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

I0 I1

I2 I3

I4 I5

I6 I7 2I1I 1R 2R 1A 2A 1D 2D 1W 2W

I8 I9

1I-1R:

2I: 2R: 1A: 1A:

1A-2A:

2A:

2A-2D:

1D:

2W:

2I1I 1R 2R 1A 2A 1D 2D 1W 2W 2I1I 1R 2R 1A 2A 1D 2D 1W 2W

Instruction cache access Instruction virtual to physical address translation Register file read, Bypass calculation, Instruction decode, Branch address calculation Issue or slip decision, Branch decision Data virtual address calculation Integer add, logical, shift Store Align Data cache access and load align Data virtual to physical address translation Register file write

Note that instruction dependencies, resource conflicts, and branches may result in some of the instruction slots being occupied by

4.4 Integer Unit

The RM7065A implements the MIP S IV Instru ction Set Architect ure. Addit ionally, the RM7065A includes two implementation specific i nst r u ct ion s not f ound in the baselin e MI PS I V I SA, b ut that are useful in the embedded market place. These instructions are integer multiply-accumulate (MAD) and three-operan d integer multiply (MUL).

2I1I 1R 2R 1A 2A 1D 2D 1W 2W 2I1I 1R 2R 1A 2A 1D 2D 1W 2W

2I1I 1R 2R 1A 2A 1D 2D 1W 2W

2I1I 1R 2R 1A 2A 1D 2D 1W 2W 2I1I 1R 2R 1A 2A 1D 2D 1W 2W

one cycle

NOPs.

The RM7065A integer unit includes thirty-two general purpose 64-bit registers, the HI/LO result registers for two-operand integer multiply/divide operations, and the program counter, or PC. There are two separate execution units, one of which can execute function (F) type instructions and one which can e xecute memor y (M) type instruc tions. Ref er to Table 1 for the inst ruction issue rules.

Note that integer multiply/divide instructions, as well as thei r correspond ing

MFHI and MFLO

instructions, can only be executed in the F type execution unit. Within each execution unit the operational characteristics are the same as on previous MIPS designs with single cycle ALU operations (add, sub, logical, shift), one cycle load delay, and an autonomous multiply/divide unit.

Register File

The RM7065A has thirty-two general purpose registers with register location 0 (r0) hard wired to a zero value. Thes e regist ers are use d for scalar integer operatio ns and addr ess cal culation . In order to service the two integer execution units, the register file has four read ports and two write ports and is fully bypassed both within and between the two execution units to minimize operation latency in the pipeline.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 14 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

4.5 ALU

The RM7065A has two complete integer ALUs each consisting of an integer adder/subtractor, a logic unit, and a shifter. Table 3 shows the functions performed by the ALUs for each execution unit. Each of these units is optimized to perform all operations in a single processor cycle.

Table 3 ALU Operations

Unit F Pipe M Pipe

Adder add, sub add, sub, data address

Logic logic, moves, zero shifts

(nop)

Shifter non zero shift non zero shift, store

4.6 Integer Multiply/Divide

The RM7065A has a single dedicated integer multiply/divide unit optimized for high-speed multiply and multiply-accumulate operations. The multiply/divide unit resides in the F type execution unit. Table 4 shows the performance of the multiply/divide unit on each operation.

Preliminary

add logic, moves, zero shifts

(nop)

align

Table 4 Integer Multiply/Divide Operations

Operand

Opcode

MULT/U, MAD/U

MUL

DMULT, DMUL TU

DIV, DIVD any 36 36 0 DDIV,

DDIVU

Size Latency

16 bit 4 3 0 32 bit 5 4 0 16 bit 4 3 2 32 bit 5 4 3

any 9 8 0

any 68 68 0

Repeat Rate

Stall Cycles

The baseline MIPS IV ISA specifies that the results of a multiply or divide operation be placed in the Hi and Lo registers. These values can then be transferred to the general purpose register file using the Move-from-Hi and Move-from-Lo (

MFHI/MFLO) instru ctions.

In addition to the baseline MIPS IV integer multiply instructions, the RM7065A also implements the 3-operand multiply instruction,

MUL. This instruction specifies that the multiply result go

directly to the integer register file rather than the Lo register. The portion of the multiply that would have normally gone i nto the Hi re gister i s discard ed. For applicat ions where i t is known tha t the upper half of the mul tiply result is not require d, using the necessity of executing an explicit

MFLO instruction.

MUL instruction eliminates the

The multiply-add instructions,

MAD and MADU, multiply two operands and add the resulting

product to the current contents of th e Hi and Lo registers. The multiply-accumulate operation is

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 15 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

the core primitive of almost all signal processing algorithms. Therefore, using the RM7065A eliminates the need for a separate DSP engine in many embedded applications.

4.7 Floating-Point Coprocessor

The RM7065A incorporates a high-performance fully pipelined floating-point coprocessor which includes a floating-po int register file and autonomous execution units for multiply/a dd/convert and divide/square root. The floating-point coprocessor is a tightly coupled execution unit, decoding and executing instructions in parallel with, and in the case of floating-point loads and stores, in cooperation with the M pipe of the integer unit. The superscalar capabilities of the RM7065A allow floating-point computation instructions to issue concurrently with integer instructions.

4.8 Floating-Point Unit

The RM7065A floating-point execution unit supports single and double precision arithmetic, as specified in the IEEE S tanda rd 754. The ex ecution uni t is broken i nto a separa te divide /square ro ot unit and a pipelined multiply/add unit. Overlap of divide/square root and multiply/add is supported.

The RM7065A maintains fully precise floating-point exceptions while allowing both overlapped and pipelined operations. Precise exceptions are extremely important in object-oriented programming environments and highly desirable for debugging in any environment.

Preliminary

Floating-point operations include:

• add

• subtract

• multiply

• divide

• square root

• reciprocal

• reciprocal square root

• conditional moves

• conversion between fixed-point and floating-point format

• conversion between floating-point formats

• floating-point compare

Table 5 gives the latencies of the floating-point instructions in internal processor cycles.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 16 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Table 5 Floating Point Latencies and Repeat Rates

Latency

Operation

fadd 4 1 fsub 4 1 fmult 4/5 1/2 fmadd 4/5 1/2 fmsub 4/5 1/2 fdiv 21/36 19/34 fsqrt 21/36 19/34 frecip 21/36 19/34 frsqrt 38/68 36/66 fcvt.s.d 4 1 fcvt.s.w 6 3 fcvt.s.l 6 3 fcvt.d.s 4 1 fcvt.d.w 4 1 fcvt.d.l 4 1 fcvt.w.s 4 1 fcvt.w.d 4 1 fcvt.l.s 4 1 fcvt.l.d 4 1 fcmp 1 1 fmov, fmovc 1 1 fabs, fneg 1 1

single/double

Repeat Rate single/double

Preliminary

4.9 Floating-Point General Register File

The floating-point general register file (FGR) is made up of thirty-two 64-bit registers. With the floating-point load and store double instructions, take advantage of the 64-bit wide data cache and issue a floating-point coprocessor load or store doubleword instruction in every cycle.

The floating-point control register file contains two registers; one for determining configuration and revision information for the coprocessor, and one for control and status information. These registers are primar ily used f or diagnost ic software , exception handling, st ate savi ng and resto ring, and control of rounding modes.

To support superscalar operations the FGR has four read ports and two write ports and is fully bypassed to minimize operation latency in the pipeline. Three of the read ports and one write port are used to support the combined multiply-add instruction while the fourth read and second write port allows for concurrent floating-point load or store and conditional move operations.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 17 Document ID: PMC-2010145, Issue 2

LDC1 and SDC1, the floating-point unit can

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

4.10 System Control Coprocessor (CP0)

The system control coprocessor (CP0) is responsible for the virtual memory sub-system, the exception control system, and the diagnostics capability of the processor.

For memory management support, the RM7065A CP0 is logically identical to the RM5200 Family. For interrupt exceptions and diagnostics, the RM7065A is a superset of the RM5200 Family, implementing additional features described in the following sections on Interrupts, Test/ Breakpoint registers, and Performance Counters.

The memory management unit co ntrol s the virtu al memory syste m page mapping . It co nsist s of a n instruction address translation buffer (ITLB) a data address translation buffer (DTLB), a Joint TLB (JTLB), and coprocessor registers used by the virtual memory mapping sub-system.

4.11 System Control Coprocessor Registers

The RM7065A incorporates all CP0 registers internally. These registers provide the path through which the virtual memory system’s page mapping is examined and modified, exceptions are handled, and operatin g modes are controlled (ke rn el vs. user mode, interr upt s e nabled or disabled, cache features). In addition, the RM7065A includes registers to implement a real-time cycle counting facility, to aid in cache and system diagnostics, and to assist in data error detection.

Preliminary

T o supp ort the non-bloc king c aches an d enhanced interr upt handl ing capa biliti es of t he RM7065A, both the data and control register spaces of CP0 are supported. In the data register space, which is accessed using the

MFC0 and MTC0 instructions, the RM7065A supports the same registers as

found in the RM5200 Family. In the control space, which is accessed by the previously unused

CTC0 and CFC0 instructions, the RM7065A supports f ive ne w r egi st ers. The first thr ee of these

new 32-bit registers support the enhanced interrupt handling capabilities; Interrupt Control, Interrupt Priority Level L o (IPLLO), and Interrupt Priority Level Hi (IPLHI). These registers are described further in the section on interrupt handling. Two other registers, Imprecise Error 1 and Imprecise Error 2, have been added to help diagnose bus errors that occur on non-blocking memory references.

Figure 5 shows the CP0 registers.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 18 Document ID: PMC-2010145, Issue 2

Figure 5 CP0 Registers

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Context

Count

Status

12*

EPC

14*

Watch2

19*

ECC

26*

LLAddr

17*

PageMask

EntryHi

10*

TLB

(entries protected

from TLBWR)

TagLo

28*

Used for memory

management

EntryLo0

EntryLo1

TagHi

29*

Info

Index

Random

Wired

PRId

15*

Config

16*

* Register number

4.12 Virtual to Physical Address Mapping

The RM7065A provides three modes of virtual addressing:

BadVAddr

Compare

11*

Cause

13*

Watch1

18*

XContext

20*

CacheErr

27*

ErrorEPC

30*

Used for exception

processing

Perf Counter

25*

Perf Ctr Cntrl

22*

Watch Mask

24*

IPLLO

18*

IPLHI

19*

IntControl

20*

Imp Error 1

26*

Imp Error 2

27*

Control Space Registers

• user mode

• kernel mode

• supervisor mode

These modes allow sys tem softwar e to provide a secure environment for us er processe s. Bits in the CP0 Status registe r det ermine which vi rtual addr essing mode is used. I n user mode, t he RM706 5A provides a single, uniform virtual address space of 256 GB (2 GB in 32-bit mode).

When operating in the kernel mode, four distinct virtual address spaces, totalling 1024 GB (4 GB in 32-bit mode), are simultaneously available and are differentiated by the high-order bits of the virtual address.

The RM7065A processor also supports a supervisor mode in which the virtual address space is

256.5 GB (2.5 GB in 32-bit mode), divided into three regions based on the high-order bits of the virtual address. Figure 6 shows the address space layout for 32-bit operations.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 19 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Figure 6 Kernel Mode Virtual Addressing (32-bit)

0xFFFFFFFF Kernel virtual address space

(kseg3)

0xE0000000 Mapped, 0.5GB

0xDFFFFFFF Supervisor virtual address space

(ksseg)

0xC0000000 Mapped, 0.5GB

0xBFFFFFFF Uncached kernel physical address space

(kseg1)

0xA0000000 Unmapped, 0.5GB

0x9FFFFFFF Cached kernel physical address space

(kseg0)

0x80000000 Unmapped, 0.5GB

Preliminary

0x7FFFFFFF User virtual address space

0x00000000

When the RM7065A is configured for 64-bit addressing, the virtual address space layout is an upward compatible extension of the 32-bit virtual address space layout.

4.13 Joint TLB

For fast virtual-to-physical address translation, the RM7065A uses a large, fully associative TLB that maps virtual pages to their corresponding physical addresses. As indicated by its name, the JTLB is used for b oth inst ruction and data translat ions. The JTLB is or gani zed as pa irs of e ven/od d entries, and maps a virtual address and address space identifier (ASID) into the large, 64 GB physical address space. By default, the JTLB is configured as 48 pairs of even/odd entries. The optional 64 even/odd entry configuration is set at boot time.

Two mechanisms are provided to assist in controlling the amount of mapped space and the replacement characte ristic s of various memory regi ons. First, the page si ze can be conf igured, on a per-entry basis, to use page sizes in the range of 4 KB to 16 MB (in 4x multiples). The CP0 PageMask register is loaded wi th the d esired p age size of a ma pping, and that si ze is s tored int o the TLB, along with the virtual address, when a new entry is written. Thus, operating systems can

(kuseg) Mapped, 2.0GB

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 20 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

create spec ial purpose maps; for example, an entire frame buffer can be memo ry mapped using only one TLB entry.

The second mechanism controls the replacement algorithm when a TLB miss occurs. The RM7065A provides a random replacement algorithm to select a TLB entry to be written with a new mapping. However, the processor also provides a mechanism whereby a system specific number of mappings can be locked into the TLB, thereby avoiding random replacement. This mechanism uses the CP0 Wired register and allows the operating system to guarantee that certain pages are always mapped for performance reasons and to avoid a deadlock condition. This mechanism also facilitates the design of real-time systems by allowing deterministic access to critical software.

The JTLB also contains information that controls the cache coherency protocol for each page. Specifically, each page has attribute bits to determine whether the coherency algorithm is:

• uncached

• write-back

• write-through with write-allocate

• write-through without write-allocate

• write-back with secondary bypass

Note that both of the wr ite-through protocol s b ypass the secondary cach e since it does not supp ort writes of less than a complete cache line.

These protocols are used for both code and data on the RM7065A with data using write-back or write-through depending on the application. The write-through modes support the same efficient frame buffer handling as the RM5200 Family.

4.14 Instruction TLB

The RM7065A uses a 4-entry instruction TLB (ITLB). The ITLB offers the followin g advan ta ges ;

• Minimizes contention for the JTLB

• Eliminates the critical path of translating through a large associative array

• Allows instruction address and data address translations to occur in parallel

• Saves power

Each ITLB entry maps a 4 KB page. The ITLB improves performance by allowing instruction address translation to occur in parallel with data address translation. When a miss occurs on an instructio n address tran slation by the ITLB, the least-recently used ITLB entry is filled from the JTLB. The operation of t he ITLB is completely transparent to th e user.

4.15 Data TLB

The RM7065A uses a 4-entry data TLB (DTLB) for the same reasons cited above for the ITLB. Each DTLB entry maps a 4 KB page. The DTLB improves performance by allowing data address translation to occur in parallel with instruction address translation. When a miss occurs on a data address translation, the DTLB is filled from the JTLB. The DTLB refill is pseudo-LRU; the least recently used ent r y of th e least recently used pair of entrie s is filled. The opera ti on of the DTLB is completely transparent to the user.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 21 Document ID: PMC-2010145, Issue 2

4.16 Cache Memory

The RM7065A contains integrated primary instruction and data caches that support single cycle access, as well as a lar g e un ifie d second ary ca che with a t hree cycle miss pen alt y fro m the pr imary caches. Each primary cache has a 64-bit read path and a 128-bit write path. Both caches can be accessed simultaneously. The primary caches provide the integer and floating-point units with an aggregate bandwidth of 5.6 GB per second at an internal clock frequency of 350 MHz. During an instruction or data primary cache refill, the secondary cache can provide a 64-bit datum every cycle following the initial three cycle latency for a peak bandwidth of 2.8 GB per second.

4.17 Instruction Cache

The RM7065A has an integrated 16 KB, four- way set assoc iative i nstructi on cache tha t is virtu ally indexed and physically tagged. The effective physical index eliminates the potential for virtual aliases in the cache.

The data array portion of the instruction cache is 64 bits wide and protected by word parity while the tag array holds a 24-bit physical address, 14 control bits, a valid bit, and a single parity bit.

By accessing 64 bits pe r cy cle , th e instruction cache is a ble to supply two instruction s per cycle to the superscalar di spatch unit. For s ig nal pr oce ssing, graphics, and ot her numerical code sequences where a floating-point load or store and a floating-point computation instruction are being issued together in a loop, the entire bandwidth available from the instruction cache is consumed by instruction issue. For typical integer code mixes, where instruction dependencies and other resource constraints restrict the level of parallelism that can be achieved, the extra i nstruction cache bandwidth is used to fetch both the taken and non-taken branch paths to minimize the overall penalty for branches.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

A 32-byte (eight instruction) line size is used to maximize the communication efficiency between the instruction cache and the secondary cache or memory system.

The RM7065A supports cache locking on a per line basis. The contents of each line of the cache can be locked by setting a bit in the Tag RAM. Locking the line prevents its contents from being overwritten by a subsequent cache miss. Refills occur only into unlocked cache lines. This mechanism allows the programmer to lock critic al code into the cache, thereby guara nteeing deterministic behavior for the locked code sequence.

4.18 Data Cache

The RM7065A has an integrated 16 KB, four-way set associative data cache that is virtually indexed and physically tagged. Line size is 32 bytes (8 words). The effective physical index eliminates the potential for virtual aliases in the cache.

The data cache is non-blocking; that is, a miss in the data cache does not necessarily stall the processor pipeline. As long as no instruction is encountered which is dependent on the data reference which caused the miss, the pipeline continues to advance. Once there are two cache misses outstanding, the processor stalls if it encounters another load or store instruction.

The data array portion of the data cache is 64 bits wide and protected by byte parity while the tag array holds a 24-bit physic al addre ss, 3 control bits, a two-bit cac he st at e field, and two parity bits.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 22 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

The most commonly used wri te policy is writ e-b ack, which means that a st ore to a cache li ne d oes not immediately cau se memo ry to b e updat ed. This in creas es syst em perf ormance by redu cing bu s traffic and eliminating the bottleneck of waiting for each store operation to finish before issuing a subsequent memory operation. Software can, however, select write-through on a per-page basis when appropriate, such as for frame buffers. Cache protocols supported for the data cache are as follows:

1. Uncached Reads to addresses in a memory area identified as uncached do not access the cache. Writes to

such addresses are written directly to main memo ry without updating the ca che.

2. Write-back Loads and instruction fetches first search the cache, reading the next memory hierarchy level

only if the d esired data is not cache re sident. On data store operations, th e cache is first searched to determine if the tar get address is cache resid ent. If it is resid ent, the cache con tents are updated and the cache line is marked for later write-back. If the cache lookup misses, the target line is first brought into the cache, afterwhich the write is performed as above.

3. Write-through with write allocate Loads and instruction fetches first search the cache, reading from memory only if the desired

data is not cache resident; write-through data is never cached in the secondary cache. On data store operation s, t he c ache is first searc hed to determine if th e t arget address is ca che resident. If it is resident, the primary cache contents are updated and main memory is written, leaving the write-back bit of the cache line unchanged; no writes occur to the secondary cache. If the cache lookup misses, the target line is first brought into the cache, afterwhich the write is performed as above.

4. Write-through without write allocate Loads and instruction fetches first search the cache, reading from memory only if the desired

data is not cache resident; write-through data is never cached in the secondary cache. On data store operation s, t he c ache is first searc hed to determine if th e t arget address is ca che resident. If it is resident, the cache contents are updated and main memory is written, leaving the write- back bit of the cache line unchanged; no writes occur to the secondary cache. If the cache lookup misses, only main memory is written.

5. Fast Packet Cache™ (Write-back with secondary bypass) Loads and instruction fetches first search the primary cache, reading from memory only if the

desired data is not resident; the secondary cache is not searched. On data store operations, the primary cache is first searched to de termine if the target addres s is resident. If it is resident, the cache contents are updated, and the cache line marked for later write-back. If the cache lookup misses, the target line is first brought into the cache, afterwhich the write is performed as above.

Associated with the data cache is the store buffer. When the RM7065A executes a

STORE

instruction, this single-entry buffer is written with the store data while the tag comparison is performed. If the tag matches, then the data is written into the data cache in the next cycle that the data cache is not accessed (the next non-load cycle). The store buffer allows the RM7065A to

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 23 Document ID: PMC-2010145, Issue 2

execute a store every processor cycle and to perform back-to-back stores without penalty. In the event of a store immediately followed by a load to the same address, a combined merge and cache write occurs such that no penalty is incurred.

4.19 Secondary Cache

The RM7065A has an integrated 256 KB, four-way set associative, block write-back secondary cache. The secondary cache has a 32-byte line size, a 64-bit bus width to match the system interface and primary cache bus widths, and is protected with doubleword parity. The secondary cache tag array holds a 20-bit physica l a ddress, 2 control b it s, a th ree bit cache state fi el d, and two parity bits.

By integrating a seconda ry cache, t he RM7065A is a ble to d ecreas e the l atency of a pri mary cache miss without significantly increasing the number of pins and the amount of power required by the processor. From a technology point of view, integrating a secondary cache leverages CMOS technology by using silicon to build the structures that are most amenable to silicon technology; building very dense, low power memory arrays rather than large power hungry I/O buffers.

Further benefits of an integrated secondary cache are flexibility in the cache organization and management policies that are not practical with an external cache. Two previously mentioned examples are the 4-way associativity and write-back cache protocol.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

A third management policy for which integration affords flexibility is cache hierarchy management. With multiple levels of cache, it is necessary to specify a policy for dealing with cases where two cache lines at level n of the hierarchy could possibly be sharing an entry in level n+1 of the hierarchy.

The RM7065A allows entries to be stored in the primary caches that do not necessarily have a corresponding entry in the secondary; t h e RM7065A does not force th e pr i mari es to be a subset of the secondary. For example, if primary cache line A is being filled and a cache line already exists in the secondary for prima ry cac he li ne B at the locat ion whe re pr imary A’s line would reside, then that secondary entry is replaced by an entry corresponding to primary cache line A and no action occurs in the primary for cache line B. This operation creates the aforementioned scenario where the primary cache l ine , whi ch initially had a corresponding secondary entry, no longer has such an entry. Such a primary line is called an orphan. In general, cache li nes at lev el n+1 of the hierarchy are called parents of level n’s children.

Another RM7065A cache management optimization occurs for the case of a secondary cache line replacement where the secondary line is dirty and has a corresponding dirty line in the primary. In this case, since it is permissible to leave the dirty line in the primary, it is not necessary to write the secondary line back to main memory. Taking this scenario one step further, a final optimization occurs when the a for emen ti oned dirty primary line is replaced by anot her line and must be wri t ten back. In this case it is wr itten directly to memory, bypassing the secondary cache.

4.20 Secondary Caching Protocols

Unlike the primary dat a cac he, t he secondary cache supports only uncached a nd block write-back. As noted earlier, cache lines managed with either of the write-through protocols are not placed in the secondary cache. A new caching attribute, write-back with secondary bypass, allows the secondary cache to be bypassed entirely. When this attribute is selected, the secondary cache is not filled on load misses and are not written on dirty write-backs from the primary cache

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 24 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

The RM7065A cache att ributes fo r the instru ction, da ta, inte rnal sec ondary caches a re summari zed in Table 6.

T a ble 6 Cache Attributes

Attribute Instruction Data Secondary

Size 16KB 16KB 256KB Associativity 4-way 4-way 4-way Replacement

Algorithm Line size 32 byte 32 byte 32 byte Index vAddr

Tag pAddr Write policy n.a. write-back,

read policy n.a. non-blocking (2

read order critical word first critical word first critical word

write order NA sequential sequential miss restart

following: Parity per word per byte per

cyclic cyclic cyclic

11..0

35..12

complete line first double (if

vAddr

11..0

pAddr

35..12

write-through

outstanding)

waiting for data)

pAddr pAddr block write-

back, bypass non-blocking

(data only, 2 outstanding)

first

n.a.

doubleword

15..0

35..16

4.21 Cache Locking

The RM7065A allows critical code or data fragments to be l ocke d in to t he primary and secondary caches. The user has complete control over the locking function. For instruction and data fragments in the primary caches, locking is accomplished by setting either or both of the cache lock enable bits and specifying the set in the CP0 ECC register, then executing either a load instruction for data, or a Fill_I cache operation for instructions.

Only sets A and B within each cache can be locked. Locking within the secondary works identically to the primaries using a separate secondary lock enable bit and the same set selection field. As with the primaries, only sets A and B can be locked. Table 7 summarizes the cache locking capabilities.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 25 Document ID: PMC-2010145, Issue 2

Table 7 Cache Locking Control

Lock

Cache

Primary I ECC[27] ECC[28]=0→A

Primary D ECC[26] ECC[28]=0→A

Secondary ECC[25] ECC[28]=0→A

Enable Set Sel ect Activate

ECC[28]=1→B

4.22 Cache Management

To improve the performance of critical data movement operations in the embedded environment, the RM7065A significantly improves the speed of operation of certain critical cache management operations. In particular, the speed of the Hit-Writeback-Invalidate and Hit-Invalidate cache operations has been improved, in some cases by an order of magnitude, over that of other MIPS processors. For example, Table 8 compares the RM7065A with the R4000 processor.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Fill_I

Load/Store

Fill_I or Load/Store

T a ble 8 Penalty Cycles

Operation Condition

Hit-WritebackInvalidate

Hit-Invalidate Miss 0 7

Miss 0 7 Hit-Clean 3 12 Hit-Dirty 3+n 14+n

For the Hit-Dirty case of Hit-Writeback-Invalidate in Table 8 above, if the writeback buffer is full from some previous cache evicti on, then n is the number of cycles req uired to empty th e writeback buffer. If the buffer is empty then n is zero.

The penalty value in Table 8 is the number of processor cycles beyond the one cycle required to issue the instruction that is required to implement the operation.

4.23 Primary Write Buffer

Writes to secondary cache or external memory, whether cache miss write-backs or stores to uncached or write-through addresses, use the integrated primary write buffer. The write buffer holds up to four 64-bit ad dress an d data pai rs. The entir e buf fer is used for a dat a cac he write-b ack and allows the processor to proceed in parallel with memory update. For uncached and writethrough stores, the write buffer significantly increases performance by decoupling the SysAD bus transfers fr om the instruction execution stream.

Penalty RM7065A R4000

4.24 System Interface

The RM7065A provides a high-performance 64-bit system interface which is compatible with the RM5200 Family. As an enhancement to the SysAD bus interface, the RM7065A allows ha lf-

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 26 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

integral clock multipliers, thereby providing greater granularity when selecting pipeline and system interface frequencies.

The SysAD interf ace cons ists of a 64-bi t Addres s/Da ta bus wit h 8 check bits and a 9-bi t command bus. In addition, there are ten handshake signals and ten interrupt inputs. The interface is capable of transferring data between the processor and memory at a peak rate of 1000 MB/sec with a 125 MHz SysClock.

Figure 7 shows a typical embedded system using the RM7065A. This example shows a system with a bank of DRAMs and an interface ASIC which provides DRAM control as well as an I/O port.

Figure 7 Typical Embedded System Block Diagram

Address Control

x x

DRAM

Latch

Flash/

Boot

ROM

RM7065A

SysCmd

4.25 System Address/Data Bus

The 64-bit System Address Data (SysAD) bus is used to transfer addresses and data between the RM7065A and the rest of the system. It is protected with an 8-bit parity check bus, SysADC[7:0].

The system interface is configurable to allow ea sy interfacing to memory and I/O systems of varying frequenci es. T he da ta rat e and the bus frequency at which the RM7065A transm it s da ta t o the system interface are programmable at boot time via mode control bits. In addition, the rate at which the processor re ceives dat a is fully con trolled by the externa l device. Ther efore, either a lo w cost interface requiring no read or write buffering, or a faster, high-performance interface can be designed to communicate with the RM7065A.

4.26 System Command Bus

The RM7065A interface has a 9-bit System Command bus, SysCmd[8:0]. The command bus indicates whether the SysAD bus carries address or data information on a per-clock basis. If the SysAD bus carries address, the SysCmd bus indicates the transaction type (for example, a read or write). If the SysAD bus carries data, then the SysCmd bus contains information about the data (for example, this is the last data word transmitted, or the data contains an error). The SysCmd bus is bidirectional to support both processor requests and external requests to the RM7065A. Processor requests are init iate d by the RM7065A a nd res ponded t o by an extern al dev ice. Ext ernal requests are issued by an external device and require the RM7065A to respond.

SysAD Bus

Memory I/O

Controller

PCI Bus

The RM7065A supports one- to eight-byte transfers as well as 32-byte block transfers on the SysAD bus. In the case of a sub-doubleword transfer, the 3 low-order address bits give the byte address of the transfer, and th e SysCmd bus indicates the number of bytes being transferred.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 27 Document ID: PMC-2010145, Issue 2

4.27 Handshake Signals

There are ten handshake sign als on th e syste m interf ace . Two of these, RdRdy* and WrRdy*, are driven by an extern al dev ice to i ndicat e to t he RM706 5A whether it c an acce pt a ne w read or writ e transaction. The RM7065A sampl es t hese signals before deasserting the ad dre ss on read and write requests.

ExtRqst* and Release* are used to transfer control of the SysAD and SysCmd buses from the processor to an external device. When an external device requires control of the bus, it asserts ExtRqst*. The RM7065A responds by asserting Release* to release the system interface to slave state.

PRqst* and PAck* are used to tran sfer cont rol of the SysAD and SysCmd buses from the external agent to the proces sor. These two pins have been added to the SysAD interface to suppor t multipl e outstanding reads and facilitate non-blocking caches. When the processor needs to reacquire control of the interface, it asserts PRqst*. The external device responds by asser ti ng PAck* to return control of the inte rface to the processor.

RspSwap* is also a new pin and is used by the external agent to indicate to the processor when it is returning data out of order. For example, when there are two outstanding reads, the external agent asserts RspSwap* when it is going to re tu rn the data for th e seco nd rea d b efore it r et urns the data for the first read.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

RdType is another new pin on the i nte rfac e that indi cates whethe r a r ead is an inst ructi on rea d o r a data read. Wh en asserted, the reference is an instruct ion read. When deasserted it is a data read .

RdType is only valid during valid address cycles. ValidOut* and ValidIn* are used by the RM7065A and the external device respectively to

indicate that there is a valid command or data on the SysAD and SysCmd buses. The RM7065A asserts ValidOut* when it is dr iving these buses with a va lid command or data, and the external device drives ValidIn* when it has control of the buses and is driving a valid command or data.

4.28 System Interface Operation

To support non-blocking caches and data prefetch instructions, the RM7065A allows two outstanding reads. An external device may respond to read requests in whatever order it chooses by using the response order indicator pin RspSwap*. No more than two read requests are submitted to the external device. Sup port for multiple outstand ing reads can be ena bled or di sabled via a boot-time mode bit. Refer to Table 16 for a complete list of mode bits.

The RM7065A can issue read and write requests to an external device, while an external device can issue null and write requests to the RM7065A.

For processor reads, the RM7065A asserts ValidOut* and simultaneously drives the address and read command on the SysAD and SysCmd buses. If the system interface has RdRdy* asserted, then the processor tristates its drivers and releases the system interface to slave state by asserting Release*. The external device can then begin sending data to the RM7065A.

Figure 8 shows a processor block rea d request and th e correspond ing exte rnal agent re ad response.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 28 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Figure 8 Processor Block Read

SysClock

Preliminary

SysAD

SysCmd

ValidOut*

ValidIn*

RdRdy*

WrRdy*

Release*

Addr Data0 Data1

Read

NData NData NEOD

NData

Data2

Data3

In Figure 8 the read latency is 4 cycles (ValidOut* to ValidIn*), and the response data pattern is DDxxDD. Figure 9 shows a processor block write where the processor was programmed with write-back data rate boot code 2, or DDxxDDxx.

Finally, Figure 10 shows a typical sequence res ult ing in two outstandi ng r ead s, a s explained in the following sequence.

1. The processor issues a read.

2. The external agent takes control of the bus in preparation for retur ning data to the processor.

3. The process or encount ers anot her inter nal cache miss and there fore ass erts PRqst* in order to regain control of the bus.

4. The external agent pulses PAck*, returning control of the bus to the processor.

5. The processor issues a read for the second miss.

6. The RspSwap* pin is asserted to denote the out of order response. Not shown in the figure is the completion of the data transfer for the second miss, or any of the data tra nsfer for the first miss.

7. The external agent retakes control of the bus and begins returning data (out of order) for the second miss to the processor

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 29 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Figure 9 Processor Block Write

SysClock

Preliminary

SysAD

SysCmd

ValidOut*

ValidIn*

RdRdy*

WrRdy*

Release*

Addr Data0 Data1 Data2 Data3

Write NData NData NData NEOD

Figure 10 Multiple Outstanding Reads

Master

SysClock

SysAD

SysCmd

RspSwap*

ValidOut*

ValidIn*

Processor

Addr

Read

Processor Processor

Data0

Data1 Data1

System

Processor

Addr

Read

System

Data0

NData

Data1

NData

Data0

Release*

PRqst*

PAck*

4.29 Data Prefetch

The RM7065A is the first PMC-Sierra design to support the MIPS IV integer data prefetch (

PREF) and floating-point data prefetch (PREFX) instructions. These instructions are used by

the compiler or by an assembly language programmer when it is known or suspected that an upcoming data reference is going to miss in the cache. By appropriately placing a prefetch instruction, the memory latency can be hidden under the execution of other instructions. In cases where the execution of a prefetch ins tr uct ion would cause a memory ma nagement or address error exception the prefetch is treated as a

The “Hint” field of the data prefetch instruction is used to specify the action taken by the instruction. The ins truction can ope rate normally (tha t is, fetching dat a as if for a load oper ation) or it can allocate and fill a cache line with zeroes on a primary data cache miss.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 30 Document ID: PMC-2010145, Issue 2

NOP.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

4.30 Enhanced Write Modes

The RM7065A implements two enhancements to the original R4000 write mechanism: Write Reissue and Pipeline Writes. The original R4000 allowed a write on the SysAD bus every four SysClock cycles. Hence for a non-block write, this meant that two out of every four cycles were wait states.

Pipelined write mode eliminates these two wait states by allowing the processor to drive a new write address onto the bus immediately after the previous data cycle. This allows for higher SysAD bus utilization. However, at high frequencies the processor may drive a subsequent write onto the bus prior to the time the external agent deasserts WrRdy*, indicating that it can not accept another write cycle. This can cause the cycle to be aborted.

Write re issue mode is an enhance ment to pipeli ned writ e mode and allo ws the proce ssor to re is sue aborted wr ite cycles. If WrRdy* is deasserted during the issue phase of a write operation, the cycle is aborted by the processor and reissued at a later time.

In write reissue mode, a rate of one write every two bus cycles can be achieved. Pipelined writes have the same two bus cycle write repeat rate, but can issue one additional write following the deassertion of WrRdy*.

Preliminary

4.31 External Requests

The RM7065A can respond to certain requests issued by an external device. These requests take one of two forms: Write requests and Null requests. An external device executes a write request when it wishes to update one of the processors writable resources such as the internal interrupt register. A null request is executed when the external device wishes the processor to reassert ownership of the proce ssor ex terna l in terf ace. On ce the ex terna l devic e has acquir ed cont rol of t he processor interface via ExtRqst*, it can execute a null request after completing an independent transaction between itself and system memory in a system where memory is connected directly to the SysAD bus. Normally this transaction would be a DMA read or write fro m the I/O system.

4.32 Test/Breakpoint Registers

To facilitate hardware and software debugging, the RM7065A incorporates a pair of Test/Breakpoint, or Watch registers, called Watch1 and Watch2. Each Watch register can be separately enabled to watch for a load address, a store address, or an instruction address. All address comparisons are done on physical addresses. An associated register, Watch Mask, has also been added so that either or both of the Watch registers can compare against an address range rather than a specific address. The range granularity is limited to a power of two.

When enabled, a match of either W atch register results in an exception. If the Watch is enabled for a load or store address then the exception is the Watch exception as defined for the R4000 by Cause exception code twenty-three. If the Watch is enabled for instruction addresses then a newly defined Instruction Watch exception is taken and the Cause code is sixteen. The Watch register which caused the exception is indicated by Cause bits 25:24. Table 9 summarizes a Watch operation.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 31 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Table 9 Watch Control Register

63 62 61 60:36 35:2 1:0

Watch1, 2 Store Load Instr 0 Addr 0

31:2 1 0

Watch Mask Mask Mask

Note that the W1 and W2 bits of the Cause register indicate which Watch register caused a particular Watch exception.

4.33 Performance Counters

To facilitate system tuning, the RM7065A implements a performance counter using two new CP0 registers, PerfCount and PerfControl. The PerfCount register is a 32-bit writable counter which causes an interrupt when bit 31 is set. The PerfControl register is a 32-bit register containing a 5bit field which sele cts one of twenty-two event types as well as a handful of bits which c ontrol the overall counting fun ction. Note tha t only one event type can be counte d at a time and that co unting can occur for user code, kernel code, or both. The event types and control bits are listed in Table

10.

Watch

Preliminary

Mask

Watch

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 32 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Table 10 Performance Counter Control

PerfControl Field Description

4:0 Event Type

00: Clock cycles 01: Total instructions issued 02: Floating-point instructions issued 03: Integer instructions issued 04: Load instructions issued 05: Store instructions issued 06: Dual issued pairs 07: Branch prefetches 08: External Cache Misses

09: Stall cycles 0A: Secondary cac he misses 0B: Instruction cache misses 0C: Data cache misses 0D: Data TLB misses 0E: Instruction TLB misses 0F: Joint TLB instruction misses 10: Joint TLB data misses 11: Branches taken 12: Branches issued 13: Secondary cache writebacks 14: Primary cache writebacks 15: Dcache miss stall cycles (cycles where both cache miss tokens taken and a third address is

requested) 16: Cache misses 17: FP possible exception cycles 18: Slip Cycles due to multiplier busy 19: Coprocessor 0 slip cycles 1A: Slip cycles doe to pending non-blocking loads 1B: Write buffer full stall cycles 1C: Cache instruction stall cycles 1D: Multiplier stall cycles 1E: Stall cycles due to pending non-blocking loads - stall start of exception

7:5 Reserved (must be zero)

8 Count in Kernel Mode

0: Disable 1: Enable

9 Count in User Mode

0: Disable 1: Enable

10 Count Enable

0: Disable 1: Enable

31:11 Reserved (must be zero)

Preliminary

The performance counter interrupt only occurs when interrupts are enabled in the St atus register, IE=1, and the Interrupt Mask bit 13 (IM[13]) of the coprocessor 0 interrupt con trol register is set.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 33 Document ID: PMC-2010145, Issue 2

Since the performance coun ter can be se t up to cou nt clock cycl es, it can be used as eit her a second timer, or a wat chd og i nterrupt. A watchdog inte rr upt can be used as an aid in debuggi ng sys tem or software “hangs.” Typically the software is setup to periodically update the count so that no interrupt occurs. When a hang occurs the interrupt ultimately tr iggers, thereby breaking free from the hang-up.

4.34 Interrupt Handling

In order to provide better real time interrupt handling, the RM7065A provides an extended set of hardware interrupts, each of which can be separately prioritized and separately vectored.

In addition to the standard six external interrupt pins, the RM7065A provides four more interrupt pins for a total of ten extern al interrupts.

As described above, the performance counter is also a hardware interrupt source using Int[13]. Historically in the MIPS architecture, interrupt 7 ( Int[7]) was u sed as the timer interrupt. The RM7065A provides a separate int er rupt, Int[12], for this purpose, thereby releasing Int[7] for use as a pure external interrupt.

All interrupts (Int[13:0]), the Performance Counter, and the Timer, have corresponding interrupt mask bits, IM[13..0], and interrupt pending bits, IP[13..0], in the Status, Interrupt Control, and Cause registers. The bit assignments for the Interrupt Control and Cause registers are shown in Table 11 and Table 12. The Status register has not changed from the RM5200 Family and is not shown.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

The IV bit in the Cause register is the global enable bit for the enhanced interrupt features. If this bit is clear then interrupt operation is compatible with the RM5200 Family.

In the Interrupt Control register, the interrupt vector spacing is controlled by the Spacing field as described below. The Interrupt Mask field (IM[15:8]) contains the interrupt mask for interrupts eight through thirteen. IM[15:14] are reserved for future use.

The Timer Enable (TE) bit is used to gate the Timer Interrupt to the Cause register. If TE is set to "0", the Timer Interrupt is not gated to IP[12]. If TE is set to "1", the Timer Interrupt is gated to IP[12].

The setting for Mode Bit 11 is used to determine if the Timer Interrupt replaces the External Interrupt (Int[5]*) as an input to IP[7] in the Cause register. If Mode Bit 11 is set to "0", Int[5]* is gated to IP[7]. If Mode Bit 11 is set to "1", the Timer Interrupt is gated to IP[7].

In order to utilize both Int [5]* and the int ernal Timer Interrupt, Mode Bit 1 1 must be se t to "0" and TE must be set to "1". In this case, the Timer Interrupt will use IP[12], and Int[5]* will use IP[7]. Refer to the logic diagram in the RM7000 User Manual for more information on the interrupt signals.

The Interrupt Control register uses IM13 to enable the Performance Counter Control. Priority of the interrupts is set via two new coprocessor 0 registers called Interrupt Priority Level

Lo (IPLLO) and Interrupt Priority Level Hi (IPLHI).

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 34 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Table 11 Cause Register

31 30 29,28 27 26 25 24 23..8 7 6..2 0,1

BD 0 CE 0 W2 W1 IV IP[15..0] 0 EXC 0

Table 12 Interrupt Control Register

31..16 15..8 7 6..5 4..0

0 IM[15..8] TE 0 Spacing

Table 13 IPLLO Register

31..28 27..24 23..20 19..16 15..12 11..8 7..4 3..0

IPL7 IPL6 IPL5 IPL4 IPL3 IPL2 IPL1 IPL0

Table 14 IPLHI Register

31..28 27..24 23..20 19..16 15..12 11..8 7..4 3..0

0 0 IPL13 IPL12 IPL11 IPL10 IPL9 IPL8

In the IPLLO and IPLHI registers, each interrupt is represented by a four-bit field, thereby allowing each interrupt to be progr ammed with a priority lev el from 0 to 13 inclusive. The priorities can be set in any manner, including having all the priorities set exa ctly t he same. Pr iorit y 0 is the highest level and priority 15 the lowest. The format of the priority level registers is shown in T abl e 13 and Table 14 above. The priority lev el regist ers are l ocated in the coproc essor 0 control register space.

In addition to programmable priority levels, the RM7065A also permits the spacing between interrupt vectors to be programmed. For example, the minimum spacing between two adjacent vectors is 0x20 while the maximum is 0x200. This programma bility all ows the use r to either set up the vectors as jumps to the actu al inte rrupt routin es or, if interrupt latency is paramount, to include the entire interrupt routine at one vector. Table 15 illustrates the complete set of vector spacing selections along with the coding as required in the Interrupt Con trol register bits 4:0.

In general, the acti ve interrupt priority, combined with the spacing setting, generates a vect or offse t which is then added to the interrupt base address of 0x200 to generate the interrupt exception offset. This offset is then added to the exception base to produce the final interrupt vector address.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 35 Document ID: PMC-2010145, Issue 2

Table 15 Interrupt Vector Spacing

ICR[4..0] Spacing

0x0 0x000 0x1 0x020 0x2 0x040 0x4 0x080 0x8 0x100

0x10 0x200

others reserved

4.35 Standby Mode

The RM7065A provides a means to reduce the amount of power consumed by the internal core when the CPU is not performing any useful operations. This state is known as Standby Mode.

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Executing the

WAIT instruction enables interrupts and causes the processor to enter Standby

Mode. If the SysAD bus is currently idle when the WAIT instruction completes the W pipe stage, the internal processor clock stops, thereby freezing the pipeline. The phase lock loop, or PLL, internal timer/counter, and the "wake up" input pins: IP[9.0]*, NMI*, ExtReq*, Reset*, and ColdReset* continue to operate in their normal fashion.

If the SysAD bus is not idle when the

WAIT is treated as a NOP until the bus operation is completed. Once the processor is in Standby,

any interrupt, including the internally generated timer interrupt, causes the processor to exit Standby and resume operation where it left off. The idle loop of the operating system or real time executive.

4.36 JTAG Interface

The RM7065A interface supports JTAG boundary scan in conformance with IEEE 1149.1. The JTAG interface is useful for checking the integrity of the processor’s pin connections.

4.37 Boot-Time Options

The RM7065A operating modes are initialized at power-up by the boot-time mode control interface. The serial boot-time mode control interface operates at a very low frequency (SysClock divided by 256), allowin g the init iali zatio n infor mat ion to be kept i n a low cos t EPROM or syst em interface AS IC.

4.38 Boot-Time Modes

WAIT instruction completes the W pipe stage, then the

WAIT instruction is typically inserted in the

The boot-time serial mode stream is defined in Table 16. Bit 0 is presented to the processor as the first bit in the stream when VccOK is de-asserted. Bit 255 is the last bit transferred.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 36 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Table 16 Boot Time Mode Stream

Mode bit Description Mode bit Description

0 Reserved: must be zero 17:16 System configuration identifiers - software

visible in processor Config[21..20] register

4:1 Write-back data rate

0: DDDD 1: DDxDDx 2: DDxxDDxx 3: DxDxDxDx 4: DDxxxDDxxx 5: DDxxxxDDxxxx 6: DxxDxxDxxDxx 7: DDxxxxxxDDxxxxxx 8: DxxxDxxxDxxxDxxx 9-15: reserved

7:5 SysClock to Pclock Multiplier

Mode bit 20 = 0 / Mode bit 20 = 1

0: Multiply by 2/x 1: Multiply by 3/x 2: Multiply by 4/x 3: Multiply by 5/2.5 4: Multiply by 6/x 5: Multiply by 7/3.5 6: Multiply by 8/x 7: Multiply by 9/4.5

8 Specifies byte ordering. Logically ORed

with BigEndian input signal.

0: Little endian 1: Big endian

10:9 Non-Block Write C ontrol

00: R4000 compatible non-block writes 01: reserved 10: pipelined non-block writes 11: non-block write re-issue

11 Timer Interrupt Enable/Disable

0: External Int[5]* gated to IP[7] 1: Internal Timer Interrupt gated to IP[7]

12 Reserved: Must be zero 26 Enable two outstanding reads with out-of-

14:13 Output driver strength - 100% = fastest

00: 67% strength 01: 50% strength 10: 100% strength 11: 83% strength

15 Reserved: Must be zero

19:18 Reserved: Must be zero

20 Pclock to SysClock multipliers.

0: Integer multipliers (2,3,4,5,6,7,8,9) 1: Half integer multipliers (2.5,3.5,4.5)

23:21 Reserved: Must be zero

24 JTLB Size.

0: 48 dual-entry 1: 64 dual-entry

25 On-chip secondary cache control.

0: Disable 1: Enable

order return

0: Disable 1: Enable

255:27 Reserved: Must be zero

Preliminary

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 37 Document ID: PMC-2010145, Issue 2

5 Pin Descriptions

The following is a list of control, data, clock, interrupt, and miscellaneous pins of the RM7065A.

Table 17 System Interface

Pin Name Type Description

ExtRqst* Input External request

Release* Output Release interface

RdRdy* Input Read Ready

WrRd y* Input Wri te Ready

ValidIn* Input Valid Input

ValidOut* Output Valid output

PRqst* Outpu t Processor Req uest

PAck* Input Processor Acknowledge

RspSwap* Input Response Swap

RdType Output Read Type

SysAD(63:0) Input /O utpu t System address/data bus

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Signals that the system interface is submitting an external request.

Signals that the processor is releas ing the system interface to slave state

Signals that an external agent can now accept a processor read.

Signals that an external agent can now accept a processor write request.

Signals that an external agent is now drivin g a valid address or data on the SysAD bus and a valid command or data identifier on the SysCmd bus.

Signals that the pro ce ss or is n ow d r iv ing a v ali d add res s or dat a o n the SysAD bus and a valid comm and or data iden tifi er on the Sy sCm d bus .

When asserted this signa l requ es ts tha t cont rol of the sy st em interfa ce be returned to the processor. This is enabled by Mode Bit 26.

When asserted, in response to PRqst*, this signal indicates to the processor that it has been granted control of the system interface.

RspSwap* is used by th e ex ternal agent to signal the proces sor wh en it is about to return a memory reference out of order; i.e., of two outstanding memory references, the data for the second reference is being returned ahead of the data for the first reference. Note that this signal works as a toggle; i.e., for each cycle that it is held asserted the order of return is reversed. By default, anytime the processor issues a second read it is assumed that the reads will be returned in order; i.e., no action is required if the reads are indeed returned in order. This is enabled by Mode Bit 26.

During the address cycle of a read request, RdType indicates whether the read request is an instruction read or a data read.

A 64-bit address and data bus for communication between the processor and an external agent.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 38 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Pin Name Type Description

SysADC(7:0) Input/Output System address/data check bus

An 8-bit bus contain ing pari ty che ck bi ts for the SysAD bus durin g da ta cycles.

SysCmd(8:0) Input/Output System command/da ta ide ntif ier bus

A 9-bit bus for command and data identifier transmission between the processor and an external agent.

SysCmdP Input/Output System Command/Data Identifier Bus Parity

For the RM7065A, unused on input and zero on output.

Table 18 Clock/Control Interface

Pin Name Type Description

SysClock Input System clock

Master clock input used as the system interface reference clock. All output timings are relative to this input clock. Pipeline operation frequency is derived by multiplying this clock up by the factor selected during boot initialization

VccP Input Vcc for PLL

Quiet VccInt for the internal phase locked loop. Must be connected to VccInt through a filter circuit.

VssP Input Vss for PLL

Quiet Vss for the internal phase locked loop. Must be connected to VssInt through a filter circuit.

Preliminary

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 39 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Table 19 Interrupt Interface

Pin Name Type Description

IP*(9:0) Input Interrupt

Ten general processor interrupts, bit-wise ORed with bits 9:0 of the interrupt register.

NMI* Input Non-maskable inte rrup t

Non-maskable interrupt, O Red with bit 15 of the interrupt registe r (bit 6 in R5000 compatibility mode).

Table 20 JT A G Interfa ce

Pin Name Type Description

JTDI Input JTAG data in

JTAG serial data in.

JTCK Input JTAG clock input

JTAG serial clock input.

JTDO Output JTAG data out

JTAG serial data out.

JTMS Input JTAG command

JTAG command signal, signals that the incoming serial data is command data.

Preliminary

Table 21 Initialization Interface

Pin Name Type Description

BigEndian Input Big Endian / Little Endian Control

Allows the system to change the processor addressing mode without rewriting the mode ROM.

VccOK Input Vcc is OK

When asserted, this signal indicates to the RM7065A that the VccInt power supply has been above the recommended value for more than 100 milliseconds and will remain stable. The assertion of VccOK initiates the reading of the boot-time mode control serial stream.

ColdReset* Input Cold Reset

This signal must be asserted for a power on reset or a cold reset. ColdReset must be de-asserted synchronously with SysClock.

Reset* Input Reset

This signal must be asserted for any reset sequence. It may be asserted synchronously or asynchronously for a cold reset, or synchronously to initiate a warm reset. Reset must be de-asserted synchronously with SysClock.

ModeClock Output Boot Mo de Clock

Serial boot-mode data clock output at the system clock frequency divided by two hundred and fifty six.

ModeIn Input Boot Mode Data In

Serial boot-mode data input.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 40 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

6 Absolute Maximum Ratings

Symbol Rating Limits Unit

TERM

CASE

STG

OUT

Notes

1. Stresses greater than those listed under ABSOLUTE MAXIMUM RATINGS may cause permanent damage to the d evic e. This is a stress rati ng on ly an d functional operation of the devi ce at th es e o r a ny other conditions above those indicated in the operational sections of this specification is not implied. Exposure to absolute maximum rating conditions for extended periods may affect reliability.

2. V

minimum = -2.0 V for pulse width less than 15 ns. VIN should not exceed 3.9 Volts.

3. When VIN < 0V or VIN > VccIO

4. Not more than one output should be shorted at a time. Duration of the short should not exceed 30 seconds.

Terminal Voltage with respect to VSS Operating Temperature 0 to +85 °C Storage Temperature –55 to +125 °C

DC Input Current DC Output Current

to +3.9

–0.5

±20 mA ±20 mA

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 41 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

7 Recommended Operating Conditions

CPU Speed Temperature Vss VccInt VccIO VccP

300 - 350 MHz 0°C to +85°C

(Case)

Notes

1. VccIO should not exceed VccInt by greater than 2.0 V during the power-up sequence.

2. Applying a logic high state to any I/O pin before VccInt becomes stable is not recommended.

3. As specified in IEEE 1149.1 (JTAG), the JTM S pin m ust be h eld hi gh duri ng res et to avoi d enteri ng JTAG test mode. Refer to the RM7065A Family Users Manual, Appendix E.

4. VccP must be connected to VccIn t through a pass ive fi lter ci rcuit. Se e RM700 0 Fa mily U ser’s Manual f or recommended circuit.

0V 1.65V ± 50 mV 3.3 V ± 150 mV

Preliminary

1.65V ± 50 mV

2.5 V ± 200 mV

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 42 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

8 DC Electrical Characteristics

(VccIO = 3.15V - 3.45V)

Parameter Minimum Maximum Conditions

IO = 2.3V - 2.7V)

VccIO - 0.2V

2.4V

-0.3V 0.8V

2.0V VccIO + 0.3V

Parameter Minimum Maximum Conditions

2.1V

2.0

1.7

-0.3V 0.7V

1.7V VccIO + 0.3V

0.2V |I

0.4V |I

±15 µA ±15 µA

0.2V |I

0.4V |I

0.7V |I

±15 µA ±15 µA

OUT

VIN = 0

= VccIO

OUT

VIN = 0

= VccIO

Preliminary

|= 100 µA

| = 2 mA

|= 100 µA

| = 1 mA

| = 2 mA

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 43 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

9 Power Consumption

Parameter Conditions

standby TBD TBD TBD VccInt Power

(mWatts)

Notes

1. Worst case supply voltage (maximum VccInt) with worst case temperature (maximum TCase).

2. Dhrystone 2.1 instruction mix.

3. I/O supply power is application dependant, but typically <20% of VccInt.

active

Maximum with no FPU operation Maximum worst case instruction

mix

CPU Speed 300 MHz 350 MHz

Max

TBD TBD TBD TBD TBD TBD

Max

Preliminary

TBD MHz

Max

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 44 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

10 AC Electrical Characteristics

10.1 Capacitive Load Deration

Parameter Symbol Min Max Units

Load Derate C

10.2 Clock Parameters

Parameter Symbol

SysClock High t SysClock Low t SysClock

Frequency SysClock Period t

Clock Jitter for SysClock

SysClock Rise Time

SysClock Fall Time

ModeClock Period

JTAG Clock Period

Note

1. Operation of the RM7065A is only guaranteed with the Phase Loc k Loo p Enabl ed.

SCHigh

SCLow

SCP

JitterIn

SCRise

SCFall

ModeCKP

JTAGCKP

CPU Speed

Test Conditions

Transition ≤ 5ns 3 3 ns Transition ≤ 5ns 3 3 ns

300 MHz 350 MHz TBD MHz Min Max Min Max Min Max

33.3 100 33.3 117 MHz

Preliminary

2 ns/25pF

Units

10 30 8.5 30 ns

±150 ±150 ps

22 ns

256 256 t

44 t

SCP

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 45 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

10.3 System Interface Parameters

Parameter1Symbol Test Conditions

mode14..13 = 10

Data Output

2,3

(fastest)

mode14..13 = 01 (slowest)

Data Setup Data Hold

= see above table

rise

= see above table

fall

Notes

1. Timings are measured from 0.425 x VccIO of clock to 0.425 x VccIO of signal for 3.3 V I/O. Timings are measured from 0.48 x VccIO of clock to 0.48 x VccIO of signal for 2.5 V I/O.

2. Capacitive load for all output timings is 50 pF.

3. Data Output timing applies to all signal pins whether tristate I/O or output only.

4. Setup and Hold parameters apply to all signal pins whether tristate I/O or input only.

5. Only mode 14:13 = 10 is tested and guaranteed.

6. Data shown is for 3.3 V I/O. For 2.5 V I/O derate all times by 0.5 nS.

5,6

CPU Speed 300 MHz 350 MHz TBD MHz Min Max Min Max Min Max

1.0 TBD TBD TBD TBD TBD ns

2.5 TBD TBD ns

1.0 TBD TBD ns

Preliminary

Units

10.4 Boot-Time Interface Parameters

Parameter Symbol Min

Mode Data Setup t Mode Data Hold t

DS DH

4 SysClock cycles 0 SysClock cycles

Max

Units

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 46 Document ID: PMC-2010145, Issue 2

11 Timing Diagrams

11.1 Clock Timing

Figure 11 Clock T iming

SysClock

System Interface Timing (SysAD, SysCmd, ValidIn*, ValidOut*, etc.)

Figure 12 Input Timing

SysClock

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

High

Rise

Fall

Low

±t

JitterIn

Data

Figure 13 Output Timing

SysClock

Data

DOmin

Data

DOmax

DataData

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 47 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

12 Packaging Information

Figure 14 Mechanical Diagram

Preliminary

468101214161820

15 13

911

19 17

TOP VIEW

24.13 DETAIL X

DETAIL Y

SIDE VIEW

bbb

-C-

DETAIL X

aaa

SEATING PLANE

Symbol Min. Nom. Max.

A A1 0.50 0.60 0.70 D — 27.00 — E — 27.00 — I 1.435 REF. J 1.435 REF. M 20 <PERIMETER> aaa 0.20 bbb 0.25 b 0.60 0.75 0.90 c 0.80 0.90 1.00 e 1.27 TYP.

——1.70

BOTTOM VIEW

0.30 M C A B

DETAIL Y

Notes

1. Package Dimensions conform to JEDEC Registration MO-149(BG-2X).

2. "e" represents the basic solder ball grid pitch.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 48 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

3. "M" represents the maximum solder ball matrix size.

4. "Dimension "b" is measured at the maximum solder ball diameter parallel to the primary datum "c".

5. The Primary datum "c" and the seating plane are defined by the spherical crowns of the solder balls.

6. All dimensions are in millimeters.

7. Dimensioning and tolerancing per ASME Y14.5M-1994.

8. After surface mount assembly, solder ball will have 0.15 mm (TYP) collapse in "A" dimension.

9. Substrate base material is copper.

10. Package top sur face color sh all be black.

11. Cavity depth maximum is 0.50 mm.

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 49 Document ID: PMC-2010145, Issue 2

13 RM7065A Pinout

Pin Function Pin Function Pin Function Pin Function

A1 VccIO A2 VSS A3 VSS A4 Do Not Connect A5 SysAD[35] A6 VSS A7 SysAD[33] A8 SysAD[32] A9 VSS A10 SysADC[1] A11 Do Not Connect A12 VSS A13 SysADC[2] A14 SysAD[62] A15 VSS A16 SysAD[60] A17 Do Not Connect A18 VSS A19 VSS A20 VccIO B1 VSS B2 VccIO B3 VSS B4 VSS B5 Do Not Connect B6 SysAD[3] B7 SysAD[2] B8 SysAD[1] B9 SysADC[5] B10 SysADC[0] B11 SysADC[3] B12 SysADC[6] B13 Do Not Connect B14 SysAD[30] B15 SysAD[29] B16 Do Not Connect B17 VSS B18 VSS B19 VccIO B20 VSS C1 VSS C2 VSS C3 VccIO C4 Do Not Connect C5 Do Not Connect C6 Do Not Connect C7 SysAD[34] C8 VccInt C9 SysAD[0] C10 SysADC[4] C11 SysADC[7] C12 VccInt C13 SysAD[31] C14 SysAD[61] C15 VccInt C16 Do Not Connect C17 Do Not Connect C18 VccIO C19 VSS C20 VSS D1 Do not Connect D2 VSS D3 Do not Connect D4 VccIO D5 VccIO D6 Do Not Connect D7 VccInt D8 VccInt D9 VccIO D10 VccI nt D11 VccInt D12 VccIO D13 SysAD[63] D14 VccInt D15 SysAD[28] D16 VccIO D17 VccIO D18 Do Not Connect D19 VSS D20 Do Not Connect E1 SysAD[5] E2 Do Not Connect E3 VccInt E4 VccIO E17 VccIO E18 Do Not Connect E19 Do Not Connect E20 SysAD[59] F1 VSS F2 SysAD[36] F3 SysAD[4] F4 VccInt F17 VccInt F18 SysAD[27] F19 SysAD[58] F20 VSS G1 SysAD[38] G2 SysAD[6] G3 SysAD[37] G4 VccInt G17 VccInt G18 SysAD[26] G19 SysAD[57] G20 SysAD[25] H1 SysAD[7] H2 SysAD[39] H3 SysAD[40] H4 SysAD[8] H17 SysAD[24] H18 SysAD[56] H19 SysAD[55] H20 SysAD[23] J1 VSS J2 SysAD[9] J3 VccInt J4 VccIO J17 VccIO J18 SysAD[54] J19 SysAD[22] J20 VSS K1 SysAD[41] K2 SysAD[10] K3 SysAD[42] K4 SysAD[11] K17 SysAD[53] K18 SysAD[21] K19 SysAD[52] K20 SysAD[20] L1 SysAD[43] L2 SysAD[44] L3 SysAD[12] L4 VccInt L17 VccInt L18 SysAD[51] L19 SysAD[19] L20 SysAD[50] M1 VSS M2 SysAD[13] M3 SysAD[45] M4 VccIO M17 VccIO M18 SysAD[18] M19 SysAD[49] M20 VSS N1 SysAD[14] N2 SysAD[46] N3 VccInt N4 SysAD[47] N17 VccInt N18 SysAD[48] N19 SysAD[16] N20 SysAD[17]

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 50 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

Preliminary

Pin Function Pin Function Pin Function Pin Function

P1 SysAD[15] P2 RSPSWAPB P3 PACKB P4 VccInt P17 ColdResetB P18 VccOK P19 BigEndian P20 ResetB R1 VSS R2 Do Not Connect R3 JTDI R4 JTCK R17 VccInt R18 ExtrQSTB R19 NMIB R20 VSS T1 PRQSTB T2 JTDO T3 VccIO T4

T17 VccIO T18 VccInt T19 Int[9]* T20 Int[8]* U1 ModeClock U2 VSS U3 JTMS U4 VccIO U5 VccIO U6 ValidInB U7 VSSP U8 VccInt U9 VccIO U10 VccI nt U11 VccInt U12 VccIO U13 SysCmd[7] U14 VccInt U15 Int[3]* U16 VccIO U17 VccIO U18 Int[6]* U19 VSS U20 Int[7]* V1 VSS V2 VSS V3 VccIO V4 RDType V5 RDRDYB V6 VccP V7 Do Not Connect V8 VccInt V9 Do Not Connect V10 Do Not Connect V11 VccInt V12 SysCmd[3] V13 SysCmd[6] V14 VccInt V15 Int[2]* V16 Int[5]* V17 Int[4]* V18 VccIO V19 VSS V20 VSS W1 VSS W2 VccIO W3 VSS W4 VSS W5 WRRDYB W6 ReleaseB W7 SysClk W8 VccInt W9 Do Not Connect W10 Do Not Connect W11 SysCmd[1] W12 SysCmd[2] W13 SysCmd[5] W14 SysCmdP W15 VccInt W16 Int[1]* W17 VSS W18 VSS W19 VccIO W20 VSS Y1 VccIO Y2 VSS Y3 VSS Y4 ModeIn Y5 ValidOutB Y6 VSS Y7 VccP Y8 Do Not Connect Y9 VSS Y10 Do Not Connect Y11 SysCmd[0] Y12 VSS Y13 SysCmd[4] Y14 SysCmd[8] Y15 VSS Y16 Do Not Connect Y17 Int[0]* Y18 VSS Y19 VSS Y20

VccIO

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 51 Document ID: PMC-2010145, Issue 2

RM7065A™ Microprocessor with On-Chip Secondary Cache Data Sheet

14 Ordering Information

RM7065A -123 T I

Valid Combinations

RM7065A-300T RM7065A-350T

Preliminary

Temperature Grade:

(blank) = commercial

Package Type:

T = TBGA

Device Maximum Speed

Device Type A = 0.18 micron process geometry

Proprietary and Confidential to PMC-Sierra, Inc and for its Customer’s Internal Use 52 Document ID: PMC-2010145, Issue 2

PMC RM7065A-300T, RM7065A-350T Datasheet

Specifications and Main Features

Frequently Asked Questions

User Manual

Legal Information

Revision History

2 June 2001 Changed IP references to INT, page 34. Changed W7 pin name to SysClk.

1 April 2001 Applied PMC-Sierra template to existing MPD (QED) prelimina r y Fram eM ake r

Document Conventions

• All instruct ion names, such as MFHI, are in san serif typeface.

Table of Contents

List of Figures

List of Tables

1 Features

2 Block Diagram

Figure 1 Block Diagram

3 Description

4 Hardware Overview

4.1 CPU Registers

Figure 2 CP0 Registers

4.2 Superscalar Dispatch

Table 1 Instruction Issue Rules

one of: one of: integer, branch, floating-point,

integer, load/store

Figure 3 Instruction Issue Paradigm

T able 2 Dual Issue Instruction Classes

add, sub, or , xor, sh ift, etc .

4.3 Pipeline

lw, sw, ld, sd, ldc1, sdc1, mov, movc, fmov, etc.

fadd, fsub, fmult, fm add, fdiv, fcmp, fsqrt, etc.

beq, bne, bCzT, bCzF, j, etc.

Figure 4 Pipeline

4.4 Integer Unit

Register File

4.5 ALU

Table 3 ALU Operations

Adder add, sub add, sub, data address

Logic logic, moves, zero shifts

Shifter non zero shift non zero shift, store

4.6 Integer Multiply/Divide

Table 4 Integer Multiply/Divide Operations

MULT/U, MAD/U

DMULT, DMUL TU

DIV, DIVD any 36 36 0 DDIV,

16 bit 4 3 0 32 bit 5 4 0 16 bit 4 3 2 32 bit 5 4 3

4.7 Floating-Point Coprocessor

4.8 Floating-Point Unit

Table 5 Floating Point Latencies and Repeat Rates

4.9 Floating-Point General Register File

4.10 System Control Coprocessor (CP0)

4.11 System Control Coprocessor Registers

Figure 5 CP0 Registers

4.12 Virtual to Physical Address Mapping

• user mode

• kernel mode

• supervisor mode

Figure 6 Kernel Mode Virtual Addressing (32-bit)

0xFFFFFFFF Kernel virtual address space

(kseg3)

0xE0000000 Mapped, 0.5GB

0xDFFFFFFF Supervisor virtual address space

(ksseg)

0xC0000000 Mapped, 0.5GB

0xBFFFFFFF Uncached kernel physical address space

(kseg1)

0xA0000000 Unmapped, 0.5GB

0x9FFFFFFF Cached kernel physical address space

(kseg0)

0x80000000 Unmapped, 0.5GB

0x7FFFFFFF User virtual address space

0x00000000

4.13 Joint TLB

4.14 Instruction TLB

4.15 Data TLB

4.16 Cache Memory

4.17 Instruction Cache

4.18 Data Cache

1. Uncached Reads to addresses in a memory area identified as uncached do not access the cache. Writes to

2. Write-back Loads and instruction fetches first search the cache, reading the next memory hierarchy level

3. Write-through with write allocate Loads and instruction fetches first search the cache, reading from memory only if the desired