Datasheet ACT-7000SC-225F17M, ACT-7000SC-210F17T, ACT-7000SC-200F17C, ACT-7000SC Datasheet (ACT)

Page 1
ACT 7000SC
64-Bit Superscaler Microprocessor
Features
Dual Issue symmetric superscalar microprocessor with instruction prefetch optimized for system level price/performance
150, 200, 210, 225 MHz operating frequency
Consult Factory for latest speeds
MIPS IV Superset Instruction Set Architecture
High performance interface (RM52xx compatible)
600 MB per second peak throughput
75 MHz max. freq., multiplexed address/data
Supports 1/2 clock multipliers (2, 2.5, 3, 3.5, 4, 4.5, 5, 6, 7, 8, 9)
IEEE 1149.1 JTAG (TAP) boundary scan
Integrated primary and secondary caches - all are 4-way set associative with 32 byte line size
16KB instruction
16KB data: non-blocking and write-back or write-through
256KB on-chip secondary: unified, non-blocking, block writeback
MIPS IV instruction set
Data PREFETCH instruction allows the processor to overlap cache
miss latency and instruction execution
Floating point combined multiply-add instruction increases
performance in signal processing and graphics applications
Conditional moves reduce branch frequency
Index address modes (register + register)
Embedded supply de-coupling capacitors and additional PLL filter components
Integrated memory management unit (ACT52xx compatible)
Fully associative joint TLB (shared by I and D translations)
48 dual entries map 96 pages
4 entry DTLB and 4 entry ITLB
Variable page size (4KB to 16MB in 4x increments)
Embedded application enhancements
Specialized DSP integer Multiply-Accumulate instruction, (MAD/MADU) and three-operand multiply instruction (MUL/U)
Per line cache locking in primaries and secondary
Bypass secondary cache option
I&D Test/Break-point (Watch) registers for emulation & debug
Performance counter for system and software tuning & debug
Ten fully prioritized vectored interrupts - 6 external, 2 internal, 2 software
Fast Hit-Writeback-Invalidate and Hit-Invalidate cache operations for efficient cache management
High-performance floating point unit - 600 M FLOPS maximum
Single cycle repeat rate for common single-precision operations and some double-precision operations
Single cycle repeat rate for single-precision combined multiply­add operations
Two cycle repeat rate for double-precision multiply and double-precision combined multiply-add operations
Fully static CMOS design with dynamic power down logic
Standby reduced power mode with WAIT instruction
4 watts typical @ 2.5V Int., 3.3V I/O, 200MHz
208-lead CQFP, cavity-up package (F17)
208-lead CQFP, inverted footprint (F24), with the same pin rotation as the commercial QED RM5261
BLOCK DIAGRAM
Store Buffer
Packer / Unpacker
MultAdd, Add, Sub,
Secondary Tags
Set A
Primary Data Cache
4 - Way Set Associative
Write Buffer
Read Buffer
D Bus
Floating-Point
Load /Align
Floating-Point
Register File
Comparator
Floating-Point
Cvt, Div, Sqrt
Multiplier Array
On -Chip 256K Byte Secondary Cache, 4 -Way Set Associative
Secondary Tags
Set B
System /Memory
Floating -Point Control
DTag
DTLB
Pad Buffer
Address Buffer
Joint TLB
Coprocessor 0
Control
PC Incrementer
Branch PC Adder
ITLB Virtuals
Program Counter
Secondary Tags
ITag
ITLB
DVA
IVA
Set C
Primar y Instruction Cache
F-Pipe Bus
Load Aligner
Integer Register File
M Pipe F Pipe
Adder
StAin/Sh Shifter
FA Bus
DTLB Virtuals
Secondary Tags
Set D
4 -Way Set Associative
A/D Bus
Prefetch Buffer
Instruction Dispatch Unit
F Pipe Register
M Pipe Register
Int Mult. Div. MaddPLL/Clocks
Pad Bus
M-Pipe Bus
Adder
LogicalsLogicals
Integer Control
eroflex Circuit
Technology
– MIPS RISC Microprocessors © SCD7000SC REV B 7/30/01
Page 2
DESCRIPTION
The ACT 7000SC is a highly integrated symmetric superscalar microprocessor capable of issuing two instructions each processor cycle. It has two high performance 64-bit integer units as well as a high throughput, fully pipelined 64-bit floating point unit. To keep its multiple execution units running efficiently, the ACT 7000SC integrates not only 16KB 4-way set associative instruction and data caches but backs them up with an integrated 256KB 4-way set associative secondary as well. For maximum efficiency, the data and secondary caches are writeback and nonblocking. A RM52XX family compatible, operating system friendly memory management unit with a 64 / 48-entry fully associative TLB and a high-performance 64-bit system interface supporting hardware prioritized and vectored interrupts round out the main features of the processor.
The ACT 7000SC is ideally suited for highend embedded control applications such as internetworking, high performance image manipulation, high speed printing, and 3-D visualization.
HARDWARE OVERVIEW
The ACT 7000SC offers a high-level of integration targeted at high-performance embedded applications. The key elements of the ACT 7000SC are briefly described below.
CPU Registers
Like all MIPS ISA processors, the ACT 7000SC CPU has a simple, clean user visible state consisting of 32 general purpose registers, or GPR’s, two special purpose registers for integer multiplication and division, and a program counter; there are no condition code bits. Figure 1 shows the user visible state.
Superscalar Dispatch
The ACT 7000SC has an efficient symmetric superscalar dispatch unit which allows it to issue up to two instructions per cycle. For purposes of instruction issue, the ACT 7000SC defines four classes of instructions: integer, load/store, branches, and floating-point. There are two logical pipelines, the
function
, or F, pipeline and the
memory
, or M, pipeline. Note however that the M pipe can execute integer as well as memory type instructions.
Table 1 – Instruction Issue Rules
F Pipe M Pipe
one of: one of:
integer, branch, floating-point,
integer mul, div
Figure 2 is a simplification of the pipeline section and illustrates the basics of the instruction issue mechanism.
integer, load/store
General Purpose Registers
63 0 Multiply/Divide Registers
063
r1 HI
r2 63
•LO
Program Counter
•63
r29 PC
r30
r31
0
0
0
Figure 1 – CP0 Registers
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
2
Page 3
.
Table 2 – Dual Issue Instruction Classes
Instruction
Cache
Dispatch
Unit
F Pipe IBus
M Pipe IBus
FP
F Pipe
FP
M Pipe
Integer
F Pipe
Integer M Pipe
Figure 2 – Instruction Issue Paradigm
The figure illustrates that one F pipe instruction and one M pipe instruction can be issued concurrently but that two M pipe or two F pipe instructions cannot be issued. Table 2 specifies more completely the instructions within each class.
integer load/store floating-point branch
add, sub, or, xor,
shift, etc.
lw, sw, ld, sd,
ldc1, sdc1,
mov, movc,
fmov, etc.
fadd, fsub, fmult,
fmadd, fdiv, fcmp,
fsqrt, etc.
beq, bne,
bCzT, bCzF, j,
etc.
The symmetric superscalar capability of the ACT 7000SC, in combination with its low latency integer execution units and high-throughput fully pipelined floating-point execution unit, provides unparalleled price/performance in computational intensive embedded applications.
Pipeline
The logical length of both the F and M pipelines is five stages with state committing in the register write, or W, pipe stage. The physical length of the floating-point execution pipeline is actually seven stages but this is completely transparent to the user.
Figure 3 shows instruction execution within the ACT 7000SC when instructions are issuing simultaneously down both pipelines. As illustrated in the figure, up to ten instructions can be executing simultaneously. This figure presents a somewhat simplistic view of the processors operation however since the out-of-order completion of loads, stores, and
I0 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I1 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I2 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I3 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I4 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W I5 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I6 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W I7 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
I8 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W I9 1l 2l 1R 2R 1A 2A 1D 2D 1W 2W
one cycle
Instruction cache access
1I-1R:
Instruction virtual to physical address translation
2I:
Register file read, Bypass calculation, Instruction decode, Branch address calculation
2R:
Issue or slip decision, Branch decision
1A:
Data vir tual address calculation
1A:
Integer add, logical, shift
1A-2A:
Store Align
2A:
Data cache access and load align
2A-2D:
Data virtual to physical address translation
1D:
Register file write
2W:
Figure 3 – Pipeline
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
3
Page 4
long latency floating-point operations can result in
Table 3 – ALU Operations
there being even more instructions in process than what is shown.
Note that instruction dependencies, resource conflicts, and branches result in some of the instruction slots being occupied by NOPs.
Integer Unit
Like the ACT 52xx family, the ACT 7000SC
Unit F Pipe M Pipe
Adder add, sub add, sub, data address
add
Logic logic, moves, zero shifts
(nop)
Shifter non zero shift non zero shift, store
logic, moves, zero shifts
(nop)
align
implements the MIPS IV Instruction Set Architecture, and is therefore fully upward compatible with applications that run on processors such as the R4650 and R4700 that implement the earlier generation MIPS III Instruction Set Architecture. Additionally, the ACT 7000SC includes two implementation specific instructions not found in the baseline MIPS IV ISA, but that are useful in the
Integer Multiply/Divide
The ACT 7000SC has a single dedicated integer multiply/divide unit optimized for high-speed multiply and multiply-accumulate operations. The multiply/divide unit resides in the F type execution unit. Table 4 shows the performance of the multiply/divide unit on each operation.
embedded market place. Described in detail in a later section of this datasheet, these instructions are integer multiply-accumulate and three-operand integer multiply.
The ACT 7000SC integer unit includes thirty-two general purpose 64-bit registers, the HI/LO result registers for the two-Pipeline operand integer multiply/divide operations, and the program counter, or PC. There are two separate execution units, one of which can execute function, or F, type instructions and one which can execute memory, or M, type instructions. See above for a description of the instruction types and the issue rules. As a special case, integer multiply/divide instructions as well as their corresponding MFHi and MFLo instructions can
Table 4 – Integer Multiply/ Divide Operations
Operand
Opcode
MULT/U,
MAD/U
MUL 16 bit 4 3 2
DMULT,
DMULTU
DIV, DIVD any 36 36 0
DDIV,
DDIVU
Size Latency
16 bit 4 3 0
32 bit 5 4 0
32 bit 5 4 3
any980
any 68 68 0
Repeat
Rate
Cycles
only be executed in the F type execution unit. Within each execution unit the operational characteristics are the same as on previous QED designs with single cycle ALU operations (add, sub, logical, shift), one cycle load delay, and an autonomous multiply/divide unit.
The baseline MIPS IV ISA specifies that the results of a multiply or divide operation be placed in the Hi and Lo registers. These values can then be transferred to the general purpose register file using the Move-from-Hi and Move-from-Lo (MFHI/MFLO) instructions.
Register File
The ACT 7000SC has thirty-two general purpose registers with register location (r0) hard wired to zero value. These registers are used for scalar integer operations and address calculation. In order to service the two integer execution units, the register file has four read ports and two write ports and is fully bypassed both within and between the two execution units to minimize operation latency in the pipeline.
In addition to the baseline MIPS IV integer multiply instructions, the ACT 7000SC also implements the 3-operand multiply instruction, MUL. This instruction specifies that the multiply result go directly to the integer register file rather than the Lo register. The portion of the multiply that would have normally gone into the Hi register is discarded. For applications where it is known that the upper half of the multiply result is not required, using the MUL instruction eliminates the necessity of executing an explicit
ALU
The ACT 7000SC has two complete integer ALU’s each consisting of an integer adder/subtractor, a logic unit, and a shifter. Table 3 shows the functions performed by the ALU’s for each execution unit. Each of these units is optimized to perform all operations in a single processor cycle.
MFLO instruction.
Also included in the ACT 7000SC are the multiply-add instructions MAD/MADU. This instruction multiplies two operands and adds the resulting product to the current contents of the Hi and Lo registers. The multiply-accumulate operation is the core primitive of almost all signal processing algorithms allowing the ACT 7000SC to eliminate the need for a separate DSP engine in many embedded applications.
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
4
Stall
Page 5
By pipelining the multiply-accumulate function and
dynamically determining the size of the input
Table 5 – Floating Point Latencies and
Repeat Rates
operands, the ACT 7000SC is able to maximize throughput while still using an area efficient implementation.
Floating-Point Coprocessor
The ACT 7000SC incorporates a high-performance fully pipe-lined floating-point coprocessor which includes a floating-point register file and autonomous execution units for multiply/ add/convert and divide/square root. The floating-point coprocessor is a tightly coupled co-execution unit, decoding and executing instructions in parallel with, and in the case of floating-point loads and stores, in cooperation with the M pipe of the integer unit. As described earlier, the superscalar capabilities of the ACT 7000SC allow floating-point computation instructions to issue concurrently with integer instructions.
Floating-Point Unit
The ACT 7000SC floating-point execution unit supports single and double precision arithmetic, as specified in the IEEE Standard 754. The execution unit is broken into a separate divide/square root unit and a pipelined multiply/add unit. Overlap of divide/square root and multiply/add is supported.
The ACT 7000SC maintains fully precise floating-point exceptions while allowing both overlapped and pipelined operations. Precise exceptions are extremely important in object-oriented programming environments and highly desirable for
Operation
fadd 4 1
fsub 4 1
fmult 4/5 1/2
fmadd 4/5 1/2
fmsub 4/5 1/2
fdiv 21/36 19/34
fsqrt 21/36 19/34
frecip 21/36 19/34
frsqrt 38/68 36/66
fcvt.s.d 4 1
fcvt.s.w 6 3
fcvt.s.l 6 3
fcvt.d.s 4 1
fcvt.d.w 4 1
fcvt.d.l 4 1
fcvt.w.s 4 1
fcvt.w.d 4 1
fcvt.l.s 4 1
fcvt.l.d 4 1
fcmp 1 1
fmov, fmovc 1 1
fabs, fneg 1 1
Latency
single/double
Repeat Rate
single/double
debugging in any environment.
The floating-point unit’s operation set includes floating-point add, subtract, multiply, multiply-add, divide, square root, reciprocal, reciprocal square root, conditional moves, conversion between fixed-point and floating-point format, conversion between floating-point formats, and floating-point compare. Table 5 gives the latencies of the floating-point instructions in internal processor cycles.
To support superscalar operations, the FGR has four read ports and two write ports, and is fully bypassed to minimize operation latency in the pipeline. Three of the read ports and one write port are used to support the combined multiply-add instruction while the fourth read and second write port allows a concurrent floating-point load or store and conditional moves.
System Control Coprocessor (CP0)
Floating-Point General Register File
The floating-point general register file, FGR, is made up of thirty-two 64-bit registers. With the floating-point load and store double instructions, LDC1 and SDC1, the floating-point unit can take advantage of the 64-bit wide data cache and issue a floating-point coprocessor load or store double-word instruction in every cycle.
The floating-point control register file contains two registers; one for determining configuration and revision information for the coprocessor and one for control and status information. These registers are primarily used for diagnostic software, exception handling, state saving and restoring, and control of rounding modes.
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
The system control coprocessor (CP0) in the MIPS architecture is responsible for the virtual memory sub-system, the exception control system, and the diagnostics capability of the processor. In the MIPS architecture, the system control coprocessor (and thus the kernel software) is implementation dependent. For memory management, the ACT 7000SC CP0 is logically identical to that of the RM5200 Family and R5000. For interrupt exceptions and diagnostics, the ACT 7000SC is a superset of the RM5200 Family and R5000 implementing additional features described later in the sections on Interrupts, the Test/Breakpoint facility, and the Performance Counter facility.
The memory management unit controls the virtual memory system page mapping. It consists of an instruction address translation buffer, or ITLB, a data
5
Page 6
address translation buffer, or DTLB, a Joint TLB, or JTLB, and coprocessor registers used by the virtual memory mapping sub-system.
System Control Coprocessor Registers
The ACT 7000SC incorporates all system control coprocessor (CP0) registers internally. These registers provide the path through which the virtual memory system’s page mapping is examined and modified, exceptions are handled, and operating modes are controlled (kernel vs. user mode, interrupts enabled or disabled, cache features). In addition, the ACT 7000SC includes registers to implement a real-time cycle counting facility, to aid in cache and system diagnostics, and to assist in data error detection.
To support the non-blocking caches and enhanced interrupt handling capabilities of the ACT 7000SC, both the data and control register spaces of CP0 are supported by the ACT 7000SC. In the data register space, that is the space accessed using the MFC0 and MTC0 instructions, the ACT 7000SC supports the same registers as found in the RM5200, R4000 and R5000 families. In the control space, that is the space accessed by the previously unused CTC0 and CFC0 instructions, the ACT 7000SC supports five new registers. The first three of these new 32-bit registers support the enhanced interrupt handling capabilities and are the Interrupt Control, Interrupt Priority Level Lo (IPLLO), and Interrupt Priority Level Hi (IPLHI)
registers. These registers are described further in the section on interrupt handling. The other two registers,
Imprecise Error 1
and
Imprecise Error 2
, have been added to help diagnose bus errors which occur on non-blocking memory references.
Figure 4 shows the CP0 registers.
Virtual to Physical Address Mapping
The ACT 7000SC provides three modes of virtual
addressing:
• user mode
• supervisor mode
• kernel mode
This mechanism is available to system software to provide a secure environment for user processes. Bits in the CP0 Status register determine which virtual addressing mode is used. In the user mode, the ACT 7000SC provides a single, uniform virtual address space of 256GB (2GB in 32-bit mode).
When operating in the kernel mode, four distinct virtual address spaces, totalling 1024GB (4GB in 32-bit mode), are simultaneously available and are differentiated by the high-order bits of the virtual address.
The ACT 7000SC processor also supports a supervisor mode in which the virtual address space is
256.5GB (2.5GB in 32-bit mode), divided into three regions based on the high-order bits of the virtual address. Figure 5 shows the address space layout for 32-bit operation.
LLAddr
17*
PageMask
5*
EntryHi
10*
47
(entries protected
from TLBWR)
0
TagLo
28*
EntryLo0
2*
EntryLo1
3*
TLB
TagHi
29*
Used for memory
management
Context
Info
7*
Index
0*
Random
1*
Wired
6*
PRid
15*
Config
16*
* Registered number
Status
Watch2
4*
Count
9*
12*
EPC
14*
19*
ECC
26*
BadVAddr
8*
Compare
11*
Cause
13*
Watch1
18*
Xcontext
20*
CacheErr
27*
ErrorEPC
30*
Used for exception
processing
Perf Counter
25*
Perf Ctr Cntrl
22*
Watch Mask
24*
IPLLO
18*
IPLHI
19*
IntControl
20*
Imp Error 1
26*
Imp Error 2
27*
Control Space Registers
Figure 4 – CP0 Registers
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
6
Page 7
Figure 5 – Kernel Mode Virtual Addressing
(32-bit mode)
is loaded with the desired page size of a mapping, and that size is stored into the TLB along with the virtual address when a new entry is written. Thus,
0xFFFFFFFF Kernel virtual address space
(kseg3) Mapped, 0.5GB
0xE0000000
0xDFFFFFFF Supervisor virtual address space
(ksseg) Mapped, 0.5GB
0xC0000000
0xBFFFFFFF Uncached kernel physical address space
(kseg1) Unmapped, 0.5GB
0xA0000000
0x9FFFFFFF Cached kernel physical address space
(kseg0) Unmapped, 0.5GB
0x80000000
0x7FFFFFFF User virtual address space
(kuseg) Mapped, 2.0GB
operating systems can create special purpose maps; for example, a typical frame buffer can be memory mapped using only one TLB entry.
The second mechanism controls the replacement algorithm when a TLB miss occurs. The ACT 7000SC provides a random replacement algorithm to select a TLB entry to be written with a new mapping; however, the processor also provides a mechanism whereby a system specific number of mappings can be locked into the TLB, thereby avoiding random replacement. This mechanism allows the operating system to guarantee that certain pages are always mapped for performance reasons and for deadlock avoidance. This mechanism also facilitates the design of real-time systems by allowing deterministic access to critical software.
The JTLB also contains information that controls the cache coherency protocol for each page. Specifically, each page has attribute bits to determine whether the coherency algorithm is: uncached, write-back, write-through with write-allocate, write-through without write-allocate, write-back with secondary bypass. Note that both of the write-through protocols bypass the secondary cache since it does not support writes of less than a complete cache line.
These protocols are used for both code and data on the ACT 7000SC with data using write-back or write-through depending on the application. The write-through modes support the same efficient frame buffer handling as the RM5200 Family, R4700 and R5000.
0x00000000
Instruction TLB
The ACT 7000SC uses a 4-entry instruction TLB
When the ACT 7000SC is configured for 64-bit addressing, the virtual address space layout is an upward compatible extension of the 32-bit virtual address space layout.
Joint TLB
For fast virtual-to-physical address translation, the ACT 7000SC uses a large, fully associative TLB that maps virtual pages to their corresponding physical addresses. As indicated by its name, the joint TLB (JTLB) is used for both instruction and data translations. The JTLB is organized as pairs of even/odd entries, and maps a virtual address and address space identifier into the large, 64GB physical address space. By default, the JTLB is configured as 48 pairs of even/odd entries. The 64 even/odd entry optional configuration is set at boot time.
Two mechanisms are provided to assist in controlling the amount of mapped space, and the replacement characteristics of various memory regions. First, the page size can be configured, on a per-entry basis, to use page sizes in the range of 4KB to 16MB (in 4X multiples). A CP0 register, PageMask,
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
(ITLB) to minimize contention for the JTLB, to eliminate the critical path of translating through a large associative array, and to save power. Each ITLB entry maps a 4KB page. The ITLB improves performance by allowing instruction address translation to occur in parallel with data address translation. When a miss occurs on an instruction address translation by the ITLB, the least-recently used ITLB entry is filled from the JTLB. The operation of the ITLB is completely transparent to the user.
Data TLB
The ACT 7000SC uses a 4-entry data TLB (DTLB) for the same reasons cited above for the ITLB. Each DTLB entry maps a 4KB page. The DTLB improves performance by allowing data address translation to occur in parallel with instruction address translation. When a miss occurs on a data address translation by the DTLB, the DTLB is filled from the JTLB. The DTLB refill is pseudo-LRU: the least recently used entry of the least recently used pair of entries is filled. The operation of the DTLB is completely transparent to the user.
7
Page 8
Cache Memory
In order to keep the ACT 7000SC’s superscalar pipeline full and operating efficiently, the ACT 7000SC has integrated primary instruction and data caches with single cycle access as well as a large unified secondary cache with a three cycle miss penalty from the primaries. Each primary cache has a 64-bit read path, a 128-bit write path, and both caches can be accessed simultaneously. The primary caches provide the integer and floating-point units with an aggregate band-width of 3.6 GB per second at an internal clock frequency of 225 MHz. During an instruction or data primary cache refill, the secondary cache can provide a 64-bit datum every cycle following the initial three cycle latency for a peak bandwidth of 2.4 GB per second.
Instruction Cache
The ACT 7000SC has an integrated 16KB, four-way set associative instruction cache and, even though instruction address translation is done in parallel with the cache access, the combination of 4-way set associativity and 16KB size results in a cache which is virtually indexed and physically tagged. Since the effective physical index eliminates the potential for virtual aliases in the cache, it is possible that some operating system code can be simplified as compared with the RM5200 Family, R5000 and R4000 class processors.
The data array portion of the instruction cache is 64 bits wide and protected by word parity while the tag array holds a 24-bit physical address, 14 housekeeping bits, a valid bit, and a single bit of parity protection.
By accessing 64 bits per cycle, the instruction cache is able to supply two instructions per cycle to the superscalar dispatch unit. For signal processing, graphics, and other numerical code sequences where a floating-point load or store and a floating-point computation instruction are being issued together in a loop, the entire bandwidth available from the instruction cache will be consumed by instruction issue. For typical integer code mixes, where instruction dependencies and other resource constraints restrict the achievable parallelism, the extra instruction cache bandwidth is used to fetch both the taken and non-taken branch paths to minimize the overall penalty for branches. A 32-byte (eight instruction) line size is used to maximize the communication efficiency between the instruction cache and the secondary cache, or memory system.
The ACT 7000SC is the first MIPS RISC microprocessor to support cache locking on a per line basis. The contents of each line of the cache can be locked by setting a bit in the Tag. Locking the line prevents its contents from being overwritten by a subsequent cache miss. Refill will occur only into unlocked cache lines. This mechanism allows the programmer to lock critical code into the cache
locked code sequence.
Data Cache
The ACT 7000SC has an integrated 16KB, four-way set associative data cache, and even though data address translation is done in parallel with the cache access, the combination of 4-way set associativity and 16KB size results in a cache which is physically indexed and physically tagged. Since the effective physical index eliminates the potential for virtual aliases in the cache, it is possible that some operating system code can be simplified compared to the RM5200 Family, R5000 and R4000 class processors. The data cache is non-blocking; that is, a miss in the data cache will not necessarily stall the processor pipeline. As long as no instruction is encountered which is dependent on the data reference which caused the miss, the pipeline will continue to advance. Once there are two cache misses outstanding, the processor will stall if it encounters another load or store instruction. A 32-byte (eight word) line size is used to maximize the communication efficiency between the data cache and the secondary cache or memory system. The data array portion of the data cache is 64 bits wide and protected by byte parity while the tag array holds a 24-bit physical address, 3 housekeeping bits, a two bit cache state field, and has two bits of parity protection. The normal write policy is write-back, which means that a store to a cache line does not immediately cause memory to be updated. This increases system performance by reducing bus traffic and eliminating the bottleneck of waiting for each store operation to finish before issuing a subsequent memory operation. Software can, however, select write-through on a per-page basis when appropriate, such as for frame buffers. Cache protocols supported for the data cache are:
1. Uncached. Reads to addresses in a memory area identified as uncached will not access the cache. Writes to such addresses will be written directly to main memory without updating the cache.
2. Write-back. Loads and instruction fetches will first search the cache, reading the next memory hierarchy level only if the desired data is not cache resident. On data store operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the cache contents will be updated, and the cache line marked for later write-back. If the cache lookup misses, the target line is first brought into the cache and then the write is performed as above.
3. Write-through with write allocate. Loads and instruction fetches will first search the cache, reading from memory only if the desired data is not cache resident; write-through data is never cached in the secondary cache. On data store
thereby guaranteeing deterministic behavior for the
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
8
Page 9
operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the primary cache contents will be updated and main memory will also be written leaving the write-back bit of the cache line unchanged; no writes will occur into the secondary. If the cache lookup misses, the target line is first brought into the cache and then the write is performed as above.
4. Write-through without write allocate. Loads and instruction fetches will first search the cache, reading from memory only if the desired data is not cache resident; write-through data is never cached in the secondary. On data store operations, the cache is first searched to determine if the target address is cache resident. If it is resident, the cache contents will be updated and main memory will also be written leaving the write-back bit of the cache line unchanged; no writes will occur into the secondary. If the cache lookup misses, then only main memory is written.
5. Write-back with secondary bypass. Loads and instruction fetches first search the primary cache, reading from memory only if the desired data is not resident; the secondary is not searched. On data store operations, the primary cache is first searched to determine if the target address is resident. If it is resident, the cache contents are updated, and the cache line marked for later write-back. If the cache lookup misses, the target line is first brought into the cache and then the write is performed as above.
Associated with the Data Cache is the store buffer. When the ACT 7000SC executes a STORE instruction, this single-entry buffer gets written with the store data while the tag comparison is performed. If the tag matches, then the data is written into the Data Cache in the next cycle that the Data Cache is not accessed (the next non-load cycle). The store buffer allows the ACT 7000SC to execute a store every processor cycle and to perform back-to-back stores without penalty. In the event of a store immediately followed by a load to the same address, a combined merge and cache write will occur such that no penalty is incurred.
Secondary Cache
The ACT 7000SC has an integrated 256KB, four-way set associative, block write-back secondary cache. The secondary has the same line size as the primaries, 32 bytes, is logically 64-bits wide matching the system interface and primary widths, and is protected with doubleword parity. The secondary tag array holds a 20-bit physical address, 2 housekeeping bits, a three bit cache state field, and two parity bits.
By integrating a secondary cache, the ACT 7000SC is able to dramatically decrease the latency of a primary cache miss without dramatically increasing
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
the number of pins and the amount of power required by the processor. From a technology point of view, integrating a secondary cache maximally leverages CMOS semiconductor technology by using silicon to build the structures that are most amenable to silicon technology; silicon is being used to build very dense, low power memory arrays rather than large power hungry I/O buffers.
Further benefits of an integrated secondary are flexibility in the cache organization and management policies that are not practical with an external cache. Two previously mentioned examples are the 4-way associativity and write-back cache protocol.
A third management policy for which integration affords flexibility is cache hierarchy management. With multiple levels of cache, it is necessary to specify a policy for dealing with cases where two cache lines at level n of the hierarchy would, if possible, be sharing an entry in level n+1 of the hierarchy. The policy followed by the ACT 7000SC is motivated by the desire to get maximum cache utility and results in the ACT 7000SC allowing entries in the primaries which do not necessarily have a corresponding entry in the secondary; the ACT 7000SC does not force the primaries to be a subset of the secondary. For example, if primary cache line A is being filled and a cache line already exists in the secondary for primary cache line B at the location where primary A’s line would reside then that secondary entry will be replaced by an entry corresponding to primary cache line A and no action will occur in the primary for cache line B. This operation will create the aforementioned scenario where the primary cache line which initially had a corresponding secondary entry will no longer have such an entry. Such a primary line is called an orphan. In general, cache lines at level n+1 of the hierarchy are called parents of level n’s children.
Another ACT 7000SC cache management optimization occurs for the case of a secondary cache line replacement where the secondary line is dirty and has a corresponding dirty line in the primary. In this case, since it is permissible to leave the dirty line in the primary, it is not necessary to write the secondary line back to main memory. Taking this scenario one step further, a final optimization occurs when the aforementioned dirty primary line is replaced by another line and must be written back, in this case, it will be written directly to memory bypassing the secondary cache.
Secondary Caching Protocols
Unlike the primary data cache, the secondary cache supports only uncached and block write-back. As noted earlier, cache lines managed with either of the write-through protocols will not be placed in the secondary cache. A new caching attribute, write-back with secondary bypass, allows the secondary to be bypassed entirely. When this attribute is selected, the secondarywill not be filled on load misses and will not be written on dirty write-backs from the primary.
9
Page 10
Table 6 – Cache Attributes
Attribute Instruction Data Secondary
Size 16KB 16KB 256KB
Associativity 4-way 4-way 4-way
Replacement Algorithm. cyclic cyclic cyclic
Line size 32 byte 32 byte 32 byte
vAddr
pAddr
11..0
35..12
Index
Ta g
Write policy n.a. write-back, write-through block write-back, bypass
read policy n.a. non-blocking (2 outstanding) non-blocking (data only, 2
read order critical word first critical word first critical word first
write order NA sequential sequential
miss restart following: complete line first double (if waiting for
Parity per word per byte per doubleword
Cache Locking
The ACT 7000SC allows critical code or data fragments to be locked into the primary and secondary caches. The user has complete control over what locking is performed with cache line granularity. For instruction and data fragments in the primaries, locking is accomplished by setting either or both of the cache lock enable bits in the CP0 ECC register, specifying the set via a field in the CP0 ECC register, and then executing either a load instruction or a Fill_I cache operation for data or instructions
vAddr
11..0
pAddr
35..12
data)
pAddr
15..0
pAddr
35..16
outstanding)
n.a.
movement operations in the embedded environment, the ACT 7000SC significantly improves the speed of operation of certain critical cache management operations as compared with the R5000 and R4000 families. In particular, the speed of the Hit-Write-back-Invalidate and Hit-Invalidate cache operations has been improved in some cases by an order of magnitude over that of the earlier families. Table 8 compares the ACT 7000SC with the R4000 and R5000 processors.
Table 8 – Penalty Cycle
respectively. Only two sets are lockable within each cache: set A and set B. Locking within the secondary works identically to the primaries using a separate secondary lock enable bit and the same set selection field. As with the primaries, only two sets are lockable: sets A and B. Table 7 summarizes the cache locking capabilities.
Table 7 – Cache Locking Control
Operation Condition
Hit-Writebac k-Invalidate
Miss 0 7
Hit-Clean 3 12
ACT 7000SCR4000/R500
Penalty
0
Cache
Primary I ECC[27] ECC[28]=0
Primary D ECC[26] ECC[28]=0
Secondary ECC[25] ECC[28]=0
Lock
Enable
Set Select Activate
ECC[28]=1
ECC[28]=1
ECC[28]=1
→ →
→ →
→ →
A B
A B
A B
Fill_I
Load/Store
Fill_I or Load/Store
Hit-Invalidate Miss 0 7
For the Hit-Dirty case of Hit-Writeback-Invalidate, if the writeback buffer is full from some previous cache eviction then n is the number of cycles required to empty the write-back buffer. If the buffer is empty then
Hit-Dirty 3+n 14+n
Hit 2 9
n is zero.
Cache Management
To improve the performance of critical data
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
The penalty value is the number of processor cycles beyond the one cycle required to issue the instruction that is required to implement the operation.
10
Page 11
Primary Write Buffer
Writes to secondary cache or external memory, whether cache miss write-backs or stores to uncached or write-through addresses, use the integrated primary write buffer. The write buffer holds up to four 64-bit address and data pairs. The entire buffer is used for a data cache write-back and allows the processor to proceed in parallel with memory update. For uncached and write-through stores, the write buffer significantly increases performance by decoupling the SysAD bus transfers from the instruction execution stream.
System Interface
The ACT 7000SC provides a high-performance 64-bit system interface which is compatible with the RM5200 Family and R5000. Unlike the R4000 and R5000 family processors which provide only an integral multiplication factor between SysClock and the pipeline clock, the ACT 7000SC also allows half-integral multipliers, thereby providing greater granularity in the designers choice of pipeline and system interface frequencies.
The interface consists of a 64-bit Address/Data bus with 8 check bits and a 9-bit command bus. In addition, there are six handshake signals and six interrupt inputs. The interface has a simple timing specification and is capable of transferring data between the processor and memory at a peak rate of 600 MB/sec with a 75 MHz SysClock.
Figure 6 shows a typical embedded system using the ACT 7000SC. This example shows a system with a bank of DRAMs, and an interface ASIC which provides DRAM control as well as an I/O port.
System Address/Data Bus
The 64-bit System Address Data (SysAD) bus is used to transfer addresses and data between the ACT 7000SC and the rest of the system. It is protected with an 8-bit parity check bus, SysADC.
The system interface is configurable to allow easy interfacing to memory and I/O systems of varying frequencies. The data rate and the bus frequency at which the ACT 7000SC transmits data to the system interface are programmable via boot time mode control bits. Also, the rate at which the processor receives data is fully controlled by the external device. Therefore, either a low cost interface requiring no read or write buffering or a faster, high-performance interface can be designed to communicate with the ACT 7000SC. Again, the system designer has the flexibility to make these price/performance trade-offs.
System Command Bus
The ACT 7000SC interface has a 9-bit System Command (SysCmd) bus. The command bus indicates whether the SysAD bus carries an address or data. If the SysAD bus carries an address, then the SysCmd bus also indicates what type of transaction is to take place (for example, a read or write). If the SysAD bus carries data, then the SysCmd bus also gives information about the data (for example, this is the last data word transmitted, or the data contains an error). The SysCmd bus is bidirectional to support both processor requests and external requests to the ACT 7000SC. Processor requests are initiated by the ACT 7000SC and responded to by an external device. External requests are issued by an external device and require the ACT 7000SC to respond.
The ACT 7000SC supports one to eight byte and 32-byte block transfers on the SysAD bus. In the case of a sub-double-word transfer, the 3 low-order address bits give the byte address of the transfer, and the SysCmd bus indicates the number of bytes being transferred.
Handshake Signals
There are six handshake signals on the system interface. Two of these, RdRdy* and WrRdy*, are used by an external device to indicate to the ACT 7000SC whether it can accept a new read or write
8
Memory I/O
Controller
Address
Control
XX
PCI Bus
ACT 7000SC
DRAM
72
Latch
72
SysCmd
Flash /
Boot
ROM
SysAD Bus
72
25
Figure 6 – Typical Embedded System Block Diagram
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
11
Page 12
transaction. The ACT 7000SC samples these signals before deasserting the address on read and write requests.
ExtRqst* and Release* are used to transfer control of the SysAD and SysCmd buses from the processor to an external device. When an external device needs to control the interface, it asserts ExtRqst*. The ACT 7000SC responds by asserting Release* to release the system interface to slave state.
ValidOut* and ValidIn* are used by the ACT 7000SC and the external device respectively to indicate that there is a valid command or data on the SysAD and SysCmd buses. The ACT 7000SC asserts ValidOut* when it is driving these buses with a valid command or data, and the external device drives ValidIn* when it has control of the buses and is driving a valid command or data.
System Interface Operation
The ACT 7000SC can issue read and write requests to an external device, while an external device can issue null and write requests to the ACT 7000SC.
For processor reads, the ACT 7000SC asserts ValidOut* and simultaneously drives the address and read command on the SysAD and SysCmd buses. If the system interface has RdRdy* asserted, then the processor tristates its drivers and releases the system interface to slave state by asserting Release*. The external device can then begin sending data to the ACT 7000SC.
Figure 7 shows a processor block read request and the external agent read response for a system with a transaction.
The read latency is 4 cycles (ValidOut* to ValidIn*), and the response data pattern is DDxxDD. Figure 9 shows a processor block write where the processor was programmed with write-back data rate boot code 2, or DDxxD-Dxx.
Data Prefetch
The ACT 7000SC is the first Aeroflex design to
support the MIPS IV integer data prefetch (PREF) and floating-point data prefetch (PREFX) instructions. These instructions are used by the compiler or by an assembly language programmer when it is known or suspected that an upcoming data reference is going to miss in the cache. By appropriately placing a prefetch instruction, the memory latency can be hidden under the execution of other instructions. If the execution of a prefetch instruction would cause a memory management or address error exception the prefetch is treated as a NOP.
The “Hint” field of the data prefetch instruction is used to specify the action taken by the instruction. The instruction can operate normally (that is, fetching data as if for a load operation) or it can allocate and fill a cache line with zeroes on a primary data cache miss.
Enhanced Write Modes
The ACT 7000SC implements two enhancements to the original R4000 write mechanism: Write Reissue and Pipeline Writes. In write reissue mode, a write rate of one write every two bus cycles can be achieved. A write issues if WrRdy* is asserted two cycles earlier and is still asserted during the issue cycle. If it is not still asserted then the last write will reissue. Pipe-lined writes have the same two bus cycle write repeat rate, but can issue one additional write following the deassertion of WrRdy*.
External Requests
The ACT 7000SC can respond to certain requests issued by an external device. These requests take one of two forms: Write requests and Null requests. An external device executes a write request when it wishes to update one of the processors writable resources such as the internal interrupt register. A null request is executed when the external device wishes the processor to reassert ownership of the processor external interface. Typically a null request will be executed after an external device, that has acquired control of the processor interface via ExtRqst*, has
SysClock
SysAD
SysCmd
ValidOut*
ValidIn*
RdRdy*
WrRdy*
Release*
Addr
Read
Data0
nData
Data1
nData
Data2 Data3
nData NEOD
Figure 7 – Processor Block Read
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
12
Page 13
SysClock
SysAD
SysCmd
ValidOut*
ValidIn*
RdRdy*
WrRdy*
Release*
Data0 Data1Addr
NData NDataWrite NData NEODNData
Figure 8 – Processor Block Write
completed an independent transaction between itself and system memory in a system where memory is connected directly to the SysAD bus. Normally this transaction would be a DMA read or write from the I/O system.
Test / Breakpoint Registers
To increase both observability and controllability of the processor thereby easing hardware and software debugging, a pair of Test/Break-point, or Watch, registers, Watch1 and Watch2, have been added to the ACT 7000SC. Each Watch register can be separately enabled to watch for a load address, a store address, or an instruction address. All address comparisons are done on physical addresses. An associated register, Watch Mask, has also been added so that either or both of the Watch registers can compare against an address range rather than a specific address. The range granularity is limited to a power of two.
When enabled, a match of either Watch register results in an exception. If the Watch is enabled for a load or store address then the exception is the Watch exception as defined for the R4000 with Cause exception code twenty-three. If the Watch is enabled for instruction addresses then a newly defined Instruction Watch exception is taken and the Cause code is sixteen. The Watch register which caused the exception is indicated by Cause bits 25..24.
Table 9 summarizes a Watch operation.
Data2 Data3
Performance Counters
Like the Test/Break-point capability described above, the Performance Counter feature has been added to improve the observability and controllability of the processor thereby easing system debug and, especially in the case of the performance counters, easing system tuning.
The Performance Counter feature is implemented using two new CP0 registers, PerfCount and PerfControl. The PerfCount register is a 32-bit writable counter which causes an interrupt when bit 31 is set. The PerfControl register is a 32-bit register containing a five bit field which selects one of twenty-two event types as well as a handful of bits which control the overall counting function. Note that only one event type can be counted at a time and that counting can occur for user code, kernel code, or both. The event types and control bits are listed in Table 10.
Table 9 – Watch Control Register
Register Bit Field/Function
63 62 61 60:36 35:2 1:0
Watch1, 2 Store Load Instr 0 Addr 0
31:2 1 0
Watch Mask
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
Mask Mask
Watch
Mask
Watch
2
1
13
Page 14
Table 10 – Performance Counter Control
The performance counter interrupt will only occur when interrupts are enabled in the Status register,
PerfControl
Field
4..0 Event Type 00: Clock cycles 01: Total instructions issued 02: Floating-point instructions issued 03: Integer instructions issued 04: Load instructions issued 05: Store instructions issued 06: Dual issued pairs 07: Branch prefetches 08: External Cache Misses 09: Stall cycles 0A: Secondary cache misses 0B: Instruction cache misses 0C: Data cache misses 0D: Data TLB misses 0E: Instruction TLB misses 0F: Joint TLB instruction misses 10: Joint TLB data misses 11: Branches taken 12: Branches issued 13: Secondary cache writebacks 14: Primary cache writebacks 15: Dcache miss stall cycles (cycles
where both cache miss tokens taken and a third address is
requested) 16: Cache misses 17: FP possible exception cycles 18: Slip Cycles due to multiplier busy 19: Coprocessor 0 slip cycles 1A: Slip cycles due to pending
non-blockingloads 1B: Write buffer full stall cycles 1C: Cache instruction stall cycles 1D: Multiplier stall cycles 1E: Stall cycles due to pending
non-blocking loads - stall start of
exception
7..5 Reserved (must be zero) 8 Count in Kernel Mode
0: Disable 1: Enable
9 Count in User Mode
0: Disable 1: Enable
10 Count Enable
0: Disable 1: Enable
31..11 Reserved (must be zero)
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
Description
IE=1, and Interrupt Mask bit 13 (IM[13]) of the coprocessor 0 interrupt control register is not set.
Since the performance counter can be set up to count clock cycles, it can be used as either a) a second timer or b) a watchdog interrupt. A watchdog interrupt can be used as an aid in debugging system or software “hangs.” Typically the software is setup to periodically update the count so that no interrupt will occur. When a hang occurs the interrupt ultimately triggers thereby breaking free from the hang-up.
Interrupt Handling
In order to provide better real time interrupt handling, the ACT 7000SC provides an extended set of hardware interrupts each of which can be separately prioritized and separately vectored.
As described above, the performance counter is also a hardware interrupt source, IP[13]. Also, whereas the R4000 and R5000 family processors map the timer interrupt onto IP[7], the ACT 7000SC provides a separate interrupt, IP[12], for this purpose.
All of these interrupts, IP[13..0], the Performance Counter, and the Timer, have corresponding interrupt mask bits, IM[13..0], and interrupt pending bits, IP[13..0], in the Status, Interrupt Control, and Cause registers. The bit assignments for the Interrupt Control and Cause registers are shown in Table 11 and Table 12 below. The Status register has not changed from the RM5200 Family and R5000, and is not shown.
The IV bit in the Cause register is the global enable bit for the enhanced interrupt features. If this bit is clear then interrupt operation is compatible with the RM5200 Family and R5000. Although not related to the interrupt mechanism, note that the W1 and W2 bits indicate which Watch register caused a particular Watch exception.
In the Interrupt Control register, the interrupt vector spacing is controlled by the Spacing field as described below. The Interrupt Mask field (IM[15..8]) contains the interrupt mask for interrupts eight through thirteen. IM[15..14] are reserved for future use. The Timer Exclusive (TE) bit if set moves the Timer interrupt to IP[12]. If clear, the Timer interrupt will be or’ed into IP[7] as on the R5000.
The Interrupt Control register uses IM13 to enable the Performance Counter Control.
Priority of the interrupts is set via two new coprocessor 0 registers called Interrupt Priority Level Lo, IPLLO, and Interrupt Priority Level Hi, IPLHI.
These two registers contain a four-bit field corresponding to each interrupt thereby allowing each interrupt to be programmed with a priority level from 0 to 13 inclusive. The priorities can be set in any manner including having all the priorities set exactly the same. Priority 0 is the highest level and priority 15 the lowest. The format of the priority level registers is shown in Table 13 and Table 14 below. The priority
14
Page 15
Table 11 – Cause Register
31 30 29,28 27 26 25 24 23..8 7 6..2 0,1
BD 0 CE 0 W2 W1 IV IP[15..0] 0 EXC 0
Table 12 – Interupt Control Register
31..16 15..8 7 6..5 4..00
0 IM[15..8] TE 0 Spacing
Table 13 – IPLLO Register
31..28 27..24 23..20 19..16 15..12 11..8 7..4 3..0
IPL7 IIPL6 IPL5 IPL4 IPL3 IPL2 IPL1 IPL0
Table 14 – IPLHI Register
31..28 27..24 23..20 19..16 15..12 11..8 7..4 3..0
0 0 IPL13 IPL12 IPL11 IPL10 IPL9 IPL8
level registers are located in the coprocessor 0 control register space. For further details about the control space see the section describing coprocessor 0.
In addition to programmable priority levels, the ACT 7000SC also permits the spacing between interrupt vectors to be programmed. For example, the minimum spacing between two adjacent vectors is 0x20 while the maximum is 0x200. This programmability allows the user to either set up the vectors as jumps to the actual interrupt routines or, if interrupt latency is paramount, to include the entire interrupt routine at the vector. Table 15 illustrates the complete set of vector spacing selections along with the coding as required in the Interrupt Control register bits 4:0.
In general, the active interrupt priority combined with the spacing setting generates a vector offset which is then added to the interrupt base address of 0x200 to generate the interrupt exception offset. This offset is then added to the exception base to produce the final interrupt vector address.
Table 15 – Interrupt Vector Spacing
Standby Mode
The ACT 7000SC provides a means to reduce the amount of power consumed by the internal core when the CPU would not otherwise be performing any useful operations. This state is known as Standby Mode.
Executing the WAIT instruction enables interrupts and enters Standby Mode. When the WAIT instruction completes the W pipe stage, if the SysAD bus is currently idle, the internal processor clocks will stop thereby freezing the pipeline. The phase lock loop, or PLL, internal timer/ counter, and the “wake up” input pins: IP[5:0]*, NMI*, ExtReq*, Reset*, and ColdReset* continue to operate in their normal fashion. If the SysAD bus is not idle when the WAIT instruction completes the W pipe stage, then the WAIT is treated as a NOP. Once the processor is in Standby, any interrupt, including the internally generated timer interrupt, will cause the processor to exit Standby and resume operation where it left off. The WAIT instruction is typically inserted in the idle loop of the operating system or real time executive.
JTAG Interface
ICR[4..0] Spacing
0x0 0x000
0x1 0x020
0x2 0x040
0x4 0x080
0x8 0x100
0x10 0x200
others reserved
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
The ACT 7000SC interface supports JTAG boundary scan in conformance with IEEE 1149.1. The JTAG interface is especially helpful for checking the integrity of the processor’s pin connections.
Boot-Time Options
Fundamental operational modes for the processor are initialized by the boot-time mode control interface. The boot-time mode control interface is a serial interface operating at a very low frequency (SysClock divided by 256). The low frequency operation allows the initialization information to be
15
Page 16
kept in a low cost EPROM; alternatively the twenty or so bits could be generated by the system interface ASIC.
Immediately after the VccOK signal is asserted, the processor reads a serial bit stream of 256 bits to initialize all the fundamental operational modes.
ModeClock runs continuously from the assertion of VccOK.
Boot-Time Modes
The boot-time serial mode stream is defined in Table 16. Bit 0 is the bit presented to the processor when VccOK is deasserted; bit 255 is the last.
Table 16 – Boot Time Mode Stream (Cont.)
Mode bit Description
10..9 Non-Block Write Control
00: R4000 compatible non-block writes
01: Reserved
10: pipelined non-block writes
11: non-block write re-issue
11 Timer Interrupt Enable/Disable
Table 16 – Boot Time Mode Stream
Mode bit Description
0 Reserved: Must be zero
4..1 Write-back data rate
0: DDDD
1: DDxDDx
2: DDxxDDxx
3: DxDxDxDx
4: DDxxxDDxxx
5 DDxxxxDDxxxx
6: DxxDxxDxxDxx
7: DDxxxxxxDDxxxxxx
8: DxxxDxxxDxxxDxxx
9-15:Reserved
7..5 SysClock to Pclock Multiplier Mode bit 20 = 0 / Mode bit 20 = 1
0: Multiply by 2/x
1: Multiply by 3/x
2: Multiply by 4/x
3: Multiply by 5/2.5
4: Multiply by 6/x
5: Multiply by 7/3.5
6: Multiply by 8/x
7: Multiply by 9/4.5
8 Specifies byte ordering. Logically ORed
with BigEndian input signal.
0: Enable the timer interrupt on IP[5]
1: Disable the timer interrupt on IP[5]
12 Reserved: Must be zero
14..13 Output driver strength - 100% = fastest
00: 67% strength
01: 50% strength
10: 100% strength
11: 83% strength
15 Reserved must be zero
17..16 System configuration identifiers ­software visible in processor Config[21..20] register
19..18 Reserved: Must be zero
20 Pclock to SysClock multipliers.
0: Integer multipliers (2,3,4,5,6,7,8,9)
1: Half integer multipliers (2.5,3.5,4.5)
21 External Bus Width.
0: 64-bit
1: 32-bit
23..22 Reserved: Must be zero
24 JTLB Size.
0: 48 dual-entry
1: 64 dual-entry
25 On-chip secondary cache control.
0: Disable
1: Enable
255..26 Reserved: Must be zero
0: Little endian
1: Big endian
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
16
Page 17
PLL Analog Power Filtering
The ACT 7000SC includes extra PLL Analog Power Fiiltering circuitry designed to provide low noise, temperature stable filtering for the VccP and VssP signals. The included circuitry consists of several passive components located at the closest possible point to the RM7000 die and is configured as shown in Figure 9.
5
VccP
64
RM7000
Die
VssP
65
5
.01
µF
1000
pF
Figure 9 – ACT 7000SC Including PLL Filter Circuit
Additional board level PPL filtering is also required. The recommended configuration is shown in Figure 10.
5
VccInt
64
VccP
10 µF
5
VssInt
.1
µF
1000
pF
65
VssP
Figure 10 – Recommended Board Level PLL Filter circuit
for the ACT 7000SC
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
17
Page 18
Absolute Maximum Rating
1
Symbol Parameter Limits Units
V
TERM
T
C
T
STG
I
IN
I
OUT
Note 1: Stresses greater than those listed under ABSOLUTE MAXIMUM RATINGS may cause permanent damage to the device. This is a stress rating
only and functional operation of the device at these or any other conditions above those indicated in the operational sections of this specification is not implied. Exposure to absolute maximum rating conditions for extended periods may affect reliability.
IN
Note 2: V
Note 3: When V
Note 4: Not more than one output should be shorted at a time. Duration of the short should not exceed 30 seconds.
minimum = -2.0V for pulse width less than 15ns. VIN should not exceed 3.9 Volts.
IN
Terminal Voltage with respect to V
SS
-0.52 to +3.9 V
Case Operating Temperature -55 to +125 °C
Storage Temperature -65 to +150 °C
DC Input Current 20
DC Output Current
< 0V or VIN > VCCIO
4
3
50 mA
mA
Recommended Operating Conditions
CPU Speed Temperature Vss VssInt VccIO VccP
150 - 225 MHz -55°C to +125°C (TC) 0V 2.5V ±5% 3.3V ±5% 2.5V ±5%
Note: VCC I/O should not exceed VccInt by greater than 1.2V during the power- up sequence.
Note: Applying a logic high state to any I/O pin before VccInt becomes stable is not recommended.
Note: As specified in IEEE 1149.1 (JTAG), the JTMS pin must be held low during reset to avoid entering JTAG test mode. Refer to the RM7000 Family
Users Manual, Appendix E.
DC Electrical Characteristics
Parameter Minimum Maximum Conditions
V
OL
V
OH
V
OL
V
OH
V
IL
V
IH
I
IN
C
IN
VCCIO - 0.1V
2.4V
-0.5V 0.2 xVCCIO
0.7 x VCCIO VCCIO + 0.5V
0.1V |I
0.4V |I
±20µA ±20µA
10pF
|= 20µA
OUT
|= 4mA
OUT
VIN=0 VIN=VCCIO
C
OUT
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
18
10pF
Page 19
Power Consumption
CPU Clock Speed
Parameter Condition
VccInt Power
(mWatts)
Note 1: Typical integer instruction mix and cache miss rates with worst case supply voltage.
Note 2: Worst case instruction mix with worst case supply voltage.
Note: I/O supply power is application dependant, but typically <10% of VccInt.
Standby No SysAD bus activity 500 1000 1500 2000
Active R4000 write protocol with no
FPU operation (integer Instruction only)
Write re-issue or pipelined writes with superscalar (integer and floating point instructions)
150 MHz 200 MHz 210 MHz 225 MHz
1
Typ
2200 4400 2700 5400 2800 5600 3800 7600
2550 5100 3150 6300 3300 6600 4250 8500
Max2Typ1Max2Typ1Max2Typ1Max
AC Electrical Characteristics – Clock Parameters
Parameter Symbol
Tes t
Condition
2
CPU Clock Speed
Units150 MHz 200 MHz 210 MHz 225 MHz
Min Max Min Max Min Max Min Max
SysClock High
SysClock Low
SysClock Frequency 25 75 25 75 25 70 25 75 MHz
SysClock Period
Clock Jitter for SysClock
SysClock Rise Time
SysClock Fall Time
ModeClock Period
JTAG Clock Period
Note: Operation of the ACT 7000 is only guaranteed with the Phase Lock Loop enabled
tSCHIGH
t
SCLOW
tSCP
t
JITTERIN
t
SCRISE
tSCFALL
t
MODECKP
t
JTAGCKP
Transition < 5ns3333ns
Transition < 5ns3333ns
40 40 40 40 ns
±200 ±200 ±150 ±150 ps
2222ns
2222ns
256 256 256 256
4444
t
SCP
t
SCP
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
19
Page 20
System Interface Parameters
Parameter
8
Sym Test Conditions
150MHz 200MHz 210MHz 225MHz
Min Max Min Max Min Max Min Max
Data Output
Data Setup
Data Hold
Notes:
1. Timings are measured from 1.5V of the clock to 1.5V of the signal.
2. Capacitive load for all output timings is 50pF.
3. Data Output timing applies to all signal pins whether tristate I/O or output only.
4. Setup and Hold parameters apply to all signal pins whether tristate I/O or input only.
2,3
4
4
t
t
t
DO
DS
DH
mode
mode
mode
mode
t
rise
t
fall
= 10 (fastest)
14...13
= 11
14...13
= 00
14...13
= 01 (slowest)
14...13
= see above table
= see above table
1.0 6.0 1.0 6.0 1.0 5.5 1.0 5.5 ns
1.0 6.5 1.0 6.5 1.0 6.0 1.0 6.0 ns
1.0 7.0 1.0 7.0 1.0 6.5 1.0 6.5 ns
1.0 7.5 1.0 7.5 1.0 7.0 1.0 7.0 ns
2.52.52.52.5 ns
1.01.01.01.0 ns
Boot-Time Interface Parameters
Parameter Symbol Test Cond itions Min Max Units
Mode Data Setup
t
DS
4 SysClock cycles
Units
Mode Data Hold
t
DH
Clock Timing
SysClock
t
SCRis etSCFall
System Interface Timing
SysClock
Data
SysClock
Data
0 SysClock cycles
t
SCP
±t
JitterIn
SysClock Timing
(SysAD, SysCmd, ValidIn*, ValidOut*, etc.)
t
DS
t
DH
Data
Input Timing
t
DO
t
DO
MIN
Data Data
Output Timing
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
20
Page 21
Pin Descriptions
The following is a list of control, data, clock, interrupt, and miscellaneous pins of the ACT 7000SC.
Pin Name Type Description
System interface:
ExtRqst* Input External request
Signals that the system interface is submitting an external request.
Release* Output Release interface
Signals that the processor is releasing the system interface to slave state
RdRdy* Input Read Ready
Signals that an external agent can now accept a processor read.
WrRdy* Input Write Ready
Signals that an external agent can now accept a processor write request.
ValidIn* Input Valid Input
Signals that an external agent is now driving a valid address or data on the SysAD bus and a valid command or data identifier on the SysCmd bus.
ValidOut* Output Valid output
Signals that the processor is now driving a valid address or data on the SysAD bus and a valid command or data identifier on the SysCmd bus.
SysAD(63:0) Input/
Output
System address/data bus A 64-bit address and data bus for communication between the processor and an external agent.
SysADC(7:0) Input/
Output
SysCmd(8:0) Input/
Output
System address/data check bus An 8-bit bus containing parity check bits for the SysAD bus during data cycles.
System command/data identifier bus A 9-bit bus for command and data identifier transmission between the processor and an external agent.
SysCmdP Input/
Output
System Command/Data Identifier Bus Parity For the RM7000, unused on input and zero on output.
Clock/Control interface:
SysClock Input System clock
Master clock input used as the system interface reference clock. All output timings are relative to this input clock. Pipeline operation frequency is derived by multiplying this clock up by the factor selected during boot initialization
VccP Input Vcc for PLL
Quiet VccInt for the internal phase locked loop. Must be connected to VccInt. See Figure 10 for additional PPL filtering information.
VssP Input Vss for PLL
Quiet Vss for the internal phase locked loop. Must be connected to Vss. See Figure 10 for additional PPL filtering information.
Interrupt Interface
Int*(5:0) Input Interrupt
Six general processor interrupts, bit-wise ORed with bits 5:0 of the interrupt register.
NMI* Input Non-maskable interrupt
Non-maskable interrupt, ORed with bit 15 of the interrupt register (bit 7 in R5000 compatibility mode).
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
21
Page 22
Pin Descriptions (Cont.)
The following is a list of control, data, clock, interrupt, and miscellaneous pins of the ACT 7000SC.
Pin Name Type Description
JTAG interface:
JTDI Input JTAG data in
JTCK Input JTAG clock input
JTDO Output JTAG data out
JTMS Input JTAG command
Initialization Interface:
BigEndian Input Big Endian / Little Endian Control
VccOK Input Vcc is OK
ColdReset* Input Cold Reset
Reset* Input Reset
JTAG serial data in.
JTAG serial clock input.
JTAG serial data out.
JTAG command signal, signals that the incoming serial data is command data.
Allows the system to change the processor addressing mode without rewriting the mode ROM.
When asserted, this signal indicates to the ACT 7000 that the 2.5V power supply has been above 2.25V for more than 100 milliseconds and will remain stable. The assertion of VccOK initiates the reading of the boot-time mode control serial stream.
This signal must be asserted for a power on reset or a cold reset. ColdReset must be de-asserted synchronously with SysClock.
This signal must be asserted for any reset sequence. It may be asserted synchronously or asynchronously for a cold reset, or synchronously to initiate a warm reset. Reset must be de-asserted synchronously with SysClock.
ModeClock Output Boot Mode Clock
Serial boot-mode data clock output at the system clock frequency divided by two hundred and fifty six.
ModeIn Input Boot Mode Data In
Serial boot-mode data input.
For additional Detail Information regarding the operation of the Quantum Effect Devices (QED) RISCMark RM7000, 64-Bit Superscalar Microprocessor see the latest QED datasheet and users guide (www.qedinc.com).
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
22
Page 23
Package Information – "F17" – CQFP 208 Leads
1.131 (28.727) SQ
1.109 (28.169) SQ
10453
52
105
.0236 (.51)
1.009 (25.63) .9998 (25.37)
51 Spaces at .0197
(51 Spaces at .50)
Pin 1 Chamfer
Detail "A"
.005 (.127)
.0158 (.49)
.010R REF
.010R REF
0°±5°
.100 (2.540) .080 (2.032)
.015 (.381) .009 (.229)
.035 (.889) .025 (.635)
Detail "A"
.008 (.202)
1
208
.960 (24.384) SQ
REF
1.331 (33.807)
1.269 (32.233)
Note: Pin rotation is opposite of QEDs PQUAD due to cavity-up construction.
156
157
.055 (1.397)
REF
.055 (1.397) .045 (1.143)
.115 (2.921)
MAX
Package Information – "F24" – Inverted CQFP 208 Leads
1.131 (28.727) SQ
1.109 (28.169) SQ
105156
157
104
Lid
.130 (3.302)
MAX
.010 (.253) .007 (.178)
Units: Inches (Millimeters)
.055 (1.397) .045 (1.143)
.115 (2.921)
MAX
.0236 (.51)
1.009 (25.63) .9998 (25.37)
51 Spaces at .0197
(51 Spaces at .50)
Pin 1 Chamfer
Detail "A"
.0158 (.49)
208
.012R REF
.012R REF
.055 (1.397)
0°±5°
.100 (2.540) .080 (2.032)
53
1
52
.139 (3.531)
MAX
Lid
.060 (1.524) .040 (1.016)
Detail "A"
REF
.010 (.253) .007 (.178)
Units: Inches (Millimeters)
.960 (24.384) REF
1.331 (33.807)
.005 (.127)
.008 (.202)
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
Note: Pin rotation is Identical to QEDs PQUAD due to cavity-down construction.
1.291 (32.791)
23
.024 (.610) .010 (.253)
Page 24
ACT 7000SC Microprocessor CQFP Pinouts – "F17" & "F24"
Pin # Function Pin # Function Pin # Function Pin # Function
1 VccIO 53 NC 105 VccIO 157 NC 2 NC 54 NC 106 NMI* 158 NC 3 NC 55 NC 107 ExtRqst* 159 NC 4 VccIO 56 VccIO 108 Reset* 160 NC 5 Vss 57 Vss 109 ColdReset* 161 VccIO 6 SysAD4 58 ModeIn 110 VccOK 162 Vss 7 SysAD36 59 RdRdy* 111 BigEndian 163 SysAD28 8 SysAD5 60 WrRdy* 112 VccIO 164 SysAD60
9 SysAD37 61 ValidIn* 113 Vss 165 SysAD29 10 VccInt 62 ValidOut* 114 SysAD16 166 SysAD61 11 Vss 63 Release* 115 SysAD48 167 VccInt 12 SysAD6 64 VccP 116 VccInt 168 Vss 13 SysAD38 65 VssP 117 Vss 169 SysAD30 14 VccIO 66 SysClock 118 SysAD17 170 SysAD62 15 Vss 67 VccInt 119 SysAD49 171 VccIO 16 SysAD7 68 Vss 120 SysAD18 172 Vss 17 SysAD39 69 VccIO 121 SysAD50 173 SysAD31 18 SysAD8 70 Vss 122 VccIO 174 SysAD63 19 SysAD40 71 VccInt 123 Vss 175 SysADC2 20 VccInt 72 Vss 124 SysAD19 176 SysADC6 21 Vss 73 SysCmd0 125 SysAD51 177 VccInt 22 SysAD9 74 SysCmd1 126 VccInt 178 Vss 23 SysAD41 75 SysCmd2 127 Vss 179 SysADC3 24 VccIO 76 SysCmd3 128 SysAD20 180 SysADC7 25 Vss 77 VccIO 129 SysAD52 181 VccIO 26 SysAD10 78 Vss 130 SysAD21 182 Vss 27 SysAD42 79 SysCmd4 131 SysAD53 183 SysADC0 28 SysAD11 80 SysCmd5 132 VccIO 184 SysADC4 29 SysAD43 81 VccIO 133 Vss 185 VccInt 30 VccInt 82 Vss 134 SysAD22 186 Vss 31 Vss 83 SysCmd6 135 SysAD54 187 SysADC1 32 SysAD12 84 SysCmd7 136 VccInt 188 SysADC5 33 SysAD44 85 SysCmd8 137 Vss 189 SysAD0 34 VccIO 86 SysCmdP 138 SysAD23 190 SysAD32 35 Vss 87 VccInt 139 SysAD55 191 VccIO 36 SysAD13 88 Vss 140 SysAD24 192 Vss 37 SysAD45 89 VccInt 141 SysAD56 193 SysAD1 38 SysAD14 90 Vss 142 VccIO 194 SysAD33 39 SysAD46 91 VccIO 143 Vss 195 VccInt 40 VccInt 92 Vss 144 SysAD25 196 Vss 41 Vss 93 Int0* 145 SysAD57 197 SysAD2 42 SysAD15 94 Int1* 146 VccInt 198 SysAD34 43 SysAD47 95 Int2* 147 Vss 199 SysAD3 44 VccIO 96 Int3* 148 SysAD26 200 SysAD35 45 Vss 97 Int4* 149 SysAD58 201 VccIO 46 ModeClock 98 Int5* 150 SysAD27 202 Vss 47 JTDO 99 VccIO 151 SysAD59 203 NC 48 JTDI 100 Vss 152 VccIO 204 NC 49 JTCK 101 NC 153 Vss 205 NC 50 JTMS 102 NC 154 NC 206 NC 51 VccIO 103 NC 155 NC 207 VccIO 52 Vss 104 NC 156 Vss 208 Vss
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
24
Page 25
CIRCUIT TECHNOLOGY
Sample Ordering Information
Part Number Screening Speed (MHz) Package
ACT-7000SC-150F17I Industrial Temperature 150 208 Lead CQFP
ACT-7000SC-200F17C Commercial Temperature 200 208 Lead CQFP
ACT-7000SC-210F17T Military Temperature 210 208 Lead CQFP
ACT-7000SC-225F17M Military Screening 225 208 Lead CQFP
Part Number Breakdown
ACT– 7000 SC – 225 F17 M
Aeroflex Circuit Technology
Base Processor Type
Cache Style
SC = Secondary Cache
Maximum Pipeline Freq.
150 = 150MHz 200 = 200MHz 210 = 210MHz 225 = 225MHz 240 = 240MHz (Future Option) 250 = 250MHz (Future Option) 266 = 266MHz (Future Option)
C = Commercial Temp, 0°C to +70°C I = Industrial Temp, -40°C to +85°C T = Military Temp, -55°C to +125°C M = Military Temp, -55°C to +125°C, Screened Q = MIL-PRF-38534 Compliant/SMD if applicable
F17 = 1.120" SQ 208 Lead CQFP F24 = 1.120" SQ Inverted 208 Lead CQFP
Screened to the individual test methods of MIL-STD-883
*
Screening
Package Type & Size
Surface Mount Package
*
This document may, wholly or partially, be subject to change without notice. Aeroflex reserves the right to make changes to its products or specifications at any time without notice.
Aeroflex will not be held responsible for any damage to the user or any property that may result from accidents, misuse, or any other causes arising during operation of the user's unit.
Aeroflex does not assume any responsibility for use of any circuitry described other than the circuitry embodied in a Aeroflex product. The company makes no representations that the circuitry described herein is free from patent infringement or other rights of third parties, which may result from its use. No license is granted by implication or otherwise under any patent, patent rights, or other rights, of Aeroflex.
The QED logo and RIS CMark are trademarks of Quantum Effect Devices, Inc.
MIPS is a registered trademark of MIPS Technologies, Inc. All other trademarks are the respective property of the trademark holders.
Aeroflex Circuit Technology 35 South Service Road Plainview New York 11803 www.aeroflex.com
Aeroflex Circuit Technology SCD7000SC REV B 7/30/01 Plainview NY (516) 694-6700
25
Telephone: (516) 694-6700 FAX: (516) 694-6715
Toll Free Inquiries: (800) 843-1553
E-Mail: sales-mcm@aeroflex.com
Loading...