Information in this document is provided in connection with Intel products. No license, express or implied, by
estopp el or otherw ise, to any intellect ual proper ty rights is granted by this documen t. Except as prov ided in
Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel
disclaims any express or imp lied warranty , relating to sale and/or use of Intel products including lia bility or
warrant ie s relat ing to fit ne ss for a partic ular pu rpo se, me rcha ntab ility , or in fringe men t of any patent , copy right
or other intellectual property right. Intel products are not intended for use in medical, life saving, or life
sus taining applications. Intel ma y make changes to speci fications an d product descriptions at any time, without
notice. Contact your local In tel sales off ice or your dis tributor to obtain the lates t specifications and befo re
placing your product order.
Intel retains the right to make changes to specifications and product descriptions at any time, without notice.
*Third party brands and names are the property of their respective owners.
Copies of documents which have an ordering number and are referenced in this document, or other Intel
literature, may be obtained from:
Intel Corporation
P.O. Box 7641
Mt. Prospect IL 60056-7641
or call 1-800-879-4683
2.4 Characte ris ti c Curv es ........ .... ........... ........... ............ ........... ............ .... ........... ............ ....... ............ ....13
2.5 Test Load Ci rcuit ........... ........... ............ ........... ............ ........... .... ........... ............ ................... ........... . 16
2.7 DC Characteristics ............................................................................................................................ 17
2.6 Absolu te Max im um Ra tin gs ... ........... ........... ............ .... ........... ............ ........... ............ ................... ... . 17
2.8 AC Specifications ............................................................................................................................. 18
2.9 Design Con side rations ............. ............ ........... ............ ........... ............ ... ............ ........... ............ .... ....22
3.0 MECHANICAL DATA ..............................................................................................................................22
3.3 Package The rm al Spec if ic ati on ........ ........... .... ............ ........... ............ ... ............ ........... ....................28
4.0 WAVEFOR M S .......... ............ ........... .... ............ ........... .... ........... ............ ........... .... ....................... ........... . 30
5.0 REVISION HISTORY ............................................................................................................................... 35
Figure 3.Multiple Register Sets Are Stored On-Chip ...............................................................................6
Figure 4.Connection Recommendations for Low Current Drive Network ..............................................13
Figure 5.Connection Recommendations for High Current Drive Network ..............................................13
Figure 6.Typical Supply Current vs. Case Temperature ........................................................................14
Figure 7.Typical Current vs. Frequency (Room Temp) ..........................................................................14
Figure 8.Typical Current vs. Frequency (Hot Temp) ..............................................................................15
Figure 9.Worst-Case Voltage vs. Output Current on Open-Drain Pins ..................................................15
Figure 10.Capacit iv e Dera tin g Curv e ......... .... ............ ... ............ ........... ............ .... ........... ............ .............15
Figure 11.Test Load Circuit for Three-State Output Pins .........................................................................16
Figure 12.Test Load Circuit for Open-Drain Output Pins .........................................................................16
Figure 13.Drive Levels and Timing Relationships for 80960MC Signals .................................................18
Figure 14.Timing Relationship of L-Bus Signals ......................................................................................19
Figure 15.System and Processor Clock Relationship ..............................................................................19
processor family, is ideally suited for embedded
applications. It includes a 512-byte instruction cache
and a built-in interrupt controller. The 80960MC has
a larg e registe r set, mul tiple parallel execut ion units
and a high-bandwidth burst bus. Using advanced
RISC technology, this processor is capable of
execution rates in excess of 9.4 million instructions
per s eco nd
range of applications including non-impact printers,
I/O control and specialty instrumentation. The
embedded market includes applications as diverse
as industrial automation, avionics, image
processing, graphics and networking. These types of
applications require high integration, low power
consumption, quick interrupt response times and
* Relative to Digital Equipment Corporation’s VAX-11/780*
at 1 MIPS
*
. The 8 0960 MC is we ll- suite d for a w ide
high performance. Since time to market is critical,
embedded processors must be easy to use in both
hardware and software designs.
All members of the i960 processor family share a
comm on c ore ar ch itect ure w hic h util izes RI SC te chnology so that, except for special functions, the
family members are object-code compatible. Each
new p ro ce ss o r in th e family a dds its o w n sp ec ia l set
of functions to the core to satisfy the needs of a
specific application or range of applications in the
embedded market.
The 80960MC includes an integrated Floating Point
Unit (FPU), a Memory Management Unit (MMU),
multitasking support, and multiprocessor support.
Two commercial members of the i960
®
family
provide similar features: the 80960KB processor with
integrated FPU and the 80960KA without floatingpoint.
FFFF FFFFH0000 0000H
ADDRESS SPACE
ARCHITECTURALLY
DATA STRUCTURES
FETCHLOADSTORE
INSTRUCTION CACHE
INSTRUCTION
STREAM
INSTRUCTION
EXECUTION
PROCESSOR STATE
REGISTERS
INSTRUCTION
POINTER
ARITHMETIC
CONTROLS
PROCESS
CONTROLS
TRACE
CONTROLS
SIXTEEN 32-BIT GLOBAL REGISTERS
REGISTER CACHE
SIXTEEN 32-BIT LOCAL REGISTERS
FOUR 80-BIT FLOATING POINT REGISTERS
CONTROL REGISTERS
DEFINED
r15
g0
g15
r0
Figure 1. 80960MC Programming Environment
1
80960MC
1.1Key Performan ce Featu res
The 80 96 0 arc hitec tur e is b ased on the mos t rece nt
advances in microprocessor technology and is
grounded in Intel’s l ong exp erience in the de sign a nd
manufacture of embedded microprocessors. Many
features contribute to the 80960MC’s exceptional
performance:
1.Large Register Set. Havi ng a lar ge nu m ber of
registers reduces the number of times that a
processor needs to access memory. Modern
compilers can take advantage of this fe ature to
optimize execution speed. For maximum flexibility, the 80960MC provides thirty-two 32-bit
registers. (See Figure 2.)
2.Fast I nst ruction E xecut ion. Sim ple functi ons
make up the bulk of instructions in most
programs so that execution speed can be
improved by ensuring that these core instructions are ex ecut ed as quic kly as po ssib le. Th e
most frequently executed instructions such as
register-register moves, add/subtract, logical
operations and shifts execute in one to two
cycles. (Table 1 contains a list of instructions.)
3.Load/Store Ar chit e cture. One way to improve
execution speed is to reduce the number of
times that the processor must access memory
to perform an operation. As with other processors based on RISC technology, the 80960MC
has a Load/Store architecture. As such, only
the LOAD and STORE instructions reference
memory; all other instructions operate on registers. This type of architecture simplifies instruction d ecodin g and i s used in co mbinat ion wi th
other techniques to increase parallelism.
4.Simple Instruction Formats. All instructions
in the 80960MC are 32 bits long and must be
aligned on word boundaries. This alignment
makes it possible to eliminate the instruction
align me nt stag e in the pi peline . To si m pli fy th e
instruction decoder, there are only five instruction formats; each instruction uses only one
format. (See Figure3.)
5.Overlapped Instruction Execution. Load
operations allow execution of subsequent
instructions to continue before the data has
been returned from memory, so that these
instructions can overlap the load. The
80960MC manages this process transparently
to software through the use of a register scoreboar d. Condi tional ins tructio ns also m ake use
of a scoreboard so that subsequent unrelated
instructions may be executed while the conditional instruction is pendi ng.
6.Integer Execution Optimization. When the
resu lt of an a rith meti c ex ecu tion i s us ed a s an
operand in a subsequent calculation, the value
is sent immediate ly to its des tination register.
Yet at the same time, the value is put on a
bypass path to the ALU, thereby saving the
time that otherwise would be required to
retrieve the value for the next operation.
7.Bandwidth Optimizations. The 80960MC
gets op timal us e of its mem ory bus ba ndwid th
because the bus is tuned for use with the onchip instruction cache: instruction cache line
size matches the maximum burst size for
instruction fetches. The 80960MC automatically fetches four word s in a bu rst and stores
them directly in the cache. Due to the size of
the cache and the fact that it is continually filled
in anticipation of needed instructions in the
program flow, the 80960MC is relatively insensitive to memory wait states. The benefit is that
the 80960MC delivers outstanding performance even with a low cost memory sys tem.
8.Cache Bypass. When a cache miss occurs,
the processor fetches the needed instruction
then se nds it on to the in struction decoder at
the sam e time it update s the cache. Thu s, no
extra time is spent to load and read the cache.
2
Table 1. 80960MC Instruction Set
Data MovementProcess ManagementFloating PointLogical
80960MC
Load
Store
Move
Load Address
Load Physical Address
Schedule Process
Saves Process
Resume Process
Load Pr ocess Time
Modify Process Controls
Wait
Conditional Wait
Signal
Receive
Conditional Receive
Send
Send Service
Atomic Add
Atom i c Mo di fy
And
Not And
And Not
Or
Exclusive Or
Not Or
Or Not
Nor
Exclusive Nor
Not
Nand
Rotate
Log Natural
Exponent
Classify
Copy Real Extended
Compare
ComparisonBranchBit and Bit FieldString
Compare
Conditional Compare
Com pa re an d Inc r e me nt
Com pa re and Dec rem e nt
Unc on di tional B ran c h
Conditional Branch
Com pa re and Bran c h
Set Bit
Clear Bit
Not Bit
Check Bit
Alter Bit
Move String
Move Quick String
Fill String
Compare Str ing
Scan Byte for Equal
Scan For Bit
Scan Over Bit
Extract
Modi fy
ConversionDecimalCall/ReturnArithmetic
Convert Real to Integer
Convert Integer to Real
Move
Add with Carry
Subtract with Carry
FaultDebugMiscellaneous
Conditional Fault
Synchronize Faults
Modify Trace C ontrols
Mark
Force Mark
3
Call
Call Extended
Call System
Return
Bra nch and Li nk
Flush Local Registers
Inspect Access
Modify Arithmetic
Controls
Test Condition Code
Add
Subtract
Multiply
Divide
Remainder
Modulo
Shift
80960MC
Control
Compare and
Branch
Register to
Register
Memory Access-
Short
Memory Access-
Long
Opcode Displacement
OpcodeReg/LitRegMDisplacement
OpcodeRegReg/LitModesExt’d OpReg/Lit
OpcodeRegBaseMXOffset
OpcodeRegBaseModeScalexxOffset
Figure 2. Instruction Formats
1.1.1Memory Space And Addressing Modes
Displacement
1.1.2Data Types
The 80960MC allows each task (process) to address
a logical memory space of up to 4 Gbytes. Each
task’s address space is divided into four 1 Gbyte
regions and each region can be mapped to physical
addresses by zero, one, or two l evels of page tables.
The r egio n wi th the high est ad dr esse s (R egio n 3) is
common to all task s.
In keeping with RISC design principles, the number
of addressing modes is minimal yet includes all
those necessary to ensure efficient execution of
high-level languages such as Ada, C, and Fortran.
Table 2 lists the memory accessing modes.
Table 2. Memory Addressing Modes
• 12-Bit Offset
• 32-Bit Offset
• Register-Indirect
• Register + 12-Bi t Offset
• Register + 32-Bi t Offset
• Register + (Index-Register x Scale-Factor)
• Register x Scale Factor + 32 -Bit Displacement
• Register + (Index-Register x Scale-Factor) + 32Bit Displacement
• Scale-Facto r is 1, 2, 4, 8 or 16
The 80960MC recogni z es the following data types:
Numeric:
• 8-, 16-, 32- and 64-bit ordinals
• 8-, 16-, 32- and 64-bit integer s
• 32-, 64- and 80-bit real numbers
Non-Numeric:
•Bit
• Bit Field
• Triple Word (96 bits)
• Quad-Word (128 bits)
1.1.3Large R eg ist e r Se t
The 80960MC programming environment includes a
large number of registers. 36 registers are available
at any time; this greatly reduces the number of
memory accesses required to perform algorithms,
which leads to greater instruction processing speed.
Two types of general-purpose registers are available: local and global. The 20 global registers
consist of sixtee n 32-bit registers (G0 though G15)
and four 80-bit registers (FP0 through FP3). These
4
80960MC
registers perform the same function as the generalpurpose registers provided in other popular microprocessors. The term
global
refers to the fact that
these registers retain their contents across procedure ca ll s.
The loc al r eg ister s are p roce du re-sp ecifi c. Fo r each
procedure call, the 80960MC allocates 16 local
regist e rs ( R0 thro ug h R 15 ). Each l oc al register i s 32
bits w id e . A ny r e gi st er ca n a ls o be u se d fo r fl oa ti ng point operations; the 80-bit floating-point registers
are provided for extended prec ision.
1.1.4Multiple Register Sets
To fur ther in crea se th e eff icie ncy of the regis ter s et,
multiple sets of local registers are stored on-chip
(See Figure 4). This cache holds up to four local
register frames, which means that up to three pr ocedure calls can be made without having to access the
procedure stack resident in memory.
Although programs may have procedure c alls n ested
many ca lls d ee p, a prog ram typi call y osc illat es b ack
and forth between only two to three levels. As a
result, with four stack frames in the cache, the probability of having a free frame available on the cache
when a call is made is very high. Runs of representative C -la ng uage pro gra ms sh ow th at 80% of t he ca lls
are handled without needing to access memory.
When four or more procedures are active and a new
proced ure is ca lled, the 80 960MC moves the ol dest
local register set in the stack-frame cache to a
proced ure stac k in me mor y to mak e ro om fo r a new
set of registers. Global register G15 is the frame
pointer (FP) to the procedure stack.
Global registers are not exchanged on a procedure
call, but retain their contents, making them available
to all procedures for fast parame ter passing.
cache, the number of memory references required to
read instructions into the processor is greatly
reduced.
To load the instruction cache, instructions are
fetched in 16-byte bloc k s; up to four instr uct io ns c an
be fetched at one time. An efficient prefetch algorithm increases the probability that an instruction is
already in the cache when it is needed.
Code for small loops often fits entirely within the
cache, leading to an increase in processing speed
since further memory references might not be
necessary until the program exits the loop. Similarly,
when calling short procedures, the code for the
calling procedure is likely to remain in the cache so it
is there on the procedure’s return.
1.1.6Register Scoreboarding
The instruction decoder is optimized in several ways.
One optimization method is the ability to overlap
instructions by using
register scoreboa rding
.
Regi ster scoreboarding occurs when a LOAD moves
a variable from memory into a register. When the
instruction initiates, a scoreboard bit on the target
register is set. Once the register is loaded, the bit is
reset. In between, any reference to the register
contents is a ccompan ie d by a tes t of th e s co re board
bit to ensure that the load has completed before
proc essi ng c ontin ues . Si nce t he pr oc esso r d oes n ot
need to wait for the LOAD to complete, it can
execute additional instructions placed between the
LOAD and the instruction that uses the register
con tents, as show n in the f ollowing example:
To further reduce memory accesses, the 80960MC
includes a 512-byte on-chip instruction cache. The
instr uctio n ca che is ba sed on th e con cep t o f
of reference
; most programs are typically not
executed in a steady stream but consist of many
branches, loops and procedure calls that lead to
jumpin g ba ck an d for th i n t he s ame sm all s ecti on of
code. Th us, by main tain ing a bloc k of in struc tio ns in
5
locali ty
In essence, the two unrelated instructions between
LOAD an d AD D are ex ecute d “f o r free” (i. e., take no
apparent time to execute) because they are
executed while the register is being loaded. Up to
three load instructions can be pending at one time
with three corresponding scoreboard bits set. By
exploiting this feature, system programmers and
compiler writers have a useful tool for optimizing
execu tion spee d.
80960MC
REGISTER
ONE OF FOUR
LOCAL
REGISTER SETS
CACHE
Figure 3. Multiple Register Sets Are Stored On-Chip
1.1.7Memory Management and Protection
The 80960MC is ideal for multitasking applications
that require software protection and a large address
space. To ensure the highes t level of performance
possible, the memory management unit (MMU) and
tran s lat io n look-aside buffer ( TL B ) a r e c on ta ine d onchip.
The 80960MC supports a conventional form of
demand-paged virtual memory in which the address
space is divided into 4-Kbyte pages. Studies indicate
that a 4-K byte page is th e op tim um siz e for a b road
range of applications.
Each page table entry includes a 2-bit page rights
field that specifies whether the page is a no-access,
read-only, or read-write page. This field is interpreted differently depending on whether the current
task (process) is executing in user or supervisor
mode, as shown below:
In the 80960MC, floating-point arithmetic is an
integr al part of th e archite cture. Hav ing the fl oatingpoint unit integrated on-chip provides two advantages. First, it improves the performance of the chip
for floating-point applications, since no additional
bus ove r he ad is as s oc ia ted wit h fl oa ting- po in t ca lc ulations, thereby leaving more time for other bus operations such as I/O. Second, the cost of using
floating-point operations is reduced because a
separate coprocessor chip is not required.
The 80960MC floating-point (real-number) data
types include single-precision (32-bit), double-precisio n (64-bit) and extended precision (80 -bit) floatingpoint numbers. Any registers may be used to
execute floating-point operations.
The processor provides hardware support for both
mandatory and recommended portions of IEEE
Standard 754 for floating-point arithmetic, including
all arithmetic, exponential, logarithmic and other
transcendental functions. Table 3 shows execution
times for some representative instructions.
6
80960MC
Table 3. Sample Floating-Point Execution Times
(µs) at 25 MHz
Function32-Bit64-Bit
Add0.40.5
Subtract0.40.5
Multiply0.71.3
Divide1.32.9
Square Root3.73.9
Arctangent10.113.1
Exponent11.312.5
Sine15.216.6
Cosine15.216.6
1.1.9Multitasking Support
Multita sking programs commonly involve the m onitoring and control of an external operation, such as
the activities of a process controller or the movements of a machine tool. These programs generally
consis t of a nu mber of proces ses that run ind ependently of one another, but share a common
database or pass data among themselves.
The 80960MC offers several hardware functions
designed to support multitasking systems. One
unique feature, called self-dispatching, allows a
processor to switch itself automatically among
scheduled tasks. When self-dispatching is used, all
the operating system is required to do is place the
task in the sc heduli ng queu e.
information by means of communication ports is
asynchronous and automatically buffered by the
processor.
Communication between tasks by means of ports
can be carried out independently of the operating
system. Once the ports have been set up by the
programmer, the processor handles the message
passing automatically.
1.1.11 High Bandwidth Local Bus
The 80960MC CPU resides on a high-bandwidth
address/data bus known as the local bus (L-Bus).
The L-Bus provides a direct communication path
between the processor and the memory and I/O
subsystem interfaces. T he processor uses th e L-Bus
to fetch instructions, manipulate memory and
respond to interrupts. L-Bus features include:
• 32-bit multiple xed address/data path
• Four-word burst capability which allows transfers
from 1 to 16 bytes at a time
• High bandwidth re ads and writes with 66.7
MBytes/s burst (at 25 MHz)
• Special signal to indicate whether a memory transaction can be cached
Table 4 defines L-bus signal n ames and functi ons;
Table 5 defines other component-support signals
such as interrupt lines.
1.1.12 Multiple Processor Support
When the processor becomes available, it
dispatches the task from the beginning of the queue
and the n exe cutes it u ntil it bec omes b lo cke d, in terrupted, or until its time-slice expires. It then returns
the task to the end of the queue (i.e., automatically
resche dules it ) and disp atches the next read y task.
During these operations, no communication betw een
the pr oc es sor an d th e op era t ing sys te m is nece ss ary
until the running task is comp lete or an interrupt is
issued .
1.1.10 Synchronization and Communication
The 80960MC also offers instructions to set up and
test semaphores to ensure that concurrent tasks
remain synchronized and no data inconsistency
results. Special data structures, known as communication ports, provide the means for exchanging
parameters and data structures. Transmission of
7
One me ans of inc rea sing the p roces sin g pow er of a
system is to run tw o or mor e proc essors in parall el.
Since microprocessors are not generally designed to
run in tandem wit h other processors, designing such
a system is us ually difficu lt an d costly.
The 80960MC solves this problem by offering a
number of functions to coordinate the actions of
multiple p r oc es s ors . First, m e ss ag es ca n be p as s ed
between processors to initiate actions such as
flushing a cache, stopping or starting another
processor, or preempting a task. The messages are
passe d on t he b us and allo w m ultip le p ro cesso rs to
run together smoothly, with rare need to lock the bus
or memory.
80960MC
Second, a set of synchronization instructions help
maintain memory coherency. These instructions
permit several processors to modify memory at the
same time without inserting inaccuracies or ambiguities into shared data structures.
The self-dispatching mechanism — in addition to
being used in single-processor systems — provides
the m ea ns to incr ease the p erf orma nce o f a syst em
merely by adding processors. Each processor can
either work on the same pool of tasks (sharing the
same queue with other processors) or can be
restricted to its own queue.
When processors perform system operation, they
synchronize themselves by using atomic operations
and se nd ing sp ecia l mes sage s betw ee n each oth er.
In theory, changing the number of processors in a
system does not require a software change.
Software executes correctly regardless of the
number of processors in the system; systems with
more pr ocessors simply execute faster.
1.1.13 Interrupt Handling
The 80960MC can be interrupted in two ways: by the
activ atio n of o ne of fo ur inte rru pt pi ns or by se ndin g
a message on the processor’s data bus.
The 80960MC is unusual in that it automatically
hand les inter rupts on a p riority b asis and can keep
trac k of pe nding interru pts thr ough its on-c hip in terrupt controller. Two of the interrupt pins can be
configured to provide 8259A-style handshaking for
expansion beyond four interrupt lines.
An interrupt message is made up of a vector number
and an interrupt priority. When the interrupt priority is
greater than that of the currently running task, the
pro cessor accept s the inter rupt an d uses the ve ctor
as an index into the interrupt table. When the priority
of the i nt erru pt me ssa ge is below tha t of the cu rrent
task, the processor saves the information in a
section of the interrupt table reserved for pending
interrupts.
1.1.14 Debug Featu res
The 80960MC has built-in debug capabilities,
including two types of breakpoints and six trace
modes. Debug features are controlled by two
internal 3 2-bit regist ers: the Pro cess-Controls Word
and the Trace-Controls Word. By setting bits in these
contr ol w ord s , a s of t war e d ebug m o nitor ca n closel y
control how the processor responds during program
execut io n.
The 80960MC has both hardware and software
breakpoints. It provides two hardware breakpoint
registers on-chip which, by using a special
command, can be set to any value. When the
instruction pointer matches e ither br eakpo int register
value, the breakpoint handling routine is automatically called.
The 80960MC also provides software breakpoints
through the use of two instructions: MARK and
FMARK. These can be placed at any point in a
progra m and cause th e process or to halt exe cution
at that point and call the breakpoint handling routine.
The breakpoint mechanism is easy to use and
provides a pow erfu l debugg ing tool.
Tracing is available for instructions (single step
execution), calls and returns and branching. Each
trace type may be enabled separately by a special
debug instruction. In each case, the 80960MC
executes the instruction first and then calls a trace
handling routine (usually part of a software debug
monitor). Further program execution is halted until
the routine completes, at which time execution
resumes at the next instruction. The 80960MC’S
tracing mechanisms, implemented completely in
hardware, greatly simplify the task of software test
and debug.
1.1.15 Fault Detection
The 80960MC has an automatic mechanism to
handle faults. There are ten fault types include
floating point, trace and arithmetic faults. When the
processor detects a fault, it automatically calls the
appropriate fault handling routine and saves the
current instruction pointer and necessary state information to make efficient recovery possible. The
processor posts diagnostic information on th e type of
fault to a Fault Record. Like interrupt handling
routines, fault handling routines are usually written to
meet the needs of specific applications and are often
included as part of the operating system or kernel.
For ea ch of t he ten f ault ty pes, nu merous subty pes
provide specific information about a fault. For
example, a floating point fault may have the subtype
set to an Overflow or Zero-Divide fault. The fault
handle r can u se this sp ecifi c in for mati on to resp ond
correctly to the fault.
8
Loading...
+ 27 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.