Intel 80960MC User Manual

80960MC
EMBEDDED 32-BIT MICROPROCESSOR
WITH INTEGRATED FLOATING-POINT UNIT
AND MEMORY MANAGEMENT UNIT
Commercial
High-Performance Embedded Architecture
25 MHz
On-Chip Floating Point Unit
— Supports IEEE 754 Floating Point
Standard — Full Transcendental Support — Four 80-Bit Registers — 13.6 Million Whetstones/s
(Single Precision) at 25 MHz
512-Byte On-Chip Instruction Cache
— Direct Mapped — Parallel Load/Decode for Uncached
Instructions
Multiple Register Sets
— Sixteen Global 32-Bit Registers — Sixteen Local 32-Bit Registers — Four Local Register Sets Stored
On
-Chip (Sixteen 32-Bit Registers per
Set) — Register Scoreboarding
On-Chip Memory Management Unit
— 4 Gbyte Virtual Address Space per
Task
— 4 Kbyte Pages with Supervisor/User
Pro
tection
Built-in Interrupt Controller
— 32 Priority Levels — 248 Vectors — Supports M8259A — 3.4 µs Latency @ 25 MHz
Easy to Use, High Bandwidth 32-Bit Bus
— 66.7 Mbytes/s Burst — Up to 16 Bytes Transferred per Burst
Multitasking and Multiprocessor Support
— Automatic Task dispatching — Prioritized Task Queues
Advanced Package Technology
— 132-Lead Ceramic Pin Grid Array
FOUR
80-BIT FP
REGISTERS
80-BIT
FPU
INSTRUCTION
FETCH UNIT
SIXTEEN
32-BIT GLOBAL
REGISTERS
512-BYTE
INSTRUCTION
CACHE
64- BY 32-BIT
LOCAL
REGISTER
CACHE
INSTRUCTION
DECODER
32-BIT
INSTRUCTION
EXECUTION
UNIT
MICRO-
INSTRUCTION
SEQUENCER
MMU
MICRO-
INSTRUCTION
ROM
32-BIT
BUS CONTROL
LOGIC
32-BIT
BURST
BUS
Figure 1. The 80960MC Processor’s Highly Parallel Architecture
© INTEL CORPORATION, 2004 September, 2004 Order Number: 273123-002
Information in this document is provided in connection with Intel products. No license, express or implied, by estopp el or otherw ise, to any intellect ual proper ty rights is granted by this documen t. Except as prov ided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability whatsoever, and Intel disclaims any express or imp lied warranty , relating to sale and/or use of Intel products including lia bility or warrant ie s relat ing to fit ne ss for a partic ular pu rpo se, me rcha ntab ility , or in fringe men t of any patent , copy right or other intellectual property right. Intel products are not intended for use in medical, life saving, or life sus taining applications. Intel ma y make changes to speci fications an d product descriptions at any time, without notice. Contact your local In tel sales off ice or your dis tributor to obtain the lates t specifications and befo re placing your product order.
Intel retains the right to make changes to specifications and product descriptions at any time, without notice. *Third party brands and names are the property of their respective owners. Copies of documents which have an ordering number and are referenced in this document, or other Intel
literature, may be obtained from: Intel Corporation
P.O. Box 7641 Mt. Prospect IL 60056-7641 or call 1-800-879-4683
Many documents are available for download from Intel’s website at http://www.intel.com Copyright © Intel Corporation 1997
80960MC
1.0 THE i960® MC PROCESSOR ...................................................................................................................1
1.1 Key Performance Features .................................................................................................................2
1.1.1 Memory Space And Addressing Modes ................................................................................... 4
1.1.2 Data Types ............................................................................................................................... 4
1.1.3 Large Register Set ................................................................................................................... 4
1.1.4 Multiple Register Sets ..............................................................................................................5
1.1.5 Instruction Cache ..................................................................................................................... 5
1.1.6 Register Scoreboarding ...........................................................................................................5
1.1.7 Memory Management and Protection ......................................................................................6
1.1.8 Floating-Point Arithmetic ..........................................................................................................6
1.1.9 Multitasking Support ................................................................................................................7
1.1.10 Synchronization and Communication ....................................................................................7
1.1.11 High Bandwidth Local Bus .....................................................................................................7
1.1.12 Multiple Processor Support .................................................................................................... 7
1.1.13 Interrupt Handling ..................................................................................................................8
1.1.14 Debug Features .....................................................................................................................8
1.1.15 Fault Detection ....................................................................................................................... 8
1.1.16 Inter-Agent Communications (IAC) ........................................................................................9
1.1.17 Built-in Testability ................................................................................................................... 9
1.1.18 Compatibility with 80960K-Series .......................................................................................... 9
1.1.19 CHMOS .................................................................................................................................. 9
2.0 ELECTRICAL SPECIFICATIONS ........................................................................................................... 13
2.1 Power and Grounding .......................................................................................................................13
2.2 Power Decoupling Recommendations .............................................................................................13
2.3 Connection Recommendations ........................................................................................................ 13
2.4 Characte ris ti c Curv es ........ .... ........... ........... ............ ........... ............ .... ........... ............ ....... ............ ....13
2.5 Test Load Ci rcuit ........... ........... ............ ........... ............ ........... .... ........... ............ ................... ........... . 16
2.7 DC Characteristics ............................................................................................................................ 17
2.6 Absolu te Max im um Ra tin gs ... ........... ........... ............ .... ........... ............ ........... ............ ................... ... . 17
2.8 AC Specifications ............................................................................................................................. 18
2.9 Design Con side rations ............. ............ ........... ............ ........... ............ ... ............ ........... ............ .... ....22
3.0 MECHANICAL DATA ..............................................................................................................................22
3.1 Packaging .........................................................................................................................................22
3.1.1 Pin Assignment ......................................................................................................................22
3.2 Pinout ...............................................................................................................................................26
3.3 Package The rm al Spec if ic ati on ........ ........... .... ............ ........... ............ ... ............ ........... ....................28
4.0 WAVEFOR M S .......... ............ ........... .... ............ ........... .... ........... ............ ........... .... ....................... ........... . 30
5.0 REVISION HISTORY ............................................................................................................................... 35
iii
80960MC
FIGURES
Figure 1. 80960MC Programming Environment ........................................................................................1
Figure 2. Instruction Formats ....................................................................................................................4
Figure 3. Multiple Register Sets Are Stored On-Chip ...............................................................................6
Figure 4. Connection Recommendations for Low Current Drive Network ..............................................13
Figure 5. Connection Recommendations for High Current Drive Network ..............................................13
Figure 6. Typical Supply Current vs. Case Temperature ........................................................................14
Figure 7. Typical Current vs. Frequency (Room Temp) ..........................................................................14
Figure 8. Typical Current vs. Frequency (Hot Temp) ..............................................................................15
Figure 9. Worst-Case Voltage vs. Output Current on Open-Drain Pins ..................................................15
Figure 10. Capacit iv e Dera tin g Curv e ......... .... ............ ... ............ ........... ............ .... ........... ............ .............15
Figure 11. Test Load Circuit for Three-State Output Pins .........................................................................16
Figure 12. Test Load Circuit for Open-Drain Output Pins .........................................................................16
Figure 13. Drive Levels and Timing Relationships for 80960MC Signals .................................................18
Figure 14. Timing Relationship of L-Bus Signals ......................................................................................19
Figure 15. System and Processor Clock Relationship ..............................................................................19
Figure 16. Processor Clock Pulse (CLK2) ................................................................................................21
Figure 17. RESET Signal Timing ..............................................................................................................21
Figure 18. HOLD Timing ...........................................................................................................................22
Figure 19. 132-Lead Pin-Grid Array (PGA) Package ................................................................................23
Figure 20. 80960MC PGA Pinout—View from Bottom (Pins Facing Up) ..................................................24
Figure 21. 80960MC PGA Pinout—View from Top (Pins Facing Down) ..................................................25
Figure 22. 25 MHz Maximum Allowable Ambient Temperature ................................................................29
Figure 23. Non-Burst Read and Write Transactions Without Wait States .................................................30
Figure 24. Burst Read and Write Transaction Without Wait States ..........................................................31
Figure 25. Burst Write Transaction with 2, 1, 1, 1 Wait States ..................................................................32
Figure 26. Access es Gener ated by Qu ad Word Read Bus Request, Misaligned Two By tes from
Quad Word Boundary (1, 0, 0, 0 Wait States) .........................................................................33
Figure 27. Interrupt Acknowledge Transaction .........................................................................................34
Figur e 28. Bus Exchange Tran saction (PBM = Pri mary Bus Ma ster, SBM = Secondary Bus Master) .....35
TABLES
Table 1. 80960MC Instruction Set ...........................................................................................................3
Table 2. Memory Addressing Modes .......................................................................................................4
Table 3. Sample Floating-Point Execution Times (µs) at 25 MHz ...........................................................7
Table 4. 80960MC Pin Description: L-Bus Signals ..................................................................................9
Table 5. 80960MC Pin Description: Support Signals .............................................................................11
Table 6. DC Characteristics ...................................................................................................................17
Table 7. 80960MC AC Characteristics (25 MHz) ...................................................................................20
Table 8. 80960MC PGA Pinout — In Pin Order .....................................................................................26
Table 9. 80960MC PGA Pinout — In Signal Order ................................................................................27
Table 10. 80960MC PGA Package Thermal Characteristics ...................................................................28
iv
80960MC
1.0 THE i960® MC PROCESSOR
The 80960MC, a member of Intel’s i960® 32-bit
processor family, is ideally suited for embedded applications. It includes a 512-byte instruction cache and a built-in interrupt controller. The 80960MC has a larg e registe r set, mul tiple parallel execut ion units and a high-bandwidth burst bus. Using advanced RISC technology, this processor is capable of execution rates in excess of 9.4 million instructions per s eco nd range of applications including non-impact printers, I/O control and specialty instrumentation. The embedded market includes applications as diverse as industrial automation, avionics, image processing, graphics and networking. These types of applications require high integration, low power consumption, quick interrupt response times and
* Relative to Digital Equipment Corporation’s VAX-11/780*
at 1 MIPS
*
. The 8 0960 MC is we ll- suite d for a w ide
high performance. Since time to market is critical, embedded processors must be easy to use in both hardware and software designs.
All members of the i960 processor family share a comm on c ore ar ch itect ure w hic h util izes RI SC te ch­nology so that, except for special functions, the family members are object-code compatible. Each new p ro ce ss o r in th e family a dds its o w n sp ec ia l set of functions to the core to satisfy the needs of a specific application or range of applications in the embedded market.
The 80960MC includes an integrated Floating Point Unit (FPU), a Memory Management Unit (MMU), multitasking support, and multiprocessor support. Two commercial members of the i960
®
family provide similar features: the 80960KB processor with integrated FPU and the 80960KA without floating­point.
FFFF FFFFH0000 0000H
ADDRESS SPACE
ARCHITECTURALLY DATA STRUCTURES
FETCH LOAD STORE
INSTRUCTION CACHE
INSTRUCTION
STREAM
INSTRUCTION
EXECUTION
PROCESSOR STATE
REGISTERS
INSTRUCTION
POINTER
ARITHMETIC
CONTROLS
PROCESS
CONTROLS
TRACE
CONTROLS
SIXTEEN 32-BIT GLOBAL REGISTERS
REGISTER CACHE
SIXTEEN 32-BIT LOCAL REGISTERS
FOUR 80-BIT FLOATING POINT REGISTERS
CONTROL REGISTERS
DEFINED
r15
g0
g15
r0
Figure 1. 80960MC Programming Environment
1
80960MC
1.1 Key Performan ce Featu res
The 80 96 0 arc hitec tur e is b ased on the mos t rece nt advances in microprocessor technology and is
grounded in Intel’s l ong exp erience in the de sign a nd manufacture of embedded microprocessors. Many features contribute to the 80960MC’s exceptional performance:
1. Large Register Set. Havi ng a lar ge nu m ber of registers reduces the number of times that a processor needs to access memory. Modern compilers can take advantage of this fe ature to optimize execution speed. For maximum flexi­bility, the 80960MC provides thirty-two 32-bit registers. (See Figure 2.)
2. Fast I nst ruction E xecut ion. Sim ple functi ons make up the bulk of instructions in most programs so that execution speed can be improved by ensuring that these core instruc­tions are ex ecut ed as quic kly as po ssib le. Th e most frequently executed instructions such as register-register moves, add/subtract, logical operations and shifts execute in one to two cycles. (Table 1 contains a list of instructions.)
3. Load/Store Ar chit e cture. One way to improve execution speed is to reduce the number of times that the processor must access memory to perform an operation. As with other proces­sors based on RISC technology, the 80960MC has a Load/Store architecture. As such, only the LOAD and STORE instructions reference memory; all other instructions operate on regis­ters. This type of architecture simplifies instruc­tion d ecodin g and i s used in co mbinat ion wi th other techniques to increase parallelism.
4. Simple Instruction Formats. All instructions in the 80960MC are 32 bits long and must be aligned on word boundaries. This alignment makes it possible to eliminate the instruction align me nt stag e in the pi peline . To si m pli fy th e instruction decoder, there are only five instruc­tion formats; each instruction uses only one format. (See Figure3.)
5. Overlapped Instruction Execution. Load operations allow execution of subsequent instructions to continue before the data has been returned from memory, so that these instructions can overlap the load. The 80960MC manages this process transparently to software through the use of a register score­boar d. Condi tional ins tructio ns also m ake use of a scoreboard so that subsequent unrelated instructions may be executed while the condi­tional instruction is pendi ng.
6. Integer Execution Optimization. When the resu lt of an a rith meti c ex ecu tion i s us ed a s an operand in a subsequent calculation, the value is sent immediate ly to its des tination register. Yet at the same time, the value is put on a bypass path to the ALU, thereby saving the time that otherwise would be required to retrieve the value for the next operation.
7. Bandwidth Optimizations. The 80960MC gets op timal us e of its mem ory bus ba ndwid th because the bus is tuned for use with the on­chip instruction cache: instruction cache line size matches the maximum burst size for instruction fetches. The 80960MC automati­cally fetches four word s in a bu rst and stores them directly in the cache. Due to the size of the cache and the fact that it is continually filled in anticipation of needed instructions in the program flow, the 80960MC is relatively insen­sitive to memory wait states. The benefit is that the 80960MC delivers outstanding perfor­mance even with a low cost memory sys tem.
8. Cache Bypass. When a cache miss occurs, the processor fetches the needed instruction then se nds it on to the in struction decoder at the sam e time it update s the cache. Thu s, no extra time is spent to load and read the cache.
2
Table 1. 80960MC Instruction Set
Data Movement Process Management Floating Point Logical
80960MC
Load Store Move Load Address Load Physical Address
Schedule Process Saves Process Resume Process Load Pr ocess Time Modify Process Controls Wait Conditional Wait Signal Receive Conditional Receive Send Send Service Atomic Add Atom i c Mo di fy
Add Subtract Multiply Divide Remainder Scale Round Square Root Sine Cosine Tangent Arctangent Log Log Binary
And Not And And Not Or Exclusive Or Not Or Or Not Nor Exclusive Nor Not Nand Rotate
Log Natural Exponent Classify Copy Real Extended Compare
Comparison Branch Bit and Bit Field String
Compare Conditional Compare Com pa re an d Inc r e me nt Com pa re and Dec rem e nt
Unc on di tional B ran c h Conditional Branch Com pa re and Bran c h
Set Bit Clear Bit Not Bit Check Bit Alter Bit
Move String Move Quick String Fill String Compare Str ing
Scan Byte for Equal Scan For Bit Scan Over Bit Extract Modi fy
Conversion Decimal Call/Return Arithmetic
Convert Real to Integer Convert Integer to Real
Move Add with Carry Subtract with Carry
Fault Debug Miscellaneous
Conditional Fault Synchronize Faults
Modify Trace C ontrols Mark Force Mark
3
Call Call Extended Call System Return Bra nch and Li nk
Flush Local Registers Inspect Access Modify Arithmetic Controls Test Condition Code
Add
Subtract
Multiply
Divide
Remainder
Modulo
Shift
80960MC
Control
Compare and
Branch
Register to
Register
Memory Access-
Short
Memory Access-
Long
Opcode Displacement
Opcode Reg/Lit Reg M Displacement
Opcode Reg Reg/Lit Modes Ext’d Op Reg/Lit
Opcode Reg Base M X Offset
Opcode Reg Base Mode Scale xx Offset
Figure 2. Instruction Formats
1.1.1 Memory Space And Addressing Modes
Displacement
1.1.2 Data Types
The 80960MC allows each task (process) to address a logical memory space of up to 4 Gbytes. Each
task’s address space is divided into four 1 Gbyte regions and each region can be mapped to physical addresses by zero, one, or two l evels of page tables. The r egio n wi th the high est ad dr esse s (R egio n 3) is common to all task s.
In keeping with RISC design principles, the number of addressing modes is minimal yet includes all those necessary to ensure efficient execution of high-level languages such as Ada, C, and Fortran.
Table 2 lists the memory accessing modes.
Table 2. Memory Addressing Modes
• 12-Bit Offset
• 32-Bit Offset
• Register-Indirect
• Register + 12-Bi t Offset
• Register + 32-Bi t Offset
• Register + (Index-Register x Scale-Factor)
• Register x Scale Factor + 32 -Bit Displacement
• Register + (Index-Register x Scale-Factor) + 32­Bit Displacement
• Scale-Facto r is 1, 2, 4, 8 or 16
The 80960MC recogni z es the following data types: Numeric:
• 8-, 16-, 32- and 64-bit ordinals
• 8-, 16-, 32- and 64-bit integer s
• 32-, 64- and 80-bit real numbers Non-Numeric:
•Bit
• Bit Field
• Triple Word (96 bits)
• Quad-Word (128 bits)
1.1.3 Large R eg ist e r Se t
The 80960MC programming environment includes a large number of registers. 36 registers are available at any time; this greatly reduces the number of memory accesses required to perform algorithms, which leads to greater instruction processing speed.
Two types of general-purpose registers are avail­able: local and global. The 20 global registers consist of sixtee n 32-bit registers (G0 though G15) and four 80-bit registers (FP0 through FP3). These
4
80960MC
registers perform the same function as the general­purpose registers provided in other popular micro­processors. The term
global
refers to the fact that these registers retain their contents across proce­dure ca ll s.
The loc al r eg ister s are p roce du re-sp ecifi c. Fo r each procedure call, the 80960MC allocates 16 local regist e rs ( R0 thro ug h R 15 ). Each l oc al register i s 32 bits w id e . A ny r e gi st er ca n a ls o be u se d fo r fl oa ti ng ­point operations; the 80-bit floating-point registers are provided for extended prec ision.
1.1.4 Multiple Register Sets
To fur ther in crea se th e eff icie ncy of the regis ter s et, multiple sets of local registers are stored on-chip (See Figure 4). This cache holds up to four local register frames, which means that up to three pr oce­dure calls can be made without having to access the procedure stack resident in memory.
Although programs may have procedure c alls n ested many ca lls d ee p, a prog ram typi call y osc illat es b ack and forth between only two to three levels. As a result, with four stack frames in the cache, the prob­ability of having a free frame available on the cache when a call is made is very high. Runs of representa­tive C -la ng uage pro gra ms sh ow th at 80% of t he ca lls are handled without needing to access memory.
When four or more procedures are active and a new proced ure is ca lled, the 80 960MC moves the ol dest local register set in the stack-frame cache to a proced ure stac k in me mor y to mak e ro om fo r a new set of registers. Global register G15 is the frame pointer (FP) to the procedure stack.
Global registers are not exchanged on a procedure call, but retain their contents, making them available to all procedures for fast parame ter passing.
cache, the number of memory references required to read instructions into the processor is greatly reduced.
To load the instruction cache, instructions are fetched in 16-byte bloc k s; up to four instr uct io ns c an be fetched at one time. An efficient prefetch algo­rithm increases the probability that an instruction is already in the cache when it is needed.
Code for small loops often fits entirely within the cache, leading to an increase in processing speed since further memory references might not be necessary until the program exits the loop. Similarly, when calling short procedures, the code for the calling procedure is likely to remain in the cache so it
is there on the procedure’s return.
1.1.6 Register Scoreboarding
The instruction decoder is optimized in several ways. One optimization method is the ability to overlap instructions by using
register scoreboa rding
.
Regi ster scoreboarding occurs when a LOAD moves a variable from memory into a register. When the instruction initiates, a scoreboard bit on the target register is set. Once the register is loaded, the bit is reset. In between, any reference to the register contents is a ccompan ie d by a tes t of th e s co re board bit to ensure that the load has completed before proc essi ng c ontin ues . Si nce t he pr oc esso r d oes n ot need to wait for the LOAD to complete, it can execute additional instructions placed between the LOAD and the instruction that uses the register con tents, as show n in the f ollowing example:
ld data_2, r4 ld data_2, r5 Unrelated instruction Unrelated instruction add R4, R5, R6
1.1.5 Instruction Cache
To further reduce memory accesses, the 80960MC includes a 512-byte on-chip instruction cache. The instr uctio n ca che is ba sed on th e con cep t o f
of reference
; most programs are typically not executed in a steady stream but consist of many branches, loops and procedure calls that lead to jumpin g ba ck an d for th i n t he s ame sm all s ecti on of code. Th us, by main tain ing a bloc k of in struc tio ns in
5
locali ty
In essence, the two unrelated instructions between LOAD an d AD D are ex ecute d “f o r free” (i. e., take no apparent time to execute) because they are executed while the register is being loaded. Up to three load instructions can be pending at one time with three corresponding scoreboard bits set. By exploiting this feature, system programmers and compiler writers have a useful tool for optimizing execu tion spee d.
80960MC
REGISTER
ONE OF FOUR
LOCAL
REGISTER SETS
CACHE
Figure 3. Multiple Register Sets Are Stored On-Chip
1.1.7 Memory Management and Protection
The 80960MC is ideal for multitasking applications that require software protection and a large address space. To ensure the highes t level of performance possible, the memory management unit (MMU) and tran s lat io n look-aside buffer ( TL B ) a r e c on ta ine d on­chip.
The 80960MC supports a conventional form of demand-paged virtual memory in which the address space is divided into 4-Kbyte pages. Studies indicate that a 4-K byte page is th e op tim um siz e for a b road range of applications.
Each page table entry includes a 2-bit page rights field that specifies whether the page is a no-access, read-only, or read-write page. This field is inter­preted differently depending on whether the current task (process) is executing in user or supervisor mode, as shown below:
Rights User Supervisor
00 No Access Read-Only 01 No Access Read-Wr ite 10 Read- Only Read-Write 11 Read-Write Read-Write
LOCAL REGISTER SET
R
0
R
15
31
0
1.1.8 Floating-Point Arithmetic
In the 80960MC, floating-point arithmetic is an integr al part of th e archite cture. Hav ing the fl oating­point unit integrated on-chip provides two advan­tages. First, it improves the performance of the chip for floating-point applications, since no additional bus ove r he ad is as s oc ia ted wit h fl oa ting- po in t ca lc u­lations, thereby leaving more time for other bus oper­ations such as I/O. Second, the cost of using floating-point operations is reduced because a separate coprocessor chip is not required.
The 80960MC floating-point (real-number) data types include single-precision (32-bit), double-preci­sio n (64-bit) and extended precision (80 -bit) floating­point numbers. Any registers may be used to execute floating-point operations.
The processor provides hardware support for both mandatory and recommended portions of IEEE Standard 754 for floating-point arithmetic, including all arithmetic, exponential, logarithmic and other transcendental functions. Table 3 shows execution times for some representative instructions.
6
80960MC
Table 3. Sample Floating-Point Execution Times
(µs) at 25 MHz
Function 32-Bit 64-Bit
Add 0.4 0.5
Subtract 0.4 0.5
Multiply 0.7 1.3
Divide 1.3 2.9
Square Root 3.7 3.9
Arctangent 10.1 13.1
Exponent 11.3 12.5
Sine 15.2 16.6
Cosine 15.2 16.6
1.1.9 Multitasking Support
Multita sking programs commonly involve the m oni­toring and control of an external operation, such as the activities of a process controller or the move­ments of a machine tool. These programs generally consis t of a nu mber of proces ses that run ind epen­dently of one another, but share a common database or pass data among themselves.
The 80960MC offers several hardware functions designed to support multitasking systems. One unique feature, called self-dispatching, allows a processor to switch itself automatically among scheduled tasks. When self-dispatching is used, all the operating system is required to do is place the task in the sc heduli ng queu e.
information by means of communication ports is asynchronous and automatically buffered by the processor.
Communication between tasks by means of ports can be carried out independently of the operating system. Once the ports have been set up by the programmer, the processor handles the message passing automatically.
1.1.11 High Bandwidth Local Bus
The 80960MC CPU resides on a high-bandwidth address/data bus known as the local bus (L-Bus). The L-Bus provides a direct communication path between the processor and the memory and I/O subsystem interfaces. T he processor uses th e L-Bus to fetch instructions, manipulate memory and respond to interrupts. L-Bus features include:
• 32-bit multiple xed address/data path
• Four-word burst capability which allows transfers from 1 to 16 bytes at a time
• High bandwidth re ads and writes with 66.7 MBytes/s burst (at 25 MHz)
• Special signal to indicate whether a memory trans­action can be cached
Table 4 defines L-bus signal n ames and functi ons; Table 5 defines other component-support signals
such as interrupt lines.
1.1.12 Multiple Processor Support
When the processor becomes available, it dispatches the task from the beginning of the queue and the n exe cutes it u ntil it bec omes b lo cke d, in ter­rupted, or until its time-slice expires. It then returns the task to the end of the queue (i.e., automatically resche dules it ) and disp atches the next read y task. During these operations, no communication betw een the pr oc es sor an d th e op era t ing sys te m is nece ss ary until the running task is comp lete or an interrupt is issued .
1.1.10 Synchronization and Communication
The 80960MC also offers instructions to set up and test semaphores to ensure that concurrent tasks remain synchronized and no data inconsistency results. Special data structures, known as communi­cation ports, provide the means for exchanging parameters and data structures. Transmission of
7
One me ans of inc rea sing the p roces sin g pow er of a system is to run tw o or mor e proc essors in parall el. Since microprocessors are not generally designed to run in tandem wit h other processors, designing such a system is us ually difficu lt an d costly.
The 80960MC solves this problem by offering a number of functions to coordinate the actions of multiple p r oc es s ors . First, m e ss ag es ca n be p as s ed between processors to initiate actions such as flushing a cache, stopping or starting another processor, or preempting a task. The messages are passe d on t he b us and allo w m ultip le p ro cesso rs to run together smoothly, with rare need to lock the bus or memory.
80960MC
Second, a set of synchronization instructions help maintain memory coherency. These instructions permit several processors to modify memory at the same time without inserting inaccuracies or ambigu­ities into shared data structures.
The self-dispatching mechanism — in addition to being used in single-processor systems — provides the m ea ns to incr ease the p erf orma nce o f a syst em merely by adding processors. Each processor can either work on the same pool of tasks (sharing the same queue with other processors) or can be restricted to its own queue.
When processors perform system operation, they synchronize themselves by using atomic operations and se nd ing sp ecia l mes sage s betw ee n each oth er. In theory, changing the number of processors in a system does not require a software change. Software executes correctly regardless of the number of processors in the system; systems with more pr ocessors simply execute faster.
1.1.13 Interrupt Handling
The 80960MC can be interrupted in two ways: by the activ atio n of o ne of fo ur inte rru pt pi ns or by se ndin g a message on the processor’s data bus.
The 80960MC is unusual in that it automatically hand les inter rupts on a p riority b asis and can keep trac k of pe nding interru pts thr ough its on-c hip in ter­rupt controller. Two of the interrupt pins can be configured to provide 8259A-style handshaking for expansion beyond four interrupt lines.
An interrupt message is made up of a vector number and an interrupt priority. When the interrupt priority is greater than that of the currently running task, the pro cessor accept s the inter rupt an d uses the ve ctor as an index into the interrupt table. When the priority of the i nt erru pt me ssa ge is below tha t of the cu rrent task, the processor saves the information in a section of the interrupt table reserved for pending interrupts.
1.1.14 Debug Featu res
The 80960MC has built-in debug capabilities, including two types of breakpoints and six trace modes. Debug features are controlled by two internal 3 2-bit regist ers: the Pro cess-Controls Word and the Trace-Controls Word. By setting bits in these
contr ol w ord s , a s of t war e d ebug m o nitor ca n closel y control how the processor responds during program execut io n.
The 80960MC has both hardware and software breakpoints. It provides two hardware breakpoint registers on-chip which, by using a special command, can be set to any value. When the instruction pointer matches e ither br eakpo int register value, the breakpoint handling routine is automati­cally called.
The 80960MC also provides software breakpoints through the use of two instructions: MARK and FMARK. These can be placed at any point in a progra m and cause th e process or to halt exe cution at that point and call the breakpoint handling routine. The breakpoint mechanism is easy to use and provides a pow erfu l debugg ing tool.
Tracing is available for instructions (single step execution), calls and returns and branching. Each trace type may be enabled separately by a special debug instruction. In each case, the 80960MC executes the instruction first and then calls a trace handling routine (usually part of a software debug monitor). Further program execution is halted until the routine completes, at which time execution resumes at the next instruction. The 80960MC’S tracing mechanisms, implemented completely in hardware, greatly simplify the task of software test and debug.
1.1.15 Fault Detection
The 80960MC has an automatic mechanism to handle faults. There are ten fault types include floating point, trace and arithmetic faults. When the processor detects a fault, it automatically calls the appropriate fault handling routine and saves the current instruction pointer and necessary state infor­mation to make efficient recovery possible. The processor posts diagnostic information on th e type of fault to a Fault Record. Like interrupt handling routines, fault handling routines are usually written to meet the needs of specific applications and are often included as part of the operating system or kernel.
For ea ch of t he ten f ault ty pes, nu merous subty pes provide specific information about a fault. For example, a floating point fault may have the subtype set to an Overflow or Zero-Divide fault. The fault handle r can u se this sp ecifi c in for mati on to resp ond correctly to the fault.
8
Loading...
+ 27 hidden pages