Information in this document is provided in connection with Intel products. No license, express or implied, by estoppel or otherwise, to any intellectual
property rights is granted by this document. Except as provided in Intel’s Terms and Conditions of Sale for such products, Intel assumes no liability
whatsoever, and Intel disclaims any express or implied warranty, relating to sale and/or use of Intel products including liability or warranties relating to
fitness for a particular purpose, merchantability, or infringement of any patent, copyright or other intellectual property right. Intel products are not
intended for use in medical, life saving, or life sustaining applications.
Intel may make changes to specifications and product descriptions at any time, without notice.
Designers must not rely on the absence or characteristics of any features or instructions marked "reserved" or "undefined." Intel reserves these for
future definition and shall have no responsibility whatsoever for conflicts or incompatibilities arising from future changes to them.
®
The i960
Current characterized errata are available on request.
Contact your local Intel sales office or your distributor to obtain the latest specifications and before placing your product order.
Copies of documents which have an ordering number and are referenced in this document, or other Intel literature may be obtained by calling 1-800-
548-4725 or by visiting Intel’s website at http://www.intel.com.
The i960® VH Processor (“80960VH”) integrates a high-performance 8096 0 “co re” into a
Peripheral Com ponents Interconnect (PCI) functionality. This integrated processor addresses the
needs of embedded appli cations and helps redu ce emb edde d system costs. As indicated in
Figure 1-1, the primary functio nal units include an i960 core processor, PCI-to-80 960 Address
Trans lation Unit, Messaging Unit, Direct Memory Access (DMA) Controller, Memory Controller,
2
and I
C Bus Interface Unit.
The PCI Bus is an industry standa rd, high performanc e, low latency sys tem bus that operates up to
132 Mbyte/sec.
®
Figure 1-1. i960
VH Processor Fun ct i onal Block Diagram
1
Local Memory
Two DMA
Channels
i960® JT Core
Processor
Address
Translation
Internal Primary
PCI Arbiter
Local Bus
Unit
Memory
Controller
PCI Bus
1.2i960® VH Processor Features
I2C Serial Bus
I2C Bus
Interface Unit
Message
Unit
Internal Local
Bus Arbiter
The 8096 0VH com bi nes the i96 0® JT processor with powerful ne w features t o create a n embedded
processor. This PCI device is fully compliant with the PCI Local Bus Specification, revision 2.1.
80960VH - specific features includ e:
i960® VH Processor Developer’s Manual
1-1
Introduction
• DMA Controller• Memory Controller
• Address Translation Unit• I2C Bus Interface Unit
• Messaging Unit
Because the 80960VH’s core processor is based upon the 80960JT, the two i960 family members
are objec t code compa tible and can maintain a sustained execution ra te of one instruction per clock
cycle. The 80960 local bus, a 32-bit multiplexed burst bus, is a high-speed interface to system
memory and I/O. A full complem ent of control signals simplifies the con nec tion of the 80960VH
to external components. Physical and logical memory attribut es are programmed via
memory-mapped control registers (MMRs), a feat ure not found on the i960 Kx, Sx or Cx
processors. Physical and logical configuration registers enable the processor to operate with all
combinations of bus width and data object alignment. See Section 1.3, “i960® Core Proc essor
Features (80960VH)” on pa ge 1-3 for more information.
The subsections that follow briefly overview each feature. Refer to the appropriate chapter for full
technical des criptions.
1.2.1DMA Controller
The DMA Controller allows low-latency, high-throughput data transfers between PCI bus agents
and 80960 local memory. Two separate DMA channels accommodate data transfers for the
primary PCI bus. The DMA Controller supports chaining and unaligned data transfers. It is
programmable through the i960 core proce ssor only, and functions in synchronous mode only. See
Chapter 20, DMA Controller.
1.2.2Address Translation Unit
The Address Translation Unit (ATU) allows PCI transactions direct access to the 80960VH local
memory. The ATU supports transactions between PCI add r ess space and 80960VH address space.
Address translation is controlled through programmable registers accessible from both the PCI
interface and t he i960 core process or. Dual access to re gisters allows flexibi lity i n mappin g the two
address spaces. See Chapter 16, Address Translation Unit.
1.2.3Messaging Unit
The Messaging Unit (MU) provides data transfer between the PCI system and the 80960VH. It
uses interrupt s to notify each system when new data arrives. The MU has four messaging
mechanisms: Mes sage Registers and Doorbe ll Registers. Each all ows a host processor or external
PCI device and the 80960VH to comm unicate through message passing and interrupt gen era tion.
See Chapter 17, “Messaging Unit”.
1.2.4Memory Controller
1-2
The Memory Controller allows direct control of external memory systems, including DRAM,
SRAM, ROM and flash. It provides a direct connect interface to memory that typically does not
require ext ernal logic. It features programmable chip selects, a wait state genera tor and byte parity.
External memo ry can be config ured as PCI addressable memory or private 80960VH memory. See
Chapter 15, Memory Controller.
i960® VH Processor Developer’s Manual
1.2.5I2C Bus Interface Unit
Introduction
The I2C (Inter-Integrated Circuit) Bus Interface Unit allows the i960 core processor to serve as a
master and sla ve device re sidi ng on the I
Semiconductor consis ting of a two-pi n interfa ce. The bus allo ws the 80960VH to int erface to other
2
I
C peripherals and microcontrollers for system manag ement functions. It require s a minimum of
2
C bus. The I2C unit uses a se ria l bus devel ope d by Phil ips
hardware for an economical s ystem to relay status and reliability information on the I/O subsystem
to an external de v ice. See Chapter 21, “I2C Bus Interface Unit”. Also refer to the document I
Peripher als for Microcontrollers (Philips Semiconductor).
1.3i960® Core Processor Features (80960VH)
The processing power of the 80960VH co mes from the 80960JF processor core. The 80960JF is a
new, scalar implementation of the i960 core architecture. Figure 1-2 shows a block diagram of the
80960JF core processor.
Figure 1-2. 8096 0JF Core Processo r Block Diag ram
P_CLK
PLL, Clocks,
TAP
Boundary Scan
5
8-Set
Local Register Cache
128
Register File
SRC2 DEST
SRC1
Three Independent 32-Bit SRC1, SRC2, and DEST Buses
Instr uction Cache
16 Kbyte T wo-Way Set Associative
Instruction Sequencer
Execution
Multiply
Divide
and
Address
Generation
Unit
Address
ControlConstants
Interface
32-bit Data
Memory
Unit
32-bit buses
address / data
Physical Region
Configuration
Bus Cont r ol U nit
Bus Request
Programmable
Memory-Mapped
Register Interface
Direct Mapped
Data Cache
Queues
Two 32-Bit
Timers
Interrupt
Controller
1 Kbyte
Data RAM
4Kbyte
2
C
Control
32
Interrupt
Port
9
Factors that contribute to the 80960VH’s performance include:
• Single-clock execution of most instructions
• Independent Multipl y/Divide Unit
• Efficient inst ruction pipeline minimizes pipeline bre ak latency
i960® VH Processor Developer’s Manual
1-3
Introduction
• Register and resource scoreboarding allow overlapped instruction execution
• 128-bit register bus speeds local register caching
• 16 Kbyt e tw o - w ay set-a s so ciative , in t eg r ated inst r uc t io n c ac he
• 4 Kbyte direct-m apped, integrated dat a cache
• 1 Kbyte integrated data RAM delivers ze ro wait state program data
The i960 core processor operates out of its own 32-bit a ddress space, which is indepe ndent of the
PCI address space. The 80960 local bus memory can be:
• Made visible to the PCI address space
• Kept private to the i960 core processor
• A ll o cated as a com bi n at io n of th e tw o
1.3.1Burst Bus
A 32-bit high-perform ance bus controller int erfaces the i960 core processor to external memory
and periphera ls . The Bus Control Unit fetche s instructions and transf ers data on the 80960 local
bus at the rate of up to four 32-bi t words per six clock cycles.
Note:DMA and ATU accesses are limited to 32-bit wide memo ry r egions. Also these units can burst up
to a 2 Kbyte boundary with no alignment restrictions .
Users may configure the i960 core processor’s bus controller to match an application’s
fundamental mem ory organization. Physical bus width is programmable up to eight regions . Dat a
caching is programmed through a group of logical memory templates and a defaults register. The
Bus Control Unit’s features include:
• Multiplexed external bus minimizes pin count
• 32-, 16- and 8-bit bus widths simplify I/O int erfa ce s
• External ready control for address-to-data, data-to-data and data-to-next-addre ss wait state
types
• Unaligned bus accesses performed transpa rently
• Three-deep load/store queue decouples the bus from the i960 core processor
For reliability, the 80960VH conducts an internal self tes t upon reset. Before executing its first
instruction, it performs a local bus c onfidence test by perform ing a checksum on the first words of
the Init ializati o n Bo o t Record.
1.3.2Timer Unit
As described in Chapter 19, “Timers”, The Timer Unit (TU) contains two ind epe ndent 32-bit
timers that ar e capable of counting at software-defined clock rates and generating interrupts. Each
is programmed by use of the Timer Unit memory-mappe d registers. The timers have a single-shot
mode and auto-reload capabilities for continuous operat ion. Each timer has an independe nt
interrupt request to the 80960VH’s interrupt controller.
1-4
i960® VH Processor Developer’s Manual
1.3.3Priority Interrupt Controller
Chapter 8, “I n t er r u pt s” explains how low interrupt latency is critical to many embedded
applications. As part of its highly flexible interrupt mechanism, the 80960VH exploits several
techniques to mi nimize latency:
• Interrupt vectors and interrupt handler routine s c an be reserved on-chip
• Register frames for high-priority interrupt handlers can be cached on-c hip
• The interrupt stack can be placed in cacheable memory space
1.3.4Faults and Debugging
The 80960VH employs a comprehensive fault model. The processor responds to faults by making
implicit cal ls to fault handling routines. Specific information collected for each fault allows the
fault handler to diagnose exceptions and recover appropriately.
The processor also has built-in debug capabilities. Via software, the 80960VH ma y be con figured
to detect as many as seven different trace event types. Alternatively,
can generate trace events explicitl y in the instruction stre am. Hardware breakpoint regis ters are
also available to trap on execution and data addresses. See Chapter 9, “Faults”.
Introduction
mark and fmark instructions
1.3.5On-Chip Cache and Data RAM
As dis c ussed in Chapter 4, “Cache and On-Chip Dat a RA M ”, memory subsystems often impose
substantial wait state penalties. The 80960VH integrates conside rable storage resources on-chip to
decouple CPU execution from the external bus. The 80960VH inc ludes a 16 Kbyte instruction
cache , a 4 Kbyte data cache and 1 Kbyte dat a RAM .
1.3.6Local Register Cache
The 80960VH rapidly allocates and deallocates local regist er sets during context s witches. The
processor needs to flush a register set to the stack only when it saves more than seven sets to its
local re gi st er cache.
1.3.7Test Featur es
The 80960VH incorporate s fe atures that enhance the user’s ability to test both the processor and
the system t o w hich it is attached. These features include ONCE (On-Ci rcuit Emulation ) mode and
IEEE Std. 1149.1 Boundary Scan (JTAG). See Chapter 22, “Test Featur es”.
One of the boundary scan instructions,
(ONCE mode). ONCE mode can also be initiated at reset with out us ing the boundary scan
mechanism.
ONCE mode is useful for board-level testing. This fea ture allows a mounted 80960VH to
electrically “remove” itself from a circuit board. This mode allows system-level testing where a
remote tester, such as an In-Circuit Emulat o r (ICE) system, can exercise the proces sor system. The
test logic does not interfere with component or system behavior and ensures that components
function correctly, and also that the connec tions between various components are correct.
HIGHZ, forces the processor to float all its output pins
i960® VH Processor Developer’s Manual
1-5
Introduction
The JTAG Boundary Scan feature is an alternative to conventional “bed-of-nails” testing. It can
examine connec tions that might otherwise be inaccessible to a test system .
1.3.8Memory-Mapped C o ntrol Re gisters
The 80960VH is compliant with 80960 family architecture and has the added advantage of
memory-mapped, internal control registers not found on the 80960Kx, Sx or Cx process ors. This
feature provi des so ftware an interface to easily read and modify internal control registers.
Each memory-mapped, 32-bi t regi s ter is accessed via regular memory-format instructions. The
processor ensures that these accesses do not generate external bus cycles. See Chapter 15,
“Memory Controller”.
1.3.9Instructions, Data Types and Memory Addressing Modes
As with all 80960 family processors, the 80960VH in struction set supports several diffe r ent data
types and formats:
• Bit
• Bit fields
• Integer (8-, 16-, 32-, 64-bit)
• Ordinal (8-, 16-, 32-, 64-bit unsig ned integers)
• Triple word (96 bits)
• Quad word (128 bits)
Several chapters describe the 80960VH instruction set, including:
• Chapter 3, Programming Environment
• Chapter 5, Instruction Set Overvie w
• Chapter 6, Instruction Set Refer ence
1.4About This Document
The 80960VH incorporates Peripheral Component Interconnect (PCI) functionality with the
80960VH. As such, it is ass um ed that the reader has a working understanding of Peripheral
Component Interc onnect (PCI), PCI Local Bus Specification, revision 2.1, and the i960 core
processor.
1.4.1Terminology
In this document, the following term s a r e used:
• 80960VH refers generically to the i960
®
VH processor.
1-6
• 80960 local bus refers to the 80960VH’s internal local bus, not the PCI local bus.
• Primary PCI bus is the 80960VH’s internal PCI bus that conforms to PCI SIG specificat ions.
• i960 core processor refers to the i960
®
JT processor that is integrated into the 80960VH.
i960® VH Processor Developer’s Manual
• DWORD is a 32-bit data word.
• 80960 Local memory is a memory subsystem on the 80960 proces sor local bus.
• Downstream — at or toward a PCI bus with a higher number (after configuration ).
• Host processor — Processor located upstream from the i960 VH Processor.
• Local processor — i960 core processor within the i960 VH Processor.
• Upstream — At or toward a PCI bus with a lower number (after configuration).
1.4.2Repres en tin g Num b er s
Assume that all numbers are base 10 unless designat ed otherwise. In text, number s in base 16 are
represented as “nnnH”, where the “H” signifies hexadecimal. In pseudocode descriptions,
hexadecimal num bers a re repr esen ted in the form 0x12 34ABCD. Bi nary numbers a re not e xpl icit ly
identified and are as su me d when bit operations or bit range s ar e us ed.
1.4.3Fields
A preserved field in a data struct ure is one that the processo r does not use. Preserved fields can be
used by software; the proc essor does not modify such fields.
Introduction
A reserved field is a field that may be used by an implementation. When the initial value of a
reserved field is supplied by software, this value must be zero. Software should not modi fy
reserved fields or depend on any values in reserved fields .
A read on l y field can be read to return the current value. Writes to read only fields are treated as
no-op operations and do not change the current value or result in an error condit ion.
A read/clear field c an also be rea d to return the current value. A write to a rea d/clear field with the
data value of 0 causes no change to the field. A write to a read/clear fiel d with a da ta value of 1
causes the field to be cleared (reset to the value of 0). For example, when a read/clear field has a
value of F0H, and a data value of 55H is written, the resultant field is A0H.
A read/set field can also be read to return the current value. A write to a read/set field with the data
value of 0 causes no change to the field. A write to a r e ad/se t field with a data value of 1 causes the
field to be set (set to the value of 1). For example, when a read/set field has a value of F0H, and a
data value of 55H is written, the resultant field is F5H.
1.4.4Specifying Bit and Signal Values
The terms set and clear in this specification refer to bit values in r egister and data str uctures. When
a bit is set, its value is 1; when the bit is clea r, its value is 0. Likewis e, setting a bit means giving it
a value of 1 and clearing a bit means giving it a value of 0.
The terms assert and deassert refer to the logically active or inactive value of a signal or bit,
respectively.
i960® VH Processor Developer’s Manual
1-7
Introduction
1.4.5Signal Name Conventions
All signal names use the PCI signal na me conventi on of using the “#” symbol at the end of a signal
name to indicate that the signal’s active state occurs when it is at a low voltage. This includes
80960 proces sor -relat ed si gnal n ames t hat normal ly use an ove rli ne
indicates that the signal’s active state occurs when it is at a high voltage level.
. The absence of the “#” symbol
1.4.6
Solutions960
®
Program
Intel’s Solutions960® program features a wide variet y of development tools that support the i960
processor family. Many of these tools are developed by partner c om panies; some are de veloped by
Intel, suc h as prof ile-dri ven opti mizi ng compi lers. F or more inf ormati on on thes e produc ts, conta ct
your local Intel representative.
1.4.7Intel Customer Literature and Telephone Support
Contact I ntel Corporation for literature and technical assistance for the i960® VH processor.
CountryLiteratu reCustom e r Support Num be r
United S tates800-548-4725800-628-8686
Canada800-468-8118 or 303-297-7763800-628-8686
EuropeContact local distributor Contact local distributor
Austral iaContact local distributor Contact local dis tributor
IsraelContact local distributor Contact local distributor
JapanContact local distributor Contac t local distributor
1.4.8Related Documents
Intel docum entation is available from your Intel S ales Represent ative or Intel Literature Sales. S ee
Section 1.4.7 for a complete listing of contact numbers for obtaining Intel literature.
PCI System Design Guide
I2C Peripherals for Microcontrollers
2
I
C
Bus and How to Use It
2
I
C
Peripherals for Microcontrollers
Data SheetIntel Order # 273179-001
, revision 2.1PCI Special Int erest Gro up 1-800- 433-5177
, Revision 1.0PCI Special Interest Group 1-800-433-5177
(Including Specifications)Philips Semiconductor
(Including Fast Mode)Signetics
Intel Order # 273174-001
Intel Order # 272483-002
Intel Or de r # 2420 16
Philip s Sem i co nd uc to r
i960® VH Processor Developer’s Manual
1.4.9Ele ctronic Infor mati o n
Intel’s documentation and other information is available from Intel’s website. See Table 1-2.
Table 1-2. Electronic Information
Intel’s World-Wide Web Home Pagehttp://www.intel.com/
Introduction
i960® VH Processor Developer’s Manual
1-9
Data T y pes and Memory Addressing
Modes
2.1Data Types
The instruction set references or produces several data lengths and formats. The i960® VH
processor supports the following data types:
• Integer (signed 8, 16 and 32 bits)• Ordinal (unsigned integer 8, 16, and 32 bits)
• Long Word (64 bi ts)• Triple Word (96 bits)
• Quad Word (128 bits)• Bit Field
• Bit
Figure 2-1 illustrates the class, data type and length of each type su pported by i960 processors .
Figure 2-1. Data Types and Ranges
Bit Field
128
Bits
31
Length
LSB of
Bit Field
96
Bits
127
Class Data TypeLengthRange
Numeric
(Integer)
Numeric
(Ordinal)
Non-Numeric
0
32
Bits
64
Bits
63
95
Byte Integer
Short Integer
Integer
Byte Ordinal
Short Ordinal
Ordinal
Long Ordinal64 Bits0 to 2
Bit
Bit Field
Long Word
Triple Word
Quad Word
8 Bits
16 Bits
32 Bits
8 Bits
16 Bits
32 Bits
1 Bit
1-32 Bits
64 Bits
96 Bits
128 Bits
31
-2
-2
16
Bits
7
-2
to 27 -1
15
to 215 -1
31
to 231 -1
0 to 2
0 to 2
0 to 2
N/A
8
Bits
7
Short
15
Word
Long
Triple Word
Quad Word
8
-1
16
-1
32
-1
64
- 1
2
Byte
0
0
0
0
0
0
i960® VH Processor Developer’s Manual
2-1
Data Types and Memory Addressing Modes
2.1.1Word/Dword Notation
Data le ngths, as d es c r ibed in th e PCI L oca l Bus Spe ci ficat ion Revision 2. 1, differ from the
conventions used for the 80960 architecture. See also Table 2-1:
• In the PCI specification the term word refers to a 16-bit block of data.
• In this manual and other documentation relating to the 80960VH, the term word refers to a
32-bit block of dat a.
Table 2-1. 80960 and PCI Architecture Data Word Notation Differences
No. of Bi tsPCI Architecture80960 Architecture
16wordshort word or half word
32doubl eword or dwordword
2.1.2Integers
Integers ar e s igned whole numbers that are s tored and operated on in two’s complement form at by
the integer inst ructions. Most integer instructions ope rate on 32-bit integers. Byte and short
integers are re fere nced by the byte and short classes of the load, store and compare instructions
only.
Integer load or store size (byte, short or word) determines how sign extension or data truncation is
performed when data is moved between registers and memory.
For inst r u ct io n s
memory is considered a two’s complement value. The value is sign-extended and placed in the
32-bit re g is t er th at is the destination of the lo a d.
Example 2-1. Sign Extensions on Load Byte and Load Short
ldib
ldis
For inst r u ct io n s
number in a register is stored to memory as a byte or short word. When regis ter data is too large to
be stored as a byte or shor t word, the value is truncated and the integer overflo w condition is
signaled. When an overflow occurs, either an AC register flag i s set or the
ARITHMETIC.INTEGER_OVERFLOW fault is gene rated, depending on th e Inte ger Overflow
Mask bit (AC.om) in the AC register. Chapter 9, “Faults” describes the integer overflow fault.
For inst r u ct io n s
register with no si gn extension or data trunca tion.
ldib (load integer byte) and ldis (load int eger short), a byte or short word in
7AH is loaded into a register as 0000 007AH
FAH is loaded into a register as FFFF FFFAH
05A5H is loaded into a register as 0000 05A5H
85A5H is loaded into a register as FFFF 85A5H
stib (store integer byte) and stis (store integer short), a 32-bit two’s complement
ld (load word) and st (store word), data is moved directly between memory and a
2.1.3Ordinals
Ordinals or uns igned integer data types are stored and treated as positive binary values. Figure 2-1
shows the supported ordinal sizes.
2-2
i960® VH Processor Developer’s Manual
The large number of instructions that perform logical, bit manipulatio n and uns igned arithmetic
operations reference 32-bit ordinal operands. When ordinals are us ed to represent Boolean values,
1 = TRUE and 0 = FALSE. Most extended arithmetic instructions reference the long ordinal data
type. Only load (
the byte and short ordinal data types.
Sign and sign extension are not considered when ordinal loads and stores are performed ; however,
the values may be zero-extended or truncated. A short word or byte load to a register causes the
value loaded to be zero-extended to 32 bits. A short word or byte store to me mo ry truncates an
ordinal value in a register to fit the destination memory. No overflow condition is signalled in this
case.
ldob and ldos), store (stob and stos), and c om pare ordinal instructions reference
2.1.4Bits and Bit Fields
The processor provides several instruct ions that perform operati ons on individual bits or bit fields
within regist er operands. An individu al bit is specified for a bit operation by giving it s bit number
and register. Inter nal regi sters al ways follow lit tl e endia n byte order; the lea st signi fic ant bit is bit 0
and the most significant bit is bit 31.
A bit field is any contiguous group of bits (up to 32 bits lo ng) in a 32-bit register. Bit fields do not
span register boundarie s. A bit fie ld is define d by givi ng its length in bi ts (1-32) an d the bit number
of its lowest numbered bit (0-31).
Data Types and Memory Addressing Modes
Loading and storing bit and bit-field dat a is norm ally performed using the ordinal load (
sto) instructions. When an ldi instruction loads a bit or bit field value into a 32-bit register,
store (
the processor appends sign extension bits. A byte or short store can signal an integer overflow
condition.
2.1.5Triple and Quad Words
Triple and quad words refe r to consec utive words in memory or in regi sters . T rip le- and quad-word
load, store and move ins tructions use th es e da ta types to accomplish block movements. No data
manipulation (s ign extension, zero ext ens ion or truncation) is performed in these instructions.
Triple- and quad-word data types can be considered a superset of the other data types described.
Data in each word subse t of a qua d word is likely to be the operand or result of an ordinal, integer,
bit or bit field instruction.
2.1.6Register Data Alignment
Several inst ructions operate on mult iple-word operands. For example, the load-lo ng instruction
(
ldl) loads two words from memory into two consecut ive re gisters . Here the registe r number for the
least significant word is automatically loaded into the next higher-numbered register.
In cases where an instruction specifies a register number , and multiple, cons ecutive registers are
implied, the regis ter numbe r must be ev en if two regis ters are acc essed ( for exam ple, g0, g2) a nd an
integral multiple of four if three or four registers are accessed (for example, g0, g4). When a
register reference for a source value is not properly aligned, the registers that the processor writes
to are undefined.
ldo) and
i960® VH Processor Developer’s Manual
2-3
Data Types and Memory Addressing Modes
The 80960VH doe s not r equire data a lignment in exter nal mem ory; the proces sor h ardware hand le s
unaligned memory accesses automatically. Optionally, user software can configure the processor
to generate a faul t on unaligned memory accesses .
2.1.7Literals
The architec ture d efines a set of 32 liter als t hat c an be use d as o perands in m any ins truct ions. These
literals are ordinal (unsigned) values that range from 0 to 31 (5 bits). When a literal is used as an
operand, the processor expands it to 32 bits by adding leading zeros. If the instruction requires an
operand larger than 32 bits, then the processor zero-extends the va lue to the operand size. If a
literal is used in an instruction that requires integer operands, then the processor treats the literal as
a positive integer value.
2.2Bit a n d B y t e O r de r ing in Me m ory
All occurrences of numeric and non-numeric data types, except bits and bi t fields, must start on a
byte boundary. Any data item occupying multiple bytes is stored as little endian.
2.3Memory Addressing Modes
Nine modes are available for addressing operands in mem ory. Each addressing mode is used to
reference a byte location in the processor’ s address space. Table 2-2 shows the memory addressing
modes and a brief description of each mode’s address ele m ents and assembly code syntax.
Table 2-2. Memory Addressing Modes
ModeDescriptionAssembler Syntax
Absolute
Regis ter Ind ir e c tabase(reg)MEMB
Index wit h displacement(index*scale) + displacementexp [reg*scale]MEMB
instruction pointer (IP) with
displacement
NOTE:
See TableB-9 “MEM Format Instruction Encodings” on page B-9 for more on addressi ng modes.
For purposes of this memory addressing modes description, MEMA format instructions require
one word of memory and MEMB usually requ ire two words and therefore cons ume twice the bus
bandwidth to read. Otherwise, both formats perform the same functions.
offset
displacement
with offset
with di splacement
with index
with index and di splacement
reg
is register,
exp
is an expression or symbolic label, and IP is the Instruction Pointer.
offset (smaller than 4096)expMEMA
displacement (larger than 4095)expMEMB
Absolute addressi ng modes allow a memory location to be refe renced directly as an offs et from
address 0H. At the instruction encoding level, two absolute addressing modes are provided:
absolute of fset and absolute displacement, depending on offset size.
• For the absolute offset addressing mode, the offset is an ordinal number ranging from 0 to
4095. The absolute offset addressing mode is encoded in the MEMA machine instruction
format.
• For the absolute displacement addressing mode, the offset value ranges from 0 to 2
absolute displacement addres sing mode is encoded in th e MEMB format.
Addressing modes and encoding instruction form ats are described in Chapter 6, “Instruction Set
Reference”.
At the as sembly langu age level, the two absolute addressing modes use the s am e syntax. Typically ,
development tools allow absolute addresses to be specified through arithmetic expressions (for
example, x + 44) or symbolic labels. After evaluating an address spec ified with the absolute
addressing mode, the assembler converts the address into an offse t or dis placement and selects the
appropriate instruction encoding format and addressing mode.
2.3.2Register Indirect
Data Types and Memory Addressing Modes
32
-1. The
Register ind irect addressing modes us e a regis ter’s 32-bit value as a base for address calculation.
The register value is referred to as the address bas e (de signated “abase” in Table 2-2). Depending
on the addressing mode, an optional scale d index and offset can be added to this address base.
Register indirect addressing modes are useful for addressing elements of an array or record
structure. When addressing array elements, the abase value provides the address of the first array
elemen t . A n offs et (or di sp la cement) selects a parti c ul ar arr ay el ement.
In register-indirect-with-index addressing mode, the index is s pecified using a value cont ained in a
register. This index value is multiplied by a scale factor. Allowable factor s ar e 1, 2, 4, 8 and 16.
The register-indirect-with-index addressing mode is encoded in the MEMA format.
The two versions of register-indirect-with-offset addressing mode at the instruction encoding l eve l
are regi ster-indir ect-with - o ffs e t an d reg i s t er-i nd i r ect-with - d isplace men t. As with ab so lute
addressing modes, the mode selected depends on the size of the offset from the bas e ad dres s.
At the assembly lang uage level, the assembler allows the offset to be specified with an expression
or symbolic labe l, then evaluates the address to determine whether to use
register-indirect-with-offset (MEMA format) or register-indirect-with-displacement (MEMB
format) addressing mode.
Register-indirect-with-index-and-displacement addressing mode adds both a scaled index and a
displacement to the address base. There is only one version of this addre ssing mode at the
instruction encoding level, and it is encoded in the MEMB instruction format.
2.3.3Index with Displacement
A scaled index can also be used with a displacement alone. Again, the index is contained in a
register and multiplied by a sca ling constant before disp lacement is a dded. This mode uses MEMB
format.
i960® VH Processor Developer’s Manual
2-5
Data Types and Memory Addressing Modes
2.3.4IP with Displacement
This addressi ng mode is us ed with loa d and store in struc tio ns to make th em instru ction poi nter (IP )
relative. IP-with-displacement addressing mode references the next instruction’s address plus the
displacement plus a constant of 8. The constant is added because, in a typical processor
implementa tion, the address has incremented beyond the next instruction address at the time of
address calculation. The constant simplifies IP-with-displacement addressing mode
implemen tation. This mode uses MEMB format.
2.3.5Addressing Mode Examples
The following examples show how i960 processor addressing modes are encoded in assembly
language. Example 2-2 shows addressing mode mnemoni cs. Example 2-3 illustrates the usefulness
of scaled index a nd scaled index plus displ acement addres sing mode s. In thi s example, a procedure
named array_op uses these addressing modes to fill two contiguous memory bl ocks separated by a
constant of fs et. A pointer to the top of the block is passed to the procedure in g0, the block size is
passed in g1 and the fill data in g2. Refer to Appendix A, “Machine-level Instruc tion Formats”.
Example 2-2. Addressing Mode Mn emon ics
st g4,xyz# Absolute; word from g4 stored at memory
# location designated with label xyz.
ldob(r3),r4# Register indirect; ordinal byte from
# memory location given in r3 loaded
# into register r4 and zero extended.
stlg6,xyz(g5)# Register indirect with displacement;
# double word from g6,g7 stored at memory
# location xyz + g5.
ldq(r8)[r9*4],r4# Register indirect with index; quad-word
# beginning at memory location r8 + (r9
# scaled by 4) loaded into r4 through r7.
st g3,xyz(g4)[g5*2]# Register indirect with index and
# displacement; word in g3 stored to mem
# location g4 + xyz + (g5 scaled by 2).
ldisxyz[r12*2],r13# Index with displacement; load short
# integer at memory location xyz + r12
# into r13 and sign extended.
st r4,xyz(IP)# IP with displacement; store word in r4
# at memory location IP + xyz + 8.
2-6
i960® VH Processor Developer’s Manual
Data Types and Memory Addressing Modes
Example 2-3. Scaled Index and Scaled Index Plus D isplacem ent Addressi ng Modes
array_op:
movg0,r4# Pointer to array is copied to r4.
subi1,g1,r3# Calculate index for the last array
b.I33# element to be filled
.I34:
st g2,(r4)[r3*4]# Fill element at index
st
# Fill element at index+constant offset
g2,0x30(r4)[r3*4]
subi1,r3,r3# Decrement index
.I33:
cmpible0,r3,.I34# Store next array elements if
ret# index is not 0
i960® VH Processor Developer’s Manual
2-7
Programming Environment
This chapter describes the i960® VH processor’s programming environment including global and
local registe r s, control registers, literals, proces sor-state reg i sters and address space.
3.1Overview
The i960 architec ture defi nes a programmi ng environmen t for program executi on, data st orage and
data manipulation. Figure 3-1 shows the programming environment elements that include a
4 Gbyte (2
general-purpose registers, a register cache, a set of literals , control registers and a set of processor
state reg is te r s .
The processor includes several architecturally-defined data structures located in memo ry as part of
the programming environment. These data structures handle procedure c alls, interrupts and faults
and provide configuration information at initializa tion. These data structures are:
• interrupt stack• control table • system procedure table
• local stack• fault tab le• process control block
byte) flat address space, an instruction cache, a data cache, global and local
3
3.2Registers and Literals as Instruction Operands
With the exception of a few special instructions, the 80960VH uses only simple load and store
instructions to access memory. All operations take place at the register level. The processor uses
16 global registers, 16 local registers and 32 literals (constants 0-31) as instruction operands.
The global register numbers are g0 through g15 ; local register numbers are r0 through r15. Several
of these registers are used for dedicated functions. For example, register r0 is the previous frame
pointer, often referred to as pfp. i960 processor compilers and assemblers recognize only the
instruction operands listed in Table 3-1. Throughout this manua l, the registers’ descriptive name s,
numbers, operands and acronyms are used interchangeably, as dictated by con text.
i960® VH Processor Developer’s Manual
3-1
Programming Environment
®
Figure 3-1. i960
VH Processor Programming Environment
0000 0000H
Address Space
Fetch
Instruction
Cache
Instruction
Stream
Instruction
Execution
Processor State
Registers
Instruction
Pointer
Arithmetic
Controls
Process
Controls
Trace
Controls
Architecturally
Data Structures
Load
Sixteen 32-Bit
Global Registers
Register Cache
Sixteen 32-Bit
Local Registers
Control Registers
FFFF FFFFH
Defined
Store
g0
g15
r15
r0
r15
3.2.1Global Registers
Global registers are general-purpose 32-bit data registers that provide temporary storage for a
program’s computational operands. These registers retain their contents across procedure
boundaries . As suc h, they provide a fast and ef ficient means of passing para m eters between
procedures.
Table 3-1. Registers and Literals Used as Instruction Operand s
Instruction OperandRegister Name (number)FunctionAcronym
g0 - g14global (g0-g14) general purpose
fpglobal (g15)frame poi nterFP
3-2
i960® VH Processor Developer’s Manual
Table 3-1. Registers and Literals Used as Instruction Operands
Instru ct i on Op era n dRegister Nam e (nu mb er )Fun ctionAcr on ym
The i960 archit ecture supplies 16 global regis ters, designated g0 through g15. Register g15 is
reserved for the current Frame Pointer (FP), which contains the address of the first byte in the
current (topmost) stack frame in internal memory. See Section 7.1, “Call and Retu r n Mec h an i s m”
on page 7-2) for a description of the FP and procedure stack.
After the processor is reset, register g0 contai ns the i960 core processor device identification and
stepping information. g0 retains this information until it is written over by the user pr ogram. The
i960 core processor device identification and stepping information is also stored in the
memory-mapped DEVICEID register located at FF00 8710H. In addition, the 80960VH device
identification and stepping information is stored in the memory-mapped register located at
0000 1710H.
Programming Environment
3.2.2Local Registers
The i960 architecture provides a separate set of 32-bit local data registers (r0 through r15) for each
active procedure . Th ese registers provide stora ge for variables that are local to a procedure. Each
time a procedure is called, th e p rocessor allocates a new set of lo cal regis ters and saves the calling
procedure’s local registers. When the application returns from the procedure, the local registers are
released for the next procedure call. The processor performs local register m anagement; a pro gram
need not explicitly save and restore th ese registers.
r3 through r15 are general purpose registers; r0 through r2 are reserved for special functions; r0
contains the Previous Frame Pointer (PFP); r1 contains the Stack Pointer (SP); r2 contains the
Return Instruction Pointer (RIP). These are discussed in Chapter7, “Procedure Calls”.
The processor does not always clear or initialize the set of local registers assigned to a new
procedure. Also, the processor does not initialize the local register save area in the newly created
stack frame for the procedure. User software should not re ly on the initial values of local registers.
3.2.3Register Scoreboarding
Register scoreboarding maintai ns register coherenc y by preventing parallel execution units from
accessing registers for which there is an out st anding operation. When an instruction that targets a
destination register or group of register s executes, the processor sets a register-scoreboard bit to
indicate that this register or grou p of registers is being used in an opera tion. If the instruct ions that
follow do not require data from registers already in use, then the processor can execute those
instructions before the prior instruction execution completes.
Software can use this feature to execute one or more singl e-cycle instructions concurrently with a
multi-cycle inst ruct ion (for example, multi pl y or divide). Ex am p le 3- 1 shows a case where register
scoreboarding prevents a subsequent instruction from executing. It also illustrates overlapping
instructions that do not have register dependencies.
i960® VH Processor Developer’s Manual
3-3
Programming Environment
Example 3-1. Register Scoreboa rdi ng
muli r4,r5,r6# r6 is scoreboarded
addi r6,r7,r8# addi must wait for the previous multiply
.# to complete
.
.
muli r4,r5,r10 # r10 is scoreboarded
andr6,r7,r8# and instruction is executed concurrently with
multiply
3.2.4Literals
The architec ture d efines a set of 32 liter als t hat c an be use d as o perands in m any ins truct ions. These
literals are ordinal (unsigned) values that range from 0 to 31 (5 bits). When a literal is used as an
operand, the processor expands it to 32 bits by adding leading zeros. If the instruction requires an
operand larger than 32 bits, then the processor zero-extends the va lue to the operand size. If a
literal is used in an instruction that requires integer operands, then the processor treats the literal as
a positive integer value.
3.2.5Regis ter and Literal Addressing and Alignment
Several instructions operate on multiple-word operands. For example, the load long instruction
(
ldl) loads two words from memory into two consecutive registers. The register for the less
signifi cant wor d is specified in th e in s t r u ct io n . T h e mo r e sig ni f icant wor d is au to matical ly lo ad ed
into the next higher-numbered register.
In cases where an instruction specifies a register number and multiple consecutive registers are
implied, the register number must be even if two registers are accessed (for example, g0, g2) and an
integral multiple of 4 if three or four registers are accessed (for example, g0, g4). If a register
reference for a source value is not properly aligned, then the source value is undefined and an
OPERATION.INVALID_OPERAND fault is generated. If a register reference for a destination
value is not prop erl y aligne d, the n the registe rs to whic h th e proce ssor write s and t he values written
are undefined. The proc essor then generates an OPERATION.INVALID_OPERAND fault. The
assembly language code in Example 3-2 shows an example of correct and incorrect register
alignment.
Example 3-2. Register Alignment
movlg3,g8# Incorrect alignment - resulting value
.# in registers g8 and g9 is
.# unpredictable (non-aligned source)
.
movlg4,g8# Correct alignment
Global registers, loc al registers and literal s are used directly as instruction operands. Table 3-2 lists
instruction operands for each machine-level instruction format and the positions that can be filled
by each re gi st er or li te r al.
3-4
i960® VH Processor Developer’s Manual
Table 3-2. Allowable Register Operands
Programming Environment
Operand
1
Instruction
Encoding
REG
MEM
COBR
NOTES:
1. “X” denotes the register can be used as an operand in a particular instruction field.
2. The COBR destination operands apply only to TEST instructions.
Operand FieldLocal RegisterGlobal RegisterLiteral
src1
src2
src/dst
src/dst
src/dst
src/dst
abase
index
src1
src2
dst
(as
src
(as
dst
(as both)
)
)
X
X
X
X
X
X
X
X
X
X
2
X
X
X
X
X
X
X
X
X
X
X
2
X
3.3Memory-Mapped Control Registers (MMRs)
The 80960VH gives software the interface to easil y r ea d and mo dify internal control re gisters.
Each of these registers is ac cessed as a memory-mapped register with a uni que memory address.
There are two distinct set s of memory-mapped registers on the 80960VH. The first set exists in the
FF00 0000H through FFFF FFFFH address range and is used to control the i960 core processor
functions. The second set exists in t he 0000 1000H through 0000 17FFH addres s range and is us ed
to control the 80960VH i ntegrated peripherals. The processor ensures t hat accesses to MMRs do
not generate external bus cycles.
X
X
X
X
2
X
3.3.1i960® Core Processor Function Memory-M apped Reg isters
Portions of the 80960VH address space (addresses FF00 0000H through FFFF FFFFH) are
reserved for memory-mapped regis te r s . Th ese memory-m apped registers are accessed through
word-operand memory instructions (
to this address space do not generate external bus cycles. The latency in accessing each of these
registers is one cycle.
Each register has an assoc iated access mode (user a nd supervis or modes) an d access type (rea d and
write ac ce s ses). Table C-2 and Table C-3 show all the memory-mapped registers.
The registers are partitioned into user and supervisor spaces based on their addresses. Addresses
FF00 0000H through FF00 7FFFH are allocated to user space memory-mapped registers;
Addresses FF00 8000H to FFFF FFFFH are allocated to supervisor space registers.
i960® VH Processor Developer’s Manual
atmod, atadd, sysctl, ld and st instructions) only. Accesses
3-5
Programming Environment
3.3.1.1Restrictions on Instructions that Access the i960® Core Processor
Memory-M app ed Re gist er s
The majority of memory-mappe d registers can be accessed by both load (ld) and store (st)
instructions. However some registers have restrictions on the types of accesses they allow. To
ensure correct opera tion, the access type restric tions for each register should be followed. The
access type col umns of Table C-2 and Table C-3 indicate the allowed access types for each
register.
Unless otherwise indicated by its access type, the modification of a memory-mapped register by a
st instruction takes effect completely before the next instruction starts execution.
Some operations require an atomic-read-modify-write sequence to a register, most notably IPND
and IMSK. The
IPND and IMSK registers in an atomic m anner on t he 80960VH. Do not use this i nstructi on on an y
other memory-mapped registers.
The
sysctl inst ruction can also modify the cont ents of a memory-mapped register atomically; in
addition,
breakpoints ca nnot be read using a
At initialization the control table is automatically loaded into the on-chip control registers. This
action simplifies the user’s start-up code by providing a transparent setup of the processor’s
peripherals. See Chapter 12, “Initialization and System Requirements”.
atmod and atadd instructions provide a special mechanism to quickly mod ify the
sysctl is the only method to read the breakpoint registers on the 80960VH; the
ld inst ru ction.
3.3.1.2Access Faults for i960® Core Processor MMRs
Memory-mapped registers are meant to be accessed only as aligned, word-size registers with
adherence to the appropriate access mode. Accessing these registers in any other way results in
faults or undefined operation. An access is performed using the following fault model:
1. The access must be a word-sized, word-aligned access; otherwise, the processor generates an
OPERATION.UNIMPLEMENTED fault.
2. I f the access is a store in user mode to an implemented supervis or location, then a
TYPE.MISMATCH fault occurs. It is unpredictable whether a store to an unimplemented
supervisor loc ation causes a fault.
3. If the access is neither of the ab o ve, then the access is attempted. Note t h at an MMR may
generate faults based on conditions spec if ic to that MMR. (Exa mple: trying to write the timer
registers in user mode when they have been all ocated to supervisor mode only.)
4. When a store access to an MMR faults, the processor ensures that the store does not take
effect.
5. A load access of a reserved location returns an unpredictable value.
6. Avoid any store accesses to reserved loca tions. Such a store can result in undefined operation
of the pr ocessor if th e locati on is in s u p er v isor space.
Instruction fetches from the memory-mapped regis ter space are not allowed and result in an
OPERATION.UNIMPLEMENTED fault.
The Peripheral Memory-Mapped Regi ster (PMMR) interface gives software the ability to read and
modify internal control registers. Each of these 32-bit registers is acc es sed as a memory-mapped
register with a unique memory address, using regular memory-format instructions from the i960
core processor. See Appendix C, “Memory-Mapped Registers”.
The memory-mapped registe rs discussed in this chapt er are specific to the 80960VH only. They
support the DMA control le r , memory c ontr oll er , PCI and peri pheral i nterrup t co ntrolle r , m essag ing
unit, internal arbitration unit, PCI address translation unit and I
provides chapters that fully describe each of these peripherals.
The PMMR interface (addresses 0000 1000H through 0000 17FFH) provides full accessibility
from the pr i mary ATU, and the i9 6 0 cor e processo r .
3.3.2.1Accessing The Peripheral Memory-Mapped Registers
The PMMR interface is a slave device connected to the 80960 internal bus. This interface accepts
data transact ions that appear on the 80960 internal bus from the Primary ATU and the i960 core
processor. The PMMR interface allows these devices to perform read, wr ite, or read - modify-write
transactions.
2
C bus interface unit. This manual
The PMMR interface does not support multi- word burst acces ses from any bu s mast er. The PMMR
interface supports 32-bit bus width tr ans actions only. Because of this, PMCON0:1 must be
configured as a 32-bit memory region for accesses that originate from the i960 core processor.
The PMMR interface is byte addressable . For PMMR read s, all accesses are promoted to w o rd
accesses and all data bytes are return ed. The byte enables generated by the bus master s wh en
performing PMMR write cy cles indicate which data bytes are valid on the 80960 internal bus.
However, there may be requirement s from the individual units that int erf ace to the PMMR. Fo r
example, when configuring the DMA channel’s control re gister, a full 32-bit write must be
performed to configure and restart the DMA channel. These restrictions are highlighted in the
chapters describing the integr ated peripheral units.
The PMMR interface supports the 80960 internal bus atomic operations from the i960 core
processor. The i960 core proc es s or provides
instructions for atomic accesses to memory. When the 80960 processor executes an
atadd ins truction, t he LOCK# signal is asserted. The 80960 internal bus is not grante d to any othe r
bus master until the LOCK# signal is deasserted. This prevents other bus masters from accessing
the PMMR interface during a locked operation.
All PMMR transactions are allowed from i960 core processor operating in either user mode or
supervisor mode . In addition, the PMMR does not pr ovide any access fault to the i960 core
processor.
The following PMMR regist ers have read/write access from the 80960 internal bus (for the ATU):
atmod (atomic modify) and atadd (atomic add)
atmod or
• Vendor ID register
• Device ID register
• Revision ID register
• Class Code register
• Header Type register
i960® VH Processor Developer’s Manual
3-7
Programming Environment
For accesse s through PCI configuration cycles, access is spec ified in the r egister definition located
in the appropriate chapter.
For PCI configura tion read transactions , the PMMR returns a zero val ue for reserved regis ters. For
PCI configuration write transactions, the PMMR discards the data. For all other accesses, reading
or writing a reserved regis ter is undefined. See Table C-2 and Table C-3 for register memory
locations.
3.4Architecturally Defined Data Structures
The architecture defines a set of data structures including stacks, interfaces to system procedures,
interrupt handling procedures and fault handling proced ures . Table 3-3 defines the data structures
and references othe r se ctions of this manual where detailed information ca n be found.
The 80960VH defines two initialization data structures: the Initialization Boot Record (IBR) and
the Process Control Block (PRCB). Thes e s tructures provide initializat ion data and pointers to
other data struct ures in memory. When the processor is initialized, these pointers are read from the
initia lization data structures and cached fo r internal use .
Pointers to the system procedure table, interrupt table, interrupt sta ck, fault table and control table
are specified in the processor control block. Supervisor stack location is specified in the system
procedure tab le. User stac k locat ion is spec ified in the user’ s st artup cod e. Of these struc tur es, only
the system procedure table, fault ta ble, control table and initialization data structur es may be in
ROM; the interrupt table and stacks must be in RAM. Th e interrupt table must be located in RAM
to allow posting of software interrupts.
Table 3-3. Data Structure Descriptions
StructureDescription
User and Supervisor Stacks
Section 7.6, “User and Supervisor
Stacks” on page 7-16
Interrupt Stack
Section 8.1.5, “Interrupt Stac k And
Interrupt Record” on page 8-5
System Procedure Table
Section3.7, “User- Supervisor
Protection Model” on page 3-17
Section 7.5, “System Calls” on
page 7-13
Interrupt T a ble
Section 8.1.4, “Interrupt Table” on
page 8-3
Fault Table
Section 9.3, “Fault Table” on
page 9-4
The processor uses these stacks when executing application code.
A separate interrupt stack is pr ovided to ensure that int errupt handling
does not interf ere with application programs.
Contains pointers to system procedures. Application code uses the
system call instruction (
this table. A system supervisor call switches execution mode from
user mode to supervisor mode. When the pro cessor sw itches modes,
it also switches to the supervisor stack.
The interrupt table contains vectors (pointers) to interr upt handling
procedures. When an interrupt is serviced, a particular interrupt table
entry is specified.
Contains pointers to fault handling procedures. When the processor
detects a fault, it selects a particular entry in th e fault ta ble. The
architecture does not require a separate fault handling stack. Instead,
a fault handling procedure uses the supervisor sta ck, user st ack or
interrupt stack, depending on the processor execu tion mode in which
the fault occurred and the type of c all made to the fault handling
proced ure.
calls) to ac cess system procedures through
3-8
i960® VH Processor Developer’s Manual
Table 3-3. Data Structure Descriptions
StructureDescription
Control Table
Section12.4.4, “Co ntrol Table” on
page 12-19
Contains on-chip control register values. Control table values are
moved to on-chip registers at initialization or with
3.5Memory Address Space
The 80960VH’s local addres s space is byte-addressable with addresses running contiguousl y f rom
32
0 to 2
Figure 3-2. Local Memory Address Space
-1. Some memory space is reserved or as si gned special functions as shown in F igure 3-2.
Address
0000 0000H
0000 0004H
0000 003FH
0000 0040H
0000 03FFH
0000 0400H
0000 0FFFH
0000 1000H
0000 17FFH
0000 1800H
0000 1FFFH
0000 2000H
Peripheral Memory-mapped Registers
NMI Vector
Optional Interrupt Vectors
Available for Data
i960® VH Processor Reserved
i960® VH Processor Reserved
Programming Environment
sysctl.
Internal
Data RAM
FEFF FF2FH
FEFF FF30H
FEFF FF5FH
FEFF FF60H
FEFF FFFFH
FF00 0000H
FFFF FFFFH
Physical addresses can be mapped to read-write memory, read-only memory and memory-mapped
I/O. The architecture does not define a dedicated, address able I/O space. There are no s ubdivisions
of the address space such as segments. For memory management, an ex ternal me mor y
i960® VH Processor Developer’s Manual
Code/Data
Architecturally Defined Data Structures
External Memory
Initialization Boot Record (IBR)
Reserved Me mory
i960® Core Processor
Memory-Mapped
Register Space
Reserved
Address
Space
3-9
Programming Environment
management unit (MMU) may su bdivide memory into pages or restrict access to certain areas of
memory to protect a ke rnel’s code, data and stack. However , the proc essor vi ews thi s addres s space
as linea r.
An address in memory is a 32-bit value in the range 0H to FFFF FFFFH. Depending on the
instruction, an address ca n reference in memor y a single byt e, short word (2 bytes), word (4 bytes),
double word (8 bytes), tripl e word (12 bytes) or quad word (16 bytes). Refer to load and store
instruction descriptions in Chapter 6, “Instru ction Se t Reference” for multiple-byte address ing
information.
3.5.1Memory Requirements
The architecture requires that external memory have the following properties:
• Memory must be byte-addressable.
• Physical memory must not be mapped to reserved addre sses that are specifically used by the
process o r implemen t at io n .
• Memory must guarantee indivisible access (read or write) for addresses that fall within 16-byte
boundaries .
• Memory must guarantee atomic access for addresses that fall within 16-byte boundaries.
The latt er tw o ca pabilit i es, indivisible and atomic access, are requi red only when multiple
processo r s o r o ther external ag en t s, s u ch as D M A or grap h i cs co n tro l le r s, s h are a co m m o n
memory.
indivisible accessGuarantees that a proc essor, readi ng or writing a set of memory
locations, complete the operation before another processor or external
agent can read or write the same location. The processor requires
indivisible access within an aligned 16-byte block of memory.
atomi c accessA read-modify -write o p eration. Here the ex ternal memory system must
guarantee that once a processor begins a read-modi fy-write opera tion on
an aligned, 16-byte block of memory it is allowed to complete the
operation bef ore an other processor or external agent can access to the
same location . An atom ic memory system can be imp lemente d by usin g
the LOCK# signal to qualify hold reque sts from exte rnal bus agent s. The
processor asserts LOCK# for the duration of an atomic memory
operation.
The upper 16 Mbytes of the address space (addresses FF00 0000H through FFFF FFFFH and
0000 1000H through 0000 17FFH) are reserved for impleme ntation-specif ic functions. 80960VH
programs cannot use this address space except for accesses to memory-mapped registers. The
processor does not generate any external bus cycles to this memory. As shown in Figur e 3-2, part
of the initi alization boot rec ord is located just below the 80960VH’s reserv ed mem ory.
The 80960VH requires some special consideration when using the lower 1 Kbyte of address space
(addresses 0000H 03FFH). Loads and stores directed to these addresses access internal memory;
instruction fetches from these addresse s are not allowed by the processor. See Section 4.1,
“Internal Data RAM” on page 4-1. No external bus cycles are generated to this address space.
3-10
i960® VH Processor Developer’s Manual
Programming Environment
3.5.2Data and Instruction Alignment in the Address Space
Instructions, program data and architecturally defined data structures can be placed anywhere in
non-reserved address space while adhering to these alignment requirements:
• Align instructions on word boundaries.
• Align all architecturally defined data structures on the boundaries specified in Table 3-4.
• Align instruction operands for the atomic instructions (atadd, atmod) to word boundaries in
memory.
The 80960VH can perform unalign ed loa d or st ore accesses. The processor handles a non-aligned
load or store request by:
• Automatically ser vicing a non-aligned memory acces s with microcode assistan ce as described
in Section 13.4.2, “Bus Transactions Across Region Boundaries” on page 13-5.
• After the access completes, the processor can generate an OPERATION.UNALIGNED fault,
if directed to do so.
The method of handling faults is selected at initializ ation based on the value of the Fault
Configuration Word in the Process Control Block. See Section 12.4.2, “Process Control Block –
PRCB” on page 12-15.
Table 3-4. Alignment of Data Structures in the Address Space
Data StructureAlignment Boundary
System Procedure Table4 byte
Interrupt Table4 byte
Fault Table4 byte
Control Table16 byte
User Stack16 byte
Supervisor Stack16 byte
Interrupt Stack16 byte
Process Control Block16 byte
Initialization Boot RecordFixed at FEFF FF30H
3.5.3Byte, Word and Bit Addressing
The processor provide s instructions for moving data blocks of various length s from memor y to
registers (
(2 bytes), words (4 bytes), double words, triple words and quad words. For example,
long) stores an 8-byte (double word) data block in memory.
The most efficient way to move data blocks long er than 16 bytes is to move them in quad-word
increments, us ing quad-word instructions
ld) and from registers to memory (st). Supported sizes for bl ocks are bytes, short words
ldq and stq.
stl (store
When a data block is stored in memory, the block’s least significant byte is stored at a base
memory address and the more significant bytes are stored at successively higher byte addresses.
This method of ordering bytes in memory is referred to as “little endian” ordering.
i960® VH Processor Developer’s Manual
3-11
Programming Environment
When loading a byte, short word or word from memory to a register, the block’s least signific ant
bit is always loa ded in regist er bit 0. When loading doubl e words, tripl e words and qua d words, the
least significant word i s stored in the base r egister. The more significant word s are then stored at
successive ly higher-numb ere d r egisters. Individua l bits can be addressed onl y in data that resides
in a register: bit 0 in a register is the least significant bit, bit 31 is the most significant bit.
3.5.4Internal Data RAM
The 80960VH has 1 Kbyte of on-chip data RAM. Only data accesses are allowed in this region.
Portions of the data RAM can also be reserved for functions such as caching inte rrupt vectors. The
internal RAM is fully described in Chapter 4, “Cache and On-Chip Data RAM”.
3.5.5Instruction Cache
The instruction cache enhances performance by reducing the number of instruction fetches from
external memory. The cache provides fast execution of cached code and loops of code in the cache
and also provides more bus bandwidth for data operati ons in external memory. The 80960VH
instruction cache is a 16-Kbyte, two-way set associative cache, or ganized in two sets of four -word
lines.
3.5.6Data Cache
The data cache on the 80960VH is a write-through 4-Kbyte direct-mapped cache. For more
information, see Chapt er4, “Cache and On-Chip Data RAM”.
3.6Processor-State Registers
The architect ure defines four 32-bit regis ters that contain stat us and control information :
• Process Controls (PC) register • Trace Controls (TC) register
3.6.1Instruction Pointer (IP) Register
The IP register contains the address of the instruc tion currently being executed. This address is
32 bits long; however , sinc e inst ruction s are requi red to be aligne d on word boundaries in memor y,
the IP’s two least-significant bits are al ways 0 ( ze r o).
All i960 processor instructions are eit her one or two words long. The IP gives the address of the
lowest-order byte of the first word of the instruction.
The IP register cannot be read directly. However, the IP-with-displacement addressing mode lets
software use the IP as an of fs et into the address space. This addressing mode can also be used with
lda (load addres s) instruction to read the current IP value.
the
3-12
i960® VH Processor Developer’s Manual
When a break occurs in the instruction stream due to an interrupt, procedure call or fault, the
processor sto res the IP of the next instruction to be executed in local regi ster r2, which is usuall y
referred to as the return IP or RIP register. Refer to Chapt er 7, “Proce d ur e Calls” for further
discussion.
3.6.2Arithmetic Controls Register – AC
The AC register (Table 3-3) contains condition code fl ags, integer overflow flag, mask bit and a bit
that controls faulting on imprecise faults. Unused AC register bits are reserved.
Figure 3-3. Arithmetic Controls Register – AC
Programming Environment
31
No-Imprecise-Faults Bit- AC.nif
Intege r Over f low M as k Bit - AC. om
Integer-Overflow Flag - AC.of
Condition Code Bits - AC.cc
Reserved
(Initialize to 0)
28242016
(0) So me Faults ar e Imprecise
(1) All Faults are Precise
(0) No Mask
(1) Mask
(0) No Overflow
(1) Overflow
n
i
f
3.6.2.1Initializing and Modifying the AC Register
At initialization, the AC register is loaded from the I nitial AC image fi eld in the Process Control
Block. Set reserved bits to 0 in the AC Regis t er Initia l I mage. Refer to Chapter 12, “Ini ti al ization
and System Requirements”.
After initialization, software must not modify or depend on the AC register’s initial image in the
PRCB. Software can use the modify arith metic controls (
modify any of the register bits. This instruction provides a mask operand that lets user software
limit access to the register’s specific bits or groups of bits, such as the reserved bits.
modac) instruction to examine and/or
12840
c
c
o
m
o
f
c
c
c
c
0
1
2
The processor automatically saves and restores the AC register when it services an interrupt or
handles a fault. The processor saves the current AC register state in an interrupt record or fault
record, then restores the register upon returning from the interrupt or fault handler.
i960® VH Processor Developer’s Manual
3-13
Programming Environment
3.6.2.2Condition Code (AC.cc)
The processor sets the AC register’s condition code flags (bits 0-2) t o indicate the results of certain
instructions, such as compare instructions. Other instructions, such as conditional branch
instructions, examine these flags and perform functions as dictated by the state of the condition
code flags. Once the pro cessor sets the condition code flags, the flags remain unchange d until
another instruction executes tha t mo difies the field.
Conditio n code f lags show true/false conditions, inequalities (greater than, equal or less than
conditions ) or carry and overflow condi tions for the extended arithmetic instructions. To show true
or false conditions, the processor se ts the flags as shown in Table 3-5. To show equality and
inequalities, the processor sets the condition code flags as shown in Table 3-6.
Tabl e 3-5. Condition Codes f or Tr ue or False Conditions
Condition C odeCondition
010
2
000
2
Table 3-6. Condition Codes for Equality and Inequality Conditions
true
false
Condition C odeCondition
000
2
001
2
010
2
100
2
The term unordered is used when comparing floating point numbers. The 80960VH does not
implement on-chip floating point processing.
To show carry out and overfl ow, the processor sets the condition code flags as shown in Table 3-7.
Table 3-7. Condition Codes for Carry Out and Overflow
Condition C odeCondition
01X
2
0X1
2
Certain instructions, such as the branc h-if instructions, us e a 3-b it mask to evaluate the condi tion
code flags. For example, the branch-if-greater-or-equal instruction (
determine if the condition code is se t to either greater-than or equal. Conditional instructions use
similar masks for the rem aining conditions suc h as: greater-or -equal (011
and not-equal (101
). The mask is part of the instructi on opcode; the instruc tio n performs a bitwi se
2
AND of the mask and condition code.
unordered
greater than
equal
less than
carry out
overflow
bge) uses a mask of 011
), less-or-equal (1102)
2
to
2
3-14
The AC register integer overflow flag (bit 8) and integer overflow mask bit (bi t12) are used in
conjunction with the ARITHMETIC.INTEGER_OVERF LOW fault. The mask bit disables fault
generation . When th e faul t is masked and integer overflow is encountered, the proc essor sets the
integer overflow flag instead of generating a fault. If the fault is not masked, then the fault is
allowe d to oc cu r an d the fl ag is n ot set .
i960® VH Processor Developer’s Manual
Once the processor sets this flag, the flag remains s et unt il the application software clears it. Refer
to the discussion of the ARITHMETIC.INTEGER_OVERFLOW fault in Chapter 9, “Faults” for
more information about the integer overflow mask bit and fla g.
The no imprecise faults (AC.nif) bit (bit 15) determines whether or not faults are allowed to be
impr ec ise. If set, then all faul ts are requi r ed to be precise ; if clear, then cer ta in f au lt s ca n b e
impr ec is e . S ee Section 9.9, “Precise and Imprecise Faults” on page 9-16 for more information.
3.6.3Process Controls Register – PC
The PC register (Table 3-4) is used to control processor activity and show the processor’s current
state. The PC register execution mode flag (bit 1) indicates that the processor is operating in either
user mode (0) or supervisor mode (1). The processor automatically sets this flag on a system call
when a switch from user mode to supervisor mode occurs and it clears the flag on a return from
supervisor mode. (User and supervisor modes are described in Section 3.7, “User-Supervisor
PC register state flag (bit 13) indicates the proce s so r state: executing (0) or interrupte d (1). If the
processor is servicing an interrupt, then its state is interrupted. Otherwise, the processor’s state is
executing.
While in the interrupted state, the processor can receive and handle additional interrupts. When
nested interrupts occur, the processor remains in the interrupted state until all interrupts are
handled, then switches back to the executing state on the return from the initia l interrupt pro cedure.
The PC register priority field (bits 16 through 20) indicates the proces sor’s current executing or
interrupted priority. The architecture defines a mechanism for prioritizing execution of code,
servicing interrupts and servicing other implementation-dependent tasks or events. This
i960® VH Processor Developer’s Manual
ppppp
432 10
t
f
s
p
te
me
3-15
Programming Environment
mechanism de fines 32 priority levels , ranging fro m 0 (the lowest pri ority level) to 31 (the highest).
The priority field always reflects the current priority of the processor. Software can change this
priority by use of the
The processor uses the priority field to determine whethe r to service an interrupt i mmedi ately or to
post the interrupt. The processor compares the priority of a requested interrupt with the current
process priority . When the interrupt priority is greater than the curre nt process priority or equal to
31, the interrupt is ser viced; otherwise it is poste d. When an interrupt is serviced, the proces s
priority field is automatically changed to reflect interrupt priority. See Chapter8, “Inte rrupts”.
The PC register trace enable bit (bit 0) and trace fa ult pending flag (bit 10) control the tracing
function. Th e trace enable bit determines whether trace fau lts are globally enabled (1) or globally
disabled (0). The trace fault pending flag indicates that a trace event has been detected (1) or not
detected (0). The tracing functions are further described in Chapter 10, “Tracing and Debugging”.
modpc instruction.
3.6.3.1Initializing and Modifying the PC Register
Any of the following three methods can be used to change bits in the PC register:
• Modify process controls instruction (modpc)
• Alter the saved process controls prior to a return from an interrupt handler or fault handler
The
modpc instruction reads and modifies the PC register directly. A TYPE.MISMATCH fault
results if software execute s
provides a mask operand that can be used to limit access to specific bits or groups of bits in the
register. In user mode, software can use
In the latter two methods, the interrupt or fault handler change s pro cess controls in the interrupt or
fault record that is sa ved on the stack. Upon return fro m t he interrupt o r fault handler, the modified
process cont rols are copied into the PC regist er. The processor must be in su pervisor mode prior to
return for modified process controls to be copied into the PC register.
When process controls are changed as described above, the processor recognizes the changes
immediate ly exc ept for one situatio n: if
processor may not recognize the change before the next four non-branch instructions are executed.
After initialization (hardware reset), the process controls reflect the following conditions:
modpc in user mode with a non-zero mask. As wit h moda c, modpc
modpc to read the current PC register.
modpc is used to change the trace enable bit, th en the
• pr io r ity = 31• execution mode = supervisor
• tr ac e en ab l e = di sa bl ed• state = interr u pt ed
• no trace fault pending
When th e p rocessor is r einitia li z ed wi th a
changed.
Software should not use
special circum stan ces, su ch as in initi aliza tion cod e. Normally, execution mode is changed throu gh
the call and return mechanism. See Section 6.2.43, “modpc” on page 6-72 for more details.
modpc to modify execution mod e or tra ce fault state flags except under
sysctl rein itialize message, th e PC registe r i s not
3-16
i960® VH Processor Developer’s Manual
3.6.4Trace Controls (TC ) Register
The TC register, in conjunctio n with the PC register , controls processor tracing facilities. It
contains trace mode enable bits and trace event flags that are used to enable specific tracing modes
and record trace events, respectively . Trace controls are described in Chapter 10, “Tr acing and
Debugging”.
3.7User-Supervisor Protection Model
The processor can be in either of two execution modes: user or supervisor. The capability of a
separate user and s upervisor execution mode creates a code and data protection mechanis m
referred to as the user-supervisor protection model. This mechanism allows code, data and stack
for a kernel (or sys tem execut ive) to resi de in the same a ddress spa ce as c ode, da ta and st ack for t he
applicati on. The mechanism restricts access to all or parts of the kernel by the application code.
This protection mechanism prevents application software from inadvertently altering the kernel.
3.7.1Supervisor Mode Resources
Supervisor mode is a privileged mode that provides several additional capabilities over user mode.
Programming Environment
• When the processor switches to supervisor mode, it also switches to the supervisor stack.
Switching to the supervisor stack helps maintain a kernel’s integrity. For example, it allows
access to system debugging software or a system monitor, even if an application’s program
destroys its own stack.
• In supervisor mode, the processor is allowed access to a set of supervisor-only functions and
instructions. For example, the processor uses supervisor mode to handle interrupts and trace
faults. Operati ons that can modify interr upt controller behavior or reconfigure bus controller
characteristics can be performed only in supervis or mode. These functions include
modificati on of control registers an d internal data RAM that is dedi cated to interrupt
controllers. A fault is generated if su pervisor-onl y operations are attempted while the
processor is in user mode.
The PC register execution mode flag specifies process or exe cution mode. The processor
automatically sets and clears this flag when it switche s between the two ex ecution modes.
• intctl (global interrupt enable and disable)• Protec ted in ter nal dat a RAM or Sup ervis or
MMR space write
• intdis (global interrup t disable)
Note that all of these instr uctions return a TYPE.MISMATCH fault if exec uted in user mode.
i960® VH Processor Developer’s Manual
3-17
Programming Environment
3.7.2Using the User-Supervisor Protection Model
A program switches from user mode to su pervisor mode by making a system-supe rvisor call (also
referred to as a supe r v iso r cal l). A system - s up ervisor cal l is a call exec u te d w it h th e ca ll - s y stem
instruction (
table. An entr y in the system procedure table ca n specify an execution mode s witch to supervisor
mode when th e called procedur e is executed.
tightly controlled interface to procedures that can exec ute in supervisor mode. Once th e proc es sor
switches to supervisor mode, it remains in that mode until a return is perfor med to the procedure
that caused the original mode switch.
Interrupts and faults can cause the processor to switch from user to supervisor mode. When the
processor handles an interrupt, it automatically switches to supervisor mode. However, it does not
switch to the supervisor stack. Instead, it switches to the interrupt stack. Fault table entries
determine if a par ticular fault tr ans itions the processor from user to supervisor mode.
If an application does not require a user-supervisor protection mechanism, then the processor can
always execut e in supervisor mode. At initialization, the proc es sor is placed in supervisor mode
prior to executing the first instruc tion of the application code. The processor then remains in
supervisor m ode indefinit ely, as long as no action is taken to change execution mode to user mode.
The processor does not need a user stack in this cas e.
calls). With calls, the IP for the called proce dure comes from the system procedure
calls and the system procedure table thus provide a
3-18
i960® VH Processor Developer’s Manual
Cache and On-Chip Data RAM
This chapter describes the structure and user configuration of all forms of on-c hip storage,
including caches (data, local register and instruction) and data RAM.
4.1Internal Data RAM
Internal data RAM is mapped to the lower 1 Kbyte (0 to 03FFH) of the address space. Loads and
stores with target addresses in internal data RAM operate directly on the internal data RAM; no
external bus ac ti vity is gener ated. Data RAM al lo ws time-c ritic al data stor age and retr ie val with out
dependence on e xterna l b us per formance. Onl y data acces ses ar e all owed to t he intern al dat a RAM;
instructions cannot be fetched from the int ernal data RAM. Instructi on fetches directed to the dat a
RAM cause an OPERATION.UNIMPLEMENTED fault to occur .
Internal data RAM locations are never cached in the data cache. Logical Memory Template bits
controlling caching are ignored for data RAM accesses.
Some internal data RAM locations are reserved for functions other than general data storage. The
first 64 bytes of data RAM may be used to c ac he interrupt vectors, which reduces latency for these
interrupts. The word at location 0000H is always res erved for the cached NMI vector. With the
exception of th e cached NMI vector, other reserved porti ons of the data RAM can be used for da ta
storage when th e al ternate function is not used. All loca tions of the internal data RAM can be read
in both supervisor and user mode.
4
The first 64 bytes (0000H to 003FH) of internal RAM are always user-mode writ e-protected. This
portion of data RAM can be read while executing in user or supervisor mode; however, it can be
only modified in supe rvis or mode. This area can also be write-protected from supervisor mode
writes by setting the BCON.sirp bit. See S ection 13.3.1, “Bus Control Re gister – BCON” on
page 13-4. Protectin g this portion of the data RAM from user and s upervisor rights preserves the
interrupt vect ors that may be cached there. See Section 8.5.2.1, “Vector Caching Option” on
page 8-35.
Figure 4-1. Internal Data RAM and Register Cache
NMI
Optional Interrupt Vectors
Available for Data
0000 0000H
0000 0004H
0000 003FH
0000 03FFH
i960® VH Processor Developer’s Manual
4-1
Cache and On-Chip Data RAM
The remainder of the internal data RAM can always be written from supervisor mode. User mode
write protection is optionally selected for the rest of the data RAM (40H to 3FFH) by setting the
Bus Control Register RAM protection bit (BCON.irp). Writes to internal data RAM locations
while they ar e p r otected gene rat e a TY P E .M I SM ATCH fault. S ee Section 13.3.1, “Bus Control
Register – BCON” on page 13-4 for the format of the BCON register.
New versions of i960 processor compilers take advantage of internal data RAM. Profiling
compilers, such as those offered by Intel, can allocate the most frequently used va riables into this
RAM.
4.2Local Register Cache
The i960® VH processor provides fast stora ge of loc al registers for call and return ope rations by
using an int ernal l ocal regis ter c ache (a lso kno wn as a stack frame cache). Up to eight local register
sets can be contained in the cache before sets must be saved in external memory. The register set is
all the local reg is ters (i.e., r0 through r15). The processor uses a 128-bit wide bus to s tore local
register sets quickly to the register cache. An integrated procedure call mechanism saves the
current local register set when a call is executed. A local register set is saved into a frame in the
local regist er cache, one frame per register set. When the eighth frame is saved, th e o l dest set of
local registers is flushed to the procedure stack in external memory, which frees one frame.
Sectio n7.1 .4, “Cach i ng Local Reg i ster Sets” on page 7-6 and Section 7.1.5, “Mapping Local
Registers to the Pr ocedur e St ack” on pa ge 7-10 further disc uss the re lati onship bet ween t he int ernal
register cache and the external procedure stack.
The branch-and-link (
The entire in te rnal re gister c ache c ontent s can be cop ied to the ext ernal procedu re st ack thr ough t he
flushreg instruction. Section6.2.30, “flushreg” on page 6-50 explains the instruction itself and
Section 7.2, “Modif ying the PFP Register” on page 7-10 offers a pr actical example when flushreg
must be used.
To decrease interrupt latency, software can reserve a number of frames in the local regis ter cache
solely for high pri orit y in terrupt s (inte rrup ted stat e and proc ess pri ority gre ater th an or equa l to 28).
The remaining frames in the cache can be used by all code, including high-priority interrupts.
When a frame is reserved for high- priority interrupts, the local registers of the code interrupted by
a high-priority interrupt can be saved to the local register cache without causing a frame flush to
memory, providing that the local register cache is not alre ady full. Thus, the register allocation for
the implicit interrupt call does not incur the latency of a frame flush.
Software can reserve frames for high-priority interrupt code by writing bits 10 through 8 of the
register cache configuration word in the PRCB. This value indicates the number of free frames
within the register cache that can be used by high-priority interrupts only. Any attempt by
non-critical code to reduce the number of free frames below this value results in a frame flus h to
external memory. The free frame check is performed only when a frame is pushed, which occurs
only for an implicit or explicit call . The following pseudo-code illustrates th e operation of the
register cache when a frame is pushed.
bal and balx) instruction s do not cause the local registers to be stored.
4-2
i960® VH Processor Developer’s Manual
Example 4-1. Register Cache Operation
frames_for_non_critical = 7- RCW[11:8];
if (interrupt_request)
else if (number_of_frames = (frames_for_non_critical + 1) &&
(PC.priority < 28 || PC.state != interrupted) ) {
The valid range for the number of reserved free frames is 0 to 7. Setting the valu e to 0 reserves no
frames for exclusive use by high-priority interrupts. Setting the value to 1 reserves 1 frame for
high-priori ty interrupts and 6 frames to be s hared by all code. Setting the value to 7 causes the
register cach e to becom e dis abled for non-critical code. If the number of reserved high-priority
frames exceeds the allocated size of the register cache, then the entire cache is reserved for
high-priority interrupts. In that case, all low-priority interrupts and procedure calls cause frame
spills to external memo r y.
The 80960VH features a 16-Kbyte, 2-wa y se t-associative ins truction cache (I-cache) organized in
lines of f our 32-bit words . T he cache provi des fast execution of cached code and loops of code and
provides more bus bandwidth for data operations in external memory. To optimize cache updates
when branches or interrupts are executed, each word in the line has a separate valid bit. When
requested instructions are found in the cache, the instruction fe tch time is one cycle for up to four
words . A mech anism to load and lock critical cod e w it h in a w ay of th e ca ch e is provided along
with a mechanism to disable the cache. The cache is managed through the
instr u ct io n . The
other i960 processor so ftware. Using
controll ing the instruction cache on the 80960VH.
Cache misses cause the processor to issue a double-word or a quad-word fetch, based on the
locat io n of th e I nstruc ti o n P o inter:
• If the IP is at word 0 or word 1 of a 16-byte block, a four-word fetch is initiated.
• If the IP is at word 2 or word 3 of a 16-byte block, a two-word fetch is initiated.
sysctl instruction supports the instruction cache to maintain compatibility with
icctl or sysctl
icctl is the preferred and more versatile method for
i960® VH Processor Developer’s Manual
4-3
Cache and On-Chip Data RAM
4.3.1Enabling and Disabling the Instructi on Cache
Enabling the instruction cache is controlled on rese t or initialization by the instruction cache
configuration word in the Process Control Block (PRCB); see Table 12-8 “Process Control Block
Configuration Words” on page 12-17. When bit 16 in the instruction cache configuration word is
set, the instruction cache is disabled and all instruction fetches are directed to external memory.
Disabling th e instruction cache is useful for tracing execution in a software debug environment.
The instruction ca che remains disabled until one of three operations is perform ed:
• icctl is issued with the enable instru ction cache operation (preferred method)
• sysctl is issued with the configure- instruction-ca che message type and cache confi guration
mode other than disable cache (provides compatibility with other i960 processors; not the
preferred method for 80960VH).
• The processor is reinitialized with a new value in the instruction cache configuration word
4.3.2Operation While the Inst ruction Cache Is Disabled
Disabling the instruction cache does not disable instruction buffering that may occur in the
instruction fetch unit. A four-word instruction buffer is always enabled, even when the cache is
disabled.
There is one tag and four wo rd-valid bits assoc iated with the buffer. Because there is only one tag
for the buffer , any “miss” within the buffer caus es the following:
• All four words of the buffer a r e invalidate d.
• A new tag value for the required instruction is loaded.
• The required instruction(s) are fetched from external memory.
Depending on the alignment of the “missed” instruction, either two or four words of instructions
are fetched and only the valid bits corr esponding to the fetched words are set in the buffer. No
external instruc tion fetches are generated until there is a “miss” within the buffer, even in the
presence of forward and backward branches.
4.3.3Loading and Locking Instructions in the Instruction Cache
The processo r can be directed to load a block of instructions into the cache and then lock out all
normal updates to the cach e. This cache loa d-and-lo ck mechani sm is provided to minimize laten cy
on program control transfers to key operations such as interrupt service routines. The block size
that can be loaded and loc ked on the 80960VH is one way of the cache.
An
icctl or sysctl instruction is issued with a configure-instruction-cache message type to select
the load-and-lo ck mechanism. When the lo ck option is select ed , the processor loads the cache
starting at an address specified as an operand to the ins truction.
4.3.4Instruction Cache Visibility
4-4
Instruction cache status can be determined by issuing icctl with an in st r uc t io n - cache statu s
message. To facilitate debugging, the instruction cache co ntents, instructions, tags and vali d bits
can be written to memory. This is done by issuing
icctl with the store cache operation.
i960® VH Processor Developer’s Manual
4.3.5Instruction Cac he Coherency
The 80960VH does not snoop the bus to prev ent instru ct ion cache in cohere ncy. The cache does not
detect modification to program memory by loads, stores or actions of other bus masters. Several
situation s may require program memory modification, such as uploadin g code at initialization or
loading from a backplane bus or a disk drive.
The applicati on program is responsible for synchronizing its own code modification and cache
invalidation. In general, a program must ensure that modified code space is not accessed until
modificati on and cache-invali date are completed. To achieve cache coherency, instr uction cache
contents s hould be invalidated after code modifica tion is complete.
cache for the 80960VH. Alte rnately, i960 processor le gacy software can use
4.4Data Cache
The 80960VH features a 4-Kbyt e, direct -mapped ca che that en hances pe rformance by reducing the
number of data load and store accesses to external memory. The cache is write-through and
write -allocate. It has a line size of 4 words and each line in the cache h as a valid bit. To reduce
fetch latency on ca che misses, each word within a line also has a valid bit. Caches are managed
through the
dcctl instruction.
Cache and On-Chip Data RAM
icctl inva lidates the i nstruction
sysctl.
User settings in the memory region configuration registers LMCON0-1 and DLMCON determine
the data accesses that are cacheable or non-cacheable based on memory region.
4.4.1Enabling and Disabling the Data Cache
To cache data, two conditions must be met:
1. The data ca che must be ena ble d. A
enables the cache. On reset or initi ali zati on, the dat a cache is alwa ys disable d and all valid bi ts
are set to ze ro .
2. Data caching for a locat ion must be enabl ed by the co rresponding logic al memor y template , or
by the default logical memory templat e if no other template applies. See Section 13.2,
“Programming the Physical Memory Attributes (Pmcon Registers)” on page 13-3 for more
details on logical memory templates.
When the data cache is disabled, all data fetches are directed to external memory. Disabling the
data cache is usef ul for debugging or monitoring a s ys tem. T o disable the data cache, issue a
with a disable data cache message. The enab le and disable status of the data cache and variou s
attribu te s of th e ca ch e ca n be d et er mi n ed by a
dcctl instruction issued with an enable data cache message
dcctl
dcctl issued with a data-cache status message.
4.4.2Multi-Word Data Accesses that Partially Hit the Data Cache
The following applies only when data caching is enabled for an access.
For a mult i-wor d lo ad ac ces s (
an external bus transaction is started to acquire all the words of the access.
ldl, ldt, ldq) in which none of the requested words hit the dat a cache,
For a multi-word load access that partially hits the data cache, the processor may either:
• Load or reload all words of t h e access (e v en those that hit) fro m the exter n al bus.
i960® VH Processor Developer’s Manual
4-5
Cache and On-Chip Data RAM
• Load only missing words from the external bus and interleave them with words found in the
data cache.
The multi-word alignment determines which of the above methods is used:
• Naturally aligned multi-word accesses cause all words to be reloaded.
• An unaligned multi-word access causes only missing words to be loaded.
When any words (Table 4-1) accesse d with
accessed by that load instruction is updated in the cache.
Table 4-1. Load Instruction Updates
Load InstructionNumber of Updated Words
ldq4 words
ldt
ldl
In each c as e, the external bus accesses used to acquire the data may consist of none, one, or several
burst accesse s based on the alignment of the data and the bus-width of the memory re gion that
contains the data. See Chapter 13, “Core Processor Local Bus Configuration” for more details.
A multi-word load access that completely hits in the data cache does not cause external bus
accesses.
For a multi -w o rd s to re access (
of the access regardless if any or all words of the acc ess hit the data cach e. Extern a l bus acce sse s
used to write the data may consist of either one or several burst accesses based on data alignment
and the bus-width of the memory region that receives the data. The cache is also updated
accord in g ly as de s c r ib ed ea r li er in th is ch a pt e r.
stl, stt, stq) an external bus transaction is started to wr ite all word s
4.4.3Data Cache Fill Policy
ldl, ldt, or ldq miss the data cache, every word
3 words
2 words
The 80960VH always uses a “natural” fill policy for cacheable loads. The processor fetches only
the amount of data th at is requested by a load (i.e., a word, long word, etc.) on a data cache miss.
Exception s ar e byte and s hort-word accesses, whi ch are alwa ys promoted to words. This allows a
complete word to be brought into the cache and marked valid. When the data cache is disabl ed and
loads are done from a cacheable region, promotions from bytes and short words still take place.
4.4.4Data Cache Write Policy
The write policy det ermines what ha ppens on cacheable writes (stores). The 80960VH always uses
a write-through policy. Stores are alway s see n on the external bus, thus maintaining coherency
between th e data cache and exter n al m emo r y.
The 80960VH always uses a write-allocate policy for data. For a cacheable location, data is always
writt en to the data cache regardless of whether the access is a hi t or miss. The following cases are
relevant to consider:
1. In the case of a hit for a word or multi -word stor e, the appropria te line and word(s) are update d
with the data.
4-6
i960® VH Processor Developer’s Manual
Cache and On-Chip Data RAM
2. In the case of a miss for a word or multi-word stor e, a tag and cache line are allocate d, if
needed, and the appro priate valid bits, line, and word(s) are updated.
3. In the case of byte or short-word data that hits a valid word in the cache, both the word in
cache and external memory are updated with the data; the cache word remains valid.
4. In the case of byte or short-word data that falls within a valid line but misses because the
appropriate word is invalid, both the word and external memor y are updated with the data;
however, the cache word remains invalid.
5. In the case of byte or short-word data that does not fall within a valid line, the external
memory is updated with the data. For data writes less than a word, the data cache is not
updated; the tags and valid bits are not changed.
A byte or short word is always invalid in the data cache since valid bits only appl y to words.
For cacheable sto res that are equal to or greater than a word in length, cache tags and a ppropriate
valid bits are u p dated whenever data is w ritten into the c ache. Co nsider a word store that misses as
an example. The tag is always updated and its valid bit is set. The appropriate valid bit for that
word is al ways s et an d the other t hree val id b it s ar e alw ays cl ea r ed. If the word sto re h it s th e cach e,
the tag bits remain unchanged. The valid bit for the stored word is set; all other valid bits are
unchanged.
Cacheable stores that are less than a word in length are handled differently. Byte and short-word
stores that hit the cache (i.e., are co ntained in valid words within valid cache lines) do not change
the tag and valid bit s. The process or writes the data int o the cache and ex ternal memory as usual. A
byte or short-word store to an invalid word within a valid cache line leaves the word’s valid bit
cleared because the rest of the word is still invalid. In thes e two cas es the processor simultaneousl y
writes the data into the cache and the external memory.
4.4.5Data Cache Coherency and Non-Cacheable Accesses
The 80960VH en sures that the data cach e is always ke pt coherent with accesses that it initiates and
performs. The most visible applicatio n of this requirement concerns non-cacheable acce s se s
discussed below. However, the processor does not provide data cache coherency for accesses on
the external bus that it did not initia te. Software is responsible for maintaining coherency in a
multi-processor environment.
An access is defined as non-cacheable when any of the fol lowing is true:
1. The access falls into an address range mapped by an enabled LMCON or DLMCON and the
data-caching enabled bit in the matc hing LMCON is clear .
2. The entire data cache is disabled.
3. The access is a read operation of the read-modify-write sequence performed by an
atadd instruction.
4. The access is an implicit read access to the interrupt table to post or deliver a software
interrupt.
If the memor y location targeted by an atmod or atadd instruction is currently in the data cache, it
is invalida ted.
If the ad dress for a non -cacheab l e sto r e matches a ta g ( “t ag hi t ” ), th e corresp on d i ng ca che line is
marke d in valid. Th i s is be cause the w o r d is not actually upda ted with th e value of th e store. Th i s
behavior ensures tha t the data cache never contains st ale data in a single-process or s ystem. A
atmod or
i960® VH Processor Developer’s Manual
4-7
Cache and On-Chip Data RAM
simple ca se illustrates the necessity of this beha vior: a read of data previously stored by a
non-c acheable access must return the new v alue o f the data, not the v alue in the ca ch e. Because the
process o r in v a l id a te s th e ap p r o pri at e w o r d in th e ca ch e line on a sto re hi t w h en th e ca ch e is
disabled, coherency can be maintained when the data cache is enabled and disabled dynamically.
Data loads or stor es invalidate the corres ponding lines of the cache even whe n data caching is
disabled. This behavior further ensures that the cache does not contain stale data.
4.4.6External I/O and Bus Masters and Cache Coherency
The 80960VH implements a single processor coher enc y me cha nism. There is no hardware
mechanism, such as bus snooping, to support mult iprocessing. If another bus master can change
shared m e mo r y, then there is no gua r an t ee th a t th e dat a cache cont ai n s th e mo s t rec en t dat a . Th e
user must manage such data coherency issues in software.
A suggested practic e is to pro gram the LMCON0-1 re gisters so that I/O regions are non- cacheabl e.
Partitioning the system in this fashi on eliminates I/O as a source of coherency problems. See
Section 13.2, “P rogramming the Physical Memory Attributes (P mcon Registers )” on page 13-3 for
more information on this subject.
4.4.7Data Cache Visibility
Data cache stat us ca n be determ ined by a dcctl instruction issued with a data-cache status message.
Data cache contents, data, tags and valid bits can be written to memory as an ai d for debugging.
This operation is accomplished by a
Section 6.2.23, “dcc tl” on page 6-37 for more information.
dcctl instruction issued with the dump cache operand. See
4-8
i960® VH Processor Developer’s Manual
Instruction Se t Overview
This chapter provi des an overview of the i960® microprocessor family’s instruction set and i960®
VH processor-speci fic instruction set ex tensions. Also discussed are the assembly-language and
instruction-encoding formats, va rious instructio n gr oups and each group’s instructions.
Chapter 6, “Instructi on S et Reference” describes each instruction, including assembly language
syntax, and the ac tion taken when the instruction executes and examples of how to use the
instruction.
5.1Instruction Formats
80960VH instructions may be described in two formats : assembly language and ins truction
encoding. The following subsections briefly describe these form ats.
5.1.1Assembly Language Format
Throughout this ma nual, instructions are referred to by their asse mb ly language mnemonics. For
example, the add ordinal instruction is referred to as
language syntax which consists of the instruction mnemonic followed by zero to three operands,
separated by commas . In the following assembly language statement example for
operands in glob al re gisters g5 and g9 are added tog ether, and the res ult is stored in g7:
addo. Examples use Intel 80960 assembly
5
addo, ordinal
addo g5, g9, g7# g7 = g9 + g5
In the assembly language listings in this chapter, register s are denoted as:
gglobal registerrlocal register
#pound sign precedes a co mment
All numbers used as li terals or in address expressions are assumed to be decimal. Hexadecimal
num b e r s are den o ted wi th a “0x ” pr efix (e.g. , 0xffff0012). Several assembly language instruction
statement ex amples follow. Additional assembly lan guage examples are giv en in Section 2.3.5,
“Addressing Mode Examples” on page 2-6.
subi r3, r5, r6#r6 = r5 - r3
setbit 13, g4, g5#g5 = g4 with bit 13 set
lda 0xfab3, r12#r12 = 0xfab3
ld (r4), g3#g3 = memory location that r4 points to
st g10, (r6)[r7*2]#g10 = memory location that r6+2*r7 points to
5.1.2Instruction Encoding Forma ts
All instruct ions are encoded in one 32-bit mac hine language instruction — an opword — which
must be word aligned in memory. An opword’s most signif icant eight bits con tain the opcode fie ld.
The opcode field determi nes the instru ction to be performed and how the remaind er of the machine
i960® VH Processor Developer’s Manual
5-1
Instruction Set Overview
language instruction is interpreted. Instructions a r e encoded in opwords in one of four formats (see
Figure 5-1). For more information on instruction formats, see Appendix A, “Machine-level
Most i nstructions are encoded in this format. Used primarily for instructions
which perform regist er-to-r egister operations.
An encoding optimization which combines compare and branch operations
into one opword. Other compare and branch operations are also provided as
REG and CTRL format instructions.
For branches and calls that do not depend on registers for address
calculation.
Used for referencing an operand which is a memory address. Load and store
instructions — and some branch and call instructions — use this format. MEM
memoryMEM
format has two encodings: MEMA or MEMB. Usage depends upon the
addressing mode selected. MEMB-formatted addressing modes use the word
in memory immediately following the instruction opword as a 32-bit constant.
MEMA format uses one word and MEMB uses two words.
Figure 5-1. Machi ne-Level Instruction Fo rmats
OPCODE
31
OPCODE
31
OPCODE
31
OPCODE
31
OPCODE
src/dstsrc2
src1
src/dst
src/dst
src2
displacement
Address
Base
Address
Base
OPCODE
Scale
displacement
Offset
src1
Index
031
REG
0
COBR
0
CTRL
0
MEMA
0
MEMB
5.1.3Instruction Operands
This section identifies and describes operands that can be used with the instruction form ats.
FormatOperand(s)Description
REGsrc1, src2, src/dstsrc1 and src2 can be global registers, local registers or
5-2
32-Bit
displacement
literals. src/dst is either a global or a local regist er.
i960® VH Processor Developer’s Manual
Instruction Set Overview
FormatOperand(s)Description
CTRLdisplacementCTRL format i s used for branch and call instructions.
displacement value indicates the target instruction of the
branch o r call.
COBRsrc1, src2,
displacement
MEMsrc/dst, efaSpecifies source or destination register and an effective
5.2Instruction Groups
The i960 processor instruction set can be categorized into the functional groups shown in
Table 5-2. The actual number of instructions is gr eater than those shown in this list because, for
some operations, several unique instructions are provided to handle various operand siz es, data
types or branch condi tions. The following sections provide an overview of the instructions in each
group. For detailed information about each instruction, refer to Chapter 6, “Instruction Set
Reference”.
Table 5-2. i960
®
VH Processor Instruction Set (Sheet 1 of 2)
Data MovementArithmeticLogicalBit, Bit Field and Byte
Add
Subtract
Multiply
Divide
Remainder
Load
Store
Move
*Condi tional Select
Load Address
* Denotes newer instructions that are NOT available on 80960CA/CF, 80960KA/KB and 80960SA/SB implementations.
Modulo
Shift
Extended Shift
Extended Multiply
Extended Divide
Add with Carry
Subtract with Carry
*Condi tio n al A dd
*Conditional Subtract
Rotate
src1, src2 indicate values to be compared; disp lacement
indicate s bra nch target. src1 can spe cify a global register,
local register or a literal. sr c2 can specify a global or local
register.
address (efa) formed by using the processor’s addressing
modes as described in Section 2.3, “Memory Addressing
Modes” on page 2-4. Registers specified in a MEM format
instructi on mu st be either a global or loca l register.
And
Not And
And Not
Or
Exclus iv e Or
Not Or
Or Not
Nor
Exclus iv e Nor
Not
Nand
Set Bit
Clear Bit
Not Bit
Alte r Bit
Scan For Bit
Span Over Bit
Extract
Modify
Scan Byte for Equal
*Byte Swap
i960® VH Processor Developer’s Manual
5-3
Instruction Set Overview
Table 5-2. i960® VH Processor Instruction Set (Sheet 2 of 2)
ComparisonB r an chCa ll/ReturnFault
Compare
Conditional Compa r e
Compare and Increment
Compare and Decrement
Test Condition Code
Check Bit
Uncondi tional Branc h
Conditional Branch
Compare and Branch
Call
Call Extended
Call System
Return
Branch and Link
Conditional Fault
Synch ron iz e F au lt s
Debug
Modif y Trace Controls
Mark
Force Mark
* Denotes newer instructions that are NOT available on 80960CA/CF, 80960KA/KB and 80960SA/SB implementations.
Proces so r
Managem ent
Flus h Local Reg isters
Modify Arithmetic
Controls
Modif y Process Contro ls
*Halt
System C ontrol
*Cache Control
*Interrupt Control
5.2.1Data Movement
These instr uctions are used to move data from memory to global and local registers, from global
and local regis ters to memory, and between local and global registers.
Rules for register alignment must be followed when using load, st ore and move instructions that
move 8, 12 or 16 bytes at a time. See Section 3.5, “Memory Address Space” on page 3-9 for
alignment requirements for code portability across implementations.
5.2.1.1Load and Store Instructions
Load instructions copy bytes or words from memory to local or global registers or to a group of
register s. Ea ch load instruction has a corresponding store instruc tion to memory bytes or words to
copy from a selected loca l or global register or grou p of registers. All load and store instructions
use the MEM f ormat.
Atomic
Atomic Add
Atomic Modify
5-4
ld
ldob
ldos
ldib
ldis
ldl
ldt
ldq
load word
load ordinal byte
load ordinal shor t
load integer byte
load integer sh ort
load long
load triple
load quad
st
stob
stos
stib
stis
stl
stt
stq
store word
store ordinal byte
store ordinal sh ort
store integer byte
store integer s hort
store long
store triple
store quad
i960® VH Processor Developer’s Manual
ld copies 4 bytes from memory into a register; ldl copies 8 bytes; ldt copies 12 bytes into
successive registers;
st copies 4 bytes from a register into memory; stl copies 8 bytes; stt co pies 12 bytes from
successive registers;
ld, ldob, ldos, ldib and ldis, the instruction specifies a memory address and register; the
For
memory address value is copied into the register. The processor automatically extends byte and
short (half-word) operands to 32 bits according to data type. Ordinals are zero-extended; integers
are sign -exten ded.
For
st, stob, stos, st ib a nd stis, the instruction specifies a memory address and register; the
register value is copied into memory. For byte and short inst r u ctions, the pr ocessor auto matically
reformats the source register’s 32-bit value for the shorter memory location. For
reformatting ca n cause in teger overfl ow when th e registe r value is too lar ge for the shorter memory
location. When integer overflow occurs, either an integer-overflow fault is generated or the
integer-overflow flag in the AC register is set, dependi ng on the integer- overflow mask bit setti ng
in the AC register.
stob and stos, the proc essor truncates th e r eg ister v alue and does not creat e a fault when
For
truncati on r es ults in the loss of significant bi ts.
5.2.1.2Move
Instruction Set Overview
ldq copies 16 bytes into successive registers.
stq copies 16 bytes from successive registers.
stib and stis, this
Move instructions copy data fro m a local or global register or group of registers to a nother register
or group of registers. The se instructions use the REG format.
mov
movl
movt
movq
move word
move long word
move triple word
move quad word
5.2.1.3Load Address
The Load Address instruc tion (lda) computes an eff ec t iv e ad d r es s in the ad dr e ss sp ac e fr o m an
operand presented in one of the addressing modes.
register. This instruction uses the MEM format and can operate upon local or global registers.
On the 80960VH,
parallelism allows
lda is useful for performing simple arithmetic operations. The processor’ s
lda to execute in the same clock as another arithmetic or logical operati on.
5.2.2Select Conditional
Given the proper condition code bit settings in the Arithmetic Controls register, these instructions
move one of two pieces of data from its source to the specified destination.
selnoSelect Based on Unordered
selg
sele
Select Based on Gr eater
Select Based on Equal
lda is commonly used to load a constant into a
i960® VH Processor Developer’s Manual
5-5
Instruction Set Overview
selge
sell
selne
selle
selo
Select Based on Greater or Equal
Select Based on Less
Select Based on Not Equal
Select Based on Less or Equal
Select Based on Ordered
5.2.3Arithmetic
Table 5-3 lists arithmetic operations and data types for which the 80960VH provides instructions.
“X” in this table indicates that the micr oprocessor provides an instruction for the spec ified
operation and data type. All arithmetic operations are carried out on opera nds in registers or
literals. Refer to Section 5.2.11, “Atomic Instructions” on page 5-15 for instructions which handle
specific require ments for in-place memory operations.
All arithmetic instructions use the REG format and can operate on loc al or global registers. The
following subsections describe arithmetic instructions for ordinal and integer data types.
divi and divo generate a zero-divide fault when the
The difference between the remainder and modulo ins tructions lies in the si gn of the result. For
remi and remo, the result has the same sign as the dividend; for modi, the result has the same sign
as the divisor.
5.2.3.3Shift, Rotate and Extended Shift
These shift instructions shift an operand a specified number of bit s left or right:
shlo
shro
shli
shri
shrdi
rotate
eshro
Except for
shlo shifts zeros in from the least significant bit; shro shifts zeros in from th e most significant bit.
These instructions are equivalent to
shift left ordinal
shift right ordinal
shift left integer
shift righ t integer
shift right dividing integer
rotate left
extended shift right ordinal
rotate, these instructions discard bits shifted beyond the register boundary.
mulo and divo by the power of 2, respectivel y.
i960® VH Processor Developer’s Manual
5-7
Instruction Set Overview
shli shifts zeros in from the least s ignificant bit. When the shift operati on results in an overflow, an
integer-overflow fault is generated (when enabled). The dest ination register is written with the
source shifted as much as possible without overflow and an integer-overflow fault is signaled.
shri performs a convent ional arithmetic s hift right operation by extending the sign bit . However,
when this instr uct ion is use d to div ide a neg ative intege r operand by t he power of 2, it may prod uce
an incorrect quotient. (Discarding the bits shifted out has the effect of rounding the result toward
negative. )
shrdi is provided for dividing integers by the power of 2. With this instruction, 1 is added to the
result when the bits shifted out are non-zero and the operand is negative, which produces the
correct result for negative operands.
2, respectively, except in cases where an overflow error occurs.
rotate rotates operand bits to the left (toward higher significance) by a specified number of bits.
Bits shifted beyond the register’s left boundary (bit 31) appear at the right boundary (bit 0).
eshro instruction performs an ordinal r ight shift of a source register pair (64 bits) by as much
The
as 32 bits and stores the result in a single (32-bit) register. This instruc tion is equivalent to an
extended divi de by a power of 2, which produces no remainder. The instruction is also the
equivalen t of a 64-bit extract of 32 bits.
5.2.3.4Extended Arithmetic
shli and shrd i are equivalent to muli and divi by the power of
These instructions support extended-precision arithmetic; (i.e., arithmetic operations on operands
greater than one word in length):
addc
subc
emul
ediv
addc adds two word operands (literals or contained in registers) plus the AC Register condition
code bit 1 (used here as a carry bit). When the result has a carry, bit 1 of the condition code is set;
otherwise, it is cleared. This instruction’s description in Chapter 6, “Instruction Set Reference”
gives an example of how thi s in struction can be used to add two long-word (64-bit) operands
together.
subc is similar to addc, except it is used to subtract exten ded-prec isi on value s. Alt hough addc an d
subc treat their operands as ordinals, the instructions also s et bit 0 of the condition codes when the
operation would have resulted in an integer overflow condition. This facilitates a software
implementa tion of extended intege r arithmetic.
emul multiplies two ordinals (each contained in a register), producing a long ordinal result (stored
in two registers).
ordinal remainder (store d in two adjacent re gisters).
5.2.4Logical
add ordinal with carry
subtract ordinal with carry
extended multiply
extended divide
ediv divides a long ordinal by an ordinal, producing an ordinal quotient and an
5-8
These instructions perform bitwise Boolean operations on the specified operands:
i960® VH Processor Developer’s Manual
Instruction Set Overview
and
notand
andnot
xor
or
nor
xnor
not
notor
ornot
nand
src2 AND src1
(NOT src2) AND src1
src2 AND (NOT src1)
src2 XOR src1
src2 OR src1
NOT (src2 OR src1)
src2 XNOR src1
NOT src1
(NOT src2) or src1src2or (NOT src1)
NOT (src2 AND src1)
All logical instructions use the REG forma t and c an operate on literals or lo cal or global registers .
5.2.5Bit, Bit Field and Byte Operations
These perform operations on a specified bit or bit field in an ordinal operand. All Bit, Bit Field a nd
Byte instructions use the REG format and can operate on literals or local or global registers.
5.2.5.1Bit Operations
These in st r u ct io n s operate on a spe ci f ie d bi t:
setbit
clrbit
notbit
alterbit
scanbit
spanbit
setbit, clrbit and notbit set, clear or complement (toggle) a specified bit in an ordinal.
alterbit alters the state of a specified bit in an ordinal according to the condition code. When the
condition code is 010
chkbit, descri bed in Section 5.2.6, “Comparison” on pag e 5-10, can be used to check the value of
set bit
clear bit
invert bit
alter bit
scan for bit
span over bit
, the bit is set; when the condition code is 0002, the bit is cleared.
2
an individual bit in an ordinal.
scanbit and spanbit find the most significant set bit or clear bit, respectively, in an ordinal.
i960® VH Processor Developer’s Manual
5-9
Instruction Set Overview
5.2.5.2Bit Field Operations
The two bit field inst r u ct io n s ar e extract and modify.
extract converts a s pecified b it field, taken from a n ordinal value, into an ordinal value. In essence,
this instruction shifts right a bit f ield in a register and fills in the bits to the left of the bit field wit h
zeros. (
eshro also provides the equivalent of a 64-bit extract of 32 bits).
modify copies bits from one register into another register. Only masked bits in the destination
regi ster are modified.
modify is equi valent to a bit field move.
5.2.5.3Byte Operations
scanbyte performs a byte-by-byte comparison of two ordinals to determine when any two
corresponding bytes are equal. The condition code is set based on the results of the comparison.
scanbyte uses the REG format and can specify liter als or local or global regis ters as arguments.
bswap alters the order of bytes in a word, rev ersing its “endiannes s.”
5.2.6Comparison
The processor provides several types of instructions for comparing two operands, as described in
the following subsections.
5.2.6.1Compare and Conditional Compare
These instr uctions compare two opera nds then set the conditi on code bits in the AC register
according to the results of the comparison:
cmpiCompar e I nt eg e r
cmpibCompare Integer Byte
cmpisCompare Integer Short
cmpoCompare O rdinal
concmpiConditional Compare Integer
concmpoConditional Compare Ordinal
chkbitCheck Bit
These all use the REG format and can specify liter als or local or global registers. The condition
code bits are set to indicate whethe r one operand is less than, equal to, or greater than the other
operand. See S ection 3.6.2, “Arithmetic Controls Register – AC” on page 3-13 for a descri ption of
the condition codes for conditional operations.
cmpi and cmpo simply compare the two operands and set the condition code bits accordingly.
concmpi and concmpo first check the status of condition code bit 2:
5-10
• When not set, the operands are compared as wit h cmpi and cmpo.
• When set, no comparison is performed and the condition code flags are not changed.
i960® VH Processor Developer’s Manual
The condition al-compare instructions are provided specifically to optim ize two-sided range
comparisons to che ck for the condition when A is betwee n B and C (B ≤ A ≤ C). He re , a co mp a r e
instruction (
instruction (
cmpi or cmpo) checks one side of the range (A ≥ B) and a conditional compare
concmpi or concmpo) c hec ks the othe r side (A ≤ C) ac co rding to the resul t of the firs t
comparison. The con dition codes following the conditional comparison directly reflect the results
of both compa rison oper ations. Th erefore , only o ne c onditi ona l br anch i nstruc tion i s requi red to ac t
upon the range check; otherwise, two branches would be needed.
chkbit checks a specified bit in a register and s ets the condition code flags according to the bit
state. The con dition code is set to 010
when the bit is set, and 0002 when the bit is not set.
2
5.2.6.2Compare and Increment or Decrement
These instructions compare two operand s, set the condition code bits according to the compare
results, the n increment or decrement one of the operands:
cmpincicompare and increment integer
cmpincocompare and increment ordinal
cmpdecicompare and decrement integer
cmpdecocompare and decrement ordinal
Instruction Set Overview
These all use the REG format and can specify literals or local or global registers. They are an
architectural performance optimization which allows two register operations (e.g., compare and
add) to execute in a single cycle. The int ended use of these instruction s is at the end o f iterativ e
loops.
5.2.6.3Test Condition Codes
These test instructions allow the sta te of the condition code fl ags to be tested:
testetest for equal
testnetest for not equal
testltest for less
testletest for less or equal
testgtest for greater
testgetest for greater or equal
testotest for ordered
testnotest for unordered
When the condition code matches the instruction-specified condition, a TRUE (0000 0001H) is
stored in a destination register; otherwise, a FALSE (0000 0000H) is stored. All use the COBR
format and can operate on local and global registers.
i960® VH Processor Developer’s Manual
5-11
Instruction Set Overview
5.2.7Branch
Branch inst ructions allow program flow direction to be changed by explicitly modifying the IP.
The processor provides three branch instruction types:
• unconditional branch
• conditional branch
• compare and branch
Most bran ch instructions specify t h e target IP by specifying a sig n ed displacement to be adde d to
the current IP. Other branch instructions specify the target IP’s memory address, using one of the
processor’s addressing mod es . T his latter group of inst ructions is called extended addressing
instructions (e.g., branch extended, bra nch-and-link extended).
5.2.7.1Unconditional Branch
These instr uctions are used for unconditional branching:
bBranch
bxBranch Extended
balBranch and Link
balxBranch and Link Extended
b and bal use the CTRL format. bx and balx use the MEM format and can specify local or global
registers as o perands.
b and bx cause program execution to jump to the specified target IP. These
two instructions perform the same function; however , their determination of the tar get IP differs.
The target IP of a
IP. The target IP of the
b instruction is specified at link time as a relative displacement from the current
bx instructi on is the absolute address resultin g from the instruction’s use of
a memory-addressing mode during execution.
bal and balx store the next instruction’s ad d r ess in a spec ified regi ster, then jump to the specified
target IP. (For
bal, the RIP is automatic ally stored in register g14; for balx, the RIP location is
specified with an instruction operand.) As described in Section 7.9, “Branch-and-Link” on
page 7-18, branch and link instructions provide a method of performing procedure call s that do not
use the processor’s integrated c all/return mechanism. Here, the saved i nstruction address is used as
a return IP. Branch and link is generally us ed to call leaf procedures (that is, proce dures that do not
call other procedures).
bx and balx can make use of any memory-addressing mode.
5.2.7.2Conditional Branch
Wit h conditi ona l branch (BRANCH IF) instructions, the processor checks the AC register condition
code flags. When these flags match the value specified with the instruction, the processor jumps to
the target IP. These instructions use the displacement-plus-ip method of specifying the target IP:
5-12
bebranch if equal/true
bnebranch if not equal
blbranch if less
i960® VH Processor Developer’s Manual
blebranch if less or equal
bgbranch if gre ater
bgebranch if greater or equal
bobranch if ordered
bnobranch if unordered/false
All use th e C T R L for mat. bo and bno are u sed with real numbers. bno can also be used with the
result of a
chkbit or scanbit inst ruct ion. Refer to Section 3.6.2.2, “Condition Code (AC.cc)” on
page 3-14 for a discussion of the condition code for conditional operations.
5.2.7.3Compare and Branch
These instructi ons compare two operands then branch according to the comparis on res ult. Three
instruction subtypes are compare integer, compare ordinal and branch on bit:
cmpibecompare integer and branch if equal
cmpibnecompare integer and branch if not equal
cmpiblcompare integer and branch if less
Instruction Set Overview
cmpiblecompare intege r and branch if less or equal
cmpibgcompare integer and branch if g r eater
cmpibgecompare integer and branch if greater or equal
cmpibocompare integer and branch if ordered
cmpibnocompare integer and branc h if unordered
cmpobecompare ordinal and branch if equal
cmpobnecompare ordinal and branch if not equal
cmpoblcompare ordinal and branch if less
cmpoblecompare ordinal and branch if less or equal
cmpobgcompare ordinal and branch if greater
cmpobgecompare ordinal and branch if greater or equal
bbscheck bit and branc h if set
bbccheck bit and branch if clear
All use the COBR machine instruction format and can specify literals, local or global registers as
operands. With compare ordinal and branch (
compib*) instruc tions , two oper ands ar e compared and the con dition c ode bits are set a s descri bed
(
compob*) and compare integer and branch
in Section 5.2.6, “Compariso n” on page 5-10. A conditional branch is then executed as with the
conditional branch (
BRANCH IF) instructions.
With check bit and branch inst r uctions (
second operand. The co ndition code flags are set ac cording to the state of the spec ified bit: 010
(true) when the bit is set and 000
executed according to condition code bit settings .
i960® VH Processor Developer’s Manual
bbs, bbc), one operand specifies a bit to be checked in the
(false) when the bit is clear. A conditional branch is then
2
2
5-13
Instruction Set Overview
These instr uctions can be used to opti mi ze execution performance time. When it is not possibl e to
separate a djacent compare and branch instructions from other unrelated instructions, replacing two
instructions with a single compare and branch instruction increases performance.
5.2.8Call/Retur n
The 80960VH offers an on-chip call/return mechanis m for maki ng pr oce dure calls. Refer to
Sectio n7.1 , “Call an d R eturn Mechan ism” on page7- 2 . The following instructions support this
mechanism:
callcall
callxcall extended
callscal l system
retreturn
call and ret use the CTRL ma chi ne -ins tr uct ion f orm at . callx uses the MEM format and can speci fy
local or global registers.
call and callx make local calls to procedures. A local call is a call that do es not require a switc h to
another stack.
The target procedure of a call is determined at link time and is encoded in the opword as a signed
displacement relative to the call IP.
address calculated at run time using any one of the addressi ng mo des . For both instruc tions, a new
set of local regi sters and a n ew stack fram e are alloc ated for the called pr o cedure.
call and callx differ only in the method of s pecifying the target procedure ’s address.
calls uses the REG format and can specif y local or global registe rs.
callx specifies the target procedure as an absolute 32-bit
calls is used to make calls to sys tem procedures — procedures tha t provide a kernel or
system - executive servic e. This instruction opera tes similarly to
target-procedure address from the system procedure table. An index number included as an
operand in the instruction provide s an entry point into the procedure table.
Depending on the type of entry being pointed to in the system procedure tab le,
either a system-supervisor call or a system-local call to be executed. A system-supervisor call is a
call to a system procedure that switches the processor to supervisor mode and switches to the
supervisor st ack. A syst em-loca l call is a call to a system procedure that does not cause an
execution mode or stack change. Supervisor mode is described throughout Chapter7, “Procedure
Calls”.
ret performs a return from a called procedure to the calling procedur e (the procedure t hat made the
ret obtains its target IP (return IP) f r om linkage information tha t was saved for the calling
call).
procedure.
implicit calls to interrupt and fault handlers.
5.2.9Faults
Generally, the processor generates faults automatically as the result of certain operation s. Fault
handling procedures are then invoked to handle various fault types without explicit intervention by
the currently running program. These conditional fault instructions permit a program to explicitly
generate a fault according to the state of the condition code flags. All use the CTRL format.
faultefault if equal
call and callx, exce pt th at i t g ets its
calls can cause
ret is used to return from all calls — including local and supervis or calls — and from
5-14
i960® VH Processor Developer’s Manual
faultnefault if not equal
faultlfault if less
faultlefault if less or equal
faultgfault if gr eater
faultg efault if greater or equal
faultofault if or dered
faultn ofault if uno rdered
syncf ensures that any faults that occur during the execution of prior instructions occur before the
instruction that follows the
5.2.10Debug
The processor supports debugging and monitoring of program activity through the use of trace
events. The following instructions support these debugging and monitoring tools :
modpcmodify process controls
Instruction Set Overview
syncf. syncf uses the REG format and requires no operands.
modtcmodify trace controls
markmark
fmarkforce mark
These all use the REG format. Trace functi ons are controlled with bits in the Trace Control (TC)
register which ena ble or disable various ty pes of tracing. Other TC register flags indicate when an
enabled trace event is de tected. Refer to Chapte r 10, “Tracing and Debugging”.
modtc permits trace controls to be modified. mark causes a breakpoint trace event to be gene rated
when breakpoint trace mode is enabl ed.
of the breakpoint trace mode bits.
Other instruc tions that are helpful in debugging include
enable/disab l e tr ace fault genera ti o n. The
trace event gene ration. This instruction is used, in part, to load an d control the 80960VH’s
breakpoint registers.
5.2.11Atomic Instructi on s
Atomic instructions perform an atomic read-modify-write opera tion on operands in memory. An
atomic operati on is one in which other memory operations are forced to occur before or after, but
not during, the accesses that comprise the atomic operation. These instructions are required to
enable synchronization between inte rrupt handlers and background tasks in any system. They are
also particularly useful in systems where several agents — processors, coprocessors or external
logic — have access to the same system memory for communication.
fmark ge nerate s a b reakpoi nt tr ace inde pen dent of t he st ate
modpc and sysctl. modpc ca n
sysctl instruction also provides control over breakpoint
The atomic instructions are atomic add (
operand to be added to the va lue in the specified memory location.
specified memory location to be modified under co ntrol of a mask. Both instructions use the REG
format and can specify literals or local or global registers as operands.
i960® VH Processor Developer’s Manual
atadd) and atomic modify (atmod). atadd causes an
atmod causes bits in the
5-15
Instruction Set Overview
5.2.12Processor Management
These instr uctions control processor-related functions:
modpcModify the Process Controls register
flushregFlush cache d local register sets to memory
modacModify the Arithmetic Controls register
All use the REG format and can sp ec ify literals or local or global registers.
modpc provides a method of reading and modifying PC register contents. Only programs
operating in supervisor mode may modify the PC regis ter; however, any program may read it.
The processor provides a flush local registers instruction (
cached local registers to the stack. The flush local registers instruction automatically stores the
contents of all the local register sets — except the current set — in the reg ister save area of their
associated stack frames.
The modify arit hmet ic cont rols i nstruc tion (
a register and/ or modi fied under the control of a mask. T he AC register cannot be explici tly
addressed wit h any other instruction ; however, it is imp licitly accessed by instructions that use the
condition codes or set the integer overflow flag.
sysctl is used to configure the interrupt controller, breakpoint registers and instruction cache. It
also permits software to signal an interrupt or cau se a processor reset and reinitialization.
may be executed only by progra ms operating in supervisor mode.
intctl, inten and intdis are used to enable and disable interrupts and to determine current interrupt
enable sta t us.
modac) allo ws th e A C re gi ster co nt ent s to b e co pie d to
5.3Performance Optimization
Performance optimization is cat egorized into two sections: instruct ions optimizations and
miscel laneous optim izations.
5.3.1Instruction Optimizations
flushreg) to save the contents of the
sysctl
Instruction optimizations are broken down by the instruction classification.
5.3.1.1Load / Store Execution Model
Becau se the 80960 V H h as a 32 - b it ex ternal data bus, mu ltiple wo rd accesse s re qu ire mult ip l e
cycles. The proc essor us es mic rocode t o sequence the mu lti-word a ccess es. Bec ause the microc ode
can ensure that aligned multi-words are bursted together on the external bus, software should not
substit ute m ult iple s i ngle-wor d ins tru ct ions fo r one mult i-word ins t ruct ion f or d ata t hat is no t l ikely
to be in cache; (i.e., one
Once a load is issue d , the processor attempts to execute oth er in struct io ns while the load is
outstandi ng. It is importa nt to note tha t when the load mi sses the data cac he, the proce ssor does not
stall the issuing of subsequent instructions (other than stores) that do not depe nd on the load.
5-16
ldq provides better bus performance than four ld instructions).
i960® VH Processor Developer’s Manual
Software should avoid following a load with an instruc tion that depends on the result of the load.
For a load that hits th e d ata cache, a one-cycle stall occurs when the instru ction immediately af ter
the load requires the data. When the load fails to hit the data cache, the inst r uction depending on
the load is stalled until the outstanding load request is resolved.
Multiple, back-to-back load instruc tions do not stall the processor until the bus queue becomes f ull.
The processor dela ys issuing a store instruction until all previously-issued loa d instructions
complete. This happens regardless of whether the store is dependent on the load. This ordering
between loads and stores ensures that the return data from a previous cache-rea d mi ss does not
overwrite the cache line updated by a subseque nt store.
5.3.1.2Compare Operations
Byte and short word data is more efficiently compared using the new byte and short compare
instr u ct io n s (
cmpob, cmpib, cmpos, cmpis), rather than shifting the data and using a word
compa r e in st r u ct io n .
5.3.1.3Microcoded Instructions
While the majority of instructions on the 80960VH are single cycle and are executed directly by
processor hardware, some re quire microcode emulation. Entry into a microcode rout ine requires
two cycles. Exit from microcode typically requires two cycles. For some routines, one cycle of the
exit process can execute in parallel with another instruction, thus saving one cycle of execution
time.
Instruction Set Overview
5.3.1.4Multiply-Divide Unit Instructions
The Multiply-Divide Unit (MDU) performs a number of mul ti-cycle arithm etic operations. These
can range from 2 cycles for a 16-bit x32-bit
ediv.
for an
mulo, 4 cycles for a 32-bitx32-bit mulo, to 30+ cycles
Once issued, these MDU instruc tions are executed in paral lel with other non-MDU instructions
that do not depend on the result of the MDU operation. Attempting to issue another MDU
instruction wh ile a current MDU instr u ction is e x ecuting, stalls t he p rocessor until t h e f i rst one
completes.
5.3.1.5Multi-Cycle Register Operations
A few register operations can also take multiple cycle s. The following instru cti ons are performed
in microcode:
• bswap• extract• eshro• modify• movl• movt
• movq• shrdi• scanbit• spanbit• testno• testo
• testl• testle• teste• testne• testg• testge
On the 80960VH,
which is ex ecute d in o n e cycle dir ectly by pr o cessor h ardware.
Multi-register move operation execu tion time can be decreased at the expense of cache utili zation
and code density by using
test<cc> dst is microcoded and takes many more cyc les than SEL<cc> 0,1,dst,
mov the appropriate number of times instead of movl, movt and movq.
i960® VH Processor Developer’s Manual
5-17
Instruction Set Overview
5.3.1.6Simple Control Transfer
There is no branch look-ahead or branch prediction mechanism on the 80960VH. Simple branch
instruct ions t ake one cycle to exe cute , and on e more cycle is neede d to fe tc h the ta r get in struc tio n if
the bran ch is actually tak e n .
b, bal, bno, bo, bl, ble, be, bne, bg, bge
One mod e of th e bx (branch -extended) instructio n, bx (base), is also a simple branch and takes one
cycle to ex ec u te and one cy cl e to fe tc h the tar ge t.
As a result, a
efficient leaf procedu r e im p lementation.
Compare-and-branch instructions have been optimized on the 80960VH. They require two cycles
to execute, and one more cycle to fetch the target instruction if the branch is actually taken. The
instructions are:
bal(g14) or bx (g14) se quence provides a two-cycle call and return mechanism for
5.3.1.7Memory Instructions
The 80960VH provides ef ficient support for naturally aligned byte, short, and word accesses that
use one of six optimized addressing modes. Th ese ac cesses require only one to two cycles to
execute; additi on a l cy c les are needed for a load to re t ur n it s da ta.
The byte, short and word memory instructions are:
ldob, ldib, ldos, ldis, ld, lda stob, stib, stos, st is, st
The rem ainder of ac cesses r eq u ire mult ip le cycle s to execute. Thes e in clude:
• Unaligned short, and word accesses
• Byte, short, and word accesses that do not use one of the 6 optimized addressing modes
• Multi-word accesses
The multi -w ord accesses are :
ldl, ldt, ldq, stl, stt, stq
5.3.1.8Unaligned Memory Accesses
Unaligned memor y acces ses are performed by micro code . Microcode sequence s the access into
smaller aligned pieces and does any merging of data that is neede d. As a result, these accesses are
not as efficient as aligned accesses. In addition, no bursting on the external bus is performed for
these accesses. Whenever possible, unaligned accesses should be avoided.
5-18
i960® VH Processor Developer’s Manual
Instruction Set Overview
5.3.2Miscella ne ou s Op tim iza tions
5.3.2.1Masking of Integer Overflow
The i960 co r e arch itectur e in s e r ts an imp l icit syncf before performing a call operation or
delivering an interrupt so that a fault handler can be dispatched first, when necessary.
require a num ber of cycles to complete when a multi-cycle integer- multiply (
integer-divide (
(allowed to occur). Call performance and interrupt latency can be improved by masking
integer-overflow faults (AC.om = 1), which allows the impl icit
divi) instruction is issued previously and integer-overflow faults are unmasked
syncf to complete more quickly.
muli) or
5.3.2.2Avoid Using PFP, SP, R3 As Destinations for MDU Instructions
When performing a call operation or delivering an interrupt, the processor typically attempts to
push the first four local registers (pfp, sp, rip, and r3) onto the local register cache as early as
possible. Because of register-interlock, this operation is stalled until previous instructions return
their results to these registers. In most cases, this is not a problem; however, in the case of
multi-cycle instructions (
for many cycles waiting for the result and unable to proceed to the next step of call processing or
interrupt delivery.
Call performance and interrupt latency can be improved by avoiding the first four registers as the
destination for a MDU instruction. Generally, registers pfp, sp, and rip should be avoided; th ey a r e
used for procedure linking.
divo, divi, ediv, modi, remo, and remi), the processor could be sta lled
syncf can
5.3.2.3Use Global Registers (g0 - g14) As Destinations for MDU Instructions
Using the same ratio nale as in the previous item , call processing and inte rrupt performance are
improved even further by usi ng global registers (g0-g14) as the destination for multi-cycle MDU
instructions. This is because there is no dependency between g0-g14 and implicit or explicit call
operations (i.e., global registers are not pushed onto the local register cache).
5.3.2.4Execute in Imprecise Fault Mode
Significant performance improvement is possible by allowing imprecise faults (AC.nif = 0). In
precise fault mode (AC.nif = 1), the processor does not issue a new instruction until the previous
one completes. This ensures that a fault from the previous instruction is delivered before the next
instruction ca n begin execution. Imprecis e fa ult mode allows new instructions to be is s ued before
previous ones complete, thus increasing the instruction issue rate. Many applications can tolerate
the imprecise fault reporti ng for the performance gain. A
mode to isolate faults at desired points of execution when necessary.
syncf can be used in imprecise fault
5.3.3Cache Control
The following instructions provide instruction and data cache c ontrol functions.
icctlInstruction cache control
dcctlData cache contr o l
i960® VH Processor Developer’s Manual
5-19
Instruction Set Overview
icctl and dcctl provide cache contro l functions including: enabling, disabling, l oading and lockin g
(instruction cache only), invalidating, getting status and storing cache information out to memory.
5-20
i960® VH Processor Developer’s Manual
Instruction Set Reference
This chapter provides detailed information about each instruction available to the i960® VH
processor. Instructions are listed alphabetically by assembly language mnemonic. Format and
notation used in this chapter are defined in Section 6.1, “Notation” on page 6-1.
Information in this chapter is oriented toward pro grammers who write assembl y language code for
the 80960VH. Information provided for each instruction includes:
• Alphabetic listing of all ins tructions• F a u lt s th at can occur dur i ng ex ec u ti o n
6
• Assembly language mnemonic, name and
format
• Description of the instruction’s operation• Assembly language example
Additional information about the ins truction set can be found in the following chapters and
appendic es in this manual:
• Chapter 5, “Instruction S et Ove rvi ew” - S ummariz es the in struc tio n set by group and describ es
the assembly lang uage instruction format.
• Appendix A, “Machine-level Instruction Formats ” - Descri bes instruction se t opword
encodings.
• Appendix B, “Opcodes and Execution Times” - A quick-reference listing of instruction
encodings assists debugging with a logi c ana lyzer .
6.1Notation
In general, notation in this chapter is consistent with usage throughout the manual; however, there
are a few exceptions. Read the following subsect ions to understand notations that are specific to
this chapter.
6.1.1Alphabetic Reference
• Action (or algorithm) and other side
effects of executing an instruction
• Related instructions• Opcode and instruction encoding format
Instructions are listed alphabetically by assembly language mnemonic. When several instructions
are related and fall together alphabet ically, they are described as a group on a single page.
The instruction’s as s em bly language mnemonic is shown in bold at the top of the page (for
example,
cases, the name of the instruction group is shown in capital letters (for example,
FAULT<cc>).
The 80960VH-specific extensions to the i960 microprocessor instruction set are indicated in the
header text for each suc h ins truction. This type of notation is also used to indi ca te new core
architecture instructions. Sections describing new core instructions provide notes as to which
i960-series proc essors do not implement these instructions.
i960® VH Processor Developer’s Manual
subc). Occasionally , it is not practical to list all mnemonics at the page top. In these
BRANCH<cc> or
6-1
Instruction Se t Ref ere nc e
Generally, instruction set extensions are not portable to other i960 processor implementations.
Further, new core instructions are not typically port able to earlier i960 processor family
implementations such as the i960 Kx microprocessors.
6.1.2Mnemonic
The Mnemonic section gives the mnemonic (in boldface type) and instruction name for each
instruction covered on the page, for example:
subi Subtract Integer
This name is the actual assembly language ins truction name recognized by assemblers.
6.1.3Format
The Format secti on gives the instruction ’s assembly language format and allowable operand types.
Format is given in two or three lines. The following is a two-line format example:
sub*
src1src2dst
reg/litreg/litreg
The first line gives the assembly language mnemonic (boldface t ype) and operands (italics). When
the format is used for two or more ins tructions, a n abbreviated form of the mnemonic is us ed. An *
(asterisk) at the end of the mnemonic indicates a variable: in the above example,
subi or subo. Capital letters indicate an instruction class. For example, ADD<cc> refers to the
class of conditional add instructions (for example,
addio, addig, addoo, addog).
sub* is either
Operand names are designed to describe operand function (for example, src, len, mask).
The second line shows allowable entries for each operand. Notation is as follows:
regGlobal (g0 ... g15) or local (r0 ... r15) register
litLiteral of the range 0 ... 31
dispSigned displacement of range (-2
22
... 222 - 1)
memAddress defined with the full range of addressing modes
In some cases, a third line is added to show register or memory location contents. For example, it
may be useful to know that a register is to contai n an address. The notation used in this line is as
follows:
addrAddress
efaEffective Address
6-2
i960® VH Processor Developer’s Manual
6.1.4Description
The Description section is a narrat ive descri pti on of the instruct ion ’ s function an d operands. It al so
gives programming hints when appropriate.
6.1.5Action
The Action section gives an algorit hm writ ten in a "C-like" pseudo-code that describes direc t
effects and possible side effec ts of executing an instructi on. Algorithms document the ins truction’s
net effect on the programming environment; they do not necessarily describe how the processor
actually implements the instruction. The following is an example of the action algorithm for the
alterbit instruction:
Instruction Se t Refe renc e
if ((AC.cc & 010
)==0)
2
dst = src2 & ~(2**(src1%32));
else
dst = src2 | 2**(src1%32);
Table 6-1 defines each abbreviation used in the instruction reference pseudo-code. The
pseudo-code has been written to comply as closely as possible with standard C programming
language notation. Table 6-1 lists the pseudocode symbol definitions.
Table 6-1. Pseudo-Code Symbol Definitions
=Assignment
==, !=Comparison: equal, not equal
<, >less than, greater than
<=, >=less than or equal to, greater than or equal to
<<, >>Logical Shift
**Exponentiation
&, &&Bitwise AND, logical AND
|, ||Bitwise OR, logical OR
^Bitwise XOR
~One’s Complement
%Modulo
+, -Additio n, Subtraction
*Multiplication (In teger or Ordinal)
/Division (Integ er or Ordinal)
#Comment delimiter
Table 6-2. Faults Applicable to All Instructions (Sheet 1 of 2)
Fault T ypeSubtypeDescription
An att emp t t o e xec ute an y i nst ru ct ion f e tc hed f r om inte r nal dat a RA M
OPERATI ON
UNIMPLEMENTED
i960® VH Processor Developer’s Manual
or a memory-mapped region causes an operation unimplemented
fault.
6-3
Instruction Se t Ref ere nc e
Table 6-2. Faults Applicable to All Instructions (Sheet 2 of 2)
Fault TypeSubtypeDes c riptio n
A
Mark
MARK
TRACE
INSTRUCTION
Trace Event is signaled after completion of an inst ructio n for
which there is a hardware breakpoint condition match. A Trace fault
is generated when PC.mk is set.
An Instruction Trace Event is signaled after instruction completion. A
Trace fault is generated when both PC.te and TC.i=1.
T able 6-3. Common Faulting Conditions
Fault TypeSubtypeDescription
Any instr uction that causes an unaligned memo ry access causes
an operation aligned fault when unaligned faul ts are not masked in
the fault configuration word in the Processor Control Block (PRCB).
This fault is generated when the processor attempt s to execute an
instruction containing an undefined opcode or addressing mode.
This fault is caused by a non-defined operand in a supervisor mode
only instruction or by an operand reference to an unaligned long-,
triple- or quad-reg is ter group.
This fault can occur due to an attempt to perform a non-word or
unalig ned access to a memory-mapped region or when attempting
to fetch instructions from MMR space or internal data RAM.
Any instruction that attempts to write to supervisor protected
internal data RA M or a memory -mapped register in supervisor
space while not in supervisor mode causes a TYPE.MISMATCH
fault. This fault is also generated for any non-supe rvisor mo de
reference to an SFR.
OPERATION
Type
UNALIGNED
INVALID_OPCODE
INVALID_OPERAND
UNIMPLEMENTED
MISMATCH
6.1.6Faults
The Faults section lists faults that can be signaled as a direct result of instruction execution.
Table 6-2 shows the possible faulting conditions that are common to the entire instruction set and
could directly result from any instruction. These fault types ar e not included in the instruction
reference. Table 6-3 shows the possibl e faulting co nditions that are common to large subsets of the
instr uc ti on set. W he n an instr u ct io n can gen er ate a faul t, it is n o ted in tha t in struction’s Faults
section. In these sections, “Standard” refer s to the faults shown in Table 6-2 and Table 6-3.
6.1.7Example
The Example section gives an assem bly language example of an application of the instruction.
6.1.8Opcode and Instruction Format
The Opcode and Instruction Format section gives the opcode and instruction format for each
instruction, for example:
subi593HREG
6-4
i960® VH Processor Developer’s Manual
The opcode is given in hexadecimal format. The format is one of four possible formats: REG,
COBR, CTRL and MEM. Refer to Appendix A, “Machine-level Instruction Formats,” for more
information on the formats.
6.1.9See Also
The See Also section gives the mne moni cs of related instructions which are also alph abe tically
listed in this chapter.
6.1.10Side Effects
This section indicates whether the instruction cause s changes to the condition code bits in the
Arithmetic Controls.
6.1.11Notes
This section p rovi des a ddition al informa tion a bout an inst ruct ion s uch as whe ther i t i s im plemente d
in other i960 processor families.
Instruction Se t Refe renc e
6.2Instructions
The processor’s instructions are arra nged alphabeticall y by ins truction or instruction group.
i960® VH Processor Developer’s Manual
6-5
Instruction Se t Ref ere nc e
6.2.1ADD<cc>
Mnemonic:addonoAdd Ordinal if Unordered
Format:a dd*src1,src2,dst
addogAdd Ordinal if Greater
addoeAdd Ordinal if Equal
addogeAdd Ordinal if Greater or Equal
addolAdd Ordinal if Less
addoneAdd Ordinal if Not Equal
addoleAdd Ordinal if Less or Equal
addooAdd Ordinal if Ordered
addinoAdd Integer if Unordered
addigAdd Integer if Greater
addieAdd Integer if Equal
addigeAdd Integer if Greater or Equa l
addilAdd Integer if Less
addineAdd Integer if Not Equal
addileAdd Integer if Less or Equal
addioAdd Integer if Ordered
reg/litreg/litreg
Description:Conditionally adds src2 and src1 values and stores the result in dst based on
the AC register condition code. If for Unordered the condition code is 0, or if
for all other cases the logical AND of the condition code and the mask part of
the opcode is not 0, then the values are added and placed in the destination.
Otherwise the destination is left unchanged. Tabl e 6 -4 shows the condition
code mask for each instruc tion. The mask is in opcode bits 4-6.
Table 6-4. Condition Code Mask Descriptions (Sheet 1 of 2)
InstructionMaskCondition
addono
addino
addog
addig
addoe
addie
addoge
addige
addol
addil
addone
addine
addole
addile
000
001
010
011
100
101
110
2
2
2
2
2
2
2
Unordered
Greater
Equal
Great er or equa l
Less
Not equal
Less or equal
6-6
i960® VH Processor Developer’s Manual
Table 6-4. Condition Code Mask Descriptions (Sheet 2 of 2)
This class of core instructions is not implemented on 80960Cx, Kx and Sx
proc essors .
6-8
i960® VH Processor Developer’s Manual
6.2.2addc
Mnemonic:addcAdd Ordinal With Carry
Format:addcsrc1,src2,dst
Description:Adds src2 and src1 values and condition code bit 1 (used here as a carry-in)
Action:dst = (src1 + src2 + AC.cc[1])[31:0];
Instruction Se t Refe renc e
reg/litreg/litreg
and stores the result in dst. If ordinal addition results in a carry out, then
condition code bit 1 is set; otherwise, bit 1 is cleared. If integer addition
results in an overflow, then condition code bit 0 is set; otherwise, bit 0 is
cleared. Regard less of addition results, condition code bit 2 is always set to 0.
addc can be used for ordinal or integer arithmetic. addc does not distinguish
between ordinal and integer sourc e operands. Instead, the processor evaluates
the result for both data types and sets condition code bits 0 and 1 accordingly.
An integer overflow fault is never signaled wit h this instruction.
Mnemonic:alte rbitAlter Bit
Format:alterbitbitpos,src,dst
Description:Copies src value to dst with one bit altered. bitpos operand specifies bit to be
Action:if((AC.cc & 010
Faults:STANDARD Refer to Section 6.1.6, “Faults” on page 6-4.
Instruction Se t Refe renc e
reg/litreg/litreg
changed; condition code determines the value to which the bit is set. If
condition code is X1X
, then b it 1 = 1, the sel ec ted bit is s et; ot he rwis e, it i s
2
cleared. Typically this instruction is used to set the bitpos bit in the targ
register if the result of a compare instruction is the equal condition code
(010
).
2
)==0)
2
dst = src & ~(2**(bitpos%32));
else
dst = src | 2**(bitpos%32);
Example:# Assume AC.cc = 010
2.
alterbit 24, g4,g9 # g9 = g4, with bit 24 set.
Opcode:alterbit58FHREG
See Also:chkbit, clrbit, notbit, set bit
i960® VH Processor Developer’s Manual
6-11
Instruction Se t Ref ere nc e
6.2.5and, andnot
Mnemonic:andAnd
Format:andsrc1,src2,dst
Description:Performs a bitwise AND (and) or AND NOT (andnot) operation on src2 and
Action:and:
andnotAnd Not
reg/litreg/litreg
andnotsrc1,src2,dst
reg/litreg/litreg
src1 values and stores resu lt in ds t. Note in the action expressions below, src2
operan d co mes first, so that w i th
andnot the expression is eval u ated as:
{src2 and not (src1)}
rather than
{src1 and not (src2)}.
dst = src2 & src1;
andnot:
dst = src2 & ~src1;
Faults:STANDARD Refer to Section 6.1.6, “Faults ” on pa ge 6-4.
Example:and 0x7, g8, g2# Put lower 3 bits of g8 in g2.
andnot 0x7, r12, r9 # Copy r12 to r9 with lower
# three bits cleared.
Opcode:and581HREG
andnot582HREG
See Also:nand, nor, not, notand, notor, or, ornot, xnor, xor
6-12
i960® VH Processor Developer’s Manual
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.