Motorola DSP96002 User Manual

DSP96002
32-BIT
DIGITAL SIGNAL PROCESSOR
USER’S MANUAL
Motorola, Inc. Semiconductor Products Sector DSP Division 6501 William Cannon Drive, West Austin, Texas 78735-8598
SECTION 1

DSP96002 INTRODUCTION

This manual describes the first member of a family of dual-port IEEE floating point programmable CMOS processors. The family concept defines a core as the Data ALU, Address Generation Unit, Program Con­troller and associated Instruction Set. The On-Chip Program Memory, Data Memories and Peripherals sup­port many numerically intensive applications and minimize system size and power dissipation; however, they are not considered part of the core.
The first family member is the DSP96002. The main characteristics of the DSP96002 are support of IEEE 754 Single Precision (8 bit Exponent and 24 bit Mantissa) and Single Extended Precision (11 bit Exponent and 32 bit Mantissa) Floating-Point and 32 bit signed and unsigned fixed point arithmetic, coupled with two identical external memory expansion ports. Its features are listed below.
DSP96002 Features
IEEE 745 Standard SP (32-bit) and SEP (44 bit) Arithmetic
16.5 Million Instructions per Second (Mips) with a 33 Mhz clock
49.5 Million Floating Point Instructions per Second (MFLOPS) peak with a 33 Mhz clock
Single-Cycle 32 x 32 Bit Parallel Multiplier
Highly Parallel Instruction Set with Unique DSP Addressing Modes
Nested Hardware Do Loops
Fast Auto-Return Interrupts
2 Independent On-Chip 512 x 32 Bit Data RAMs
2 Independent On-Chip 1024 x 32 Bit Data ROMs
32
Off-Chip Expansion to 2 x 2
On-Chip 1,024 x 32 Bit Program RAM
On-Chip 64 x 32 Bit Bootstrap ROM
Off-Chip Expansion to 2
Two Identical External Memory Expansion Ports
Two 32-Bit Parallel Host MPU/DMA Interfaces
On-Chip Two-Channel DMA Controller
On-Chip Emulator
32-Bit Words of Data Memory
32
32-Bit Words of Program Memory
MOTOROLA DSP96002 USER’S MANUAL 1 - 1
1 - 2 DSP96002 USER’S MANUAL MOTOROLA
SECTION 2

SIGNAL DESCRIPTION AND BUS OPERATION

2.1 PINOUT
The functional signal groups of the DSP96002 are shown in Figure 2-2, and are described in the following sections. A pin allocation summary is shown in Figure 2-1. Specific pinout and timing information is avail­able in the DSP96002 Technical Data Sheet (DSP96002/D).
2.1.1 Package
The DSP96002 is available in a 223 pin PGA package. There are 176 signal pins (including 5 spares), 17 power pins and 30 ground pins. All packaging information is available in the data sheet.
2.1.2 Interrupt And Mode Control (4 Pins)
—R—E—S—E–T(Reset) - active low, Schmitt trigger input. —R—E—S—E–T is internally synchronized
to the input clock (CLK). When asserted, the chip is placed in the reset state and the internal phase generator is reset. The Schmitt trigger input allows a slowly rising input
(such as a capacitor charging) to reliably reset the chip. If —R—E—S—E–T is deas­serted synchronous to the input clock (CLK), exact startup timing is guaranteed, allow­ing multiple processors to startup synchronously and operate together in "lock-step".
When the —R—E—S—E–T pin is deasserted, the initial chip operating mode is latched from the MODA, MODB and MODC pins.
MODA/—I—R—Q–A(Mode Select A/External Interrupt Request A) - active low input, internally
synchronized to the input clock (CLK). MODA/—I—R—Q–A selects the initial chip operating mode during hardware reset and becomes a level sensitive or negative edge triggered, maskable interrupt request input during normal instruction processing. MODA, MODB and MODC select one of 8 initial chip operating modes, latched into the
operating mode register (OMR) when the —R—E—S—E–T pin is deasserted. If —I
R—Q–A is asserted synchronous to the input clock (CLK), multiple processors can be resynchronized using the WAIT instruction and asserting —I—R—Q–A to exit the wait state. If the processor is in the STOP standby state and —I—R—Q–A is asserted, the
processor will exit the STOP state.
MOTOROLA DSP96002 USER’S MANUAL 2 - 1
CPU Pins Pins
Reset and IRQs 4 Clock Input 1 OnCE Port 4 CPU Spare 1 Quiet Power 4 Quiet Ground 4 CPU Subtotal 18
Power/Ground Planes Pins
Package Noisy Power Plane 2 Package Noisy Ground Plane 5 Package Quiet Power Plane 1 Package Quiet Ground Plane 1
Power/Ground Plane Subtotal 9
Each Port Both Ports Port A/B Pins Pins
Data Bus 32 64
Address Bus 32 64 Data Power 2 4 Data Ground 4 8 Address Power 2 4 Address Ground 4 8 Addr/Data Subtotal 76 152
Each Port Both Ports
Port A/B Pins Pins
Bus Control Signals 17 34 Bus Control Spare 2 4 Bus Control Power 1 2 Bus Control Ground 2 4 Control Subtotal 22 44
Pinout Summary Pins
CPU Pins 18 Package Power/Ground Planes 9 Port A/B Pins
Data and Address 152 Bus Control 44
TOTALS 223
Figure 2-1. DSP96002 Functional Group Pin Allocation
MODB/—I—R—Q–B(Mode Select B/External Interrupt Request B) - active low input, internally synchronized
to the input clock (CLK). MODB/—I—R—Q–B selects the initial chip operating mode dur­ing hardware reset and becomes a level sensitive or negative edge triggered, maskable interrupt request input during normal instruction processing. MODA, MODB and MODC select one of 8 initial chip operating modes, latched into the operating mode register
(OMR) when the —R—E—S—E–T pin is deasserted. If —I—R—Q–B is asserted syn­chronous to the input clock (CLK), multiple processors can be resynchronized using the
WAIT instruction and asserting —I—R—Q–B to exit the wait state.
MODC/—I—R—Q–C(Mode Select C/External Interrupt Request C) - active low input, internally synchronized
to the input clock (CLK). MODC/—I—R—Q–C selects the initial chip operating mode dur-
2 - 2 DSP96002 USER’S MANUAL MOTOROLA
ADDRESS BUS A
aA0-aA31 bA0-bA31 V
cc
V
ss
(2) (2) V (4) (4) V
DATA BUS A
aD0-aD31 bD0-bD31 V
cc
V
ss
(2) (2) V (4) (4) V
32 32
32 32
ADDRESS BUS B
cc ss
DATA BUS B
cc ss
PORT A BUS CONTROL PORT B BUS CONTROL
aS1 bS1 aS0 bS0
aR/—W bR/—W a—B–S a—B–L b—B–L a—T–T b—T–T a—T–S b—T–S a—T–A b—T–A
a—A–E b—A–E a—D–E b—D–E
a—H–S b—H–S a—H–A b—H–A a—H–R b—H–R
DSP96002
223 PINS
—B–
S
a—B–R b—B–R a—B–G b—B–G a—B–B b—B–B a—B–A b—B–A
aNC (2) (2) bNC V
cc
V
ss
(1) (1) V (2) (2) V
cc ss
INTERRUPT AND OnCE  ON-CHIP MODE CONTROL EMULATION PORT
MODA/—I—R—Q–A DSO MODB/—I—R—Q–B DSI/OS0 MODC/—I—R—Q–C DSCK/OS1
—R—E—S—E–
T
—–D–
R
CLOCK INPUT NOISY POWER PLANE
CLK (2) V NC (5) V
cc ss
QUIET POWER QUIET POWER PLANE
V
cc
V
ss
(4) (1) V (4) (1) V
cc ss
Figure 2-2. DSP96002 Functional Signal Groups
OnCE
is a trademark of Motorola Inc.
MOTOROLA DSP96002 USER’S MANUAL 2 - 3
ing hardware reset and becomes a level sensitive or negative edge triggered, maskable interrupt request input during normal instruction processing. MODA, MODB and MODC select one of 8 initial chip operating modes, latched into the operating mode register
(OMR) when the —R—E—S—E–T pin is deasserted. If —I—R—Q–C is asserted syn­chronous to the input clock (CLK), multiple processors can be resynchronized using the
WAIT instruction and asserting —I—R—Q–C to exit the wait state.
2.1.3 Power and Clock (39 Pins)
CLK (Clock Input) - active high input, high frequency processor clock. Frequency is twice the
instruction rate. An internal phase generator divides CLK into four phases (t0, t1, t2 and t3) which is the basic instruction execution cycle. Additional tw phases are optionally generated to insert wait states (WS) into instruction execution. A wait state is formed by pairing a t2 and tw phase. CLK should be continuous with a 46-54% duty cycle.
WS WS
t0 t1 t2 t3 t0 t1 t2 tw t2 tw t2 t3
CLK
No Wait State
Instruction
Quiet VCC (4) (Power) - isolated power for the CPU logic. Must be tied to all other chip power pins ex-
ternally. User must provide adequate external decoupling capacitors.
Quiet VSS (4) (Ground) - isolated ground for the CPU logic. Must be tied to all other chip ground pins
externally. User must provide adequate external decoupling capacitors.
Address Bus VCC(4) (Power) - isolated power for sections of address bus I/O drivers. Must be tied to
all other chip power pins externally. User must provide adequate external decoupling capacitors.
Address Bus VSS(8) (Ground) - isolated ground for sections of address bus I/O drivers. Must be tied
to all other chip ground pins externally. User must provide adequate external decoupling capacitors.
Two Wait State Instruction
Data Bus VCC(4) (Power) - isolated power for sections of data bus I/O drivers. Must be tied to all
other chip power pins externally. User must provide adequate external decoupling ca­pacitors.
Data Bus VSS(8) (Ground) - isolated ground for sections of data bus I/O drivers. Must be tied to
all other chip ground pins externally. User must provide adequate external decoupling capacitors.
2 - 4 DSP96002 USER’S MANUAL MOTOROLA
Bus Control VCC(2) (Power) - isolated power for the bus control I/O drivers. Must be tied to all other
chip power pins externally. User must provide adequate external decoupling capacitors.
Bus Control VSS(4) (Ground) - isolated ground for the bus control I/O drivers. Must be tied to all oth-
er chip ground pins externally. User must provide adequate external decoupling capac­itors.
2.1.4 On-chip Emulator Interface (OnCE) (4 Pins)
—D–
R (Debug Request) - The debug enable input provides a means of entering the debug
mode of operation from the external command controller. This pin when asserted causes the DSP96002 to finish the current instruction being executed, save the instruction pipe­line information, enter the debug mode and wait for commands to be entered from the debug serial input line.
DSCK/OS1 (Debug Serial Clock/Chip Status 1) - The DSCK/OS1 pin, when configured as an input,
is the pin through which the serial clock is supplied to the OnCE. The serial clock pro­vides pulses required to shift data into and out of the OnCE serial port. When output (not in Debug Mode), this pin in conjunction with the OS0 pin, provides information about the chip status.
DSI/OS0 (Debug Serial Input/Chip Status 0) - The DSI/OS0 pin, when configured as an input, is
the pin through which serial data or commands are provided to the OnCE controller. The data received on the DSI pin will be recognized only when the DSP 96002 has entered the debug mode of operation. When configured as an output (not in Debug Mode), this pin in conjunction with the OS1 pin, provides information about the chip status.
DSO (Debug Serial Output)
OnCE controller registers as specified by the last command received from the external command controller. When a trace or breakpoint occurs this line will be asserted for one T cycle to indicate that the chip has entered the debug mode and is waiting for com­mands.
The debug serial output provides the data contained in one of the
2.1.5 Port A and Port B (162 Pins)
Port A and Port B are identical in pinout and function. The following pin descriptions apply to both ports. Each port may be a bus master and each port has a host interface which can be accessed on demand.
The pins are specified for a 50 pf load and two external TTL loads. Derating curves will be provided spec­ifying performance up to 250 pf capacitive loads.
A0-A31 (Address Bus) - three-state, active high outputs when a bus master. When not a bus
master, A2-A5 are active high inputs, A0-A1 and A6-A31 are three-stated. As inputs, A2-A5 may change asynchronous relative to the input clock (CLK). A2-A5 are host in­terface address inputs which are used to select the host interface register. When a bus master, A0-A31 specify the address for external program and data memory accesses. If there is no external bus activity, A0-A31 remain at their previous values. When a bus
master, the Address Enable (—A–E) input acts as an output enable control for A0-A31. When a bus master, A0-A31 are stable whenever the transfer strobe —T–S is asserted
MOTOROLA DSP96002 USER’S MANUAL 2 - 5
and may change only when —T–S is deasserted. A0-A31 are three-stated during hard­ware reset.
D0-D31 (Data Bus) - three-state, active high, bidirectional input/outputs when a bus master or
not a bus master. The Data Enable (—D–E) input acts as an output enable control for D0-D31. As a bus master, the data lines are controlled by the CPU instruction execution or the DMA controller. D0-D31 are also the Host Interface data lines. If there is no ex­ternal bus activity, D0-D31 are three-stated. D0-D31 are also three-stated during hard­ware reset.
S1,S0 (Space Select) - three-state, active low outputs when a bus master, three-stated when
not a bus master. Timing is the same as the address lines A0-A31. S1 and S0 are three­stated during hardware reset.
These signals can be viewed in different ways, depending on how the external memo­ries are mapped. They support the trend toward splitting memory spaces among ports and mapping multiple memory spaces into the same physical memory locations. Sev-
S1 S0 MEMORY SPACE
1 1 No access 1 0 P access 0 1 X access 0 0 Y access
eral examples are given in Figure 2-3 . The encoding S1:S0=11 may be used to place external memories in their low power standby mode.
R/—W (Read/Write)- three-state, active low output when a bus master, active low input when
not a bus master. Bus master timing is the same as the DSP96002 address lines, giving
EXTERNAL MEMORY AND MAPPING S1 FUNCTION S0 FUNCTION
P only — X only Y only X and Y mapped as 1 or 2 spaces P and X mapped as 2 spaces P and Y mapped as 1 space P, X, and Y mapped as 1 space
—D– —D– —D– —D– —P–S/—D–S—P– —P–S/—D–
S— S— SX/ S
S—
—P–
S
Y
—P–
S
S and —D–S
Figure 2-3. Program and Data Memory Select Encoding
2 - 6 DSP96002 USER’S MANUAL MOTOROLA
an "early write" signal for DRAM interfacing. R/—W is high for a read access and is low for a write access. The R/—W pin is also the Host Interface read/write input. As an in­put, R/—W may change asynchronous relative to the input clock. R/—W goes high if the external bus is not used during an instruction cycle. R/—W is three-stated during
hardware reset.
—B–
S (Bus Strobe) - three-state, active low output when a bus master, three-stated when not
a bus master. Asserted at the start of a bus cycle (providing an "early bus start" signal for DRAM interfacing) and deasserted at the end of the bus cycle. The early negation provides an "early bus end" signal useful for external bus control. If the external bus is
not used during an instruction cycle, —B–S remains deasserted until the next external bus cycle. —B–S is three-stated during hardware reset.
—T–
T (Transfer Type) - three-state, active low output when a bus master, three-stated when
not a bus master. When a bus master, —T–T is controlled by an on-chip page circuit (see Section seven). —T–T is asserted when a fast access memory mode (page, static
column, nibble or serial shift register) is detected. If the external bus is not used during an instruction cycle or a fault is detected by the page circuit during an external access, —T–
T remains deasserted. The parameters of the page circuit fault detection are user
programmable. —T–T is three-stated during hardware reset.
—T–
S (Transfer Strobe) - three-state, active low output when a bus master, active low input
when not a bus master. When a bus master, —T–S is asserted to indicate that the ad­dress lines A0-A31, S1, S0, —B–S, —B–L and R/—W are stable and that a bus read or
bus write transfer is taking place. During a read cycle, input data is latched inside the DSP96002 on the rising edge of —T–S. During a write cycle, output data is placed on the data bus after —T–S is asserted. Therefore —T–S can be used as an output enable
control for external data bus buffers if they are present. If the external bus is not used during an instruction cycle, —T–S remains deasserted until the next external bus cycle. An external flip-flop can delay —T–S if required for slow devices or more address de­coding time. The —T–S pin is also the Host Interface transfer strobe input used to en-
able the data bus output drivers during host read operations and to latch data inside the Host Interface during host write operations. As an input, —T–S may change asynchro-
nous relative to the input clock. Write data is latched inside the Host Interface on the rising edge of —T–S. —T–S is three-stated during hardware reset.
MOTOROLA DSP96002 USER’S MANUAL 2 - 7
CLK
—B–
—T–
When a bus master, the combination of —B–S and —T–S can be decoded externally to determine the status of the current bus cycle and to generate hardware strobes useful for latching address and data signals. The encoding is shown in Figure 2-4.
WS WS
t0 t1 t2 t3 t0 t1 t2 tw t2 tw t2 t3
S
—A–
—D–
—B–S—T–
1 1 Idle 0 1 Cycle Start Address Strobe (—A–S) 0 0 Wait 1 0 Cycle End Data Strobe (—D–S)
S Bus Status Strobe Generation Application
Figure 2-4. Bus Status Encoding
—T–
A (Transfer Acknowledge) - active low input. If the DSP96002 is the bus master and either
there is no external bus activity or the DSP96002 is not the bus master, the —T––A input is ignored by the core. The —T–A input is a synchronous "DTACK" function which can extend an external bus cycle indefinitely. —T–A must be asserted and deasserted syn­chronous to the input clock (CLK) for proper operation. —T–A is sampled on the falling
edge of the input clock (CLK). Any number of wait states (0, 1, 2, ..., infinity) may be inserted by keeping —T–A deasserted. In typical operation, —T–A is deasserted at the
start of a bus cycle, is asserted to enable completion of the bus cycle and is deasserted before the next bus cycle. The current bus cycle completes one clock period after —T A is asserted synchronous to CLK. The number of wait states is determined by the T–A input or by the Bus Control Register (BCR), whichever is longer. The BCR can be used to set the minimum number of wait states in external bus cycles. If —T–A is tied
low (asserted) and no wait states are specified in the BCR register, zero wait states will be inserted into external bus cycles.
2 - 8 DSP96002 USER’S MANUAL MOTOROLA
—A–
E (Address Enable) - active low input, must be asserted and deasserted synchronous to
the input clock (CLK) for proper operation. If a bus master, —A–E is asserted to enable the A0-A31 address output drivers. If —A–E is deasserted, the address output drivers
are three-stated. If not a bus master, the address output drivers are three-stated regard­less of whether —A–E is asserted or deasserted. The function of —A–E is to allow mul-
tiplexed bus systems to be implemented. Examples are a multiplexed address/data bus such as the NuBus  used in the Macintosh II  or a multiplexed address1/address2 bus used with dual port memories such as dynamic VRAMs. Note that there must be at least one undriven CLK period between enables for multiplexed buses to allow one bus to three-state before another bus is enabled. External control is responsible for this timing.
For non-multiplexed systems, —A–E should be tied low.
—D–
E (Data Enable) - active low input, must be asserted and deasserted synchronous to the
input clock (CLK) for proper operation. If a bus master or the Host interface is being read, —D–
E is asserted to enable the D0-D31 data bus output drivers. If —D–E is deassert-
ed, the data bus output drivers are three-stated. If not a bus master, the data bus output drivers are three-stated regardless of whether —D–E is asserted or deasserted. Read­only bus cycles may be performed even though —D–E is deasserted. The function of
—D–
E is to allow multiplexed bus systems to be implemented. Examples are a multi-
plexed address/data bus such as the NuBus  used in the Macintosh II  or a multi­plexed data1/data2 bus used for long word transfers with one 32 bit wide memory. Note that there must be at least one undriven CLK period between enables for multiplexed buses to allow one bus to three-state before another bus is enabled. External control is
responsible for this timing. For non-multiplexed systems, —D–E should be tied low.
—H–
S (Host Select) - active low input, may change asynchronous to the input clock. —H–S is
asserted low to enable selection of the Host Interface functions by the address lines A2­A5. If —T–S is asserted when —H–S is asserted, a data transfer will take place with the Host Interface. Note that both —H–S and —H–A must be tied high to disable the Host Interface. When —H–A is asserted, —H–S is ignored.
—H–
A (Host Acknowledge) - active low input, may change asynchronous to the input clock.
H–A is used to acknowledge either an interrupt request or a DMA request to the host
interface. When the host interface is not in DMA mode, asserting —T–S when —H–A and —H–R are asserted will enable the contents of the host interface interrupt vector
NuBus is a trademark of Texas Instruments, Inc. Macintosh II is a trademark of Apple Computer, Inc.
MOTOROLA DSP96002 USER’S MANUAL 2 - 9
register (IVR) onto the data bus outputs D0-D31. This provides an interrupt acknowl­edge capability compatible with MC68000 family processors.
If the host interface is in DMA mode, —H–A is used as a DMA transfer acknowledge in­put and it is asserted by an external device to transfer data between the Host Interface
registers and an external device. In DMA read mode, —H–A is asserted to read the Host Interface RX register on the data bus outputs D0-D31. In DMA write mode, —H–A is as-
serted to strobe external data into the Host Interface TX register. Write data is latched into the TX register on the rising edge of —H–A.
—H–
R (Host Request) - active low output, never three-stated. The host request —H–R is as-
serted to indicate that the host interface is requesting service - either an interrupt request or a DMA request - from an external device.
The —H–R output may be connected to interrupt request input —I—R—Q–A, —I—R Q–B, or —I—R—Q–C of another DSP96002. The DSP96002 on-chip DMA Controller
channel can select the interrupt request input as a DMA transfer request input.
—B–
R (Bus Request) - active low output, never three-stated. —B–R is asserted when the CPU
or DMA is requesting bus mastership. —B–R is deasserted when the CPU or DMA no longer needs the bus. —B–R may be asserted or deasserted independent of whether the DSP96002 is a bus master or a bus slave. Bus "parking" allows —B–R to be
deasserted even though the DSP96002 is the bus master. See the description of bus "parking" in the —B–A pin description. The RH bit in the Bus Control Register (see Section seven) allows —B–R to be asserted under software control even though the CPU or DMA does not need the bus. —B–R is typically sent to an external bus arbitrator
which controls the priority, parking and tenure of each DSP96002 on the same external bus. —B–R is only affected by CPU or DMA requests for the external bus, never for the internal bus. During hardware reset, —B–R is deasserted and the arbitration is reset
to the bus slave state.
—B–
G (Bus Grant) – active low input. —B–G must be asserted/ deasserted synchronous to the
input clock (CLK) for proper operation. —B–G is asserted by an external bus arbitration circuit when the DSP96002 may become the next bus master. When —B–G is asserted, the DSP96002 must wait until —B–B is deasserted before taking bus mastership. When
—B–
G is deasserted, bus mastership is typically given up at the end of the current bus cycle. This may occur in the middle of an instruction which requires more than one ex­ternal bus cycle for execution. Note that indivisible read-modify-write instructions
2 - 10 DSP96002 USER’S MANUAL MOTOROLA
(BSET, BCLR, BCHG) will not give up bus mastership until the end of the current instruc-
——B–
tion.
—B–
A (Bus Acknowledge) - Open drain, active low output. When deasserting —B–A, the
DSP96002 drives —B–A high during half a CLK cycle and then disables the active pull­up. In this way, only a weak external pull-up resistor is required to hold the line high.
G is ignored during hardware reset.
B–A may be directly connected to —B–B MC68040 —B–B pin. When —B–G is asserted, the DSP96002 becomes the pending bus master. It waits until —B–B is negated by the previous bus master, indicating that the previous bus master is off the bus. The pending bus master asserts —B–A to be­come the current bus master. —B–A is asserted when the CPU or DMA has taken the bus and is the bus master. While —B–A is asserted, the DSP96002 is the owner of the bus (the bus master). When —B–A is negated, the DSP96002 is a bus slave. —B–A
may be used as a three-state enable control for external address, data and bus control signal buffers. —B–A is three-stated during hardware reset.
Note that a current bus master may keep —B–A asserted after ceasing bus activity, re­gardless of whether —B–R is asserted or deasserted. This is called "bus parking" and
allows the current bus master to use the bus repeatedly without re-arbitration until some other device wants the bus.
The current bus master keeps —B–A asserted during indivisible read-modify-write bus cycles, regardless of whether —B–G has been deasserted by the external bus arbitra-
tion unit. This form of "bus locking" allows the current bus master to perform atomic op­erations on shared variables in multitasking and multiprocessor systems. Current in­structions which perform indivisible read-modify-write bus cycles are BCLR, BCHG and BSET.
in order to obtain the same functionality as the
—B–
B (Bus Busy) - active low input, must be asserted and deasserted synchronous to the input
clock (CLK) for proper operation. —B–B is deasserted when there is no bus master on the external bus. In multiple DSP96002 systems, all —B–B inputs are tied together and are driven by the logical AND of all —B–A outputs. —B–B is asserted by a pending bus master (directly or indirectly by —B–A assertion) to indicate that it is now the current bus master. —B–B is deasserted by the current bus master (directly or indirectly by —B–A
negation) to indicate that it is off the bus and is no longer the bus master. The pending bus master monitors the —B–B signal until it is deasserted. Then the pending bus mas­ter asserts —B–A to become the current bus master, which asserts —B–B directly or
indirectly.
MOTOROLA DSP96002 USER’S MANUAL 2 - 11
—B–
L (Bus Lock) - active low output, never three-stated. Asserted at the start of an external
indivisible Read-Modify-Write (RMW) bus cycle (providing an "early bus start" signal for DRAM interfacing) and deasserted at the end of the write bus cycle. —B–L remains as­serted between the read and write bus cycles of the RMW bus sequence. —B–L can
be used to indicate that special memory timing (such as RMW timing for DRAMs) may be used or to "resource lock" an external multi-port memory for secure semaphore up­dates. The early negation provides an "early bus end" signal useful for external bus con-
trol. If the external bus is not used during an instruction cycle, —B–L remains deassert­ed until the next external indivisible RMW bus cycle. —B–L also remains deasserted if
the external bus cycle is not an indivisible RMW bus cycle or if there is an internal RMW bus cycle. The only instructions which automatically assert —B–L are a BSET, BCLR or BCHG instruction which accesses external memory. —B–L can also be asserted by setting the LH bit in the BCR register (see Section seven). —B–L is deasserted during
hardware reset.
2.1.6 Reserved Pins
There are 5 spare pins reserved for future use.
2.2 BUS OPERATION
The external bus timing is defined by the operation of the Address Bus, Data Bus and Bus Control pins described in paragraph 2.1.5. The DSP96002 external ports are designed to interface with a wide variety of memory and peripheral devices, high speed static RAMs, dynamic RAMs and video RAMs as well as
slower memory devices. External bus timing is controlled by the —T–A control signal and by the Bus Con­trol Registers (BCR) which are described in Section seven. The BCR and —T–A control the timing of the
bus interface signals. Insertion of wait states is controlled by the BCR to provide constant bus access tim­ing, and by —T–A to provide dynamic bus access timing. The number of wait states is determined by the
—T–
A input or by the BCR, whichever is longer.
2.2.1 Synchronous Bus Operation
Synchronous external bus cycle consists of at least 4 internal clock phases. See the DSP96002 Technical Data Sheet (DSP96002/D) for the specification of the internal clock phases. Each synchronous external memory access requires the following procedure:
3:3. The external memory address is defined by the Address Bus A0-A31 and the Memory Ref-
erence Select signals S1 and S0. These signals change in the first phase of the external bus cycle. The Memory Reference Select signals have the same timing as the Address Bus and may be used as additional address lines. The Address and Memory Reference signals are also used to generate chip select signals for the appropriate memory chips. These chip se­lect signals change the memory chips from low power standby mode to active mode and be­gin the read access time. This allows slower memories to be used since the chip select sig­nals are address-based rather than read or write enable-based.
2 - 12 DSP96002 USER’S MANUAL MOTOROLA
3:4. When the Address and Memory Reference signals are stable, the data transfer is enabled by
the Transfer Strobe —T–S signal. —T–S is asserted to "qualify" the Address and Memory Reference signals as stable and to perform the read or write data transfer. —T–S is asserted
in the second phase of the bus cycle.
3:5. Wait states are inserted into the bus cycle controlled by a wait state counter or by —T–A,
whichever is longer. The wait state counter is loaded from the Bus Control Register. If the wait state number determined by these two factors is zero, no wait states are inserted into
the bus cycle and —T–S is deasserted in the fourth phase. If the wait state number deter­mined is W, then W wait states are inserted into the instruction cycle. Each wait state intro­duces one Tc delay.
3:6. When the Transfer Strobe —T–S is deasserted at the end of a bus cycle, the data is latched
in the destination device. At the end of a read cycle, the DSP96002 latches the data inter­nally. At the end of a write cycle, the external memory latches the data. The Address signals remain stable until the first phase of the next external bus cycle to minimize power dissipa­tion. The Memory Reference signals S1 and S0 are deasserted during periods of no bus ac­tivity and the data signals are three-stated.
3.6.1 Static RAM Support
Static RAM devices can be easily interfaced to the DSP96002 bus timing. There are two basic techniques
- —C–S controlled writes and —W–E controlled writes.
—C–
3. 6.1.1
This form of static interface uses the memory chip select (—C–S) as the write strobe. The DSP96002 R/ —
W signal is used as an early read/write direction indication. Proper data buffer enable control on RAMs
without a separate output enable (—O–E) input must use this form to avoid multiple data buffers colliding on the data bus. The interface schematic is shown in Figure 2-5.
DSP96002
S Controlled Writes
—T–
—C–
—W–
S
ER/—W
STATIC RAM
Figure 2-5. —C–S Controlled Writes Interface to Static RAM
MOTOROLA DSP96002 USER’S MANUAL 2 - 13
The disadvantage of this technique is that access time is measured from —T–S instead of from the address or —B–S. Hence faster memories are required.
DSP96002
S1 or S0
R/
STATIC RAM
—O–
—C–
Figure 2-6. —W–E Controlled Writes Interface To Static RAM
—W–
3. 6.1.2
This form of static interface uses the memory write enable (—W–E) as the write strobe. The DSP96002 R/—W signal is used to form a late read/write indication by gating it with —T–S. This form is the one used by the 56000/1 bus interface. Proper data buffer enable control requires a separate output enable (—O
E) input on the memory to avoid multiple data buffers colliding on the data bus. The interface schematic is shown in Figure 2-6.
E Controlled Writes
The advantage of this technique is that access time is measured from S1, S0 or addresses instead of T–S. Hence slower memories can be used. The disadvantage of this technique is that the write data hold will be shortened because the —W–E signal is delayed by the OR gate.
3.6.2 Dynamic RAM and Video RAM Support
Modern dynamic memory (DRAM) and video memory (VRAM) are becoming the preferred choice for a wide variety of computing systems based on
4:7. Cost per bit due to dynamic storage cell density. 4:8. Packaging density due to multiplexed address and control pins. 4:9. Improved performance relative to static RAMs due to fast access modes (page, static col-
umn, nibble and serial shift (VRAM)).
4:10. Commodity pricing due to high volume production.
2 - 14 DSP96002 USER’S MANUAL MOTOROLA
The Port A/B bus control signals are designed for efficient interface to DRAM/VRAM devices in both ran­dom read/write cycles and fast access modes such as those listed above. The bus control signal timing is specified relative to the external clock (CLK) to enable synchronous control by an external state ma-
chine. An on-chip page circuit controls the —T–T pin, indicating to the external state machine when a slow or fast access is being made. The page circuit operation and programming is described in Section seven.
4.11 BUS HANDSHAKE AND ARBITRATION
Bus transactions are governed by a single bus master. Bus arbitration determines which device becomes the bus master. The arbitration logic implementation is system dependent, but must result in at most one device becoming the bus master (even if multiple devices request bus ownership). The arbitration signals permit simple implementation of a variety of bus arbitration schemes (e.g. fairness, priority, etc.). External logic must be provided by the system designer to implement the arbitration scheme.
4.11.1Bus Arbitration Signals
Four signals are provided for bus arbitration. Three of them are considered as local arbitration signals and one as system arbitration signal. The local arbitration signals run between a potential bus master and the
arbitration logic. The local signals are —B–R, —B–G, and —B–A; —B–B is a system arbitration signal. These signals are described below.
—B–
R Bus Request - Asserted by the requesting device to indicate that it wants to use the bus,
and is held asserted until it no longer needs the bus. This includes time when it is the bus master as well as when it is not the bus master.
—B–
G Bus Grant - Asserted by the bus arbitration controller to signal the requesting device that
it is the bus master elect. —B–G is valid only when the bus is not busy (Bus Busy signal described below).
—B–
A Bus Acknowledge - Asserted by the device (bus master) that received the bus owner-
ship from the bus arbitration controller. The master holds —B–A asserted for the dura­tion of its bus possession. —B–A indicates whether the device is a bus master or a bus slave. When asserted, —B–A indicates that the device is the bus master. —B–A may
be used as a three-state enable control for external address, data and bus control signal buffers.
—B–
B Bus Busy - The system arbitration signal —B–B is monitored by all potential bus masters
and is derived from the local bus signal —B–A. This signal controls the hand-over of bus ownership by the bus master at the end of bus possession. Typically —B–B is the wired-OR of all bus acknowledgments. —B–B is asserted if the Bus Acknowledge signal
is asserted by the bus master.
MOTOROLA DSP96002 USER’S MANUAL 2 - 15
4.11.2The Arbitration Protocol
The bus is arbitrated by a central bus arbitrator, using individual request/grant lines to each bus master. The arbitration protocol can operate in parallel with bus transfer activity so that the bus hand-over can be made without much performance penalty.
The arbitration sequence occurs as follows:
5:12. All candidates for bus ownership assert their respective —B–R signals as soon as they need
the bus.
5:13. The arbitration logic designates a bus master-elect by asserting the —B–G signal for that de-
vice.
5:14. The master-elect tests —B–B to ensure that the previous master has relinquished the bus.
If —B–B is deasserted, then the master-elect asserts —B–A, which designates the device as the new bus master. If a higher priority bus request occurs before the —B–B signal was
deasserted, then the arbitration logic may replace the current master-elect with the higher
priority candidate. However, only one —B–G signal must be asserted at one time. 5:15. The new bus master begins its bus transfers after the assertion of —B–A. 5:16. The arbitration logic signals the current bus master to relinquish the bus by deasserting —B
G at any time. A DSP96002 bus master releases its ownership (deasserts —B–A) after
completing the current external bus access. If an instruction is executing a Read-Modify-
Write external access, a DSP96002 master asserts the —B–L signal and will only relinquish
the bus (and deassert —B–L) after completing the entire Read-Modify-Write sequence.
When the current bus master deasserts —B–A, the —B–B signal must also be deasserted
because the next bus master-elect has received its —B–G signal and is waiting for —B–B to
be deasserted before claiming ownership.
The DSP96002 has 2 control bits and one status bit, located in the Bus Control Registers (see Section 7) to permit software control of the —B–R and —B–L signals, and to verify when the chip is the bus master. If the RH bit in the BCR register is cleared, the DSP96002 asserts its —B–R signal only as long as requests for bus transfers are pending or being attempted. If the RH bit is set, —B–R will remain asserted. If the LH bit in the BCR register is cleared, the DSP96002 asserts its —B–L signal only during a read-modify-
write bus access. If the LH bit is set, —B–L will remain asserted.
5.16.1Arbitration Scheme
The bus arbitration scheme is implementation dependent. The diagram in Figure 2-7 illustrates a common method of implementing the bus arbitration scheme. The arbitration logic determines the device priorities and assigns bus ownership depending on those priorities.
2 - 16 DSP96002 USER’S MANUAL MOTOROLA
An implementation of a bus arbitration scheme may hold —B–G asserted, for example, to the current bus owner if none of the other devices are requesting the bus. As a consequence, the current bus master may
keep —B–A asserted after ceasing bus activity, regardless of whether —B–R is asserted or deasserted. This situation is called "bus parking" and allows the current bus master to use the bus repeatedly without re-arbitration until some other device requests the bus.
V
cc
DSP96002
DSP96002
—B–
—B–
—B–
—B–
—B–
ARBITRATION
LOGIC
L
—B–
—B–
—B–
—B–
—B–
L
Figure 2-7. Bus Arbitration Scheme
5.16.2Bus Handshake Unit
The bus handshake unit in the DSP96002 is implemented within a finite state machine. It consists of two external outputs (—B–R, —B–A), two external inputs (—B–G, —B–B) and three internal inputs
(ext_acc_req, end_of_sequence, RH) (see Figure 2-8). The ext_acc_req signal is asserted when one or more requests for external bus access are pending, and remains asserted as long as the transfers are being executed. The end_of_sequence signal is asserted at the last bus cycle of the current sequence.
—B–
ext_acc_req
end_of_sequence
Request Hold (RH)
BUS
HANDSHAKE
UNIT
—B–
—B–
—B–
B
R
A
Figure 2-8. Bus Handshake Unit
MOTOROLA DSP96002 USER’S MANUAL 2 - 17
YY
(delayed
ZZ
(delayed)
REQUEST_BUS
(Y)
—B–R = 0
—B–A = 1
ZY
)
ACTIVE_ MASTER (Z)
—B–R = 0
—B–A = 0
YZ
WY
(non-existant)
XZ
YX (illegal)
XY
ZW
WZ
ZX
YW (illegal)
IDLE
(X) —B–R = —R
H
WX
PARKING_ MASTER (W)
—B–R = R–H
XW
XX
WW
Figure 2-9. Bus Handshake State Diagram
Likewise, when executing the read part of a RMW access, the end_of_sequence signal is deasserted. This signal is used to give up bus ownership if —B–G is deasserted during bus transfers. The state ma-
chine which controls the bus handshake is illustrated in Figure 2.9. The transition arcs are labeled by two letters which denote its source and destination states. The equa-
tions of the transition arcs are described as follows:
XX = ^ext_acc_req & ^( ^—B–G & —B–B ) XY = ext_acc_req & ^( ^—B–G & —B–B ) XZ = ext_acc_req & ( ^—B–G & —B–B ) XW = ^ext_acc_req & ( ^—B–G & —B–B )
YX = ^ext_acc_req & ^( ^—B–G & —B–B ) (note 1) YY = ext_acc_req & ^( ^—B–G & —B–B ) YZ = ext_acc_req & ( ^—B–G & —B–B ) YW = ^ext_acc_req & ( ^—B–G & —B–B ) (note 1)
ZX = ^ext_acc_req & —B–G ZY = ext_acc_req & —D—B–G & end_of_sequence (note 3)
2 - 18 DSP96002 USER’S MANUAL MOTOROLA
ZZ = ^end_of_sequence v ( ext_acc_req & ^—D—B–G ) (note 3) ZW = ^ext_acc_req & ^—B–G
WX = ^ext_acc_req & —B–G WY = NON-EXISTENT ARC (note 2) WZ = ext_acc_req
WW = ^ext_acc_req & ^—B–G Notes: 1. Illegal arcs in DSP96002 since once the request of the bus is pending, it will not be canceled
before the execution of the access.
2. Non-existent arc since if ext_acc_req arrives together with the negation of —B–G, the device becomes active master and begins its bus transfers.
3.—D—B–G is —B–G delayed by one phase. This is done to provide a response to the ext_acc_req signal when it is asserted at the same phase together with —B–G negation.
5.16.3Bus Arbitration Example Cases
5.16.3.1 Case 1 – Normal
If the device requesting mastership asserts —B–R: the arbiter asserts the requesting devices’ —B–G and —B–
B is deasserted indicating the bus is not busy. The requesting device will assert —B–A.
5.16.3.2 Case 2 – Bus Busy
If the device requesting mastership asserts —B–R: the arbiter responds by asserting the requesting devic­es’ —B–G; however, the bus is busy because —B–B is asserted. The requesting device will not assert B–A until —B–B is deasserted.
5.16.3.3 Case 3 – Low Priority
If the device requesting mastership asserts —B–R: the arbiter withholds asserting the requesting devices’ —B–
G because a higher priority device requested the bus. —B–A of the requesting device will not be as-
serted.
5.16.3.4 Case 4 – Default
If a device does not request the bus and it is not in the bus parking state but rather it is in the idle state: the arbiter, by design (i. e., default), asserts —B–G. —B–A will remain deasserted.
MOTOROLA DSP96002 USER’S MANUAL 2 - 19
5.16.3.5 Case 5 – Bus Lock during RMW
If the device requesting mastership asserts —B–R and the arbiter asserts the requesting devices’ —B–G and —B–B is deasserted, then the requesting device will assert —B–A. If a read-modify-write (RMW) in­struction which accesses external memory is being executed, and the bus arbiter deasserts —B–G, then
—B–
A will remain asserted until the entire RMW instruction completes execution. —B–A will then be deas­serted thereby relinquishing the bus. Note that during external RMW instruction execution, —B–L is assert­ed. In general, the —B–L signal can be used to ensure that a multiport memory can only be written by one master at a time. That is, referring to Figure 2-10, —B–L can be input from DSP #1to the memory controller which prevents —T–A from being asserted by the controller (thereby suspending the memory access by
DSP #2) until DSP #1 completes its RMW access.
DSP96002
RMW
—B–
Dual Port
Memory
Controller
L
—T–
A
DSP96002
#2#1
Figure 2-10. Bus Lock During RMW
5.16.3.6 Case 6 – Bus Park
The device requesting mastership asserts —B–R; the arbiter asserts the requesting devices’ —B–G and —B–
B is deasserted indicating the bus is not busy – the requesting device will assert —B–A. When the requesting device no longer requires the bus it will deassert —B–R; if the bus arbiter leaves —B–G assert­ed because other requests are not pending, then —B–A will remain asserted. This condition is called bus
parking and eliminates the need for the last bus master to rearbitrate for the bus during its next external access.
2 - 20 DSP96002 USER’S MANUAL MOTOROLA
SECTION 3

CHIP ARCHITECTURE

3.1 INTRODUCTION
The DSP96002 architecture is a 32-bit highly-parallel multiple-bus IEEE floating-point processor. The ar­chitecture is designed to accommodate various IC family members with different memory and on-chip pe­ripheral requirements while maintaining a standard programmable core. The overall chip architecture is presented and detailed block diagrams of the Data ALU and Address Generation Unit AGU) core architec­ture are described.
3.2 DSP96002 BLOCK DIAGRAM
The major components of the DSP96002 are
Data Buses
Address Buses
Data ALU
Address Generation Unit
X Data Memory
Y Data Memory
Program Control and System Stack
Program Memory
Port A and Port B External Bus Interfaces
Internal Bus Switch and Bit Manipulation Unit
I/O Interfaces
An overall block diagram of the DSP96002 architecture is shown in Figure 3-1.
3.2.1 Data Buses
Data movement on the chip occurs over five bidirectional 32-bit buses, X Data Bus (XDB), Y Data Bus (YDB), Global Data Bus (GDB), the DMA Data Bus (DDB) and the Program Data Bus (PDB). The X and Y data buses may also be treated by certain instructions as one 64-bit data bus by concatenation of XDB and YDB. Data transfer between the Data ALU and the X Data Memory and Y Data Memory occur over the X Data Bus and Y Data Bus. These are kept local on the chip to maximize speed and minimize power. The direct memory access data transfers occur over the DMA Data Bus. Program memory data transfers and instruction fetches occur over the Program Data Bus. All other data transfers occur over the Global Data Bus.
MOTOROLA DSP96002 USER’S MANUAL 3 - 1
Figure 3-1. DSP96002 Block Diagram
3.2.2 Address Buses
Addresses are specified for internal X Data Memory and Y Data Memory on two unidirectional 32-bit buses, X Address Bus (XAB) and Y Address Bus (YAB). Internal address bus sizes depend on the amount of in­ternal memory implemented. External memory spaces for each port, A and B, are addressed via a single 32-bit unidirectional address bus driven by a three input multiplexer that can select the X Address Bus (XAB), the Y Address Bus (YAB) or the Program Address Bus (PAB). On-chip peripherals and the DMA Controller are memory mapped in the internal X memory space. When zero wait state external memory is used, one instruction cycle is needed for each external memory access.
The XAB, YAB and PAB are dual access buses in the sense that one instruction cycle contains two slots, the one slot is dedicated to the on-chip DMA transfers and the second is used for the core transfers.
3 - 2 DSP96002 USER’S MANUAL MOTOROLA
3.2.3 Data ALU
The Data ALU performs all of the arithmetic and logical operations on data operands. The Data ALU con­sists of ten 96-bit general purpose registers, a 32-bit barrel shifter, a 32-bit adder, and a 32-bit parallel mul­tiplier. Data ALU registers may be read or written over the XDB and YDB as 32 or 64-bit operands. The Data ALU is capable of multiplication, addition, subtraction, format conversion, shifting and logical opera­tions in one instruction cycle. Data ALU source operands may be 32 or 96-bits and originate from the gen­eral purpose register file. Data ALU results are always stored in one of the general purpose registers. Float­ing-point Data ALU operations always have a 96-bit result. Integer (fixed-point) Data ALU operations have a 32 or 64-bit result.
The Data ALU fully implements the IEEE Standard 754 for binary floating-point arithmetic. The operations are supported in three data formats: 32-bit two’s-complement fixed-point, 32-bit unsigned-magnitude fixed­point and 44-bit IEEE single extended precision floating-point. All the floating-point computations are per­formed using the single extended precision format and the results are automatically rounded to single pre­cision or single extended precision numbers as programmed. All four IEEE rounding modes (round to zero, round to nearest, round to plus infinity and round to minus infinity) are supported for all floating-point oper­ations and conversions. The IEEE gradual underflow with denormalized numbers is supported by the IEEE mode. In the IEEE mode, if input operand(s) or output result(s) are denormalized numbers, additional in­struction cycles are required to process these numbers per the IEEE standard. A "Flush to Zero" mode is also provided which forces all floating point result underflows to zero (all denormalized input operands are considered as being zero). The Flush to Zero mode never requires any additional instruction cycles.
Refer to Section 3.3 for a detailed description of the Data ALU architecture.
3.2.4 AGU
The AGU performs all of the address storage and effective address calculations necessary to address data operands in memory and it is used by both the core and the on-chip DMA Controller. The AGU operates in parallel with other chip resources to minimize address generation overhead. The AGU contains eight Ad­dress Registers (R0-R7), eight Offset Registers (N0-N7), and eight Modifier Registers (M0-M7). The Ad­dress Registers are 32-bit registers which may contain any address or data. Each Address Register may be accessed for output to the XAB, YAB, and PAB. The modifier and offset registers are 32-bit registers which are normally used to control updating of the address registers.
AGU registers may be read or written over the Global Data Bus as 32-bit operands. The AGU can generate two 32-bit addresses every instruction cycle - one for any two of the XAB, YAB or PAB. The AGU can di­rectly address 4,294,967,296 locations on the XAB and 4,294,967,296 locations on the YAB - a total capa­bility of 8,589,934,592 32-bit data words. Refer to Section 3.4 for a detailed description of the AGU archi­tecture.
3.2.5 X Data Memory
The X Data Memory may contain both data RAM and ROM. The X Data RAM is a 32-bit wide internal mem­ory and occupies the lowest 512 locations in X Memory Space. The X Data ROM is also a 32-bit wide in­ternal memory and occupies 1024 locations in X Memory Space. Addresses are received from the XAB and data transfers occur on the XDB. The X memory is a dual-access memory in the sense that it may be accessed twice during a cycle: once by the core and once by the DMA. X memory may be expanded off chip.
MOTOROLA DSP96002 USER’S MANUAL 3 - 3
3.2.6 Y Data Memory
The Y Data Memory may contain both data RAM and ROM. The Y Data RAM is a 32-bit wide internal mem­ory and occupies the lowest 512 locations in Y Memory Space. The Y Data ROM is also a 32-bit wide in­ternal memory and occupies 1024 locations in Y Memory Space. Addresses are received from the YAB and data transfers occur on the YDB. The Y memory is dual-access memory in the sense that it may be accessed twice during a cycle: once by the core and once by the DMA. Y memory may be expanded off chip.
3.2.7 Program Control and System Stack
The Program Control logic performs instruction prefetch, instruction decoding and exception processing. A 32-bit program counter (PC) register can address 4,294,967,296 locations in Program Memory Space.
The System Stack is a separate internal RAM which stores the PC and the status register (SR) for subrou­tine calls and long interrupts. The stack will also store the loop counter (LC) and the loop address register (LA) in addition to the PC and SR registers for program looping. The System Stack is in Stack Memory Space and its address is always inherent and implied by the current instruction. The stack RAM is 64-bits wide and 15 locations "deep". When a subroutine call or long interrupt occurs, the contents of the PC and SR registers are stored (pushed) on the "top" location in the System Stack. When a return from subroutine occurs, the contents of the "top" location in the System Stack are copied (pulled) to the PC. When a return from interrupt occurs, the contents of the "top" location in the System Stack are copied (pulled) to the PC and SR.
An interrupt will cause the processor to enter the exception processing state. Upon entering this state, the current instruction in decode will execute normally, unless it is the first word of a two-word instruction, in which case it will be aborted, and re-fetched at the completion of exception processing. The next two fetch addresses are supplied by the interrupt controller. During these fetches the PC is not updated.
If one of the words fetched by the interrupt controller is a jump to subroutine, a long interrupt routine is formed, and a context switch is performed using the stack. If neither interrupt instruction word causes a change of control flow, then the two interrupt instructions fetched constitute a fast interrupt routine. In this case, the stack is not used, and interrupt service concludes with the execution of the instructions contained within the two words. Fetching then resumes using the PC. The fast interrupt routine provides minimum overhead exception processing. This mechanism is commonly used to move data between memory and an I/O device.
For more details on the behavior of interrupts, see Section 8. The system stack is also used to implement no-overhead hardware program loops. When a program loop
is initiated with the execution of a DO instruction, the following events occur:
the current 32-bit loop counter (LC) and 32-bit loop address register (LA) are pushed onto the system stack to allow nested loops.
the LC and LA registers are initialized with values specified in the DO instruction.
the address of the first instruction in the program loop and the current status register contents are transferred onto the system stack.
the loop flag bit in the status register is set.
The loop flag bit is set when a program loop is in progress and enables the end of loop detection (compar­ison between the PC and LA registers, discussed below). The loop flag bit is pulled from the system stack when a loop is terminated and indicates if the terminated loop was a nested loop.
3 - 4 DSP96002 USER’S MANUAL MOTOROLA
A program loop begins execution after the DO instruction and continues until the program address fetched equals the loop address register contents (last address of program loop). The contents of the loop counter are then tested for one. If the loop counter is not one, the loop counter is decremented and the top location in the stack RAM is read (but not pulled) into the PC to return to the start of the loop. If the loop counter is one, the program loop is terminated by incrementing the PC, reading the previous loop flag bit from the top location in the stack into the status register, purging the stack (pulling the top location and discarding the contents) and pulling the LA and LC registers off the stack and restoring the respective registers. When terminating a loop the loop flag, LA and LC registers as well as the system stack pointer are restored.
3.2.8 Program Memory
The Program Memory consists of a 1,024 location by 32-bit RAM. Addresses are received from the pro­gram control logic (usually the PC). The Program Memory may contain instructions, constants, and data tables which are fixed at assembly time. The Program Memory is a dual-access memory in the sense that it may be accessed twice during a cycle: once by the core and once by the DMA. Program Memory may be expanded off-chip. Program RAM may be written to download instructions. The bootstrap ROM also ap­pears in Program Memory space during the bootstrap mode. See Section 9.
3.2.9 External Bus Interfaces
The DSP96002 has two identical external bus interfaces. Each bus interface has a 32-bit wide address bus and a 32-bit wide data bus, and may be used to access external Data Memory, Program Memory or I/O devices. Separate select lines control access to the memory spaces. A Port Select control register permits assigning sections of each memory space to each external bus interface port. Refer to Section 2 and Sec­tion 9 for a detailed description of the external bus interface.
3.2.10 Internal Bus Switch and Bit Manipulation Unit
The Internal Bus Switch performs data transfers from one internal bus to another. The Bit Manipulation Unit performs bit manipulation operations on memory and register operands on the
XDB, YDB, and GDB.
3.2.11 I/O Interfaces
The on-chip I/O interfaces are intended to minimize system chip count and "glue" logic in many DSP96002 applications. Each I/O interface has its own control, status and data registers and is treated as memory­mapped I/O by the DSP96002. Each interface has several dedicated interrupt vector addresses and control bits to enable/disable interrupts. This minimizes the overhead associated with servicing the device since each interrupt source has its own service routine.
The DSP96002 provides the following I/O interfaces: two identical 32-bit parallel Host MPU/DMA Interface peripherals are provided on the DSP96002, one connected to External Bus Interface A and the other to External Bus Interface B; a two-channel DMA Controller.
3.2.11.1 Host Interfaces
The DSP96002 provides a Host MPU/DMA Interface for each of its external bus interface ports. Each Host Interface (HI) is a 8-, 16-, 24- or 32-bit wide parallel port which may be connected directly to the data bus of a host processor. The host processor may be any of a number of popular microcomputers or micropro-
MOTOROLA DSP96002 USER’S MANUAL 3 - 5
cessors, another DSP96002 or DMA hardware. The HI appears as a memory mapped peripheral occupy­ing 16 words in the host processor address space. Separate transmit and receive data registers are double­buffered to allow the DSP96002 and host processor to efficiently transfer data at high speed. Host proces­sor communication with the HI is accomplished using standard Host processor data move instructions and addressing modes. Handshake flags are provided for polled or interrupt-driven data transfers.
3.2.11.2 DMA Controller
The DMA Controller performs all the address storage and effective address calculations necessary to ad­dress the DMA source and destination operands. The DMA controller operates in parallel with other chip resources to minimize data or program transfers overhead. The DMA controller contains one Source Ad­dress Register, one Source Offset Register, one Source Modifier Register, one Destination Address Reg­ister, one Destination Offset Register and one Destination Modifier Register for each channel.
In addition there are two control registers per channel. The Transfer Count down counter, decremented af­ter each transfer, contains the number of DMA transfers remaining to be done. The DMA Control/Status Register controls the DMA activities and contains the DMA status. All DMA registers are mapped into the X memory space. The AGU is shared by the DMA for the source and destination address calculations. The DMA addressing modes are: linear, bit reversed and modulo. For more details see Section 7.5.
3.3 DATA ALU BLOCK DIAGRAM
The major components of the Data ALU are
Data ALU Register File
Multiply Unit
Adder Unit
Logic Unit
Format Converter
Divide and Square Root Unit
Controller and Arbitrator
A block diagram of the Data ALU architecture is shown in Figure 3-2. D0, D1, D2, D3, D4, D5, D6, D7, D8 and D9 are 96-bit registers which serve as the Data ALU general pur-
pose register file. Every register is divided into three portions: high, middle, and low, each 32-bits wide. The registers may be treated as ten 96-bit registers Dn (Dn.H:Dn.M:Dn.L), n=0,1,..,9 for floating-point source and/or destination operands. These floating point registers receive inputs from the Multiplier, the Adder, and the Subtracter and supply a source data register of the same form. Most Data ALU floating-point op­erations specify the 96-bit registers as source and/or destination operands. However, D8 and D9 are never destinations of a Data ALU operation.
The data is stored in the registers in double precision floating-point format. Each register may be read or written over the XDB or YDB as a floating-point operand. A format conversion is automatically performed when a Dn register is written with an operand of a different floating-point format. This can occur when writ­ing Dn from the XDB or YDB as a result of a single precision floating-point MOVE. If a single precision op­erand is written to a floating point data register, the middle portion of the data register is written with the mantissa portion of the word operand, the low portion is zeroed and the high portion is written with the ex­ponent portion of the word operand.
3 - 6 DSP96002 USER’S MANUAL MOTOROLA
Figure 3-2. Data ALU Block Diagram Data ALU Register File (D0-D9)
The registers may also be treated as thirty 32-bit registers Dn.H, Dn.M, Dn.L, n=0,1,..,9. Each register may be read or written over the XDB or YDB as a word operand. When an individual 32-bit register is written over the XDB or YDB, no format conversion takes place and only the designated register is affected. The low portion of the registers, Dn.L, is used as source and/or destination for most integer operations. In this case the integer registers supply an operand for the Multiplier and the Adder/Subtracter while receiving an input from the Multiplier and the Adder/subtracter. Note that in the case of integer multiplication the result will be 64-bits wide and will be stored in both middle and low portions of the destination register.
3.3.1 Multiply Unit
The Multiplier is one of the two arithmetic processing units of the Data ALU and performs all the floating­point multiplications as well as signed/unsigned fixed-point (integer) multiplications on the data operands.
MOTOROLA DSP96002 USER’S MANUAL 3 - 7
Loading...
+ 867 hidden pages