Motorola, Inc.
Semiconductor Products Sector
DSP Division
6501 William Cannon Drive, West
Austin, Texas 78735-8598
SECTION 1
DSP96002 INTRODUCTION
This manual describes the first member of a family of dual-port IEEE floating point programmable CMOS
processors. The family concept defines a core as the Data ALU, Address Generation Unit, Program Controller and associated Instruction Set. The On-Chip Program Memory, Data Memories and Peripherals support many numerically intensive applications and minimize system size and power dissipation; however,
they are not considered part of the core.
The first family member is the DSP96002. The main characteristics of the DSP96002 are support of IEEE
754 Single Precision (8 bit Exponent and 24 bit Mantissa) and Single Extended Precision (11 bit Exponent
and 32 bit Mantissa) Floating-Point and 32 bit signed and unsigned fixed point arithmetic, coupled with two
identical external memory expansion ports. Its features are listed below.
DSP96002 Features
•IEEE 745 Standard SP (32-bit) and SEP (44 bit) Arithmetic
•16.5 Million Instructions per Second (Mips) with a 33 Mhz clock
•49.5 Million Floating Point Instructions per Second (MFLOPS) peak with a 33 Mhz
clock
•Single-Cycle 32 x 32 Bit Parallel Multiplier
•Highly Parallel Instruction Set with Unique DSP Addressing Modes
•Nested Hardware Do Loops
•Fast Auto-Return Interrupts
•2 Independent On-Chip 512 x 32 Bit Data RAMs
•2 Independent On-Chip 1024 x 32 Bit Data ROMs
32
•Off-Chip Expansion to 2 x 2
•On-Chip 1,024 x 32 Bit Program RAM
•On-Chip 64 x 32 Bit Bootstrap ROM
•Off-Chip Expansion to 2
•Two Identical External Memory Expansion Ports
•Two 32-Bit Parallel Host MPU/DMA Interfaces
•On-Chip Two-Channel DMA Controller
•On-Chip Emulator
32-Bit Words of Data Memory
32
32-Bit Words of Program Memory
MOTOROLADSP96002 USER’S MANUAL1 - 1
1 - 2DSP96002 USER’S MANUALMOTOROLA
SECTION 2
SIGNAL DESCRIPTION AND BUS OPERATION
2.1PINOUT
The functional signal groups of the DSP96002 are shown in Figure 2-2, and are described in the following
sections. A pin allocation summary is shown in Figure 2-1. Specific pinout and timing information is available in the DSP96002 Technical Data Sheet (DSP96002/D).
2.1.1 Package
The DSP96002 is available in a 223 pin PGA package. There are 176 signal pins (including 5 spares), 17
power pins and 30 ground pins. All packaging information is available in the data sheet.
2.1.2 Interrupt And Mode Control (4 Pins)
—R—E—S—E–T(Reset) - active low, Schmitt trigger input. —R—E—S—E–T is internally synchronized
to the input clock (CLK). When asserted, the chip is placed in the reset state and the
internal phase generator is reset. The Schmitt trigger input allows a slowly rising input
(such as a capacitor charging) to reliably reset the chip. If —R—E—S—E–T is deasserted synchronous to the input clock (CLK), exact startup timing is guaranteed, allowing multiple processors to startup synchronously and operate together in "lock-step".
When the —R—E—S—E–T pin is deasserted, the initial chip operating mode is latched
from the MODA, MODB and MODC pins.
MODA/—I—R—Q–A(Mode Select A/External Interrupt Request A) - active low input, internally
synchronized to the input clock (CLK). MODA/—I—R—Q–A selects the initial chip
operating mode during hardware reset and becomes a level sensitive or negative edge
triggered, maskable interrupt request input during normal instruction processing.
MODA, MODB and MODC select one of 8 initial chip operating modes, latched into the
operating mode register (OMR) when the —R—E—S—E–T pin is deasserted. If —I
—
R—Q–A is asserted synchronous to the input clock (CLK), multiple processors can be
resynchronized using the WAIT instruction and asserting —I—R—Q–A to exit the wait
state. If the processor is in the STOP standby state and —I—R—Q–A is asserted, the
processor will exit the STOP state.
MOTOROLADSP96002 USER’S MANUAL2 - 1
CPU PinsPins
Reset and IRQs4
Clock Input1
OnCE Port4
CPU Spare1
Quiet Power4
Quiet Ground4
CPU Subtotal18
Power/Ground PlanesPins
Package Noisy Power Plane2
Package Noisy Ground Plane5
Package Quiet Power Plane1
Package Quiet Ground Plane1
Power/Ground Plane Subtotal9
Each Port Both Ports
Port A/BPinsPins
Data Bus3264
Address Bus3264
Data Power24
Data Ground48
Address Power24
Address Ground48
Addr/Data Subtotal76152
Each Port Both Ports
Port A/BPinsPins
Bus Control Signals1734
Bus Control Spare24
Bus Control Power12
Bus Control Ground24
Control Subtotal2244
Pinout Summary Pins
CPU Pins18
Package Power/Ground Planes 9
Port A/B Pins
Data and Address 152
Bus Control 44
TOTALS223
Figure 2-1. DSP96002 Functional Group Pin Allocation
MODB/—I—R—Q–B(Mode Select B/External Interrupt Request B) - active low input, internally synchronized
to the input clock (CLK). MODB/—I—R—Q–B selects the initial chip operating mode during hardware reset and becomes a level sensitive or negative edge triggered, maskable
interrupt request input during normal instruction processing. MODA, MODB and MODC
select one of 8 initial chip operating modes, latched into the operating mode register
(OMR) when the —R—E—S—E–T pin is deasserted. If —I—R—Q–B is asserted synchronous to the input clock (CLK), multiple processors can be resynchronized using the
WAIT instruction and asserting —I—R—Q–B to exit the wait state.
MODC/—I—R—Q–C(Mode Select C/External Interrupt Request C) - active low input, internally synchronized
to the input clock (CLK). MODC/—I—R—Q–C selects the initial chip operating mode dur-
ing hardware reset and becomes a level sensitive or negative edge triggered, maskable
interrupt request input during normal instruction processing. MODA, MODB and MODC
select one of 8 initial chip operating modes, latched into the operating mode register
(OMR) when the —R—E—S—E–T pin is deasserted. If —I—R—Q–C is asserted synchronous to the input clock (CLK), multiple processors can be resynchronized using the
WAIT instruction and asserting —I—R—Q–C to exit the wait state.
2.1.3 Power and Clock (39 Pins)
CLK(Clock Input) - active high input, high frequency processor clock. Frequency is twice the
instruction rate. An internal phase generator divides CLK into four phases (t0, t1, t2 and
t3) which is the basic instruction execution cycle. Additional tw phases are optionally
generated to insert wait states (WS) into instruction execution. A wait state is formed by
pairing a t2 and tw phase. CLK should be continuous with a 46-54% duty cycle.
WSWS
t0t1t2t3t0t1t2twt2twt2t3
CLK
No Wait State
Instruction
Quiet VCC (4)(Power) - isolated power for the CPU logic. Must be tied to all other chip power pins ex-
ternally. User must provide adequate external decoupling capacitors.
Quiet VSS (4)(Ground) - isolated ground for the CPU logic. Must be tied to all other chip ground pins
externally. User must provide adequate external decoupling capacitors.
Address Bus VCC(4)(Power) - isolated power for sections of address bus I/O drivers. Must be tied to
all other chip power pins externally. User must provide adequate external decoupling
capacitors.
Address Bus VSS(8)(Ground) - isolated ground for sections of address bus I/O drivers. Must be tied
to all other chip ground pins externally. User must provide adequate external decoupling
capacitors.
Two Wait State Instruction
Data Bus VCC(4)(Power) - isolated power for sections of data bus I/O drivers. Must be tied to all
other chip power pins externally. User must provide adequate external decoupling capacitors.
Data Bus VSS(8)(Ground) - isolated ground for sections of data bus I/O drivers. Must be tied to
all other chip ground pins externally. User must provide adequate external decoupling
capacitors.
2 - 4DSP96002 USER’S MANUALMOTOROLA
Bus Control VCC(2)(Power) - isolated power for the bus control I/O drivers. Must be tied to all other
chip power pins externally. User must provide adequate external decoupling capacitors.
Bus Control VSS(4)(Ground) - isolated ground for the bus control I/O drivers. Must be tied to all oth-
er chip ground pins externally. User must provide adequate external decoupling capacitors.
2.1.4 On-chip Emulator Interface (OnCE) (4 Pins)
—D–
R(Debug Request) - The debug enable input provides a means of entering the debug
mode of operation from the external command controller. This pin when asserted causes
the DSP96002 to finish the current instruction being executed, save the instruction pipeline information, enter the debug mode and wait for commands to be entered from the
debug serial input line.
DSCK/OS1(Debug Serial Clock/Chip Status 1) - The DSCK/OS1 pin, when configured as an input,
is the pin through which the serial clock is supplied to the OnCE. The serial clock provides pulses required to shift data into and out of the OnCE serial port. When output (not
in Debug Mode), this pin in conjunction with the OS0 pin, provides information about the
chip status.
DSI/OS0(Debug Serial Input/Chip Status 0) - The DSI/OS0 pin, when configured as an input, is
the pin through which serial data or commands are provided to the OnCE controller. The
data received on the DSI pin will be recognized only when the DSP 96002 has entered
the debug mode of operation. When configured as an output (not in Debug Mode), this
pin in conjunction with the OS1 pin, provides information about the chip status.
DSO(Debug Serial Output)
OnCE controller registers as specified by the last command received from the external
command controller. When a trace or breakpoint occurs this line will be asserted for one
T cycle to indicate that the chip has entered the debug mode and is waiting for commands.
The debug serial output provides the data contained in one of the
2.1.5 Port A and Port B (162 Pins)
Port A and Port B are identical in pinout and function. The following pin descriptions apply to both ports.
Each port may be a bus master and each port has a host interface which can be accessed on demand.
The pins are specified for a 50 pf load and two external TTL loads. Derating curves will be provided specifying performance up to 250 pf capacitive loads.
A0-A31(Address Bus) - three-state, active high outputs when a bus master. When not a bus
master, A2-A5 are active high inputs, A0-A1 and A6-A31 are three-stated. As inputs,
A2-A5 may change asynchronous relative to the input clock (CLK). A2-A5 are host interface address inputs which are used to select the host interface register. When a bus
master, A0-A31 specify the address for external program and data memory accesses.
If there is no external bus activity, A0-A31 remain at their previous values. When a bus
master, the Address Enable (—A–E) input acts as an output enable control for A0-A31.
When a bus master, A0-A31 are stable whenever the transfer strobe —T–S is asserted
MOTOROLADSP96002 USER’S MANUAL2 - 5
and may change only when —T–S is deasserted. A0-A31 are three-stated during hardware reset.
D0-D31(Data Bus) - three-state, active high, bidirectional input/outputs when a bus master or
not a bus master. The Data Enable (—D–E) input acts as an output enable control for
D0-D31. As a bus master, the data lines are controlled by the CPU instruction execution
or the DMA controller. D0-D31 are also the Host Interface data lines. If there is no external bus activity, D0-D31 are three-stated. D0-D31 are also three-stated during hardware reset.
S1,S0(Space Select) - three-state, active low outputs when a bus master, three-stated when
not a bus master. Timing is the same as the address lines A0-A31. S1 and S0 are threestated during hardware reset.
These signals can be viewed in different ways, depending on how the external memories are mapped. They support the trend toward splitting memory spaces among ports
and mapping multiple memory spaces into the same physical memory locations. Sev-
S1S0MEMORY SPACE
11No access
10P access
01X access
00Y access
eral examples are given in Figure 2-3 . The encoding S1:S0=11 may be used to place
external memories in their low power standby mode.
R/—W(Read/Write)- three-state, active low output when a bus master, active low input when
not a bus master. Bus master timing is the same as the DSP96002 address lines, giving
EXTERNAL MEMORY AND MAPPINGS1 FUNCTIONS0 FUNCTION
P only—
X only
Y only
X and Y mapped as 1 or 2 spaces
P and X mapped as 2 spaces
P and Y mapped as 1 space
P, X, and Y mapped as 1 space
—D–
—D–
—D–
—D–
—P–S/—D–S—P–
—P–S/—D–
S—
S—
SX/
S
S—
—P–
S
–
Y
—P–
S
S and —D–S
Figure 2-3. Program and Data Memory Select Encoding
2 - 6DSP96002 USER’S MANUALMOTOROLA
an "early write" signal for DRAM interfacing. R/—W is high for a read access and is low
for a write access. The R/—W pin is also the Host Interface read/write input. As an input, R/—W may change asynchronous relative to the input clock. R/—W goes high if
the external bus is not used during an instruction cycle. R/—W is three-stated during
hardware reset.
—B–
S(Bus Strobe) - three-state, active low output when a bus master, three-stated when not
a bus master. Asserted at the start of a bus cycle (providing an "early bus start" signal
for DRAM interfacing) and deasserted at the end of the bus cycle. The early negation
provides an "early bus end" signal useful for external bus control. If the external bus is
not used during an instruction cycle, —B–S remains deasserted until the next external
bus cycle. —B–S is three-stated during hardware reset.
—T–
T(Transfer Type) - three-state, active low output when a bus master, three-stated when
not a bus master. When a bus master, —T–T is controlled by an on-chip page circuit
(see Section seven). —T–T is asserted when a fast access memory mode (page, static
column, nibble or serial shift register) is detected. If the external bus is not used during
an instruction cycle or a fault is detected by the page circuit during an external access,
—T–
T remains deasserted. The parameters of the page circuit fault detection are user
programmable. —T–T is three-stated during hardware reset.
—T–
S(Transfer Strobe) - three-state, active low output when a bus master, active low input
when not a bus master. When a bus master, —T–S is asserted to indicate that the address lines A0-A31, S1, S0, —B–S, —B–L and R/—W are stable and that a bus read or
bus write transfer is taking place. During a read cycle, input data is latched inside the
DSP96002 on the rising edge of —T–S. During a write cycle, output data is placed on
the data bus after —T–S is asserted. Therefore —T–S can be used as an output enable
control for external data bus buffers if they are present. If the external bus is not used
during an instruction cycle, —T–S remains deasserted until the next external bus cycle.
An external flip-flop can delay —T–S if required for slow devices or more address decoding time. The —T–S pin is also the Host Interface transfer strobe input used to en-
able the data bus output drivers during host read operations and to latch data inside the
Host Interface during host write operations. As an input, —T–S may change asynchro-
nous relative to the input clock. Write data is latched inside the Host Interface on the
rising edge of —T–S. —T–S is three-stated during hardware reset.
MOTOROLADSP96002 USER’S MANUAL2 - 7
CLK
—B–
—T–
When a bus master, the combination of —B–S and —T–S can be decoded externally to
determine the status of the current bus cycle and to generate hardware strobes useful
for latching address and data signals. The encoding is shown in Figure 2-4.
A(Transfer Acknowledge) - active low input. If the DSP96002 is the bus master and either
there is no external bus activity or the DSP96002 is not the bus master, the —T––A input
is ignored by the core. The —T–A input is a synchronous "DTACK" function which can
extend an external bus cycle indefinitely. —T–A must be asserted and deasserted synchronous to the input clock (CLK) for proper operation. —T–A is sampled on the falling
edge of the input clock (CLK). Any number of wait states (0, 1, 2, ..., infinity) may be
inserted by keeping —T–A deasserted. In typical operation, —T–A is deasserted at the
start of a bus cycle, is asserted to enable completion of the bus cycle and is deasserted
before the next bus cycle. The current bus cycle completes one clock period after —T
A is asserted synchronous to CLK. The number of wait states is determined by the
T–A input or by the Bus Control Register (BCR), whichever is longer. The BCR can be
used to set the minimum number of wait states in external bus cycles. If —T–A is tied
low (asserted) and no wait states are specified in the BCR register, zero wait states will
be inserted into external bus cycles.
2 - 8DSP96002 USER’S MANUALMOTOROLA
–
—
—A–
E(Address Enable) - active low input, must be asserted and deasserted synchronous to
the input clock (CLK) for proper operation. If a bus master, —A–E is asserted to enable
the A0-A31 address output drivers. If —A–E is deasserted, the address output drivers
are three-stated. If not a bus master, the address output drivers are three-stated regardless of whether —A–E is asserted or deasserted. The function of —A–E is to allow mul-
tiplexed bus systems to be implemented. Examples are a multiplexed address/data bus
such as the NuBus used in the Macintosh II or a multiplexed address1/address2 bus
used with dual port memories such as dynamic VRAMs. Note that there must be at least
one undriven CLK period between enables for multiplexed buses to allow one bus to
three-state before another bus is enabled. External control is responsible for this timing.
For non-multiplexed systems, —A–E should be tied low.
—D–
E(Data Enable) - active low input, must be asserted and deasserted synchronous to the
input clock (CLK) for proper operation. If a bus master or the Host interface is being read,
—D–
E is asserted to enable the D0-D31 data bus output drivers. If —D–E is deassert-
ed, the data bus output drivers are three-stated. If not a bus master, the data bus output
drivers are three-stated regardless of whether —D–E is asserted or deasserted. Readonly bus cycles may be performed even though —D–E is deasserted. The function of
—D–
E is to allow multiplexed bus systems to be implemented. Examples are a multi-
plexed address/data bus such as the NuBus used in the Macintosh II or a multiplexed data1/data2 bus used for long word transfers with one 32 bit wide memory. Note
that there must be at least one undriven CLK period between enables for multiplexed
buses to allow one bus to three-state before another bus is enabled. External control is
responsible for this timing. For non-multiplexed systems, —D–E should be tied low.
—H–
S(Host Select) - active low input, may change asynchronous to the input clock. —H–S is
asserted low to enable selection of the Host Interface functions by the address lines A2A5. If —T–S is asserted when —H–S is asserted, a data transfer will take place with the
Host Interface. Note that both —H–S and —H–A must be tied high to disable the Host
Interface. When —H–A is asserted, —H–S is ignored.
—H–
A(Host Acknowledge) - active low input, may change asynchronous to the input clock.
H–A is used to acknowledge either an interrupt request or a DMA request to the host
—
interface. When the host interface is not in DMA mode, asserting —T–S when —H–A
and —H–R are asserted will enable the contents of the host interface interrupt vector
NuBus is a trademark of Texas Instruments, Inc.
Macintosh II is a trademark of Apple Computer, Inc.
MOTOROLADSP96002 USER’S MANUAL2 - 9
register (IVR) onto the data bus outputs D0-D31. This provides an interrupt acknowledge capability compatible with MC68000 family processors.
If the host interface is in DMA mode, —H–A is used as a DMA transfer acknowledge input and it is asserted by an external device to transfer data between the Host Interface
registers and an external device. In DMA read mode, —H–A is asserted to read the Host
Interface RX register on the data bus outputs D0-D31. In DMA write mode, —H–A is as-
serted to strobe external data into the Host Interface TX register. Write data is latched
into the TX register on the rising edge of —H–A.
—H–
R(Host Request) - active low output, never three-stated. The host request —H–R is as-
serted to indicate that the host interface is requesting service - either an interrupt request
or a DMA request - from an external device.
The —H–R output may be connected to interrupt request input —I—R—Q–A, —I—R
Q–B, or —I—R—Q–C of another DSP96002. The DSP96002 on-chip DMA Controller
channel can select the interrupt request input as a DMA transfer request input.
—B–
R(Bus Request) - active low output, never three-stated. —B–R is asserted when the CPU
or DMA is requesting bus mastership. —B–R is deasserted when the CPU or DMA no
longer needs the bus. —B–R may be asserted or deasserted independent of whether
the DSP96002 is a bus master or a bus slave. Bus "parking" allows —B–R to be
deasserted even though the DSP96002 is the bus master. See the description of bus
"parking" in the —B–A pin description. The RH bit in the Bus Control Register (see
Section seven) allows —B–R to be asserted under software control even though the
CPU or DMA does not need the bus. —B–R is typically sent to an external bus arbitrator
which controls the priority, parking and tenure of each DSP96002 on the same external
bus. —B–R is only affected by CPU or DMA requests for the external bus, never for the
internal bus. During hardware reset, —B–R is deasserted and the arbitration is reset
to the bus slave state.
—
—B–
G(Bus Grant) – active low input. —B–G must be asserted/ deasserted synchronous to the
input clock (CLK) for proper operation. —B–G is asserted by an external bus arbitration
circuit when the DSP96002 may become the next bus master. When —B–G is asserted,
the DSP96002 must wait until —B–B is deasserted before taking bus mastership. When
—B–
G is deasserted, bus mastership is typically given up at the end of the current bus
cycle. This may occur in the middle of an instruction which requires more than one external bus cycle for execution. Note that indivisible read-modify-write instructions
2 - 10DSP96002 USER’S MANUALMOTOROLA
(BSET, BCLR, BCHG) will not give up bus mastership until the end of the current instruc-
——B–
tion.
—B–
A(Bus Acknowledge) - Open drain, active low output. When deasserting —B–A, the
DSP96002 drives —B–A high during half a CLK cycle and then disables the active pullup. In this way, only a weak external pull-up resistor is required to hold the line high.
G is ignored during hardware reset.
—
B–A may be directly connected to —B–B
MC68040 —B–B pin. When —B–G is asserted, the DSP96002 becomes the pending
bus master. It waits until —B–B is negated by the previous bus master, indicating that
the previous bus master is off the bus. The pending bus master asserts —B–A to become the current bus master. —B–A is asserted when the CPU or DMA has taken the
bus and is the bus master. While —B–A is asserted, the DSP96002 is the owner of the
bus (the bus master). When —B–A is negated, the DSP96002 is a bus slave. —B–A
may be used as a three-state enable control for external address, data and bus control
signal buffers. —B–A is three-stated during hardware reset.
Note that a current bus master may keep —B–A asserted after ceasing bus activity, regardless of whether —B–R is asserted or deasserted. This is called "bus parking" and
allows the current bus master to use the bus repeatedly without re-arbitration until some
other device wants the bus.
The current bus master keeps —B–A asserted during indivisible read-modify-write bus
cycles, regardless of whether —B–G has been deasserted by the external bus arbitra-
tion unit. This form of "bus locking" allows the current bus master to perform atomic operations on shared variables in multitasking and multiprocessor systems. Current instructions which perform indivisible read-modify-write bus cycles are BCLR, BCHG and
BSET.
in order to obtain the same functionality as the
—B–
B(Bus Busy) - active low input, must be asserted and deasserted synchronous to the input
clock (CLK) for proper operation. —B–B is deasserted when there is no bus master on
the external bus. In multiple DSP96002 systems, all —B–B inputs are tied together and
are driven by the logical AND of all —B–A outputs. —B–B is asserted by a pending bus
master (directly or indirectly by —B–A assertion) to indicate that it is now the current bus
master. —B–B is deasserted by the current bus master (directly or indirectly by —B–A
negation) to indicate that it is off the bus and is no longer the bus master. The pending
bus master monitors the —B–B signal until it is deasserted. Then the pending bus master asserts —B–A to become the current bus master, which asserts —B–B directly or
indirectly.
MOTOROLADSP96002 USER’S MANUAL2 - 11
—B–
L(Bus Lock) - active low output, never three-stated. Asserted at the start of an external
indivisible Read-Modify-Write (RMW) bus cycle (providing an "early bus start" signal for
DRAM interfacing) and deasserted at the end of the write bus cycle. —B–L remains asserted between the read and write bus cycles of the RMW bus sequence. —B–L can
be used to indicate that special memory timing (such as RMW timing for DRAMs) may
be used or to "resource lock" an external multi-port memory for secure semaphore updates. The early negation provides an "early bus end" signal useful for external bus con-
trol. If the external bus is not used during an instruction cycle, —B–L remains deasserted until the next external indivisible RMW bus cycle. —B–L also remains deasserted if
the external bus cycle is not an indivisible RMW bus cycle or if there is an internal RMW
bus cycle. The only instructions which automatically assert —B–L are a BSET, BCLR
or BCHG instruction which accesses external memory. —B–L can also be asserted by
setting the LH bit in the BCR register (see Section seven). —B–L is deasserted during
hardware reset.
2.1.6 Reserved Pins
There are 5 spare pins reserved for future use.
2.2BUS OPERATION
The external bus timing is defined by the operation of the Address Bus, Data Bus and Bus Control pins
described in paragraph 2.1.5. The DSP96002 external ports are designed to interface with a wide variety
of memory and peripheral devices, high speed static RAMs, dynamic RAMs and video RAMs as well as
slower memory devices. External bus timing is controlled by the —T–A control signal and by the Bus Control Registers (BCR) which are described in Section seven. The BCR and —T–A control the timing of the
bus interface signals. Insertion of wait states is controlled by the BCR to provide constant bus access timing, and by —T–A to provide dynamic bus access timing. The number of wait states is determined by the
—T–
A input or by the BCR, whichever is longer.
2.2.1 Synchronous Bus Operation
Synchronous external bus cycle consists of at least 4 internal clock phases. See the DSP96002 Technical
Data Sheet (DSP96002/D) for the specification of the internal clock phases. Each synchronous external
memory access requires the following procedure:
3:3.The external memory address is defined by the Address Bus A0-A31 and the Memory Ref-
erence Select signals S1 and S0. These signals change in the first phase of the external bus
cycle. The Memory Reference Select signals have the same timing as the Address Bus and
may be used as additional address lines. The Address and Memory Reference signals are
also used to generate chip select signals for the appropriate memory chips. These chip select signals change the memory chips from low power standby mode to active mode and begin the read access time. This allows slower memories to be used since the chip select signals are address-based rather than read or write enable-based.
2 - 12DSP96002 USER’S MANUALMOTOROLA
3:4.When the Address and Memory Reference signals are stable, the data transfer is enabled by
the Transfer Strobe —T–S signal. —T–S is asserted to "qualify" the Address and Memory
Reference signals as stable and to perform the read or write data transfer. —T–S is asserted
in the second phase of the bus cycle.
3:5.Wait states are inserted into the bus cycle controlled by a wait state counter or by —T–A,
whichever is longer. The wait state counter is loaded from the Bus Control Register. If the
wait state number determined by these two factors is zero, no wait states are inserted into
the bus cycle and —T–S is deasserted in the fourth phase. If the wait state number determined is W, then W wait states are inserted into the instruction cycle. Each wait state introduces one Tc delay.
3:6.When the Transfer Strobe —T–S is deasserted at the end of a bus cycle, the data is latched
in the destination device. At the end of a read cycle, the DSP96002 latches the data internally. At the end of a write cycle, the external memory latches the data. The Address signals
remain stable until the first phase of the next external bus cycle to minimize power dissipation. The Memory Reference signals S1 and S0 are deasserted during periods of no bus activity and the data signals are three-stated.
3.6.1 Static RAM Support
Static RAM devices can be easily interfaced to the DSP96002 bus timing. There are two basic techniques
- —C–S controlled writes and —W–E controlled writes.
—C–
3. 6.1.1
This form of static interface uses the memory chip select (—C–S) as the write strobe. The DSP96002 R/
—
W signal is used as an early read/write direction indication. Proper data buffer enable control on RAMs
without a separate output enable (—O–E) input must use this form to avoid multiple data buffers colliding
on the data bus. The interface schematic is shown in Figure 2-5.
DSP96002
S Controlled Writes
—T–
—C–
—W–
S
ER/—W
STATIC RAM
Figure 2-5. —C–S Controlled Writes Interface to Static RAM
MOTOROLADSP96002 USER’S MANUAL2 - 13
The disadvantage of this technique is that access time is measured from —T–S instead of from the address
or —B–S. Hence faster memories are required.
DSP96002
S1 or S0
R/
—
—
STATIC RAM
—
—O–
—C–
Figure 2-6. —W–E Controlled Writes Interface To Static RAM
—W–
3. 6.1.2
This form of static interface uses the memory write enable (—W–E) as the write strobe. The DSP96002
R/—W signal is used to form a late read/write indication by gating it with —T–S. This form is the one used
by the 56000/1 bus interface. Proper data buffer enable control requires a separate output enable (—O
E) input on the memory to avoid multiple data buffers colliding on the data bus. The interface schematic
is shown in Figure 2-6.
E Controlled Writes
–
The advantage of this technique is that access time is measured from S1, S0 or addresses instead of
T–S. Hence slower memories can be used. The disadvantage of this technique is that the write data hold
will be shortened because the —W–E signal is delayed by the OR gate.
3.6.2 Dynamic RAM and Video RAM Support
Modern dynamic memory (DRAM) and video memory (VRAM) are becoming the preferred choice for a
wide variety of computing systems based on
4:7.Cost per bit due to dynamic storage cell density.
4:8.Packaging density due to multiplexed address and control pins.
4:9.Improved performance relative to static RAMs due to fast access modes (page, static col-
umn, nibble and serial shift (VRAM)).
4:10. Commodity pricing due to high volume production.
2 - 14DSP96002 USER’S MANUALMOTOROLA
—
The Port A/B bus control signals are designed for efficient interface to DRAM/VRAM devices in both random read/write cycles and fast access modes such as those listed above. The bus control signal timing
is specified relative to the external clock (CLK) to enable synchronous control by an external state ma-
chine. An on-chip page circuit controls the —T–T pin, indicating to the external state machine when a slow
or fast access is being made. The page circuit operation and programming is described in Section seven.
4.11BUS HANDSHAKE AND ARBITRATION
Bus transactions are governed by a single bus master. Bus arbitration determines which device becomes
the bus master. The arbitration logic implementation is system dependent, but must result in at most one
device becoming the bus master (even if multiple devices request bus ownership). The arbitration signals
permit simple implementation of a variety of bus arbitration schemes (e.g. fairness, priority, etc.). External
logic must be provided by the system designer to implement the arbitration scheme.
4.11.1Bus Arbitration Signals
Four signals are provided for bus arbitration. Three of them are considered as local arbitration signals and
one as system arbitration signal. The local arbitration signals run between a potential bus master and the
arbitration logic. The local signals are —B–R, —B–G, and —B–A; —B–B is a system arbitration signal.
These signals are described below.
—B–
RBus Request - Asserted by the requesting device to indicate that it wants to use the bus,
and is held asserted until it no longer needs the bus. This includes time when it is the
bus master as well as when it is not the bus master.
—B–
GBus Grant - Asserted by the bus arbitration controller to signal the requesting device that
it is the bus master elect. —B–G is valid only when the bus is not busy (Bus Busy signal
described below).
—B–
ABus Acknowledge - Asserted by the device (bus master) that received the bus owner-
ship from the bus arbitration controller. The master holds —B–A asserted for the duration of its bus possession. —B–A indicates whether the device is a bus master or a bus
slave. When asserted, —B–A indicates that the device is the bus master. —B–A may
be used as a three-state enable control for external address, data and bus control signal
buffers.
—B–
BBus Busy - The system arbitration signal —B–B is monitored by all potential bus masters
and is derived from the local bus signal —B–A. This signal controls the hand-over of
bus ownership by the bus master at the end of bus possession. Typically —B–B is the
wired-OR of all bus acknowledgments. —B–B is asserted if the Bus Acknowledge signal
is asserted by the bus master.
MOTOROLADSP96002 USER’S MANUAL2 - 15
4.11.2The Arbitration Protocol
The bus is arbitrated by a central bus arbitrator, using individual request/grant lines to each bus master.
The arbitration protocol can operate in parallel with bus transfer activity so that the bus hand-over can be
made without much performance penalty.
The arbitration sequence occurs as follows:
5:12. All candidates for bus ownership assert their respective —B–R signals as soon as they need
the bus.
5:13. The arbitration logic designates a bus master-elect by asserting the —B–G signal for that de-
vice.
5:14. The master-elect tests —B–B to ensure that the previous master has relinquished the bus.
If —B–B is deasserted, then the master-elect asserts —B–A, which designates the device as
the new bus master. If a higher priority bus request occurs before the —B–B signal was
deasserted, then the arbitration logic may replace the current master-elect with the higher
priority candidate. However, only one —B–G signal must be asserted at one time.
5:15. The new bus master begins its bus transfers after the assertion of —B–A.
5:16. The arbitration logic signals the current bus master to relinquish the bus by deasserting —B
G at any time. A DSP96002 bus master releases its ownership (deasserts —B–A) after
completing the current external bus access. If an instruction is executing a Read-Modify-
Write external access, a DSP96002 master asserts the —B–L signal and will only relinquish
the bus (and deassert —B–L) after completing the entire Read-Modify-Write sequence.
When the current bus master deasserts —B–A, the —B–B signal must also be deasserted
because the next bus master-elect has received its —B–G signal and is waiting for —B–B to
be deasserted before claiming ownership.
The DSP96002 has 2 control bits and one status bit, located in the Bus Control Registers (see Section 7)
to permit software control of the —B–R and —B–L signals, and to verify when the chip is the bus master.
If the RH bit in the BCR register is cleared, the DSP96002 asserts its —B–R signal only as long as requests
for bus transfers are pending or being attempted. If the RH bit is set, —B–R will remain asserted. If the
LH bit in the BCR register is cleared, the DSP96002 asserts its —B–L signal only during a read-modify-
–
write bus access. If the LH bit is set, —B–L will remain asserted.
5.16.1Arbitration Scheme
The bus arbitration scheme is implementation dependent. The diagram in Figure 2-7 illustrates a common
method of implementing the bus arbitration scheme. The arbitration logic determines the device priorities
and assigns bus ownership depending on those priorities.
2 - 16DSP96002 USER’S MANUALMOTOROLA
An implementation of a bus arbitration scheme may hold —B–G asserted, for example, to the current bus
owner if none of the other devices are requesting the bus. As a consequence, the current bus master may
keep —B–A asserted after ceasing bus activity, regardless of whether —B–R is asserted or deasserted.
This situation is called "bus parking" and allows the current bus master to use the bus repeatedly without
re-arbitration until some other device requests the bus.
V
cc
DSP96002
DSP96002
—B–
—B–
—B–
—B–
—B–
ARBITRATION
LOGIC
L
—B–
—B–
—B–
—B–
—B–
L
Figure 2-7. Bus Arbitration Scheme
5.16.2Bus Handshake Unit
The bus handshake unit in the DSP96002 is implemented within a finite state machine. It consists of two
external outputs (—B–R, —B–A), two external inputs (—B–G, —B–B) and three internal inputs
(ext_acc_req, end_of_sequence, RH) (see Figure 2-8). The ext_acc_req signal is asserted when one or
more requests for external bus access are pending, and remains asserted as long as the transfers are
being executed. The end_of_sequence signal is asserted at the last bus cycle of the current sequence.
—B–
ext_acc_req
end_of_sequence
Request Hold (RH)
BUS
HANDSHAKE
UNIT
—B–
—B–
—B–
B
R
A
Figure 2-8. Bus Handshake Unit
MOTOROLADSP96002 USER’S MANUAL2 - 17
YY
(delayed
ZZ
(delayed)
REQUEST_BUS
(Y)
—B–R = 0
—B–A = 1
ZY
)
ACTIVE_
MASTER
(Z)
—B–R = 0
—B–A = 0
YZ
WY
(non-existant)
XZ
YX (illegal)
XY
ZW
WZ
ZX
YW (illegal)
IDLE
(X)
—B–R = —R
H
WX
PARKING_
MASTER
(W)
—B–R =
R–H
XW
XX
–
—
WW
Figure 2-9. Bus Handshake State Diagram
Likewise, when executing the read part of a RMW access, the end_of_sequence signal is deasserted.
This signal is used to give up bus ownership if —B–G is deasserted during bus transfers. The state ma-
chine which controls the bus handshake is illustrated in Figure 2.9.
The transition arcs are labeled by two letters which denote its source and destination states. The equa-
tions of the transition arcs are described as follows:
WW = ^ext_acc_req & ^—B–G
Notes: 1. Illegal arcs in DSP96002 since once the request of the bus is pending, it will not be canceled
before the execution of the access.
2. Non-existent arc since if ext_acc_req arrives together with the negation of —B–G, the device
becomes active master and begins its bus transfers.
3.—D—B–G is —B–G delayed by one phase. This is done to provide a response to the
ext_acc_req signal when it is asserted at the same phase together with —B–G negation.
5.16.3Bus Arbitration Example Cases
5.16.3.1 Case 1 – Normal
If the device requesting mastership asserts —B–R: the arbiter asserts the requesting devices’ —B–G and
—B–
B is deasserted indicating the bus is not busy. The requesting device will assert —B–A.
5.16.3.2 Case 2 – Bus Busy
If the device requesting mastership asserts —B–R: the arbiter responds by asserting the requesting devices’ —B–G; however, the bus is busy because —B–B is asserted. The requesting device will not assert
B–A until —B–B is deasserted.
5.16.3.3 Case 3 – Low Priority
If the device requesting mastership asserts —B–R: the arbiter withholds asserting the requesting devices’
—B–
G because a higher priority device requested the bus. —B–A of the requesting device will not be as-
serted.
—
5.16.3.4 Case 4 – Default
If a device does not request the bus and it is not in the bus parking state but rather it is in the idle state: the
arbiter, by design (i. e., default), asserts —B–G. —B–A will remain deasserted.
MOTOROLADSP96002 USER’S MANUAL2 - 19
5.16.3.5 Case 5 – Bus Lock during RMW
If the device requesting mastership asserts —B–R and the arbiter asserts the requesting devices’ —B–G
and —B–B is deasserted, then the requesting device will assert —B–A. If a read-modify-write (RMW) instruction which accesses external memory is being executed, and the bus arbiter deasserts —B–G, then
—B–
A will remain asserted until the entire RMW instruction completes execution. —B–A will then be deasserted thereby relinquishing the bus. Note that during external RMW instruction execution, —B–L is asserted. In general, the —B–L signal can be used to ensure that a multiport memory can only be written by one
master at a time. That is, referring to Figure 2-10, —B–L can be input from DSP #1to the memory controller
which prevents —T–A from being asserted by the controller (thereby suspending the memory access by
DSP #2) until DSP #1 completes its RMW access.
DSP96002
RMW
—B–
Dual Port
Memory
Controller
L
—T–
A
DSP96002
#2#1
Figure 2-10. Bus Lock During RMW
5.16.3.6 Case 6 – Bus Park
The device requesting mastership asserts —B–R; the arbiter asserts the requesting devices’ —B–G and
—B–
B is deasserted indicating the bus is not busy – the requesting device will assert —B–A. When the
requesting device no longer requires the bus it will deassert —B–R; if the bus arbiter leaves —B–G asserted because other requests are not pending, then —B–A will remain asserted. This condition is called bus
parking and eliminates the need for the last bus master to rearbitrate for the bus during its next external
access.
2 - 20DSP96002 USER’S MANUALMOTOROLA
SECTION 3
CHIP ARCHITECTURE
3.1INTRODUCTION
The DSP96002 architecture is a 32-bit highly-parallel multiple-bus IEEE floating-point processor. The architecture is designed to accommodate various IC family members with different memory and on-chip peripheral requirements while maintaining a standard programmable core. The overall chip architecture is
presented and detailed block diagrams of the Data ALU and Address Generation Unit AGU) core architecture are described.
3.2DSP96002 BLOCK DIAGRAM
The major components of the DSP96002 are
• Data Buses
• Address Buses
• Data ALU
• Address Generation Unit
• X Data Memory
• Y Data Memory
• Program Control and System Stack
• Program Memory
• Port A and Port B External Bus Interfaces
• Internal Bus Switch and Bit Manipulation Unit
• I/O Interfaces
An overall block diagram of the DSP96002 architecture is shown in Figure 3-1.
3.2.1 Data Buses
Data movement on the chip occurs over five bidirectional 32-bit buses, X Data Bus (XDB), Y Data Bus
(YDB), Global Data Bus (GDB), the DMA Data Bus (DDB) and the Program Data Bus (PDB). The X and Y
data buses may also be treated by certain instructions as one 64-bit data bus by concatenation of XDB and
YDB. Data transfer between the Data ALU and the X Data Memory and Y Data Memory occur over the X
Data Bus and Y Data Bus. These are kept local on the chip to maximize speed and minimize power. The
direct memory access data transfers occur over the DMA Data Bus. Program memory data transfers and
instruction fetches occur over the Program Data Bus. All other data transfers occur over the Global Data
Bus.
MOTOROLADSP96002 USER’S MANUAL3 - 1
Figure 3-1. DSP96002 Block Diagram
3.2.2 Address Buses
Addresses are specified for internal X Data Memory and Y Data Memory on two unidirectional 32-bit buses,
X Address Bus (XAB) and Y Address Bus (YAB). Internal address bus sizes depend on the amount of internal memory implemented. External memory spaces for each port, A and B, are addressed via a single
32-bit unidirectional address bus driven by a three input multiplexer that can select the X Address Bus
(XAB), the Y Address Bus (YAB) or the Program Address Bus (PAB). On-chip peripherals and the DMA
Controller are memory mapped in the internal X memory space. When zero wait state external memory is
used, one instruction cycle is needed for each external memory access.
The XAB, YAB and PAB are dual access buses in the sense that one instruction cycle contains two slots,
the one slot is dedicated to the on-chip DMA transfers and the second is used for the core transfers.
3 - 2DSP96002 USER’S MANUALMOTOROLA
3.2.3 Data ALU
The Data ALU performs all of the arithmetic and logical operations on data operands. The Data ALU consists of ten 96-bit general purpose registers, a 32-bit barrel shifter, a 32-bit adder, and a 32-bit parallel multiplier. Data ALU registers may be read or written over the XDB and YDB as 32 or 64-bit operands. The
Data ALU is capable of multiplication, addition, subtraction, format conversion, shifting and logical operations in one instruction cycle. Data ALU source operands may be 32 or 96-bits and originate from the general purpose register file. Data ALU results are always stored in one of the general purpose registers. Floating-point Data ALU operations always have a 96-bit result. Integer (fixed-point) Data ALU operations have
a 32 or 64-bit result.
The Data ALU fully implements the IEEE Standard 754 for binary floating-point arithmetic. The operations
are supported in three data formats: 32-bit two’s-complement fixed-point, 32-bit unsigned-magnitude fixedpoint and 44-bit IEEE single extended precision floating-point. All the floating-point computations are performed using the single extended precision format and the results are automatically rounded to single precision or single extended precision numbers as programmed. All four IEEE rounding modes (round to zero,
round to nearest, round to plus infinity and round to minus infinity) are supported for all floating-point operations and conversions. The IEEE gradual underflow with denormalized numbers is supported by the IEEE
mode. In the IEEE mode, if input operand(s) or output result(s) are denormalized numbers, additional instruction cycles are required to process these numbers per the IEEE standard. A "Flush to Zero" mode is
also provided which forces all floating point result underflows to zero (all denormalized input operands are
considered as being zero). The Flush to Zero mode never requires any additional instruction cycles.
Refer to Section 3.3 for a detailed description of the Data ALU architecture.
3.2.4 AGU
The AGU performs all of the address storage and effective address calculations necessary to address data
operands in memory and it is used by both the core and the on-chip DMA Controller. The AGU operates in
parallel with other chip resources to minimize address generation overhead. The AGU contains eight Address Registers (R0-R7), eight Offset Registers (N0-N7), and eight Modifier Registers (M0-M7). The Address Registers are 32-bit registers which may contain any address or data. Each Address Register may
be accessed for output to the XAB, YAB, and PAB. The modifier and offset registers are 32-bit registers
which are normally used to control updating of the address registers.
AGU registers may be read or written over the Global Data Bus as 32-bit operands. The AGU can generate
two 32-bit addresses every instruction cycle - one for any two of the XAB, YAB or PAB. The AGU can directly address 4,294,967,296 locations on the XAB and 4,294,967,296 locations on the YAB - a total capability of 8,589,934,592 32-bit data words. Refer to Section 3.4 for a detailed description of the AGU architecture.
3.2.5 X Data Memory
The X Data Memory may contain both data RAM and ROM. The X Data RAM is a 32-bit wide internal memory and occupies the lowest 512 locations in X Memory Space. The X Data ROM is also a 32-bit wide internal memory and occupies 1024 locations in X Memory Space. Addresses are received from the XAB
and data transfers occur on the XDB. The X memory is a dual-access memory in the sense that it may be
accessed twice during a cycle: once by the core and once by the DMA. X memory may be expanded off
chip.
MOTOROLADSP96002 USER’S MANUAL3 - 3
3.2.6 Y Data Memory
The Y Data Memory may contain both data RAM and ROM. The Y Data RAM is a 32-bit wide internal memory and occupies the lowest 512 locations in Y Memory Space. The Y Data ROM is also a 32-bit wide internal memory and occupies 1024 locations in Y Memory Space. Addresses are received from the YAB
and data transfers occur on the YDB. The Y memory is dual-access memory in the sense that it may be
accessed twice during a cycle: once by the core and once by the DMA. Y memory may be expanded off
chip.
3.2.7 Program Control and System Stack
The Program Control logic performs instruction prefetch, instruction decoding and exception processing. A
32-bit program counter (PC) register can address 4,294,967,296 locations in Program Memory Space.
The System Stack is a separate internal RAM which stores the PC and the status register (SR) for subroutine calls and long interrupts. The stack will also store the loop counter (LC) and the loop address register
(LA) in addition to the PC and SR registers for program looping. The System Stack is in Stack Memory
Space and its address is always inherent and implied by the current instruction. The stack RAM is 64-bits
wide and 15 locations "deep". When a subroutine call or long interrupt occurs, the contents of the PC and
SR registers are stored (pushed) on the "top" location in the System Stack. When a return from subroutine
occurs, the contents of the "top" location in the System Stack are copied (pulled) to the PC. When a return
from interrupt occurs, the contents of the "top" location in the System Stack are copied (pulled) to the PC
and SR.
An interrupt will cause the processor to enter the exception processing state. Upon entering this state, the
current instruction in decode will execute normally, unless it is the first word of a two-word instruction, in
which case it will be aborted, and re-fetched at the completion of exception processing. The next two fetch
addresses are supplied by the interrupt controller. During these fetches the PC is not updated.
If one of the words fetched by the interrupt controller is a jump to subroutine, a long interrupt routine is
formed, and a context switch is performed using the stack. If neither interrupt instruction word causes a
change of control flow, then the two interrupt instructions fetched constitute a fast interrupt routine. In this
case, the stack is not used, and interrupt service concludes with the execution of the instructions contained
within the two words. Fetching then resumes using the PC. The fast interrupt routine provides minimum
overhead exception processing. This mechanism is commonly used to move data between memory and
an I/O device.
For more details on the behavior of interrupts, see Section 8.
The system stack is also used to implement no-overhead hardware program loops. When a program loop
is initiated with the execution of a DO instruction, the following events occur:
• the current 32-bit loop counter (LC) and 32-bit loop address register (LA) are pushed onto the
system stack to allow nested loops.
• the LC and LA registers are initialized with values specified in the DO instruction.
• the address of the first instruction in the program loop and the current status register contents
are transferred onto the system stack.
• the loop flag bit in the status register is set.
The loop flag bit is set when a program loop is in progress and enables the end of loop detection (comparison between the PC and LA registers, discussed below). The loop flag bit is pulled from the system stack
when a loop is terminated and indicates if the terminated loop was a nested loop.
3 - 4DSP96002 USER’S MANUALMOTOROLA
A program loop begins execution after the DO instruction and continues until the program address fetched
equals the loop address register contents (last address of program loop). The contents of the loop counter
are then tested for one. If the loop counter is not one, the loop counter is decremented and the top location
in the stack RAM is read (but not pulled) into the PC to return to the start of the loop. If the loop counter is
one, the program loop is terminated by incrementing the PC, reading the previous loop flag bit from the top
location in the stack into the status register, purging the stack (pulling the top location and discarding the
contents) and pulling the LA and LC registers off the stack and restoring the respective registers. When
terminating a loop the loop flag, LA and LC registers as well as the system stack pointer are restored.
3.2.8 Program Memory
The Program Memory consists of a 1,024 location by 32-bit RAM. Addresses are received from the program control logic (usually the PC). The Program Memory may contain instructions, constants, and data
tables which are fixed at assembly time. The Program Memory is a dual-access memory in the sense that
it may be accessed twice during a cycle: once by the core and once by the DMA. Program Memory may
be expanded off-chip. Program RAM may be written to download instructions. The bootstrap ROM also appears in Program Memory space during the bootstrap mode. See Section 9.
3.2.9 External Bus Interfaces
The DSP96002 has two identical external bus interfaces. Each bus interface has a 32-bit wide address bus
and a 32-bit wide data bus, and may be used to access external Data Memory, Program Memory or I/O
devices. Separate select lines control access to the memory spaces. A Port Select control register permits
assigning sections of each memory space to each external bus interface port. Refer to Section 2 and Section 9 for a detailed description of the external bus interface.
3.2.10Internal Bus Switch and Bit Manipulation Unit
The Internal Bus Switch performs data transfers from one internal bus to another.
The Bit Manipulation Unit performs bit manipulation operations on memory and register operands on the
XDB, YDB, and GDB.
3.2.11I/O Interfaces
The on-chip I/O interfaces are intended to minimize system chip count and "glue" logic in many DSP96002
applications. Each I/O interface has its own control, status and data registers and is treated as memorymapped I/O by the DSP96002. Each interface has several dedicated interrupt vector addresses and control
bits to enable/disable interrupts. This minimizes the overhead associated with servicing the device since
each interrupt source has its own service routine.
The DSP96002 provides the following I/O interfaces: two identical 32-bit parallel Host MPU/DMA Interface
peripherals are provided on the DSP96002, one connected to External Bus Interface A and the other to
External Bus Interface B; a two-channel DMA Controller.
3.2.11.1Host Interfaces
The DSP96002 provides a Host MPU/DMA Interface for each of its external bus interface ports. Each Host
Interface (HI) is a 8-, 16-, 24- or 32-bit wide parallel port which may be connected directly to the data bus
of a host processor. The host processor may be any of a number of popular microcomputers or micropro-
MOTOROLADSP96002 USER’S MANUAL3 - 5
cessors, another DSP96002 or DMA hardware. The HI appears as a memory mapped peripheral occupying 16 words in the host processor address space. Separate transmit and receive data registers are doublebuffered to allow the DSP96002 and host processor to efficiently transfer data at high speed. Host processor communication with the HI is accomplished using standard Host processor data move instructions and
addressing modes. Handshake flags are provided for polled or interrupt-driven data transfers.
3.2.11.2DMA Controller
The DMA Controller performs all the address storage and effective address calculations necessary to address the DMA source and destination operands. The DMA controller operates in parallel with other chip
resources to minimize data or program transfers overhead. The DMA controller contains one Source Address Register, one Source Offset Register, one Source Modifier Register, one Destination Address Register, one Destination Offset Register and one Destination Modifier Register for each channel.
In addition there are two control registers per channel. The Transfer Count down counter, decremented after each transfer, contains the number of DMA transfers remaining to be done. The DMA Control/Status
Register controls the DMA activities and contains the DMA status. All DMA registers are mapped into the
X memory space. The AGU is shared by the DMA for the source and destination address calculations. The
DMA addressing modes are: linear, bit reversed and modulo. For more details see Section 7.5.
3.3DATA ALU BLOCK DIAGRAM
The major components of the Data ALU are
• Data ALU Register File
• Multiply Unit
• Adder Unit
• Logic Unit
• Format Converter
• Divide and Square Root Unit
• Controller and Arbitrator
A block diagram of the Data ALU architecture is shown in Figure 3-2.
D0, D1, D2, D3, D4, D5, D6, D7, D8 and D9 are 96-bit registers which serve as the Data ALU general pur-
pose register file. Every register is divided into three portions: high, middle, and low, each 32-bits wide. The
registers may be treated as ten 96-bit registers Dn (Dn.H:Dn.M:Dn.L), n=0,1,..,9 for floating-point source
and/or destination operands. These floating point registers receive inputs from the Multiplier, the Adder,
and the Subtracter and supply a source data register of the same form. Most Data ALU floating-point operations specify the 96-bit registers as source and/or destination operands. However, D8 and D9 are never
destinations of a Data ALU operation.
The data is stored in the registers in double precision floating-point format. Each register may be read or
written over the XDB or YDB as a floating-point operand. A format conversion is automatically performed
when a Dn register is written with an operand of a different floating-point format. This can occur when writing Dn from the XDB or YDB as a result of a single precision floating-point MOVE. If a single precision operand is written to a floating point data register, the middle portion of the data register is written with the
mantissa portion of the word operand, the low portion is zeroed and the high portion is written with the exponent portion of the word operand.
3 - 6DSP96002 USER’S MANUALMOTOROLA
Figure 3-2. Data ALU Block Diagram Data ALU Register File (D0-D9)
The registers may also be treated as thirty 32-bit registers Dn.H, Dn.M, Dn.L, n=0,1,..,9. Each register may
be read or written over the XDB or YDB as a word operand. When an individual 32-bit register is written
over the XDB or YDB, no format conversion takes place and only the designated register is affected. The
low portion of the registers, Dn.L, is used as source and/or destination for most integer operations. In this
case the integer registers supply an operand for the Multiplier and the Adder/Subtracter while receiving an
input from the Multiplier and the Adder/subtracter. Note that in the case of integer multiplication the result
will be 64-bits wide and will be stored in both middle and low portions of the destination register.
3.3.1 Multiply Unit
The Multiplier is one of the two arithmetic processing units of the Data ALU and performs all the floatingpoint multiplications as well as signed/unsigned fixed-point (integer) multiplications on the data operands.
MOTOROLADSP96002 USER’S MANUAL3 - 7
Loading...
+ 867 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.