SGS Thomson Microelectronics D950CORE Datasheet

4 September 1997 1/89
This is preliminary information on a new product in development or undergoing evaluation. Details are subject to change without notice
D950-CORE
16-Bit Fixed Point Digital Signal Processor (DSP) Core
PRELIMINARY DATA
Performance
66 Mips - 15ns instruction cycle time
Memory Organization
Two 64k x 16-bit data memory spaces
One 64k x 16-bit program memory space
2 stacks in data memory spaces
Fast and Flexible Buses
Two 16-bit address 16-bit data non-
multiplexed data buses
One 16-bit address 16-bit data non-
multiplex ed ins t r uct ion bus
Data Calculation Unit
16 x 16-bit parallel multiplier
40-bit bar rel shif ter un it
40-bit ALU
Two 40-bit extended precision accumulators
Fractional and integer arithmetic with support
for floating point and multi-precision
16-bit bit manipulation unit (BMU)
Address Calculation Unit
Two address calculation units with modulo
and bit-reverse capability
2 x 16-bit address registers
4 x 16-bit index registers
2 x 16-bit base and maximum address
registers for modulo addressing
Program Con trol Un it
16-bit program counter
3 Hardware Loop Capabilities
Power Consumption
Single 3.3V power supply
Low-power standby mode
Electrical Characteristics
Operating frequency down to DC
Channels
General purpose 8-bit I/O port
Dedicated hardw are for Emulation and Test,
IEEE 1149.1 (JT AG) interfac e compa tible
Peripherals and Memory
Macrocells for peripherals such as the bus
switch unit, interrupt controller and DMA controller
Standard cells library, I/O library
Memory generators for RAM and ROM
Development Tools
JTAG PC board with graphic windowed high
level source debugger for AS-DSP emulation
Complete crash-barrier chain (assembler /
simulator / linker) running on PC and SUN,
Complete GNU chain (assembler / simulator /
linker / C compiler / C debugger) for SUN
VHDL model (SYNOPSYS & MENTOR)
DATA
CALCULATION
UNIT
ADDRESS
CALCULATION
UNIT
PROGRAM
CONTROL
UNIT
CLKIN
DATA MEMORY
PROGRAM MEMORY
V
DD
V
SS
TEST & EMULATIONPO/P7
CONTROL
11
8
14
XD-bus
YD-bus
6 16
16 16 16
3 16 16
OUTPUT
CLOCKS
XA-bus
YA-bus
ID-bus IA-bus
2/89
Table of Contents
4
1 INTRODUCT ION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
2 PIN DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
3 FUNCTIONAL OVERVIEW . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
4 BLOCK DESCRIPTION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1 DATA CALCU L ATION UNIT (DCU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
4.1.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
4.1.3 Multiplier . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.4 Barrel Shifter Unit (BSU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.1.5 Arithmetic and Logic Unit (ALU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.6 Bit Manipulation Unit (BMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.2 ADDRESS CALCULATION UNIT (ACU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
4.2.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.2.3 Addressing mode s . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
4.3 PROGRAM CONTROL UN IT ( PCU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
4.3.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.3 Instruction pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.4 Interrupt Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.3.5 Loop Controller . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.3.6 Sequence control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.7 Halting program execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.3.8 Memory Move s with Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.4 GENERAL PURPOSE P-PORT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
4.4.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.5 COMMON CONTROL REGISTERS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5.1 STA: Status register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
4.5.2 CCR: Condition Code Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5 SOFTWARE ARCHITECTURE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.1 INTRODUCT ION . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.2 REGISTER LIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.3 CONDITION LIST . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
1
3/89
Table of Contents
5.4 INSTRUCTION SET . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.4.1 Assignment Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
5.4.2 ALU Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
5.4.3 Bit Manipulation Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
5.4.4 Program Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
5.4.5 Conditional Assignment Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4.6 Loop Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
5.4.7 Co-processor Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
5.4.8 Stack Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.5 INSTRUCTION CYCLE AND W ORD COUNT . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
6 ELECTRICAL SPECIFICATIONS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.1 DC ABSOLUTE MAXIMUM R A TINGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
6.2 DC ELECTRICAL CHARACTERIST ICS (CORE LEVEL) . . . . . . . . . . . . . . . . . . 56
6.3 AC CHARAC TERISTICS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
6.3.1 Bus AC Electrical Characterstics (for X, Y and I buses) . . . . . . . . . . . . . . 57
6.3.2 Control I/O Electrical Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
6.3.3 Hardware Reset . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
6.3.4 Wait States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
6.3.5 Interrupt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
6.3.6 HOLD . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.3.7 JUMP on Port Cond ition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
7 ANNEX - HARDWARE PERIPHERAL LIBRARY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.1 CO-PROCESSOR . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
7.2 BUS SWITCH UNIT (BSU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
7.2.2 I/O interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.2.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
7.2.4 BSU control registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.3 INTERRUP T C ONTROLLER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
7.3.2 I/O interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.3.3 Interrupt Controller Peripheral Registers . . . . . . . . . . . . . . . . . . . . . . . . . 73
7.4 DMA CONTROLLER . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4/89
Table of Contents
4
7.4.2 I/O interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.4.3 Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
7.4.4 DMA Peripheral Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
7.5 EMULATION AND TEST UNIT (EMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7.5.2 Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
8 APPENDIX . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.1 MEMORY MAPPING (Y-MEMORY SPACE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.1.1 General mapping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.1.2 Registers Related to the D950-COR E . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.1.3 Registers related to the interrupt controller . . . . . . . . . . . . . . . . . . . . . . . 87
8.1.4 Registers related to the DMA controller . . . . . . . . . . . . . . . . . . . . . . . . . 88
8.1.5 Registers related to the Bus Switch Unit . . . . . . . . . . . . . . . . . . . . . . . . 88
5/89
D950-Core
1 Introduction
The D950-Core is a general purpose programmable 16-bit fixed point Digital Signal Processor Core, designed for multimedia, telecom and datacom applications. The D950-Core is a core product, used in combination with standard or custom peripherals from the standard cell library. The peripherals are implemented around the core on the same silicon die, for application specific DSP silicon chip design.
The main blocks of the D950-Core include an arithmetic data calculation unit, a program control unit and an address calculation unit, able to manage up to 64k (program) and 128k (data) x 16-bit memory spaces. Standard peripherals from the m acrocell library include an Emulation Unit, a Bus Switch Unit, an Interrupt Controller, a DMA Controller, a Timer and a Synchronous Serial Port. Memory can be added for programs or data and dedicated memory cells can be generated by use of RAM and ROM memory generators. The development of application specific peripherals is simplified by using the standard cells library.
A set of high level hardware and software development tools and a complete design pack age, give the user a substantial advantages in the form of a performant design environment, rapid prototyping, first time silicon success and built-in test strategies for a global solution in AS-DSP development:
Figure 1.1 shows an architecture example for an AS-DSP used for audio decoding (Dolby AC3, MPEG).
Figure 1.1 AS-DSP Architecture Example
DAT A
MEMORY
PROGRAM
MEMORY
PERIPHERAL A PERIPHERAL D
PERIPHERAL B PERIPHERAL C
CHANNEL 0
CHANNE L 1
CHANNEL 2
CHANNEL 3
DMA CONTROLLER
I N T E R R U P T
C O N T R O L L E R
BUS SWITCH UNIT
D950-CORE
ON-CHIP MEMORY
ON-CHIP MEMORY
ON-CHIP MEMORY
X-BUS
I-BUS
Y-BUS
AS-DSP
TAP
VR02015
EMU
2
6/89
D950-Core
2 PIN DESCRIPTION
The following tables detail the D950-Core pin set. There is one table for each group of pins. The tables detail the pin name, type and a short description of the pin function. A diagram of the D950-Core I/O interface is contained at the end of the section.
Table 2.1 DATA BUSES (70 PINS )
Pin Name Type Descri pti on
XD0-XD15 I/O X Data Bus.
Hi-Z during cycles with no X-bus exchange.
XA0-XA15 O X Address bus.
Hi-Z when in Hold.
XRD
O X-bus read strobe. Active low.
Hi-Z when in Hold.
XWR
O X-bus write strobe. Active low.
Hi-Z when in Hold.
XBS
O X-bus strobe. Active low.
Hi-Z when in Hold. Asserted low at the beginning of a valid X-bus cycle.
YD0-YD15 I/O Y Data Bus.
Hi-Z during cycles with no Y-bus exchange.
YA0-YA15 O Y Address bus.
Hi-Z when in Hold.
YRD
O Y-bus read strobe. Active low.
Hi-Z when in Hold.
YWR
O Y-bus write strobe. Active low.
Hi-Z when in Hold.
YBS
O Y-bus strobe. Active low.
Hi-Z when in Hold. Asserted low at the beginning of a valid Y-bus cycle.
3
7/89
D950-Core
Table 2.2 PROGRAM BUS (35 PINS)
Table 2.3 BUS CONTROL (3 PINS)
Table 2.4 GENERAL PUR POSE P-PORT (9 PINS)
Table 2.5 CLOCK (4 PINS )
Pin Name Type Description
ID0-ID15 I/O Instruction data bus.
Hi-Z during cycles with no I-bus exchange.
IA0-IA15 O Instruction address bus.
Hi-Z when in Hold.
IRD
O I-bus read strobe. Active low.
Hi-Z when in Hold.
IWR
O I-bus write strobe. Active low.
Hi-Z when in Hold.
IBS
O I-bus strobe. Active low.
Hi-Z in Hold. Asserted low at the beginning of a valid I-bus cycle.
Pin Name Type Description
DTACK
I Data transfer acknowledge. Active low.
Sampled on CLKIN rising edge. Controls bus cycle extension by insertion of wait-states.
HOLD
I Hold bus request signal. Active low.
Asserted by a peripheral (DMA controller) requiring bus mastership. Halts program execution and tri-states buses.
HOLDACK
O Hold Acknowledge output. Active low. Indicates that all buses are in Hi-Z.
Pin Name Type Description
P0-P7 I/O 8-bit bidirectional parallel port. Each pin can be individually programmed as
input or output and as level or falling edge sensitive input conditions for test by branch and conditional instructions.
P_EN O Direction of Port
Pin Name Type Description
CLKIN I Clock input. CLK_EMU I Emulation Clock input DMA_CLK O DMA Clock output BSU_CLK O BSU Clock output
3
8/89
D950-Core
Table 2.6 CONTROL (13 P IN S)
Pin Name Type Description
IT
I Maskable Interrupt Request Input. Falling edge sensitive.
ITACK
O Maskable Interrupt Request Acknowledge. Active Low.
Asserted low at the beginning of Interrupt servicing.
EOI
O End of maskable Interrupt routine output. Active low.
Asserted low at the end of current interrupt request processing.
LP
I Low power. Falling edge sensitive.
Stops the processor after execution of the currently decoded instruction and enters low-power standby state (in this state, the clock generator is stopped except for INCYCLE).
LPACK
O Low power Acknowledge. Active low.
Asse rted low at the en d of ex ecut ion of the l ast in str ucti on following dete c­tion of LP
falling edge or at the end of LP or STOP instruction.
RESET
I Reset input. Active low.
Initializes the processor to the RESET state and the clock generator. Forces Program Counter value to reset address and execution of NOP instruction.
MODE I Mode input select.
Forces reset address to 0x0000 (resp. 0xFC00) when low (resp. high).
VCI O Valid co-processor instruction decoded.
Asserted high while decoding a co-processor dedicated instruction. Indi­cates that the c o- pr oc e s s or inst ru c t ion w ill be ex ec u t ed at th e following in­struction cycle.
IRD_WR O Indicates program memory RD/WR cycle during execution of Read or Write
Program memory instruction. INCYCLE O Instruction cycle . Asserted high at the beginning of cycle. RESET_OUT
O Hardware and Software Reset Output STACKX O X Stack read/write instruction STACKY O Y Stack read/write instruction
3
9/89
D950-Core
Table 2.7 EMULATION (9 PI NS)
Table 2.8 TAP CONTROLLER INTERFACE (10 PINS)
Table 2.9 SUPPLY (2 PINS)
Pin Na m e Type De sc ription
ERQ
I Emulator Halt Request. Active low.
Halts program execution and enters emulation mode.
IDLE O Output flag indicating if the processor is halted or executing an instruction
in Emulation mode.
HALTACK O Halt Acknowledge. Active high. Asserted high when the processor is halted
and under control of the emulator.
SNAP O Snapshot output. Active high. Asserted high when executing an instruction
in Snapshot mode. HALT I Halt program execution request EMI I Single Instruction Execute Comma nd MCI O Multicycle in stru ction flag IDLE O Execution of emulation instruction/Halted FNOP O Forced NOP instruction flag
Pin Name Type Description
TE I Test Enable TEST I Test Scan Mode TI_ACU I Test Input for ACU TO_ACU O Test Output for ACU TI_PCU I Test Input for PCU TO_PCU O Test Output for PCU TI_DCU I Test Input for DCU TO_DCU O Test Output for DCU TI_CORE I Scan Chain input TO-CORE O Scan Chain output
Pin Name Type Description
V
DD
I Positive Supply.
V
SS
I Ground pin.
3
10/89
D950-Core
Figure 2.1 D950-Core I/O Interface
D950-CORE
16
16
ID
IA
P
R
O
G
R
A
M
B
U
S
P0-P7
8
X-BUS
CLOCK
CLKIN CLK_EMU
DMA_CL K BSU_CLK
XA XD
16 16
Y-BUS
YA YD
1616
3
3
3
P-PORT
IRD / IWR / IBS
XRD / XWR / XBS
YRD / YWR / YBS
HOLDACK
DTACK / HOLD
BUS CONTROL
CONTROL
IT LP RESET MODE
ITACK EOI LPACK VCI IRD_WR INCYCLE RESET_OUT STACKX STACKY
TEST & EMULATION
2
4
9
Ti_ACU Ti_DCU Ti_PCU Ti_CORE
4
VR02016
8
P_EN
2
2
HALTACK SNAP IDLE MCI FNOP
5
ERQ
1
2
TE TEST
TO_ACU TO_DCU TO_PCU TO_CORE
4
2 HALT EMI
3
11/89
D950-Core
3 FUNCTIONAL OVERVIEW
The D950-CORE is composed of three main units.
Data Calculation Unit (DCU)
Address Calculation Unit (ACU)
Program Control Unit (PCU)
These units are organized in an HARVARD architecture around three bidirectional 16-bit buses, two for data and one for instruction. Each of these buses is dedicated to an uni­directional 16-bit address bus (XA/YA/IA).
An 8-bit general purpose parallel port (P0-P7) can be configured (input or output). A test condition is attached to each bit to test external events. Each of these functional blocks are discussed in detail in Section 4“BLOCK DESCRIPTION”.
Control of the chip is performed through interface pins related to interrupt, low-power mode, reset and miscellaneous functions.
Clock input is provided on the CLKIN pin which is buffered to the output clocks.
Figure 3.1 Block Diagram
DATA
CALCULATION
UNIT
ADDRESS
CALCULATION
UNIT
PROGRAM
CONTROL
UNIT
CLKIN
DATA MEMORY
PROGRAM MEMORYVDD
VSS
TEST & EMULATIONPO/P7
CONTROL
11
8
14
XD-bus
YD-bus
6
16 16 16
16
3
16
16
OUTPUT
CLOCKS
XA-bus
YA-bus
ID-bus IA-bus
4
12/89
D950-Core
Data buses (XD/YD and XA/YA) are provided externally. Data memories (RAM, ROM) and peripherals registers are to be mapped in these address spaces.
Instruction bus (ID/IA) gives access to program memory (RAM, ROM). Each bus has its own control interface
Table 3.1 Data/Instruction Bus and Corresponding Address B us.
Depending on the calculation mode, the D950-Core DCU computes operands which can be considered as 16 or 32-bit, signed or unsigned. It includes a 16 x 16-bit parallel multiplier able to implement MAC-based functions in one cycle per MAC. A 40-bit arithmetic and logic unit, including a 8-bit extension for arithmetic operations, implements a wide range of arithmetic and logic functions. A 40-bit barrel shifter unit and a bit manipulation unit are included.
Tables 3.2 and 3.3 illustrate the different types of word length and w ord format available for manipulation.
Table 3.2 Summary of Possible Word L eng t hs
Data / Instruction Buses Corresponding Address Bus
XD Bidirec tional 16-bit XA Unidirectional 16-bit YD Bidirec tional 16-bit YA Unidirectional 16-bit
ID Bidirectional 16-bit IA Unidi rectional 16-bit
01-bit word
7 0 8-bit word
15 0 16-bit word signed / unsigned
31 16 15 0 32-bit word signed / unsigned
39 32 31 16 15 0 40-bit word signed / unsigned
Table 3.3 Summary of Possible Word F or mat s
Format Minimum M aximum
fractional signed - 1 + 0.999969481
unsigned 0 + 0.99996948
integer signed - 32768 + 32767
unsigned 0 + 65535
13/89
D950-Core
4 BLOCK DESCRIPTION
4.1 Data Calculation Unit (DCU)
4.1.1 Introduction
The D950-Core DCU includes the following main components:
Register file - containing 16 data registers
4 Control Registers:
DCU0CR: Register
BSC: Shifter Control
PSC: Shifter Control
CCR: ALU Flags
Multiplier - 16x16-bit signed/unsigned fractional/integer parallel multiplier.
BSU - 40-bit Barrel Shifter Unit with a maximum right or left shift value of 32.
ALU - 40-bit Arithmetic and Logic Unit implementing a wide range of arithmetic and logic functions with an 8-bit extension for arithmetic operations.
BMU - 16-bit Bit Manipulation Unit implementing bit operations on internal registers and/or on 16-bit data RAM with an 8/16-bit mask.
Figure 4.1 D950-Core Data Calculation Unit
L0
L1
R0
R1
1616
XD YD
16-bit PL
16
16-bit A0H 16-bit A0L 16-bit A1H
16-bit A1L
40-bit A.L.U.
16-bit PH
32
XD
YD
6
32
C.C.R.
STA
8XD8
YD
8-b A0E 8-b A1E
13
328
B.S.C.
P.S. C.
6
16
16 x 16 SIGNED / UNSI GNE D MULTIPLIER WITH PROGRAMMABLE ROUNDING
16
16
8
1313
8
16
40-bit extension
40-bit extension
40-bit B.S.
40
VR02017B
5
14/89
D950-Core
4.1.2 Registers
There are two types of registers: data registers and control registers. All registers are direct addressed. Registers can be read or written through the X and Y buses. All of the D CU par ts (multiplier, BSU, ALU, BMU) operate on these registers.
Data registers
L0 / L1: 2 x 16-bit input Left registers. R0 / R1: 2 x 16-bit input Right registers. A0 / A1: 2 x 40-bit Accumulators, each made of the concatenation of an 8-bit extension
A0E (resp. A1E), a 16-bit MSB A0H (resp. A1H) and a 16-bit LSB A0L (resp. A1L). These registers are dedicated to extended precision calculations, in order to provide up to 240 dB of dynamic range.
P: 32-bit multiplier result register made of the concatenation of PH (MSB) and PL
(LSB) 16-bit registers
Table 4.1 Data Register Structure.
L L1 L0 32-bit Input Left
31 16 15 0
R R1 R0 32-bit Input Right
31 16
A0 A0E A0H A0L 40-bi t Accu mulator 0
39 32 31 16
A1 A1E A1H A1L 40-bi t Accu mulator 1
39 32 31 16 15 0
P PH PL 32-bit Multiplier Result
31 16 15 0
L L1 0 16-bit Input Left
31 16 15 0
L L0 0 16-bit Input Left
31 16 15 0
R R1 0 16-bit Input Right
31 16 15 0
R R0 0 16-bit Input Right
5
15/89
D950-Core
Control regist ers
CCR: B its 0 to 12 are dedicated to the DCU (see Section 4.5.2). BSC: 6-bit Barrel Shifter Control register. The BSC contains a 6-bit signed shift value
(2’s complement). If the value is positive (resp. negative), all shifts using the BSC contents will provide a left (resp. r ight) shift. After rese t, the BSC val ue is
0.
PSC: 6-bit Product Shift Control register . T he PSC contains a 6-bit signed shift value.
If the value is positive (resp. negative) there will be a left (resp. right) shift on the P-register. After reset, the PSC value is 0.
DCU0CR: Bits 0 to 7 are copied from bits 0 to 7 of the STA register. Bit 10 is used for
clearing the lower part (bits 0 to 15) and sign extending bits 32 to 39 of the accumulator when its higher part (bits 16 to 31) is loaded.
4.1.3 Multiplier
The D950-Core multiplier performs 16 x 16-bit multiplications with the following implementations (see SL and SR bits of STA register):
The 16 or 32-bit operands, are provided by a subset of the register file and stored in L1/L0 and R1/R0, and accessed through X and Y buses.
The multiplication is performed in one single instruction cycle and the result is loaded in the 32-bit P register. The product can be either integer or fractional (see I-bit of STA register). Rounding of the product is explicitly defined by the multiplication instructions (see Section
5.4.2).
SL LL SR LR Multipl ication
0 X 0 X Unsigned L-source X Unsi gned R-sourc e 1 0 0 X Signed L-source X Uns i gned R-sourc e 0 X 1 0 Unsigned L-source X S ig ned R-source 1 0 1 0 Signed L-source X Signed R-source
SL 1 SR X Unsigned L0 X Uns ig ned R-source
(dep on SR-bit)
or
Signed/Unsigned L-source X Signed/Unsigned R-source
SL X SR 1 Signed/Unsigned L-source
(depending on SL-bit)
X Unsigned R0
or
Signed/Unsigned L-source Signed/Unsigned R-source
IProduct
0 Fractional L-source X Fractional R-source 1 Integer L-source X Integer R-source
5
16/89
D950-Core
4.1.4 Barrel Shifter Unit (BSU)
The D950-Core BSU provides a complete set of shifting functions Arithmetic shift: 40-bit input (either a 32 bit operand sign extended to 40-bit, or a 40-bit
accumulator), providing a valid result
Logical shift: provides a 32-bit result which is loaded into a 40-bit accumulator, the 8-bit extension of which is reset.
Rotation:
8-bit EXT/sign 16-bit MSB 16-bit LSB
TST
8-bit EXT/sign 16-bit MSB
16-bit LSB
0
TST
Right: shi fts the 40-bit input data to the right , the upper part is sign extended
Left: shifts the 40-bit input data to the left, the upper part is fed by 0
8-bit EXT = 0 16-bit MSB 16-bit LSB
TST
8-bit EXT = 0 16-bit MSB 16-bit LSB
0
TST
Right: shi ft s the 32-bit in put data to the ri ght, the upper part is fed by 0
Left: shifts the 32-bit input data to the left, the upper part is fed by 0
0
8-bit EXT = 0 16-b it MSB 16-bit LSB
16-bit MSB 16-bit LSB
TST
Right: rot ates the inpu t data to the right (only thr ough the BSC register)
Left with TST: rotate s t he 33-bit data made of the concatenation of TST-bit
0
Left: rotates the 32-bit input data to the left
TST
of CCR with the 32-bit input data to the left (the LSB of the 32-bit input is
fed by TST-bit, the MSB of the 32-bit input fe eds the TST-bit of CCR).
5
17/89
D950-Core
When using a pure shift instruction, the T ST bit of th e CCR is fed by the last bit shifted out. The shift value provided to the BSU is a signed value which may be provided i n three different ways:
By the instruction (shift defined in the instruction: see Section 5.4.2).
By the BSC register (shift defined in the ALU code: if BSC c ontains a positive (resp. negative) value, all shifts using BSC content will provide a left (resp. right) shift).
By the PSC register (shift defined in the MAC instruction: if PSC contains a positive (resp. negative) value, all shifts using BSC content will provide a left (resp. right) shift).
4.1.5 Arithmetic and Logic Unit (ALU)
The D950-Core ALU is 40-bit wide and implements about s ixty ALU functions. It i ncludes an 8-bit extension for arithmetical operations.
The calculation mode is controlled by both the instruction and the corresponding bits of the STA register (see Section 4.5.1). The ALU has two inputs (see Figure 4.2), the left (always the output of the BSU) and the right (fed by the registers making up the register file).
For logical operations, the ALU is fed with 32-bit wide operands, 0-extended to 40-bits. Then, the ALU generates a 40-bit result which is always stored in A0 or A1 (A0E and A1E extension registers being reset).
For arithmetical operations, the ALU is fed with 40-bit wide operands.
If the operand is an accumulator, the entire 40-bit register (A0E/A0H/A0L or A1E/A1H/A1L) feeds the 40-bit ALU.
If not, the 32-bit register is considered as sign extended to a 40-bit format. The extended ALU then generates a 40-bit wide result which is always stored in A0E/A0H/A0L or A1E/A1H/A1 L
Figure 4.2 D950-Core ALU Operations
A0 or A1
40-bit A.L.U.
40
40
8-bit
16-bit 16-bit
FROM BSU
FROM REGISTER FILE
16 16
CCR
13
8
VR02017C
5
18/89
D950-Core
The ALU output is always made to one of the two accumulators and the CCR (with the exception of particular ALU codes which affect only CCR or an accumulator). The ALU operations can be partitioned into three different groups (see Section 5.4.2), depending on the number of operands the operation requires:
Specific ALU codes (see Section 5.4.2) are used to implement a non-restoring conditional add/subtract division algorithm. The division can be signed or unsigned. The dividend must be a 32-bit operand sign extended to 40-bit and located in the 40-bit accumulator. The divisor must be a 16-bit operand located in R0 or R1 (LR-bit of STA register must be low).
In order to obtain a valid result, the absolute value of the dividend must be strictly smaller than the absolute value of the divisor (considering operand is in a fractional format).
Special features are implemented in the D950-Core to process multi-precision data (see DMULT instruction for double-precision MAC operations).
Two overflow preventions exist in the D950-Core (see SAT and ES bits of STA register):
1: For the multiplier, when multiplying 0x8000 by 0x8000 in signed/signed fraction-
al mode, the saturation block forces the multiplier result to 0x7FFFFFFF,
2: For the ALU, when the result overflows. Provided one of the two optional satu-
ration modes (32-bit saturation or 40-bit saturation) has been selected, the ac­cumulator destination is set to plus or minus the maximum value.
Two rounding operations are enabled in the D950-Core (see RND-bit of STA register):
1: The multiplier result stored in P register explicitly defined by the instruction. A
two’s complement rounding is performed on the result which is s tored in the 16­bit PH register (see Section 5.4.2).
2: The 40-bit accumulator (either two’s complement or convergent rounding) pro-
vided by ALU operation (see RND-bit of STA register).
4.1.6 Bit Manipulation Unit (BMU)
The BMU allows bit m anipulation operations on 16-bit data sources, access ed in 3 different modes: direct, indirect and register addressing, through dedicated instructions.
An 8-bit mask is applied to enable the following operations on a bit-per-bit basis:
TSTL: bit test low.
TSTH: bit test high.
TSTHSET: bit test high and set.
TSTLCLR: bit test low and reset.
ALU Code Number of So urces Number of D es tin a tio ns
3 operands 2 1 2 operands 1 1
1 operand 1 (source=destination) 1 (source=destinat ion)
5
19/89
D950-Core
Figure 4.3 D950-Core Bit Manipulation Unit
This 8-bit mask is extended to a 16-bit mask in three ways:
8-bit value on MSBs, 0x00 on LSBs,
0x00 on MSBs, 8-bit value on LSBs,
8-bit value on MSBs, 8-bit value on LSBs. (In this case, the mask value is the same on MSB and LSB.)
For registers with a length less than 16-bit (AIE, BSC, PSC), the signed v alue data is sign­extended to a 16-bit signed value data before being tested.
Figure 4.4 Extension of an 8-bit Mask to 16-bit Mask
Figure 4.5 S ign Extension to a 16-bit Signed Value
15 8 7 0
MASK 0
0MASK
MASK MASK
16
8
16
INTERNAL REG ISTER
RAM
EXTENSION
TST
BIT MANIPULATION
UNIT
MASK
Processed D ata
VR020 17 D
VR02017P
S
0
15
5
20/89
D950-Core
4.2 Address Calculation Unit (ACU)
4.2.1 Introduction
The D950-Core ACU includes two id entical address generators (one for each data memory space), each containing:
2 x 16-bit address registers
4 x 16-bit index registers
Adder for address register update
2 x 16-bit base and maximum address registers for modulo addressing. There is dedicated logic for address comparison and calculation.
Figure 4.6 D950-Core Address C alculation Unit
In addition to these two address generators, the D950-Core ACU includes:
16-bit Stack Pointer (SPX) register for the X-memory space mapped stack
16-bit Stack Pointer (SPY) register for the Y-memory space mapped stack
6 bits of STA register for addressing modes
VR02017E
STA
6
AY0 AY1
BY MY
MODULUS
LOGIC
16-bit ADDER
IY0 IY1 IY2 IY3
6
16 16
16 16
AX0 AX1
BX MX
MODULUS
LOGIC
16-bit ADDER
IX0
IX1 IX2 IX3
SPX
XD XA
YD
YA
SPY
5
21/89
D950-Core
4.2.2 Registers
The D950-Core ACU includes two types of registers: data registers and control registers
Data registers:
The following registers are directly addressed by instructions:
2 x 16-bit pointer registers and 4 x 16-bit index registers are dedicated for each data memory space:
AX0/AX1 (pointer), IX0/IX1/IX2/IX3 (index) for X-memory space,
AY0/AY1 (pointer), IY0/IY1/IY2/IY3 (index) for Y-memory space.
In addition to these registers, 16-bit SP registers address the stacks located in the X and Y­memory spaces.
The following four registers are mapped in Y-memory space:
2 x 16-bit base and maximum address registers are dedicated for each Data memory space:
B X (Base), MX (Maximum) for X-memory space,
B Y (Base), MY (Maximum) for Y-memory space.
Control Register:
STA: Bits 8 to 13 are dedicated to ACU ( see Section 4.5.1 ). Index register values are 16-bit signed.
4.2.3 Addressing modes
The D950-Core provides the following addressing modes:.
Addressing Modes Type
DIRECT
INDIRECT
LINEAR POST- INCREMENT
MODULO POST-INCREMENT
BIT-REVERSE POST-INCREMENT
INDEXED
IMMEDIATE
STACK
5
22/89
D950-Core
Direct addressing
Memory direct addressing instructions require one extension word to provide the memory address, and are executed in two cycles. They are used for data moves between memory and direct addressable registers.
Registers addressable by the instruction code include:
16 for DCU (L1/L0/R1/R0/A1E/A1H/A1L/A0E /A0H/A0L/PH/PL/BSC/PSC/STA/ CCR),
13 for ACU (AX0/AX1/IX0/IX1/IX2/IX3/AY0/AY1/IY0/IY1/IY2/IY3/SPX),
3 for PCU (LS/LC/LE).
Figure 4.7 Direct Addressing
Indirect addressing
See RX1, RX0, MY1, MY0, MX1 and MX0 bits of the STA register. The instruction specifies the address register (AX0, AX1, AY0, AY1) of the operand to process, and the address calculation to be performed, according to STA register content.
At the end of the instruction, the new address register (AXi / AYi) contains the previously selected address (AXi / AYi), post-incremented by the corresponding index registers (IXi / IYi).
Four types of indirect addressing modes are implemented:
1: Linear addressing with post-modification.
Address modification is done using the normal 16-bit 2's complement linear arithmetic.
2: Modulo addressing with post-modification.
This mode can be selected individually for AX0, AX1, AY0, AY 1 registers (see MX0, MX1, MY0, MY1 bi ts of STA register). BX / MX: 16-bit register Base / Maximu m address for AX0 / AX1, BY / MY: 16-bit register Base / Maximu m address for AY0 / AY1. Base and maximum addresses can be defined to any value, provided that: the maximum address is greater than base address, the starting address is initial­ized within the base/maximum address range, the index absolute value is less than or equal to maximum address minus the base address.
register
value
address
Memory
VR02017F
5
23/89
D950-Core
3: Bit reverse addressing (on X-memory space only) with post-increment
This mode can be selected for AX0, AX1 (see RX0, R X1 bits of STA register). It generates the bit-reversed address for 2k point FFT implementation (Index value = 2k-1).
4: Indirect indexed addressing. The address of the operand is the sum of the con-
tents of the address register (AXi, AYi, SPX or SPY) and the contents of the se­lected index register (IXi or IYi). This addition occurs be fore the operand is ac­cessed and therefore requires an extra instruction cycle. The contents of the se­lected address and index registers are unchanged.
Figures 4.8 and 4.9 show the schematics for indirect addressing with and without post modification.
Figure 4.8 Indirect Addressing with Post-Modification
Figure 4.9 Indirect Indexed Addressing wit hout Post-Modification
register
value address
Memory
- linear
- bit-reverse
- modulo
+
index reg.
address reg.
VR02017H
register
value
address
Memory
- linear
+
index reg.
address reg.
VR02017G
5
24/89
D950-Core
Immediate Data Addressing
This mode allows direct register loading. If the data is 16-bit long (see LR and LL bits of STA register), this mode requires one word of instruction extension to store the data.
Immediate short data addressing is possible on 6-bit data, without instruction extension: If AXi, AYi or STA are concerned, the 6 LSB’s are loaded from the instruction and the MSB’s
are unchanged. For all other registers, the MSB’s are fed with 0
Figure 4.10 Immediate Addressing
Stack operation addressing
16-bit Stack Pointer register SPX is available for X-memory space and SPY for Y-memory space. It can be initialized to any val ue, pr ovide d it points to a stack dedicated memory area. The stack size is limited to the available memory. No provision is taken to detect stack overflows or underflows. After reset, the SP registers are not initialized.
The following addressing modes are possible with the 16-bit SP registers:
For the X and Y stack pointer registers: PUSH (SP pre-decrement) or POP (SP post-increment) for register-to-stack move, memory-to-stack move and for immediate value-to-stack move. Double PUSH and double PO P. In th is operation, the PUSH or PO P operation is performed simultaneously on the X and Y stack point register. This is used in a switching context.
For the X stack pointer register only: Indirect indexed addressing for register­to-stack move.
register
value
VR02017I
5
25/89
D950-Core
4.3 Program Control Unit (PCU)
4.3.1 Introduction
The D950-Core PCU includes the following components:
16-bit Program Counter (PC)
9 x 16-bit Loop registers (3 x LS, 3 x LE, 3 x LC)
Branch and Hardware Loops control logic including CCR and PORT condition decoding
2 bits of STA register for interrupt control
2 bits of CCR for loop management
Reset, Hold and Low-Power operation control logic
Stack control logic for automatic PC save and restore in Subroutine Calls and Interrupts. (The Stack is implemented in a user-defined dedicated X-RAM area. The Stack pointer and its control logic are included in the ACU, see
Section 4.2.1.)
•PPort
Figure 4.11 D950-Core Program Control Unit
XRAM
STACK
SPX
16
CONTROL
IR
MUX
IA
ID
LS0:2
LE0:2
LC0:2
LS
YD
XD
16
16
RESET
+ 1 BRANCH / IT @ RTS / RTI @
LOOP REGISTERS
TO OTHER UNITS
CCR COND. (13) PORT COND. (8)
16
16
16-BIT PC
STA
2
2
RESET IT LP HOLD
VR02017J
8
P.PORT
PORT COND
8
P8P_EN
5
26/89
D950-Core
4.3.2 Registers
Data registers
LS0 / LS1 / LS2: 3 x 16-bit Loop Start address registers,
LE0 / LE1 / LE2: 3 x 16-bit Loop End address registers,
LC0 / LC1 / LC2: 3 x 16-bit Loop Count registers.
All these registers are addressed directly by the LSP instruction (see Section 4.3.5) After reset, LSP = 0. (No hardware loop is selected).
Control regist ers
STA: Bits 14 and 15 are dedicated to PCU (see Section 4.5.1).
CCR: Bits 14 and 15 are dedicated to PCU (see Section 4.5.2).
4.3.3 Instruction pipeline
Instruction execution is performed in a 3-stage pipeline: fetch/decode/execute. While instruction n is executed, instruction n+1 is decoded and instruction n+2 is fetched.
The instruction cycle period is twice the CLKIN period. According to the number of words used, D950-Core instructions can be of two types
One word instruction: Inside this group, most D950-Core instructions are one cycle instructions (all arithmetic and logic instructions except instructions performing double precision multiplication and bit manipulations). Some instructions are multiple cycle instructions. Instructions causing a program flow change (JUMP, CALL, RTS, RTI, SWI, RESET, BREAK, CONTINUE) are executed in two or three cycles.
Instructions with extension words : As one program m emory word is fetched at each cycle, if an instruction needs extension words, they are fetched during the cycles following the first fetch.
4.3.4 Interrupt Sources
The D950-Core includes three interrupt sources. The following table orders the interrupt sources from highest to lowest priority
5
27/89
D950-Core
Table 4.2 Interrupt Sources and Priority
RESET
Non maskable (internally vectorized), either hardware or software (see Table 6.3.3“Hardware
Reset”)
In hardware, when a low level is applied to the RESE T
input, the CLOCK generator is re­synchronized, the PC is reset, execution of NOP instructions is forced and control registers are initialized.
In order to get a valid reset, a low level must be applied for a minimum of ten CLKIN cycles (i.e five D950-Core cycles).
In software, the RESET instruction is a 3-cycle instruction having the same effects as a hardware reset, except the CLOCK generator is not re-synchronized.
The reset address is 0x0000. By setting the MODE pin to 1, the alternate reset address 0XFC00 is selected.
INT
Maskable external interrupt EI and IPE bits of STA register (s ee Table 6.3.5“Interrupt”) Start of Interrupt: External interrupt is disabled on reset and is enabled by setting EI-bit to 1. As soon as an IT
falling edge is memorized and r ecognized by the PCU at the beginning of an
instruction cycle, IPE-bit is set. Provided IT
has been previously enabled, ITACK signal is
asserted low to acknowledge the interrupt. ITACK
stays at the low state for one cycle, allowing the interrupt vector to be pr ovided by the controller on Y-bus. Then IPE -bit is reset. Interrupt start processing requires three cycles to read the interrupt vector and to fetch the corresponding instruction. Meanwhile, CCR register, STA and the return address are automatically saved onto the stack, located in X-memory space.
Return from Interrupt: Return from the interrupt is performed by the RTI instruction, a 3-c ycle instruction during which the return address, STA register and CCR are retrieved from the stack. The EOI
signal is then asserted low, allowing the controller to arbitrate pending interrupt
requests and to issue, if required, the next interrupt request to the D950-Core. An interrupt request that is recognized while decoding or executing a delayed branch
instruction, is not acknowledged until all operations related to the branch have been completed.
In addition to this external interrupt source, a powerful interrupt controller is available as peripheral of the D950-Core (see Section 7.3).
Sources Priority
RESET Non-Maskable Highest SWI Non-Maskable INT Maskable Lowest
5
Loading...
+ 62 hidden pages