Datasheet ADSP-21161 Datasheet (ANALOG DEVICES)

Download

Page 1

ADSP-21161 SHARC® Processor

Hardware Reference

Analog Devices, Inc. One Technology Way Norwood, Mass. 02062-9106

Revision 4.0, February 2005

Part Number

82-001944-01

Page 2

Printed in the USA.

Disclaimer

Analog Devices, Inc. reserves the right to change this product without prior notice. Information furnished by Analog Devices is believed to be accurate and reliable. However, no responsibility is assumed by Analog Devices for its use; nor for any infringement of patents or other rights of third parties which may result from its use. No license is granted by implication or otherwise under the patent rights of Analog Devices, Inc.

Trademark and Service Mark Notice

The Analog Devices logo, EZ–Kit Lite, SHARC, the SHARC logo and VisualDSP++ are registered trademarks of Analog Devices, Inc.

All other brand and product names are trademarks or service marks of their respective owners.

Page 3

CONTENTS

INTRODUCTION

Design Advantages ........................................................................ 1-1

Architecture Overview ................................................................... 1-5

Processor Core ......................................................................... 1-5

Processing Elements ............................................................ 1-6

Program Sequence Control .................................................. 1-7

Processor Internal Buses .................................................... 1-10

Processor Peripherals .............................................................. 1-11

Dual-Ported Internal Memory (SRAM) ............................. 1-11

External Port ..................................................................... 1-12

I/O Processor .................................................................... 1-14

JTAG Port ............................................................................. 1-16

Differences From Previous SHARC Processors ............................. 1-16

Processor Core Enhancements ................................................ 1-17

Processor Internal Bus Enhancements ..................................... 1-17

Memory Organization Enhancements .................................... 1-18

External Port Enhancements .................................................. 1-18

Host Interface Enhancements ............................................ 1-18

Multiprocessor Interface Enhancements ............................. 1-19

ADSP-21161 SHARC Processor Hardware Reference iii

Page 4

CONTENTS

IO Architecture Enhancements .............................................. 1-19

DMA Controller Enhancements ........................................ 1-19

Link Port Enhancements ................................................... 1-19

Instruction Set Enhancements ............................................... 1-20

For More Information About Analog Products ............................. 1-21

For Technical or Customer Support ............................................. 1-22

What’s New in This Manual ....................................................... 1-22

Related Documents .................................................................... 1-23

Conventions ............................................................................... 1-24

PROCESSING ELEMENTS

Setting Computational Modes ...................................................... 2-4

32-Bit (Normal Word) Floating-Point Format .......................... 2-4

40-Bit Floating-Point Format .................................................. 2-5

16-Bit (Short Word) Floating-Point Format ............................. 2-6

32-Bit Fixed-Point Format ....................................................... 2-6

Rounding Mode ...................................................................... 2-7

Using Computational Status ......................................................... 2-8

Arithmetic Logic Unit (ALU) ........................................................ 2-9

ALU Operation ....................................................................... 2-9

ALU Saturation ..................................................................... 2-10

ALU Status Flags ................................................................... 2-11

ALU Instruction Summary .................................................... 2-12

Multiply—Accumulator (Multiplier) ........................................... 2-15

Multiplier Operation ............................................................. 2-15

iv ADSP-21161 SHARC Processor Hardware Reference

Page 5

CONTENTS

Multiplier (Fixed-Point) Result Register ................................. 2-16

Multiplier Status Flags ........................................................... 2-19

Multiplier Instruction Summary ............................................ 2-20

Barrel-Shifter (Shifter) ................................................................. 2-23

Shifter Operation .................................................................. 2-23

Shifter Status Flags ................................................................ 2-27

Shifter Instruction Summary .................................................. 2-28

Data Register File ........................................................................ 2-30

Alternate (Secondary) Data Registers ........................................... 2-32

Multifunction Computations ...................................................... 2-34

Secondary Processing Element (PEy) ............................................ 2-37

Dual Compute Units Sets ...................................................... 2-39

Dual Register Files ................................................................. 2-42

Dual Alternate Registers ........................................................ 2-43

SIMD (Computational) Operations ....................................... 2-43

SIMD And Status Flags ......................................................... 2-46

PROGRAM SEQUENCER

Instruction Pipeline ...................................................................... 3-7

Instruction Cache ......................................................................... 3-8

Using the Cache .................................................................... 3-11

Optimizing Cache Usage ....................................................... 3-11

Branches and Sequencing ............................................................ 3-13

Conditional Branches ............................................................ 3-15

Delayed Branches .................................................................. 3-15

ADSP-21161 SHARC Processor Hardware Reference v

Page 6

CONTENTS

Restrictions and Limitations When Using

Delayed Branches .......................................................... 3-19

Loops and Sequencing ................................................................ 3-22

Restrictions on Ending Loops ................................................ 3-25

Restrictions on Short Loops .................................................. 3-26

Loop Address Stack ............................................................... 3-29

Loop Counter Stack .............................................................. 3-30

Interrupts and Sequencing .......................................................... 3-34

Sensing Interrupts ................................................................. 3-40

Masking Interrupts ............................................................... 3-41

Latching Interrupts ............................................................... 3-42

Stacking Status During Interrupts .......................................... 3-44

Nesting Interrupts ................................................................. 3-45

Reusing Interrupts ................................................................ 3-47

Interrupting IDLE ................................................................ 3-48

Multiprocessing Interrupts .................................................... 3-49

Timer and Sequencing ................................................................ 3-50

Stacks and Sequencing ................................................................ 3-52

Conditional Sequencing .............................................................. 3-53

SIMD Mode and Sequencing ...................................................... 3-57

Conditional Compute Operations ......................................... 3-58

Conditional Branches and Loops ........................................... 3-59

Conditional Data Moves ....................................................... 3-59

Case 1: Complementary Register Pair Data Move .............. 3-60

vi ADSP-21161 SHARC Processor Hardware Reference

Page 7

CONTENTS

Case 2: Uncomplemented–to–Complementary

Case 3: Complementary Register => Uncomplimentary

Case 4: Data Move Involves External Memory or

IOP Memory Space ........................................................ 3-65

Conditional DAG Operations ................................................ 3-66

DATA ADDRESS GENERATOR

Setting DAG Modes ...................................................................... 4-2

Circular Buffering Mode .......................................................... 4-4

Broadcast Loading Mode ......................................................... 4-5

Alternate (Secondary) DAG Registers ....................................... 4-6

Bit-reverse Addressing Mode .................................................... 4-8

Using DAG Status ........................................................................ 4-8

DAG Operations ........................................................................... 4-9

Addressing With DAGs ......................................................... 4-10

Addressing Circular Buffers ................................................... 4-12

Modifying DAG Registers ...................................................... 4-17

Addressing in SISD and SIMD Modes ................................... 4-18

DAGs, Registers, and Memory .................................................... 4-18

DAG Register-to-Bus Alignment ............................................ 4-19

DAG Register Transfer Restrictions ........................................ 4-21

DAG Instruction Summary ......................................................... 4-23

ADSP-21161 SHARC Processor Hardware Reference vii

Page 8

CONTENTS

MEMORY

Internal Memory .......................................................................... 5-2

External Memory .......................................................................... 5-2

Processor Architecture .................................................................. 5-4

Off-Chip Memory and Peripherals Interface .................................. 5-6

Buses ............................................................................................ 5-7

Internal Address and Data Buses .............................................. 5-7

Internal Data Bus Exchange .................................................. 5-10

ADSP-21161 Memory Map ........................................................ 5-16

Internal Memory ................................................................... 5-16

Multiprocessor Memory ........................................................ 5-19

External Memory .................................................................. 5-22

Shadow Write FIFO .............................................................. 5-24

Memory Organization and Word Size .................................... 5-25

Placing 32-Bit Words and 48-Bit Words ............................ 5-25

Mixing 32-Bit and 48-Bit Words ....................................... 5-26

Restrictions on Mixing 32-Bit and 48-Bit Words ............... 5-28

48-Bit Word Allocation .................................................... 5-31

Setting Data Access Modes .......................................................... 5-32

SYSCON Register Control Bits ............................................. 5-32

Mode 1 Register Control Bits ................................................ 5-34

Mode 2 Register Control Bits ................................................ 5-34

Wait Register Control Bits ..................................................... 5-34

Using Boot Memory .............................................................. 5-35

viii ADSP-21161 SHARC Processor Hardware Reference

Page 9

CONTENTS

Reading From Boot Memory ............................................. 5-35

Writing to Boot Memory ................................................... 5-36

Internal Interrupt Vector Table .............................................. 5-37

Internal Memory Data Width ................................................ 5-37

Memory Bank Size ................................................................ 5-38

External Bus Priority ............................................................. 5-39

Secondary Processor Element (PEy) ........................................ 5-39

Broadcast Register Loads ....................................................... 5-40

Illegal I/O Processor Register Access ....................................... 5-41

Unaligned 64-Bit Memory Access .......................................... 5-41

External Bank X Access Mode ................................................ 5-42

External Bank X Waitstates .................................................... 5-45

Using Memory Access Status ....................................................... 5-46

Accessing Memory ...................................................................... 5-46

Access Word Size ................................................................... 5-47

Long Word (64-Bit) Accesses ............................................. 5-48

Instruction Word (48-Bit) and Extended-Precision

Normal Word (40-Bit) Accesses ...................................... 5-50

Normal Word (32-Bit) Accesses ......................................... 5-50

Short Word (16-Bit) Accesses ............................................ 5-51

SISD, SIMD, and Broadcast Load Modes ............................... 5-51

Single and Dual Data Accesses ............................................... 5-52

Data Access Options .............................................................. 5-52

Short Word Addressing of Single Data in SISD Mode ........ 5-54

Short Word Addressing of Single Data in SIMD Mode ....... 5-56

ADSP-21161 SHARC Processor Hardware Reference ix

Page 10

CONTENTS

Short Word Addressing of Dual-Data in SISD Mode ......... 5-58

Short Word Addressing of Dual-Data in SIMD Mode ....... 5-60

32-Bit Normal Word Addressing of Single Data in

SISD Mode ................................................................... 5-62

32-Bit Normal Word Addressing of Single Data in

SIMD Mode .................................................................. 5-64

32-Bit Normal Word Addressing of Dual Data in

SISD Mode ................................................................... 5-66

32-Bit Normal Word Addressing of Dual Data in

SIMD Mode .................................................................. 5-68

Extended Precision Normal Word Addressing of

Single Data .................................................................... 5-70

Extended Precision Normal Word Addressing of Dual

Data in SISD Mode ....................................................... 5-72

Extended-Precision Normal Word Addressing of Dual

Data in SIMD Mode ..................................................... 5-74

Long Word Addressing of Single Data ............................... 5-76

Long Word Addressing of Dual Data in SISD Mode .......... 5-78

Long Word Addressing of Dual Data in SIMD Mode ........ 5-80

Mixed Word Width Addressing of Dual Data in

SISD Mode ................................................................... 5-82

Mixed Word Width Addressing of Dual Data in

SIMD Mode .................................................................. 5-84

Broadcast Load Access ...................................................... 5-86

Shadow Write FIFO Considerations in SIMD Mode .............. 5-95

Arranging Data in Memory ....................................................... 5-100

Executing Instructions From External Memory .......................... 5-101

x ADSP-21161 SHARC Processor Hardware Reference

Page 11

CONTENTS

32- to 48-Bit Packing Address Generation Scheme ............... 5-109

Total Program Size (32- to 48-Bit Packing) ...................... 5-110

16- to 48-Bit Packing Address Generation Scheme ............... 5-111

Total Program Size (16- to 48-Bit Packing) ...................... 5-111

8- to 48-Bit Packing Address Generation Scheme ................. 5-112

Total Program Size (8- to 48-Bit Packing) ........................ 5-113

No Packing (48- to 48-Bit) Address Generation Scheme ....... 5-113

I/O PROCESSOR

DMA Channel Allocation and Priorities ...................................... 6-16

DMA Interrupt Vector Locations ................................................. 6-18

Booting Modes ........................................................................... 6-20

DMA Controller Operation ........................................................ 6-20

Managing DMA Channel Priority .......................................... 6-22

Chaining DMA Processes ...................................................... 6-25

Transfer Control Block (TCB) Chain Loading ................... 6-26

Setting Up and Starting the Chain ..................................... 6-28

Inserting a TCB in an Active Chain ................................... 6-28

External Port DMA ..................................................................... 6-29

External Port Registers ........................................................... 6-30

External Port FIFO Buffers .................................................... 6-33

External Port DMA Data Packing .......................................... 6-34

32-Bit Bus Downloading ................................................... 6-37

16-Bit Bus Downloading ................................................... 6-38

8-Bit Bus Downloading ..................................................... 6-39

ADSP-21161 SHARC Processor Hardware Reference xi

Page 12

CONTENTS

Boot Memory DMA Mode .................................................... 6-42

External Port Buffer Modes ................................................... 6-42

External Port Channel Priority Modes ................................... 6-43

External Port Channel Transfer Modes ................................... 6-46

External Port Channel Handshake Modes .............................. 6-47

Master Mode .................................................................... 6-50

Paced Master Mode .......................................................... 6-54

Slave Mode ....................................................................... 6-55

Handshake Mode ............................................................. 6-57

DMA Handshake Idle Cycle .................................................. 6-64

External-Handshake Mode ................................................ 6-66

Setting Up External Port DMA .............................................. 6-68

Bootloading Through The External Port ................................ 6-70

Host Processor Booting ..................................................... 6-72

PROM Booting ................................................................ 6-74

External Port DMA Programming Examples .......................... 6-76

Link Port DMA .......................................................................... 6-81

Link Port Registers ................................................................ 6-81

Link Port Buffer Modes ......................................................... 6-83

Link Port Channel Priority Modes ......................................... 6-83

Link Port Channel Transfer Modes ........................................ 6-85

Setting Up Link Port DMA ................................................... 6-86

Bootloading Through The Link Port ..................................... 6-88

Link Port DMA Programming Examples ................................ 6-90

xii ADSP-21161 SHARC Processor Hardware Reference

Page 13

CONTENTS

Serial Port DMA ......................................................................... 6-95

Serial Port Registers ............................................................... 6-96

Serial Port Buffer Modes ........................................................ 6-97

Serial Port Channel Priority Modes ........................................ 6-99

Serial Port Channel Transfer Modes ....................................... 6-99

Setting Up Serial Port DMA ................................................ 6-100

SPORT DMA Programming Examples ................................. 6-102

SPI Port DMA .......................................................................... 6-108

SPI Port Registers ................................................................ 6-108

SPI Port Buffer .................................................................... 6-109

SPI DMA Channel Priority .................................................. 6-112

Setting up SPl Port DMA .................................................... 6-112

Bootloading Through the SPI Port ....................................... 6-113

SPI Port DMA Programming Examples ................................ 6-116

Using I/O Processor Status ........................................................ 6-121

External Port Status ............................................................. 6-127

Link Port Status .................................................................. 6-131

Serial Port Status ................................................................. 6-135

SPI Port Status .................................................................... 6-137

Optimizing DMA Throughput .................................................. 6-139

Internal Memory DMA ....................................................... 6-139

External Memory DMA ....................................................... 6-140

System-Level Considerations ................................................ 6-144

ADSP-21161 SHARC Processor Hardware Reference xiii

Page 14

CONTENTS

EXTERNAL PORT

Setting External Port Modes .......................................................... 7-3

External Memory Interface ........................................................... 7-3

Banked External Memory ........................................................ 7-9

Boot Memory ....................................................................... 7-10

Idle Cycle ......................................................................... 7-10

Data Hold Cycle ............................................................... 7-12

Multiprocessor Memory Space Waitstates and

Acknowledge ................................................................. 7-12

Timing External Memory Accesses ......................................... 7-13

Asynchronous Mode Interface Timing ............................... 7-14

Synchronous Mode Interface Timing ................................ 7-18

Synchronous Burst Mode Interface Timing ....................... 7-26

Using External SBSRAM ....................................................... 7-36

SBSRAM Restrictions ........................................................... 7-41

Host Processor Interface ............................................................. 7-42

Acquiring the Bus ................................................................. 7-44

Asynchronous Transfers ......................................................... 7-48

Host Transfer Timing ............................................................ 7-51

Host Interface Deadlock Resolution With SBTS .................... 7-54

Slave Reads and Writes .......................................................... 7-55

IOP Shadow Registers ....................................................... 7-55

Instruction Transfers ......................................................... 7-56

Slave Write Latency .......................................................... 7-56

xiv ADSP-21161 SHARC Processor Hardware Reference

Page 15

CONTENTS

Slave Reads ....................................................................... 7-57

Broadcast Writes .................................................................... 7-57

Data Transfers Through the EPBx Buffers .............................. 7-58

DMA Transfers ...................................................................... 7-58

Host Data Packing ................................................................. 7-59

Packing Mode Variations For Host Accesses ........................... 7-61

IOP Register Host Accesses ............................................... 7-62

LINK Port Buffer Access ................................................... 7-63

EPBx Buffer Accesses ........................................................ 7-64

8- to 32-Bit Data Packing .................................................. 7-66

16- to 32-Bit Packing ........................................................ 7-69

48-Bit Instruction Packing ................................................ 7-74

Host Interface Status ............................................................. 7-76

Interprocessor Messages and Vector Interrupts ........................ 7-76

Message Passing (MSGRx) ................................................ 7-77

Host Vector Interrupts (VIRPT) ........................................ 7-78

System Bus Interfacing .......................................................... 7-78

Access to the Processor Bus – Slave Processor ..................... 7-79

Access to the System Bus – Master Processor ...................... 7-79

Processor Core Access to System Bus ................................. 7-82

Deadlock Resolution ......................................................... 7-82

DMA Access to System Bus ............................................... 7-84

Multiprocessing With Local Memory ................................. 7-85

ADSP-21161 to Microprocessor Interface .......................... 7-85

ADSP-21161 SHARC Processor Hardware Reference xv

Page 16

CONTENTS

Multiprocessor (MP) Interface .................................................... 7-87

Multiprocessing System Architectures .................................... 7-90

Data Flow Multiprocessing ............................................... 7-90

Cluster Multiprocessing .................................................... 7-91

Multiprocessor Bus Arbitration .............................................. 7-93

Bus Arbitration Protocol ................................................... 7-95

Bus Arbitration Priority (RPBA) ....................................... 7-98

Bus Mastership Timeout ................................................. 7-101

Priority Access ................................................................ 7-103

Bus Synchronization After Reset .......................................... 7-105

Booting Another processor .................................................. 7-108

Multiprocessor Writes and Reads ......................................... 7-109

Instruction Transfers ....................................................... 7-110

Bus Lock and Semaphores ................................................... 7-110

Multiprocessor Interface Status ....................................... 7-112

SDRAM INTERFACE

SDRAM Pin Connections ............................................................. 8-7

SDRAM Timing Specifications ..................................................... 8-8

SDRAM Control Register (SDCTL) ............................................. 8-9

SDRAM Configuration for Runtime ........................................... 8-10

Setting the Refresh Counter Value (SDRDIV) ....................... 8-13

Setting the SDRAM Clock Enables ........................................ 8-14

Setting the Number of SDRAM Banks (SDBN) ..................... 8-15

Setting the External Memory Bank (SDEMx) ........................ 8-16

xvi ADSP-21161 SHARC Processor Hardware Reference

Page 17

CONTENTS

Setting the SDRAM Buffering Option (SDBUF) .................... 8-16

Selecting the CAS Latency Value (SDCL) ............................... 8-17

Selecting the SDRAM Page Size (SDPGS) .............................. 8-18

Setting the SDRAM Power-Up Mode (SDPM) ....................... 8-19

Starting the SDRAM Power-Up Sequence (SDPSS) ................ 8-19

Starting Self-Refresh Mode (SDSRF) ..................................... 8-20

Selecting the Active Command Delay (SDTRAS) ................... 8-20

Selecting the Precharge Delay (SDTRP) ................................. 8-21

Selecting the RAS-to-CAS Delay (SDTRCD) ......................... 8-21

SDRAM Controller Standard Operation ...................................... 8-22

Understanding DAG and DMA Operation ............................. 8-22

Multiprocessing Operation .................................................... 8-24

Accessing SDRAM ................................................................ 8-25

Address Mapping for SDRAM ........................................... 8-27

Understanding DQM Operation ............................................ 8-29

Executing a Parallel Refresh Command During

Host Control ...................................................................... 8-29

Powering Up After Reset ........................................................ 8-30

Entering and Exiting Self-Refresh Mode ................................. 8-31

SDRAM Controller Commands .................................................. 8-31

Bank Activate (ACT) Command ............................................ 8-32

Mode Register Set (MRS) ...................................................... 8-32

Precharge Command (PRE) ................................................... 8-33

Read/Write Command ........................................................... 8-34

Read Commands ............................................................... 8-34

ADSP-21161 SHARC Processor Hardware Reference xvii

Page 18

CONTENTS

Write Commands ............................................................. 8-36

DMA Transfers ................................................................. 8-37

Refresh (REF) Command ...................................................... 8-37

Setting the Delay Between Refresh Commands .................. 8-37

Understanding Multiprocessing Operation ........................ 8-38

Self Refresh Command (SREF) .............................................. 8-39

Programming Example .......................................................... 8-40

LINK PORTS

Link Port to Link Buffer Assignment ............................................. 9-3

Link Port DMA Channels ............................................................. 9-4

Link Port Booting ......................................................................... 9-5

Setting Link Port Modes ............................................................... 9-5

Link Port Control Register (LCTL) Bit Descriptions ................ 9-7

Link Data Path and Compatibility Modes ................................ 9-9

Using Link Port Handshake Signals ............................................. 9-10

Using Link Buffers ...................................................................... 9-12

Core Processor Access To Link Buffers ................................... 9-13

Host Processor Access To Link Buffers ................................... 9-14

Using Link Port DMA ................................................................ 9-16

Using Link Port Interrupts .......................................................... 9-17

Link Port Interrupts With DMA Enabled .............................. 9-18

Link Port Interrupts With DMA Disabled ............................. 9-19

Link Port Service Request Interrupts (LSRQ) ......................... 9-19

Detecting Errors on Link Transmissions ...................................... 9-22

xviii ADSP-21161 SHARC Processor Hardware Reference

Page 19

CONTENTS

Link Port Programming Examples .......................................... 9-23

Using Token Passing With Link Ports .......................................... 9-27

Designing Link Port Systems ....................................................... 9-30

Terminations for Link Transmission Lines .............................. 9-30

Peripheral I/O Using Link Ports ............................................. 9-31

Data Flow Multiprocessing With Link Ports ........................... 9-33

SERIAL PORTS

Serial Port Pins ........................................................................... 10-3

SPORT Interrupts ...................................................................... 10-7

SPORT Reset .............................................................................. 10-8

SPORT Control Registers and Data Buffers ................................. 10-9

Serial Port Control Registers (SPCTLx) ................................ 10-14

Transmit and Receive Data Buffers ....................................... 10-30

Clock and Frame Sync Frequencies (DIV) ............................ 10-33

Data Word Formats ................................................................... 10-35

Word Length ....................................................................... 10-36

Endian Format .................................................................... 10-36

Data Packing and Unpacking ............................................... 10-37

Data Type ....................................................................... 10-37

Companding ....................................................................... 10-39

Clock Signal Options ................................................................ 10-40

Frame Sync Options .................................................................. 10-41

Framed Versus Unframed ..................................................... 10-41

ADSP-21161 SHARC Processor Hardware Reference xix

Page 20

CONTENTS

Internal Versus External Frame Syncs ................................... 10-42

Active Low Versus Active High Frame Syncs ........................ 10-43

Sampling Edge for Data and Frame Syncs ............................ 10-43

Early Versus Late Frame Syncs ............................................. 10-44

Data-Independent Transmit Frame Sync .............................. 10-45

SPORT Loopback .................................................................... 10-46

SPORT Operation Modes ......................................................... 10-47

I2S Mode ............................................................................ 10-48

Setting Internal Serial Clock and Frame Sync Rates ......... 10-49

I2S Control Bits ............................................................. 10-49

Setting Word Length (SLEN) .......................................... 10-49

Selecting Transmit Receive Channel Order (L_FIRST) .... 10-49

Selecting the Frame Sync Options (FS_BOTH) ............... 10-50

Enabling SPORT Master Mode (MSTR) ......................... 10-50

Enabling SPORT DMA (SDEN) .................................... 10-51

Multichannel Operation ...................................................... 10-52

Frame Syncs in Multichannel Mode ................................ 10-54

Multichannel Control Bits in SPCTL .............................. 10-55

Channel Selection Registers ............................................ 10-57

Transferring Data to Memory ................................................... 10-58

DMA Block Transfers .......................................................... 10-59

Setting Up DMA on SPORT Channels ........................... 10-60

SPORT DMA Parameter Registers ....................................... 10-61

SPORT DMA Chaining ................................................. 10-65

xx ADSP-21161 SHARC Processor Hardware Reference

Page 21

CONTENTS

Single-Word Transfers .......................................................... 10-65

SPORT Pin/Line Terminations .................................................. 10-66

SPORT Programming Examples ................................................ 10-67

SERIAL PERIPHERAL INTERFACE (SPI)

Functional Description ............................................................... 11-2

SPI Interface Signals ................................................................... 11-3

SPICLK ................................................................................ 11-3

SPIDS ................................................................................... 11-4

FLAG ................................................................................... 11-5

MOSI ................................................................................... 11-6

MISO ................................................................................... 11-6

SPI Interrupts ............................................................................. 11-8

SPI IOP Registers ....................................................................... 11-9

SPI Control Register (SPICTL) .............................................. 11-9

Baud Rate Example ......................................................... 11-14

Seamless Operation ......................................................... 11-15

SPI Status Register (SPISTAT) ............................................. 11-15

SPI Transmit Data Buffer (SPITX) ....................................... 11-20

SPI Receive Data Buffer (SPIRX) ......................................... 11-20

SPI Shift Registers ............................................................... 11-21

SPI Data Word Formats ............................................................ 11-21

SPI Word Packing ............................................................... 11-24

SPI Operation Modes ................................................................ 11-24

Master Mode Operation ...................................................... 11-25

ADSP-21161 SHARC Processor Hardware Reference xxi

Page 22

CONTENTS

Interrupt and DMA Driven Transfers .............................. 11-26

Core Driven Transfers ..................................................... 11-26

Automatic Slave Selection ............................................... 11-26

User Controlled Slave Selection ....................................... 11-27

Slave Mode Operation ......................................................... 11-28

Error Signals and Flags ............................................................. 11-29

Multi-Master Error (MME) ................................................. 11-30

Transmission Error (TXE) ................................................... 11-30

Reception Error (RBSY) ...................................................... 11-31

SPI/Link Port DMA ................................................................. 11-32

DMA Operation in SPI Master Mode .................................. 11-32

DMA Operation in Slave Mode ........................................... 11-33

SPI Booting .............................................................................. 11-34

32-Bit SPI Host Boot .......................................................... 11-38

16-Bit SPI Host Boot .......................................................... 11-39

8-Bit SPI Host Boot ............................................................ 11-41

Multiprocessor SPI Port Booting ..................................... 11-42

SPI Programming Example ....................................................... 11-44

JTAG TEST-EMULATION PORT

JTAG Test Access Port ................................................................ 12-3

Instruction Register .................................................................... 12-4

EMUPMD Shift Register ...................................................... 12-5

EMUPX Shift Register .......................................................... 12-6

EMU64PX Shift Register ...................................................... 12-7

xxii ADSP-21161 SHARC Processor Hardware Reference

Page 23

CONTENTS

EMUPC Shift Register .......................................................... 12-7

EMUCTL Shift Register ........................................................ 12-8

EMUSTAT Shift Register .................................................... 12-11

BRKSTAT Shift Register ..................................................... 12-12

MEMTST Shift Register ...................................................... 12-13

PSx, DMx, IOx, and EPx (Breakpoint) Registers .................. 12-13

EMUN Register .................................................................. 12-16

EMUCLK and EMUCLK2 Registers ................................... 12-16

EMUIDLE Instruction ........................................................ 12-17

In Circuit Signal Analyzer (ICSA) Function ......................... 12-17

Boundary Register ..................................................................... 12-17

Device Identification Register .................................................... 12-28

Built-In Self-Test Operation (BIST) .......................................... 12-28

Private Instructions ................................................................... 12-28

References ................................................................................. 12-29

SYSTEM DESIGN

Pin Descriptions ......................................................................... 13-2

Input Synchronization Delay ............................................... 13-18

Pin States At Reset ............................................................... 13-19

Pull-Up and Pull-Down Resistors ......................................... 13-22

Clock Derivation ................................................................. 13-24

Timing Specifications ...................................................... 13-25

RESET and CLKIN ............................................................ 13-28

Reset Generators ................................................................. 13-31

ADSP-21161 SHARC Processor Hardware Reference xxiii

Page 24

CONTENTS

Interrupt and Timer Pins .................................................... 13-33

Core-Based Flag Pins ........................................................... 13-34

Flag Inputs ..................................................................... 13-34

Flag Outputs .................................................................. 13-34

Programmable I/O Flags ..................................................... 13-35

Example #1: Configuring FLGx as Output Flags ............. 13-37

Example #2: Configuring FLGx as Input Flags ................ 13-38

System Design Considerations for Flags ............................... 13-38

Example #3: Programming 2:1 Clock Ratio ..................... 13-40

Example #4: Programming 3:1 Clock Ratio ..................... 13-40

Example #5: Programming 4:1 Clock Ratio ..................... 13-40

JTAG Interface Pins ............................................................ 13-41

Dual-Voltage Power-up Sequencing ........................................... 13-41

PLL Start-Up (Revisions 1.0/1.1) ........................................ 13-44

Power On Reset (POR) Circuit ....................................... 13-44

PLL CLKIN Enable Circuit ............................................ 13-46

PLL Start-Up (Revision 1.2) ................................................ 13-48

Designing For JTAG Emulation ................................................ 13-49

Target Board Connector ...................................................... 13-50

Layout Requirements ................................................................ 13-54

Power Sequence for Emulation .................................................. 13-56

Additional JTAG Emulator References ...................................... 13-56

Pod Specifications ..................................................................... 13-56

JTAG Pod Connector .......................................................... 13-57

xxiv ADSP-21161 SHARC Processor Hardware Reference

Page 25

CONTENTS

3.3 V Pod Logic .................................................................. 13-58

2.5 V Pod Logic .................................................................. 13-59

Conditioning Input Signals ....................................................... 13-60

Link Port Input Filter Circuits ............................................. 13-60

RESET Input Hysteresis ...................................................... 13-61

Designing For High Frequency Operation ................................. 13-62

Clock Specifications and Jitter ............................................. 13-63

Clock Distribution .............................................................. 13-63

Point-to-Point Connections ................................................. 13-65

Signal Integrity .................................................................... 13-67

Other Recommendations and Suggestions ............................ 13-68

Decoupling Capacitors and Ground Planes .......................... 13-69

Oscilloscope Probes ............................................................. 13-70

Recommended Reading ....................................................... 13-71

Booting Single and Multiple Processors ..................................... 13-71

Multiprocessor Host Booting ............................................... 13-73

Multiprocessor EPROM Booting ......................................... 13-73

Booting From a Single EPROM ...................................... 13-73

Sequential Booting .......................................................... 13-74

Multiprocessor Link Port Booting ........................................ 13-75

Multiprocessor Booting From External Memory ................... 13-75

Data Delays, Latencies, and Throughput ................................... 13-76

Execution Stalls ................................................................... 13-77

DAG Stalls .......................................................................... 13-77

ADSP-21161 SHARC Processor Hardware Reference xxv

Page 26

CONTENTS

Memory Stalls ..................................................................... 13-77

IOP Register Stalls .............................................................. 13-78

DMA Stalls ......................................................................... 13-78

Link Port and Serial Port Stalls ............................................ 13-78

REGISTERS

Control and Status System Registers .............................................. A-2

Mode Control 1 Register (MODE1) ........................................ A-3

Mode Mask Register (MMASK) .............................................. A-8

Mode Control 2 Register (MODE2) ...................................... A-10

Arithmetic Status Registers (ASTATx and ASTATy) ............... A-13

Sticky Status Registers (STKYx and STKYy) .......................... A-18

User-Defined Status Registers (USTATx) ............................... A-22

Processing Element Registers ....................................................... A-23

Data File Data Registers (Rx, Fx, Sx) ..................................... A-23

Multiplier Results Registers (MRFx, MRBx) .......................... A-24

Program Memory Bus Exchange Register (PX) ....................... A-25

Program Sequencer Registers ....................................................... A-25

Interrupt Latch Register (IRPTL) .......................................... A-27

Interrupt Mask Register (IMASK) ......................................... A-31

Interrupt Mask Pointer Register (IMASKP) ........................... A-32

Link Port Interrupt Register (LIRPTL) .................................. A-34

Flag Value Register (FLAGS) ................................................. A-37

IOFLAG Value Register ........................................................ A-38

Program Counter Register (PC) ............................................. A-41

xxvi ADSP-21161 SHARC Processor Hardware Reference

Page 27

CONTENTS

Program Counter Stack Register (PCSTK) ............................ A-44

Program Counter Stack Pointer Register (PCSTKP) .............. A-44

Fetch Address Register (FADDR) .......................................... A-44

Decode Address Register (DADDR) ...................................... A-44

Loop Address Stack Register (LADDR) ................................. A-45

Current Loop Counter Register (CURLCNTR) .................... A-45

Loop Counter Register (LCNTR) ......................................... A-45

Timer Period Register (TPERIOD) ....................................... A-46

Timer Count Register (TCOUNT) ....................................... A-46

Data Address Generator Registers ............................................... A-46

Index Registers (Ix) ............................................................... A-47

Modify Registers (Mx) .......................................................... A-47

Length and Base Registers (Lx,Bx) ........................................ A-47

I/O Processor Registers ............................................................... A-47

System Configuration Register (SYSCON) ............................ A-60

Vector Interrupt Address Register (VIRPT) ........................... A-63

External Memory Waitstate and Access Mode Register

(WAIT) ............................................................................. A-65

System Status Register (SYSTAT) .......................................... A-69

SDRDIV Register (SDRDIV) ............................................... A-72

SDRAM Control Register (SDCTL) ..................................... A-73

External Port DMA Buffer Registers (EPBx) .......................... A-76

Message Registers (MSGRx) ................................................. A-77

PC Shadow Register (PC_SHDW) ........................................ A-77

MODE2 Shadow Register (MODE2_SHDW) ...................... A-78

ADSP-21161 SHARC Processor Hardware Reference xxvii

Page 28

CONTENTS

Bus Time-Out Maximum Register (BMAX) ........................... A-79

Bus (Time-Out) Counter Register (BCNT) ........................... A-79

External Port DMA Control Registers (DMACx) ................... A-80

Internal Memory DMA Index Registers (IIx) ......................... A-87

Internal Memory DMA Modifier Registers (IMx) .................. A-87

Internal Memory DMA Count Registers (Cx) ........................ A-87

Chain Pointer For Next DMA TCB Registers (CPx) .............. A-88

General Purpose DMA Registers (GPx) ................................. A-89

External Memory DMA Index Registers (EIEPx) ................... A-89

External Memory DMA Modifier Registers (EMEPx) ............ A-89

External Memory DMA Count Registers (ECEPx) ................. A-90

DMA Channel Status Register (DMASTAT) .......................... A-90

Link Port Buffer Registers (LBUFx) ....................................... A-92

Link Port Buffer Control Register (LCTL) ............................. A-92

Link Port Service Request & Mask Register (LSRQ) ............... A-98

Serial Port Registers ............................................................. A-100

SPORT Serial Control Registers (SPCTLx) ..................... A-100

SPORT Multichannel Control Registers (SPxyMCTL) .... A-109

SPORT Transmit Buffer Registers (TXx) ......................... A-111

SPORT Receive Buffer Registers (RXx) ........................... A-111

SPORT Divisor Registers (DIVx) .................................... A-112

SPORT Count Registers (CNTx) .................................... A-113

SPORT Transmit Select Registers (MT2CSx and

MT3CSx) .................................................................... A-113

xxviii ADSP-21161 SHARC Processor Hardware Reference

Page 29

CONTENTS

SPORT Transmit Compand Registers (MT2CCSx and

MT3CCSx) ................................................................. A-113

SPORT Receive Select Registers ..................................... A-114

SPORT Receive Compand Registers ............................... A-114

Serial Peripheral Interface Registers ........................................... A-114

SPI Port Status Register ...................................................... A-115

SPI Control Register (SPICTL) ........................................... A-117

SPI Receive Buffer Register (SPIRX) ................................... A-120

SPI Transmit Buffer Register (SPITX) ................................. A-121

INTERRUPT VECTOR ADDRESSES

NUMERIC FORMATS

IEEE Single-Precision Floating-Point Data ................................... C-1

Extended-Precision Floating-Point ................................................ C-3

Short Word Floating-Point Format ............................................... C-4

Packing for Floating-Point Data ................................................... C-4

Fixed-Point Formats ..................................................................... C-6

GLOSSARY

INDEX

ADSP-21161 SHARC Processor Hardware Reference xxix

Page 30

CONTENTS

xxx ADSP-21161 SHARC Processor Hardware Reference

Page 31

1 INTRODUCTION

Thank you for purchasing the Analog Devices SHARC® digital signal processor (DSP).

Design Advantages

The ADSP-21161 processor is a high-performance 32-bit processor used for medical imaging, communications, military, audio, test equipment, 3D graphics, speech recognition, motor control, imaging, and other applications. This processor builds on the ADSP-21000 Family processor core to form a complete system-on-a-chip, adding a dual-ported on-chip SRAM, integrated I/O peripherals, and an additional processing element for Single-Instruction-Multiple-Data (SIMD) support.

The SHARC architecture balances a high performance processor core with high performance buses (PM, DM, IO). In the core, every instruction can execute in a single cycle. The buses and instruction cache provide rapid, unimpeded data flow to the core to maintain the execution rate.

Figure 1-1 shows a detailed block diagram of the processor, which illus-

trates the following architectural features.

• Two processing elements (PEx and PEy), each containing 32-Bit IEEE floating-point computation unit—multiplier, ALU, Shifter, and data register file

• Program sequencer with related instruction cache, interval timer, and Data Address Generators (DAG1 and DAG2)

ADSP-21161 SHARC Processor Hardware Reference 1-1

Page 32

Design Advantages

• Dual-ported SRAM

• External port for interfacing to off-chip memory such as SDRAM, peripherals, hosts, and multiprocessor systems

• Input/Output (IO) processor with integrated DMA controller, SPI-compatible port, serial ports, and link ports for point-to-point multiprocessor communications

• JTAG Test Access Port for emulation

CORE PROCESSOR

DAG2

DAG1

8x4x32

BUS

CONNECT

(PX)

DATA

FILE

(PEx)

16 x 40-BIT

MULT

TIMER

PROGRAM

SEQUENCER

PM ADDRESS BUS

DM ADDRESS BUS

PM DATA BUS

DM DATA BUS

BARREL SHIFTER

ALU

INSTRUCTION

CACHE

32 x 48-BIT

48/64

32/40/64

BARREL SHIFTER

ALU

DUAL-PORTED SRAM

TWO INDEPENDENT

DUAL-PORTED BLOCKS

PROCESSOR PO RT I/O POR T

ADDR DATA ADDR

ADDR DATA

DATA

FILE

(PEy)

16 x 40-BIT

MULT

Figure 1-1. ADSP-21161 SHARC Block Diagram

DATA

0 K

C O L B

ADDR

IOD

IOA

IOP

REGISTERS

(

MEMORY MAPPED)

CONTROL, STATUS, &

DATA BUFFERS

I/O PROCESSOR

1 K

C O L B

JTAG

TEST &

EMULATION

EXTERNAL

PORT

ADDR BUS

MUX

MULTIPROCESSOR

INTERFACE

DATA BUS

MUX

HOST PORT

DMA

CONTROLLER

SERIAL PORTS

(2)

LINK PORTS

(6)

Figure 1-1 also shows the three on-chip buses of the ADSP-21161 proces-

sor: the Program Memory (PM) bus, Data Memory (DM) bus, and Input/Output (IO) bus. The PM bus provides access to either instructions

1-2 ADSP-21161 SHARC Processor Hardware Reference

Page 33

INTRODUCTION

or data. During a single cycle, these buses let the processor access two data operands from memory, access an instruction (from the cache), and perform a DMA transfer.

The buses connect to the ADSP-21161 processor external port, which provides the processor interface to external memory, memory-mapped I/O, a host processor, and additional multiprocessing ADSP-21161 processors. The external port performs bus arbitration and supplies control signals to shared, global memory and I/O devices.

Figure 1-2 illustrates a typical single-processor system.

The ADSP-21161 processor includes extensive support for multiprocessor systems as well. For more information, see “Multiprocessor (MP) Inter-

face” on page 7-87.

Further, the ADSP-21161 processor addresses the five central requirements for DSPs:

• Fast, flexible arithmetic computation units

• Unconstrained data flow to and from the computation units

• Extended precision and dynamic range in the computation units

• Dual address generators with circular buffering support

• Efficient program sequencing

Fast, Flexible Arithmetic. The ADSP-21000 Family processors execute all instructions in a single cycle. They provide fast cycle times and a complete set of arithmetic operations. The processor is IEEE floating-point compatible and allows either interrupt on arithmetic exception or latched status exception handling.

Unconstrained Data Flow. The ADSP-21161 processor has a Super Harvard Architecture combined with a 10-port data register file. In every cycle, the processor can write or read two operands to or from the register

ADSP-21161 SHARC Processor Hardware Reference 1-3

Page 34

Design Advantages

ADSP-21161

CLKIN XTAL

CLK_CFG1-0 CLKDBL EBOOT LBOOT

IRQ2-0

FLAG11-0 TIMEXP RPBA ID2-0

LXCLK LXACK

LXDAT7-0

SCLK0 FS0 D0A D0B

SCLK1 FS1 D1A D1B

SCLK2 FS2 D2A D2B

SCLK3 FS3

D3A D3B

SPICLK

SPDS

MOSI MISO

RESET

RSTOUT

CLOCK

LINK

DEVICES

(2 MAX)

(OPTIONAL)

SERIAL DEVICE

(OPTIONAL)

SERIAL DEVICE

(OPTIONAL)

SERIAL DEVICE

(OPTIONAL)

SERIAL DEVICE

(OPTIONAL)

SPI

COMPATIBLE

DEVICE

(HOST OR

SLAVE)

(OPTIONAL)

BMS

BRST

ADDR23-0

DATA47-16

ACK

MS3-0

RAS CAS

DQM

SDWE

SDCLK1-0

SDCKE SDA10

CLKOUT

DMAR1-2

DMAG1-2

HBR

HBG

REDY

BR1-6

SBTS

JTAG

S S

O R T N O C

D A

ADDR

DATA

ADDR

DATA

ACK

DATA

(OPTIONAL)

PERIPHERALS

(OPTIONAL)

DMA DEVICE

(OPTIONAL)

PROCESSOR

INTERFACE (OPTIONAL)

ADDR

DATA

BOOT

EPROM

MEMORY

AND

HOST

RAS

CAS

DQM

CLK CKE A10

ADDR

DATA

SDRAM

(OPTIONAL)

Figure 1-2. Typical Single Processor System

file, supply two operands to the ALU, supply two operands to the multiplier, and receive three results from the ALU and multiplier. The processor’s 48-bit orthogonal instruction word supports parallel data transfers and arithmetic operations in the same instruction.

1-4 ADSP-21161 SHARC Processor Hardware Reference

Page 35

INTRODUCTION

40-Bit Extended Precision. The processor handles 32-bit IEEE floating-point format, 32-bit integer and fractional formats (twos-complement and unsigned), and extended-precision 40-bit floating-point format. The processors carry extended precision throughout their computation units, limiting intermediate data truncation errors.

Dual Address Generators. The processor has two Data Address Generators (DAGs) that provide immediate or indirect (pre- and post-modify) addressing. Modulus, bit-reverse, and broadcast operations are supported with no constraints on data buffer placement.

Efficient Program Sequencing. In addition to zero-overhead loops, the processor supports single-cycle setup and exit for loops. Loops are both nestable (six levels in hardware) and interruptable. The processors support both delayed and non-delayed branches.

Architecture Overview

The ADSP-21161 processor forms a complete system-on-a-chip, integrating a large, high-speed SRAM and I/O peripherals supported by a dedicated I/O bus. The following sections summarize the features of each functional block in the ADSP-21161 processor SHARC architecture, which appears in Figure 1-1 on page 1-2. With each summary, a cross ref- erence points to the sections where the features are described in greater detail.

Processor Core

The processor core of the ADSP-21161 processor consists of two processing elements (each with three computation units and data register file), a program sequencer, two data address generators, a timer, and an instruction cache. All digital signal processing occurs in the processor core.

ADSP-21161 SHARC Processor Hardware Reference 1-5

Page 36

Architecture Overview

Processing Elements

The processor core contains two processing elements (PEx and PEy). Each element contains a data register file and three independent computation units: an ALU, a multiplier with a fixed-point accumulator, and a shifter. For meeting a wide variety of processing needs, the computation units process data in three formats: 32-bit fixed-point, 32-bit floating-point and 40-bit floating-point.

The floating-point operations are single-precision IEEE-compatible. The 32-bit floating-point format is the standard IEEE format, whereas the 40-bit extended-precision format has eight additional Least Significant Bits (LSBs) of mantissa for greater accuracy.

The ALU performs a set of arithmetic and logic operations on both fixed-point and floating-point formats. The multiplier performs floating-point or fixed-point multiplication and fixed-point multiply/add or multiply/subtract operations. The shifter performs logical and arithmetic shifts, bit manipulation, field deposit and extraction, and exponent derivation operations on 32-bit operands. These computation units perform single-cycle operations; there is no computation pipeline. All units are connected in parallel, rather than serially. The output of any unit may serve as the input of any unit on the next cycle. In a multifunction computation, the ALU and multiplier perform independent, simultaneous operations.

Each processing element has a general-purpose data register file that transfers data between the computation units and the data buses and stores intermediate results. A register file has two sets (primary and secondary) of sixteen registers each, for fast context switching. All of the registers are 40 bits wide. The register file, combined with the core processor’s Super Harvard architecture, allows unconstrained data flow between computation units and internal memory.

1-6 ADSP-21161 SHARC Processor Hardware Reference

Page 37

INTRODUCTION

Primary Processing Element (PEx). PEx processes all computational instructions whether the processor is in Single-Instruction, Single-Data (SISD) or Single-Instruction, Multiple-Data (SIMD) mode. This element corresponds to the computational units and register file in previous ADSP-21000 family DSPs.

Secondary Processing Element (PEy). PEy processes each computational instruction in lock-step with PEx, but only processes these instructions when the processor is in SIMD mode. Because many operations are influenced by this mode, more information on SIMD is available in multiple locations:

• For information on PEy operations, see “Processing Elements” on

page 2-1

• For information on data addressing in SIMD mode, see “Address-

ing in SISD and SIMD Modes” on page 4-18

• For information on data accesses in SIMD mode, see “SISD,

SIMD, and Broadcast Load Modes” on page 5-51

• For information on multiprocessing in SIMD mode, see “Multi-

processor (MP) Interface” on page 7-87

• For information on SIMD programming, see the ADSP-21160

SHARC DSP Instruction Set Reference

Program Sequence Control

Internal controls for ADSP-21161 processor program execution come from four functional blocks: program sequencer, data address generators, timer, and instruction cache. Two dedicated address generators and a program sequencer supply addresses for memory accesses. Together the sequencer and data address generators allow computational operations to execute with maximum efficiency since the computation units can be devoted exclusively to processing data. With its instruction cache, the

ADSP-21161 SHARC Processor Hardware Reference 1-7

Page 38

Architecture Overview

ADSP-21161 processor can simultaneously fetch an instruction from the cache and access two data operands from memory. The data address generators implement circular data buffers in hardware.

Program Sequencer. The program sequencer supplies instruction addresses to program memory. It controls loop iterations and evaluates conditional instructions. With an internal loop counter and loop stack, the ADSP-21161 processor executes looped code with zero overhead. No explicit jump instructions are required to loop or to decrement and test the counter.

The ADSP-21161 processor achieves its fast execution rate by means of pipelined fetch, decode, and execute cycles. If external memories are used, they are allowed more time to complete an access than if there were no decode cycle.

Data Address Generators. The Data Address Generators (DAGs) provide memory addresses when data is transferred between memory and registers. Dual data address generators enable the processor to output simultaneous addresses for two operand reads or writes. DAG1 supplies 32-bit addresses to data memory. DAG2 supplies 32-bit addresses to program memory for program memory data accesses.

Each DAG keeps track of up to eight address pointers, eight modifiers and eight length values. A pointer used for indirect addressing can be modified by a value in a specified register, either before (pre-modify) or after (post-modify) the access. A length value may be associated with each pointer to perform automatic modulo addressing for circular data buffers; the circular buffers can be located at arbitrary boundaries in memory. Each DAG register has a secondary register that can be activated for fast context switching.

1-8 ADSP-21161 SHARC Processor Hardware Reference

Page 39

INTRODUCTION

Circular buffers allow efficient implementation of delay lines and other data structures required in digital signal processing, and are commonly used in digital filters and Fourier transforms. The DAGs automatically handle address pointer wraparound, reducing overhead, increasing performance, and simplifying implementation.

Interrupts. The ADSP-21161 processor has four external hardware interrupts: three general-purpose interrupts,

IRQ2-0, and a special interrupt for

reset. The processor also has internally generated interrupts for the timer, DMA controller operations, circular buffer overflow, stack overflows, arithmetic exceptions, multiprocessor vector interrupts, and user-defined software interrupts.

For the general-purpose external interrupts and the internal timer interrupt, the ADSP-21161 processor automatically stacks the arithmetic status and mode (MODE1) registers in parallel with the interrupt servicing, allowing fifteen nesting levels of very fast service for these interrupts.

Context Switch. Many of the processor’s registers have secondary registers that can be activated during interrupt servicing for a fast context switch. The data registers in the register file, the DAG registers, and the multiplier result register all have secondary registers. The primary registers are active at reset, while the secondary registers are activated by control bits in a mode control register.

Timer. The programmable interval timer provides periodic interrupt generation. When enabled, the timer decrements a 32-bit count register every cycle. When this count register reaches zero, the ADSP-21161 processor generates an interrupt and asserts its timer expired output. The count register is automatically reloaded from a 32-bit period register and the count resumes immediately.

Instruction Cache. The program sequencer includes a 32-word instruction cache that enables three-bus operation for fetching an instruction and two data values. The cache is selective; only instructions whose fetches

ADSP-21161 SHARC Processor Hardware Reference 1-9

Page 40

Architecture Overview

conflict with program memory data accesses are cached. This caching allows full-speed execution of core, looped operations such as digital filter multiply-accumulates and FFT butterfly processing.

Processor Internal Buses

The processor core has six buses: PM address, PM data, DM address, DM data, IO address, and IO data. Due to processor’s Super Harvard Architecture, data memory stores data operands, while program memory can store both instructions and data. This architecture allows dual data fetches, when the instruction is supplied by the cache.

Bus Capacities. The PM address bus and DM address bus transfer the addresses for instructions and data. The PM data bus and DM data bus transfer the data or instructions from each type of memory. the PM address bus is 32 bits wide, allowing access of up to 62 Mwords for non-SDRAM and 254 Mwords for SDRAM banks of mixed instructions and data. The PM data bus is 64 bits wide from (8-, 16-, and 32-bits) to accommodate the 48-bit instructions and 32-bit data.

The DM address bus is 32 bits wide allowing direct access of up to 4G words of data. The DM data bus is 64 bits wide. The DM data bus provides a path for the contents of any register in the processor to be transferred to any other register or to any data memory location in a single cycle. The data memory address comes from one of two sources: an absolute value specified in the instruction code (direct addressing) or the output of a data address generator (indirect addressing).

The IO address and IO data buses let the IO processor access internal memory for DMA without delaying the processor core. The IO address bus is 18 bits wide, and the IO data bus is 64 bits wide.

Data Transfers. Nearly every register in the processor core is classified as a Universal Register (UREG). Instructions allow transferring data between any two universal registers or between a universal register and memory. This support includes transfers between control registers, status registers,

1-10 ADSP-21161 SHARC Processor Hardware Reference

Page 41

INTRODUCTION

and data registers in the register file. The PM bus connect (

PX) registers

permit data to be passed between the 64-bit PM data bus and the 64-bit DM data bus, or between the 40-bit register file and the PM data bus. These registers contain hardware to handle the data width difference. For more information, see For more information, see “Processing Element

Registers” on page A-23.

Processor Peripherals

The term processor peripherals refers to everything outside the processor core. The ADSP-21161 processor peripherals include internal memory, external port, I/O processor, JTAG port, and any external devices that connect to the processor.

Dual-Ported Internal Memory (SRAM)

The ADSP-21161 processor contains 1 megabit of on-chip SRAM, organized as two blocks of 0.5 Mbits. Each block can be configured for different combinations of code and data storage. Each memory block is dual-ported for single-cycle, independent accesses by the core processor and I/O processor or DMA controller. The dual-ported memory and separate on-chip buses allow two data transfers from the core and one from I/O, all in a single cycle.

All of the memory can be accessed as 16-, 32-, 48-, or 64-bit words. On the ADSP-21161 processor, the memory can be configured as a maximum of 32K words of 32-bit data, 64K words of 16-bit data, 21.25K words of 48-bit instructions (and 40-bit data), or combinations of different word sizes up to 1.0 Mbit.

The processor supports a 16-bit floating-point storage format, which effectively doubles the amount of data that may be stored on chip. Conversion between the 32-bit floating-point and 16-bit floating-point formats completes in a single instruction.

ADSP-21161 SHARC Processor Hardware Reference 1-11

Page 42

Architecture Overview

While each memory block can store combinations of code and data, accesses are most efficient when one block stores data, using the DM bus for transfers, and the other block stores instructions and data, using the PM bus for transfers. Using the DM bus and PM bus in this way, with one dedicated to each memory block, assures single-cycle execution with two data transfers. In this case, the instruction must be available in the cache. The processor uses its external port to maintain single-cycle execution when one of the data operands is transferred to or from off-chip.

External Port

The ADSP-21161 processor external port provides the processor interface to off-chip memory and peripherals. The 254 Mword off-chip address space is included in the unified address space of the ADSP-21161 processor. The separate on-chip buses—for PM address, PM data, DM address, DM data, IO address, and IO data—multiplex at the external port to create an external system bus with a single 24-bit address bus and a single 32-bit data bus. The ADSP-21161 processor on-chip DMA controller automatically packs external data into the appropriate word width during transfers.

The ADSP-21161 processor supports instruction packing modes to execute from 48-, 32-, 16-, and 8-bit wide memories. With the link ports disabled, the additional link port pins can be used to execute 48-bit wide instructions. The ADSP-21161 processor also includes 32- to 48-bit, 16to 48-bit, 8- to 48-bit execution packing for executing instruction directly from 32-bit, 16-bit, or 8-bit wide external memories. External SDRAM, SRAM, or SBSRAM can be 8-, 16-, or 32-bits wide for DMA transfers to or from external memory.

On-chip decoding of high-order address lines generates memory bank select signals for addressing external memory devices. The ADSP-21161 processor provides programmable memory waitstates and external memory acknowledge controls for interfacing to peripherals with variable access, hold, and disable time requirements.

1-12 ADSP-21161 SHARC Processor Hardware Reference

Page 43

INTRODUCTION

SDRAM Interface. The ADSP-21161 processor integrated on-chip SDRAM controller transfers data to and from synchronous DRAM (SDRAM) at the core clock frequency or one-half the core clock frequency. The synchronous approach, coupled with the core clock frequency, supports data transfer at a high throughput—up to 400 Mbytes/second for 32-bit transfers and 600 Mbytes/second for 48-bit transfers.

The SDRAM interface provides a glueless interface with standard SDRAMs—16 Mbits, 64 Mbits, 128 Mbits, and 256 Mbits—and includes options to support additional buffers between the ADSP-21161 processor and SDRAM. The SDRAM interface is extremely flexible and provides capability for connecting SDRAMs to any one of the ADSP-21161 processor four external memory banks, with up to all four banks mapped to SDRAM.

Systems with several SDRAM devices connected in parallel may require buffering to meet overall system timing requirements. The ADSP-21161 processor supports pipelining of the address and control signals to enable such buffering between itself and multiple SDRAM devices.

Host Processor Interface. The ADSP-21161 processor host interface allows easy connection to standard microprocessor buses, 8-bit, 16-bit and 32-bit, with little additional hardware required. The interface supports asynchronous and synchronous transfers at speeds up to the half the internal core clock rate of the ADSP-21161 processor. The host interface operates through the ADSP-21161 processor external port and maps into the unified address space. Four channels of DMA are available for the host interface; code and data transfers occur with low software overhead. The host can directly read and write the IOP register space of the ADSP-21161 processor and can access the DMA channel setup and mailbox registers. The host can also perform DMA transfers to and from the internal memory of the processor. Vector interrupt support provides for efficient execution of host commands.

ADSP-21161 SHARC Processor Hardware Reference 1-13

Page 44

Architecture Overview

Multiprocessor System Interface. The ADSP-21161 processor offers powerful features tailored to multiprocessing systems. The unified address space allows direct interprocessor accesses of each ADSP-21161 processor internal IOP registers. Distributed bus arbitration logic on the processor allows simple, glueless connection of systems containing up to six ADSP-21161 processor and a host processor. Master processor changeover incurs only one cycle of overhead. Bus arbitration handles either fixed or rotating priority. Processor bus lock allows indivisible read-modify-write sequences for semaphores. A vector interrupt capability is provided for interprocessor commands.

I/O Processor

The ADSP-21161 processor Input/Output Processor (IOP) includes four serial ports, two link ports, a SPI-compatible port, and a DMA controller. One of the processes that the IO processor automates is booting. The processor can boot from the external port (with data from an 8-bit EPROM or a host processor) or a link port. Alternatively, a no-boot mode lets the processor start by executing instructions from external memory without booting.

Serial Ports. The ADSP-21161 processor features four synchronous serial ports that provide an inexpensive interface to a wide variety of digital and mixed-signal peripheral devices. The serial ports can operate at up to half the processor core clock rate. Programmable data direction provides greater flexibility for serial communications. Serial port data can automatically transfer to and from on-chip memory using DMA. Each of the serial ports offers a TDM multichannel mode (up to 128 channels) and supports m-law or A-law companding. I2S support is also provided with the ADSP-21161 processor.

The serial ports can operate with little-endian or big-endian transmission formats, with word lengths from 3 to 32 bits. The serial ports offer selectable synchronization and transmit modes. Serial port clocks and frame syncs can be internally or externally generated.

1-14 ADSP-21161 SHARC Processor Hardware Reference

Page 45

INTRODUCTION

Link Ports. The ADSP-21161 processor features two 8-bit link ports that provide additional I/O capabilities. Link port I/O is especially useful for point-to-point interprocessor communication in multiprocessing systems. The link ports can operate independently and simultaneously. The data packs into 32-bit or 48-bit words, which the processor core can directly read or the IO processor can DMA-transfer to on-chip memory. Clock and acknowledge handshaking signals control link port transfers. Transfers are programmable as either transmit or receive.

Serial Peripheral (Compatible) Interface. The ADSP-21161 processor Serial Peripheral Interface (SPI) is an industry standard synchronous serial link that enables the ADSP-21161 processor SPI-compatible port to communicate with other SPI-compatible devices. SPI is a 4-wire interface consisting of two data pins, one device select pin, and one clock pin. It is a full-duplex synchronous serial interface, supporting both master and slave modes. It can operate in a multi-master environment by interfacing with up to four other SPI-compatible devices, either acting as a master or slave device. The ADSP-21161 processor SPI-compatible peripheral implementation also supports programmable baud rate and clock phase/polarities, and the use of open drain drivers to support the multi-master scenario to avoid data contention.

DMA Controller. The ADSP-21161 processor on-chip DMA controller allows zero-overhead data transfers without processor intervention. The DMA controller operates independently and invisibly to the processor core, allowing DMA operations to occur while the core is simultaneously executing its program. Both code and data can be downloaded to the ADSP-21161 processor using DMA transfers.

DMA transfers can occur between the ADSP-21161 processor internal memory and external memory, external peripherals, or a host processor. DMA transfers between external memory and external peripheral devices are another option. External bus packing to 8-, 16-, 32-, 48-, or 64-bit words is automatically performed during DMA transfers.

ADSP-21161 SHARC Processor Hardware Reference 1-15

Page 46

Differences From Previous SHARC Processors

Fourteen channels of DMA are available on the ADSP-21161 processor— two over the link ports (shared with SPI), eight over the serial ports, and four over the processor’s external port. The external port DMA channels serve for host processor, other ADSP-21161 processor DSPs, memory, or I/O transfers.

JTAG Port

The JTAG port on the ADSP-21161 processor supports the IEEE standard 1149.1 Joint Test Action Group (JTAG) standard for system test. This standard defines a method for serially scanning the I/O status of each component in a system. Emulators use the JTAG port to monitor and control the processor during emulation. Emulators using this port provide full-speed emulation with access to inspect and modify memory, registers, and processor stacks. JTAG-based emulation is non-intrusive and does not effect target system loading or timing.

Differences From Previous SHARC Processors

This section identifies differences between the ADSP-21161 processor and previous SHARC processors: ADSP-21160, ADSP-21060, ADSP-21061, ADSP-21062, and ADSP-21065. The ADSP-21161 processor preserves much of the ADSP-2106x architecture and is comparable to the ADSP-21160 with extended performance and functionality. For background information on SHARC and the ADSP-2106x Family processors, see the ADSP-2106x SHARC User’s Manual or the ADSP-21065L SHARC Technical Reference.

1-16 ADSP-21161 SHARC Processor Hardware Reference

Page 47

INTRODUCTION

Processor Core Enhancements

Computational bandwidth on the ADSP-21161 processor is significantly greater than that on the ADSP-2106x DSPs. The increase comes from raising the operational frequency and adding another processing element: ALU, shifter, multiplier, and register file. The new processing element lets the processor to process multiple data streams in parallel (SIMD mode).

Like the ADSP-21160, the program sequencer on the ADSP-21161 processor differs from the ADSP-2106x family, having several enhancements: new interrupt vector table definitions, SIMD mode stack and conditional execution model, and instruction decodes associated with new instructions. Interrupt vectors have been added that detect illegal memory accesses. Link port interrupt control has moved to a new register to support the additional DMA channels. Also, mode stack and mode mask support has been added to improve context switch time.

As with the ADSP-21160, the data address generators on the ADSP-21161 processor differ from the ADSP-2106x in that DAG2 (for the PM bus) has the same addressing capability as DAG1 (for the DM bus). The DAG registers move 64-bits per cycle. Additionally, the DAGs support the new memory map and Long Word transfer capability. Circular buffering on the ADSP-21161 processor can be quickly disabled on interrupts and restored on the return. Data “broadcast”, from one memory location to both data register files, is determined by appropriate index register usage.

Processor Internal Bus Enhancements

The PM, DM, and IO data buses on the ADSP-21161 processor have increased on the ADSP-2106x processors to 64 bits. Additional multiplexing and control logic on the ADSP-21161 processor enable 16-, 32-, or 64-bit wide moves between both register files and memory. The ADSP-21161 processor is capable of broadcasting a single memory loca-

ADSP-21161 SHARC Processor Hardware Reference 1-17

Page 48

Differences From Previous SHARC Processors

tion to each of the register files in parallel. Also, the ADSP-21161 processor permits register contents to be exchanged between the two processing elements’ register files in a single cycle.

Memory Organization Enhancements

The ADSP-21161 processor memory map differs from the ADSP-2106x’s and is similar in organization to the ADSP-21160. The system memory map on the ADSP-21161 processor supports double-word transfers each cycle, reflects extended internal memory capacity for derivative designs, and works with updated control register for SIMD support.

External Port Enhancements

The ADSP-21161 processor external port differs from the ADSP-2106x DSPs. The data bus on the ADSP-21160 is 32-bits wide. A new packing mode permits DMA for instructions and data to and from 8-bit external memory. The ADSP-21161 processor has a new synchronous interface that improves local bus switching frequency. Also, burst support on the ADSP-21161 processor improves bus usage.

Host Interface Enhancements

The ADSP-21161 processor host interface differs from the ADSP-2106x DSPs. It is 32-bit wide and supports 8-bit, 16- and 32-bit hosts. Although the ADSP-21161 processor supports the ADSP-2106x asynchronous host interface protocols, the ADSP-21161 processor also provides new synchronous interface protocols for maximum throughput.

The host/local bus deadlock resolution function on the ADSP-21161 processor is extended to the DMA controller. With this function the host (or bridge) logic forces the local bus to wait until the host completes it’s operation.

1-18 ADSP-21161 SHARC Processor Hardware Reference

Page 49

INTRODUCTION

Multiprocessor Interface Enhancements

The ADSP-21161 processor multiprocessor system interface supports greater throughput than the ADSP-2106x DSPs. The throughput between ADSP-21161 processors in a multiprocessing application increases due to new shared bus transfer protocols, shared bus cycle time improvements due to synchronous interface, and improvements in link port throughput. The external port supports glueless multiprocessing, with distributed arbitration for up to six ADSP-21161 processors.

IO Architecture Enhancements

The IO processor on the ADSP-21161 processor provides much greater throughput than the ADSP-2106x DSPs. This section describes how the link ports and DMA controller differ on the ADSP-21161 processor.

DMA Controller Enhancements

The ADSP-21161 processor DMA controller supports 14 channels compared to 10 on the ADSP-2106x DSPs. New packing modes support the 64-bit internal busing. To resolve potential deadlock scenarios, the ADSP-21161 processor DMA controller relinquishes the local bus in a similar fashion to the processor core when host logic asserts both HBR and

SBTS.

Link Port Enhancements

The ADSP-21161 processor two link ports provide greater throughput than the ADSP-2106x DSPs. The link port data bus width on the ADSP-21161 processor is 8 bits wide (versus 4 bits on the ADSP-2106x DSPs). Link port clock control on the ADSP-21161 processor supports a wider frequency range.

ADSP-21161 SHARC Processor Hardware Reference 1-19

Page 50

Differences From Previous SHARC Processors

Instruction Set Enhancements

ADSP-21161 processor provides source code compatibility with the previous SHARC family members, to the application assembly source code level. All instructions, control registers, and system resources available in the ADSP-2106x core programming model are available in ADSP-21161 processor. Instructions, control registers, or other facilities, required to support the new feature set of ADSP-2116x core include the following.

• Code compatible to the ADSP-21160 SIMD core

• Supersets of the ADSP-2106x programming model

• Reserved facilities in the ADSP-2106x programming model

• Symbol name changes from the ADSP-2106x and ADSP-21161 processor programming models

These name changes can be managed through re-assembly using the ADSP-21161 processor development tools to apply the ADSP-21161 processor symbol definitions header file and linker description file. While these changes have no direct impact on existing core applications, system and I/O processor initialization code and control code do require modifications.

Although the porting of source code written for the ADSP-2106x family to ADSP-21161 processor has been simplified, code changes are required to take full advantage of the new ADSP-21161 processor features. For more information, see the ADSP-21160 SHARC DSP Instruction Set Reference.

1-20 ADSP-21161 SHARC Processor Hardware Reference

Page 51

INTRODUCTION

For More Information About Analog Products

Analog Devices is online on the internet at http://www.analog.com. Our web pages provide information on the company and products, including access to technical information and documentation, product overviews, and product announcements.

Additional information may be obtained about Analog Devices and its products in any of the following ways:

• Visit our World Wide Web site at www.analog.com

• FAX questions or requests for information to 1(781)461-3010.

• Access the Computer Products Division File Transfer Protocol (FTP) site at ftp ftp.analog.com or ftp 137.71.23.21 or ftp://ftp.analog.com

ADSP-21161 SHARC Processor Hardware Reference 1-21

Page 52

For Technical or Customer Support

Our Customer Support group can be reached in the following ways:

• E-mail questions to dsp.support@analog.com (hardware support), dsptools.support@analog.com (software support) or dsp.europe@analog.com (European customer support).

• Contact your local ADI sales office or an authorized ADI distributor

• Send questions by mail to:

Analog Devices, Inc. DSP Division One Technology Way P.O. Box 9106 Norwood, MA 02062-9106 USA

What’s New in This Manual

The fourth edition of the ADSP-21161 SHARC Processor Hardware Reference is revised based on the published document errata.

1-22 ADSP-21161 SHARC Processor Hardware Reference

Page 53

INTRODUCTION

Related Documents

For more information about Analog Devices DSPs and development products, see the following documents:

• ADSP-21161 SHARC DSP Microcomputer Data Sheet

• ADSP-21160 SHARC DSP Instruction Set Reference

• Getting Started Guide for VisualDSP++ & ADSP-21xxx Family

DSPs

• VisualDSP++ User's Guide for ADSP-21xxx Family DSPs

• C/C++ Compiler & Library Manual for ADSP-21xxx Family DSPs

• Assembler Manual for ADSP-21xxx Family DSPs

• Linker & Utilities Manual for ADSP-21xxx Family DSPs

All the manuals are included in the software distribution CD-ROM. To access these manuals, use the Help Topics command in the VisualDSP++ environment’s Help menu and select the Online Manuals book. From this Help topic, you can open any of the manuals, which are in Adobe Acrobat PDF format.

ADSP-21161 SHARC Processor Hardware Reference 1-23

Page 54

Conventions

The following are conventions that apply to all chapters. Note that additional conventions, which apply only to specific chapters, appear throughout this document.

Table 1-1. Notation Conventions

Example Description

Close command (File menu)

{this | that} Alternative items in syntax descriptions appear within curly brackets

[this | that] Optional items in syntax descriptions appear within brackets and sepa-

[this,…] Optional item lists in syntax descriptions appear within brackets

.SECTION Commands, directives, keywords, and feature names are in text with

filename Non-keyword placeholders appear in text with italic style format.

[

Titles in reference sections indicate the location of an item within the VisualDSP++ environment’s menu system (for example, the Close command appears on the File menu).

and separated by vertical bars; read the example as or the other is required.

rated by vertical bars; read the example as an optional this or that.

delimited by commas and terminated with an ellipse; read the example as an optional comma-separated list of

letter gothic font.

Note: For correct operation, ... A Note: provides supplementary information on a related topic. In the online version of this book, the word Note appears instead of this symbol.

Warn ing : Injury to device users may result if ... A Warning: identifies conditions or inappropriate usage of the product that could lead to conditions that are potentially hazardous for devices users. In the online version of this book, the word Wa rning appears instead of this symbol.

this.

this or that. One

1-24 ADSP-21161 SHARC Processor Hardware Reference

Page 55

2 PROCESSING ELEMENTS

The processor’s Processing Elements (PEx and PEy) perform numeric processing for digital signal processing algorithms. Each processing element contains a data register file and three computation units: an arithmetic/logic unit (ALU), a multiplier, and a shifter. Computational instructions for these elements include both fixed-point and floating-point operations, and each computational instruction can execute in a single cycle.

The computational units in a processing element handle different types of operations. The ALU performs arithmetic and logic operations on fixed-point and floating-point data. The multiplier does floating-point and fixed-point multiplication and executes fixed-point multiply/add and multiply/subtract operations. The shifter completes logical shifts, arithmetic shifts, bit manipulation, field deposit, and field extraction operations on 32-bit operands. Also, the Shifter can derive exponents.

Data flow paths through the computational units are arranged in parallel, as shown in Figure 2-1. The output of any computation unit may serve as the input of any computation unit on the next instruction cycle. Data moving in and out of the computational units goes through a 10-port register file, consisting of sixteen primary registers and sixteen alternate registers. Two ports on the register file connect to the PM and DM data buses, allowing data transfer between the computational units and memory (and anything else) connected to these buses.

ADSP-21161 SHARC Processor Hardware Reference 2-1

Page 56

The processor’s assembly language provides access to the data register files in both processing elements. The syntax lets programs move data to and from these registers and specify a computation’s data format at the same time with naming conventions for the registers. For information on the data register names, see “Data Register File” on page 2-30.

Figure 2-1 provides a graphical guide to the other topics in this chapter.

First, a description of the

MODE1 register shows how to set rounding, data

format, and other modes for the processing elements. Next, an examination of each computational unit provides details on operation and a summary of computational instructions. Outside the computational units, details on register files and data buses identify how to flow data for computations. Finally, details on the processor’s advanced parallelism reveal how to take advantage of multifunction instructions and SIMD mode.

2-2 ADSP-21161 SHARC Processor Hardware Reference

Page 57

Processing Elements

MODE1

XY ZXYXY

MULTIPLIER

TO PROGRAM SEQUENCER

PM DATA BUS

DM DATA BUS

(16 × 40-BIT)

R0 R1 R2 R3

R4 R5 R6 R7

MRF2 MRF0MRF1

ASTATx STKYx

Figure 2-1. Computation Units

R9 R10 R11

R12 R13 R14 R15

SHIFTER ALU

ADSP-21161 SHARC Processor Hardware Reference 2-3

Page 58

Setting Computational Modes

The MODE1 register controls the operating mode of the processing elements. Table A-2 on page A-3 lists all the bits in MODE1. The following bits in MODE1 control computational modes.

• Floating-point data format. Bit 16 (RND32) directs the computa-

tional units to round floating-point data to 32 bits (if 1) or round to 40 bits (if 0).

• Rounding mode. Bit 15 (TRUNC) directs the computational units to

round results with round-to-zero (if 1) or round-to-nearest (if 0).

• ALU saturation. Bit 13 (ALUSAT) directs the computational units to

saturate results on positive or negative fixed-point overflows (if 1) or return unsaturated results (if 0).

• Short word sign extension. Bit 14 (SSE) directs the computational

units to sign extend short-word, 16-bit data (if 1) or zero-fill the upper 16 bits (if 0).

• Secondary processor element (PEy). Bit 21 (PEYEN) enables com-

putations in PEy—SIMD mode—(if 1) or disables PEy—SISD mode—(if 0).

32-Bit (Normal Word) Floating-Point Format

In the default mode of the processor (RND32 bit=1), the multiplier and ALU support a single-precision floating-point format, which is specified in the IEEE 754/854 standard. For more information on this standard, see

2-4 ADSP-21161 SHARC Processor Hardware Reference

Page 59

Processing Elements

For more information, see “Numeric Formats” on page C-1. This format

is IEEE 754/854 compatible for single-precision floating-point operations in all respects except that:

• The processor does not provide inexact flags.

• NAN (“Not-A-Number”) inputs generate an invalid exception and return a quiet NAN (all 1s).

• Denormal operands flush to zero when input to a computation unit and do not generate an underflow exception. Any denormal or underflow result from an arithmetic operation flushes to zero and generates an underflow exception.

• The processor supports round to nearest and round toward zero modes, but does not support round to +Infinity and round to

-Infinity.

IEEE Single-precision floating-point data uses a 23-bit mantissa with an 8-bit exponent plus sign bit. In this case, the computation unit sets the eight LSBs of floating-point inputs to zeros before performing the operation. The mantissa of a result rounds to 23 bits (not including the hidden bit), and the 8 LSBs of the 40-bit result clear to zeros to form a 32-bit number, which is equivalent to the IEEE standard result.

In fixed-point to floating-point conversion, the rounding boundary is always 40 bits even if the RND32 bit is set.

40-Bit Floating-Point Format

When in extended precision mode (RND32 bit=0), the processor supports a 40-bit extended precision floating-point mode, which has eight additional LSBs of the mantissa and is compliant with the 754/854 standards; however, results in this format are more precise than the IEEE single-precision standard specifies. Extended-precision floating-point data uses a 31-bit mantissa with a 8-bit exponent plus sign bit.

ADSP-21161 SHARC Processor Hardware Reference 2-5

Page 60

Setting Computational Modes

16-Bit (Short Word) Floating-Point Format

The processor supports a 16-bit floating-point storage format and provides instructions that convert the data for 40-bit computations. The 16-bit floating-point format uses an 11-bit mantissa with a 4-bit exponent plus sign bit. The 16-bit data goes into bits 23 through 8 of a data register. Two shifter instructions, Fpack and Funpack, perform the packing and unpacking conversions between 32-bit floating-point words and 16-bit floating-point words. The Fpack instruction converts a 32-bit IEEE floating-point number in a data register into a 16-bit floating-point number. Funpack converts a 16-bit floating-point number in a data register into a 32-bit IEEE floating-point number. Each instruction executes in a single cycle.

When 16-bit data is written to bits 23 through 8 of a data register, the processor automatically extends the data into a 32-bit integer (bits 39 through 8). If the SSE bit in MODE1 is set (1), the processor sign extends the upper 16 bits. If the SSE bit is cleared (0), the processor zeros the upper 16 bits.

The 16-bit floating-point format supports gradual underflow. This method sacrifices precision for dynamic range. When packing a number that would have underflowed, the exponent clears to zero and the mantissa (including “hidden” 1) right-shifts the appropriate amount. The packed result is a denormal, which can be unpacked into a normal IEEE floating-point number.

32-Bit Fixed-Point Format

The processor always represents fixed-point numbers in 32 bits, occupying the 32 MSBs in 40-bit data registers. Fixed-point data may be fractional or integer numbers and unsigned or twos-complement. Each computational unit has its own limitations on how these formats may be mixed for

2-6 ADSP-21161 SHARC Processor Hardware Reference

Page 61

Processing Elements

a given operation. All computational units read the upper 32 bits of data (inputs, operands) from the 40-bit registers (ignoring the 8 LSBs) and write results to the upper 32 bits (zeroing the 8 LSBs).

Rounding Mode

The TRUNC bit in the MODE1 register determines the rounding mode for all ALU operations, all floating-point multiplies, and fixed-point multiplies of fractional data. The processor supports two modes of rounding: round-toward-zero and round-toward-nearest. The rounding modes comply with the IEEE 754 standard and have the following definitions.

• Round-Toward-Zero (TRUNC bit=1). If the result before rounding is not exactly representable in the destination format, the rounded result is the number that is nearer to zero. This definition is equivalent to truncation.

• Round-Toward-Nearest (TRUNC bit=0). If the result before round- ing is not exactly representable in the destination format, the rounded result is the number that is nearer to the result before rounding. If the result before rounding is exactly halfway between two numbers in the destination format (differing by an LSB), the rounded result is the number that has an LSB equal to zero.

Statistically, rounding up occurs as often as rounding down, so there is no large sample bias. Because the maximum floating-point value is one LSB less than the value that represents Infinity, a result that is halfway between the maximum floating-point value and Infinity rounds to Infinity in this mode.

Though these rounding modes comply with standards set for floating-point data, they also apply for fixed-point multiplier operations on fractional data. The same two rounding modes are supported, but only the round-to-nearest operation is actually performed by the multiplier. Using

ADSP-21161 SHARC Processor Hardware Reference 2-7

Page 62

Using Computational Status

its local result register for fixed-point operations, the multiplier rounds-to-zero by reading only the upper bits of the result and discarding the lower bits.

Using Computational Status

The multiplier and ALU each provide exception information when executing floating-point operations. Each unit updates overflow, underflow, and invalid operation flags in the processing element’s arithmetic status (ASTATx and ASTATy) register and sticky status (STKYx and STKYy) register. An underflow, overflow, or invalid operation from any unit also generates a maskable interrupt. There are three ways to use floating-point exceptions from computations in program sequencing:

• Interrupts. Enable interrupts and use an interrupt service routine

to handle the exception condition immediately. This method is appropriate if it is important to correct all exceptions as they occur.

• ASTATx and ASTATy registers. Use conditional instructions to test

the exception flags in the ASTATx or ASTATy register after the instruction executes. This method permits monitoring each instruction’s outcome.

• STKYx and STKYy registers. Use the Bit Tst instruction to examine

exception flags in the any flags are set, some of the results are incorrect. This method is useful when exception handling is not critical.

More information on describe the computational units. For summaries relating instructions and status bits, see Table 2-1, Table 2-2, Table 2-4, Table 2-6, and Table 2-7.

2-8 ADSP-21161 SHARC Processor Hardware Reference

ASTAT and STKY status appears in the sections that

STKY register after a series of operations. If

Page 63

Processing Elements

Arithmetic Logic Unit (ALU)

The ALU performs arithmetic operations on fixed-point or floating-point data and logical operations on fixed-point data. ALU fixed-point instructions operate on 32-bit fixed-point operands and output 32-bit fixed-point results. ALU floating-point instructions operate on 32-bit or 40-bit floating-point operands and output 32-bit or 40-bit floating-point results. ALU instructions include:

• Floating-point addition, subtraction, add/subtract, average

• Fixed-point addition, subtraction, add/subtract, average

• Floating-point manipulation: binary log, scale, mantissa

• Fixed-point add with carry, subtract with borrow, increment, decrement

• Logical And, Or, Xor, Not

• Functions: Abs, pass, min, max, clip, compare

• Format conversion

• Reciprocal and reciprocal square root primitives

ALU Operation

ALU instructions take one or two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. Most ALU operations return one result; in add/subtract operations, the ALU operation returns two results, and in compare operations, the ALU operation returns no result (only flags are updated). ALU results can be returned to any location in the register file.

ADSP-21161 SHARC Processor Hardware Reference 2-9

Page 64

Arithmetic Logic Unit (ALU)

The processor transfers input operands from the register file during the first half of the cycle and transfers results to the register file during the second half of the cycle. With this arrangement, the ALU can read and write the same register file location in a single cycle. If the ALU operation is fixed-point, the inputs are treated as 32-bit fixed-point operands. The ALU transfers the upper 32 bits from the source location in the register file. For fixed-point operations, the result(s) are always 32-bit fixed-point values. Some floating-point operations (Logb, Mant and Fix) can also yield fixed-point results.

The processor transfers fixed-point results to the upper 32 bits of the data register and clears the lower eight bits of the register. The format of fixed-point operands and results depends on the operation. In most arithmetic operations, there is no need to distinguish between integer and fractional formats. Fixed-point inputs to operations such as scaling a floating-point value are treated as integers. For purposes of determining status such as overflow, fixed-point arithmetic operands and results are treated as twos-complement numbers.

ALU Saturation

When the ALUSAT bit is set (1) in the MODE1 register, the ALU is in saturation mode. In this mode, all positive fixed-point overflows return the maximum positive fixed-point number (0x7FFF FFFF), and all negative overflows return the maximum negative number (0x8000 0000).

When the ALUSAT bit is cleared (0) in the MODE1 register, fixed-point results that overflow are not saturated; the upper 32 bits of the result are returned unaltered.

The ALU overflow flag reflects the ALU result before saturation.

2-10 ADSP-21161 SHARC Processor Hardware Reference

Page 65

Processing Elements

ALU Status Flags

ALU operations update seven status flags in the processing element’s Arithmetic Status (ASTATx and ASTATy) register. Table A-4 on page A-18 lists all the bits in these registers. The following bits in ASTATx or ASTATy flag ALU status (a 1 indicates the condition) for the most recent ALU operation:

• ALU result zero or floating-point underflow. Bit 0 (AZ)

• ALU overflow. Bit 1 (AV)

• ALU result negative. Bit 2 (AN)

• ALU fixed-point carry. Bit 3 (AC)

• ALU X input sign for Abs, Mant operations. Bit 4 (AS)

• ALU floating-point invalid operation. Bit 5 (AI)

• Last ALU operation was a floating-point operation. Bit 10 (AF)

• Compare Accumulation register results of last 8 compare opera- tions. Bits 31-24 (CACC)

ALU operations also update four “sticky” status flags in the processing element’s Sticky status (STKYx and STKYy) register. “Sticky Status Registers

(STKYx and STKYy)” on page A-18 lists all the bits in these registers. The

following bits in STKYx or STKYy flag ALU status (a 1 indicates the condition). Once set, a sticky flag remains high until explicitly cleared:

• ALU floating-point underflow. Bit 0 (AUS)

• ALU floating-point overflow. Bit 1 (

AVS)

• ALU fixed-point overflow. Bit 2 (AOS)

• ALU floating-point invalid operation. Bit 5 (

AIS)

ADSP-21161 SHARC Processor Hardware Reference 2-11

Page 66

Arithmetic Logic Unit (ALU)

Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky status register explicitly in the same cycle that the ALU is performing an operation, the explicit write to the status register supersedes any flag update from the ALU operation.

ALU Instruction Summary

Table 2-1 and Table 2-2 list the ALU instructions and how they relate to

ASTATx,y and STKYx,y flags. For more information on assembly language

syntax, see the ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols.

• Rn, Rx, Ry indicate any register file location; treated as fixed-point

• Fn, Fx, Fy indicate any register file location; treated as

floating-point

• * indicates the flag may be set or cleared, depending on results of

instruction

• ** indicates the flag may be set (but not cleared), depending on

results of instruction

• – indicates no effect

2-12 ADSP-21161 SHARC Processor Hardware Reference

Page 67

Processing Elements

Table 2-1. Fixed-Point ALU Instruction Summary

Instruction ASTATx,y Status Flags STKYx,y Status Flags

Fixed-Point: AZAV ANACAS AI AF C

Rn = Rx + Ry ****000–––**–

Rn = Rx – Ry ****000–––**–

Rn = Rx + Ry + CI ****000–––**–

Rn = Rx – Ry + CI – 1 ****000–––**–

Rn = (Rx + Ry)/2 *0**000–––––

COMP(Rx, Ry) *0*0000*––––

COMPU(Rx,Ry) *0*0000*--------

Rn = Rx + CI ****000–––**–

Rn = Rx + CI – 1 ****000–––**–

Rn = Rx + 1 ****000–––**–

Rn = Rx – 1 ****000–––**–

Rn = –Rx ****000–––**–

Rn = ABS Rx **00*00–––**–

Rn = PASS Rx *0*0000–––––

Rn = Rx AND Ry *0* 0000–––––

Rn = Rx OR Ry *0* 0000–––––

Rn = Rx XOR Ry * 0* 0000–––––

Rn = NOT Rx *0*0000–––––

Rn = MIN(Rx, Ry) * 0* 0000–––––

Rn = MAX(Rx, Ry) *0*0000–––––

Rn = CLIP Rx BY Ry *0*0000–––––

A C C

AUSAVSA

ADSP-21161 SHARC Processor Hardware Reference 2-13

Page 68

Arithmetic Logic Unit (ALU)

Table 2-2. Floating-Point ALU Instruction Summary

Instruction ASTATx,y Status Flags STKYx,y Status Flags

Floating–Point: AZ AV AN AC AS AI AF CACCAUSAVSAOSAIS

Fn = Fx + Fy * * * 0 0 * 1 – ** ** – **

Fn = Fx – Fy * * * 0 0 * 1 – ** ** – **

Fn = ABS (Fx + Fy) * * 0 0 0 * 1 – ** ** – **

Fn = ABS (Fx – Fy) * * 0 0 0 * 1 – ** ** – **

Fn = (Fx + Fy)/2 *0*00*1–**––**

COMP(Fx, Fy) *0*00*1*–––**

Fn = –Fx ***00*1––**–**

Fn = ABS Fx **00**1––**–**

Fn = PASS Fx *0*00*1––––**

Fn = RND Fx ** * 00* 1––**–**

Fn = SCALB Fx BY Ry * * * 0 0 * 1 – ** ** – **

Rn = MANT Fx **00**1––**–**

Rn = LOGB Fx ** * 00* 1––**–**

Rn = FIX Fx BY Ry * * * 0 0 * 1 – ** ** – **

Rn = FIX Fx * * * 0 0 * 1 – ** ** – **

Fn = FLOAT Rx BY Ry * * * 0001–****––

Fn = FLOAT Rx *0*0001–––––

Fn = RECIPS Fx * * * 0 0 * 1 – ** ** – **

Fn = RSQRTS Fx ***00*1––**–**

Fn = Fx COPYSIGN Fy *0*00*1––––**

Fn = MIN(Fx, Fy) *0*00*1––––**

Fn = MAX(Fx, Fy) *0*00*1––––**

Fn = CLIP Fx BY Fy *0*00*1––––**

2-14 ADSP-21161 SHARC Processor Hardware Reference

Page 69

Processing Elements

Multiply—Accumulator (Multiplier)

The multiplier performs fixed-point or floating-point multiplication and fixed-point multiply/accumulate operations. Fixed-point multiply/accumulates are available with either cumulative addition or cumulative subtraction. Multiplier floating-point instructions operate on 32-bit or 40-bit floating-point operands and output 32-bit or 40-bit floating-point results. Multiplier fixed-point instructions operate on 32-bit fixed-point data and produce 80-bit results. Inputs are treated as fractional or integer, unsigned or twos-complement. Multiplier instructions include:

• Floating-point multiplication

• Fixed-point multiplication

• Fixed-point multiply/accumulate with addition, rounding optional

• Fixed-point multiply/accumulate with subtraction, rounding optional

• Rounding result register

• Saturating result register

• Clearing result register

Multiplier Operation

The multiplier takes two inputs: X input and Y input. These inputs (also known as operands) can be any data registers in the register file. The multiplier can accumulate fixed-point results in the local Multiplier Result (MRF) registers or write results back to the register file. The results in MRF can also be rounded or saturated in separate operations. Floating-point multiplies yield floating-point results, which the multiplier always writes directly to the register file.

ADSP-21161 SHARC Processor Hardware Reference 2-15

Page 70

Multiply—Accumulator (Multiplier)

The multiplier transfers input operands during the first half of the cycle and transfers results during the second half of the cycle. With this arrangement, the multiplier can read and write the same register file location in a single cycle.

For fixed-point multiplies, the multiplier reads the inputs from the upper 32 bits of the data registers. Fixed-point operands may be either both in integer format or both in fractional format. The format of the result matches the format of the inputs. Each fixed-point operand may be either an unsigned or a twos-complement number. If both inputs are fractional and signed, the multiplier automatically shifts the result left one bit to remove the redundant sign bit. The register name(s) within the multiplier instruction specify input data type(s)—Fx for floating-point and Rx for fixed-point.

Multiplier (Fixed-Point) Result Register

Fixed-point operations place 80-bit results in the multiplier’s foreground

MRF register or background MRB register, depending on which is active. For

more information on selecting the result register, see “Alternate (Second-

ary) Data Registers” on page 2-32.

The location of a result in the MRF register’s 80-bit field depends on whether the result is in fractional or integer format, as shown in

Figure 2-2. If the result is sent directly to a data register, the 32-bit result

with the same format as the input data is transferred, using bits 63-32 for a fractional result or bits 31-0 for an integer result. The eight LSBs of the 40-bit register file location are zero-filled.

2-16 ADSP-21161 SHARC Processor Hardware Reference

Page 71

Processing Elements

796

331

MRF2 MRF0

OVERFLOW UNDERFLOWFRACTIONAL RESULT

OVERFLOW INTEGER RESULTOVERFLOW

MRF1

Figure 2-2. Multiplier Fixed-Point Result Placement

Fractional results can be rounded-to-nearest before being sent to the register file. If rounding is not specified, discarding bits 31-0 effectively truncates a fractional result (rounds to zero). For more information on rounding, see “Rounding Mode” on page 2-7.

The

MRF register is divided into MRF2, MRF1, and MRF0 registers, which can

be individually read from or written to the register file. Each of these registers has the same format. When data is read from MRF2, it is sign-extended to 32 bits as shown in Figure 2-3. The processor zero fills the eight LSBs of the 40-bit register file location when data is read from

MRF2, MRF1, or MRF0 to the register file. When the processor writes data into MRF2, MRF1, or MRF0 from the 32 MSBs of a register file location, the eight

LSBs are ignored. Data written to MRF1 is sign-extended to MRF2, repeating the MSB of

MRF1 in the 16 bits of MRF2. Data written to MRF0 is not

sign-extended.

ADSP-21161 SHARC Processor Hardware Reference 2-17

Page 72

Multiply—Accumulator (Multiplier)

16 BITS 16 BITS 16 BITS

MRF2

ZEROSSIGNEXTEND

8BITS32 BITS

MRF1

ZEROS

MRF0

8-BITS32-BITS

ZEROS

Figure 2-3. MR Transfer Formats

In addition to multiplication, fixed-point operations include accumulation, rounding and saturation of fixed-point data. There are three

MRF

The clear operation—MRF=0—resets the specified MRF register to zero. Often, it is best to perform this operation at the start of a multiply/accumulate operation to remove results left over from the previous operation.

The rounding operation—MRF=Rnd MRF—applies only to fractional results, so integer results are not effected. This operation rounds the 80-bit MRF value to nearest at bit 32; for example, the MRF1-MRF0 boundary. Rounding of a fixed-point result occurs either as part of a multiply or multiply/accumulate operation or as an explicit operation on the MRF register. The rounded result in the same

MRF register. To round a fractional result to zero (truncation)

MRF1 can be sent either to the register file or back to

instead of to nearest, a program would transfer the unrounded result from

MRF1, discarding the lower 32 bits in MRF0.

The saturate operation—

MRF value has overflowed. Overflow occurs when the MRF value is greater

MRF=Sat MRF—sets MRF to a maximum value if the

than the maximum value for the data format—unsigned or twos-complement and integer or fractional—as specified in the saturate instruction.

2-18 ADSP-21161 SHARC Processor Hardware Reference

Page 73

Processing Elements

The six possible maximum values appear in Table 2-3. The result from

MRF saturation can be sent either to the register file or back to the same MRF

Table 2-3. Fixed-Point Format Maximum Values (For Saturation)

Maximum Number (Hexadecimal)

MRF2 MRF1 MRF0

2’s complement fractional (positive) 0000 7FFF FFFF FFFF FFFF

2’s complement fractional (negative) FFFF 8000 0000 0000 0000

2’s complement integer (positive) 0000 0000 0000 7FFF FFFF

2’s complement integer (negative) FFFF FFFF FFFF 8000 0000

Unsigned fractional number 0000 FFFF FFFF FFFF FFFF

Unsigned integer number 0000 0000 0000 FFFF FFFF

Multiplier Status Flags

Multiplier operations update four status flags in the processing element’s arithmetic status register (ASTATx and ASTATy). Table A-5 on page A-19 lists all the bits in these registers. The following bits in ASTATx or ASTATy flag multiplier status (a 1 indicates the condition) for the most recent multiplier operation.

• Multiplier result negative. Bit 6 (

• Multiplier overflow. Bit 7 (MV)

• Multiplier underflow. Bit 8 (MU)

• Multiplier floating-point invalid operation. Bit 9 (

ADSP-21161 SHARC Processor Hardware Reference 2-19

MN)

MI)

Page 74

Multiply—Accumulator (Multiplier)

Multiplier operations also update four “sticky” status flags in the processing element’s Sticky status (

STKYx and STKYy) register. Table A-5 on

page A-19 lists all the bits in these registers. The following bits in STKYx or

STKYy flag multiplier status (a 1 indicates the condition). Once set, a sticky

flag remains high until explicitly cleared:

• Multiplier fixed-point overflow. Bit 6 (MOS)

• Multiplier floating-point overflow. Bit 7 (MVS)

• Multiplier underflow. Bit 8 (MUS)

• Multiplier floating-point invalid operation. Bit 9 (MIS)

Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register or sticky register explicitly in the same cycle that the multiplier is performing an operation, the explicit write to ASTAT or STKY supersedes any flag update from the multiplier operation.

Multiplier Instruction Summary

Table 2-4 and Table 2-6 list the Multiplier instructions and how they

relate to ASTATx,y and STKYx,y flags. For more information on assembly language syntax, see the ADSP-21160 SHARC DSP Instruction Set Refer- ence. In these tables, note the meaning of the following symbols.

• Rn, Rx, Ry indicate any register file location; treated as fixed-point

• Fn, Fx, Fy indicate any register file location; treated as

floating-point

• * indicates the flag may be set or cleared, depending on results of

instruction

• ** indicates the flag may be set (but not cleared), depending on

results of instruction

2-20 ADSP-21161 SHARC Processor Hardware Reference

Page 75

Processing Elements

• – indicates no effect

• The Input Mods column indicates the types of optional modifiers that you can apply to the instructions inputs. For a list of modifiers, see Table 2-5.

Table 2-4. Fixed-Point Multiplier Instruction Summary

Instruction Input

Fixed-Point: For Input Mods, see

Table 2-5

Rn = Rx * Ry 1 * * * 0 – ** – –

MRF = Rx * Ry 1 ***0–**––

MRB = Rx * Ry 1 ***0–**––

Rn = MRF + Rx * Ry 1 * * * 0 – ** – –

Rn = MRB + Rx * Ry 1 * * * 0 – ** – –

MRF = MRF + Rx * Ry 1 * * * 0 – ** – –

MRB = MRB + Rx * Ry 1 * * * 0 – ** – –

Rn = MRF – Rx * Ry 1 * * * 0 – ** – –

Rn = MRB – Rx * Ry 1 * * * 0 – ** – –

MRF = MRF – Rx * Ry 1 * * * 0 – ** – –

MRB = MRB – Rx * Ry 1 * * * 0 – ** – –

Rn = SAT MRF 2 * * * 0 – ** – –

Rn = SAT MRB 2 * * * 0 – ** – –

MRF = SAT MRF 2 * * * 0 – ** – –

MRB = SAT MRB 2 * * * 0 – ** – –

Rn = RND MRF 3 * * * 0 – ** – –

Rn = RND MRB 3 * * * 0 – ** – –

MRF = RND MRF 3 * * * 0 – ** – –

MRB = RND MRB 3 * * * 0 – ** – –

MRF = 0 – ––––––––

Mods

ASTATx,y Flags STKYx,y Flags

MU MN MV MI MUS MOS MVS MIS

ADSP-21161 SHARC Processor Hardware Reference 2-21

Page 76

Multiply—Accumulator (Multiplier)

Table 2-4. Fixed-Point Multiplier Instruction Summary (Cont’d)

Instruction Input

Fixed-Point: For Input Mods, see

Table 2-5

MRB = 0 – ––––––––

MRxF = Rn – ––––––––

MRxB = Rn – ––––––––

Rn = MRxF – ––––––––

Rn = MRxB – ––––––––

Mods

ASTATx,y Flags STKYx,y Flags

MU MN MV MI MUS MOS MVS MIS

Table 2-5. Input Modifiers For Fixed-Point Multiplier Instruction

Input Mods f rom

Table 2-4

1 (SSF), (SSI), (SSFR), (SUF), (SUI), (SUFR), (USF), (USI), (USFR), (UUF), (UUI), or

2 (SF), (SI), (UF), or (UI)

3 (SF) or (UF)

Input Mods—Options For Fixed-point Multiplier Instructions

Note the meaning of the following symbols in this table: SSigned input UUnsigned input IInteger input(s) FFractional input(s) FRFractional inputs, Rounded output

Note that (SF) is the default format for 1-input operations, and (SSF) is the default format for 2-input operations

(UUFR)

Table 2-6. Floating-Point Multiplier Instruction Summary

Instruction ASTATx,y Flags STKYx,y Flags

Floating-Point: MU MN MV MI MUS MOS MVS MIS

Fn = Fx * Fy ******–****

2-22 ADSP-21161 SHARC Processor Hardware Reference

Page 77

Processing Elements

Barrel-Shifter (Shifter)

The shifter performs bit-wise operations on 32-bit fixed-point operands. Shifter operations include:

• Shifts and rotates from off-scale left to off-scale right

• Bit manipulation operations, including bit set, clear, toggle, and test

• Bit field manipulation operations, including extract and deposit

• Fixed-point/floating-point conversion operations, including exponent extract, number of leading 1s or 0s

Shifter Operation

The shifter takes from one to three inputs: X-input, Y-input, and Z-input. The inputs (also known as operands) can be any register in the register file. Within a shifter instruction, the inputs serve as follows.

• The X-input provides data that is operated on

• The Y-input specifies shift magnitudes, bit field lengths or bit positions

• The Z-input provides data that is operated on and updated

In the following example, Rx is the X-input, Ry is the Y-input, and Rn is the Z-input. The shifter returns one output (Rn) to the register file.

Rn = Rn OR LSHIFT Rx BY Ry;

As shown in Figure 2-4, the shifter fetches input operands from the upper 32 bits of a register file location (bits 39-8) or from an immediate value in the instruction. The shifter transfers operands during the first half of the cycle and transfers the result to the upper 32 bits of a register (with the

ADSP-21161 SHARC Processor Hardware Reference 2-23

Page 78

Barrel-Shifter (Shifter)

970

eight LSBs zero-filled) during the second half of the cycle. With this arrangement, the shifter can read and write the same register file location in a single cycle.

The X-input and Z-input are always 32-bit fixed-point values. The Y-input is a 32-bit fixed-point value or an 8-bit field (shf8), positioned in the register file. These inputs appear in Figure 2-4.

Some shifter operations produce 8-bit or 6-bit results. As shown in

Figure 2-5, the shifter places these results in either the shf8 field or the

bit6 field and sign-extends the results to 32 bits. The shifter always returns a 32-bit result.

32-BIT Y-INPUT OR RESULT

39 15 7 0

SHF8

8-BIT Y-INPUT OR RESULT

Figure 2-4. Register File Fields for Shifter Instructions

The shifter supports bit field deposit and bit field extract instructions for manipulating groups of bits within an input. The Y-input for bit field instructions specifies two 6-bit values: bit6 and len6, which are positioned in the Ry register as shown in Figure 2-5. The shifter interprets bit6 and

2-24 ADSP-21161 SHARC Processor Hardware Reference

Page 79

Processing Elements

len6 as positive integers. Bit6 is the starting bit position for the deposit or extract, and len6 is the bit field length, which specifies how many bits are deposited or extracted.

39 32 24 16 8 0

00000000 00000000

00000000

000

0 11

1 00

len6

0 11

1 00

000

bit6

0 00

0000000

len6 = 8 bit6 = 16

0 00

000

0x0000 0210 00

39 32 24 16

00000000

39 32 24 16 8 0

00000000 00000000

00000000

1 11

111

1 11

111

00000000

1 11

16 8 0

Starting bit position for deposit

1 11

111

1 11

00000000

1 11

111

1 11

Reference point

1 11

00000000

0x0000 00FF 00R1

0x00FF 0000 00

Figure 2-5. Register File Fields for FDEP, FEXT Instructions

Field deposit (Fdep) instructions take a group of bits from the input register (starting at the LSB of the 32-bit integer field) and deposit the bits as directed anywhere within the result register. The bit6 value specifies the starting bit position for the deposit. Figure 2-7 shows how the inputs, bit6 and len6, work in an field deposit instruction (

Rn=Fdep Rx By Ry).

Figure 2-8 shows bit placement for the field deposit instruction R0 = FDEP

R1 BY R2;

Field extract (Fext) instructions extract a group of bits as directed from anywhere within the input register and place them in the result register (aligned with the LSB of the 32-bit integer field). The bit6 value specifies the starting bit position for the extract. Figure 2-8 shows bit placement for the following field extract instruction

R3 = FEXT R4 BY R5;

ADSP-21161 SHARC Processor Hardware Reference 2-25

Page 80

Barrel-Shifter (Shifter)

39 32 24 16 8 0

91913

00000000 00000000

39 32 24 16

00000000

39 32 24 16 8 0

00000000 00000000

00000000

1 11

Starting bit position for deposit

1 11

0 00

0 11

1 00

len6

00000000

16 8 0

0 11

1 11

00000000

1 11

1 00

0 00

bit6

1 11

Reference point

0 00

1 11

0000000

len6 = 8 bit6 = 16

00000000

Figure 2-6. Bit Field Deposit Example

RY DETERMINES LENGTH OF BIT FIELD TO TAKE FROM RX AND STARTING POSITION FOR DEPOSIT IN RN

LEN6 BIT6

0x0000 0 210 00

0x0000 0 0FF 00R1

0x00FF 0000 00

39 7

LEN6=NUMBEROFBITSTOTAKEFROMRX,STARTINGFROMLSBOF32-BITFIELD

39 7

BIT6 = STARTING BIT POSITION FOR DEPOSIT, REFERENCED FROM LSB OF 32-BIT FIELD

DEPOSIT FIELD

BIT6 REFERENCE POINT

Figure 2-7. Bit Field Deposit Instruction

2-26 ADSP-21161 SHARC Processor Hardware Reference

Page 81

39 32 24 16 8 0

00000000 00000000

39 32 24 16

1 00

000

0 00

0 11

39 32 24 16 8 0

00000000 00000000

00000000

1 11

1 0000000

Starting bit position for deposit

00000000

16 8 0

Figure 2-8. Bit Field Extract Example

0 00

0 11

1 00

len6

Processing Elements

0 00

0 11

0 00

1 00

0 11

1 11

1000000

bit6

len6 = 8 bit6 = 23

Reference point

0 11

111

1 11

100000000

000000000000000 000000000

0x0000 0217 00

0x8710 0000 00

0x0000 000F 00

Shifter Status Flags

Shifter operations update three status flags in the processing element’s arithmetic status register (ASTATx and ASTATy). Table A-4 on page A-13 lists all the bits in these registers. The following bits in ASTATx or ASTATy indicate shifter status (a 1 indicates the condition) for the most recent ALU operation:

• Shifter overflow of bits to left of MSB. Bit 11 (

• Shifter result zero. Bit 12 (SZ)

• Shifter input sign for exponent extract only. Bit 13 (

Flag update occurs at the end of the cycle in which the status is generated and is available on the next cycle. If a program writes the arithmetic status register explicitly in the same cycle that the shifter is performing an operation, the explicit write to

ASTAT supersedes any flag update caused by the

shift operation.

ADSP-21161 SHARC Processor Hardware Reference 2-27

SV)

SS)

Page 82

Barrel-Shifter (Shifter)

Shifter Instruction Summary

Table 2-7 lists the Shifter instructions and how they relate to ASTATx,y

flags. For more information on assembly language syntax, see the ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols:

• Rn, Rx, Ry indicate any register file location; bit fields used

depend on instruction

• Fn, Fx indicate any register file location; floating-point word

• * indicates the flag may set or cleared, depending on data

Table 2-7. Shifter Instruction Summary

Instruction ASTATx,y Flags

SZ SV SS

Rn = LSHIFT Rx BY Ry * * 0

Rn = LSHIFT Rx BY <data8> * * 0

Rn = Rn OR LSHIFT Rx BY Ry * * 0

Rn = Rn OR LSHIFT Rx BY <data8> * * 0

Rn = ASHIFT Rx BY Ry * * 0

Rn = ASHIFT Rx BY<data8> * * 0

Rn = Rn OR ASHIFT Rx BY Ry * * 0

Rn = Rn OR ASHIFT Rx BY <data8> * * 0

Rn = ROT Rx BY Ry * 0 0

Rn = ROT Rx BY <data8> * 0 0

Rn = BCLR Rx BY Ry * * 0

Rn = BCLR Rx BY <data8> * * 0

Rn = BSET Rx BY Ry * * 0

Rn = BSET Rx BY <data8> * * 0

Rn = BTGL Rx BY Ry * * 0

2-28 ADSP-21161 SHARC Processor Hardware Reference

Page 83

Processing Elements

Table 2-7. Shifter Instruction Summary (Cont’d)

Instruction ASTATx,y Flags

SZ SV SS

Rn = BTGL Rx BY <data8> * * 0

BTST Rx BY Ry * * 0

BTST Rx BY <data8> * * 0

Rn = FDEP Rx BY Ry * * 0

Rn = FDEP Rx BY <bit6>:<len6> * * 0

Rn = Rn OR FDEP Rx BY Ry * * 0

Rn = Rn OR FDEP Rx BY <bit6>:<len6> * * 0

Rn = FDEP Rx BY Ry (SE) * * 0

Rn = FDEP Rx BY <bit6>:<len6> (SE) * * 0

Rn = Rn OR FDEP Rx BY Ry (SE) * * 0

Rn = Rn OR FDEP Rx BY <bit6>:<len6> (SE) * * 0

Rn = FEXT Rx BY Ry * * 0

Rn = FEXT Rx BY <bit6>:<len6> * * 0

Rn = FEXT Rx BY Ry (SE) * * 0

Rn = FEXT Rx BY <bit6>:<len6> (SE) * * 0

Rn = EXP Rx (EX) * 0 *

Rn = EXP Rx * 0 *

Rn = LEFTZ Rx * * 0

Rn = LEFTO Rx * * 0

Rn = FPACK Fx 0 * 0

Fn = FUNPACK Rx 0 0 0

ADSP-21161 SHARC Processor Hardware Reference 2-29

Page 84

Data Register File

Each of the processor’s processing elements has a data register file: a set of data registers that transfer data between the data buses and the computation units. These registers also provide local storage for operands and results.

The two register files each consist of 16 primary registers and 16 alternate (secondary) registers. All of the data registers are 40 bits wide. Within these registers, 32-bit data is always left-justified. If an operation specifies a 32-bit data transfer to these 40-bit registers, the eight LSBs are ignored on register reads, and the eight LSBs are cleared to zeros on writes.

Program memory data accesses and data memory accesses to/from the register file(s) occur on the PM data bus and DM data bus, respectively. One PM data bus access for each processing element and/or one DM data bus access for each processing element can occur in one cycle. Transfers between the register files and the DM or PM data buses can move up to 64-bits of valid data on each bus.

If an operation specifies the same register file location as both an input and output, the read occurs in the first half of the cycle and the write in the second half. With this arrangement, the processor uses the old data as the operand, before updating the location with the new result data. If writes to the same location take place in the same cycle, only the write with higher precedence actually occurs. The processor determines precedence for the write operation from the source of the data; from highest to lowest, the precedence is:

1. Data memory or universal register

2. Program memory

3. PEx ALU

4. PEy ALU

2-30 ADSP-21161 SHARC Processor Hardware Reference

Page 85

Processing Elements

5. PEx Multiplier

6. PEy Multiplier

7. PEx Shifter

8. PEy Shifter

The data register file in Figure 2-1 on page 2-3 lists register names of R0 through R15 within PEx’s register file. When a program refers to these registers as R0 through R15, the computational units treat the registers’ contents as fixed-point data. To perform floating point computations, refer to these registers as F0 through F15. For example, the following instructions refer to the same registers, but direct the computational units to perform different operations:

F0=F1 * F2; /*floating-point multiply*/

R0=R1 * R2; /*fixed-point multiply*/

The F and R prefixes on register names do not effect the 32-bit or 40-bit data transfer; the naming convention only determines how the ALU, multiplier, and shifter treat the data.

Code may only refer to the PEy data registers ( move instructions. The rules for using register names are as follows.

ADSP-21161 SHARC Processor Hardware Reference 2-31

To maintain compatibility with code written for previous SHARC DSPs, the assembly syntax accommodates references to PEx data registers and PEy data registers.

S0 through S15) for data

• R0 through R15 and F0 through F15 always refer to PEx registers for data move and computational instructions, whether the processor is in SISD or SIMD mode

Page 86

Alternate (Secondary) Data Registers

•

R0 through R15 and F0 through F15 refer to both PEx and PEy reg-

ister for computational instructions in SIMD mode

• S0 through S15 always refer to PEy registers for data move instructions, whether the processor is in SISD or SIMD mode

For more information on SISD and SIMD computational operations, see

“Alternate (Secondary) Data Registers” on page 2-32. For more informa-

tion on ADSP-21161 processor assembly language, see the ADSP-21160 SHARC DSP Instruction Set Reference.

Alternate (Secondary) Data Registers

Each register file has an alternate register set. To facilitate fast context switching, the processor includes alternate register sets for data, results, and data address generator registers. Bits in the MODE1 register control when alternate registers become accessible. While inaccessible, the contents of alternate registers are not effected by processor operations. Note that there is a one cycle latency between writing to MODE1 and being able to access an alternate register set. The alternate register sets for data and results are described in this section. For more information on alternate data address generator registers, see the DAG “Alternate (Secondary) Data

Registers” on page 2-32.

Bits in the

MODE1 register can activate independent-alternate-data-register

sets: the lower half (R0-R7 and S0-S7) and the upper half (R8-R15 and

S8-S15). To share data between contexts, a program places the data to be

shared in one half of either the current processing element’s register file or the opposite processing element’s register file and activates the alternate register set of the other half. For information on how to activate alternate data registers, see the description on page 2-33.

Each multiplier has a primary or foreground (

MRF) register and alternate or

background (MRB) results register. A bit in the MODE1 register selects which result register receives the result from the multiplier operation, swapping

2-32 ADSP-21161 SHARC Processor Hardware Reference

Page 87

Processing Elements

which register is the current

MRF or MRB. This swapping facilitates context

switching. Unlike other registers that have alternates, both MRF and MRB are accessible at the same time. All fixed-point multiplies can accumulate results in either MRF or MRB, without regard to the state of the MODE1 register. With this arrangement, code can use the result registers as primary and alternate accumulators, or code can use these registers as two parallel accumulators. This feature facilitates complex math.

The MODE1 register controls the access to alternate registers. Table A-2 on

page A-3 lists all the bits in MODE1. The following bits in MODE1 control

alternate registers (a 1 enables the alternate set).

• Secondary registers for computation unit results. Bit 2 (SRCU)

• Secondary registers for hi register file, R8-R15 and S8-15. Bit 7

(SRRFH)

• Secondary registers for lo register file, R0-R7 and S0-S7. Bit 10

(SRRFL)

The following example demonstrates how code should handle the one cycle of latency from the instruction setting the bit in MODE1 to when the alternate registers may be accessed. Note that it is possible to use any instruction that does not access the switching register file instead of an NOP instruction.

BIT SET MODE1 SRRFL; /* activate alternate reg. file */ NOP; /* wait for access to alternates */ R0=7;

ADSP-21161 SHARC Processor Hardware Reference 2-33

Page 88

Multifunction Computations

Using the many parallel data paths within its computational units, the processor supports multiple-parallel (multifunction) computations. These instructions complete in a single cycle, and they combine parallel operation of the multiplier and the ALU or dual ALU functions. The multiple operations perform the same as if they were in corresponding single-function computations. Multifunction computations also handle flags in the same way as the single-function computations, except that in the dual add/subtract computation the ALU flags from the two operations are Or’ed together.

To work with the available data paths, the computation units constrain which data registers may hold the four input operands for multifunction computations. These constraints limit which registers may hold the X-input and Y-input for the ALU and multiplier.

Figure 2-9 shows a computational unit and indicates which registers may

serve as X-inputs and Y-inputs for the ALU and multiplier. For example, the X-input to the ALU can only be R8, R9, R10 or R11. Note that the shifter is gray in Figure 2-7 to indicate that there are no shifter multifunction operations.

2-34 ADSP-21161 SHARC Processor Hardware Reference

Page 89

Processing Elements

MODE1

XY ZXY XY

MULTIPLIER

TO PROGRAM SEQUENCER

PM DATA BUS

DM DATA BUS

(16 × 40-BIT)

R0 R1 R2 R3

R4 R5 R6 R7

MRF2 MRF0MRF1

ASTATx STKYx

R9 R10 R11

R12 R13 R14 R15

NOTE THAT SHIFTER IS FADED HERE, INDICATING TH AT IT IS NOT AVAILABLE FOR MULTIFUNCTION INSTRUCTIONS.

SHIFTER ALU

Figure 2-9. Input Registers for Multifunction Computations (ALU and Multiplier)

Table 2-8, through Table 2-11 list the multifunction computations. For

more information on assembly language syntax, see the ADSP-21160 SHARC DSP Instruction Set Reference. In these tables, note the meaning of the following symbols.

• Rm, Ra, Rs, Rx, Ry indicate any register file location; fixed-point

• Fm, Fa, Fs, Fx, Fy indicate any register file location; floating-point

• R3-0 indicates data file registers R3, R2, R1, or R0, and F3-0 indi-

cates data file registers

F3, F2, F1, or F0

ADSP-21161 SHARC Processor Hardware Reference 2-35

Page 90

Multifunction Computations

• R7-4 indicates data file registers

R7, R6, R5, or R4, and F7-4 indi-

cates data file registers F7, F6, F5, or F4

• R11-8 indicates data file registers R11, R10, R9, or R8, and F11-8

indicates data file registers F11, F10, F9, or F8

• R15-12 indicates data file registers R15, R14, R13, or R12, and F15-12 indicates data file registers F15, F14, F13, or F12

• SSFR indicates the X-input is signed, Y-input is signed, use Fractional inputs, and Rounded-to-nearest output

• SSF indicates the X-input is signed, Y-input is signed, use Fractional input

Table 2-8. Dual Add And Subtract

Ra = Rx + Ry, Rs = Rx – Ry

Fa = Fx + Fy, Fs = Fx – Fy

Table 2-9. Fixed-Point Multiply and Add, Subtract, Or Average

(Any combination of left and right column)

Rm=R3-0 * R7-4 (SSFR), Ra=R11-8 + R15-12

MRF=MRF + R3-0 * R7-4 (SSF), Ra=R11-8 – R15-12

Rm=MRF + R3-0 * R7-4 (SSFR), Ra=(R11-8 + R15-12)/2

MRF=MRF – R3-0 * R7-4 (SSF),

Rm=MRF – R3-0 * R7-4 (SSFR),

Table 2-10. Floating-Point Multiply And ALU Operation

Fm=F3-0 * F7-4, Fa=F11-8 + F15-12

Fm=F3-0 * F7-4, Fa=F11-8 – F15-12

Fm=F3-0 * F7-4, Fa=FLOAT R11-8 by R15-12

Fm=F3-0 * F7-4, Ra=FIX F11-8 by R15-12

2-36 ADSP-21161 SHARC Processor Hardware Reference

Page 91

Processing Elements

Table 2-10. Floating-Point Multiply And ALU Operation (Cont’d)

Fm=F3-0 * F7-4, Fa=(F11-8 + F15-12)/2

Fm=F3-0 * F7-4, Fa=ABS F11-8

Fm=F3-0 * F7-4, Fa=MAX (F11-8, F15-12)

Fm=F3-0 * F7-4, Fa=MIN (F11-8, F15-12)

Table 2-11. Multiply With Dual Add and Subtract

Rm = R3-0 * R7-4 (SSFR), Ra = R11-8 + R15-12, Rs = R11-8 – R15-12

Fm = F3-0 * F7-4, Fa = F11-8 + F15-12, Fs = F11-8 – F15-12

Another type of multifunction operation is also available on the processor, combining transfers between the results and data registers and transfers between memory and data registers. Like other multifunction instructions, these parallel operations complete in a single cycle. For example, the processor can perform the following multiply and parallel read of data memory:

MRF=MRF-R5*R0, R6=DM(I1,M2);

Or, the processor can perform the following result register transfer and parallel read:

R5=MR1F, R6=DM(I1,M2);

Secondary Processing Element (PEy)

The ADSP-21161 processor contains two sets of computation units and associated register files. As shown in Figure 2-10, these two Processing Elements (PEx and PEy) support Single Instruction, Multiple Data (SIMD) operation.

ADSP-21161 SHARC Processor Hardware Reference 2-37

Page 92

Secondary Processing Element (PEy)

DIFFERENT DATA GOES TO EACH ELEMENT

16/32/40/64

PROGRAM

SEQUENCER

BARREL

SHIFTER

ALU

DATA

FILE

(PEy)

16 x 40-BIT

MULT

BUS

CONNECT

(PX)

MULT

PM DATA BUS

DM DATA BUS

DATA

FILE

(PEx)

16 x 40-BIT

BARREL

SHIFTER

ALU

SAME INSTRUCTION GOES TO BOTH ELEMENTS

Figure 2-10. Block Diagram Showing Secondary Execution Complex

The

MODE1 register controls the operating mode of the processing ele-

ments. Table A-2 on page A-3 lists all the bits in MODE1. The PEYEN bit (bit

21) in the MODE1 register enables or disables the PEy processing element. When PEYEN is cleared (0), the ADSP-21161 processor operates in Single-Instruction-Single-Data (SISD) mode, using only PEx; this is the mode in which ADSP-2106x family DSPs operate. When the

PEYEN bit is

set (1), the ADSP-21161 processor operates in SIMD mode, using the PEx and PEy processing elements. There is a one cycle delay after

PEYEN is

set or cleared, before the change to or from SIMD mode takes effect.

2-38 ADSP-21161 SHARC Processor Hardware Reference

Page 93

Processing Elements

To support SIMD, the processor performs the following parallel operations.

• Dispatches a single instruction to both processing element’s computation units

• Loads two sets of data from memory, one for each processing element

• Executes the same instruction simultaneously in both processing elements

• Stores data results from the dual executions to memory

The two processing elements are symmetrical, and each contains the following functional blocks.

Using the information here and in the ADSP-21160 SHARC DSP Instruction Set Reference, it is possible through SIMD mode’s parallelism to double performance over similar algorithms running in SISD (ADSP-2106x processor compatible) mode.

•ALU

• Multiplier primary and alternate result registers

• Shifter

• Data register file and alternate register file

Dual Compute Units Sets

The computation units (ALU, Multiplier, and Shifter) in PEx and PEy are identical. The data bus connections for the dual computation units permit asymmetric data moves to, from, and between the two processing elements. Identical instructions execute on the PEx and PEy computational units; the difference is the data. The data registers for PEy operations are identified (implicitly) from the PEx registers in the instruction. This

ADSP-21161 SHARC Processor Hardware Reference 2-39

Page 94

Secondary Processing Element (PEy)

implicit relation between PEx and PEy data registers corresponds to complementary register pairs in Table 2-12. Any universal registers that don’t appear in Table 2-12 have the same identities in both PEx and PEy. When a computation in SIMD mode refers to a register in the PEx column, the corresponding computation in PEy refers to the complimentary register in the PEy column.

Table 2-12. SIMD Mode Complementary Register Pairs

PEx PEy

R0 S0

R1 S1

R2 S2

R3 S3

R4 S4

R5 S5

R6 S6

R7 S7

R8 S8

R9 S9

R10 S10

R11 S11

R12 S12

R13 S13

R14 S14

AS TATx A STATy

STKYx STKYy

2-40 ADSP-21161 SHARC Processor Hardware Reference

Page 95

Processing Elements

Table 2-13 lists the multiplier result SIMD mode complementary register

pairs. These multiplier result registers are not universal (

UREGs) registers

and cannot be accessed directly. These registers can be read with the following multiplier operations:

MRxF/B = Rn; Rn = MRxF/B;

Table 2-13. Multiplier Result SIMD Mode Complementary Register Pairs

PEx PEy

MRF0 MSF0

MRF1 MSF1

MRF2 MSF2

MRB0 MSB0

MRB1 MSB1

MRB2 MSB2

Table 2-14. Other Complementary Register Pairs

USTAT1 USTAT2

USTAT3 USTAT4

PX1 PX2

ADSP-21161 SHARC Processor Hardware Reference 2-41

Page 96

Secondary Processing Element (PEy)

Dual Register Files

The two 16 entry data register files (one in each PE) and their operand and result busing and porting are identical. The same is true for each 16 entry alternate register files. The transfer direction, source and destination registers, and data bus usage depend on the following conditions:

• Computational mode:

– Is PEy enabled—PEYEN bit=1 in MODE1 register – Is the data register file in PEx (R0-R15, F0-F15) or PEy (S0-S15) – Is the instruction a data register swap between the processing elements

• Data addressing mode:

– What is the state of the Internal Memory Data Width (IMDW) bits in the System Configuration (SYSCON) register – Is Broadcast write enabled—BDCST1,9 bits in MODE1 register – What is the type of address—long, normal, or short word – Is Long Word override (LW) specified in the instruction – What are the states of instruction fields for DAG1 or DAG2

• Program sequencing (conditional logic):

–What is the outcome of the instruction’s condition comparison on each processing element

For information on SIMD issues that relate to computational modes, see

“SIMD (Computational) Operations” on page 2-43. For information on

SIMD issues relating to data addressing, see “SIMD Mode and Sequenc-

ing” on page 3-57. For information on SIMD issues relating to program

sequencing, see “Addressing in SISD and SIMD Modes” on page 4-18.

2-42 ADSP-21161 SHARC Processor Hardware Reference

Page 97

Processing Elements

Dual Alternate Registers

Both register files consist of a primary set of 16 by 40-bit registers and an alternate set of 16 by 40-bit registers. Context switching between the two sets of registers occur in parallel between the two processing elements.

“Alternate (Secondary) Data Registers” on page 2-32.

SIMD (Computational) Operations

In SIMD mode, the dual processing elements execute the same instruction, but operate on different data. To support SIMD operation, the elements support a variety of dual data move features.

The processor supports unidirectional and bidirectional register-to-register transfers with the conditional compute and move instruction. All four combinations of inter-register file and intra-register file transfers (PEx ↔ PEx, PEx ↔ PEy, PEy ↔ PEx, and PEy ↔ PEy) are possible in both SISD (unidirectional) and SIMD (bidirectional) modes.

In SISD mode (PEYEN bit=0), the register-to-register transfers are unidirectional, meaning that an operation performed on one processing element is not duplicated on the other processing element. The SISD transfer uses a source register and a destination register, and either register can be in either element’s data register file. For a summary of unidirectional transfers, see the upper half of Table 2-15. Note that in SISD mode a condition for an instruction only tests in the PEx element and applies to the entire instruction.

In SIMD mode (PEYEN bit=1), the register-to-register transfers are bidirectional, meaning that an operation performed on one element is duplicated in parallel on the other element. The instruction uses two source registers (one from each element’s register file) and two destination registers (one from each element’s register file). For a summary of bidirectional transfers, see the lower half of Table 2-15. Note that in SIMD mode a

ADSP-21161 SHARC Processor Hardware Reference 2-43

Page 98

Secondary Processing Element (PEy)

conditional for an instruction test in both the PEx and PEy elements, dividing control of the explicit and implicit transfers as detailed in

Table 2-15.

Bidirectional register-to-register transfers in SIMD mode are allowed between a data register and DAG, control, or status registers. When the DAG, control, or status register is a source of the transfer, the destination can be a data register. This SIMD transfer duplicates the contents of the source register in a data register in both processing elements.

In the case where a DAG, control, or status register is both source and destination, the data move operation executes the same as if SIMD mode were disabled.

In both SISD and SIMD modes, the processor supports bidirectional register-to-register swaps. The swap always occurs between one register in each processing element’s data register file.

Registers swaps use the special swap operator, <->. A register-to-register swap occurs when registers in different processing elements exchange values; for example are supported—no double register operations.

Careful programming is required when a DAG, control, or status register is a destination of a transfer from a data register. If the destination register has a complement (for example ASTATx and

ASTATy), the SIMD transfer moves the contents of the explicit data

register into the explicit destination and moves the contents of the implicit data register into the implicit destination (the complement). If the destination register has no complement (for example, I0), only the explicit transfer occurs.

Even if the code uses a conditional operation to select whether the transfer occurs, only the explicit transfer can take place if the destination register has no complement.

R0 <-> S1. Only single, 40-bit register to register swaps

2-44 ADSP-21161 SHARC Processor Hardware Reference

Page 99

Processing Elements

When they are unconditional, register-to-register swaps operate the same in SISD mode and SIMD mode. If a condition is added to the instruction in SISD mode, the condition tests only in the PEx element and controls the entire operation. If a condition is added in SIMD mode, the condition tests in both the PEx and PEy elements and controls the halves of the operation as detailed in Table 2-15.

Table 2-15. Register-To-Register Move Summary (SISD Versus SIMD)

Mode Instruction Explicit Transfer Implicit Transfer

SISD

SIMD

IF condition compute, Rx = Ry; Rx loaded from Ry None

IF condition compute, Rx = Sy; Rx loaded from Sy None

IF condition compute, Sx = Ry; Sx loaded from Ry None

IF condition compute, Sx = Sy; Sx loaded from Sy None

IF condition compute, Rx <-> Sy; Rx swaps to Sy

Sy swaps to Rx

IF condition compute, Rx = Ry; Rx loaded from Ry Sx loaded from Sy

IF condition compute, Rx = Sy; Rx loaded from Sy Sx loaded from Ry

IF condition compute, Sx = Ry; Sx loaded from Ry Rx loaded from Sy

IF condition compute, Sx = Sy; Sx loaded from Sy Rx loaded from Ry

IF condition compute, Rx <-> Sy;

Rx swaps to Sy Sy swaps to Rx

None

1 In SISD mode, the conditional applies only to the entire operation and is only tested against PEx’s

flags. When the condition tests true, the entire operation occurs.

2 In SIMD mode, the conditional applies separately to the explicit and implicit transfers. Where the

condition tests true (PEx for the explicit and PEy for the implicit), the operation occurs in that processing element.

3 Register to register transfers (R0=S0) and register swaps (R0<->S0) do not cause a PMD bus conflict.

These operations use only the DMD bus and a hidden 16-bit bus to do the two register moves.

SIMD conditional instructions with the same destination registers

[

do not produce predictable transfers. For example, the instruction

IF EQ R4 = R14 – R15, S4 = R6; may not work as expected. This

kind of usage is prohibited, as it is not logical to use it this way.

ADSP-21161 SHARC Processor Hardware Reference 2-45

Page 100

Secondary Processing Element (PEy)

SIMD And Status Flags

When the processor is in SIMD mode (PEYEN bit=1), computations on both processing elements generate status flags, producing a logical Oring of the exception status test on each processing element. If one of the four fixed-point or floating-point exceptions is enabled, an exception condition on either or both processing elements generates an exception interrupt. Interrupt service routines must determine which of the processing elements encountered the exception. Note that returning from a floating point interrupt does not automatically clear the STKY state. Code must clear the STKY bits in both processing element’s sticky status (STKYx and

STKYy) registers as part of the exception service routine. For more informa-

tion, see “Interrupts and Sequencing” on page 3-34.

2-46 ADSP-21161 SHARC Processor Hardware Reference

Datasheet ADSP-21161 Datasheet (ANALOG DEVICES)

Specifications and Main Features

Frequently Asked Questions

User Manual

INTRODUCTION

PROCESSING ELEMENTS

PROGRAM SEQUENCER

DATA ADDRESS GENERATOR

MEMORY

I/O PROCESSOR

EXTERNAL PORT

SDRAM INTERFACE

LINK PORTS

SERIAL PORTS

SERIAL PERIPHERAL INTERFACE (SPI)

JTAG TEST-EMULATION PORT

SYSTEM DESIGN

REGISTERS

INTERRUPT VECTOR ADDRESSES

NUMERIC FORMATS

GLOSSARY