Freescale Semiconductor SC140 DSP Core, StarCore SC140 Reference Manual

SC140 DSP Core
Reference Manual
Revision 4.1, September 2005
This document contains information on a new product. Specifications and information herein are subject to change without notice.
(c) Freescale Semiconductor, Inc. 2005, All rights
SC140 DSP Core Reference Manual
LICENSOR is defined as Freescale Semiconductor, Inc. LICENSOR reserves the right to make changes without further notice to any products included and covered hereby. LICENSOR makes no warranty, representation or guarantee regarding the suitability of its products for any particular purpose, nor does LICENSOR assume any liability arising out of the application or use of any product or circuit, and specifically disclaims any and all liability, including without limitation incidental, consequential, reliance, exemplary, or any other similar such damages, by way of illustration but not limitation, such as, loss of profits and loss of business opportunity. "Typical" parameters which may be provided in LICENSOR data sheets and/or specifications can and do vary in different applications and actual performance may vary over time. All operating parameters, including "Typicals" must be validated for each customer application by customer’s technical experts. LICENSOR does not convey any license under its patent rights nor the rights of others. LICENSOR products are not designed, intended, or authorized for use as components in systems intended for surgical implant into the body, or other applications intended to support life, or for any other application in which the failure of the LICENSOR product could create a situation where personal injury or death may occur. Should Buyer purchase or use LICENSOR products for any such unintended or unauthorized application, Buyer shall indemnify and hold LICENSOR and its officers, employees, subsidiaries, affiliates, and distributors harmless against all claims, cost, damages, and expenses, and reasonable attorney fees arising out of, directly or indirectly, any claim of personal injury or death associated with such unintended or unauthorized use, even if such claim alleges that LICENSOR was negligent regarding the design or manufacture of the part.
Freescale and are registered trademarks of Freescale Semiconductor, Inc. Freescale, Inc. is an Equal Opportunity/Affirmative Action Employer.
All other tradenames, trademarks, and registered trademarks are the property of their respective owners.
SC140 DSP Core Reference Manual iii
About This Book
Audience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .xxi
Organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxi
Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxii
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xxiv
Chapter 1
Introduction
1.1 Target Markets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-1
1.2 Architectural Differentiation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-2
1.3 Core Architecture Features. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-3
1.3.1 Typical System-On-Chip Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .1-4
1.3.2 Variable Length Execution Set (VLES) Software Model . . . . . . . . . . . . . . . .1-5
Chapter 2
Core Architecture
2.1 Architecture Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-1
2.1.1 Data Arithmetic Logic Unit (DALU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-2
2.1.1.1 Data Register File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3
2.1.1.2 Multiply-Accumulate (MAC) Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3
2.1.1.3 Bit-Field Unit (BFU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3
2.1.1.4 Shifter/Limiters. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3
2.1.2 Address Generation Unit (AGU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-3
2.1.2.1 Stack Pointer Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4
2.1.2.2 Bit Mask Unit (BMU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-4
2.1.3 Program Sequencer Unit (PSEQ) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-5
2.1.4 Enhanced On-Chip Emulator (EOnCE) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-5
2.1.5 Instruction Set Accelerator Plug-in (ISAP) Interface. . . . . . . . . . . . . . . . . . . .2-5
2.1.6 Memory Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-5
2.2 DALU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-6
2.2.1 DALU Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-6
2.2.1.1 Data Registers (D0–D15) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-8
2.2.1.2 Multiply-Accumulate (MAC) Unit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-10
2.2.1.3 Bit-Field Unit (BFU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-12
2.2.1.4 Data Shifter/Limiter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-13
2.2.1.5 Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-14
2.2.1.6 Limiting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-14
2.2.1.7 Scaling and Arithmetic Saturation Mode Interactions . . . . . . . . . . . . . . .2-16
2.2.2 DALU Arithmetic and Rounding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17
Table of Contents
iv SC140 DSP Core Reference Manual
2.2.2.1 Data Representation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-17
2.2.2.2 Data Formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-18
2.2.2.3 Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20
2.2.2.4 Division. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20
2.2.2.5 Unsigned Arithmetic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-20
2.2.2.6 Rounding Modes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-21
2.2.2.7 Arithmetic Saturation Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-25
2.2.2.8 Multi-Precision Arithmetic Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-26
2.2.2.9 Viterbi Decoding Support. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-30
2.3 Address Generation Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31
2.3.1 AGU Architecture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-31
2.3.2 AGU Programming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-34
2.3.2.1 Address Registers (R0–R15) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35
2.3.2.2 Stack Pointer Registers (NSP, ESP) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-35
2.3.2.3 Offset Registers (N0–N3). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36
2.3.2.4 Base Address Registers (B0–B7) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36
2.3.2.5 Modifier Registers (M0–M3) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-36
2.3.2.6 Modifier Control Register (MCTL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-37
2.3.3 Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38
2.3.3.1 Register Direct Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38
2.3.3.2 Address Register Indirect Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-38
2.3.3.3 PC Relative Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-40
2.3.3.4 Special Addressing Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-41
2.3.3.5 Memory Access Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42
2.3.3.6 Memory Access Misalignment. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-42
2.3.3.7 Addressing Modes Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-43
2.3.4 Address Modifier Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-45
2.3.4.1 Linear Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-45
2.3.4.2 Reverse-carry Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-45
2.3.4.3 Modulo Addressing Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-45
2.3.4.4 Multiple Wrap-Around Modulo Addressing Mode . . . . . . . . . . . . . . . . .2-47
2.3.5 Arithmetic Instructions on Address Registers . . . . . . . . . . . . . . . . . . . . . . . .2-48
2.3.6 Bit Mask Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-49
2.3.6.1 Bit Mask Test and Set (Semaphore Support) Instruction . . . . . . . . . . . . .2-50
2.3.6.2 Semaphore Hardware Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51
2.3.7 Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-51
2.4 Memory Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-55
2.4.1 SC140 Endian Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-56
2.4.1.1 SC140 Bus Structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-56
2.4.1.2 Memory Organization. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-57
2.4.1.3 Data Moves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-58
2.4.1.4 Multi-Register Moves. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-60
2.4.1.5 Instruction Word Transfers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .2-62
2.4.1.6 Memory Access Behavior in Big/Little Endian Modes . . . . . . . . . . . . . .2-64
SC140 DSP Core Reference Manual v
Chapter 3
Control Registers
3.1 Core Control Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-1
3.1.1 Status Register (SR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-1
3.1.2 Exception and Mode Register (EMR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-7
3.1.2.1 Clearing EMR Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10
3.2 PLL and Clock Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .3-10
Chapter 4
Emulation and Debug (EOnCE)
4.1 Debugging System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-1
4.2 Overview of the Combined JTAG and EOnCE Interface. . . . . . . . . . . . . . . . . . . .4-2
4.2.1 Cascading Multiple SC140 EOnCE Modules in a SoC . . . . . . . . . . . . . . . . . .4-2
4.2.2 JTAG Scan Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-3
4.2.3 Activating the EOnCE Through the JTAG Port. . . . . . . . . . . . . . . . . . . . . . . .4-6
4.2.4 Enabling the EOnCE Module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-6
4.2.5 DEBUG_REQUEST and ENABLE_EONCE Commands. . . . . . . . . . . . . . . .4-7
4.2.6 Reading/Writing EOnCE Registers Through JTAG. . . . . . . . . . . . . . . . . . . . .4-7
4.3 Main Capabilities of the EOnCE Module. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10
4.3.1 EOnCE Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-10
4.3.2 EOnCE Dedicated Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11
4.3.3 Debug State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-11
4.3.4 Debug Exception. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12
4.3.5 Executing an Instruction while in Debug State . . . . . . . . . . . . . . . . . . . . . . .4-12
4.3.6 Software Downloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-12
4.3.7 EOnCE Events . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-14
4.3.8 EOnCE Actions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15
4.3.9 Event and Action Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-15
4.4 EOnCE Enabling and Power Considerations . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16
4.5 EOnCE Module Internal Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16
4.5.1 EOnCE Controller. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-16
4.5.2 Event Counter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-18
4.5.3 Event Detection Unit (EDU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-20
4.5.3.1 Address Event Detection Channel (EDCA) . . . . . . . . . . . . . . . . . . . . . . .4-22
4.5.3.2 Data Event Detection Channel (EDCD). . . . . . . . . . . . . . . . . . . . . . . . . .4-24
4.5.3.3 Optional External Event Detection Address Channels. . . . . . . . . . . . . . .4-25
4.5.4 Event Selector (ES). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-25
4.5.5 Trace Unit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-26
4.5.5.1 Change of Flow and Interrupt Tracing . . . . . . . . . . . . . . . . . . . . . . . . . . .4-28
4.5.5.2 Writing to the Trace Buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-29
4.5.5.3 Reading the Trace Buffer (TB_BUFF). . . . . . . . . . . . . . . . . . . . . . . . . . .4-29
4.5.5.4 Trace Unit Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-29
4.6 EOnCE Register Addressing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-30
4.6.1 Reading or Writing EOnCE Registers Using Core Software . . . . . . . . . . . . .4-33
4.6.2 Real-Time JTAG Access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-33
4.6.3 Real-Time Data Transfer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-34
vi SC140 DSP Core Reference Manual
4.6.4 General EOnCE Register Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-34
4.7 EOnCE Controller Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36
4.7.1 EOnCE Command Register (ECR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-36
4.7.2 EOnCE Status Register (ESR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-37
4.7.3 EOnCE Monitor and Control Register (EMCR). . . . . . . . . . . . . . . . . . . . . . .4-41
4.7.4 EOnCE Receive Register (ERCV) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-43
4.7.5 EOnCE Transmit Register (ETRSMT). . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-43
4.7.6 EE Signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44
4.7.6.1 EE Signals as Outputs. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-44
4.7.6.2 EE Signals as Inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-45
4.7.6.3 EE Signals Control Register (EE_CTRL) . . . . . . . . . . . . . . . . . . . . . . . .4-45
4.7.7 Core Command Register (CORE_CMD) . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-48
4.7.8 PC of the Exception Execution Set (PC_EXCP) . . . . . . . . . . . . . . . . . . . . . .4-49
4.7.9 PC of the Next Execution Set (PC_NEXT) . . . . . . . . . . . . . . . . . . . . . . . . . .4-49
4.7.10 PC of Last Execution Set (PC_LAST) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-49
4.7.11 PC Breakpoint Detection Register (PC_DETECT) . . . . . . . . . . . . . . . . . . . .4-49
4.8 Event Counter Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-50
4.8.1 Event Counter Control Register (ECNT_CTRL) . . . . . . . . . . . . . . . . . . . . . .4-50
4.8.2 Event Counter Value Register (ECNT_VAL) . . . . . . . . . . . . . . . . . . . . . . . .4-52
4.8.3 Extension Counter Value Register (ECNT_EXT) . . . . . . . . . . . . . . . . . . . . .4-53
4.8.4 EC Signals. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-53
4.9 Event Detection Unit (EDU) Channels and Registers . . . . . . . . . . . . . . . . . . . . .4-54
4.9.1 Address Event Detection Channel (EDCA) . . . . . . . . . . . . . . . . . . . . . . . . . .4-54
4.9.1.1 EDCA Control Registers (EDCAi_CTRL). . . . . . . . . . . . . . . . . . . . . . . .4-54
4.9.1.2 EDCA Reference Value Registers A and B
(EDCAi_REFA, EDCAi_REFB) . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-57
4.9.1.3 EDCA Mask Register (EDCAi_MASK) . . . . . . . . . . . . . . . . . . . . . . . . .4-57
4.9.2 Data Event Detection Channel (EDCD). . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-58
4.9.2.1 EDCD Control Register (EDCD_CTRL) . . . . . . . . . . . . . . . . . . . . . . . . .4-58
4.9.2.2 EDCD Reference Value Register (EDCD_REF) . . . . . . . . . . . . . . . . . . .4-61
4.9.2.3 EDCD Mask Register (EDCD_MASK) . . . . . . . . . . . . . . . . . . . . . . . . . .4-61
4.10 Event Selector (ES) Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-61
4.10.1 Event Selector Control Register (ESEL_CTRL) . . . . . . . . . . . . . . . . . . . . . .4-61
4.10.2 Event Selector Mask Debug State Register (ESEL_DM) . . . . . . . . . . . . . . .4-63
4.10.3 Event Selector Mask Debug Exception
Register (ESEL_DI) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-64
4.10.4 Event Selector Mask Enable Trace Register (ESEL_ETB) . . . . . . . . . . . . . .4-64
4.10.5 Event Selector Mask Disable Trace Register (ESEL_DTB) . . . . . . . . . . . . .4-65
4.11 Trace Unit Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-65
4.11.1 Trace Buffer Control Register (TB_CTRL) . . . . . . . . . . . . . . . . . . . . . . . . . .4-65
4.11.2 Trace Buffer Read Pointer Register (TB_RD) . . . . . . . . . . . . . . . . . . . . . . . .4-69
4.11.3 Trace Buffer Write Pointer Register (TB_WR) . . . . . . . . . . . . . . . . . . . . . . .4-69
4.11.4 Trace Buffer Register (TB_BUFF). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4-69
SC140 DSP Core Reference Manual vii
Chapter 5
Program Control
5.1 Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-1
5.1.1 Instruction Pipeline Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-2
5.1.1.1 Instruction Pre-Fetch and Fetch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-4
5.1.1.2 Instruction Dispatch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-4
5.1.1.3 Address Generation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-4
5.1.1.4 Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-5
5.2 Instruction Grouping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-5
5.2.1 Grouping Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-6
5.2.1.1 Serial Grouping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-7
5.2.1.2 Prefix Grouping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-7
5.2.2 Prefix Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8
5.2.2.1 Two-Word Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-8
5.2.2.2 One-Word Low Register Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9
5.2.3 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-9
5.2.4 Prefix Selection Algorithm. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-10
5.2.5 Instruction Reordering Within an Execution Set . . . . . . . . . . . . . . . . . . . . . .5-12
5.3 Instruction Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-14
5.3.1 Sequential Instruction Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-15
5.3.1.1 DALU Instruction Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16
5.3.1.2 Move Instruction Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16
5.3.1.3 Bit Mask Instruction Timing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-16
5.3.2 Change-Of-Flow Instruction Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-17
5.3.2.1 Direct, PC-Relative, and Conditional COF . . . . . . . . . . . . . . . . . . . . . . .5-18
5.3.2.2 Delayed COF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-19
5.3.2.3 COF Execution Cycles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-19
5.3.3 Memory Access Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-21
5.3.3.1 Memory Access Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-22
5.3.3.2 Implicit Push/Pop Memory Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-24
5.3.3.3 Memory Stall Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-24
5.4 Hardware Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-25
5.4.1 Loop Programming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-25
5.4.1.1 Loop Start Address Registers (SAn). . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-25
5.4.1.2 Loop Counter Registers (LCn) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-26
5.4.1.3 Status Register (SR) Loop Flag Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-26
5.4.2 Loop Notation and Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-26
5.4.3 Loop Initiation and Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-27
5.4.4 Loop Nesting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-28
5.4.5 Loop Iteration and Termination . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-28
5.4.6 Loop Control Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-29
5.4.7 Loop Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-32
5.5 Stack Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-32
5.5.1 SC140 Single Stack Memory Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-32
5.5.2 SC140 Dual Stack Memory Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-33
5.5.3 Stack Support Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-34
5.5.4 Shadow Stack Pointer Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-35
viii SC140 DSP Core Reference Manual
5.5.5 Fast Return from Subroutines. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-36
5.6 Working Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-37
5.6.1 Normal Working Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-37
5.6.2 Exception Working Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-37
5.6.3 Typical Working Mode Usage Scenarios. . . . . . . . . . . . . . . . . . . . . . . . . . . .5-38
5.6.3.1 Dual-stack RTOS. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-38
5.6.3.2 Single-stack RTOS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39
5.6.4 Working Mode Transitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39
5.6.4.1 From Exception to Normal mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39
5.6.4.2 From Normal to Exception mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-39
5.7 Processing States. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-41
5.7.1 Processing State Change Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-41
5.7.2 Processing State Transitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-42
5.7.3 Execution State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-43
5.7.4 Reset Processing State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-43
5.7.5 Debug State. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-44
5.7.6 Wait Processing State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-44
5.7.7 Stop Processing State . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-45
5.8 Exception Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-46
5.8.1 Interrupt Vector Address . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48
5.8.1.1 Vector Base Address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-48
5.8.1.2 Programming Exception Routine Addresses . . . . . . . . . . . . . . . . . . . . . .5-48
5.8.2 Return From Exception Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-49
5.8.3 Maskable Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50
5.8.3.1 Interrupt Priority Level. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50
5.8.3.2 Controlling All Interrupt Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50
5.8.4 Non-Maskable Interrupts (NMI). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50
5.8.5 Internal Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-50
5.8.5.1 Illegal Exception. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-51
5.8.5.2 DALU Overflow. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52
5.8.5.3 TRAP Exception. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52
5.8.5.4 Debug Exception. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52
5.8.6 Exception Interface to the Pipeline. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52
5.8.6.1 Exception Routine Fetch. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-52
5.8.6.2 Exception Mode Execution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53
5.8.7 Exception Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .5-53
Chapter 6
Instruction Set Accelerator Plug-In
6.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-57
6.2 ISAP - SC140 Schematic Connection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-58
6.2.1 Single ISAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-58
6.2.2 Multiple ISAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-59
6.3 ISAP instructions and instruction encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-60
6.4 ISAP Memory Access. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-60
6.5 ISAP-core register transfers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-61
6.6 Immediate Data Transfer to ISAP registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-62
SC140 DSP Core Reference Manual ix
6.7 Core Assembly Syntax with an ISAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-63
6.7.1 Identification of ISAP instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-63
6.7.1.1 Working with One ISAP. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-63
6.7.1.2 Working with Multiple ISAPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-64
6.7.2 An Example of the Definition Flexibility of an ISAP . . . . . . . . . . . . . . . . . .6-65
6.7.3 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-66
6.8 Programming Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-67
6.8.1 ISAP Functions that Interact With the Core. . . . . . . . . . . . . . . . . . . . . . . . . .6-67
6.8.2 Grouping rules for explicit ISAP instructions . . . . . . . . . . . . . . . . . . . . . . . .6-68
6.8.3 Rules for implicit AGU instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-68
6.8.4 Sequencing rules for T bit update. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6-69
Chapter 7
Programming Rules
7.1 VLES Sequencing Semantics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-1
7.2 VLES Grouping Semantics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-1
7.3 SC140 Pipeline Exposure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4 Programming Rule Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4.1 Grouping Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4.1.1 Prefix Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4.1.2 Conditional Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4.1.3 Assembler Reordering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-3
7.4.2 Sequencing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.2.1 Cycle Counts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.2.2 Conditional Execution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.2.3 Simulator Execution Counts. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.3 Register Read/Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.3.1 Register Names. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-4
7.4.3.2 B Register Aliasing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5
7.4.4 Status Bit Updates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5
7.4.5 Instruction Words . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5
7.4.6 MOVE-like Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5
7.4.6.1 Address/Data Operands . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-5
7.4.7 AGU Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6
7.4.8 Change-Of-Flow Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6
7.4.8.1 COF Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6
7.4.9 Delayed COF Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6
7.4.9.1 Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-6
7.4.10 Hardware Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7
7.4.10.1 Enabled Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7
7.4.10.2 Enveloping Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7
7.5 Static Programming Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7
7.5.1 Hardware Loop Detection. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-7
7.5.2 General Grouping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-8
7.5.3 Prefix Grouping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-11
7.5.4 AGU Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-16
7.5.5 Delayed COF Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-19
x SC140 DSP Core Reference Manual
7.5.6 Status Bit Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-22
7.5.7 Loop Nesting Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-28
7.5.8 Loop LA Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-31
7.5.9 Loop Sequencing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-33
7.5.10 Loop COF Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-36
7.5.11 General Looping Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-40
7.6 Dynamic Programming Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-41
7.6.1 AGU Dynamic Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-41
7.6.2 Memory Access Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-42
7.6.3 RAS Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-43
7.6.4 Loop Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-43
7.6.5 Rule Detection Across COF Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-44
7.6.5.1 Cycle-Based COF Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-44
7.6.5.2 VLES-Based COF Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-45
7.6.6 Rule Detection Across Exception Boundaries . . . . . . . . . . . . . . . . . . . . . . . .7-46
7.7 Programming Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-48
7.7.1 Rules Not Detected Across COF Boundaries. . . . . . . . . . . . . . . . . . . . . . . . .7-49
7.7.2 Good Programming Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-49
7.7.2.1 Source Code Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-49
7.7.2.2 Binary Code Practices. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-50
7.7.2.3 Software Development Practices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-51
7.8 LPMARK Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-51
7.8.1 LPMARK Instruction Type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-51
7.8.2 Static Programming Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-52
7.8.2.1 General Grouping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-52
7.8.2.2 Prefix Grouping Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-52
7.8.3 Dynamic Programming Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-52
7.8.3.1 LPMARK Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-52
7.8.3.2 Loop Nesting Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-53
7.8.3.3 Loop LA Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-53
7.8.3.4 Loop Sequencing Rules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-55
7.8.3.5 Loop COF Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-56
7.8.3.6 General Looping Rules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-59
7.8.3.7 Rule Detection Across Exception Boundaries . . . . . . . . . . . . . . . . . . . . .7-59
7.8.4 LPMARK Programming Guidelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-59
7.9 NOP Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-60
7.9.1 Grouping Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .7-61
Appendix A
SC140 DSP Core Instruction Set
A.1 Introduction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-1
A.1.1 Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
A.1.1.1 Brackets as ISAP indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
A.1.1.2 Brackets as address indicators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
A.1.2 Addressing Mode Notation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-5
A.1.3 Data Representation in Memory for the Examples. . . . . . . . . . . . . . . . . . . . . A-6
A.1.4 Encoding Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-6
SC140 DSP Core Reference Manual xi
A.1.5 Prefix Word Encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-7
A.1.5.1 One-Word Low Register Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-8
A.1.5.2 Two-Word Prefix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-9
A.1.6 Instruction Types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12
A.1.6.1 Instruction Sub-types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-12
A.2 Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19
A.2.1 Instruction Definition Layout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-19
Appendix B
StarCore Registry
B.1 Using the StarCore Registry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-1
xii SC140 DSP Core Reference Manual
SC140 DSP Core Reference Manual xiii
1-1 Block Diagram of a Typical SoC Configuration with the SC140 Core . . . . . . . 1-5
2-1 Block Diagram of the SC140 Core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-2
2-2 DALU Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-6
2-3 DALU Data Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-18
2-4 Fractional and Integer Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-20
2-5 Convergent Rounding (No Scaling) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-22
2-6 Two’s Complement Rounding (No Scaling) . . . . . . . . . . . . . . . . . . . . . . . . . . 2-24
2-7 DMAC Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-26
2-8 Fractional Double-Precision Multiplication . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-27
2-9 Fractional Mixed-Precision Multiplication. . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-28
2-10 Signed Integer Double-Precision Multiplication . . . . . . . . . . . . . . . . . . . . . . . 2-29
2-11 Unsigned Integer Double-Precision Multiplication . . . . . . . . . . . . . . . . . . . . . 2-30
2-12 AGU Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-32
2-13 AGU Programming Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-34
2-14 Modifier Control Register (MCTL) Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-37
2-15 Modulo Addressing Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-46
2-16 Integer Move Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-53
2-17 Fractional Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-54
2-18 Bit Allocation in MOVE.L D0.e:D1.e . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-55
2-19 Endian Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-56
2-20 Basic Connection between SC140 Core and Memory . . . . . . . . . . . . . . . . . . . 2-57
2-21 Memory Organization of Big and Little Endian Mode. . . . . . . . . . . . . . . . . . . 2-57
2-22 Data Transfer in Big and Little Endian Modes. . . . . . . . . . . . . . . . . . . . . . . . . 2-59
2-23 Multi-Register Transfer in Big and Little Endian Modes. . . . . . . . . . . . . . . . . 2-61
2-24 Program Memory Organization in Big and Little Endian Modes . . . . . . . . . . 2-62
2-25 Instruction Moves in Big and Little Endian Modes . . . . . . . . . . . . . . . . . . . . . 2-63
3-1 Status Register -SR. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3-2 Exception and Mode Register (EMR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-7
4-1 JTAG and EOnCE Multi-core Interconnection . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4-2 TAP Controller State Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4-3 Cascading Multiple EOnCE Modules. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-7
4-4 Reading and Writing EOnCE Registers Via JTAG . . . . . . . . . . . . . . . . . . . . . . 4-8
4-5 Accessing EOnCE registers through JTAG . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-9
4-6 Typical Debugging System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-10
List of Figures
xiv SC140 DSP Core Reference Manual
4-7 Software Downloading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-13
4-8 EOnCE Controller Block Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4-9 Event Counter Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4-10 Event Detection Unit Block Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-21
4-11 EDCA Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-22
4-12 EDCD Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24
4-13 Event Selector Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-26
4-14 Trace Unit Block Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-28
4-15 EOnCE Command Register (ECR). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
4-16 EOnCE Status Register (ESR) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38
4-17 EOnCE Monitor and Control Register (EMCR). . . . . . . . . . . . . . . . . . . . . . . . 4-41
4-18 EE Signals Control Register (EE_CTRL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-45
4-19 Injected Instruction Format. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-48
4-20 Event Counter Register (ECNT_CTRL). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-51
4-21 EDCA Control Register (EDCAi_CTRL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-54
4-22 EDCD Control Register (EDCD_CTRL). . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-58
4-23 Event Selector Control Register (ESEL_CTRL) . . . . . . . . . . . . . . . . . . . . . . . 4-62
4-24 Event Selector Mask Debug State (ESEL_DM). . . . . . . . . . . . . . . . . . . . . . . . 4-63
4-25 Event Selector Mask Debug Exception (ESEL_DI). . . . . . . . . . . . . . . . . . . . . 4-64
4-26 Event Selector Mask Enable Trace (ESEL_ETB) . . . . . . . . . . . . . . . . . . . . . . 4-64
4-27 Event Selector Mask Disable Trace (ESEL_DTB). . . . . . . . . . . . . . . . . . . . . . 4-65
4-28 Trace Buffer Control Register (TB_CTRL) . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-67
5-1 Instruction Pipeline Stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-2
5-2 Instruction Grouping Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-6
5-3 Low Register Prefix Selection Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-11
5-4 Hardware Loop Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-25
5-5 Loop Nesting. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-28
5-6 SC140 Memory Use with a Single Stack Pointer. . . . . . . . . . . . . . . . . . . . . . . 5-32
5-7 SC140 Memory Use with Dual Stack Pointers. . . . . . . . . . . . . . . . . . . . . . . . . 5-33
5-8 Working mode Transitions - Unprotected Dual-stack RTOS. . . . . . . . . . . . . . 5-38
5-9 Working mode Transitions - Unprotected Single-stack RTOS . . . . . . . . . . . . 5-39
5-10 Core State Diagram. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-42
5-11 Core-PIC Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-47
5-12 Flowchart for Exception Timing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-55
6-1 Core to Single ISAP Connection Schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . 6-58
6-2 Core to Multiple ISAP Connection Schematic. . . . . . . . . . . . . . . . . . . . . . . . . 6-59
SC140 DSP Core Reference Manual xv
2-1 DALU Programming Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-7
2-2 Write to Data Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
2-3 Read from Data Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-9
2-4 Data Registers Access Width . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
2-5 DALU Arithmetic Instructions (MAC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-10
2-6 DALU Logical Instructions (BFU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-13
2-7 Scaling Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-14
2-8 Ln Bit Calculation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-15
2-9 Limiting Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16
2-10 Scaling and Limiting Interactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-16
2-11 Saturation and Rounding Interactions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-17
2-12 Two’s Complement Word Representations . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-19
2-13 Rounding Position in Relation to Scaling Mode . . . . . . . . . . . . . . . . . . . . . . . 2-21
2-14 Arithmetic Saturation Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-25
2-15 Fractional Signed and Unsigned Two’s Complement Multiplication . . . . . . . 2-26
2-16 Integer Signed and Unsigned Two’s Complement Multiplication. . . . . . . . . . 2-28
2-17 Address Modifier (AM) Bits. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-37
2-18 Access Width Support for Address and Register Update Calculations . . . . . . 2-42
2-19 Memory Address Alignment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-43
2-20 Addressing Modes Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-43
2-21 Modulo Register Values for Modulo Addressing Mode . . . . . . . . . . . . . . . . . 2-47
2-22 Modulo Register Values for Wrap-Around Modulo Addressing Mode. . . . . . 2-48
2-23 AGU Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-48
2-24 AGU Bit Mask Instructions (BMU) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-50
2-25 AGU Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-52
2-26 Data Representation in Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2-58
2-27 Move Instructions in Big and Little Endian Modes . . . . . . . . . . . . . . . . . . . . . 2-64
2-28 Stack Support Instructions in Big and Little Endian Modes . . . . . . . . . . . . . . 2-67
2-29 Bit Mask Instructions in Big and Little Endian Modes . . . . . . . . . . . . . . . . . . 2-67
2-31 Control Instructions in Big and Little Endian Modes. . . . . . . . . . . . . . . . . . . . 2-68
2-30 Non-Loop Change-of-Flow Instructions in Big and Little Endian Modes. . . . 2-68
3-1 Status Register Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-2
3-2 EMR Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-8
4-1 JTAG Interface Signal Descriptions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-2
List of Tables
xvi SC140 DSP Core Reference Manual
4-2 JTAG Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-3
4-3 JTAG Scan Paths . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-5
4-4 EOnCE Event Types. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-14
4-5 EOnCE Event and Action Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-15
4-6 EOnCE Controller Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-17
4-7 Event Counter Register Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-19
4-8 EDCA Register Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-23
4-9 EDCD Register Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-24
4-10 Event Selector Register Set. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-26
4-11 Trace Buffer Register Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-30
4-12 EOnCE Register Addressing Offsets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-31
4-13 ECR Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-36
4-14 ESR Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-38
4-15 EMCR Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-41
4-16 EE_CTRL Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-46
4-17 Length Control Bits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-48
4-18 ECNT_CTRL Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-51
4-19 EDCA_CTRL Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-54
4-20 EDCD_CTRL Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-58
4-21 ESEL_CTRL Description. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-62
4-22 Allowed tracing mode combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-66
4-23 TB_CTRL Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4-67
5-1 Pipeline Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5-2 Pipeline Stages Overview. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-3
5-3 Prefix Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5-4 Conditional IFc Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-9
5-5 Instruction Categories Timing Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-15
5-6 Non-Loop Change-of-Flow Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5-7 Loop Change-Of-Flow Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-18
5-8 Number of Cycles Needed by Change-of-Flow Instructions . . . . . . . . . . . . . . 5-20
5-9 LPMARKA and LPMARKB Bits in Short and Long Loops . . . . . . . . . . . . . . 5-27
5-10 Loop Control Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-29
5-11 Stack Push/Pop Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
5-12 Even and Odd Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-34
5-13 Stack Memory Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35
5-14 Stack Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-35
5-15 Working Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-37
5-16 Processing State Change Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-41
5-17 Processing State Transitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-43
SC140 DSP Core Reference Manual xvii
5-18 Exit Wait Processing State due to an Interrupt or NMI . . . . . . . . . . . . . . . . . . 5-45
5-19 Exception Vector Address Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-49
5-20 Exception Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-53
5-21 Pipeline Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-56
6-1 ISAP Encoding Fields. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-60
A-1 Instruction Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-2
A-2 Operations Syntax. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
A-3 Register Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-3
A-4 Assembler Syntax . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-4
A-5 Addressing Mode Notation for the EA Operand . . . . . . . . . . . . . . . . . . . . . . . . A-5
A-6 Addressing Mode Notation for the ea Operand . . . . . . . . . . . . . . . . . . . . . . . . . A-5
A-7 DALU Arithmetic Instructions (MAC) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-13
A-8 DALU Logical Instructions (BFU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-14
A-9 AGU Arithmetic Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15
A-10 AGU Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-15
A-11 AGU Stack Support Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-16
A-12 AGU Bit-Mask Instructions (BMU). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-17
A-13 AGU Non-Loop Change-of-Flow Instructions. . . . . . . . . . . . . . . . . . . . . . . . . A-17
A-14 AGU Loop Control (Including Loop COF) Instructions . . . . . . . . . . . . . . . . . A-18
A-15 AGU Program Control Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18
A-16 Prefix Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-18
A-17 Combinations of LPMARKx Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . A-221
B-1 SCID Assignments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . B-2
xviii SC140 DSP Core Reference Manual
SC140 DSP Core Reference Manual xix
3-1 Clearing an EMR Bit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3-10
5-1 Four SC140 Instructions in an Execution Set. . . . . . . . . . . . . . . . . . . . . . . . . . . 5-5
5-2 Grouping Six SC140 Instructions in an Execution Set. . . . . . . . . . . . . . . . . . . . 5-5
5-3 Execution Set with Three One-word and Two Two-word Instructions . . . . . . 5-13
5-4 Conditional VLES Having Two Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-13
5-5 Set of 2 Two-word Instructions Requiring a NOP . . . . . . . . . . . . . . . . . . . . . . 5-13
5-6 Delayed Change-of-Flow and Its Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . 5-17
5-7 Subroutine Call Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-20
5-8 Parallel Execution of Two Move Instructions . . . . . . . . . . . . . . . . . . . . . . . . . 5-23
5-9 Execution Set Containing a Bit Mask and a Move Instruction. . . . . . . . . . . . . 5-23
5-10 Execution Set Containing One Bit Mask Instruction . . . . . . . . . . . . . . . . . . . . 5-23
5-11 Execution Set Containing a Bit Mask and a Pop Instruction . . . . . . . . . . . . . . 5-24
5-12 Long Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
5-13 Long Loop Disassembly. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
5-14 Short Loop, Two Execution Sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-30
5-15 Short Loop, One Execution Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5-16 Nested Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-31
5-17 Basic Exception Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5-53
6-1 ISAP memory access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-61
6-2 ISAP-Core register transfers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-62
6-3 ISAP-Core register transfers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-62
6-4 Single ISAP coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-63
6-5 Multiple ISAP coding. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-65
6-6 Conditional Execution Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-66
6-7 Conditional Execution Example. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6-66
6-8 MOVE rules with an implicit MOVE instruction from ISAP . . . . . . . . . . . . . 6-68
7-1 B Register Aliasing. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-5
7-2 Delayed COF Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-6
7-3 VLES Word Count Exceeds Eight . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7-4 Too Many AGU Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-8
7-5 Duplicate PC Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7-6 Duplicate Address Pointer Register Destinations. . . . . . . . . . . . . . . . . . . . . . . . 7-9
List of Examples
xx SC140 DSP Core Reference Manual
7-7 Duplicate Stack Pointer Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-9
7-8 Duplicate Register Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7-9 Duplicate SR/EMR Register Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7-10 Duplicate Status Bit Destinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7-11 Dual Stack Pointer Destination Exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-10
7-12 Mutually Exclusive Register Destination Exception . . . . . . . . . . . . . . . . . . . . 7-11
7-13 Mutually Exclusive Status Bit Destination Exception . . . . . . . . . . . . . . . . . . . 7-11
7-14 Multiple C, S and DOVF Status Bit Destination Exception. . . . . . . . . . . . . . . 7-11
7-15 DALU Register Use Exceeds Four Times . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-11
7-16 VLES Extension Words Exceed Two. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7-17 Two-Word Instructions Exceed Two . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-12
7-18 VLES Has Mutually Exclusive Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7-19 RTE Uses Both AAU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-13
7-20 Data Source Use of Nn and Mn Registers . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7-21 IFc Having Two Subgroups . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7-22 IFA Subgroup Must Be Last Instructions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-14
7-23 Core AGU instructions on same VLES as ISAP instructions . . . . . . . . . . . . . 7-15
7-24 ISAP instructions in same IFc group . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-15
7-25 MCTL Write to R0-R7 Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-16
7-26 Rn, Nn, Mn Write to AGU Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-17
7-27 Rn or Nn Write to MOVE-like Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7-28 LCn Write to MOVE-like Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-18
7-29 NMID Update to EMR Read . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
7-30 Instructions in a Delay Slot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-19
7-31 Instructions in a RTED Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7-32 RTE/D with SR Updates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-20
7-33 PC Read in a Return Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7-34 SR Write with a Subroutine Call . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7-35 SR Write in BSRD or JSRD Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7-36 SP Use in Return Delay Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-21
7-37 SR Read in a CONTD Delay Slot. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7-38 EMR Use in Return Delay Slots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7-39 T Bit Update to IFT/IFF AGU Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-22
7-40 T Bit Update by ISAP and COF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7-41 T Bit Update by ISAP and MOVET/MOVEF . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7-42 T Bit Update by ISAP and IFT/IFF . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-23
7-43 SR Write to SR Status Bit Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-25
SC140 DSP Core Reference Manual xxi
7-44 SR Write to SR Status Bit Update . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-26
7-45 DOVF Update to SR Read or Write . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-27
7-46 DOVF Update grouped with Move-like SR updates . . . . . . . . . . . . . . . . . . . . 7-27
7-47 Status Bit Update with SR Read. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
7-48 Nested Loops with the Same LA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-28
7-49 Nested Loops with Ordered Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7-50 Nested DOENn/DOENSHn Instructions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-29
7-51 DOENn instruction following DOENSHn Instruction. . . . . . . . . . . . . . . . . . . 7-30
7-52 LOOPEND between DOEN and LOOPEND. . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
7-53 Changing a loop type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-30
7-54 Instructions at the End of Long Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
7-55 LCn Write at the End of Long Loop n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-31
7-56 Instructions in Short Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-32
7-57 Short Loop LA at the End of a Long Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-32
7-58 LCn Write to SKIPLS Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33
7-59 LCn Write at the End of Long Loop n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-33
7-60 LCn Write at the Start of Short Loop n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-34
7-61 LCn Write to CONT/D Instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-34
7-62 SAn Write at the End of Long Loop n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
7-63 SAn Write to CONT/D Instruction. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
7-64 LCn Read at the Start of Short Loop n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-35
7-65 COF Destination to Loop Delay Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36
7-66 COF Instructions at LA-2 of a Long Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36
7-67 Bc/Jc at SA-1 of a Short Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-36
7-68 Bc/Jc at LA-3 of a Long Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-37
7-69 Loop COF Destination in the Same Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-38
7-70 Loop COF at End of Nested Long Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-39
7-71 Subroutine Call to End of Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-39
7-72 Delayed COF at LA-3 of a Long Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-40
7-73 Delayed COF at SA-1 of a Short Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-40
7-74 SR Read to LA of Any Long Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-40
7-75 SR Read to SA of Any Short Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-40
7-76 Enabling Short and Long Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-41
7-77 Bn, Mn Write to AGU Use. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-41
7-78 Multiple Memory Writes to the Same Location. . . . . . . . . . . . . . . . . . . . . . . . 7-42
7-79 Pre-Calculated Memory Accesses to the Same Location. . . . . . . . . . . . . . . . . 7-42
7-80 Memory Write to Stack in a Return Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . 7-42
xxii SC140 DSP Core Reference Manual
7-81 Illegal use of RAS value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-43
7-82 SR.2 Across a COF Boundary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-44
7-83 A.2 from a Delay Slot to a COF Destination . . . . . . . . . . . . . . . . . . . . . . . . . . 7-44
7-84 Set condition during a COF, and use it at the destination (T.1) . . . . . . . . . . . . 7-45
7-85 EMR access at the start of an exception . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-46
7-86 MCTL Write to R0-R7 Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-47
7-87 Invalid COF Destination Cannot be Detected . . . . . . . . . . . . . . . . . . . . . . . . . 7-48
7-88 COF Destination in the Middle of a VLES. . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-48
7-89 COF Destination in a Delay Slot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-48
7-90 LFn Enabled During Loop Body n . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-49
7-91 LFn Enabled at LPA or LPB. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-53
7-92 Instructions at the End of Long Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-53
7-93 Active LCn Write at the End of Long Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 7-54
7-94 Instructions in Short Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-54
7-95 Active LCn Write at the Start of a Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-55
7-96 Active SAn Write at the End of Long Loops . . . . . . . . . . . . . . . . . . . . . . . . . . 7-55
7-97 Active LCn Read at the Start of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-56
7-98 COF Instructions at LPB of a Long Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-57
7-99 Bc/Jc at the Start of a Loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-57
7-100 Loop COF at End of Nested Long Loops. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-58
7-101 Subroutine Call to End of Loops . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-58
7-102 Delay Slot at LPA or LPB of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-59
7-103 SR Read to LPA or LPB of a Loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-59
7-104 COF Destination to Loop Delay Slots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7-60
SC140 DSP Core Reference Manual xxiii
About This Book
This manual provides reference information for the StarCore SC140 digital signal processor (DSP) core. Specifically, this book describes the instruction set architecture and programming model for the SC140 core as well as corresponding register details, debug capabilities, and programming rules.
An appendix provides a detailed instruction reference for the SC140 instruction set, describing the operation, mnemonics, instruction fields, and encoding for each instruction. Instruction examples are also provided.
The resulting system-on-chip devices designed around the SC140 core will usually include additional functional blocks such as on-chip memory, an external memory interface, peripheral accelerators, and coprocessor devices. The specification of these functional blocks is customer-specific as well as application-specific. Therefore, this information is not covered in this manual.
Audience
This manual is intended for systems software developers, hardware designers, and application developers.
Organization
This book is organized into six chapters and one appendix as follows:
Chapter 1, “Introduction”, describes key features of the SC140 architecture. This chapter also
illustrates a typical system using the SC140 core.
Chapter 2, “Core Architecture”, describes the main functional blo cks and data paths of the SC140
core.
Chapter 3, “Control Registers”, details the core’s control registers.
Chapter 4, “Emulation and Debug (EOnCE)”, describes the hardware debug capabilities of the core.
Chapter 5, “Program Control”, details program control features such as the pipeline, instruction
grouping, instruction timing, hardware loops, stack support, processing states, protection model, and exception processing.
Chapter 6, “Instruction Set Accelerator Plug-In”, describes how the SC140 core and SW developer
can work with a an Instruction Set Accelerator Plug-In.
Chapter 7, “Programming Rules”, details the VLES semantics, static programming rules, dynamic
programming rules, and programming guidelines for correct code construction.
Appendix A, “SC140 DSP Core Instruction Set,” references the SC140 instruction set.
Appendix B, “StarCore Registry,” shows how to access the core version
xxiv SC140 DSP Core Reference Manual
Abbreviations
The abbreviations used in this manual are listed below:
Table 1. Abbreviations
Abbreviation Description
AAU Address arithmetic unit ADM Application development module AGU Address generation unit ALU Arithmetic logic unit Bn AGU base address register n BFU Bit-field unit BMU Bit mask unit DALU Data arithmetic and logic unit DSP Digital signal processor ECR EOnCE control register EDU Event detection unit, with respect to the EOnCE EE EOnCE event pins EMCR EOnCE monitor and control register EMR Exception and mode register EOnCE Enhanced on-chip emulator ERCV EOnCE receive register ES Event selector, with respect to the EOnCE ESP E xception mode stack pointer ESR EOnCE status register ETRSMT EOnCE transmit register EXT Extension portion of a data register FC Fetch counter FIFO First-in first-out FFT Fast Fourier transform HP High portion of a data register IPL Interrupt priority level ISAP Instruction Set Accelerator Plug-in
SC140 DSP Core Reference Manual xxv
ISR Interrupt service routine JTAG Joint test action group LA Last address LCn Loop counter register n Ln Limit tag bit n LP Low portion of a data register LSB Least significant bits LSP Least significant portion Mn AGU modifier register MAC Multiply-accumulate MCTL Modifier control register MIPS Million instructions per second MMACS Million multiply and accumulate operations per second MSB Most significant bits MSP Most significant portion Nn AGU offset register n NMI Non-maskable interrupt NSP Normal mode stack pointer OS Operating system PAB Program address bus PAG Program address generator PC Program counter register PCU Program control unit PDB Program data bus PDU Program dispatch unit PIC Programmable interrupt controller PLL Phase locked loop PSEQ Program sequencer unit Rn AGU address register n
Table 1. Abbreviations (Continued)
Abbreviation Description
xxvi SC140 DSP Core Reference Manual
Revision History
RAS Return address register RTOS Real-time operating system SAn Start address register n SF Signed fractional SI Signed integer SM Saturation mode SoC System-on-chip SP Stack pointer SR Status register T True bit UI Unsigned integer VBA I nterrupt vector base address register VLES Variable length execution set instruction grouping XABA Data memory address bus A XABB Data memory address bus B XDBA Data memory data bus A XDBB Data memory data bus B
Table 2. Revision History
Revision Date Description
4.0 31 Aug, 2004 Fourth release of SC140
4.1 20 Sep, 2005 Misc. corrections (restored missing IADDNC.W instruction)
Table 1. Abbreviations (Continued)
Abbreviation Description
SC140 DSP Core Reference Manual 1-1
Chapter 1
Introduction
The StarCore SC140 digital signal processing (DSP) core, a new member of the SC100 architecture, addresses key market needs of next-generation DSP applications. It is especially suited for wireline and wireless communications, including infrastructure and subscriber communications. It is a flexible programmable DSP core which enables the emergence of computational-intensive communication applications by providing exceptional performance, low power consumption, efficient compilability, and compact code density. The SC140 core efficiently deploys a variable-length execution set (VLES) execution model which utilizes maximum parallelism by allowing multiple address generation and data arithmetic logic units to execute multiple instructions in a single clock cycle.
This chapter describes key features of the SC140 core architecture.
1.1 Target Markets
The design of the SC140 architecture aims to provide a DSP software platform that fulfills the constantly increasing computational requirements of DSP applications due to:
New communication standards and services
Wideband channels and data rates
New user interfaces and media
Currently, software-configurable wireless terminals are already required to accommodate multiple air interfaces and frequency bands for cellular phones, PCs, paging devices, cordless phones, wireless LAN systems, and modems. In addition, multiple voice, messaging, internet, and video services must also be supported. These terminals must be flexible and upgradable so that they can be personalized for each user (such as permitting the dynamic download of applets). Finally, these terminals must be able to process baseband data using software to implement a range of functions previously carried out by hardware.
Target markets for the SC140 architecture include:
Wireless software configurable handset terminals (radios)
Third generation wireless handset systems with wideband data services
Wireless and wireline base stations as well as the corresponding infrastructure
Speech coding, synthesis, and voice recognition
Wireless internet and multimedia
Network and data communication
1-2 SC140 DSP Core Reference Manual
Architectural Differentiation
1.2 Architectural Differentiation
The SC140architecture differentiates itself in the market with the following capabilities:
High-level Abstraction of the Application Software
— DSP applications and kernels can currently be developed in the C programming language. An
optimizing compiler generates parallel instructions while maintaining a high code density.
— An orthogonal instruction set and programming model along with single data space and byte
addressability enable the compiler to generate efficient code.
— Hardware supported integer and fractional data types enable application developers to choose
their own style of code development, or to use coding techniques derived from an application-specific standard.
Scalable Performance
— The number of execution units is independent of the instruction set, and can be tailored to the
application’ s performance requirement. The SC140 contains four arithmetic logic units (ALUs) and two address arithmetic units (AAUs).
— A high frequency of operation is achieved at low voltage, providing four million multiply and
accumulate (MAC) operations per second (4 MMACS) for each megahertz of clock frequency.
— Support exists for application-specific accelerators, providing a performance boost and
reduction in power consumption.
High Code Density for Minimized Cost
— 16-bit wide instruction encoding. — A rich and orthogonal instruction set, major portions of which focus on control code that can
often occupy most of the application code.
— Variable length execution set (VLES) for DSP kernel operations.
Improved Support for Multi-tasking Applications
— Dual stack pointer support in HW.
Optimized Power Management Control
— Very low power consumption. — Low voltage operation. — Power saving modes.
Efficient Memory and I/O Interface
— Very large on-chip zero-wait state static random access memory (SRAM) capability. — Support for slower on-chip memory via wait-states. — 32-bit address space for both program and data (byte-addressable). — Unified data and program memory space. — Decoupled external memory timing with independent clock.
Core Organization and Design
— Supports flexible system-on-a-chip (SoC) configurations. — Portable across fabrication lines and foundries.
Core Architecture Features
SC140 DSP Core Reference Manual 1-3
1.3 Core Architecture Features
The SC140 core consists of the following:
Data arithmetic logic unit (DALU) that contains four instances of an arithmetic logic unit (ALU) and
a data register file
Address generation unit (AGU) that contains two address arithmetic units (AAU) and an address
register file
Program sequencer and control unit (PSEQ)
Key features of the SC140 core include the following:
Up to four million multiply-accumulate (MAC) operations per second (4 MMACS) for each
megahertz of clock frequency
Up to 10 RISC MIPS (million instruction words per second) for each megahertz of clock frequency
(a MAC operation is counted as two RISC instructions)
Four ALUs comprising MAC and bit-field units
A true (16 16) + 40 to 40-bit MAC unit in each ALU
A true 40-bit parallel barrel shifter in each ALU
Sixteen 40-bit data registers for fractional and integer data operand storage
Sixteen 32-bit address registers, eight of which can be used as 32-bit base address registers
Four address offset registers and four modulo address registers
Hardware support for fractional and integer data types
Up to six instructions executed in a single clock cycle
Very rich 16-bit wide orthogonal instruction set
Support for application specific instruction set enhancements with an interface to an ISAP
(Instruction Set Accelerator Plug-in)
VLES execution model
Two AAUs with integer arithmetic capabilities
A bit mask unit (BMU) for bit and bit-field logic operations
Unique DSP addressing modes
32-bit unified data and program address space
Zero-overhead hardware loops with up to four levels of nesting
Byte-addressable data memory
Position independent code utilizing change-of-flow instructions that are relative to the
program counter (PC)
Enhanced on-chip emulation (EOnCE) module with real-time debug capabilities
Low power wait standby mode
Very low power complementary metal-oxide semiconductor (CMOS) design
Fully static logic
1-4 SC140 DSP Core Reference Manual
Core Architecture Features
1.3.1 Typical System-On-Chip Configuration
The SC140 is a high-performance general-purpose fixed-point DSP core, allowing it to support many system-on-chip (SoC) configurations. A library of modules containing memories, peripherals, accelerators, and other processor cores makes it possible for a variety of highly integrated and cost-effective SoC devices to be built around the SC140. Figure 1-1 shows a block diagram of a typical SoC chip made up of the SC140 core and associated SoC components (described below). In a typical system the SC140 core is enveloped in a platform that includes the core and supporting zero wait-state memories. This platform is integrated as a unit in the SoC. Although not indicated in this configuration, an SoC can contain more than one SC140 core platform.
An on-platform instruction set accelerator plug-in can be used as part of the SC140 core platform to provide additional instructions for unique application solutions such as video processing, which require specific arithmetic instructions in addition to the main instruction set.
SC140 DSP core platform — Includes the DSP core and the immediate supporting blocks that
typically run at the full core frequency. The DSP platform typically includes: — SC140 DSP core
— Instruction Set Accelerator Plug-in (ISAP) - for expanding the instruction set with
application-specific instructions. — L1 caches - data and instruction caches, operating with zero wait states in case of cache hit — Unified M1 memory - supporting both program and data, and hence connected to both the
program and data buses of the core. The M1 memory operates with no wait states. It could be
either RAM or ROM, or a mix of both. The RAM, depending on its’ size, may be conn ected as
a slave to an external DMA. — Program interrupt controller (PIC) — Interfaces - translate the core data and program fetch requests to the bus protocol supported by
the system, usually in reduced frequency.
DSP Expansion Area — This area includes the functional units that interface between the core and
the DSP application, most importantly the functions that send and receive data from external input/output sources, under the control of the software running on the DSP core. In addition, this area includes accelerators that execute portions of the application, in order to boost performance and decrease power consumption. This area is application-specific and may or may not include various functional units such as:
— Synchronous serial interface — Serial communication interface — Viterbi accelerator — Filter coprocessors
System Expansion Area — This area includes the SoC functional units that are not tightly coupled
with the DSP core. Typically it may include other processors with their support platform as well. This area is application-specific, and may include various functional units such as:
— External memory interface — Direct memory access (DMA) controller — L2 Cache controller for either data or program — Chip-level Interrupt control unit — On-chip Level 2 (M2) memory expansion modules — Other processor cores with their supporting platforms
Core Architecture Features
SC140 DSP Core Reference Manual 1-5
Figure 1-1. Block Diagram of a Typical SoC Configuration with the SC140 Core
1.3.2 Variable Length Execution Set (VLES) Software Model
The VLES software model is the instruction grouping used by the SC140 to address the requirements of DSP kernels. Using an orthogonal compiler-friendly instruction set, this model maintains a compact code density for applications.
All SC140 instruction words are 16 bits wide. Most instructions are encoded with one word. Each SC140 instruction encodes an atomic (lowest-level) operation. For example, MAC and store (move) instructions are encoded in 16 bits. Since atomic operations need fewer bits to encode, the 16-bit instruction set becomes fully orthogonal and very rich in the functionality it supports.
In order to execute signal processing kernels, a set of SC140 instructions can be grouped to be executed in parallel. The PSEQ performs this automatically with up to four DALU instructions and two AGU instructions executed at the same time.
SC140 core
EOnCE ISAP
Instruction
cache
Data
cache
Unified M1
prog. & data
memory
SoC
SC140 platform
DSP expansion area
System expansion area
Bus switch & interfaces
RAM
ROM
P
XA
XB
Trace buffer
JTAG
Standard I/O Peripherals
Application specific accelerators
General purpose programmable
accelerators
External memory interface Level-2 caches On-chip RAM and ROM Host interface Other micro-controllers
DMA
PLL
PIC
1-6 SC140 DSP Core Reference Manual
Core Architecture Features
SC140 DSP Core Reference Manual 2-1
Chapter 2
Core Architecture
This chapter provides an overview of the SC140 core architecture. It describes the main functional blocks and data paths of the core.
2.1 Architecture Overview
The SC140 core provides the following main functional units:
Data arithmetic and logic unit (DALU)
Address generation unit (AGU)
Program sequencer unit (PSEQ)
To provide data exchange between the core and the other on-chip blocks, the following buses are implemented:
Two data memory buses (address and data pairs: XABA and XDBA, XABB and XDBB) that are
used for all data transfers between the core and memory.
Program data and address buses (PDB and PAB) for carrying program words from the memory to
the core.
Special buses to support tightly coupled external user-definable instruction set accelerators.
A block diagram of the SC140 core is shown in Figure 2-3.
2-2 SC140 DSP Core Reference Manual
Architecture Overview
.
Figure 2-1. Block Diagram of the SC140 Core
2.1.1 Data Arithmetic Logic Unit (DALU)
The DALU performs arithmetic and logical operations on data operands in the SC140 core. The components of the DALU are as follows:
A register file of sixteen 40-bit registers
Four parallel ALUs, each ALU containing a multiply-accumulate (MAC) unit and
a bit-field unit (BFU)
Eight data bus shifter/limiters
All the MAC units and BFUs can access all the DALU registers. Each register is partitioned into three portions: two 16-bit registers (low and high portion of the register) and one 8-bit register (extension portion). Accesses to or from these registers can be in widths of 8 bits, 16 bits, 32 bits, or 40 bits, depending on the instruction.
The two data buses between the DALU register file and the memory are each 64 bits wide. This enables a very high data transfer speed between memory and registers by allowing two data moves in parallel, each up to 64 bits in width. The move instructions vary in access width from 8 bits to 64 bits, and can transfer multiple words within the 64 bit constraint. With every MOVE instruction affecting the memory, one of four signals to the memory interface is asserted, defining the access width.
MOVE.B loads or stores bytes (8-bit).
MOVE.W or MOVE.F loads or stores integer or fractional words (16-bit).
MOVE.2W, MOVE.2F or MOVE.L loads or stores two integers, two fractions and long words
respectively (32-bit).
MOVE.4W or MOVE.4F loads or stores four integers or four fractions, respectively (64-bit).
XDBA
XABA
Instruction Bus
PAB
Program
Sequencer
PDB
XABB
XDBB
DALU
Register File
Unified
64
64
32
32
32
128
BMU
25
Data/Program Memory
StarCore SC140 Core
Address Generator
Register File
DALU
AGU
ISAP
EOnCE
JTAG
controller
2 AAUs
4 ALUs
Architecture Overview
SC140 DSP Core Reference Manual 2-3
MOVE.2L loads or stores two long words (64-bit).
2.1.1.1 Data Register File
The DALU registers can be read or written over the data buses (XDBA and XDBB). A DALU register can be the source for up to four simultaneous instructions, but simultaneous writes of a destination register are illegal. The source operands for DALU arithmetic instructions usually originate from DALU registers. The destination of every arithmetic operation is a DALU register, and each such destination can be used as a source operand for the operation immediately following, without any time penalty.
2.1.1.2 Multiply-Accumulate (MAC) Unit
The MAC unit comprises the main arithmetic processing unit of the SC140 core and performs the arithmetic operations. The MAC unit has a 40-bit input and outputs one 40-bit result in the form of [Extension:High Portion:Low Portion] (EXT:HP:LP).
The multiplier executes 16-bit by 16-bit fractional or integer multiplication between two’s complement signed, unsigned, or mixed operands (16-bit multiplier and multiplicand). The 32-bit product is right-justified, sign-extended, and may be added to the 40-bit contents of one of the 16 data registers.
2.1.1.3 Bit-Field Unit (BFU)
The BFU contains a 40-bit parallel bidirectional shifter with a 40-bit input and a 40-bit output, a mask generation unit, and a logic unit. The BFU is used in the following operations:
Multi-bit left/right shift (arithmetic or logical)
One-bit rotate (right or left)
Bit-field insert and extract
Count leading bits (ones or zeros)
Logical operations
Sign or zero extension operations
2.1.1.4 Shifter/Limiters
Eight shifter/limiters provide scaling and limiting on 32-bit transfers from the data register file to memory. Scaling up or down by one bit is programmable as is limiting to the maximum values provided in 32 bits. For more detailed information, see Section 2.2.1.4, “Data Shifter/Limiter,” Section 2.2.1.5, “Scaling,” and
Section 2.2.1.6, “Limiting.”
2.1.2 Address Generation Unit (AGU)
The AGU contains address registers and performs address calculations using integer arithmetic necessary to address data operands in memory. It implements four types of arithmetic: linear, modulo, multiple wrap-around modulo, and reverse-carry. The AGU operates in parallel with other core resources to minimize address generation overhead.
2-4 SC140 DSP Core Reference Manual
Architecture Overview
The AGU in the SC140 core has two address arithmetic units (AAU) to allow two address generation operations at every clock cycle. The AAU has access to:
Sixteen 32-bit address registers (R0–R15), of which R8–R15 can also be used as base address
registers for modulo addressing.
Four 32-bit offset registers (N0–N3).
Four 32-bit modulo registers (M0–M3).
The two AAUs are identical. Each contains:
A 32-bit full adder, used for offset calculations.
A second 32-bit full adder, used for modulo calculations.
Each AAU can update one address register in the address register file in one instruction cycle. The AGU also contains a 32-bit modulo control register (MCTL). This control register is used to specify
the addressing mode of the R registers: linear, reverse-carry, modulo, or multiple wrap-around modulo. When modulo addressing mode is selected, the MCTL register is used to specify which of the four modulo registers is assigned to a specific R register.
Explicit instructions in the SC140 instruction set are used to execute arithmetic operations on the address pointers. This capability can also be used for general data arithmetic. In addition, the AGU generates change-of-flow program addresses and updates the stack pointers as needed.
2.1.2.1 Stack Pointer Registers
Two special registers with special addressing modes are used for software stacks. These are the Normal mode stack pointer (NSP) and the Exception mode stack pointer (ESP). Both the ESP and the NSP are 32-bit read/write address registers with pre-decrement and post-increment updates. Both are offset with immediate values to allow random access to a software stack.
The ESP is used by stack instructions when the SC140 is in the Exception mode of operation, which is entered when exceptions occur. The NSP is used in Normal mode when there are no exceptions. The existence of two stack pointers enables separate allocation of stack space by the operating system and each application task, which optimizes memory use in multi-tasking systems.
2.1.2.2 Bit Mask Unit (BMU)
The BMU provides an easy way of setting, clearing, inverting, or testing a selected, but not necessarily adjacent, group of bits in a register or memory location.
The BMU supports a set of bit mask instructions that operate on:
All AGU pointers (R0–R15)
All DALU registers (D0–D15)
All control registers (EMR, VBA, SR, MCTL)
Memory locations
Only a single bit mask instruction is allowed in any single execution set since only one execution unit exists for these instructions.
A subgroup of the bit mask instructions (BMTSET) provides hardware support of semaphoring, providing one instruction for read-modify-write.
Architecture Overview
SC140 DSP Core Reference Manual 2-5
2.1.3 Program Sequencer Unit (PSEQ)
The PSEQ performs instruction fetch, instruction dispatch, hardware loop control, and exception processing. The PSEQ controls the different processing states of the SC140 core. The PSEQ consists of three hardware blocks:
Program dispatch unit (PDU)—Responsible for detecting the execution set out of a one or two fetch
set, and dispatching the execution set’s various instructions to their appropriate execution units where they are decoded.
Program control unit (PCU)—Responsible for controlling the sequence of the program flow.
Program address generator (PAG)—Responsible for generating the program counter (PC) for
instruction fetch operations, including hardware looping.
The PSEQ implements its functions using the following registers:
PC—Program counter register
SR—Status register
SA0-3—Four start address registers (SA0–SA3)
LC0-3—Four loop counter registers (LC0–LC3)
EMR—Exception and mode register
VBA—Interrupt vector base address register
2.1.4 Enhanced On-Chip Emulator (EOnCE)
The EOnCE module provides a non-intrusive means of interacting with the SC140 core and its peripherals so that a user can examine registers, memory, or on-chip peripherals as well as define various breakpoints and read the trace-FIFO. The EOnCE module greatly aids the development of hardware and software on the SC140 core processor, EOnCE interfacing with the debugging system through on-chip JTAG TAP controller pins. Refer to Chapter 4, “Emulation and Debug (EOnCE),” for details.
2.1.5 Instruction Set Accelerator Plug-in (ISAP) Interface
A user-defined instruction set accelerator plug-in (ISAP) module provides a means of enhancing the SC140 basic instruction set with additional instructions. These additional instructions are executed in an external module connected to the core. The new instructions are added to the SC140 Assembler and Compiler via intrinsic libraries making application-specific or general-purpose functions available to the user. A 25-bit instruction bus from the SC140 core to the ISAP enables the definition and support of a very rich instruction set. The ISAP is also connected to the two 64-bit data buses, providing a large data bandwidth to the main memory system.
2.1.6 Memory Interface
The SC140 core uses a unified memory space. Each address can contain either program information or data. The exact memory configuration is customizable for each chip containing an SC140 core. Memory space typically consists of on-chip RAM and ROM that can be expanded off-chip. The memory system must support two parallel data accesses. However, it may issue stalls due to its specific implementation. Refer to Section 2.4, “Memory Interface,” for further details.
Both internal and external memory configurations are specific to each member of the SC140 family.
2-6 SC140 DSP Core Reference Manual
DALU
2.2 DALU
This section describes the architecture and operation of the DALU, the block where most of the arithmetic and logical operations are performed on data operands. In addition, this section details the arithmetic and rounding operations performed by the DALU as well as its programming model.
2.2.1 DALU Architecture
The DALU performs most of the arithmetic and logical operations on data operands in the SC140 core. The data registers can be read from or written to memory over the XDBA and the XDBB as 8-bit, 16-bit,
or 32-bit operands. The 64-bit wide data buses, XDBA and XDBB, support the transfer of several operands in a single access. The source operands for the DALU, which may be 16, 32, or 40 bits, originate either from data registers or from immediate data. The results of all DALU operations are stored in the data registers.
All DALU operations are performed in one clock cycle. Up to parallel arithmetic operations can be performed in each cycle. The destination of every arithmetic operation can be used as a source operand for the operation immediately following without any time penalty.
The components of the DALU are as follows:
A register file of sixteen 40-bit registers
Four parallel ALUs, each containing a MAC unit and a BFU with a 40-bit barrel shifter
Eight data bus shifter/limiters that allow scaling and limiting of up to four 32-bit operands
transferred over each of the XDBA and XDBB buses in a single cycle
Figure 2-2 shows the architecture of the DALU.
Figure 2-2. DALU Architecture
Memory Data Bus 1 (XDBA) Memory Data Bus 2 (XDBB)
64 64 64 64
(8) Shifter/Limiters
Data Registers D0–D15
40 404040
4040 40 40 40 40 40
ALU
40 40 40
40 40 40 40 40 40
ALUALU ALU
DALU
SC140 DSP Core Reference Manual 2-7
The DALU programming model is shown in Table 2-1. Register D0 refers to the entire 40-bit register, whereas D0.e, D0.h, and D0.l refer to the extension: high portion and low portion of the D0 register, respectively. In addition, one limit tag bit is associated with each data register. L0–L15 are concatenated to D0–D15, respectively.
Table 2-1. DALU Programming Model
LIMIT EXT HP LP
L0 D0.e D0.h D0.l L1 D1.e D1.h D1.l L2 D2.e D2.h D2.l L3 D3.e D3.h D3. L5 D5.e D5.h D5.l L6 D6.e D6.h D6.l L7 D7.e D7.h D7.l L8 D8.e D8.h D8.l
L9 D9.e D9.h D9.l L10 D10.e D10.h D10.l L11 D11.e D11.h D11.l L12 D12.e D12.h D12.l L13 D13.e D13.h D13.l L14 D14.e D14.h D14.l L15 D15.e D15.h D15.l
2-8 SC140 DSP Core Reference Manual
DALU
2.2.1.1 Data Registers (D0–D15)
In this section, the D0–D15 data registers are referred to as Dn. They can be used as:
Source operands
Destination operands
Accumulators
The registers can serve as input buffer registers between XDBA or XDBB and the ALUs. The registers are used as DALU source operands, allowing new operands to be loaded for the next instruction while the register contents are used by the current arithmetic instruction.
Each data register Dn has a limit tag bit (Ln) which is used to signify whether the extension portion of the register is in use. The limit tag bit Ln is coupled to the extension portion Dn.e, which forms a 9-bit operand for the purpose of storing these bits to memory. See Section 2.2.1.6, “Limiting,” for further details.
The data registers can be accessed over XDBA and XDBB with three data widths:
A long-word access, writing or reading 32-bit operands
A word access, writing or reading 16-bit operands
A byte access, writing or reading 8-bit operands
For move instructions of fractional data, the transfer of a Dn register to memory over XDBA and XDBB is protected against overflow by substituting a limiting constant for the data that is being transferred. The content of Dn is not affected should limiting occur. Only the value transferred over XDBA or XDBB is limited. This process is commonly referred to as transfer saturation and should not be confused with the arithmetic saturation mode as described in Section 2.2.2.7, “Arithmetic Saturation Mode.”
Limiting is performed after the contents of the register have been shifted according to the scaling mode. Shifting and limiting are performed only for MOVES instructions when a fractional operand is specified as the source for a data move over XDBA or XDBB. When an integer operand is specified as the source for a data move, shifting and limiting are not performed.
Automatic sign extension (or zero extension of the data values into the 40-bit registers) is provided when an operand is transferred from memory to a data register. Sign extension can occur when loading the Dn register from memory. If a fractional word operand is to be written to a data register, the high portion (HP) of the register is written with the word operand. The low portion (LP) is zero-filled. The EXT portion is sign-extended from the HP, and the limit tag bit (Ln) is cleared.
When an integer word operand is to be written to a data register, the LP portion of the register is written with the word operand. The HP and EXT portions are either zero-extended or sign-extended from the LP. Long-word operands are written into the HP:LP portions of the register. The EXT portion is zero-extended or sign-extended, and the limit tag bit (Ln) is cleared.
When a byte operand is to be written to a data register, the register’s first 8-bit portion of the LP (Dn.1[7:0]) is written with the byte operand. The following eight bits of the LP (Dn.1[15:8]), the high portion, and the EXT are either zero-extended or sign-extended from the LP lower byte. The limit tag bit (Ln) is cleared.
DALU
SC140 DSP Core Reference Manual 2-9
A special case of the MOVE.L instruction is used for reading from or writing to the EXT portion of a data register. Six variations of this instruction save (restore) the extension bits and Ln bit of data registers to (from) memory. One of the variations writes to memory the Ln bit and extension bits of an even and an odd pair of registers. Another variation reads bits 8:0 from memory to the extension bits and the Ln bit of an even register. Another variation reads bits 24:16 to the extension bits and the Ln bit of an odd register. Memory writes are done from the even/odd pair of registers. Memory reads are done to a single register. An extension saved to memory from an even numbered register must be restored to an even register, likewise for odd registers.
All move instructions are described in detail in Appendix A, “SC140 DSP Core Instruction Set.” Table 2-2 summarizes the various types of data bus write access to the data registers. Note: When an unsigned long operand is written to a data register, Dn.e is zero-extended.
Table 2-3 summarizes the various types of data bus read accesses from the data registers.
Note: A fractional word or fractional long word can be written to memory with or without limiting and
shifting. See MOVE.F and MOVES.F in Appendix A, “SC140 DSP Core Instruction Set.”
The register file architecture and the 64-bit wide data buses XDBA and XDBB support wide data transfers between the memory and the data registers. Up to four 16-bit words or two 32-bit long words can be transferred between the register file and the memory in a single move operation on each data bus, XDBA or XDBB.
Table 2-4 summarizes the various data widths for data moves from/to the data register file.
Table 2-2. Write to Data Registers
Operand Type Ln Dn.e Dn.h Dn.l
Fractional word Zero-extended Sign-extended Operand Zero-filled
Integer Byte Zero-extended Zero-extended/
Sign-extended
Zero-filled/
Sign-extended
Upper byte - Sign-extended/zero-extended
Lower byte - Operand
Integer Word Zero-extended Zero-extended/
Sign-extended
Zero-filled/
Sign-extended
Operand
Long Zero-extended Zero-extended/
Sign-extended
Operand Operand
2 Extensions - Long Operand Operand Unchanged Unchanged
Table 2-3. Read from Data Registers
Operand Type Memory Data Bus.h Memory Data Bus.l Limiting/Scaling
Fractional Word - Dn.h Yes/No (Se e N ot e)
Fractional Long Dn.h Dn.l Yes/No (See Note)
Integer Word - Dn.l No
Integer Long Dn.h Dn.l No
Integer Byte - Low byte - Dn.l[7:0] No
2 Extensions - Long EXT word: {7 zero bits, L
n+1
,
D
n+1
.e}
EXT word: {7 zero bits, Ln,
Dn.e}
No
2-10 SC140 DSP Core Reference Manual
DALU
.
2.2.1.2 Multiply-Accumulate (MAC) Unit
The MAC unit is the arithmetic part of the ALU containing both a multiplier and an adder. It also performs other operations such as rounding, saturation, comparisons, and shifting. Inputs to the MAC unit are from data registers or from immediate data programmed into the instruction. As many as three operands may be inputs. The destination for MAC instructions is always a data register in the 40-bit form EXT:HP:LP. The multiplier executes 16 by 16 parallel multiplication of two’s complement data, signed or unsigned, fractional or integer. The multiplier output can be accumulated with 40-bit data in a destination register. A detailed description of each multiplication operation is given in Section 2.2.2.3, “Multiplication.” The adder executes addition and subtraction of two 40-bit operands. All MAC instructions are executed in one clock cycle.
Table 2-5 lists the arithmetic instructions that are executed in the MAC unit. A more detailed description of each instruction is given in Appendix A, “SC140 DSP Core Instruction Set.”
Table 2-4. Data Registers Access Width
Operand Type Data Width (Bits)
Byte 8
Word 16
Long 32 Two word 32 Four byte 32
Two long word 64
Four word 64
Table 2-5. DALU Arithmetic Instructions (MAC)
Instruction Description
ABS Absolute value ADC Add long with carry ADD Add
ADD2 Add two words
ADDNC.W Add without changing the carry bit in the SR
ADR Add and round ASL Arithmetic shift left by one bit ASR Arithmetic shift right by one bit
CLR Clear CMPEQ Compare for equal CMPGT Compare for greater than
CMPHI Compare for higher (unsigned)
DALU
SC140 DSP Core Reference Manual 2-11
DECEQ Decrement a data register and set T (the true bit) if zero DECGE Decrement a data register and set T if greater than or equal to zero
DIV Divide iteration
DMACSS Multiply signed by signed and accumulate with data register
right-shifted by word size
DMACSU Multiply signed by unsigned and accumulate with data register
right-shifted by word size
IADDNC.W 40-bit non-saturating add integers with immediate, no carry update
IMAC Multiply-accumulate integers
IMACLHUU Multiply-accumulate unsigned integers:
first source from low portion, second from high portion
IMACUS Multiply-accumulate unsigned integer and signed integer
IMPY.W Multiply integer
IMPYHLUU Multiply unsigned integer and unsigned integer:
first source from high portion, second from low portion
IMPYSU Multiply signed integer and unsigned integer
IMPYUU Multiply unsigned integer and unsigned integer
INC Increment a data register
INC.F Increment a data register (as fractional data)
MAC Multiply-accumulate signed fractions
MACR Multiply-accumulate signed fractions and round MACSU Multiply-accumulate signed fraction and unsigned fraction MACUS Multiply-accumulate unsigned fraction and signed fraction MACUU Multiply-accumulate unsigned fraction and unsigned fraction
MAX Transfer maximum signed value
MAX2 Transfer two 16-bit maximum signed values
MAX2VIT Transfer two 16-bit maximum signed values, update Viterbi flags
MAXM Transfer maximum magnitude value
MIN Transfer minimum signed value
MPY Multiply signed fractions
MPYR Multiply signed fractions and round MPYSU Multiply signed fraction and unsigned fraction MPYUS Multiply unsigned fraction and signed fraction MPYUU Multiply unsigned fraction and unsigned fraction
Table 2-5. DALU Arithmetic Instructions (MAC) (Continued)
Instruction Description
2-12 SC140 DSP Core Reference Manual
DALU
2.2.1.3 Bit-Field Unit (BFU)
The BFU is the logic part of the ALU. It contains a 40-bit parallel bidirectional shifter (with a 40-bit input and a 40-bit output) mask generation unit and logic unit. The BFU is used in the following operations:
Multi-bit left/right shift (arithmetic or logical)
One-bit rotate (right or left)
Bit-field insert and extract
Count leading bits (ones or zeros)
Logical operations
Sign or zero extension operations
Table 2-6 lists the instructions which are executed in the BFU. A more detailed description of each instruction is given in Appendix A, “SC140 DSP Core Instruction Set.”
NEG Negate RND Round
SAT.F Saturate fractional value in data register to fit in high portion
SAT.L Saturate value in data register to fit in 32 bits
SBC Subtract long with carry SBR Subtract and round
SUB Subtract SUB2 Subtract two words SUBL Shift left and subtract
SUBNC.W Subtract with no carry bit generation
TFR Transfer data register to a data register TFRF Transfer data register to a data register if T bit is false TFRT Transfer data register to a data register if T bit is true
TSTEQ Test for equal to zero
TSTEQ.L 32-bit compare for equal to zero
TSTGE Test for greater than or equal to zero
TSTGT Test for greater than zero
Table 2-5. DALU Arithmetic Instructions (MAC) (Continued)
Instruction Description
DALU
SC140 DSP Core Reference Manual 2-13
2.2.1.4 Data Shifter/Limiter
The data shifters/limiters provide special post-processing on data written from a Dn register to the XDBA or XDBB buses. There are eight independent shifters/limiters, four for the XDBA bus and four for the XDBB bus, allowing transfers to memory of up to four words per MOVES instruction with scaling and limiting. Each consists of a shifter for scaling followed by a limiter. Note that arithmetic saturation from DALU operations is a different function. Saturation occurs in the DALU before data is written to a destination register.
Table 2-6. DALU Logical Instructions (BFU)
Instruction Description
AND Logical AND
ASLL Multi-bit arithmetic shi ft le ft ASLW Word arithmetic shift left (16-bit shift) ASRR Multi-bit arithmetic shift right
ASRW Word arithmetic shift right (16-bit shift)
CLB Count leading bits (ones or zeros)
EOR Bit-wise exclusive OR
EXTRACT Extract signed bit-field
EXTRACTU Extract unsigned bit-field
INSERT Insert bit-field
LSLL Multi-bit logical shift left
LSR Logical shift right by one bit
LSRR Multi-bit logical shift right
LSRW Word logical shift right (16-bit shift)
NOT One’s complement (inversion)
OR Bit-wise inclusive OR ROL Rotate one bit left through the carry bit ROR Rotate one bit right through the carry bit
SXT.B Sign extend byte SXT.L Sign extend long
SXT.W Sign extend word
ZXT.B Zero extend byte ZXT.L Zero extend long
ZXT.W Zero extend word
2-14 SC140 DSP Core Reference Manual
DALU
2.2.1.5 Scaling
The data shifters in the shifter/limiter unit can perform the following data shift operations:
Scale up—Shift data one bit to the left
Scale down—Shift data one bit to the right
No scaling—Pass the data unshifted
The eight shifters permit direct dynamic scaling of fixed-point data without additional program steps. For example, this permits straightforward block floating-point implementation of Fast Fourier Transforms (FFTs).
Scaling occurs if programmed in the scaling mode bits S0 and S1 (bits 4 and 5 in the SR). Scaling of operands only occurs with the MOVES.F, MOVES.2F, MOVES.4F, and MOVES.L instructions, moving data from a DALU register (or registers) to memory. The data in the register is not changed, only the data that is transferred. The scaling mode also affects the Ln bit calculation and the rounding function for a set of DALU instructions. Scaling is disabled when the arithmetic saturation mode is set. See Section 3.1.1,
“Status Register (SR),” and below for further details. An example of scaling is provided in Table 2-7.
2.2.1.6 Limiting
The limiting capability is enabled only for the MOVES.F, MOVES.2F, MOVES.4F, and MOVES.L instructions, and not for any other fractional moves such as MOVE.F. These instructions move data from DALU register(s) to memory. The limiting operation takes place in two steps: first, calculating the Ln bit when a previous ALU instruction wrote to a register, and second, transferring the data from that register with a MOVES instruction. The transferred data is limited if the Ln bit is set.
2.2.1.6.1 Calculating the Ln Bit
The Ln bit can be affected by ALU instructions which are capable of using the extension portion of a data register. The only use of the Ln bit is to set up or prepare for a subsequent MOVES instruction. The Ln bit is calculated based on the effective extension bits shown in Table 2-8. These are the bits to the left of the implied decimal point after scaling. If the bits are not all zeros or all ones, the extension is effectively in use and the Ln bit will be set. The Ln bit is cleared as data is written to a DALU register if the defining bits below are all zeros or all ones.
Table 2-7. Scaling Example
Instruction
Memory/ Register
New Value Comments
move.w #$0030,r0 r0 $0000 0030 R0 initialized for first memory write moveu.w #$0200,d0.h d0 $0200 0000 D0 written bmset #$10,sr.l sr $0000 0010 Scale down set in SR moves.f d0,(r0)+ $0030 $0100 Memory written with scaled down value move.l #$00e40020,sr sr $00e4 0020 Scale up set in SR moves.f d0,(r0) $0032 $0400 Memory written with scaled up value
DALU
SC140 DSP Core Reference Manual 2-15
The Ln bit is calculated (and set or cleared) for the following saturable instructions: ABS, ADC, ADR, ADD, ADDNC, ASL, ASR, DIV, INC, MAC, MACR, MPY, MPYR, NEG, RND, SBC, SBR, SUB, SUBL, SUBNC, and TFRx. The Ln bit is cleared if arithmetic saturation mode is set, except for these instructions: ADC, DIV, SBC, TFR, TFRT, and TFTF. For the latter six, the Ln bit calculation is done, even if arithmetic saturation mode is set. However, no scaling is considered in the Ln bit calculation if the arithmetic saturation mode is set, even if a scaling mode bit is set.
The Ln bit is always cleared as a result of the execution of one of the following instructions: CLR, DECEQ, DECGE, MAX, MAXM, MIN, ADD2, SUB2, MAX2, MAX2VIT, DMACsu, DMACss, MACsu, MACuu, MACus, MPYsu, MPYuu, MPYus, IADDNC, SAT, all integer multiplication operations, all BFU operations (as listed in Table 2-6 on page 2-13), and all MOVE instructions except for the specialized MOVE instruction that restores (pops the stack) the extension and Ln bits from memory. If the result of these instructions is required to be limited by a following move operation (a TFR Dn), the Dn instruction should be executed after the original instruction in order to validate the Ln bit before the value is written to memory using a MOVES.x operation.
2.2.1.6.2 Limiting with the MOVES Instructions
The second stage of limiting occurs with the execution of a MOVES instruction. A limited value is substituted for the transferred data if the Ln bit of that register was set. The data in the register is not changed, only the data transferred.
Having four limiters for each bus allows eight operands to be limited independently in the same instruction cycle. The four data limiters per bus can also be combined to form two 32-bit data limiters per bus for long-word operands.
If limiting occurs, the data limiter substitutes a limited data value having maximum magnitude (saturated) and the same sign as the 40-bit source register content:
$7FFF for 16-bit positive numbers
$7FFF FFFF for 32-bit positive numbers
$8000 for 16-bit negative numbers
$8000 0000 for 32-bit negative numbers
This substitution process is sometimes called transfer saturation. The value in the register is not shifted or limited, and can be reused by subsequent instructions. If the arithmetic saturation mode is set in the SR, scaling is not considered in the calculation of the Ln bit. An example of limiting is provided in Table 2-9.
Table 2-8. Ln Bit Calculation
S1 S0 Scaling Mode Bits Defining the Ln bit Calculation
0 0 No Scaling Bits 39, 38..............32, 31
0 1 Scale Down Bits 39, 38..............33, 32
1 0 Scale Up Bits 39, 38..............31, 30
2-16 SC140 DSP Core Reference Manual
DALU
Note that in the unusual case where arithmetic saturation mode is set between a DALU instruction and a subsequent moves instruction, scaling with the moves instruction is inhibited. However, limiting will occur if the Ln bit is already set.
2.2.1.7 Scaling and Arithmetic Saturation Mode Interactions
The following table shows the scaling and limiting operations for the four possible cases of scaling/no scaling with arithmetic saturation mode on/off. Note that the mode of both scaling and arithmetic saturation selected is not a normal mode of operation for the core. The “Special Six” instructions referred to in Table 2-10 and Table 2-11 are ADC, DIV, SBC, TFR, TFRT, and TFTF.
Note: Limiting will occur if the Ln bit is set.
Table 2-9. Limiting Example
Instruction
Memory/ Register
New Value Comments
move.w #$0030,r0 r0 $0000 0020 R0 holds the address for the first move to memory moveu.w #$7fff,d0.h d0 $7fff 0000 d0.h set with the most positive 2’s complement number moveu.w #$7fff,d1.h d1 $7fff 0000 d1.h set with the most positive 2’s complement number add d0,d1,d3 d3 $1:00:fffe 0000 L3 bit set from overflow move.f d3,(r0)+ $0020 $fffe No limiting from the move instruction moves.f d3,(r0) $0022 $7fff Limiting occurs with the moves instruction
Table 2-10. Scaling and Limiting Interactions
Scaling
Selected
Arithmetic Saturation
Mode
Ln Bit Calculation
Limiting
with MOVES
instructions
(see note below)
Scaling with
MOVES
Instructions
Saturable
DALU
Instructions
Special Six
Instructions
Other DALU Instructions
None Off Calculated,
no scaling
Calculated,
no scaling
Cleared Yes No
Up/down Off Calculated,
with scaling
Calculated,
with scaling
Cleared Yes Yes
Off On Cleared Calculated,
no scaling
Cleared Yes No
Up/down On Cleared Calculated,
no scaling
Cleared Yes No
DALU
SC140 DSP Core Reference Manual 2-17
The following table (Table 2-11) shows the arithmetic saturation and rounding operations for the four possible cases of scaling, no scaling, and arithmetic saturation mode on/off.
2.2.2 DALU Arithmetic and Rounding
The following paragraphs describe the DALU data representation, rounding modes, and arithmetic methods.
2.2.2.1 Data Representation
The SC140 core uses either a fractional or integer two’s complement data representation for all DALU operations. The main difference between fractional and integer representations is the location of the decimal (or binary) point. For fractional arithmetic, the decimal (or binary) point is always located immediately to the right of the most significant bit of the high portion. For integer values, it is always located immediately to the right of the least significant bit (LSB) of the value. Figure 2-3 shows the location of the decimal point (binary point) bit weighting and operand alignment for different fractional and integer representations supported on the SC140 architecture.
Table 2-11. Saturation and Rounding Interactions
Scaling Selected
Arithmetic Saturation
Mode
Arithmetic Saturation
Rounding
Saturable DALU
Instructions
Special Six Instructions
None Off None None Rounding with no scaling Up/down Off None None Rounding with scaling
considered None On Saturation can occur None Rounding with no scaling Up/down On Saturatio n can occur, no
scaling considered
None Rounding with no scaling
2-18 SC140 DSP Core Reference Manual
DALU
Figure 2-3. DALU Data Representations
2.2.2.2 Data Formats
Three types of two’s complement data formats are supported by the SC140 core:
Signed fractional (SF)
Signed integer (SI)
Unsigned integer (UI)
The ranges for each of these formats, described below, apply to all data stored in memory as well as data stored in the data registers. The extension associated with each register allows word growth so that the most positive fractional number that can be represented in a register is almost 256.0 with the most negative fractional number being exactly -256.0. When the register extension is in use, the data contained in the register cannot be stored exactly in memory or in other registers in a single move. In these cases, the storage error can be minimized by limiting the data to the most positive or most negative number consistent with the size of the destination, the sign of the register and the MSB of the extension.
2.2.2.2.1 Signed Fractional
In this format, without extension bits 39-32, the N-bit operand is represented using the 1.[N-1] bit format (1 sign bit, N-1 fractional bits). Signed fractional numbers lie in the following range:
-1.0
SF +1.0 - 2
-[N-1]
For words and long-word signed fractions, the most negative number that can be represented is exactly –1.0, of which the internal representation is $8000 and $8000 0000, respectively. The most positive word is $7FFF or 1.0–2
-15
, and the most positive long word is $7FFF FFFF or 1.0–2
-31
.
If the extension bits are in use, the most positive number is 256 – 2
–31
represented by $7F FFFF FFFF, and
the most negative number is –256, represented by $80 0000 0000.
16-bit word operand
D0.h—D15.h,
16-bit memory
40-bit registers
D0—D15
16-bit word operand
D0.l—D15.l,
16-bit memory
40-bit registers
D0—D15
Signed Fractional Two’s Complement Representations
Signed Integer Two’s Complement Representations
.
2
0
2
–15
2
0
2
–152–16
2
–31
2
8
2
15
2
0
2
14
2
31
2162
15
2
0
2
39
.
DALU
SC140 DSP Core Reference Manual 2-19
2.2.2.2.2 Signed Integer
This format is used when processing data as integers. Using this format, the N-bit operand is represented using the N.0 bit format (N integer bits). Signed integer numbers lie in the following range:
-2
[N-1]
SI [2
[N-1]
-1]
For words and long-word signed integers, the most negative word that can be represented is -32768 ($8000) and the most negative long word is -2147483648 ($8000 0000). The most positive word is 32767 ($7FFF) and the most positive long word is 2147483647 ($7FFF FFFF).
If the extension bits are in use, N becomes 40, and the most positive number is 2
39
– 1 represented by
$7F FFFF FFFF. The most negative number is –2
39
, represented by $80 0000 0000.
2.2.2.2.3 Unsigned Integer
Unsigned integer numbers may be thought of as positive only. The unsigned numbers have nearly twice the magnitude of a signed number of the same length. Unsigned integer numbers lie in the following range:
0
UI [2
N
-1]
The binary word is interpreted as having a binary point immediately to the right of the LSB. The most positive 16-bit unsigned integer is 65535 ($FFFF). The most positive 32-bit unsigned integer is 2
32
-1
($FFFF FFFF). The smallest unsigned number is zero ($0000). If the extension bits are in use, the range is from zero to +2
40
– 1.
Table 2-12. Two’s Complement Word Representations
Signed Fractional Signed Integer Unsigned Integer
$7FFF $7FFF $FFFF
llll$FFFE
llllll
$0001 $0001 +1 l l
$0000 0 $0000 0 l l
$FFFF $FFFF l l
llllll llll$00011
$8000 $8000 $0000 0
1.0 2
15
–2
15
1–2
16
1
2
16
2
2
15
2
15
1–
1.0
2
15
2-20 SC140 DSP Core Reference Manual
DALU
2.2.2.3 Multiplication
Most of the operations are performed identically in fractional and integer arithmetic. However, the multiplication operation is not the same for integer and fractional arithmetic. As illustrated in Figure 2-4, fractional and integer multiplication differ by a 1-bit shift. Any binary multiplication of two N-bit signed numbers gives a signed result that is 2N-1 bits in length. This 2N-1 bit result must then be correctly placed into a field of 2N-bits to correctly fit into the on-chip registers. For correct fractional multiplication, an extra 0-bit is placed at the LSB to give a 2N-bit result. For correct integer multiplication, an extra sign bit is placed at the MSB to give a 2N-bit result.
The MPY, MAC, MPYR, and MACR instructions perform fractional multiplication and fractional multiply-accumulation. The IMPY and the IMAC instructions perform integer multiplication.
2.2.2.4 Division
Fractional division of both positive and signed values is supported using the DIV instruction. The dividend (numerator) is a 32-bit fraction and the divisor (denominator) is a 16-bit fraction. For a detailed description of the DIV instruction, see Appendix A, “SC140 DSP Core Instruction Set.”
2.2.2.5 Unsigned Arithmetic
Unsigned arithmetic can be performed on the SC140 core architecture. Most of the unsigned arithmetic instructions are performed the same as the signed instructions. However, some operations require special hardware and are implemented as separate instructions.
2.2.2.5.1 Unsigned Multiplication
Unsigned multiplication (MPYUU, MACUU) and mixed unsigned-signed multiplication (MPYSU, MACSU) are used to support double precision, as described in Section 2.2.2.8, “Multi-Precision
Arithmetic Support.” These instructions can be used for unsigned arithmetic multiplication.
Figure 2-4. Fractional and Integer Multiplication
S S
S
2N – 1 product
2N bits
S S
0
2N – 1 product
2N bits
Integer Fractional
Signed Multiplication: N x N --> 2N – 1 Bits
X
sign extension zero fill
X
Signed Multiplier Signed Multiplier
SHP LP SHP LP
DALU
SC140 DSP Core Reference Manual 2-21
2.2.2.5.2 Unsigned Comparison
When performing an unsigned comparison, the condition code computation is different from signed comparisons. The most significant bit of the unsigned operand has a positive weight, while in signed representation it has a negative weight. Special instructions are implemented to support unsigned comparison such as CMPHI (compare greater).
2.2.2.6 Rounding Modes
The SC140 DALU performs rounding of the full register to single precision if requested in the instruct ion. The high portion of the register is rounded according to the contents of the low portion of the register. Then the low portion is cleared. The boundary between the low portion and the high portion is determined by the scaling mode bits (S0 and S1) in the SR. Two types of rounding are implemented, convergent rounding and two’s complement rounding. The type of rounding is selected by the rounding mo de (RM) bit in the SR.
Table 2-13 shows the boundary between the high portion and the low portion depending on scaling. The scaling adjustment is disabled if arithmetic saturation mode is selected.
2.2.2.6.1 Convergent Rounding
Convergent rounding (also called round-to-nearest even number) is the default rounding mode. It is selected when the rounding mode (RM) bit in the SR is cleared. The traditional rounding method rounds up any value greater than one-half, and rounds down any value less than one-half. However, the question arises as to which way one-half should be rounded. If it is always rounded one way, the results are eventually biased in that direction. Convergent rounding, however, removes the bias by rounding down if the high portion is even (LSB = 0) and rounding up if the high portion is odd (LSB = 1).
For no scaling, the higher portion (HP) of the register is bits 39:16; the low portion (LP) is bits 15:0. The HP is incremented by one bit if the LP was > 1/2, or if the LP = 1/2 and bit 16 was 1 (odd). The HP is left alone if the LP was <1/2, or if LP = 1/2 and bit 16 was 0 (even). After rounding, the LP is cleared. If scaling down is selected, the HP is bits 39:17 and the LP is bits 16:0. If scaling up is selected, the HP is bits 39:15 and the LP is bits 14:0.
Table 2-13. Rounding Position in Relation to Scaling Mode
S1 S0 Scaling Mode High Portion Low Portion
0 0 No Scaling 39–16 15–0 0 1 Scale Down 39–17 16–0 1 0 Scale Up 39–15 14–0
2-22 SC140 DSP Core Reference Manual
DALU
Figure 2-5 shows the four cases for rounding a number in the Dn.h register. If scaling is set in the SR, the rounding position is updated to reflect the alignment of the result when it is put on the data bus. However, the contents of the register are not scaled.
Figure 2-5. Convergent Rounding (No Scaling)
Case I: If D0.l < $8000 (1/2), then round down (add nothing)
Before Rounding
After Rounding
Case II: If D0.l > $8000 (1/2), then round up (add 1 to D0.h)
Case III: If D0.l = $8000 (1/2), and the LSB of D0.h= 0, then round down (add nothing)
Case IV: If D0.l = $8000 (1/2), and the LSB of Do.h = 1, then round up (add 1 to D0.h)
*D0.l is always clear, performed during RND, MPYR, and MACR.
X X . . X X X X X . . . X X X 0 1 0 0 0 1 1 X X X . . . . X X X
39 32 31 16 15 0
D0.e D0.h D0.l
0
X X . . X X X X X . . . X X X 0 1 0 0 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 0 1 1 1 0 X X . . . . X X X
39 32 31 16 15 0
D0.e D0.h D0.l
1
X X . . X X X X X . . . X X X 0 1 0 1 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 0 1 0 0 0 . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l
0
X X . . X X X X X . . . X X X 0 1 0 0 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 1 1 0 0 0 . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l
1
X X . . X X X X X . . . X X X 0 1 1 0 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
DALU
SC140 DSP Core Reference Manual 2-23
2.2.2.6.2 Two’s Complement Rounding
When two’s complement rounding is selected by setting the rounding mode (RM) bit in the SR, all values greater than or equal to one-half are rounded up, and all values less than one-half are rounded down. Therefore, a small positive bias is introduced.
For no scaling, the higher portion (HP) of the register is bits 39:16; the low portion (LP) is bits 15:0. The HP is incremented by one bit if the LP was ≥ 1/2. The HP is left alone if the LP was <1/2. After rounding, the LP is cleared. If scaling down is selected, the HP is bits 39:17 and the LP is bits 16:0. If scaling up is selected, the HP is bits 39:15 and LP is bits 14:0.
2-24 SC140 DSP Core Reference Manual
DALU
Figure 2-6 shows the four cases for rounding a number in the Dn.h register. If scaling is set in the SR, the rounding position is updated to reflect the alignment of the result when it is transferred to the data bus. However, the contents of the register are not scaled.
Figure 2-6. Two’s Complement Rounding (No Scaling)
Case I: If D0.l < $8000 (1/2), then round down (add nothing)
Before Rounding
After Rounding
Case II: If D0.l > $8000 (1/2), then round up (add 1 to D0.h)
Case III: If D0.l = $8000 (1/2), and the LSB of D0.h = 0, then round up (add 1 to D0.h)
Case IV: If D0.l = $8000 (1/2), and the LSB of D0.h = 1, then round up (add 1 to D0.h)
*D0.l is always cleared, performed during RND, MPYR, and MACR.
X X . . X X X X X . . . X X X 0 1 0 0 0 1 1 X X X . . . . X X X
39 32 31 16 15 0
D0.e D0.h D0.l
0
X X . . X X X X X . . . X X X 0 1 0 0 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 0 1 1 1 0 X X . . . . X X X
39 32 31 16 15 0
D0.e D0.h D0.l
1
X X . . X X X X X . . . X X X 0 1 0 1 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 0 1 0 0 0 . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l
1
X X . . X X X X X . . . X X X 0 1 0 1 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
Before Rounding
After Rounding
X X . . X X X X X . . . X X X 0 1 0 1 1 0 0 0 . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l
1
X X . . X X X X X . . . X X X 0 1 1 0 0 0 0 . . . . . . . . . 0 0 0
39 32 31 16 15 0
D0.e D0.h D0.l*
DALU
SC140 DSP Core Reference Manual 2-25
2.2.2.7 Arithmetic Saturation Mode
By setting the arithmetic saturation mode (SM) bit in the SR, the arithmetic unit’s result is limited to 32 bits (high portion and low portion). The dynamic range of the DALU is therefore reduced to 32 bits. The purpose of this bit is to provide a saturation mode for algorithms that do not recognize or cannot take advantage of the extension bits.
Arithmetic saturation operates by checking whether bits 39–31 of a relevant DALU instruction result in all ones or all zeros. If they are not, and if bit 39 is one, the result receives the negative saturation co nstant $FF 8000 0000. If bit 39 is zero, the result receives the positive saturation constant $00 7FFF FFFF. If saturation occurs, the DOVF bit in the EMR register is set.
1
The calculation for saturation is not affected by the scaling mode. In the same way, the rounding of the saturation constant during execution of MPYR, MACR and RND instructions is independent of the scaling mode: $00 7FFF FFFF is rounded to $00 7FFF 0000 and $FF 8000 0000 is unchanged.
The instructions that are affected by arithmetic saturation mode are: MAC, MPY, MACR, MPYR, SUB, ADD, NEG, ABS, RND, INC, ADR, SBR, SUBL, ASR, SUBNC, ADDNC, and ASL.
When the arithmetic saturation mode is set, for most of the instructions, the scaling mode bits are ignored for the calculation of the Ln bit, and the Ln bit cannot be set. For instructions ADC, DIV, SBC, TFR, TFRT, and TFRF, however, the arithmetic saturation mode is ignored, and the Ln bit will be calculated. These six are dependent on arithmetic saturation mode to the extent that scaling is not considered in the Ln bit calculation if arithmetic saturation mode is on. See Section 2.2.1.7, “Scaling and Arithmetic Saturation
Mode Interactions,” on page 2-16 for more information.
The arithmetic saturation mode is always disabled during the execution of the following instructions: TFR, TFRT, TFRF, MAX, MAXM, MIN, ADD2, SUB2, DIV, SBC, ADC, MAX2, MAX2VIT, DMACSU, DMACSS, MACSU, MACUS, MACUU, MPYSU, MPYUU, MPYUS, IADDNC, CMPHI, DECEQ, DECGE all integer multiplication operations, and all BFU operations as described in Table 2-6 on page 2-13. If the result of these instructions should be saturated, a SAT.L Dn instruction must be executed following the original instruction.
If the arithmetic saturation mode is set and data saturation occurs, the sticky data overflow bit (DOVF) in the EMR is set to signify that the arithmetic result before saturation cannot be represented in 32 bits. Note that if arithmetic saturation mode is not set, the DOVF bit is set when overflow from 40 bits occurs. Table 2-14 provides an example of the arithmetic saturation mode.
1. In case of a 40-bit overflow which takes place in conjunction with arithmetic saturation, the constant being chosen is undefined, and it can be either the negative or positive constant.
Table 2-14. Arithmetic Saturation Example
Instruction
Memory/ Register
New Value Comments
bmset #$0004,sr.l sr $00e4 0004 Arithmetic saturation mode set moveu.w #$7fff,d0.h d0 $7fff 0000 d0.h set with the most positive 2’s complement number moveu.w #$7fff,d1.h d1 $7fff 0000 d1.h set with the most positive 2’s complement number add d0,d1,d3 d3 $0:00:7fff ffff Max positive constant loaded in D3. L3 bit not set from
overflow
emr $0000 0004 DALU overflow bit set
2-26 SC140 DSP Core Reference Manual
DALU
2.2.2.8 Multi-Precision Arithmetic Support
The SC140 DALU supports multi-precision arithmetic for fractional and integer operations.
2.2.2.8.1 Fractional Multi-Precision Arithmetic
A set of DALU instructions is provided for fractional multi-precision multiplications. When these instructions are used, the multiplier accepts some combinations of two’s complement signed and unsigned formats. Table 2-15 lists these instructions.
Figure 2-7 shows how the DMAC instruction is implemented.
Figure 2-7. DMAC Implementation
Table 2-15. Fractional Signed and Unsigned Two’s Complement Multiplication
Instruction Description
MPYSU/MACSU Fractional multiplication and multiply-accumulate with signed × unsigned operands MPYUS/MACUS Fractional multiplication and multiply-accumulate with unsigned × signed operands MPYUU/MACUU Fractional multiplication and multiply-accumulate with unsigned × unsigned ope rands DMACSS Fractional multiplication with signed × signed operands and 16-bit arithmetic right shift
of the accumulator before accumulation
DMACSU Fractional multiplication with signed × unsigned operands and 16-bit arithmetic right
shift of the accumulator before accumulation
Multiply
+
40-bit Accumulate
Register Shifter
>> 16
16-bit Operand
16-bit Operand
DALU
SC140 DSP Core Reference Manual 2-27
Figure 2-8 illustrates the use of these instructions in the case of a double-precision multiplication of 32-bit x 32-bit operands. The “Unsigned x Unsigned” operation is used to multiply or multiply-accumulate the unsigned low portion of one double-precision number with the un signed low portion of the other double-precision number. The “Signed x Unsigned” and “Unsigned x Signed” operations are used to multiply or multiply-accumulate the signed high portion of one double-precision number with the unsigned low portion of the other double-precision number. The “Signed x Signed” operation is used to multiply or multiply-accumulate the two signed high portions of two signed double-precision numbers. The TFRx instructions in parentheses are optional instructions that are used only in case all 64 bits of the result are needed. Otherwise, the result is truncated to a 32-bit fraction.
Figure 2-8. Fractional Double-Precision Multiplication
32 bits
64 bits
D3.lD4.lD2.lD2.hD2.e
D0.lD0.h
D1.h D1.l
×
=
S Ext
+
+
+
D1.l
×
D0.l
D0.h
× D1.l
D1.h
× D0.l
D1.h
× D0.h
Signed × Unsigned
Signed × Signed
Unsigned × Unsigned
D0,D1,D2 D2,D3)
D0,D1,D2
D0,D1,D2 D2,D4)
D0,D1,D2
mpyuu (tfr
dmacsu
macus (tfr
dmacss
Unsigned × Signed
2-28 SC140 DSP Core Reference Manual
DALU
Figure 2-9 illustrates the use of the fractional multiplication and multiply-accumulate instructions in the case of a mixed double-precision multiplication of 16-bit by 32-bit signed operands. The “Signed x Unsigned” operation is used to multiply the signed high portion of one single-precision number with the unsigned low portion of the other double-precision number. The “Signed x Signed” DMAC operation is used to multiply-accumulate the two signed high portions of the two signed operands. The TFRx instruction in parentheses is an optional instruction that is used only in case all 48 bits of the result are needed. Otherwise, the result is truncated to a 32 bit fraction.
Figure 2-9. Fractional Mixed-Precision Multiplication
2.2.2.8.2 Integer Multi-Precision Arithmetic
A set of DALU operations is provided for integer multi-precision multiplications. When these instructions are used, the multiplier accepts some combinations of two’s complement signed and unsigned formats. Both signed and unsigned multi-precision multiplication are supported. Table 2-16 lists these instructions.
Table 2-16. Integer Signed and Unsigned Two’s Complement Multiplication
Instruction Description
IMPYSU/IMACSU Integer multiplication and multiply-accumulate with signed x unsigned operands IMPYUU Integer multiplication with unsigned x unsigned operands IMPYHLUU Integer multiply unsigned x unsigned:
first source from high portion, second from low portion
IMACLHUU Integer multiply-accumulate unsigned x unsigned:
first source from low portion, second from high portion
48 bits
D3.lD2.lD2.hD2.e
D0.h
D1.h D1.l
×
=
S Ext
+
D0.h
× D1.l
D1.h
× D0.h
Signed × Unsigned
Signed × Signed
D0,D1,D2 D2,D3)
D0,D1,D2
mpysu
(tfr
dmacss
DALU
SC140 DSP Core Reference Manual 2-29
Figure 2-10 illustrates the use of these instructions in the case of a signed integer double-precision multiplication of 32-bit by 32-bit signed operands. In this example, only a 32-bit result is generated. The most significant 32 bits are shifted out.The “Unsigned x Unsigned” operation is used to multiply or multiply-accumulate the unsigned low portion of one double-precision number with the unsigned low portion of the other double-precision number. The “Signed x Unsigned” and “Unsigned x Signed” operations are used to multiply or multiply-accumulate the signed high portion of one double-precision number with the unsigned low portion of the other double-precision number. This example generates only a 32-bit integer.
Figure 2-10. Signed Integer Double-Precision Multiplication
32 bits
32 bits
D3.lD3.h
D0.lD0.h
D1.h D1.l
×
=
+
D1.l
× D0.l
D0.h
× D1.l
D1.h
× D0.l
Signed × Unsigned
Unsigned × Unsigned
D0,D1,D2
D0,D1,D3
D0,D1,D3
D3
impyuu
impysu
imacus
aslw
D3.l
0
add D2,D3
+
Unsigned × Signed
2-30 SC140 DSP Core Reference Manual
DALU
Figure 2-11 illustrates the use of these instructions in the case of an unsigned integer double-precision multiplication of 32-bit by 32-bit unsigned operands. In this example, only a 32-bit result is generated. The most significant 32-bits are shifted out. All multiplications are of the “Unsigned x Unsigned” type using different combinations of high and low portions.
Figure 2-11. Unsigned Integer Double-Precision Multiplication
2.2.2.9 Viterbi Decoding Support
A set of DALU and AGU operations is provided for Viterbi decoding kernels. A special MAX2VIT operation is defined. This instruction functions as a regular MAX2 instruction and is used to transfer two 16-bit maximum signed values. In addition, the MAX2VIT instruction updates two Viterbi flags (VFs) which reside in the status register as described in Section 3.1.1, “Status Register (SR),” on page 3-1. Complementary AGU move operations are provided (VSL instructions). For a full description of the Viterbi instructions, see Appendix A, “Viterbi Shift Left Move (AGU) VSL,” on page A-422.
32 bits
D3.lD3.h
D0.lD0.h
D1.h D1.l
×
=
+
D1.l
× D0.l
D0.h
× D1.l
D1.h
× D0.l
Unsigned × Unsigned
Unsigned × Unsigned
impyuu d0,d1,d2
impyhluu d0,d1,d3 imaclhuu d0,d1,d3
aslw d3
D3.l
0
add d2,d3
+
D0.l
Address Generation Unit
SC140 DSP Core Reference Manual 2-31
2.3 Address Generation Unit
The AGU is one of the execution units in the SC140 core. The AGU performs effective address calculations using the integer arithmetic necessary to address data operands in memory. It also contains the registers used to generate the addresses. The AGU implements four types of arithmetic: linear, modulo, multiple wrap-around modulo, and reverse-carry. It operates in parallel with other chip resources to minimize address generation overhead. The AGU also generates change-of-flow program addresses as well as updates the stack pointer (SP), whenever needed.
2.3.1 AGU Architecture
The major components of the AGU are listed below:
Eight low bank address registers (R0–R7)
Eight high bank address registers (R8–R15), or alternatively, eight base address registers (B0–B7)
Two stack pointers (NSP, ESP), only one of which is active at a time (SP)
Four offset registers (N0–N3)
Four modifier registers (M0–M3)
A modifier control register (MCTL)
Two address arithmetic units (AAU)
One bit mask unit (BMU)
In this section, the registers are referred to as:
Rn for any of the R0–R15 address registers
Bn for any of the B0–B7 base address registers
Ni for any of the N0–N3 offset registers
Mj for any of the M0–M3 modifier registers
All the Rn, Bn, SP, Ni, and Mj registers are referred to as AGU registers. All of the AGU registers are 32-bits.
Figure 2-12 shows a block diagram of the AGU.
2-32 SC140 DSP Core Reference Manual
Address Generation Unit
All sixteen address registers (R0–R15) as well as the NSP or ESP are used for generating addresses in the register indirect addressing modes. All four offset registers (N0–N3) can be used by all sixteen address registers. The four modifier registers (M0–M3) can only be used by the low bank of eight address registers (R0–R7).
The base address (Bn) registers are uniquely associated with the low bank of Rn registers such that B0 is used with R0, B1 with R1, and so on.
The BMU is used to perform bit mask operations such as setting, clearing, changing, or testing bits in a destination according to an immediate mask operand. Data is loaded into the BMU over the data memory buses XDBA or XDBB. The result is written back over XDBA or XDBB to the destinations in the next cycle. All bit mask instructions are typically executed in two cycles and work on 16-bit data. This data can be a memory location or a portion (high or low) of a register. For more information, see Section 2.3.6, “Bit
Mask Instructions.”
Figure 2-12. AGU Block Diagram
Program Counter (PC) Address
R0 R1 R2 R3 R4 R5 R6 R7
N0 N1 N2 N3
PABXABBXABA
NSP
MCTL
R8/B0
R9/B1 R10/B2 R11/B3 R12/B4 R13/B5 R14/B6 R15/B7
Bit
Mask
Unit
(BMU)
Memory Data Bus 1 (XDBA)
Memory Data Bus 2 (XDBB)
Address Arithmetic Unit (AAU)
M0 M1 M2 M3
32
32 32
64
64
ESP
Address Generation Unit
SC140 DSP Core Reference Manual 2-33
During every instruction cycle, the two AAUs can generate one 32-bit program memory address on the PAB (in case of change of flow) or two 32-bit data memory addresses (one on each of the XABA and XABB). Each AAU can generate an address to access a byte, a 16-bit word, a 32-bit long word, or a 64-bit two-word long operand in memory to feed into the DALU in a single cycle.
Each AAU can update one address register during one instruction cycle. The modifier control register (MCTL) specifies the type of arithmetic to be used in the address register update calculation. The address arithmetic instructions provide arithmetic operations for address calculations or for general purpose calculations.
The two AAUs are identical. Each contains a 32-bit full adder, called an offset adder, which can perform the following:
Add or subtract two AGU registers
Add an immediate value
Increment or decrement an AGU register
Add the PC
Add with reverse-carry
The offset adder can also perform compare or test operations as well as arithmetic and logical shifts. The offset values added in this adder can be pre-shifted left by 1, 2, or 3 bits according to the access width. In reverse-carry mode, the carry propagates in the opposite direction.
A second full adder, called a modulo adder, adds the summed result of the first full adder to a modulo value, M or minus M, where M is stored in the selected modifier register. In modulo mode, a modulo comparator tests whether the result is inside the buffer by comparing the results to the B register, choosing the correct result from the offset adder or the modulo adder.
For more information, see Section 2.3.5, “Arithmetic Instructions on Address Registers.”
2-34 SC140 DSP Core Reference Manual
Address Generation Unit
2.3.2 AGU Programming Model
The programming model of the AGU is shown in Figure 2-13. The address registers can be programmed for linear addressing, modulo addressing (regular or multiple
wrap-around), and reverse-carry addressing.
Automatic updating of address registers is available when
using address register indirect addressing.
Figure 2-13. AGU Programming Model
ADDRESS REGISTERS
OFFSET, MODIFIER, and MCTL REGISTERS
031
N0
N1
031
N2
N3
R0
R2
R3
R1
R4
R5
R6
R7
SP (NSP, ESP)
031
M0
M1
M2
MCTL
M3
ADDRESS REGISTERS / BASE ADDRESS REGISTERS
031
R8 / B0
R10 / B2
R11 / B3
R9 / B1
R12 / B4
R13 / B5
R14 / B6
R15 / B7
Address Generation Unit
SC140 DSP Core Reference Manual 2-35
2.3.2.1 Address Registers (R0–R15)
The sixteen 32-bit address registers R0–R15 can contain addresses or general-purpose data. These are 32-bit read/write registers. The 32-bit address in a selected address register is used in calculating the effective address of an operand. The contents of an address register can point directly to data, or can be used as an index.
The sixteen address registers R0–R15 are composed of two separate banks, a low bank (R0–R7) and a high bank (R8–R15). The high bank can be used alternatively as a base addre ss regist er bank (B0–B7). Each address register Rn of the high bank can serve as an address register on condition that the corresponding B
n-8
register is not used. Both Rn and B
n-8
are mapped to the same physical register. For example, R8 is
available only if R0 is not being used in modulo addressing since this requires the base address register B0. Use of both R
n
and B
n-8
notations as source and destination of move-like instructions is permitted,
regardless of the use of the physical register as Base modulo or as a pointer. For example:
MOVE.L #ADDRESS, B0 ... MOVE.W (R8), D0
See Section 2.3.2.6, “Modifier Control Register (MCTL),” for further information. The high bank of registers can only be used as pointers in the linear mode of addressing since the other modes of addressing are only encoded for the low bank in the MCTL register.
In addition, an address register can be post-updated according to the addressing mode selected. If an address register is updated, one of the modifier registers (Mj) ca n be used to specify the type of update arithmetic. Offset registers (Ni) are used for post-incrementing and indexing by offset.
The address register modification can be performed by either of the two AAUs. Most addressing modes modify the selected address register in a read-modify-write fashion. The address register is read, its contents are modified by the associated modulo arithmetic unit, and the register is written with the appropriate output of the AAU. The form of address register modification performed by the address arithmetic unit is controlled by the contents of the offset and modifier registers described in the following sections.
2.3.2.2 Stack Pointer Registers (NSP, ESP)
The SC140 core has two stack pointer registers: the normal stack pointer (NSP) and the exception stack pointer (ESP). These 32-bit registers are used implicitly in all PUSH and POP instructions. Only one stack pointer is active at one time according to the mode:
In Normal working mode, the NSP is used.
In Exception working mode, the ESP is used.
The EXP bit in the status register (SR) determines the active working mode. The active stack pointer (SP) is used explicitly for memory references when used with the address register indirect modes. The stack pointers point to the next unoccupied location in the stacks. They are post-incremented on all the implicit PUSH operations and pre-decremented on all the implicit POP operations.
Note: Both stack pointer registers must be initialized explicitly by the programmer after reset.
2-36 SC140 DSP Core Reference Manual
Address Generation Unit
2.3.2.2.1 Shadow Stack Pointer Registers
Both stack pointers have shadow registers which contain a decremented value of the stack pointers. When the shadow register is not valid, the POP instruction is executed in two cycles. The first cycle is used to decrement the stack pointer. When the shadow register is valid, the POP instruction is executed in only one cycle.
When an SP is written by the AAU register transfer (TFRA), its shadow register automatically becomes invalid. When a PUSH/POP instruction is executed, the shadow register of the active SP becomes valid. As a result, during consecutive POPs, even in the worst case, only the first POP requires an additional cycle.
2.3.2.2.2 Initializing ESP
ESP should be initialized using the AAU register transfer (TFRA) instruction. This guarantees a valid ESP value even if execution of this instruction is interrupted by an exception. The TFRA instruction is considered an address arithmetic operation. The ESP is updated at the address generation pipeline stage, avoiding pipeline conflicts.
2.3.2.3 Offset Registers (N0–N3)
The four 32-bit read/write offset registers N0–N3 can contain offset values used to increment or decrement address registers in address register update calculations. These registers can also be used for 32-bit general purpose storage. For example, the contents of an offset register can specify the offset into a table or the base of the table for indexed addressing, or can be used to step through a table at a specified rate (for example, five locations per step for waveform generation). Each address register can be used with each offset register. For example, R0 can be used with N0, N1, N2, or N3 for offset address calculations. The signed value in an offset register is pre-shifted to the left by 0, 1, 2, or 3 bits to align to the access width.
2.3.2.4 Base Address Registers (B0–B7)
The eight 32-bit read/write base address registers B0–B7 are used in modulo calculations. Each B register is associated with an R register (B0 with R0, and so on). When activating the modulo addressing mode, the B register contains the lower boundary value of the modulo buffer. The upper boundary of the modulo buffer is calculated by B+M-1, where M is the modifier register associated with the R register by MCTL.
When not used for modulo addressing, these registers can be used as high bank address registers (R8–R15). Both Rn and Bn
-8
share the same physical register. For example, if R0 is not programmed for
modulo addressing, the base address register B0 can serve as an additional address register R8.
2.3.2.5 Modifier Registers (M0–M3)
The four 32-bit read/write modifier registers M0–M3 can contain the value of the modulus modifier. These registers can also be used for general-purpose storage. When activating the modulo arithmetic, the contents of Mj specify the modulus. Each low address register can be used with each modifier register as programmed in the MCTL register.
Address Generation Unit
SC140 DSP Core Reference Manual 2-37
2.3.2.6 Modifier Control Register (MCTL)
The MCTL register is a 32-bit read/write register. This control register is used to program the address mode (AM) for each of the eight low address registers (R0–R7). The addressing mode of the high address register file (R8–R15) cannot be programmed and functions in linear addressing mode only. The format of MCTL is shown in Figure 2-14.
Figure 2-14. Modifier Control Register (MCTL) Format
The AM bits (AM3, AM2, AM1, AM0) associated with each address register (R0-R7) reflect the address modifier mode of this address register as shown in Table 2-17. Each of the Rn registers can use M0, M1, M2, or M3 as their associated modulo register either in modulo addressing mode, or in multiple wrap-around modulo addressing mode. When activating the modulo addressing mode , the corresponding B register is used to define the lower boundary value (B0 with R0, and so on). The linear or the reverse-carry addressing modes can also be used, freeing the B register to be used as an additional linear address register.
The high bank of the address register file (R8–R15) can only be used in linear addressing mode. Each Rn (n = 8:15) is available only if the corresponding B
n
-8
register is not used since both Rn and B
n-8
are mapped
to the same physical register.
MCTL is initialized to zero at reset, setting a default linear mode for all Rn registers. All other AM field combinations are reserved and should not be used.
Bit 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 Bit 16
R7 AM[3:0] R6 AM[3:0] R5 AM[3:0] R4 AM[3:0]
Bit 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 Bit 0
R3 AM[3:0] R2 AM[3:0] R1 AM[3:0] R0 AM[3:0]
Table 2-17. Address Modifier (AM) Bits
AM3 AM2 AM1 AM0 Address Modifier Modes
0000 Linear addressing 0001 Reverse-carry addressing 1000 M0 usedModulo addressing 1001 M1 usedModulo addressing 1010 M2 usedModulo addressing 1011 M3 usedModulo addressing 1100 M0 usedMultiple wrap-around modulo addressing 1101 M1 usedMultiple wrap-around modulo addressing 1110 M2 usedMultiple wrap-around modulo addressing 1111 M3 usedMultiple wrap-around modulo addressing
2-38 SC140 DSP Core Reference Manual
Address Generation Unit
2.3.3 Addressing Modes
The SC140 core provides four types of addressing modes:
Register direct
Address register indirect
PC relative
Special
The addressing modes are related to where the operands are to be found and how the address calculations are to be made. These modes are described in the following sections:
2.3.3.1 Register Direct Modes
The register direct addressing modes specify that the operand is in one or more of the DALU registers, AGU registers, or control registers, and are classified as register references.
Data or Control Register Direct — The operand is in one, two, or four DALU registers as specified
in a portion of the data bus movement field in the instruction. An example is: mac d4,d5,d6, which uses data registers d4, d5, and d6 as sources for the multiply-accumulate operation. This addressing mode is also used to specify a control register operand for special instructions.
Address Register Direct — The operand is in one of the twenty-seven AGU registers (R0–R7,
R8–R15/B0–B7, N0–N3, M0–M3, MCTL, N/ESP) specified by a field in the instruction. An example is addl1a r0,r1, which performs a 1-bit arithmetic left shift on the data in R0, and adds the result to the data in R1.
2.3.3.2 Address Register Indirect Modes
The address register indirect modes specify that the address register is used to point to a memory location. The term indirect is used because the register contents are not the operand itself, but rather the operand address. These addressing modes specify that an operand is in a memory location and specify the effective address of that operand. These references are classified as memory references. The term “index” refers to an offset stored in a register. The term “displacement” refers to an offset from an immediate in the instruction.
No Update, (Rn) — The operand address is in the address register. The contents of the address
register are unchanged by executing the instruction. For R0-R7, the contents of the modifier control register (MCTL) are ignored. An example is: bmclr.w #$004f,(r4). A word is read from memory location stored in r4, operated on, and written back to the same location. The address in r4 is unchanged.
Post-increment, (Rn)+ — The operand address is in the address register. After the operand address
is used, it is incremented by the access width (1, 2, 4, or 8 bytes) and stored in the same address register. The access width is the number of bytes used by the active instruction on the memory data bus. Incrementing the operand address by the access width places the next available byte address in the register. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. An example is: move.f (r3)+,d2. The data in the location identified by the value in r3 is moved to data register d2. Then the value in r3 is incremented by two.
Address Generation Unit
SC140 DSP Core Reference Manual 2-39
Post-decrement, (Rn)- —The operand address is in the address register . After the operand address
is used, it is decremented by the access width (1, 2, 4, or 8 bytes) and stored in the same address register. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. An example is: move.l (r3)-,d2. In this case, the value in r3 is decremented by four after the move has taken place.
Post-increment by Offset Ni, (Rn) + Ni — The operand address is in the address register . After the
operand address is used, it is incremented or decremented by an amount determined by the signed contents of the Ni register pre-shifted to the left by 0, 1, 2, or 3 bits according to the access width. The result is stored in the same address register. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. The contents of the Ni register are unchanged. An example is: move.w d3,(r2)+n3. The access width is two, so the increment is twice the value in the n3 register.
Indexed By Offset N0, (Rn + N0) — The operand address is the sum of the contents of the address
register and the signed contents of the N0 register, pre-shifted to the left by 0, 1, 2, or 3 bits according to the access width. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. The contents of the Rn and N0 registers are unchanged. For example: move.b d6,(r3+n0). The access width is one, so the contents of the n0 register are used directly to modify the address before the move is done.
Note that only the N0 offset register can be used in this addressing mode.
Indexed by Address Register Rm, (Rn + Rm) — The operand address is the sum of the contents
of the address register Rn and the contents of the address register Rm, pre-shifted to the left by 0, 1, 2, or 3 bits according to the access width. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. The contents of the Rn and Rm registers are unchanged. An example is: move.l (r0+r2),d6. Here, the access width is four, so the value in r2 is shifted left two bits before adding to the address in r0.
Note that only address registers (R0–R7) can be used as Rm.
Short Displacement, (Rn + x) — The operand address is the sum of the contents of the address
register Rn and a short displacement x that occupies three bits in the instruction word. The displacement (unsigned) is first shifted to the left by 0, 1, 2, or 3 bits according to the access width. It is then zero-extended to 32 bits and added to Rn to obtain the operand address. Thus, the displacement can range from [0] to [+7] bytes, words, long words, or two long words according to the access width. The contents of the Rn register are unchanged. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register. An example is: move.l d4,(r3+$1c). The access width is four, and the displacement encoded in the instruction is seven (4 x 7 = 28 = $1c).
Word Displacement, (Rn + xxxx) — The operand address is the sum of the contents of the address
register Rn and an immediate displacement. The displacement is a signed 15-bit word that requires a second instruction word. It is sign-extended to 32 bits and then added to Rn to obtain the operand address. Thus, the displacement can range from [-16,384] to [+16,383] bytes, [-8192] to [+8191] words, [-4096] to [+4095] long words, or [-2048] to [+2047] two long words according to the access width. The contents of the Rn register are unchanged. The type of arithmetic used for updating R0-R7 is determined by programming the MCTL register.
SP Short Displacement, (SP – xx) — The instruction word contains a 5-bit or 6-bit short unsigned
immediate index field. This field is first shifted to the left by 1 or 2 bits according to the access width, then zero-extended to form a 32-bit offset and subtracted from the active stack pointer (NSP in Normal mode, ESP in Exception mode) to obtain the operand address. Thus, the displacement can range from [0] to [31/63] words or long words according to the access width. The contents of the
2-40 SC140 DSP Core Reference Manual
Address Generation Unit
active SP register are unchanged. The type of arithmetic used is always linear. An example is: move.w #$ffff,(sp–$3e). The encoded displacement is 31,the maximum value of five bits, and the actual displacement is 62 ($3e), since the access width is two.
SP Wo rd Displacement, (SP + xxxx)—The operand address is the sum of the contents of the active
stack pointer (SP) and an immediate displacement. The displacement is a signed 15-bit word that requires a second instruction word. It is sign-extended to 32 bits and added to the active stack pointer (NSP in Normal mode, ESP in Exception mode) to obtain the operand address. Thus, the displacement can range from [-16,384] to [+16,383] bytes, [-8192] to [+8191] words, [-4096] to [+4095] long words, or [-2048] to [+2047] two long words according to the access width. The contents of the active SP register are unchanged. The type of arithmetic used is always linear. An example is: move.l (sp+$2000),d2.e. Here, the positive value $2000 is added to the active stack pointer before the memory access.
2.3.3.3 PC Relative Mode
The PC relative address mode is used to calculate the program destination of change-of-flow instructions such as branches (BRA). In the PC relative addressing mode, the instruction encoding contains a signed displacement operand. The operand address is obtained by left-shifting (multiplying by two) the displacement and adding the result to the value of the program counter (PC). The operand is left-shifted because the addresses of the program instructions are word-aligned, and memory addressing is in units of bytes. The arithmetic used is always linear. For example,
bra _label2. Assume that PC=$0010 and that
_label2 is at location $0020. The encoded displacement will be ($0020 – $0010)/2 = $0008. The number of bits occupied by the displacement in the instruction differs with the different kinds of PC
relative instructions. In all cases, the displacement is first sign-extended to 32 bits, then multiplied by two, and added to the PC to obtain the operand address.
In the one-word conditional branch instructions, the displacement occupies 8 bits of the instruction word and can range from [-256] to [254] words. In the one-word unconditional branch instructions, the displacement occupies 10 bits of the instruction word and can range from [-1024] to [1022] words. In the two-word branch instructions, the displacement occupies 20 bits and can range from [-1,048,576] to [1,048,574] words. In the DOSETUP instruction, the displacement occupies 16 bits of the instruction. The displacement for the start address (SA) can range from [-65,536] to [65,534] words.
Address Generation Unit
SC140 DSP Core Reference Manual 2-41
2.3.3.4 Special Addressing Modes
The special addressing modes do not use an address register when specifying an effective address. They either use an immediate value that is included in the instruction for the data value, such as the data value address, or they use a register that is implicitly referenced by the instruction for the data value.
Immediate Short Data — A 5-bit, 6-bit, or 7-bit operand is part of the instruction operation word.
The 5-bit zero-extended operand is used for DALU and AGU arithmetic instructions. The 6-bit zero-extended operand is used for DALU instructions to move short immediate data to an LCn register. The 7-bit sign-extended operand is used for immediate moves to a register. This reference is classified as a program reference. An example is:
doen2 #$3f. The value $3f, 63, is loaded to
loop counter 2.
Immediate Word Data — This addressing mode requires a one-word instruction extension. The
immediate data is a 16-bit operand. This reference is classified as a program reference. An example is:
doen2 #$40. The value 64 is loaded to loop counter 2. The value exceeds the 6-bit limit for
immediate short data, so an extra word is needed for the encoding.
Immediate Long Data — This addressing mode requires a two-word instruction extension. The
immediate data is a 32-bit operand. This reference is classified as a program reference. An example is: move.l
#$f00d0d01,n0. The 32-bit unsigned value is moved to the general register n0.
Absolute Word Address — This addressing mode requires a one-word instruction extension. The
operand address occupies 16 bits in the instruction operation words, and is zero-extended to form a 32-bit address. This reference is classified as a memory reference. An example is:
move.w
($8a20),d0
.
Absolute Long Address — This addressing mode requires a two-word instruction extension. A
32-bit address is contained in the instruction words. This reference is classified as a memory reference. An example is:
move.w ($34008a20),d0.
Absolute Jump Address — The operand occupies 32 bits in the instruction operation words. It
requires a two-word instruction extension. This reference is classified as a program reference. An example is:
jmp lbl4, where the instruction is encoded with the program memory address of lbl4.
Implicit Reference — Some instructions make implicit reference to the PC, normal or exception
stack, loop registers (SA0, SA1, SA2, SA3, LC0, LC1, LC2, LC3), or status register (SR). These registers are implied by the instruction, and their use is defined by the individual instruction descriptions. An example is:
tfra osp,r2, which transfers the 32-bit word stored at the other
(non-active) stack pointer to address register R2.
2-42 SC140 DSP Core Reference Manual
Address Generation Unit
2.3.3.5 Memory Access Width
The SC140 core supports variable width access to data memory. With every memory access, the core sends one of four signals to the memory interface to designate whether the access width is 8 bits, 16 bits, 32 bits, or 64 bits wide. The access width is determined by the type of MOVE instruction being used. For example, MOVE.B is used for byte access. MOVE.W is used for word access. For long-word access, MOVE.L, MOVE.2F, and MOVE.2W are used. And, for two long-word access, MOVE.2L, MOVE.4F, and MOVE.4W are used.
The memory addresses are always in units of bytes. For example, addresses for two-word MOVE operations to/from memory are available in multiples of four in order to best align the data with the byte addressing.
Address calculations and register update calculations are performed according to the memory access width as shown in Table 2-18.
2.3.3.6 Memory Access Misalignment
Each access to the memory generated by the core should be aligned according to the access type. If the alignment rule is violated, erroneous data may be fetched from the memory. In addition, an exception may be generated to identify that an unaligned access occurred. For more information, see Section 5.8,
“Exception Processing,” on page 5-46.
Table 2-18. Access Width Support for Address and Register Update Calculations
Addressing Mode Calculation
Memory Access Width
Byte Word Long Two Long
Post-increment (Rn) + Post-decrement (Rn) -
Rn register post-increment or post-decrement by —>
1248
Post-increment by Offset (Rn)+Ni
Rn register post-increment by -> Ni*1 Ni*2 Ni*4 Ni*8
Indexed by Offset N0 (Rn + N0)
Actual address offset N0 2*N0 4*N0 8*N0
Indexed by Address Register Rm (Rn + Rm)
Actual address offset Rm 2*Rm 4*Rm 8*Rm
Short Displacement (Rn + x)
Actual address displacementxxxx
Word Displacement (Rn + xxxx)
Actual address displacement xxxx xxxx xxxx xxxx
SP update in Push/Pop SP post-increment or
pre-decrement by —>
8888
SP Short Displacement (SP - xx)
Actual address displacement NA xx xx NA
SP Word Displacement Actual address displacement xxxx xxxx xxxx xxxx
Address Generation Unit
SC140 DSP Core Reference Manual 2-43
Table 2-19 summarizes the memory address alignment rule for each type of memory access.
Table 2-19. Memory Address Alignment
2.3.3.7 Addressing Modes Summary
Table 2-20 provides a summary of the addressing modes described in the previous sections. The Operand Reference columns are abbreviated as follows:
S = Software Stack Reference in data memory (uses NSP or ESP according to mode)
C = Program Control Unit Register Reference
D = DALU Register Reference
A = AGU Register Reference
P = Program Memory Reference
X = Data Memory Reference
Access Type Aligned Address
Byte access Any address
Word access Multiple of 2
Long-word access Multiple of 4
Two long-word access Multiple of 8
Table 2-20. Addressing Modes Summary
Addressing Modes
R0-R7
Uses
MCTL
Operand Reference
Assembler Syntax
SCDAPX
Register Direct
Data or Control Register √√ Dn
Dn Dm
Dn Dm Di Dj
MCTL
SR, EMR, VBA
LC0, LC1
LC2, LC3 SA0, SA1 SA2, SA3
Address Register (Rn) Rn
Address Modifier Register (Mj) Mj
Base Address Register (Bn) Bn Address Offset Register (Ni) Ni
Stack Pointer SP
2-44 SC140 DSP Core Reference Manual
Address Generation Unit
Note: The “—” that appears in the “R0-R7 Uses MCTL” heading means that it is not applicable for that
addressing mode.
Address Register Indirect
No Update, (Rn) No (Rn)
Post-increment, (Rn)+ Yes (Rn)+
Post-decrement, (Rn)– Yes (Rn)–
Post-increment by Offset Ni, (Rn)+Ni Yes (Rn) + Ni
Indexed by offset N0, (Rn+N0) Yes (Rn + N0)
Indexed by Address Register Rm,
(Rn+Rm)
Yes (Rn + Rm)
Short Displacement, (Rn+x)
Word Displacement, (Rn+xxxx)
Yes (Rn + x)
(Rn + xxxx)
SP Short Displacement, (SP-xx) √√(SP - xx)
SP Word Displacement, (SP+xxxx) √√(SP + xxxx)
PC Relative
PC Relative with Displacement #xx (8 bits)
#xxx (10 bits)
#xxxx (16 bits)
#xxxxx (20 bits)
Special
Immediate Short Data
Immediate Word Data Immediate Long Data
#xx (5, 6, or 7bits)
#xxxx (16 bits)
#xxxxxxxx(32 bits)
Absolute Word Address
Absolute Long Address
xxxx (16 bits)
xxxxxxxx (32 bits)
Absolute Jump Address xxxxxxxx (32 bits)
Implicit Reference √√
Table 2-20. Addressing Modes Summary (Continued)
Addressing Modes
R0-R7
Uses
MCTL
Operand Reference
Assembler Syntax
SCDAPX
Address Generation Unit
SC140 DSP Core Reference Manual 2-45
2.3.4 Address Modifier Modes
The AAU supports linear, reverse-carry, modulo, and multiple wrap-around modulo arithmetic types for address register indirect modes operating on R0-R7. These arithmetic types allow the easy creation of data structures in memory for First-In/First-Out (FIFO) queues, delay lines, circular buffers, stacks, and reverse-carry Fast Fourier Transform (FFT) buffers.
Data is manipulated by updating address registers (Rn) used as pointers rather than moving large blocks of data. The contents of the modifier control register MCTL define the type of arithmetic to be performed for address calculations. For modulo arithmetic, the address modifier register Mj specifies the modulus. Each of the address register lower banks (R0–R7) can be used with any of the modifier registers (M0–M3) as programmed in the MCTL register.
2.3.4.1 Linear Addressing Mode
Linear addressing is useful for general-purpose addressing such as stacks. In linear addressing mode, the address is calculated using standard binary arithmetic. The entire memory space is addressable. Linear addressing mode is selected by setting the AM3–0 bits to 0000 in the MCTL register. This is the default state.
2.3.4.2 Reverse-carry Addressing Mode
Reverse-carry addressing is useful for 2k point FFT addressing. This mode is selected for R0-R7 by setting the AM3-0 bits to 0001 in the MCTL register. Address modification is performed in the hardware by propagating the carry from each pair of added bits in the reverse direction (from the MSB end toward the LSB end). For the +Ni addressing mode, reverse-carry is equivalent to:
Bit-reversing the contents of Rn (redefining the MSB as the LSB, the next MSB as bit 1, and so on )
Shifting the offset value in Ni left by 0, 1, 2, or 3 according to the access width
Bit-reversing the shifted Ni
Adding normally
Bit-reversing the result
This address modification is useful for addressing the twiddle factors in 2
k
point FFT addressing as well as
to unscramble 2
k
point FFT data. The range of values for Ni is 0 to 232-1, which allows reverse-carry
addressing for FFTs up to 4,294,967,296 points. Note: To achieve correct reverse-carry accessing for access widths of 2, 4, or 8, the last 1, 2, or 3 least
significant bits (respectively) of the address calculation result are forced to zero.
2.3.4.3 Modulo Addressing Mode
Modulo address modification is useful for creating circular buffers for FIFO queues, delay lines, and sample buffers up to 2
31
bytes long.
Modulo addressing is selected by writing the MCTL AM3-0 bits of the MCTL register (as shown in Table 2-10) as well as writing the desired modulus to the corresponding Mj register. Address modification is performed in modulo M, where M ranges from 1 to +2
31
-1. Modulo M arithmetic causes the address register values to remain within an address range of size M, thus defining a buffer with a lower and an upper address boundary.
Each base address register (Bn register) is associated with an Rn register (B0 with R0, and so on). Each
2-46 SC140 DSP Core Reference Manual
Address Generation Unit
register Rn has one Mj register assigned to it by encoding in the MCTL. The lower boundary value of the buffer resides in the Bn register, and the upper boundary is calculated as Bn+Mj-1. Mj must be smaller than 2
31
- 1 (Mj < 231 - 1).
The modulo addressing definition, using a base register (Bn) and a modulo register (Mj), enables the programmer to locate the modulo buffer at any address. The buffer start address is only required to be aligned to the access width.
The address pointer Rn is not required to start at the lower address boundary, nor to end on the upper address boundary. Rn can initially point anywhere (aligned to its access width) within the defined modulo address range, Bn Rn < B+Mj. Assuming the (Rn)+ indirect addressing mode, if the address register pointer increments past the upper boundary of the buffer (base address + Mj-1), it wraps around through the base address (lower boundary). Alternatively, assuming the (Rn)- indirect addressing mode, if the address decrements past the lower boundary (base address), it wraps around through the base address + Mj-1 (upper boundary).
The following constraints apply:
1. For proper modulo addressing, if an offset Ni is used in the address calculation, the 32-bit
absolute effective value |Ni| must be less than or equal to Mj, where “effective” means the programmed Ni is multiplied by the access width. For example, move.w (r0)+n0,d0 translates to the restriction 2*n0 ≤ Μj, and move.l (r0)+,d0 translates to 4 ≤ Mj. If effective Ni > Mj, the result of the address calculation is undefined. Multiple wrap-around modulo addressing supports the situation of effective Ni greater than Mj.
2. Mj must be aligned to the access width used. For example, if the buffer is used with a
MOVE.2L instruction, Mj must be aligned to 8 (be a multiple of 8). If the modulus is less than the access width, the data accessed as well as the address calculations are undefined.
3. When Bn is used as a base address register, the use of R
n+8
as a pointer is illegal since this
is the same physical register.
Modulo addressing is illustrated in Figure 2-15. Addresses will be kept within the eleven addresses shown. For the instruction,
move.w (r0+$000e),d0, the access will be made from $26 (38), if the base address
is $20, the modulus is $c, and r0 is $24. The operation is 36+14=50=38 in modulu s 12, base address 32 (50–44 + 32 = 38).
Figure 2-15. Modulo Addressing Example
$002c = B + M – 1
32
36 38
44
M = 12
$0020 = B
Address Generation Unit
SC140 DSP Core Reference Manual 2-47
Table 2-21 describes the modulo register values and the corresponding address calculation.
2.3.4.4 Multiple Wrap-Around Modulo Addressing Mode
Multiple wrap-around addressing is useful for decimation, interpolation, and waveform generation. The multiple wrap-around capability can be used for argument reduction. In multiple wrap-around modulo addressing mode, the modulus M is a power of 2 in the range of 2
1
to 231. The value M-1 is stored in the modifier register (Mj). The B registers B0 to B7 are not used for multiple wrap-around modulo addressing; therefore, their corresponding R8–R15 registers can be used for linear addressing.
The lower and upper boundaries are derived from the contents of Mj. The lower boundary (base address) value has zeros in the k LSBs where M = 2
k
and therefore must be a multiple of M. The Rn register involved in the memory access is used to set the MSBs of the base address. The base address is set so that the initial value in the Rn register is within the lower and upper boundaries. The upper boundary is the lower boundary plus the modulo size minus one (base address + M–1).
The size of the modulo buffer must be aligned to (be a multiple of) the access width. If the modulus is less than the access width, the data accessed as well as the address calculations are undefined.
If an offset Ni is used in the address calculations, it is not required to be less than or equal to M for proper modulo addressing. The multiple wrap-around modulo addressing mode supports unlimited boundary wraps.
When using the (Rn)+ and (Rn)- addressing modes with a modulus 2
k
8, there is no functional difference between the multiple wrap-around and normal modulo modes since the address can only be wrapped around once.
As an example, consider the instruction
move.w (r0 + $0042),d0. If the mctl is set to $000c, and m0
is set to $000f, then M0 = 16. If r0 is initially $24 (36), the lower boundary is $20 (32) and the upper boundary is $2f (47). The memory access is done from address $26 (38), calculated by 36 + 66 = 102, 102–48=54, 54–3x16=6, 6+32=38.
Table 2-21. Modulo Register Values for Modulo Addressing Mode
Modifier Mj Address Calculation Arithmetic
$0000 0000 Unused $0000 0001 Modulo 1 $0000 0002 Modulo 2
$7FFF FFFE
Modulo 2
31
-2
$7FFF FFFF
Modulo 2
31
-1
2-48 SC140 DSP Core Reference Manual
Address Generation Unit
Table 2-22 describes the modulo register Mj values and the corresponding multiple wrap-around address calculation.
2.3.5 Arithmetic Instructions on Address Registers
The SC140 core provides arithmetic instructions on the address registers (R0–R15), offset registers (N0–N3), the stack pointer (SP), and the program counter (PC).
Address modification modes can affect the arithmetic results stored in R0-R7 using instructions ADDA, SUBA, ADDL1A, or ADDL2A. In addition, an address calculation that increments or decrements address register R0-R7 is affected by the modifier mode. When updating R0-R7 in modulo addressing mode, the modulo registers hold the modulus.
Table 2-23 lists the arithmetic instructions that are executed in the AGU unit. A more detailed description of the operations is provided in Appendix A, “SC140 DSP Core Instruction Set.”
Table 2-22. Modulo Register Values for Wrap-Around Modulo Addressing Mode
Modifier Mj Address Calculation Arithmetic
$0000 0001 Multiple Wrap-around Modulo 2 $0000 0003 Multiple Wrap-around Modulo 4 $0000 0007 Multiple Wrap-around Modulo 8
$7FFF FFFF
Multiple Wrap-around Modulo 2
31
$FFFF FFFF Linear
Table 2-23. AGU Arithmetic Instructions
Instruction Description
ADDA AGU Add (affected by the modifier mode) ADDL2A AGU Add with 2-bit left shift of source operand (affected by the
modifier mode)
ADDL1A AGU Add with 1-bit left shift of source operand (affected by the
modifier mode) ASL2A AGU Arithmetic shift left by 2 bits (32-bit) ASLA AGU Arithmetic shift left (32-bit) ASRA AGU Arithmetic shift right (32-bit) CMPEQA AGU Compare for equal CMPGTA AGU Compa re for greater than CMPHIA AGU Compare for higher (unsigned) DECA AGU Decrement register (affected by the modifier mode) DECEQA AGU Decrement and set T if result is zero
Address Generation Unit
SC140 DSP Core Reference Manual 2-49
2.3.6 Bit Mask Instructions
The SC140 core provides bit mask instructions on all address registers (R0–R15), all DALU registers (D0–D15), all control registers (EMR, VBA, SR, MCTL), and all memory locations.
Bit mask instructions provide an easy way of setting, clearing, inverting, or testing a selected but not necessarily adjacent group of bits in a register or memory location.
All bit mask instructions work on 16-bit data. This data can be the contents of a memory location or a portion (high or low) of a register.
Only a single bit mask instruction is allowed in one execution set since only one execution unit exists for these instructions. A subgroup of the bit mask instructions (BMTSET) supports hardware semaphores. For more information, see Section 2.3.6.1, “Bit Mask Test and Set (Semaphore Support) Instruction.”
DECGEA AGU Decrement and set T if result is equal to or greater than zero INCA AGU Increment register (affected by the modifier mode) LSRA AGU Logical shift right (32-bit) SUBA AGU Subtract (affected by the modifier mode) SXTA.B AGU Sign-extend byte SXTA.W AGU Sign-extend word TFRA AGU Register transfer TSTEQA AGU Test for equal to zero TSTEQA.W AGU Test for equal to zero on lower 16 bits TSTGEA AGU Test for greater than or equal to zero TSTGTA AGU Test for greater than zero ZXTA.B AGU Zero-extend byte ZXTA.W AGU Zero-extend word
Table 2-23. AGU Arithmetic Instructions (Continued)
Instruction Description
2-50 SC140 DSP Core Reference Manual
Address Generation Unit
Table 2-24 lists the arithmetic instructions that are executed in the BMU.
2.3.6.1 Bit Mask Test and Set (Semaphore Support) Instruction
The bit mask test and set instruction (BMTSET) provides support for hardware semaphores. A semaphore is a signal which can be set to indicate whether a program resource can be accessed or not. The destination of this instruction can be a register or a memory location in either internal or external memory. If the semaphore indicates that the resource is available, the T bit has the value 0. If the semaphore indicates that the resource is not available (T = 1), a jump can be made to skip the resource code.
This instruction performs the following tasks:
1. Reads the destination register, tests the data, and sets the T bit, if every bit that has the value
1 in the mask is 1 in the destination.
2. Writes back to the destination a word with ones for the masked bits, and the original
destination bits for the unmasked bits.
3. Sets the T bit if the set (write) failed.
Normally , the BMTSET consists of three indivisible operations: read, update the T bit, and write. A set (write) failed condition occurs if the destination failed to be written indivisibly from the previous read operation of that BMTSET instruction. The memory subsystem signals the core of a write failure if a memory access that is initiated by another master source intervenes between the read and the write accesses of the BMTSET operation. As a result of the non-exclusive write indication, the T bit is set, signalling that the resource may not be available, thereby avoiding a hazard condition.
Table 2-24. AGU Bit Mask Instructions (BMU)
Instruction Description
AND.W Logical AND on a 16-bit operand BMCHG Bit mask change
Inverts every bit in the destination (register or memory) that has the value 1 in the mask.
BMCLR Bit mask clear
Clears every bit in the destination (register or memory) that has the value 1 in the mask.
BMSET Bit mask set
Sets every bit position in the destination (register or memory) that has the value 1 in the mask.
BMTSET Bit mask test (if set) and set
Sets the T bit if every bit that has the value 1 in the mask is 1 in the destination (register or memory). Sets (writes) every bit in the destination (register or memory) that has the value 1 in the mask, and sets the T-bit if the set (write) failed. See Section 2.3.6.1, “Bit Mask Test and Set
(Semaphore Support) Instruction.”
BMTSTC Bit mask test if clear
Sets the T-bit, if every bit position that has the value 1 in the mask is 0 in an operand.
BMTSTS Bit mask test if set
Sets the T bit if every bit position that has the value 1 in the mask is 1 in an operand. EOR.W Logical exclusive OR on a 16-bit operand NOT.W Binary inversion of a 16-bit opera nd OR.W Logical OR on a 16-bit operand
Address Generation Unit
SC140 DSP Core Reference Manual 2-51
2.3.6.1.1 Example of Normal Usage of the Semaphoring Mechanism
The following sequence accesses a resource controlled by a semaphore.
label : BMTSET.W #mask,(R0)
JT label
Normally, the mask enables only one bit. In this case, the memory destination pointed to by (R0) is read, and the enabled bit is tested. The enabled bit is then set, and the memory destination is written back.
The T bit is set if the enabled bit was originally 1 (meaning that it was semaphore-occupied), or that the write-back failed. A T bit value of TRUE indicates to the conditional jump that the attempt to obtain the resource has failed, and that the jump should be taken. The T bit is cleared if the enabled bit was originally zero. This means that the semaphore was not allocated. Therefore, the resource was available, and the instruction was successful in setting the semaphore exclusively. A successful allocation writeback results.
When the destination is a register, the write is always successful.
2.3.6.2 Semaphore Hardware Implementation
During the address phase of the read and write accesses associated with the BMTSET instruction, an output of the core is asserted. This assertion indicates that the read and the following write are part of a read-modify-write sequence.
During the data phase of the write access, a core input provides the core with the result of the access (de-asserted = write failed).
2.3.7 Move Instructions
The SC140 instruction set supports various types of move instructions which differ in the following properties:
Access width — Byte (8 bits), word (16 bits), long-word (32 bits), and two long words (64 bits)
Data type — Signed integer, unsigned integer, fractional (with or without limiting)
Multi-register moves — Some move operations split data between two or four registers
Addressing mode — For example, absolute, relative to an address pointer (with various offset and
post-update options), and relative to the stack pointer
The move instructions perform data movement over the XDBA and XDBB buses (for data moves). Move instructions do not affect the status register with the exception of the sticky scaling bit in reading a DALU register.
Table 2-25 lists the move instructions. The suffix just before the period in the MOVE nomenclature indicates the following:
None = Signed
U = Unsigned
S = Scaling and limiting (saturation) enabled
2-52 SC140 DSP Core Reference Manual
Address Generation Unit
The suffix just after the period in the MOVE nomenclature indicates the following:
B = Byte
W = Integer word (16 bits)
L = Long word (32 bits)
F = Fractional word (16 bits)
Either a two or four may modify the last suffix.
Table 2-25. AGU Move Instructions
Instruction Description
MOVE.2F Move two fractional words from memory to a register pair MOVE.2L Move two longs to/from a register pair MOVE.2W Move two integer words to/from memory and a register pair MOVE.4F Move four fractional words from memory to a re gister quad MOVE.4W Move four integer words to/from memory and a register quad MOVE.B M ove byte to/from memory MOVE.F Move fractional word to/from memory MOVE.L Move long MOVE.W Move integer word to/from memory, or immediate to register or
memory MOVEc Conditional move between address registers MOVES.2F Move two fractional words to memory with scaling and limiting enabled MOVES.4F Move four fractional words to memory with scaling and limiting enabled MOVES.F Move fractional word to memory with scaling and limiting enabled MOVES.L Move long to memory with scaling and limiting enabled MOVEU.B Move unsigned byte from memory MOVEU.L Move unsigned long from immediate MOVEU.W Move unsigned integer word from memory or from immediate VSL.2F Viterbi shift leftspecialized move to support Viterbi kernel VSL.2W Viterbi shift leftspecialized move to support Viterbi kernel VSL.4F Viterbi shift leftspecialized move to support Viterbi kernel VSL.4W Viterbi shift leftspecialized move to support Viterbi kernel
Address Generation Unit
SC140 DSP Core Reference Manual 2-53
Integer moves from memory (byte, word, long, two long) are right-aligned in the destination register, and by default are sign-extended to the left. Unsigned moves are marked with “U” (for example, MOVEU.B), and zero extended in the destination register. A schematic representation of integer moves from memory into a 40-bit register is shown in Figure 2-16. Moves from registers to memory use the appropriate portion from the source register. Moves to registers of less than 40 bits behave the same as in Figure 2-16 up to their bit length.
Figure 2-16. Integer Move Instructions
Fractional moves are supported only to DALU registers. Moves from memory are put in the high portion of the data register, sign-extended to the extension, and zero-filled in the low portion. MOVE.L and MOVE.2L may also be considered fractional moves since alignment in the destination register is the same for integer long moves and fractional long moves. A schematic representation of fractional moves from memory to 40-bit data registers is shown in Figure 2-17.
039 8
MOVE.B (signed byte move)
sign extension
039 8
MOVEU.B (unsigned byte move)
zero extension
039 16
MOVE.W (signed word move)
sign extension
039
32
MOVE.L (signed long move)
sign
extension
039
32
MOVEU.L (unsigned long move)
zero
extension
MOVE.2L (signed two long move)
039 16
MOVE.2W (signed two word move)
sign extension
sign extension
039
32
sign
extension
sign
extension
039
16
sign extension
sign extension
sign extension
sign extension
MOVE.4W (signed four-word move)
039 16
MOVEU.W (unsigned word move)
zero extension
2-54 SC140 DSP Core Reference Manual
Address Generation Unit
.
Figure 2-17. Fractional Move Instructions
The four instructions MOVES.F, MOVES.2F, MOVES.4F, and MOVES.L move data from data registers to the memory with scaling and limiting. The first three operate on 16-bit data. The MOVES.L instruction performs 32-bit scaling and limiting before the move.
For all moves on the SC140, the syntax requires that the source of the data be specified first followed by the destination (SRC, DST). The source and destination are separated by a comma with no spaces either before or after the comma.
Multi-register move instructions originate or update several registers. Registers that are accessed as part of the same move instruction are specified with a colon separator. For example, a MOVE.4F from a memory location pointed by R0 to the registers D0, D1, D2, and D3 is written as:
MOVE.4F (R0),D0:D1:D2:D3
In this case, let the address in R0 be noted as A0. The fractional word in location A0 then goes to D0, the word in A0 + 2 goes to D1, the word in A0 + 4 goes to D2, and the word in A0 + 6 goes to D3. The addresses increment by two since the addressing unit is always a byte. Moves to or from more than one register are treated according to the same principle.
A special MOVE.L instruction supports moving data to and from data register extensions (Dn.e). In order to support full saving and restoring of the machine state, extension moves also include the limit bit Ln of the register, and are therefore nine bits wide. In one case of the MOVE.L instruction, two extensions belonging to two consecutive data registers are moved concurrently from the registers to the memory as part of a 32-bit access.
039 32
MOVE.F (fractional move)
sign
extension
zero-fill
16
039 32
sign
extension
zero-fill
16
sign
extension
zero-fill
MOVE.2F (fractional double move)
039
32
sign
extension
zero-fill
16
sign
extension
zero-fill
MOVE.4F (fractional quad-move)
sign
extension
zero-fill
sign
extension
zero-fill
Memory Interface
SC140 DSP Core Reference Manual 2-55
The extension bits of the even data register occupy bits 0 to 8 (bit 8 is the limit bit). The extension bits of the odd register occupy bits 16 to 24 (bit 24 is the limit bit) as described in Figure 2-18.
Figure 2-18. Bit Allocation in MOVE.L D0.e:D1.e
Moves from memory to an extension are only to single registers. However, they are also 32-bit wide and implicitly assume the bit allocation described above according to the register number (odd or even). For example, move.l $4F42,d3.E is the instruction for moving bits 24:16 from the memory location addressed by $4F42 to the limit bit and extension bits of the odd register d3. See Appendix A, “Move Long Word
(AGU) MOVE.L,” , for more information about the moves to and from extension registers.
2.4 Memory Interface
The SC140 core interfaces to memory via the following:
32-bit program memory address bus (PAB) and 128-bit program memory data bus (PDB)
32-bit data memory address bus A (XABA) and 64-bit data memory data bus A (XDBA)
32-bit data memory address bus B (XABB) and 64-bit data memory data bus B (XDBB)
Control signals such as read and write access strobes as well as access width control
The SC140 does not specify a memory subsystem architecture, only the minimum requirements for correct execution of SC140 code. Listed below are requirements for all memory designs that interface with the SC140 core.
The SC140 core supports only unified memory designs. Memory is regarded as a single space. There
is no distinction between program memory locations and data memory locations. Each memory location possesses a unique address that can be accessible from either the program or data buses. From the core’s perspective, there is only one memory address “a,” which can hold either data or program information.
Data must be byte-addressable and accessible by the two data memory buses.
All data width accesses used by the SC140 core must be supported by the memory such as byte
(8 bits), word (16 bits), long word (32 bits), or double-long word and four-word (64 bits). One of four control signals will indicate to the memory which access width is needed for each access.
Multi-byte memory accesses must support both endian modes.
039 32
extension
16
extension
D0
D1
L1
L0
08162431
+
+
Memory Long Word00
2-56 SC140 DSP Core Reference Manual
Memory Interface
Memory must resolve access ordering on a cycle by cycle basis. All accesses on a given cycle must
be completed before proceeding to accesses in the next cycle. Note that a conflict acces may occur when there are multiple requests to access the same memory module, in the same cycle. An access conflict is resolved by a stall cycle (per conflict), which serializes the multiple request.
Multiple access rules in a given cycle are as follows:
— Multiple read or write accesses to different memory locations execute without any
predetermined sequence.
— In cases where multiple accesses to the same memory location occur, the access sequence is
program fetch, data read, and data write.
— If two write operations access the same byte in memory in the same cycle, the operation is illegal
and the result is undefined. The same byte may be written by different but overlapping words or long words. The memory subsystem should be able to detect these cases and issue an imprecise interrupt to the core. The use of this interrupt is optional. Refer to Section 5.3.3.2, “Implicit
Push/Pop Memory Timing,” on page 5-24 for more details.
Accesses to non-existent memory locations are illegal and the result is undefined. The memory
subsystem can issue an imprecise interrupt to the core. The use of this interrupt is optional.
2.4.1 SC140 Endian Support
The term “little endian” is defined as a computer architecture such that given a multi-byte operand representation, bytes at lower addresses have lower numeric significance. Each word is stored little end first. In little endian mode, the MOVE.W D0,(R0) instruction (for example) stores bits 7–0 of D0 into address (R0), and bits 15–8 into address (R0 + 1).
In “big endian” architectures, the most significant byte has the lowest address, and each word is stored big end first. In big endian mode, the MOVE.W D0,(R0) instruction stores bits 15–8 of D0 into address (R0), and bits 7–0 into address (R0 + 1).
The SC140 supports both big and little endian architectures through the big endian memory (BEM) mode bit in the EMR. This bit samples a core input signal when exiting the reset state, and cannot be changed during normal operation.
Figure 2-19 shows an example how data is transferred from a register to memory in the two endian modes.
Figure 2-19. Endian Example
2.4.1.1 SC140 Bus Structure
The entire memory space of the SC140 core is unified. The memory supports two parallel 64-bit data accesses and one 128-bit program fetch. All can occur in parallel.
Little EndianBig Endian
87
0
15
7
0
0
7
REGISTER
MEMORY
A0
A0+1
87
0
15
7
0
0
7
REGISTER
MEMORY
A0
A0+1
Memory Interface
SC140 DSP Core Reference Manual 2-57
The two data buses that connect between the core and the memory are each 64 bits wide. Instructions such as load to registers and store to memory utilize the bus according to the application requirement. Different versions of the instructions are used for different bandwidths such that:
MOVE.B loads or stores bytes (8 bits).
MOVE.W and MOVE.F load or store integer or fractional words (16 bits).
MOVE.2W, MOVE.2F, and MOVE.L load or store double-integers, double-fractions, and long
words respectively (32 bits).
MOVE.4W , MOVE.4F , and MOVE.2L load or store quad-integers, quad-fractions, and double-long
words respectively (64 bits).
Figure 2-20 shows the data busses between the SC140 core and the memory.
Figure 2-20. Basic Connection between SC140 Core and Memory
2.4.1.2 Memory Organization
Different types of data are stored differently in memory for each of the two endian modes. However, the data retains the same meaning. For example, 64 bits of data can be represented by any of the following:
Eight 8-bit bytes
Four 16-bit numbers
Two 32-bit numbers
Figure 2-21 shows how data is organized in memory in the two endian modes. Each data unit is a byte made of two hexadecimal numbers.
Figure 2-21. Memory Organization of Big and Little Endian Mode
Unified Memory Space
SC140 Core
128-bit PDB-bus 64-bit XDBA-bus 64-bit XDBB-bus
0 8 16 ($10)
76543210
Little Endian
8
16 ($10)
01234567
Big Endian
0
0a 0b 0c 0d 0e 0f
0f 0e 0d 0c 0b 0a
01 02 03 04 05 06 07 08
07 08 05 06 03 04 01 02
11 22 33 44 cc dd ee ff
cc dd ee ff 11 22 33 44
2-58 SC140 DSP Core Reference Manual
Memory Interface
Table 2-26 describes the data representation for each 64-bit row in Figure 2-21.
2.4.1.3 Data Moves
Data moves are executed by moving core registers to and from memory over one of the data buses (XDBA or XDBB).
The data registers can be accessed with three types of data:
Long type access, writing or reading 32-bit operands.
Word type access, writing or reading 16-bit operands.
Byte type access, writing or reading 8-bit operands.
Figure 2-22 shows an example of data transfer in big and little endian modes.
Table 2-26. Data Representation in Memory
Representation Type Value
Eight 8-bit bytes A0 = $0a, A1 = $0b, A2 = $0c, A3 = $0d, A4 = $0e, A5 = $0f Four 16-bit numbers A8 = $0102, A10 = $0304, A12 = $0506, A14 = $0708 Two 32-bit numbers A16 = $11223344, A20 = $ccddeeff
Memory Interface
SC140 DSP Core Reference Manual 2-59
Figure 2-22. Data Transfer in Big and Little Endian Modes
For single-register moves, assuming an equivalent memory map in big and little endian modes, the byte organization on the buses is identical in both modes. However, the memory subsystem must route the data bus bytes to different memory addresses for each supported endian mode.
Big Endian
Little Endian
MOVE.B (A0), D0 MOVE.B (A2), D0 MOVE.W (A8), D0 MOVE.L (A16), D0
xxxx xxxx xxxx xx0a xxxx xxxx xxxx xx0c xxxx xxxx xxxx 0102 xxxx xxxx 1122 3344
64-bit XB-BUS
64-bit XA-BUS
SC140 Core
Memory
Instructions Data Bus Contents
8
16 ($10) 24 ($18) 32 ($20)
01234567
0
0a 0b 0c 0d 0e 0f
01 02 03 04 05 06 07 08 11 22 33 44 cc dd ee ff
0 8 16 ($10) 24 ($18) 32 ($20)
76543210
0f 0e 0d 0c 0b 0a
07 08 05 06 03 04 01 02
cc dd ee ff 11 22 33 44
2-60 SC140 DSP Core Reference Manual
Memory Interface
2.4.1.4 Multi-Register Moves
For accesses involving more than one register, such as with MOVE.2W or MOVE.4F instructions, the SC140 ensures that data originating from a specific register reaches the same address in memory in both little and big endian modes (and the other way round). The memory system does not distinguish between MOVE.L and MOVE.2W transfers that have the same data width. Memory treats them both like a long word transfer. If the data bus were the same for both endian modes in a two-register transfer, the data from the two registers would end up in different addresses. To correct for this, the byte order on the buses for multi-register transfers is adjusted for the little endian mode. The memory also does not distinguish between transfers of four words or two long words. It treats them both like a string of eight bytes. The bus structure for the little endian mode corrects for both cases to ensure that register data is stored at the same address for both modes.
As an example of the problem that arises if a correction is not made, consider the following case: The instruction move.2w d0:d1,(a8) transfers two integer words from data registers d0 and d1 to
memory at address a8. For d0 = $0102 and d1 = $0304 , the data bus would be $010 20304, and the memory would be accessed for a width of 32 bits. For big endian mode, the memory would look like:
For little endian mode, the memory would be accessed for a width of 32 bits (like a long word), and then it would write the data little end first such that the memory would look like:
Note that the data word from d0, $0102, is at a different address for the two modes. If the data bus were modified by the core to $03040102, then the memory for little endian mode would look like:
Address Data
a8 01
a9 02 a10 03 a11 04
Address Data
a8 04
a9 03 a10 02 a11 01
Address Data
a8 02
a9 01 a10 04 a11 03
Memory Interface
SC140 DSP Core Reference Manual 2-61
This is the desired result. This effect is achieved in little endian mode through logic in the core, which modifies the data on the data bus to the memory for both reads and writes.
Figure 2-23 shows examples of multi-register data transfers in big and little endian modes.
Figure 2-23. Multi-Register Transfer in Big and Little Endian Modes
Note: The only exceptions to the behavior described above are the VSL instructions. These instructions
cause source data words from the core to be written to different memory locations in big and little endian modes. For more information about the VSL instructions, refer to Table 2-27 on page 2-64, and Appendix A, “Viterbi Shift Left Move (AGU) VSL,” on page A-422..
Big Endian
Little Endian
(a) MOVE.2W (A8), D0:D1 (b) MOVE.4W (A8), D0:D1:D2:D3 (c) MOVE.2L (A16), D0:D1
64-bit XB-BUS
64-bit XA-BUS
xxxx xxxx 0102 0304 0102 0304 0506 0708 1122 3344 ccdd eeff
xxxx xxxx 0304 0102 0708 0506 0304 0102 ccdd eeff 1122 3344
InstructionsData Bus Contents Data Bus Contents
64-bit XB-BUS
64-bit XA-BUS
SC140
Core
Memory
(a) (b) (c)
D0 D1
D2 D3
0102 0304
– –
0102
0304 0506 0708
11223344 ccddeeff
– –
0 8 16 ($10) 24 ($18) 32 ($20)
76543210
0f 0e 0d 0c 0b 0a
07 08 05 06 03 04 01 02
cc dd ee ff 11 22 33 44
8
16 ($10)
24 ($18) 32 ($20)
01234567
0
0a 0b 0c 0d 0e 0f
01 02 03 04 05 06 07 08
11 22 33 44 cc dd ee ff
2-62 SC140 DSP Core Reference Manual
Memory Interface
2.4.1.5 Instruction Word Transfers
Instruction words are transferred to the core from memory over the program data bus (PDB) to special instruction registers in the program dispatch unit (PDU).
The instruction registers can be accessed only with aligned access of 128-bit width (8 instruction words). Figure 2-24 shows the program memory organization in big and little endian modes. Note that program data consists of a series of 16-bit instructions. In this example the assembler determines the instructions to be:
word address $
00 instruction $a0b0
word address $02 instruction $c0d0 word address $04 instruction $e0f0 word address $06 instruction $a1b1 word address $08 instruction $c1d1 word address $0a instruction $e1f1 word address $0c instruction $a2b2 word address $0e instruction $c2d2 word address $10 instruction $e2f2
.....
These are to be placed in memory as shown in the following figure.
Figure 2-24. Program Memory Organization in Big and Little Endian Modes
The assembler outputs a byte stream to the loader and therefore corrects for the byte address reversal inside each 16-bit instruction to achieve the memory results above.
Big Endian Assembler OutputLittle Endian Assembler Output byte address $00 data $a0byte address $00 data $b0 byte address $01 data $b0byte address $01 data $a0 byte address $02 data $c0byte address $02 data $d0 byte address $03 data $d0byte address $03 data $c0 byte address $04 data $e0byte address $04 data $f0
..... .....
0 8 16 ($10) 24 ($18)
76543210
Little Endian
a0b0c0d0e0f0a1b1 c1d1e1f1a2b2
c2d2
e2f2
a3b3
c3d3e3f3
0 8 16 ($10) 24 ($18)
01234567
Big Endian
a1b1e0f0c0d0 c2d2a2b2e1f1 e3f3
c3d3
a3b3e2f2
a0b0 c1d1
Memory Interface
SC140 DSP Core Reference Manual 2-63
Figure 2-25 shows the memory accesses to the same memory area by both program fetches as well as data accesses in big and little endian modes.
Figure 2-25. Instruction Moves in Big and Little Endian Modes
The Program Bus contents always appear as eight 16-bit little endian packed instructions, the memory system performing a word (instruction) reversal in the case of big endian (program bus only).
0 8 16 ($10)
76543210 0 8 16 ($10)
01234567
Little EndianBig Endian
a0b0c0d0e0f0a1b1 c1d1e1f1a2b2
c2d2
e2f2
a3b3
c3d3e3f3
a1b1e0f0c0d0a0b0 c2d2a2b2e1f1
c1d1
e3f3
c3d3
a3b3e2f2
Memory
MOVE.4W from address $00
MOVE.L from address $08
64-bit XB-BUS
64-bit XA-BUS
InstructionsData Bus Contents Data Bus Contents
64-bit XB-BUS
64-bit XA-BUS
SC140 Core
128-bit P-BUS
c2d2_a2b2_e1f1_c1d1_a1b1_e0f0_c0d 0_a 0b0
Memory System Changes Big Endian to Little
a1b1_e0f0_c0d0_a0b0 xxxx_xxxx_e1f1_c1d1
Program Bus Contents (for both Endian cases)
FETCH (always 128 aligned) from address A0
MOVE.W from address $08
xxxx_xxxx_xxxx_c1d1
MOVE.B from address $08
xxxx_xxxx_xxxx_xxd1
a0b0_c0d0_e0f0_a1b2 xxxx_xxxx_c1d1_e1f1
xxxx_xxxx_xxxx_c1d1 xxxx_xxxx_xxxx_xxc1
2-64 SC140 DSP Core Reference Manual
Memory Interface
2.4.1.6 Memory Access Behavior in Big/Little Endian Modes
Table 2-27 shows the representation of the move instructions in big and little endian modes. In the examples shown in this table, it is assumed that R0 points to address A0. Each alphanumeric A–H represents one byte. Also, the memory contents may not exactly equal the register contents. For example, in VSL instructions, the memory word (16 bits) is the register word shifted left by one bit. See Appendix A for more detailed information.
Table 2-27. Move Instructions in Big and Little Endian Modes
Instruction Register Operands Big Endian
Little
Endian
MOVE.B
MOVEU.B
A0 = A A0 = A
MOVE.W
MOVEU.W
A0 = A A1 = B
A0 = B A1 = A
MOVE.2W A0 = A
A1 = B A2 = C A3 = D
A0 = B A1 = A A2 = D A3 = C
MOVE.4W A0 = A
A1 = B A2 = C A3 = D A4 = E A5 = F A6 = G A7 = H
A0 = B A1 = A A2 = D A3 = C A4 = F A5 = E A6 = H A7 = G
MOVE.L MOVEU.L MOVES.L
A0 = A A1 = B A2 = C A3 = D
A0 = D A1 = C A2 = B A3 = A
039 8
A
D0 =
Examp
l
e: MOVE.B D0,(R
0)
039
16
A
B
Example: MOVE.W D0, (R0)
D0 =
039 16
AB
CD
Example: MOVE.2W D0:D1, (R0)
D0 = D1 =
039
16
AB
CD
EF
GH
Example: MOVE.4W D0:D1:D2:D3, (R0)
D0 = D1 =
D2 = D3 =
039 32
AB
C
D
Example: MOVE.L D0, (R0)
D0 =
Memory Interface
SC140 DSP Core Reference Manual 2-65
MOVE.L
(Extension)
A0 = L1
A1 = B1
A2 = L0
A3 = A1
A0 = A1 A1 = L0 A2 = B1 A3 = L1
MOVE.2L A0 = A
A1 = B A2 = C A3 = D A4 = E A5 = F A6 = G A7 = H
A0 = D A1 = C A2 = B A3 = A A4 = H A5 = G A6 = F A7 = E
MOVE.F
MOVES.F
A0 = A A1 = B
A0 = B A1 = A
MOVE.2F
MOVES.2F
A0 = A A1 = B A2 = C A3 = D
A0 = B A1 = A A2 = D A3 = C
MOVE.4F
MOVES.4F
A0 = A A1 = B A2 = C A3 = D A4 = E A5 = F A6 = G A7 = H
A0 = B A1 = A A2 = D A3 = C A4 = F A5 = E A6 = H A7 = G
Table 2-27. Move Instructions in Big and Little Endian Modes (Continued)
Instruction Register Operands Big Endian
Little
Endian
039 32
A
16
B
L1
L0
+
+
Example: MOVE.L D0.E:D1.E, (A0)
D0 = D1 =
039 32
A
B
C
D
E
F
GH
Example: MOVE.2L D0:D1, (R0)
D0 = D1 =
039 32 16
AB
Example: MOVE.F D0, (R0)
D0 =
039 32 16
A
B
C
D
Example: MOVE.2F D0:D1, (R0)
D0 = D1 =
039 32 16
A
B
C
D
E
F
G
H
Example: MOVE.4F D0:D1:D2:D3, (R0)
D0 = D1 =
D2 = D3 =
2-66 SC140 DSP Core Reference Manual
Memory Interface
Notes:
1. Data selected according to VF0 bit in SR, selects D3.l<<1 if VF0=1, D1.L<<1 if VF0=0
2. Data selected according to VF2 bit in SR, selects D3.l<<1 if VF2=1, D1.L<<1 if VF2=0
3. Data selected according to VF1 bit in SR, selects D3.H<<1 if VF1=1, D1.H<<1 if VF1=0
4. Data selected according to VF3 bit in SR, selects D3.H<<1 if VF3=1, D1.H<<1 if VF3=0
VSL.4W A0 = C
A1 = D A2 = A A3 = B A4 = G A5 = H A6 = E A7 = F
A0 = B A1 = A A2 = D A3 = C A4 = F A5 = E A6 = H A7 = G
VSL.4F A0 = C
A1 = D A2 = A A3 = B A4 = G A5 = H A6 = E A7 = F
A0 = B A1 = A A2 = D A3 = C A4 = F A5 = E A6 = H A7 = G
VSL.2W A0 = C
A1 = D A2 = A A3 = B
A0 = B A1 = A A2 = D A3 = C
VSL.2F A0 = C
A1 = D A2 = A A3 = B
A0 = B A1 = A A2 = D A3 = C
Table 2-27. Move Instructions in Big and Little Endian Modes (Continued)
Instruction Register Operands Big Endian
Little
Endian
039 16
AB
C
D
EF
GH
xample: VSL.4W D2:D6:D1:D3, (R0) + N0
D2 = D6 =
N
ote 1
N
ote 2
039 32 16
A
B
C
D
E
F
G
H
Example: VSL.4F D2:D6:D1:D3, (R0) + N0
D2 = D6 =
Note 3 Note 4
039 16
A
B
C
D
Example: VSL.2W D1:D3, (R0) + N0
Note 1 Note 2
039 32 16
A
B
C
D
Example: VSL.2F D1:D3, (R0) + N0
Note 3 Note 4
Memory Interface
SC140 DSP Core Reference Manual 2-67
Table 2-28 shows the representation of the stack support instructions in big and little endian modes. In the examples shown in this table, it is assumed that the stack access is to address A0. The stack instructions treat the register data like a 32-bit long word move.
Table 2-29 shows the representation of the bit mask instructions in big and little endian modes.
Table 2-28. Stack Support Instructions in Big and Little Endian Modes
Instruction Register Operands Big Endian
Little
Endian
Single
POP
POPN
PUSH
PUSHN
A0 = A
A1 = B A2 = C A3 = D
A0 = D A1 = C A2 = B A3 = A
Double
POP
POPN
PUSH
PUSHN
A0 = A
A1 = B A2 = C A3 = D
A4 = E
A5 = F A6 = G A7 = H
A0 = D A1 = C A2 = B A3 = A A4 = H A5 = G
A6 = F
A7 = E
Table 2-29. Bit Mask Instructions in Big and Little Endian Modes
Instruction Register Operands Big Endian
Little
Endian
BMCHG.W
BMCLR.W
BMSET.W
BMTSTS.W
BMTSTC.W
BMTSET.W
NOT.W AND.W
OR.W
EOR.W
A0 = A
A1 = B
A0 = B A1 = A
031
A
BCD
Example: PUSH D0
D0 =
031
A
B
C
D
E
F
GH
Example: PUSH D0 PUSH D1
D0 = D1 =
015
A
B
Example: BMSET.W #$1234, (A0)
Data = Mask = 12 34
2-68 SC140 DSP Core Reference Manual
Memory Interface
Table 2-30 shows the representation of the change-of-flow instructions in big and little endian modes. In this table, it is assumed that the stack access is to address A0. This shows how the contents of the PC and SR are transferred to/from memory like 32-bit long words.
Table 2-31 shows the representation of the control instructions in big and little endian modes. In this table, it is assumed that the stack access is to address A0.
Table 2-31. Control Instructions in Big and Little Endian Modes
.
Table 2-30. Non-Loop Change-of-Flow Instructions in Big and Little Endian Modes
Instruction Register Operands
Big
Endian
Little
Endian
BSR
BSRD
JSR
JSRD
RTE
RTED
A0 = A
A1 = B A2 = C A3 = D
A4 = E
A5 = F A6 = G A7 = H
A0 = D A1 = C A2 = B A3 = A A4 = H A5 = G
A6 = F
A7 = E
RTS
RTSD
RTSTK
RTSTKD
A0 = A
A1 = B A2 = C A3 = D
A0 = D A1 = C A2 = B A3 = A
Instruction Register Operands
Big
Endian
Little
Endian
TRAP
ILLEGAL
Interrupt Service
A0 = A
A1 = B A2 = C A3 = D
A4 = E
A5 = F A6 = G A7 = H
A0 = D A1 = C A2 = B A3 = A A4 = H A5 = G
A6 = F
A7 = E
031
A
B
C
D
E
F
GH
PC =
SR =
031
AB
C
D
PC =
031
A
B
C
D
E
F
GH
PC =
SR =
Loading...