Texas Instruments and its subsidiaries (TI) reserve the right to make changes to their products
or to discontinue any product or service without notice, and advise customers to obtain the latest
version of relevant information to verify, before placing orders, that information being relied on
is current and complete. All products are sold subject to the terms and conditions of sale supplied
at the time of order acknowledgment, including those pertaining to warranty, patent infringement,
and limitation of liability.
TI warrants performance of its products to the specifications applicable at the time of sale in
accordance with TI’s standard warranty. Testing and other quality control techniques are utilized
to the extent TI deems necessary to support this warranty . Specific testing of all parameters of
each device is not necessarily performed, except those mandated by government requirements.
Customers are responsible for their applications using TI components.
In order to minimize risks associated with the customer’s applications, adequate design and
operating safeguards must be provided by the customer to minimize inherent or procedural
hazards.
TI assumes no liability for applications assistance or customer product design. TI does not
warrant or represent that any license, either express or implied, is granted under any patent right,
copyright, mask work right, or other intellectual property right of TI covering or relating to any
combination, machine, or process in which such products or services might be or are used. TI’s
publication of information regarding any third party’s products or services does not constitute TI’ s
approval, license, warranty or endorsement thereof.
Reproduction of information in TI data books or data sheets is permissible only if reproduction
is without alteration and is accompanied by all associated warranties, conditions, limitations and
notices. Representation or reproduction of this information with alteration voids all warranties
provided for an associated TI product or service, is an unfair and deceptive business practice,
and TI is not responsible nor liable for any such use.
Resale of T I’ s p roducts o r s ervices w ith statements d ifferent f rom o r b eyond the parameters
by TI for that products or service voids all express and any implied warranties for the associated
TI product or service, is an unfair and deceptive business practice, and TI is not responsible nor
liable for any such use.
Also see: Standard Terms and Conditions of Sale for Semiconductor Products.
www.ti.com/sc/docs/stdterms.htm
Mailing Address:
Texas Instruments
Post Office Box 655303
Dallas, Texas 75265
Copyright 2001, Texas Instruments Incorporated
stated
About This Manual
This manual describes ways to optimize C and assembly code for the
TMS320C55x DSPs and recommends ways to write TMS320C55x code for
specific applications.
Notational Conventions
This document uses the following conventions.
- The device number TMS320C55x is often abbreviated as C55x.
- Program listings, program examples, and interactive displays are shown
Preface
in a special typeface similar to a typewriter’s. Examples use a bold
version of the special typeface for emphasis; interactive displays use a
bold version of the special typeface to distinguish commands that you
enter from items that the system displays (such as prompts, command
output, error messages, etc.).
Here is an example of a system prompt and a command that you might
enter:
C: csr −a /user/ti/simuboard/utilities
- In syntax descriptions, the instruction, command, or directive is in a bold
typeface font and parameters are in an italic typeface. Portions of a syntax
that are in bold should be entered as shown; portions of a syntax that are
in italics describe the type of information that should be entered. Here is
an example of a directive syntax:
.asect “section name”,address
.asect is the directive. This directive has two parameters, indicated by sec-
tion name and address. When you use .asect, the first parameter must be
an actual section name, enclosed in double quotes; the second parameter
must be an address.
Read This First
iii
Notational Conventions
Some directives can have a varying number of parameters. For example,
-
the .byte directive can have up to 100 parameters. The syntax for this directive is:
.byte value
[, ... , valuen]
1
This syntax shows that .byte must have at least one value parameter, but
you have the option of supplying additional value parameters, separated
by commas.
- In most cases, hexadecimal numbers are shown with the suffix h. For ex-
ample, the following number is a hexadecimal 40 (decimal 64):
40h
Similarly, binary numbers usually are shown with the suffix b. For example,
the following number is the decimal number 4 shown in binary form:
0100b
- Bits are sometimes referenced with the following notation:
NotationDescriptionExample
Register(n−m) Bits n through m of RegisterAC0(15−0) represents the 16
least significant bits of the register AC0.
iv
Related Documentation From Texas Instruments
The following books describe the TMS320C55x devices and related support
tools. To obtain a copy of any of these TI documents, call the Texas
Instruments Literature Response Center at (800) 477-8924. When ordering,
please identify the book by its title and literature number.
TMS320C55x T echnical Overview (literature number SPRU393). This over-
view is an introduction to the TMS320C55x digital signal processor
(DSP). The TMS320C55x is the latest generation of fixed-point DSPs in
the TMS320C5000 DSP platform. Like the previous generations, this
processor is optimized for high performance and low-power operation.
This book describes the CPU architecture, low-power enhancements,
and embedded emulation features of the TMS320C55x.
TMS320C55x DSP CPU Reference Guide (literature number SPRU371)
describes the architecture, registers, and operation of the CPU.
TMS320C55x DSP Mnemonic Instruction Set Reference Guide (literature
number SPRU374) describes the mnemonic instructions individually. It
also includes a summary of the instruction set, a list of the instruction
opcodes, and a cross-reference to the algebraic instruction set.
Related Documentation From Texas Instruments
TMS320C55x DSP Algebraic Instruction Set Reference Guide (literature
number SPRU375) describes the algebraic instructions individually. It
also includes a summary of the instruction set, a list of the instruction
opcodes, and a cross-reference to the mnemonic instruction set.
TMS320C55x Optimizing C Compiler User’s Guide (literature number
SPRU281) describes the C55x C compiler. This C compiler accepts
ANSI standard C source code and produces assembly language source
code for TMS320C55x devices.
TMS320C55x Assembly Language Tools User’s Guide (literature number
SPRU280) describes the assembly language tools (assembler, linker,
and other tools used to develop assembly language code), assembler
directives, macros, common object file format, and symbolic debugging
directives for TMS320C55x devices.
TMS320C55x DSP Library Programmer’s Reference (literature number
SPRU422) describes the optimized DSP Function Library for C programmers on the TMS320C55x DSP.
The CPU, the registers, and the instruction sets are also described in online
documentation contained in Code Composer Studio.
Read This First
v
Trademarks
Trademarks
Code Composer Studio, TMS320C54x, C54x, TMS320C55x, and C55x are
trademarks of Texas Instruments.
This chapter lists some of the key features of the TMS320C55x (C55x) DSP
architecture and shows a recommended process for creating code that runs
efficiently.
1.2Code Development Flow for Best Performance1-3. . . . . . . . . . . . . . . . . .
1-1
TMS320C55x Architecture
1.1TMS320C55x Architecture
The TMS320C55x device is a fixed-point digital signal processor (DSP). The
main block of the DSP is the central processing unit (CPU), which has the following characteristics:
- A unified program/data memory map. In program space, the map contains
16M bytes that are accessible at 24-bit addresses. In data space, the map
contains 8M words that are accessible at 23-bit addresses.
- An input/output (I/O) space of 64K words for communication with peripher-
als.
- Software stacks that support 16-bit and 32-bit push and pop operations.
You can use these stack for data storage and retreival. The CPU uses
these stacks for automatic context saving (in response to a call or interrupt) and restoring (when returning to the calling or interrupted code sequence).
- A large number of data and address buses, to provide a high level of paral-
lelism. One 32-bit data bus and one 24-bit address bus support instruction
fetching. Three 16-bit data buses and three 24-bit address buses are used
to transport data to the CPU. Two 16-bit data buses and two 24-bit address
buses are used to transport data from the CPU.
1-2
- An instruction buffer and a separate fetch mechanism, so that instruction
fetching is decoupled from other CPU activities.
- The following computation blocks: one 40-bit arithmetic logic unit (ALU),
one 16-bit ALU, one 40-bit shifter, and two multiply-and-accumulate units
(MACs). In a single cycle, each MAC can perform a 17-bit by 17-bit multiplication (fractional or integer) and a 40-bit addition or subtraction with optional 32-/40-bit saturation.
- An instruction pipeline that is protected. The pipeline protection mecha-
nism inserts delay cycles as necessary to prevent read operations and
write operations from happening out of the intended order.
- Data address generation units that support linear, circular, and bit-reverse
addressing.
- Interrupt-control logic that can block (or mask) certain interrupts known as
the maskable interrupts.
- A TMS320C54x-compatible mode to support code originally written for a
TMS320C54x DSP.
Code Development Flow for Best Performance
1.2Code Development Flow for Best Performance
The following flow chart shows how to achieve the best performance and codegeneration efficiency from your code. After the chart, there is a table that describes the phases of the flow.
Figure 1−1. Code Development Flow
Step 1:
Write C Code
Step 2:
Optimize
C Code
Write C code
Optimize C code
Yes
optimization?
Compile
Profile
Efficient
enough?
No
Compile
Profile
Efficient
enough?
No
More C
Yes
Yes
Done
Done
No
To Step 3 (next page)
Introduction
1-3
Code Development Flow for Best Performance
Figure 1−1. Code Development Flow (Continued)
From Step 2 (previous page)
Step 3:
Write
Assembly
Code
Step 4:
Optimize
Assembly
Code
Identify time-critical portions of C code
Write them in assembly code
Profile
Efficient
enough?
Yes
No
Optimize assembly code
Profile
No
Efficient
enough?
Yes
Done
Done
1-4
Code Development Flow for Best Performance
Step
Goal
1Write C Code: You can develop your code in C using the ANSI-
compliant C55x C compiler without any knowledge of the C55x DSP.
Use Code Composer Studio to identify any inefficient areas that
you might have in your C code. After making your code functional,
you can improve its performance by selecting higher-level optimization compiler options. If your code i s s ti l l n ot as efficient as you would
like it to be, proceed to step 2.
2Optimize C Code: Explore potential modifications to your C code
to achieve better performance. Some of the techniques you can apply include (see Chapter 3):
- Use specific types (register, volatile, const).
- Modify the C code to better suit the C55x architecture.
- Use an ETSI intrinsic when applicable.
- Use C55x compiler intrinsics.
After modifying your code, use the C55x profiling tools again, to
check its performance. If your code is still not as efficient as you
would like it to be, proceed to step 3.
3Write Assembly Code: Identify the time-critical portions of your C
code and rewrite them as C-callable assembly-language functions.
Again, profile your code, and if it is still not as efficient as you would
like it to be, proceed to step 4.
4
Optimize Assembly Code: After making your assembly code functional, try to optimize the assembly-language functions by using
some of the techniques described in Chapter 4, Optimizing Your As-sembly Code. The techniques include:
- Place instructions in parallel.
- Rewrite or reorganize code to avoid pipeline protection delays.
- Minimize stalls in instruction fetching.
Introduction
1-5
1-6
Chapter 2
This tutorial walks you through the code development flow introduced in Chapter 1, and introduces you to basic concepts of TMS320C55x (C55x) DSP programming. It uses step-by-step instructions and code examples to show you
how to use the software development tools integrated under Code Composer
Studio (CCS).
Installing CCS before beginning the tutorial allows you to edit, build, and debug
DSP target programs. For more information about CCS features, see the CCS
Tutorial. You can access the CCS Tutorial within CCS by choosing
Help!Tutorial.
The examples in this tutorial use instructions from the mnemonic instruction
set, but the concepts apply equally for the algebraic instruction set.
This tutorial presents a simple assembly code example that adds four numbers together (y = x0 + x3 + x1 + x2). This example helps you become familiar
with the basics of C55x programming.
After completing the tutorial, you should know:
- The four common C55x addressing modes and when to use them.
- The basic C55x tools required to develop and test your software.
This tutorial does not replace the information presented in other C55x documentation and is not intended to cover all the topics required to program the
C55x efficiently.
Refer to the related documentation listed in the preface of this book for more
information about programming the C55x DSP. Much of this information has
been consolidated as part of the C55x Code Composer Studio online help.
For your convenience, all the files required to run this example can be downloaded with the TMS320C55x Programmer’s Guide (SPRU376) from
http://www.ti.com/sc/docs/schome.htm. The examples in this chapter can be
found in the 55xprgug_srccode\tutor directory.
2-2
2.2Writing Assembly Code
Writing your assembly code involves the following steps:
- Allocate sections for code, constants, and variables.
- Initialize the processor mode.
- Set up addressing modes and add the following values: x0 + x1 + x2 + x3.
The following rules should be considered when writing C55x assembly code:
- Labels
The first character of a label must be a letter or an underscore ( _ ) followed by a let t e r, and must begin in the first column of the text file. Labels
can contain up to 32 alphanumeric characters.
- Comments
When preceded by a semicolon ( ; ), a comment may begin in any column.
When preceded by an asterisk ( * ), a comment must begin in the first
column.
Writing Assembly Code
The final assembly code product of this tutorial is displayed in Example 2−1,
Final Assembly Code of tutor.asm. This code performs the addition of the elements in vector x. Sections of this code are highlighted in the three steps used
to create this example.
For more information about assembly syntax, see the TMS320C55x Assembly
Language Tools User’s Guide (SPRU280).
Tutorial
2-3
Writing Assembly Code
Example 2−1. Final Assembly Code of tutor.asm
* Step 1: Section allocation
* −−−−−−
x.usect ”vars”,4; reserve 4 uninitalized 16-bit locations for x
y.usect ”vars”,1; reserve 1 uninitialized 16-bit location for y
.def x,y,init
init.int 1,2,3,4; contain initialization values for x
start
* Step 2: Processor mode initialization
* −−−−−−
BCLRC54CM; set processor to ’55x native mode instead of
BCLRAR0LC; set AR0 register in linear mode
BCLRAR6LC; set AR6 register in linear mode
* Step 3a: Copy initialization values to vector x using indirect addressing
* −−−−−−−
copy
AMOV#x, XAR0; XAR0 pointing to variable x
AMOV#init, XAR6; XAR6 pointing to initialization table
MOV*AR6+, *AR0+; copy starts from ”init” to ”x”
MOV*AR6+, *AR0+
MOV*AR6+, *AR0+
MOV*AR6, *AR0
* Step 3b: Add values of vector x elements using direct addressing
* −−−−−−−
add
AMOV#x, XDP; XDP pointing to variable x
.dp x; and the assembler is notified
.sect ”table”; create initialized section ”table” to
.text; create code section (default is .text)
.def start; define label to the start of the code
* Step 3c. Write the result to y using absolute addressing
* −−−−−−−
MOVAC0, *(#y)
end
NOP
B end
2-4
2.2.1Allocate Sections for Code, Constants, and Variables
The first step in writing this assembly code is to allocate memory space for the
different sections of your program.
Sections are modules consisting of code, constants, or variables needed to
successfully run your application. These modules are defined in the source file
using assembler directives. The following basic assembler directives are used
to create sections and initialize values in the example code.
- .sect “section_name” creates initialized name section for code/data. Ini-
tialized sections are sections defining their initial values.
- .usect “section_name”, size creates uninitialized named section for data.
Uninitialized sections declare only their size in 16-bit words, but do not define their initial values.
- .int value reserves a 16-bit word in memory and defines the initialization
value
- .def symbol makes a symbol global, known to external files, and indicates
that the symbol is defined in the current file. External files can access the
symbol by using the .ref directive. A symbol can be a label or a variable.
Writing Assembly Code
As shown in Example 2−2 and Figure 2−1, the example file tutor.asm contains
three sections:
- vars, containing five uninitialized memory locations
J The first four are reserved for vector x (the input vector to add).
J The last location, y, will be used to store the result of the addition.
- table, to hold the initialization values for x. The init label points to the begin-
ning of section table.
- .text, which contains the assembly code
Example 2−2 shows the partial assembly code used for allocating sections.
Tutorial
2-5
Writing Assembly Code
Example 2−2. Partial Assembly Code of tutor.asm (Step 1)
* Step 1: Section allocation
* −−−−−−
.def x, y, init
x.usect “vars”, 4 ; reserve 4 uninitialized 16−bit locations for x
y.usect “vars”, 1 ; reserve 1 uninitialized 16−bit location for y
.sect “table”; create initialized section “table” to
init.int 1, 2, 3, 4; contain initialization values for x
.text; create code section (default is .text)
.def start; define label to the start of the code
start
Note:The algebraic instructions code example for Partial Assembly Code of tutor.asm (Step 1) is shown in Example B−1 on
page B-2.
Figure 2−1. Section Allocation
x
y
Init
Start
1
2
3
4
Code
2-6
2.2.2Processor Mode Initialization
The second step is to make sure the status registers (ST0_55, ST1_55,
ST2_55, and ST3_55) are set to configure your processor . You will either need
to set these values or use the default values. Default values are placed in the
registers after processor reset. You can locate the default register values after
reset in the TMS320C55x DSP CPU Reference Guide (SPRU371).
As shown in Example 2−3:
- The AR0 and AR6 registers are set to linear addressing (instead of circular
addressing) using bit addressing mode to modify the status register bits.
- The processor has been set in C55x native mode instead of C54x-compat-
ible mode.
Example 2−3. Partial Assembly Code of tutor.asm (Step 2)
* Step 2: Processor mode initialization
* −−−−−−
BCLRC54CM; set processor to ’55x native mode instead of
; ’54x compatibility mode (reset value)
BCLRAR0LC; set AR0 register in linear mode
BCLRAR6LC; set AR6 register in linear mode
Writing Assembly Code
Note:The algebraic instructions code example for Partial Assembly Code of tutor.asm (Step 2) is shown in Example B−2 on
page B-2.
Tutorial
2-7
Writing Assembly Code
2.2.3Setting up Addressing Modes
Four of the most common C55x addressing modes are used in this code:
- ARn Indirect addressing (identified by *), in which you use auxiliary regis-
ters (ARx) as pointers.
- DP direct addressing (identified by @), which provides a positive offset ad-
dressing from a base address specified by the DP register. The offset is
calculated by the assembler and defined by a 7-bit value embedded in the
instruction.
- k23 absolute addressing (identified by #), which allows you to specify the
entire 23-bit data address with a label.
- Bit addressing (identified by the bit instruction), which allows you to modify
a single bit of a memory location or MMR register.
For further details on these addressing modes, refer to the TMS320C55x DSPCPU Reference Guide (SPRU371). Example 2−4 demonstrates the use of the
addressing modes discussed in this section.
In Step 3a, initialization values from the table section are copied to vector x (the
vector to perform the addition) using indirect addressing. Figure 2−2 illustrates
the structure of the extended auxiliar registers (XARn). The XARn register is
used only during register initialization. Subsequent operations use ARn because only the lower 16 bits are affected (ARn operations are restricted to a
64k main data page). AR6 is used to hold the address of table, and AR0 is used
to hold the address of x.
In Step 3b, direct addressing is used to add the four values. Notice that the
XDP register was initialized to point to variable x. The .dp assembler directive
is used to define the value of XDP, so the correct offset can be computed by
the assembler at compile time.
Finally , i n Step 3c, the result was stored in the y vector using absolute addressing. Absolute addressing provides an easy way to access a memory location
without having to make XDP changes, but at the expense of an increased code
size.
2-8
Loading...
+ 257 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.