The following changes have been made to this book.
Change History
DateIssueChange
November 2000ARelease 1.1
November 2001BRelease 1.2
Proprietary Notice
Words and logos marked with
brands and names mentioned herein may be the trademarks of their respective owners.
Neither the whole nor any part of the information contained in, or the product described in, this document
may be adapted or reproduced in any material form except with the prior written permission of the copyright
holder.
®
or ™ are registered trademarks or trademarks owned by ARM Limited. Other
The product described in this document is subject to continuous developments and improvements. All
particulars of the product and its use contained in this document are given by ARM in good faith. However,
all warranties implied or expressed, including but not limited to implied warranties of merchantability, or
fitness for purpose, are excluded.
This document is intended only to assist the reader in the use of the product. ARM Limited shall not be liable
for any loss or damage arising from the use of any information in this document, or any error or omission in
such information, or any incorrect use of the product.
This book provides tutorial and reference information for the ADS assemblers (
armasm
the free-standing assembler, and inline assemblers in the C and C++ compilers). It
describes the command-line options to the assembler, the pseudo-instructions and
directives available to assembly language programmers, and the ARM, Thumb
®
, and
Vector Floating-point (VFP) instruction sets.
This book is written for all developers who are producing applications using ADS. It
assumes that you are an experienced software developer and that you are familiar with
the ARM development tools as described in ADS Getting Started.
This book is organized into the following chapters:
Chapter 1 Introduction
Read this chapter for an introduction to the ADS version 1.2 assemblers
and assembly language.
Chapter 2 Writing ARM and Thumb Assembly Language
Read this chapter for tutorial information to help you use the ARM
assemblers and assembly language.
Chapter 3 Assembler Reference
Read this chapter for reference material about the syntax and structure of
the language provided by the ARM assemblers.
,
Chapter 4 ARM Instruction Reference
Read this chapter for reference material on the ARM instruction set.
Chapter 5 Thumb Instruction Reference
Read this chapter for reference material on the Thumb instruction set.
Chapter 6 Vector Floating-point Programming
Read this chapter for reference material on the VFP instruction set, and
other VFP-specific assembly language information.
Chapter 7 Directives Reference
Read this chapter for reference material on the assembler directives
available in the ARM assembler,
The following typographical conventions are used in this book:
Preface
Further reading
monospace
Denotes text that can be entered at the keyboard, such as commands, file
and program names, and source code.
monospace
Denotes a permitted abbreviation for a command or option. The
underlined text can be entered instead of the full command or option
name.
monospace italic
Denotes arguments to commands and functions where the argument is to
be replaced by a specific value.
monospace bold
Denotes language keywords when used outside example code.
italicHighlights important notes, introduces special terminology, denotes
internal cross-references, and citations.
boldHighlights interface elements, such as menu names. Also used for
emphasis in descriptive lists, where appropriate, and for ARM processor
signal names.
This section lists publications from both ARM Limited and third parties that provide
additional information on developing code for the ARM family of processors.
ARM periodically provides updates and corrections to its documentation. See
http://www.arm.com
for current errata sheets and addenda.
See also the ARM Frequently Asked Questions list at:
http://www.arm.com/DevSupp/Sales+Support/faq.html
ARM publications
This book contains reference information that is specific to development tools supplied
with ADS. Other publications included in the suite are:
•ADS Installation and License Management Guide (ARM DUI 0139)
•an optimizing inline assembler built into the C and C++ compilers.
The language that these assemblers take as input is basically the same. However, there
are limitations on what features of the language you can use in the inline assemblers.
Refer to the Mixing C, C++, and Assembly Language chapter in ADS Developer Guide
for further information on the inline assemblers.
This chapter gives a basic, practical understanding of how to write ARM and Thumb
assembly language modules. It also gives information on the facilities provided by the
ARM assembler (armasm).
This chapter does not provide a detailed description of the ARM, Thumb, or VFP
instruction sets. This information can be found in Chapter 4 ARM Instruction
Reference, Chapter 5 Thumb Instruction Reference, and Chapter 6 Vect or
Floating-point Programming. Further information can be found in ARM Architecture
Reference Manual.
2.1.1Code examples
There are a number of code examples in this chapter. Many of them are supplied in the
examples\asm
Follow these steps to build, link, and execute an assembly language file:
directory of the ADS.
1.Type
armasm -g filename.s
at the command prompt to assemble the file and
generate debug tables.
2.Type
armlink filename.o -o filename
to link the object file and generate an ELF
executable image.
3.Type
4.Type
5.Type
armsd filename
go
at the
armsd:
quit
at the
armsd:
to load the image file into the debugger.
prompt to execute it.
prompt to return to the command line.
To see how the assembler converts the source code, enter:
fromelf -text/c filename.o
or run the module in AXD with interleaving on.
See:
•AXD and armsd Debuggers Guide for details on armsd, and AXD.
This section gives a brief overview of the ARM architecture.
ARM processors are typical of RISC processors in that they implement a load/store
architecture. Only load and store instructions can access memory. Data processing
instructions operate on register contents only.
2.2.1Architecture versions
The information and examples in this book assume that you are using a processor that
implements ARM architecture v3 or above. See ARM Architecture Reference Manual
for details of the various architecture versions.
All these processors have a 32-bit addressing range.
2.2.2ARM and Thumb state
ARM architecture versions v4T and above define a 16-bit instruction set called the
Thumb instruction set. The functionality of the Thumb instruction set is a subset of the
functionality of the 32-bit ARM instruction set. Refer to Thumb instruction set overview
on page 2-9 for more information.
Writing ARM and Thumb Assembly Language
A processor that is executing Thumb instructions is operating in Thumb state. A
processor that is executing ARM instructions is operating in ARM state.
A processor in ARM state cannot execute Thumb instructions, and a processor in
Thumb state cannot execute ARM instructions. You must ensure that the processor
never receives instructions of the wrong instruction set for the current state.
Each instruction set includes instructions to change processor state.
You must also switch the assembler mode to produce the correct opcodes using
and
CODE32
directives. Refer to CODE16 and CODE32 on page 7-54 for details.
CODE16
ARM processors always start executing code in ARM state.
ARM processors support up to seven processor modes, depending on the architecture
version. These are:
•User
•FIQ - Fast Interrupt Request
•IRQ - Interrupt Request
•Supervisor
•Abort
•Undefined
•System (ARM architecture v4 and above).
All modes except User mode are referred to as privileged modes.
Applications that require task protection usually execute in User mode. Some
embedded applications might run entirely in Supervisor or System modes.
Modes other than User mode are entered to service exceptions, or to access privileged
resources. Refer to the Handling Processor Exceptions chapter in ADS Developer Guide, and ARM Architecture Reference Manual for more information.
2.2.4Registers
ARM processors have 37 registers. The registers are arranged in partially overlapping
banks. There is a different register bank for each processor mode. The banked registers
give rapid context switching for dealing with processor exceptions and privileged
operations. Refer to ARM Architecture Reference Manual for a detailed description of
how registers are banked.
The following registers are available in ARM architecture v3 and above:
•30 general-purpose, 32-bit registers
•The program counter (pc) on page 2-5
•The Current Program Status Register (CPSR) on page 2-5
•Five Saved Program Status Registers (SPSRs) on page 2-5.
30 general-purpose, 32-bit registers
Fifteen general-purpose registers are visible at any one time, depending on the current
processor mode, as r0, r1, ... ,r13, r14.
By convention, r13 is used as a stack pointer (sp) in ARM assembly language. The C
and C++ compilers always use r13 as the stack pointer.
In User mode, r14 is used as a link register (lr) to store the return address when a
subroutine call is made. It can also be used as a general-purpose register if the return
address is stored on the stack.
In the exception handling modes, r14 holds the return address for the exception, or a
subroutine return address if subroutine calls are executed within an exception. r14 can
be used as a general-purpose register if the return address is stored on the stack.
The program counter (pc)
The program counter is accessed as r15 (or pc). It is incremented by one word (four
bytes) for each instruction in ARM state, or by two bytes in Thumb state. Branch
instructions load the destination address into the program counter. You can also load the
program counter directly using data operation instructions. For example, to return from
a subroutine, you can copy the link register into the program counter using:
MOV pc,lr
During execution, r15 does not contain the address of the currently executing
instruction. The address of the currently executing instruction is typically pc–8 for
ARM, or pc–4 for Thumb.
The Current Program Status Register (CPSR)
The CPSR holds:
•copies of the Arithmetic Logic Unit (ALU) status flags
•the current processor mode
•interrupt disable flags.
The ALU status flags in the CPSR are used to determine whether conditional
instructions are executed or not. Refer to Conditional execution on page 2-20 for more
information.
On Thumb-capable processors, the CPSR also holds the current processor state (ARM
or Thumb).
On ARM architecture v5TE, the CPSR also holds the Q flag (see The ALU status flags
on page 2-20).
Five Saved Program Status Registers (SPSRs)
The SPSRs are used to store the CPSR when an exception is taken. One SPSR is
accessible in each of the exception-handling modes. User mode and System mode do
not have an SPSR because they are not exception handling modes. Refer to the
Handling Processor Exceptions chapter in ADS Developer Guide for more information.
All ARM instructions are 32 bits long. Instructions are stored word-aligned, so the least
significant two bits of instruction addresses are always zero in ARM state. Some
instructions use the least significant bit to determine whether the code being branched
to is Thumb code or ARM code.
See Chapter 4 ARM Instruction Reference for detailed information on the syntax of the
ARM instruction set.
ARM instructions can be classified into a number of functional groups:
•Branch instructions
•Data processing instructions
•Single register load and store instructions on page 2-7
•Multiple register load and store instructions on page 2-7
•Status register access instructions on page 2-7
•Semaphore instructions on page 2-7
•Coprocessor instructions on page 2-7.
Branch instructions
These instructions are used to:
•branch backwards to form loops
•branch forward in conditional structures
•branch to subroutines
•change the processor from ARM state to Thumb state.
Data processing instructions
These instructions operate on the general-purpose registers. They can perform
operations such as addition, subtraction, or bitwise logic on the contents of two registers
and place the result in a third register. They can also operate on the value in a single
register, or on a value in a register and a constant supplied within the instruction (an
immediate value).
Long multiply instructions (unavailable in some architectures) give a 64-bit result in
two registers.
These instructions load or store the value of a single register from or to memory. They
can load or store a 32-bit word or an 8-bit unsigned byte. In ARM architecture v4 and
above they can also load or store a 16-bit unsigned halfword, or load and sign extend a
16-bit halfword or an 8-bit byte.
Multiple register load and store instructions
These instructions load or store any subset of the general-purpose registers from or to
memory. Refer to Load and store multiple register instructions on page 2-39 for a
detailed description of these instructions.
Status register access instructions
These instructions move the contents of the CPSR or an SPSR to or from a
general-purpose register.
Semaphore instructions
These instructions load and alter a memory semaphore.
Coprocessor instructions
These instructions support a general way to extend the ARM architecture.
The following general points apply to ARM instructions:
•Conditional execution
•Register access
•Access to the inline barrel shifter.
Conditional execution
Almost all ARM instructions can be executed conditionally on the value of the ALU
status flags in the CPSR. You do not need to use branches to skip conditional
instructions, although it can be better to do so when a series of instructions depend on
the same condition.
You can specify whether a data processing instruction sets the state of these flags or not.
You can use the flags set by one instruction to control execution of other instructions
even if there are many instructions in between.
Refer to Conditional execution on page 2-20 for a detailed description.
Register access
In ARM state, all instructions can access r0 to r14, and most also allow access to r15
(pc). The
MRS
and
MSR
instructions can move the contents of the CPSR and SPSRs to a
general-purpose register, where they can be manipulated by normal data processing
operations. Refer to MRS on page 4-73 and MSR on page 4-74 for more information.
Access to the inline barrel shifter
The ARM arithmetic logic unit has a 32-bit barrel shifter that is capable of shift and
rotate operations. The second operand to all ARM data-processing and single register
data-transfer instructions can be shifted, before the data-processing or data-transfer is
executed, as part of the instruction. This supports, but is not limited to:
•scaled addressing
•multiplication by a constant
•constructing constants.
Refer to Loading constants into registers on page 2-25 for more information on using
the barrel-shifter to generate constants.
The functionality of the Thumb instruction set is almost exactly a subset of the
functionality of the ARM instruction set. The instruction set is optimized for production
by a C or C++ compiler.
All Thumb instructions are 16 bits long and are stored halfword-aligned in memory.
Because of this, the least significant bit of the address of an instruction is always zero
in Thumb state. Some instructions use the least significant bit to determine whether the
code being branched to is Thumb code or ARM code.
All Thumb data processing instructions:
•operate on full 32-bit values in registers
•use full 32-bit addresses for data access and for instruction fetches.
Refer to Chapter 5 Thumb Instruction Reference for detailed information on the syntax
of the Thumb instruction set, and how Thumb instructions differ from their ARM
counterparts.
2.2.8Thumb instruction capabilities
The following general points apply to Thumb instructions:
•Conditional execution
•Register access
•Access to the barrel shifter on page 2-10.
Writing ARM and Thumb Assembly Language
Conditional execution
The conditional branch instruction is the only Thumb instruction that can be executed
conditionally on the value of the ALU status flags in the CPSR. All data processing
instructions update these flags, except when one or more high registers are specified as
operands to the
MOV
or
ADD
instructions. In these cases the flags cannot be updated.
You cannot have any data processing instructions between an instruction that sets a
condition and a conditional branch that depends on it. Use a conditional branch over any
instruction that you wish to be conditional.
Register access
In Thumb state, most instructions can access only r0 to r7. These are referred to as the
low registers.
Registers r8 to r15 are limited access registers. In Thumb state these are referred to as
high registers. They can be used, for example, as fast temporary storage.
Refer to Chapter 5 Thumb Instruction Reference for a complete list of the Thumb data
processing instructions that can access the high registers.
Access to the barrel shifter
In Thumb state you can use the barrel shifter only in a separate operation, using an
LSR, ASR,
or
ROR
instruction.
2.2.9Differences between Thumb and ARM instruction sets
The general differences between the Thumb instruction set and the ARM instruction set
are dealt with under the following headings:
•Branch instructions
•Data processing instructions
•Single register load and store instructions on page 2-11
•Multiple register load and store instructions on page 2-11.
There are no Thumb coprocessor instructions, no Thumb semaphore instructions, and
no Thumb instructions to access the CPSR or SPSR.
Branch instructions
LSL
,
These instructions are used to:
•branch backwards to form loops
•branch forward in conditional structures
•branch to subroutines
•change the processor from Thumb state to ARM state.
Program-relative branches, particularly conditional branches, are more limited in range
than in ARM code, and branches to subroutines can only be unconditional.
Data processing instructions
These operate on the general-purpose registers. In many cases, the result of the
operation must be put in one of the operand registers, not in a third register. There are
fewer data processing operations available than in ARM state. They have limited access
to registers r8 to r15.
The ALU status flags in the CPSR are always updated by these instructions except when
MOV
or
ADD
instructions access registers r8 to r15. Thumb data processing instructions
that access registers r8 to r15 cannot update the flags.
Instructions, pseudo-instructions, and directives must be preceded by white space, such
as a space or a tab, even if there is no label.
All three sections of the source line are optional. You can use blank lines to make your
code more readable.
Case rules
Instruction mnemonics, directives, and symbolic register names can be written in
uppercase or lowercase, but not mixed.
armasm
) parses and
Line length
To make source files easier to read, a long line of source can be split onto several lines
by placing a backslash character ( \ ) at the end of the line. The backslash must not be
followed by any other characters (including spaces and tabs). The backslash/end-of-line
sequence is treated by the assembler as white space.
Note
Do not use the backslash/end-of-line sequence within quoted strings.
The exact limit on the length of lines, including any extensions using backslashes,
depends on the contents of the line, but is generally between 128 and 255 characters.
Labels are symbols that represent addresses. The address given by a label is calculated
during assembly.
The assembler calculates the address of a label relative to the origin of the section where
the label is defined. A reference to a label within the same section can use the program
counter plus or minus an offset. This is called program-relative addressing.
Labels can be defined in a map. See Describing data structures with MAP and FIELD directives on page 2-51. You can place the origin of the map in a specified register at
runtime, and references to the label use the specified register plus an offset. This is
called register-relative addressing.
Addresses of labels in other sections are calculated at link time, when the linker has
allocated specific locations in memory for each section.
Local labels
Local labels are a subclass of label. A local label begins with a number in the range
0-99. Unlike other labels, a local label can be defined many times. Local labels are
useful when you are generating labels with a macro. When the assembler finds a
reference to a local label, it links it to a nearby instance of the local label.
The scope of local labels is limited by the
AREA
directive. You can use the
ROUT
directive
to limit the scope more tightly.
Refer to the Local labels on page 3-16 for details of:
•the syntax of local label declarations
•how the assembler associates references to local labels with their labels.
Comments
The first semicolon on a line marks the beginning of a comment, except where the
semicolon appears inside a string constant. The end of the line is the end of the
comment. A comment alone is a valid line. All comments are ignored by the assembler.
Constants can be numeric, boolean, character or string:
Numbers Numeric constants are accepted in three forms:
Boolean The Boolean constants
Characters Character constants consist of opening and closing single quotes,
Strings Strings consist of opening and closing double quotes, enclosing
•Decimal, for example,
•Hexadecimal, for example,
•
n_xxx
where:
n
xxx
{FALSE}
.
is a base between 2 and 9
is a number in that base.
TRUE
123
and
0x7B
FALSE
must be written as
{TRUE}
and
enclosing either a single character or an escaped character, using the
standard C escape characters.
characters and spaces. If double quotes or dollar signs are used within a
string as literal text characters, they must be represented by a pair of the
appropriate character. For example, you must use
single
$
in the string. The standard C escape sequences can be used within
Example 2-1 illustrates some of the core constituents of an assembly language module.
The example is written in ARM assembly language. It is supplied as
examples\asm
subdirectory of ADS. Refer to Code examples on page 2-2 for instructions
on how to assemble, link, and execute the example.
The constituent parts of this example are described in more detail in the following
sections.
AREA ARMex, CODE, READONLY
; Name this block of code ARMex
ENTRY ; Mark first instruction to execute
start
MOV r0, #10 ; Set up parameters
MOV r1, #3
ADD r0, r0, r1 ; r0 = r0 + r1
stop
MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
armex.s
Example 2-1
in the
END ; Mark end of file
ELF sections and the AREA directive
ELF sections are independent, named, indivisible sequences of code or data. A single
code section is the minimum required to produce an application.
The output of an assembly or compilation can include:
•One or more code sections. These are usually read-only sections.
•One or more data sections. These are usually read-write sections. They may be
zero initialized (ZI).
The linker places each section in a program image according to section placement rules.
Sections that are adjacent in source files are not necessarily adjacent in the application
image. Refer to the Linker chapter in ADS Linker and Utilities Guide for more
information on how the linker places sections.
In an ARM assembly language source file, the start of a section is marked by the
AREA
directive. This directive names the section and sets its attributes. The attributes are
placed after the name, separated by commas. Refer to AREA on page 7-52 for a detailed
description of the syntax of the
AREA
directive.
You can choose any name for your sections. However, names starting with any
nonalphabetic character must be enclosed in bars, or an
generated. For example:
|1_DataArea|
.
Example 2-1 on page 2-15 defines a single section called
is marked as being
READONLY
.
AREA name missing
ARMex
that contains code and
error is
The ENTRY directive
ENTRY
The
directive marks the first instruction to be executed. In applications containing
C code, an entry point is also contained within the C library initialization code.
Initialization code and exception handlers also contain entry points.
Application execution
The application code in Example 2-1 on page 2-15 begins executing at the label
start
where it loads the decimal values 10 and 3 into registers r0 and r1. These registers are
added together and the result placed in r0.
Application termination
,
After executing the main code, the application terminates by returning control to the
debugger. This is done using the ARM semihosting SWI (
0x123456 by default
), with
the following parameters:
•r0 equal to
•r1 equal to
angel_SWIreason_ReportException (0x18
ADP_Stopped_ApplicationExit (0x20026
)
).
Refer to the Semihosting SWIs chapter in ADS Debug Target Guide for additional
information.
The END directive
This directive instructs the assembler to stop processing this source file. Every assembly
To call subroutines, use a branch and link instruction. The syntax is:
BL destination
where
destination
is usually the label on the first instruction of the subroutine.
Writing ARM and Thumb Assembly Language
destination
can also be a program-relative or register-relative expression. Refer to B
and BL on page 4-58 for further information.
The
BL
instruction:
•places the return address in the link register (lr)
•sets pc to the address of the subroutine.
After the subroutine code is executed you can use a
MOV pc,lr
convention, registers r0 to r3 are used to pass parameters to subroutines, and to pass
results back to the callers.
Note
Calls between separately assembled or compiled modules must comply with the
restrictions and conventions defined by the procedure call standard. Refer to the Using the Procedure Call Standard in ADS Developer Guide for more information.
Example 2-2 shows a subroutine that adds the values of its two parameters and returns
a result in r0. It is supplied as
subrout.s
in the
examples\asm
to Code examples on page 2-2 for instructions on how to assemble, link, and execute the
example.
AREA subrout, CODE, READONLY
; Name this block of code
ENTRY ; Mark first instruction to execute
start MOV r0, #10 ; Set up parameters
MOV r1, #3
BL doadd ; Call subroutine
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
instruction to return. By
subdirectory of ADS. Refer
Example 2-2
doadd ADD r0, r0, r1 ; Subroutine code
MOV pc, lr ; Return from subroutine
END ; Mark end of file
Example 2-3 illustrates some of the core constituents of a Thumb assembly language
module. It is based on
subrout.s
. It is supplied as
thumbsub.s
in the
subdirectory of the ADS. Refer to Code examples on page 2-2 for instructions on how
to assemble, link, and execute the example.
AREA ThumbSub, CODE, READONLY ; Name this block of code
ENTRY ; Mark first instruction to execute
CODE32 ; Subsequent instructions are ARM
header ADR r0, start + 1 ; Processor starts in ARM state,
BX r0 ; so small ARM code header used
; to call Thumb main program
CODE16 ; Subsequent instructions are Thumb
start
MOV r0, #10 ; Set up parameters
MOV r1, #3
BL doadd ; Call subroutine
stop
MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0xAB ; Thumb semihosting SWI
doadd
ADD r0, r0, r1 ; Subroutine code
MOV pc, lr ; Return from subroutine
END ; Mark end of file
examples\asm
Example 2-3
CODE32 and CODE16 directives
These directives instruct the assembler to assemble subsequent instructions as ARM
CODE32
(
) or Thumb (
CODE16
) instructions. They do not assemble to an instruction to
change the processor state at runtime. They only change the assembler state.
The ARM assembler,
armasm
, starts in ARM mode by default. You can use the
-16
option
in the command line if you want it to start in Thumb mode.
BX instruction
This instruction is a branch that can change processor state at runtime. The least
significant bit of the target address specifies whether it is an ARM instruction (clear) or
ADR
a Thumb instruction (set). In this example, this bit is set in the
In ARM state, each data processing instruction has an option to update ALU status flags
in the Current Program Status Register (CPSR) according to the result of the operation.
S
Add an
flags in the CPSR.
suffix to an ARM data processing instruction to make it update the ALU status
Do not use the
update the flags. This is their only effect.
In Thumb state, there is no option. All data processing instructions update the ALU
status flags in the CPSR, except when one or more high registers are used in
instructions.
Almost every ARM instruction can be executed conditionally on the state of the ALU
status flags in the CPSR. Refer to Table 2-1 on page 2-21 for a list of the suffixes to add
to instructions to make them conditional.
In ARM state, you can:
•update the ALU status flags in the CPSR on the result of a data operation
•execute several other data operations without updating the flags
•execute following instructions or not, according to the state of the flags updated
in the first operation.
In Thumb state, most data operations always update the flags, and conditional execution
can only be achieved using the conditional branch instruction (
instruction are the same as in ARM state. No other instruction can be conditional.
2.5.1The ALU status flags
The CPSR contains the following ALU status flags:
N Set when the result of the operation was Negative.
Z Set when the result of the operation was Zero.
C Set when the operation resulted in a Carry.
V Set when the operation caused oVerflow.
Q ARM architecture v5E only. Sticky flag (see The Q flag on page 4-5).
S
suffix with
MOV
and
CMP, CMN, TST
ADD
cannot update the status flags in these cases.
, or
TEQ
. These comparison instructions always
MOV
and
B
). The suffixes for this
ADD
A carry occurs if the result of an addition is greater than or equal to 2
32
, if the result of
a subtraction is positive, or as the result of an inline barrel shifter operation in a move
or logical instruction.
Overflow occurs if the result of an add, subtract, or compare is greater than or equal to
You can use conditional execution of ARM instructions to reduce the number of branch
instructions in your code. This improves code density.
Branch instructions are also expensive in processor cycles. On ARM processors without
branch prediction hardware, it typically takes three processor cycles to refill the
processor pipeline each time a branch is taken.
Some ARM processors, for example ARM10
prediction hardware. In systems using these processors, the pipeline only needs to be
flushed and refilled when there is a misprediction.
2.5.4Example of the use of conditional execution
This example uses two implementations of Euclid’s Greatest Common Divisor (gcd)
algorithm. It demonstrates how you can use conditional execution to improve code
density and execution speed. The detailed analysis of execution speed only applies to
an ARM7™ processor. The code density calculations apply to all ARM processors.
In C the algorithm can be expressed as:
int gcd(int a, int b)
{
while (a != b) do
{
if (a > b)
a = a - b;
else
b = b - a;
}
return a;
}
™
and StrongARM®, have branch
You can implement the gcd function with conditional execution of branches only, in the
following way:
gcd CMP r0, r1
BEQ end
BLT less
SUB r0, r0, r1
B gcd
less
SUB r1, r1, r0
B gcd
end
Because of the number of branches, the code is seven instructions long. Every time a
branch is taken, the processor must refill the pipeline and continue from the new
location. The other instructions and non-executed branches use a single cycle each.
By using the conditional execution feature of the ARM instruction set, you can
implement the gcd function in only four instructions:
In addition to improving code size, this code executes faster in most cases. Table 2-2
and Table 2-3 on page 2-24 show the number of cycles used by each implementation for
the case where r0 equals 1 and r1 equals 2. In this case, replacing branches with
conditional execution of all instructions saves three cycles.
The conditional version of the code executes in the same number of cycles for any case
where r0 equals r1. In all other cases, the conditional version of the code executes in
fewer cycles.
is the only Thumb instruction that can be executed conditionally, the gcd
algorithm must be written with conditional branches in Thumb code.
Like the ARM conditional branch implementation, the Thumb code requires seven
instructions. However, because Thumb instructions are only 16 bits long, the overall
code size is 14 bytes, compared to 16 bytes for the smaller ARM implementation.
In addition, on a system using 16-bit memory the Thumb version runs faster than the
second ARM implementation because only one memory access is required for each
Thumb instruction, whereas each ARM instruction requires two fetches.
Branch prediction and caches
To optimize code for execution speed you need detailed knowledge of the instruction
timings, branch prediction logic, and cache behavior of your target system. Refer to
ARM Architecture Reference Manual and the technical reference manuals for individual
processors for full information.
You cannot load an arbitrary 32-bit immediate constant into a register in a single
instruction without performing a data load from memory. This is because ARM
instructions are only 32 bits long.
Thumb instructions have a similar limitation.
You can load any 32-bit value into a register with a data load, but there are more direct
and efficient ways to load many commonly-used constants. You can also include many
commonly-used constants directly as operands within data-processing instructions,
without a separate load operation at all.
The following sections describe:
•how to use the
Direct loading with MOV and MVN on page 2-26
•how to use the
with LDR Rd, =const on page 2-27
•how to load floating-point constants, see Loading floating-point constants on
page 2-29.
MOV
and
MVN
LDR
pseudo-instruction to load any 32-bit constant, see Loading
Writing ARM and Thumb Assembly Language
instructions to load a range of immediate values, see
can load any 8-bit constant value, giving a range of
0x0
to
0xFF
(0-255).
It can also rotate these values by any even number. Table 2-4 shows the range of
values that this provides.
•
MVN
can load the bitwise complement of these values. The numerical values are
-(n+1)
, where n are the values given in Table 2-4.
You do not need to calculate the necessary rotation. The assembler performs the
calculation for you.
MOV
or
MVN
You do not need to decide whether to use
. The assembler uses whichever is
appropriate. This is useful if the value is an assembly-time variable.
If you write an instruction with a constant that cannot be constructed, the assembler
reports the error:
Immediate n out of range for this operation
.
The range of values shown in Table 2-4 can also be used as one of the operands in
data-processing operations. You cannot use their bitwise complements as operands, and
you cannot use them as operands in multiplication operations.
In Thumb state you can use the
cannot generate constants outside this range because:
•The Thumb
Constants cannot be right-rotated as they can in ARM state.
•The Thumb
Bitwise complements cannot be directly loaded as they can in ARM state.
If you attempt to use a
assembler reports the error:
Immediate n out of range for this operation
2.6.2Loading with LDR Rd, =const
LDR Rd,=const
The
single instruction. Use this pseudo-instruction to generate constants that are out of range
of the
MOV
and
MVN
LDR
The
pseudo-instruction generates the most efficient code for a specific constant:
•If the constant can be constructed with a
generates the appropriate instruction.
•If the constant cannot be constructed with a
—places the value in a literal pool (a portion of memory embedded in the code
to hold constant values)
—generates an
constant from the literal pool.
For example:
LDR rn, [pc, #offset to literal pool]
; load register n with one word
; from the address [pc + offset]
You must ensure that there is a literal pool within range of the
generated by the assembler. Refer to Placing literal pools on page 2-28 for more
information.
MOV
instruction to load constants in the range 0-255. You
MOV
instruction does not provide inline access to the barrel shifter.
MVN
instruction can act only on registers and not on constant values.
MOV
instruction with a value outside the range 0-255, the
.
pseudo-instruction can construct any 32-bit numeric constant in a
instructions.
MOV
or
MVN
instruction, the assembler
MOV
or
MVN
instruction, the assembler:
LDR
instruction with a program-relative address that reads the
LDR
instruction
Refer to LDR ARM pseudo-instruction on page 4-82 for a description of the syntax of
the
The assembler places a literal pool at the end of each section. These are defined by the
AREA
directive at the start of the following section, or by the
the assembly. The
a section.
END
directive at the end of
END
directive at the end of an included file does not signal the end of
In large sections the default literal pool can be out of range of one or more
LDR
instructions. The offset from the pc to the constant must be:
•less than 4KB in ARM state, but can be in either direction
•forward and less than 1KB in Thumb state.
When an
LDR Rd,=const
pseudo-instruction requires the constant to be placed in a literal
pool, the assembler:
•Checks if the constant is available and addressable in any previous literal pools.
If so, it addresses the existing constant.
•Attempts to place the constant in the next literal pool if it is not already available.
If the next literal pool is out of range, the assembler generates an error message. In this
case you must use the
the
LTORG
directive after the failed
LTORG
directive to place an additional literal pool in the code. Place
LDR
pseudo-instruction, and within 4KB (ARM) or
1KB (Thumb). Refer to LTORG on page 7-14 for a detailed description.
You must place literal pools where the processor does not attempt to execute them as
instructions. Place them after unconditional branch instructions, or after the return
instruction at the end of a subroutine.
Example 2-5 shows how this works in practice. It is supplied as
examples\asm
subdirectory of the ADS. The instructions listed as comments are the
loadcon.s
in the
ARM instructions that are generated by the assembler. Refer to Code examples on
page 2-2 for instructions on how to assemble, link, and execute the example.
Example 2-5
AREA Loadcon, CODE, READONLY
ENTRY ; Mark first instruction to execute
start BL func1 ; Branch to first subroutine
BL func2 ; Branch to second subroutine
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
func1
LDR r0, =42 ; => MOV R0, #42
LDR r1, =0x55555555 ; => LDR R1, [PC, #offset to
; Literal Pool 1]
LDR r2, =0xFFFFFFFF ; => MVN R2, #0
MOV pc, lr
LTORG ; Literal Pool 1 contains
; literal Ox55555555
func2
LDR r3, =0x55555555 ; => LDR R3, [PC, #offset to
; Literal Pool 1]
; LDR r4, =0x66666666 ; If this is uncommented it
; fails, because Literal Pool 2
; is out of reach
MOV pc, lr
LargeTable
SPACE 4200 ; Starting at the current location,
; clears a 4200 byte area of memory
; to zero
END ; Literal Pool 2 is empty
2.6.3Loading floating-point constants
You can load any single-precision or double-precision floating-point constant in a single
instruction, using the
FLD
Refer to FLD pseudo-instruction on page 6-38 for details.
It is often necessary to load an address into a register. You might need to load the address
of a variable, a string constant, or the start location of a jump table.
Addresses are normally expressed as offsets from the current pc or other register.
This section describes two methods for loading an address into a register:
•load the register directly, see Direct loading with ADR and ADRL.
•load the address from a literal pool, see Loading addresses with LDR Rd, = label
on page 2-35.
2.7.1Direct loading with ADR and ADRL
The
ADR
and
ADRL
pseudo-instructions enable you to generate an address, within a certain
range, without performing a data load.
•A program-relative expression, which is a label with an optional offset, where the
address of the label is relative to the current pc.
•A register-relative expression, which is a label with an optional offset, where the
address of the label is relative to an address held in a specified general-purpose
register. Refer to Describing data structures with MAP and FIELD directives on
page 2-51 for information on specifying register-relative expressions.
ADR
and
ADRL
accept either of the following:
The assembler converts an
•a single
ADD
or
ADR rn,label
SUB
instruction that loads the address, if it is in range
pseudo-instruction by generating:
•an error message if the address cannot be reached in a single instruction.
The offset range is ±255 bytes for an offset to a non word-aligned address, and ±1020
bytes (255 words) for an offset to a word-aligned address. (For Thumb, the address must
be word aligned, and the offset must be positive.)
The assembler converts an
ADRL rn,label
pseudo-instruction by generating:
•two data-processing instructions that load the address, if it is in range
•an error message if the address cannot be constructed in two instructions.
The range of an
±256KB for a word-aligned address. (There is no
ADRL
assembles to two instructions, if successful. The assembler generates two
ADRL
pseudo-instruction is ±64KB for a non word-aligned address and
ADRL
pseudo-instruction for Thumb.)
instructions even if the address could be loaded in a single instruction.
Refer to Loading addresses with LDR Rd, = label on page 2-35 for information on
loading addresses that are outside the range of the
must be within the same code section. The assembler
faults references to labels that are out of range in the same section. The linker faults
references to labels that are out of range in other code sections.
In Thumb state,
ADRL
is not available in Thumb code. Use it only in ARM code.
Example 2-6 shows the type of code generated by the assembler when assembling
and
ADRL
pseudo-instructions. It is supplied as
ADR
can generate word-aligned addresses only.
adrlabel.s
in the
examples\asm
ADR
subdirectory of the ADS. Refer to Code examples on page 2-2 for instructions on how
to assemble, link, and execute the example.
The instructions listed in the comments are the ARM instructions generated by the
assembler.
Example 2-6
AREA adrlabel, CODE,READONLY
ENTRY ; Mark first instruction to execute
Start
BL func ; Branch to subroutine
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
LTORG ; Create a literal pool
func ADR r0, Start ; => SUB r0, PC, #offset to Start
ADR r1, DataArea ; => ADD r1, PC, #offset to DataArea
; ADR r2, DataArea+4300 ; This would fail because the offset
; cannot be expressed by operand2
; of an ADD
ADRL r2, DataArea+4300 ; => ADD r2, PC, #offset1
; ADD r2, r2, #offset2
MOV pc, lr ; Return
DataArea SPACE 8000 ; Starting at the current location,
; clears a 8000 byte area of memory
; to zero
END
AREA Jump, CODE, READONLY ; Name this block of code
CODE32 ; Following code is ARM code
num EQU 2 ; Number of entries in jump table
ENTRY ; Mark first instruction to execute
start ; First instruction to call
MOV r0, #0 ; Set up the three parameters
MOV r1, #3
MOV r2, #2
BL arithfunc ; Call the function
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
arithfunc ; Label the function
CMP r0, #num ; Treat function code as unsigned integer
MOVHS pc, lr ; If code is >= num then simply return
ADR r3, JumpTable ; Load address of jump table
LDR pc, [r3,r0,LSL#2] ; Jump to the appropriate routine
JumpTable
DCD DoAdd
DCD DoSub
DoAdd ADD r0, r1, r2 ; Operation 0
MOV pc, lr ; Return
DoSub SUB r0, r1, r2 ; Operation 1
MOV pc, lr ; Return
END ; Mark the end of this file
pseudo-instruction can load any 32-bit constant into a register. See
Loading with LDR Rd, =const on page 2-27. It also accepts program-relative
expressions such as labels, and labels with offsets.
Writing ARM and Thumb Assembly Language
The assembler converts an
•Placing the address of
LDR r0,=label
label
pseudo-instruction by:
in a literal pool (a portion of memory embedded in
the code to hold constant values).
•Generating a program-relative
LDR
instruction that reads the address from the
literal pool, for example:
LDR rn [pc, #offset to literal pool]
; load register n with one word
; from the address [pc + offset]
You must ensure that there is a literal pool within range. Refer to Placing literal
pools on page 2-28 for more information.
Unlike the
ADR
and
ADRL
pseudo-instructions, you can use
LDR
with labels that are outside
the current section. If the label is outside the current section, the assembler places a
relocation directive in the object code when the source file is assembled. The relocation
directive instructs the linker to resolve the address at link time. The address remains
valid wherever the linker places the section containing the
Example 2-9 shows how this works. It is supplied as
LDR
and the literal pool.
ldrlabel.s
in the
examples\asm
subdirectory of the ADS. Refer to Code examples on page 2-2 for instructions on how
to assemble, link, and execute the example.
The instructions listed in the comments are the ARM instructions that are generated by
the assembler.
Example 2-9
AREA LDRlabel, CODE,READONLY
ENTRY ; Mark first instruction to execute
start
BL func1 ; Branch to first subroutine
BL func2 ; Branch to second subroutine
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
func1
LDR r0, =start ; => LDR R0,[PC, #offset into
; Literal Pool 1]
LDR r1, =Darea + 12 ; => LDR R1,[PC, #offset into
; Literal Pool 1]
LDR r2, =Darea + 6000 ; => LDR R2, [PC, #offset into
; Literal Pool 1]
MOV pc,lr ; Return
LTORG ; Literal Pool 1
func2
LDR r3, =Darea + 6000 ; => LDR r3, [PC, #offset into
; Literal Pool 1]
; (sharing with previous literal)
; LDR r4, =Darea + 6004 ; If uncommented produces an error
; as Literal Pool 2 is out of range
MOV pc, lr ; Return
Darea SPACE 8000 ; Starting at the current location,
; clears a 8000 byte area of memory
; to zero
END ; Literal Pool 2 is out of range of
; the LDR instructions above
The ARM and Thumb instruction sets include instructions that load and store multiple
registers to and from memory.
Multiple register transfer instructions provide an efficient way of moving the contents
of several registers to and from memory. They are most often used for block copy and
for stack operations at subroutine entry and exit. The advantages of using a multiple
register transfer instruction instead of a series of single data transfer instructions
include:
•Smaller code size.
•A single instruction fetch overhead, rather than many instruction fetches.
•On uncached ARM processors, the first word of data transferred by a load or store
multiple is always a nonsequential memory cycle, but all subsequent words
transferred can be sequential memory cycles. Sequential memory cycles are faster
in most systems.
Note
The lowest numbered register is transferred to or from the lowest memory address
accessed, and the highest numbered register to or from the highest address accessed.
The order of the registers in the register list in the instructions makes no difference.
Use the
-checkreglist
assembler command line option to check that registers in register
lists are specified in increasing order. Refer to Command syntax on page 3-2 for further
information.
The load (or store) multiple instruction loads (stores) any subset of the 16
general-purpose registers from (to) memory, using a single instruction.
Syntax
The syntax of the
LDM{cond}address-mode Rn{!},reg-list{^}
LDM
where:
instructions is:
cond
is an optional condition code. Refer to Conditional execution on
page 2-20 for more information.
address-mode
specifies the addressing mode of the instruction. Refer to LDM and STM
addressing modes on page 2-41 for details.
Rn
is the base register for the load operation. The address stored in this
register is the starting address for the load operation. Do not specify r15
(pc) as the base register.
!
specifies base register write back. If this is specified, the address in the
base register is updated after the transfer. It is decremented or
incremented by one word for each register in the register list.
register-list
is a comma-delimited list of symbolic register names and register ranges
enclosed in braces. There must be at least one register in the list. Register
ranges are specified with a dash. For example:
{r0,r1,r4-r6,pc}
Do not specify writeback if the base register Rn is in
^
You must not use this option in User or System mode. For details of its
use in privileged modes, see the Handling Processor Exceptions chapter
in ADS Developer Guide and LDM and STM on page 4-18.
instruction corresponds exactly, except for some details in the
Usage
See Implementing stacks with LDM and STM on page 2-42 and Block copy with LDM
and STM on page 2-44.
2.8.2LDM and STM addressing modes
There are four different addressing modes. The base register can be incremented or
decremented by one word for each register in the operation, and the increment or
decrement can occur before or after the operation. The suffixes for these options are:
Writing ARM and Thumb Assembly Language
IA
IB
DA
DB
Increment after.
Increment before.
Decrement after.
Decrement before.
There are alternative addressing mode suffixes that are easier to use for stack operations.
See Implementing stacks with LDM and STM on page 2-42.
The load and store multiple instructions can update the base register. For stack
operations, the base register is usually the stack pointer, r13. This means that you can
use load and store multiple instructions to implement push and pop operations for any
number of registers in a single instruction.
The load and store multiple instructions can be used with several types of stack:
Descending or ascending
The stack grows downwards, starting with a high address and progressing
to a lower one (a descending stack), or upwards, starting from a low
address and progressing to a higher address (an ascending stack).
Full or empty
The stack pointer can either point to the last item in the stack (a full
stack), or the next free space on the stack (an empty stack).
To make it easier for the programmer, stack-oriented suffixes can be used instead of the
increment or decrement and before or after suffixes. Refer to Table 2-5 for a list of
stack-oriented suffixes.
Table 2-5 Suffixes for load and store multiple instructions
Stack typePushPop
Full descending
Full ascending
Empty descending
Empty ascending
STMFD (STMDB)LDMFD (LDMIA)
STMFA (STMIB)LDMFA (LDMDA)
STMED (STMDA)LDMED (LDMIB)
STMEA (STMIA)LDMEA (LDMDB)
For example:
STMFD r13!, {r0-r5} ; Push onto a Full Descending Stack
LDMFD r13!, {r0-r5} ; Pop from a Full Descending Stack.
Note
The ARM-Thumb Procedure Call Standard (ATPCS), and ARM and Thumb C and C++
compilers always use a full descending stack.
Stack operations are very useful at subroutine entry and exit. At the start of a subroutine,
any working registers required can be stored on the stack, and at exit they can be popped
off again.
In addition, if the link register is pushed onto the stack at entry, additional subroutine
calls can safely be made without causing the return address to be lost. If you do this, you
can also return from a subroutine by popping the pc off the stack at exit, instead of
popping lr and then moving that value into the pc. For example:
subroutine STMFD sp!, {r5-r7,lr} ; Push work registers and lr
; code
BL somewhere_else
; code
LDMFD sp!, {r5-r7,pc} ; Pop work registers and pc
Note
Use this with care in mixed ARM and Thumb systems. In ARM architecture v4T
systems, you cannot change state by popping directly into the program counter.
In ARM architecture v5T and above, you can change state in this way.
See the Interworking ARM and Thumb chapter in ADS Developer Guide for further
information on mixing ARM and Thumb.
Example 2-11 is an ARM code routine that copies a set of words from a source location
to a destination by copying a single word at a time. It is supplied as
examples\asm
subdirectory of the ADS. Refer to Code examples on page 2-2 for
word.s
instructions on how to assemble, link, and execute the example.
Example 2-11 Block copy
AREA Word, CODE, READONLY ; name this block of code
num EQU 20 ; set number of words to be copied
ENTRY ; mark the first instruction to call
start
LDR r0, =src ; r0 = pointer to source block
LDR r1, =dst ; r1 = pointer to destination block
MOV r2, #num ; r2 = number of words to copy
wordcopy LDR r3, [r0], #4 ; load a word from the source and
STR r3, [r1], #4 ; store it to the destination
SUBS r2, r2, #1 ; decrement the counter
BNE wordcopy ; ... copy more
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
in the
AREA BlockData, DATA, READWRITE
src DCD 1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8,1,2,3,4
dst DCD 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
END
This module can be made more efficient by using
LDM
and
STM
for as much of the copying
as possible. Eight is a sensible number of words to transfer at a time, given the number
of registers that the ARM has. The number of eight-word multiples in the block to be
copied can be found (if r2 = number of words to be copied) using:
MOVS r3, r2, LSR #3 ; number of eight word multiples
This value can be used to control the number of iterations through a loop that copies
eight words per iteration. When there are less than eight words left, the number of words
left can be found (assuming that r2 has not been corrupted) using:
ANDS r2, r2, #7
Example 2-12 on page 2-45 lists the block copy module rewritten to use
AREA Block, CODE, READONLY ; name this block of code
num EQU 20 ; set number of words to be copied
ENTRY ; mark the first instruction to call
start
LDR r0, =src ; r0 = pointer to source block
LDR r1, =dst ; r1 = pointer to destination block
MOV r2, #num ; r2 = number of words to copy
MOV sp, #0x400 ; Set up stack pointer (r13)
blockcopy MOVS r3,r2, LSR #3 ; Number of eight word multiples
BEQ copywords ; Less than eight words to move?
STMFD sp!, {r4-r11} ; Save some working registers
octcopy LDMIA r0!, {r4-r11} ; Load 8 words from the source
STMIA r1!, {r4-r11} ; and put them at the destination
SUBS r3, r3, #1 ; Decrement the counter
BNE octcopy ; ... copy more
LDMFD sp!, {r4-r11} ; Don't need these now - restore
; originals
copywords ANDS r2, r2, #7 ; Number of odd words to copy
BEQ stop ; No words left to copy?
wordcopy LDR r3, [r0], #4 ; Load a word from the source and
STR r3, [r1], #4 ; store it to the destination
SUBS r2, r2, #1 ; Decrement the counter
BNE wordcopy ; ... copy more
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0x123456 ; ARM semihosting SWI
AREA BlockData, DATA, READWRITE
src DCD 1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8,1,2,3,4
dst DCD 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
END
The Thumb instruction set contains two pairs of multiple-register transfer instructions:
•
LDM
and
STM
for block memory transfers
•
PUSH
and
POP
for stack operations.
LDM and STM
These instructions can be used to load or store any subset of the low registers from or
to memory. The base register is always updated at the end of the multiple register
transfer instruction. You must specify the
instructions is
IA
.
Examples of these instructions are:
LDMIA r1!, {r0,r2-r7}
STMIA r4!, {r0-r3}
PUSH and POP
These instructions can be used to push any subset of the low registers and (optionally)
the link register onto the stack, and to pop any subset of the low registers and
(optionally) the pc off the stack. The base address of the stack is held in r13. Examples
of these instructions are:
!
character. The only valid suffix for these
PUSH {r0-r3}
POP {r0-r3}
PUSH {r4-r7,lr}
POP {r4-r7,pc}
The optional addition of the lr or pc to the register list provides support for subroutine
entry and exit.
The stack is always full descending.
Thumb-state block copy example
The block copy example, Example 2-11 on page 2-44, can be converted into Thumb
instructions (see Example 2-13 on page 2-47).
LDM
and
STM
Because the Thumb
number of words copied per iteration is reduced from eight to four. In addition, the
and
STM
instructions can be used to carry out the single word at a time copy, because they
update the base pointer after each access. If
instructions can access only the low registers, the
LDR
and
STR
were used for this, separate
LDM
ADD
instructions would be required to update each base pointer.
AREA Tblock, CODE, READONLY ; Name this block of code
num EQU 20 ; Set number of words to be copied
ENTRY ; Mark first instruction to execute
header ; The first instruction to call
MOV sp, #0x400 ; Set up stack pointer (r13)
ADR r0, start + 1 ; Processor starts in ARM state,
BX r0 ; so small ARM code header used
; to call Thumb main program
CODE16 ; Subsequent instructions are Thumb
start
LDR r0, =src ; r0 =pointer to source block
LDR r1, =dst ; r1 =pointer to destination block
MOV r2, #num ; r2 =number of words to copy
blockcopy
LSR r3,r2, #2 ; Number of four word multiples
BEQ copywords ; Less than four words to move?
PUSH {r4-r7} ; Save some working registers
quadcopy
LDMIA r0!, {r4-r7} ; Load 4 words from the source
STMIA r1!, {r4-r7} ; and put them at the destination
SUB r3, #1 ; Decrement the counter
BNE quadcopy ; ... copy more
POP {r4-r7} ; Don't need these now-restore originals
copywords
MOV r3, #3 ; Bottom two bits represent number
AND r2, r3 ; ...of odd words left to copy
BEQ stop ; No words left to copy?
wordcopy
LDMIA r0!, {r3} ; load a word from the source and
STMIA r1!, {r3} ; store it to the destination
SUB r2, #1 ; Decrement the counter
BNE wordcopy ; ... copy more
stop MOV r0, #0x18 ; angel_SWIreason_ReportException
LDR r1, =0x20026 ; ADP_Stopped_ApplicationExit
SWI 0xAB ; Thumb semihosting SWI
AREA BlockData, DATA, READWRITE
src DCD 1,2,3,4,5,6,7,8,1,2,3,4,5,6,7,8,1,2,3,4
dst DCD 0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
END
A macro definition is a block of code enclosed between
defines a name that can be used instead of repeating the whole block of code. This has
two main uses:
•to make it easier to follow the logic of the source code, by replacing a block of
code with a single, meaningful name
•to avoid repeating a block of code several times.
Refer to MACRO and MEND on page 7-27 for more details.
2.9.1Test-and-branch macro example
A test-and-branch operation requires two ARM instructions to implement.
You can define a macro definition such as this:
MACRO
$label TestAndBranch $dest, $reg, $cc
$label CMP $reg, #0
B$cc $dest
MEND
The line after the
MACRO
statement defines the name (
parameters (
$label, $dest, $reg
you invoke the macro. The assembler substitutes the values you give into the code.
MACRO
and
MEND
directives. It
directive is the macro prototype statement. The macro prototype
TestAndBranch
, and
) you use to invoke the macro. It also defines
$cc
). You must give values to the parameters when
This macro can be invoked as follows:
test TestAndBranch NonZero, r0, NE
...
...
NonZero
Example 2-14 shows a macro that performs an unsigned integer division. It takes four
parameters:
$Bot
The register that holds the divisor.
$Top
The register that holds the dividend before the instructions are executed.
After the instructions are executed, it holds the remainder.
$Div
The register where the quotient of the division is placed. It can be
(
""
) if only the remainder is required.
$Temp
A temporary register used during the calculation.
MACRO
$Lab DivMod $Div,$Top,$Bot,$Temp
ASSERT $Top <> $Bot ; Produce an error message if the
ASSERT $Top <> $Temp ; registers supplied are
ASSERT $Bot <> $Temp ; not all different
IF "$Div" <> ""
ASSERT $Div <> $Top ; These three only matter if $Div
ASSERT $Div <> $Bot ; is not null ("")
ASSERT $Div <> $Temp ;
ENDIF
$Lab
MOV $Temp, $Bot ; Put divisor in $Temp
CMP $Temp, $Top, LSR #1 ; double it until
90 MOVLS $Temp, $Temp, LSL #1 ; 2 * $Temp > $Top
CMP $Temp, $Top, LSR #1
BLS %b90 ; The b means search backwards
IF "$Div" <> "" ; Omit next instruction if $Div is null
MOV $Div, #0 ; Initialize quotient
ENDIF
91 CMP $Top, $Temp ; Can we subtract $Temp?
SUBCS $Top, $Top,$Temp ; If we can, do so
IF "$Div" <> "" ; Omit next instruction if $Div is null
ADC $Div, $Div, $Div ; Double $Div
ENDIF
MOV $Temp, $Temp, LSR #1 ; Halve $Temp,
CMP $Temp, $Bot ; and loop until
BHS %b91 ; less than divisor
MEND
The macro checks that no two parameters use the same register. It also optimizes the
code produced if only the remainder is required.
To avoid multiple definitions of labels if
DivMod
is used more than once in the assembler
source, the macro uses local labels (90, 91). Refer to Local labels on page 2-13 for more
information.
Example 2-15 shows the code that this macro produces if it is invoked as follows:
ratio DivMod r0,r5,r4,r2
Example 2-15
ASSERT r5 <> r4 ; Produce an error if the
ASSERT r5 <> r2 ; registers supplied are
ASSERT r4 <> r2 ; not all different
ASSERT r0 <> r5 ; These three only matter if $Div
ASSERT r0 <> r4 ; is not null ("")
ASSERT r0 <> r2 ;
ratio
MOV r2, r4 ; Put divisor in $Temp
CMP r2, r5, LSR #1 ; double it until
90 MOVLS r2, r2, LSL #1 ; 2 * r2 > r5
CMP r2, r5, LSR #1
BLS %b90 ; The b means search backwards
MOV r0, #0 ; Initialize quotient
91 CMP r5, r2 ; Can we subtract r2?
SUBCS r5, r5, r2 ; If we can, do so
ADC r0, r0, r0 ; Double r0
MOV r2, r2, LSR #1 ; Halve r2,
CMP r2, r4 ; and loop until
BHS %b91 ; less than divisor
2.10Describing data structures with MAP and FIELD directives
You can use the
MAP
and
FIELD
directives to describe data structures. These directives are
always used together.
MAP
and
FIELD
Data structures defined using
:
•are easily maintainable
•can be used to describe multiple instances of the same structure
•make it easy to access data efficiently.
The
MAP
directive specifies the base address of the data structure. Refer to MAP on
page 7-15 for further information.
The
FIELD
directive specifies the amount of memory required for a data item, and can
give the data item a label. It is repeated for each data item in the structure. Refer to
FIELD on page 7-16 for further information.
Note
No space in memory is allocated when a map is defined. Use define constant directives
(for example,
To access data more than 4KB away from the current instruction, you can use a
register-relative instruction, such as:
LDR r4,[r9,#offset]
is limited to 4096, so r9 must already contain a value within 4KB of the address
offset
of the data.
MAP 0
consta FIELD 4 ; consta uses four bytes, located at offset 0
constb FIELD 4 ; constb uses four bytes, located at offset 4
x FIELD 8 ; x uses eight bytes, located at offset 8
y FIELD 8 ; y uses eight bytes, located at offset 16
string FIELD 256 ; string is up to 256 bytes long, starting at offset 24
Using the map in Example 2-16, you can access the data structure using the following
instructions:
MOV r9,#4096
LDR r4,[r9,#constb]
Example 2-16
The labels are relative to the start of the data structure. The register used to hold the start
address of the map (r9 in this case) is called the base register.
There are likely to be many
LDR
or
STR
instructions accessing data in this data structure.
This map does not contain the location of the data structure. The location of the
structure is determined by the value loaded into the base register at runtime.
The same map can be used to describe many instances of the data structure. These can
be located anywhere in memory.
MOV
There are restrictions on what addresses can be loaded into a register using the
instruction. Refer to Loading addresses into registers on page 2-30 for details of how to
load arbitrary addresses.
Note
r9 is the static base register (sb) in the ARM-Thumb Procedure Call Standard. Refer to
the Using the Procedure Call Standard chapter in ADS Developer Guide for further
information.
In many cases, you can use the same register as the base register every time you access
a data structure. You can include the name of the register in the base address of the map.
Example 2-17 shows such a register-based map. The labels defined in the map include
the register.
MAP 0,r9
consta FIELD 4 ; consta uses four bytes, located at offset 0 (from r9)
constb FIELD 4 ; constb uses four bytes, located at offset 4
x FIELD 8 ; x uses eight bytes, located at offset 8
y FIELD 8 ; y uses eight bytes, located at offset 16
string FIELD 256 ; string is up to 256 bytes long, starting at offset 24
Using the map in Example 2-17, you can access the data structure wherever it is:
You can use the program counter (r15) as the base register for a map. In this case, each
STM
or
LDM
instruction must be within 4KB of the data item it addresses, because the
offset is limited to 4KB. The data structure must be in the same section as the
instructions, because otherwise there is no guarantee that the data items will be within
range after linking.
Example 2-18 shows a program fragment with such a map. It includes a directive which
allocates space in memory for the data structure, and an instruction which accesses it.
datastruc SPACE 280 ; reserves 280 bytes of memory for datastruc
MAP datastruc
consta FIELD 4
constb FIELD 4
x FIELD 8
y FIELD 8
string FIELD 256
code LDR r2,constb ; => LDR r2,[pc,offset]
Example 2-18
In this case, there is no need to load the base register before loading the data as the
program counter already holds the correct address. (This is not actually the same as the
address of the
LDR
instruction, because of pipelining in the processor. However, the
directive with an operand of 0 to label a location within a
structure. The location is labeled, but the location counter is not incremented.
The size of the data structure defined in Example 2-19 depends on the values of
MaxStrLen
and
ArrayLen
. If these values are too large, the structure overruns the end of
available memory.
Example 2-19 uses:
•an
•a
An
EQU
directive to define the end of available memory
FIELD
directive with an operand of 0 to label the end of the data structure.
ASSERT
directive checks that the end of the data structure does not overrun the
available memory.
Example 2-19
StartOfData EQU 0x1000
EndOfData EQU 0x2000
MAP StartOfData
Integer FIELD 4
Integer2 FIELD 4
String FIELD MaxStrLen
Array FIELD ArrayLen*8
BitMask FIELD 4
EndOfUsedData FIELD 0
ASSERT EndOfUsedData <= EndOfData
You are likely to have problems if you include some character variables in the data
structure, as in Example 2-20. This is because a lot of words are misaligned.
StartOfData EQU 0x1000
EndOfData EQU 0x2000
MAP StartOfData
Char FIELD 1
Char2 FIELD 1
Char3 FIELD 1
Integer FIELD 4 ; alignment = 3
Integer2 FIELD 4
String FIELD MaxStrLen
Array FIELD ArrayLen*8
BitMask FIELD 4
EndOfUsedData FIELD 0
ASSERT EndOfUsedData <= EndOfData
Example 2-20
You cannot use the
location within memory.
ALIGN
directive, because the
MAP
and
FIELD
ALIGN
directive aligns the current
directives do not allocate any memory for the
structures they define.
You could insert a dummy
FIELD 1
after
Char3 FIELD 1
. However, this makes
maintenance difficult if you change the number of character variables. You must
recalculate the right amount of padding each time.
Example 2-21 on page 2-57 shows a better way of adjusting the padding. The example
uses a
FIELD
directive with a 0 operand to label the end of the character data. A second
FIELD
directive inserts the correct amount of padding based on the value of the label. An
:AND:
operator is used to calculate the correct value.
(-EndOfChars):AND:3
The
0 if EndOfChars is 0 mod 4;
3 if EndOfChars is 1 mod 4;
2 if EndOfChars is 2 mod 4;
1 if EndOfChars is 3 mod 4.
expression calculates the correct amount of padding:
This automatically adjusts the amount of padding used whenever character variables are
added or removed.
StartOfData EQU 0x1000
EndOfData EQU 0x2000
MAP StartOfData
Char FIELD 1
Char2 FIELD 1
Char3 FIELD 1
EndOfChars FIELD 0
Padding FIELD (-EndOfChars):AND:3
Integer FIELD 4
Integer2 FIELD 4
String FIELD MaxStrLen
Array FIELD ArrayLen*8
BitMask FIELD 4
EndOfUsedData FIELD 0
ASSERT EndOfUsedData <= EndOfData
2.10.6Using register-based MAP and FIELD directives
Register-based
MAP
and
FIELD
directives define register-based symbols. There are two
main uses for register-based symbols:
•defining structures similar to C structures
•gaining faster access to memory sections described by non register-based
FIELD
directives.
MAP
and
Defining register-based symbols
Register-based symbols can be very useful, but you must be careful when using them.
As a general rule, use them only in the following ways:
into
Location
LDR
•As the location for a load or store instruction to load from or store to. If
is a register-based symbol based on the register
assembler automatically translates, for example,
or
.
ADRL
instruction,
.
ADR Rn,Location
Rn,[Rb,#offset]
In an
ADR
ADD Rn,Rb,#offset
Rb
and with numeric offset, the
LDR Rn,Location
is converted by the assembler into
•Adding an ordinary numeric expression to a register-based symbol to get another
register-based symbol.
•Subtracting an ordinary numeric expression from a register-based symbol to get
another register-based symbol.
•Subtracting a register-based symbol from another register-based symbol to get an
ordinary numeric expression. Do not do this unless the two register-based
symbols are based on the same register. Otherwise, you have a combination of
two registers and a numeric value. This results in an assembler error.
•As the operand of a
:BASE:
or
:INDEX:
operator. These operators are mainly of
use in macros.
Other uses usually result in assembler error messages. For example, if you write
Rn,=Location
, where
from a memory location that always has the current value of the register
Location
is register-based, you are asking the assembler to load Rn
Rb
LDR
plus offset
in it. It cannot do this, because there is no such memory location.
Similarly, if you write
asking for a single
its offset to
ADD
Rn
. Again, the assembler cannot do this. You must use two
ADD Rd,Rn,#expression
instruction that adds both the base register of the expression and
If you use the same technique for a section of memory containing memory-mapped I/O
(or whose absolute addresses must not change for other reasons), you must take care to
keep the code maintainable.
One method is to add comments to the code warning maintainers to take care when
modifying the definitions. A better method is to use definitions of the absolute addresses
to control the register-based definitions.
Using
MAP offset,reg
symbol with register part
StartOfIOArea EQU 0x1000000
SendFlag_Abs EQU 0x1000000
SendData_Abs EQU 0x1000004
RcvFlag_Abs EQU 0x1000008
RcvData_Abs EQU 0x100000C
IOAreaBase RN r11
MAP (SendFlag_Abs-StartOfIOArea),IOAreaBase
SendFlag FIELD 0
MAP (SendData_Abs-StartOfIOArea),IOAreaBase
SendData FIELD 0
MAP (RcvFlag_Abs-StartOfIOArea),IOAreaBase
RcvFlag FIELD 0
MAP (RcvData_Abs-StartOfIOArea),IOAreaBase
RcvData FIELD 0
followed by
reg
and numeric part
label FIELD 0
offset
makes
label
into a register-based
. Example 2-25 shows this.
Example 2-25
Load the base address with
locations to be accessed with statements like
Sometimes you need to operate on two structures of the same type at the same time. For
example, if you want the equivalent of the pseudo-code:
newloc.x = oldloc.x + (value in r0);
newloc.y = oldloc.y + (value in r1);
newloc.z = oldloc.z + (value in r2);
Writing ARM and Thumb Assembly Language
The base register needs to point alternately to the
oldloc
structure and to the
Repeatedly changing the base register would be inefficient. Instead, use a
non register-based map, and set up two pointers in two different registers as in
Example 2-26.
Example 2-26
MAP 0 ; Non-register based relative map used twice, for
Pointx FIELD 4 ; old and new data at oldloc and newloc
Pointy FIELD 4 ; oldloc and newloc are labels for
Pointz FIELD 4 ; memory allocated in other sections
2.10.8Avoiding problems with MAP and FIELD directives
Using
MAP
and
FIELD
directives can help you to produce maintainable data structures.
However, this is only true if the order the elements are placed in memory is not
important to either the programmer or the program.
You can have problems if you load or store multiple elements of a structure in a single
instruction. These problems arise in operations such as:
•loading several single-byte elements into one register
•using a store multiple or load multiple instruction (
multiple words from or to multiple registers.
These operations require the data elements in the structure to be contiguous in memory,
and to be in a specific order. If the order of the elements is changed, or a new element
is added, the program is broken in a way that cannot be detected by the assembler.
There are several methods for avoiding problems such as this.
Example 2-27 shows a sample structure.
STM
and
LDM
) to store or load
Example 2-27
MiscBase RN r10
MAP 0,MiscBase
MiscStart FIELD 0
Misc_a FIELD 1
Misc_b FIELD 1
Misc_c FIELD 1
Misc_d FIELD 1
MiscEndOfChars FIELD 0
MiscPadding FIELD (-:INDEX:MiscEndOfChars) :AND: 3
Misc_I FIELD 4
Misc_J FIELD 4
Misc_K FIELD 4
Misc_data FIELD 4*20
MiscEnd FIELD 0
MiscLen EQU MiscEnd-MiscStart
There is no problem in using
LDM
and
STM
instructions for accessing single data elements
that are larger than a word (for example, arrays). An example of this is the 20-word
element
Example 2-27 on page 2-64 loads the first six items in the array
Misc_data
. The array is
a single element and therefore covers contiguous memory locations. No one is likely to
want to split it into separate arrays in the future.
However, for loading
Misc_I, Misc_J
, and
Misc_K
into registers r0, r1, and r2 the
following code works, but might cause problems in the future:
ArrayBase RN r9
ADR ArrayBase, Misc_I
LDMIA ArrayBase, {r0-r2}
Problems arise if the order of
Misc_New
is added in the middle. Either of these small changes breaks the code.
Misc_I, Misc_J
, and
Misc_K
is changed, or if a new element
If these elements are accessed separately elsewhere, you must not amalgamate them
into a single array element. In this case, you must amend the code. The first remedy is
to comment the structure to prevent changes affecting this section:
Misc_I FIELD 4 ; ==} Do not split/reorder
Misc_J FIELD 4 ; } these 3 elements, STM
Misc_K FIELD 4 ; ==} and LDM instructions used.
If the code is strongly commented, no deliberate changes are likely to be made that
affect the workings of the program. Unfortunately, mistakes can occur. A second
method of catching these problems is to add
ASSERT
directives just before the
STM
and
LDM
instructions to check that the labels are consecutive and in the correct order:
ArrayBase RN R9
; Check that the structure elements
; are correctly ordered for LDM
ASSERT (((Misc_J-Misc_I) = 4) :LAND: ((Misc_K-Misc_J) = 4))
ADR ArrayBase, Misc_I
LDMIA ArrayBase, {r0-r2}
This
ASSERT
directive stops assembly at this point if the structure is not in the correct
order to be loaded with an
LDM
. Remember that the element with the lowest address is
always loaded from, or stored to, the lowest numbered register.
You must use frame directives to describe the way that your code uses the stack if you
want to be able to do either of the following:
•debug your application using stack unwinding
•use either flat or call-graph profiling.
Refer to Frame description directives on page 7-33 for details of these directives.
The assembler uses these directives to insert DWARF2 debug frame information into
the object file in ELF format that it produces. This information is required by the
debuggers for stack unwinding and for profiling. Refer to the Using the Procedure Call Standard chapter in ADS Developer Guide for further information about stack
unwinding.
Frame directives do not affect the code produced by
instructs the assembler to interpret instructions as Thumb instructions.
This is equivalent to a
CODE16
directive at the head of the source file.
instructs the assembler to interpret instructions as ARM instructions.
This is the default.
-apcs [none| [/qualifier[/qualifier[...]]]]
specifies whether you are using the ARM/Thumb Procedure Call
Standard (ATPCS). It can also specify some attributes of code sections. See ADS Developer Guide for more information about the ATPCS.
/none
specifies that
inputfile
does not use ATPCS. ATPCS registers
are not set up. Qualifiers are not allowed.
Note
ATPCS qualifiers do not affect the code produced by the assembler. They
are an assertion by the programmer that the code in
inputfile
complies
with a particular variant of ATPCS. They cause attributes to be set in the
object file produced by the assembler. The linker uses these attributes to
check compatibility of files, and to select appropriate library variants.
Values for
/interwork
qualifier
are:
specifies that the code in
inputfile
is suitable for
ARM/Thumb interworking. See ADS Developer Guide for information on interworking.
/nointerwork
specifies that the code in
inputfile
is not suitable
for ARM/Thumb interworking. This is the default.
does not carry
out software stack-limit checking. This is the
default.
/swstna
specifies that the code in
inputfile
is compatible
both with code which carries out stack-limit
checking, and with code that does not carry out
stack-limit checking.
instructs the assembler to assemble code suitable for a big-endian ARM.
The default is
-littleend
.
instructs the assembler to assemble code suitable for a little-endian ARM.
-checkreglist
instructs the assembler to check
RLIST, LDM
, and
STM
register lists to ensure
that all registers are provided in increasing register number order. A
warning is given if registers are not listed in order.
-cpu cpu
sets the target CPU. Some instructions produce either errors or warnings
if assembled for the wrong target CPU (see also the
option). Valid values for
or part numbers such as ARM7TDMI
cpu
are architecture names such as 3, 4T, or
®
. See ARM Architecture Reference
-unsafe
assembler
5TE
Manual for information about the architectures. The default is
ARM7TDMI.
-depend dependfile
instructs the assembler to save source file dependency lists to
instructs the assembler to write source file dependency lists to
stdout
.
,
.
Assembler Reference
-md
-errors errorfile
instructs the assembler to write source file dependency lists to
inputfile.d
instructs the assembler to output error messages to
-fpu name
this option selects the target floating-point unit (FPU) architecture. If you
specify this option it overrides any implicit FPU set by the
Floating-point instructions produce either errors or warnings if
assembled for the wrong target FPU.
The assembler sets a build attribute corresponding to
file. The linker determines compatibility between object files, and
selection of libraries, accordingly.
The assembler sets a build attribute corresponding to
file. The linker determines compatibility between object files, and
selection of libraries, accordingly.
Valid options are:
none
vfp
This is a synonym for
vfpv1
vfpv2
fpa
Selects hardware Floating Point Accelerator.
softvfp+vfp
softvfp
softfpa
.
errorfile
-cpu
name
in the object
name
in the object
.
option.
Selects no floating-point option. This makes your assembled
object file compatible with any other object file.
-fpu vfpv1
.
Selects hardware vector floating-point unit conforming to
architecture VFPv1.
Selects hardware vector floating-point unit conforming to
architecture VFPv2.
Selects hardware Vector Floating Point unit.
To
armasm
, this is identical to
-fpu vfpv1
. See the C and C++
Compilers chapter in ADS Compilers and Libraries Guide for
details of the effect on software library selection at link time.
Selects software floating-point library (FPLib) with
pure-endian doubles. This is the default if no
-fpu
option is
specified.
Selects software floating-point library with mixed-endian
doubles.
-g
instructs the assembler to generate DWARF2 debug tables. For
backwards compatibility, the following command line option is
permitted, but not required:
instructs the assembler to display a summary of the assembler
command-line options.
-i dir [,dir]…
adds directories to the source file search path so that arguments to
INCLUDE
, or
INCBIN
INCLUDE on page 7-61).
-keep
instructs the assembler to keep local labels in the symbol table of the
object file, for use by the debugger (see KEEP on page 7-64).
-list [listingfile] [options]
instructs the assembler to output a detailed listing of the assembly
language produced by the assembler to
listingfile
sent to
, listing is sent to
inputfile.lst
Use the following command-line options to control the behavior of
-noterse
turns the
due to conditional assembly do not appear in the listing. If the
terse
default is on.
-width
-length
sets the listing page width. The default is 79 characters.
sets the listing page length. Length zero means an unpaged
listing. The default is 66 lines.
-xref
instructs the assembler to list cross-referencing information on
symbols, including where they were defined and where they
were used, both inside and outside macros. The default is off.
GET
directives do not need to be fully qualified (see GET or
stdout
listingfile
. If no
listingfile
. If - is given as
is given, listing is
.
-list
terse
flag off. When this option is on, lines skipped
option is off, these lines do appear in the listing. The
,
:
-maxcache n
sets the maximum source cache size to n. The default is 8MB.
-memaccess attributes
Specifies memory access attributes of the target memory system. The
default is to allow aligned loads and saves of bytes, halfwords and words.
modify the default. They can be any one of the following:
Allow unaligned
LDR
s.
Disallow halfword loads.
Disallow halfword stores.
Disallow halfword loads and stores.
Assembler Reference
-nocache
turns off source caching. By default the assembler caches source files on
the first pass and reads them from memory on the second pass.
-noesc
-noregs
instructs the assembler to ignore C-style escaped special characters, such
as
\n
and \t.
instructs the assembler not to predefine register names. See Predefined register and coprocessor names on page 3-9 for a list of predefined
register names.
-nowarn
-o filename
turns off warning messages.
names the output object file. If this option is not specified, the assembler
uses the second command-line argument that is not a valid command-line
option as the name of the output file. If there is no such argument, the
assembler creates an object filename of the form
-predefine "directive"
instructs the assembler to pre-execute one of the
enclose
The assembler executes a corresponding
directive
in quotes. See SETA, SETL, and SETS on page 7-7.
GBLL, GBLS
inputfilename.o
SET
directives. You must
, or
GBLA
directive to
.
define the variable before setting its value.
The variable name is case-sensitive.
Note
The command line interface of your system might require you to enter
special character combinations, such as
directive
. Alternatively, you can use
\”
, to include strings in
-via file
to include a
-predefine
argument. The command line interface does not alter arguments from
-via
files.
-split_ldm
This option instructs the assembler to fault
LDM
and
STM
instructions if the
maximum number of registers transferred exceeds:
•five, for all
•four, for
STM
s, and for
LDM
s that load the PC.
LDM
s that do not load the PC
Avoiding large multiple register transfers can reduce interrupt latency on
ARM systems that:
•do not have a cache or a write buffer (for example, a cacheless
ARM7TDMI)
Avoiding large multiple register transfers increases code size and
decreases performance slightly.
Avoiding large multiple register transfers has no significant benefit for
cached systems or processors with a write buffer.
Avoiding large multiple register transfers also has no benefit for systems
without zero wait-state memory, or for systems with slow peripheral
devices. Interrupt latency in such systems is determined by the number of
cycles required for the slowest memory or peripheral access. This is
typically much greater than the latency introduced by multiple register
transfers.
-unsafe
-via file
inputfile
allows assembly of a file containing instructions that are not available on
the specified architecture and processor. It changes corresponding error
messages to warning messages. It also suppresses warnings about
operator precedence (see Binary operators on page 3-28).
instructs the assembler to open
file
and read in command-line arguments
to the assembler. For further information see the Via File Syntax appendix
in ADS Compilers and Libraries Guide.
specifies the input file for the assembler. Input files must be ARM or
Thumb assembly language source files.
All three sections of the source line are optional.
Instructions cannot start in the first column. They must be preceded by white space even
if there is no preceding symbol.
You can write directives in all upper case, as in this manual. Alternatively, you can write
directives in all lower case. You must not write a directive in mixed upper and lower
case.
You can use blank lines to make your code more readable.
symbol
is usually a label (see Labels on page 3-15). In instructions and
pseudo-instructions it is always a label. In some directives it is a symbol for a variable
or a constant. The description of the directive makes this clear in each case.
symbol
must begin in the first column and cannot contain any whitespace character such
as a space or a tab (see Symbol naming rules on page 3-12).
You can use symbols to represent variables, addresses, and numeric constants. Symbols
representing addresses are also called labels. See:
•Variables on page 3-13
•Numeric constants on page 3-13
•Labels on page 3-15
•Local labels on page 3-16.
3.5.1Symbol naming rules
The following general rules apply to symbol names:
•You can use uppercase letters, lowercase letters, numeric characters, or the
underscore character in symbol names.
•Do not use numeric characters for the first character of symbol names, except in
local labels (see Local labels on page 3-16).
•Symbol names are case-sensitive.
•All characters in the symbol name are significant.
•Symbol names must be unique within their scope.
•Symbols must not use built-in variable names or predefined symbol names (see
Predefined register and coprocessor names on page 3-9 and Built-in variables on
page 3-10).
•Symbols must not use the same name as instruction mnemonics or directives. If
you use the same name as an instruction mnemonic or directive, use double bars
to delimit the symbol name. For example:
||ASSERT||
The bars are not part of the symbol.
If you need to use a wider range of characters in symbols, for example, when working
with compilers, use single bars to delimit the symbol name. For example:
|.text|
The bars are not part of the symbol. You cannot use bars, semicolons, or newlines within
the bars.
The value of a variable can be changed as assembly proceeds. Variables are of three
types:
•numeric
•logical
•string.
The type of a variable cannot be changed.
The range of possible values of a numeric variable is the same as the range of possible
values of a numeric constant or numeric expression (see Numeric constants and Numeric expressions on page 3-20).
The possible values of a logical variable are
on page 3-23).
The range of possible values of a string variable is the same as the range of values of a
string expression (see String expressions on page 3-19).
Use the
variables, and assign values to them using the
•GBLA, GBLL, and GBLS on page 7-4
•LCLA, LCLL, and LCLS on page 7-6
•SETA, SETL, and SETS on page 7-7.
3.5.3Numeric constants
Numeric constants are 32-bit integers. You can set them using unsigned numbers in the
range 0 to 2
assembler makes no distinction between –n and 2
use the unsigned interpretation. This means that 0 > –1 is
Use the
the value of a numeric constant after you define it.
See also Numeric expressions on page 3-20 and Numeric literals on page 3-21.
{TRUE}
or
{FALSE}
(see Logical expressions
GBLA, GBLL, GBLS, LCLA, LCLL
32
– 1, or signed numbers in the range –231 to 2
EQU
directive to define constants (see EQU on page 7-57). You cannot change
You can use a string variable for a whole line of assembly language, or any part of a line.
Use the variable with a
$
prefix in the places where the value is to be substituted for the
variable. The dollar character instructs the assembler to substitute the string into the
source code line before checking the syntax of the line.
Numeric and logical variables can also be substituted. The current value of the variable
is converted to a hexadecimal string (or
Use a dot to mark the end of the variable name if the following character would be
permissible in a symbol name (see Symbol naming rules on page 3-12). You must set
the contents of the variable before you can use it.
$
If you need a
$
.
that you do not want to be substituted, use $$. This is converted to a single
You can include a variable with a
way as anywhere else.
Substitution does not occur within vertical bars, except that vertical bars within double
quotes do not affect substitution.
T
or F for logical variables) before substitution.
$
prefix in a string. Substitution occurs in the same
Examples
; straightforward substitution
GBLS add4ff
;
add4ff SETS "ADD r4,r4,#0xFF" ; set up add4ff
$add4ff.00 ; invoke add4ff
; this produces
ADD r4,r4,#0xFF00
; elaborate substitution
GBLS s1
GBLS s2
GBLS fixup
GBLA count
;
count SETA 14
s1 SETS "a$$b$count" ; s1 now has value a$b0000000E
s2 SETS "abc"
fixup SETS "|xy$s2.z|" ; fixup now has value |xyabcz|
|C$$code| MOV r4,#16 ; but the label here is C$$code
Labels are symbols representing the addresses in memory of instructions or data. They
can be program-relative, register-relative, or absolute.
Program-relative labels
These represent the program counter, plus or minus a numeric constant. Use them as
targets for branch instructions, or to access small items of data embedded in code
sections. You can define program-relative labels using a label on an instruction or on
one of the data definition directives. See:
•DCB on page 7-18
•DCD and DCDU on page 7-19
•DCFD and DCFDU on page 7-21
•DCFS and DCFSU on page 7-22
•DCI on page 7-23
•DCQ and DCQU on page 7-24
•DCW and DCWU on page 7-25.
Register-relative labels
These represent a named register plus a numeric constant. They are most often used to
access data in data sections. You can define them with a storage map. You can use the
EQU
directive to define additional register-relative labels, based on labels defined in
storage maps. See:
•MAP on page 7-15
•SPACE on page 7-17
•DCDO on page 7-20
•EQU on page 7-57.
Absolute addresses
These are numeric constants. They are integers in the range 0 to 2
A local label is a number in the range 0-99, optionally followed by a name. The same
number can be used for more than one local label in an ELF section.
Local labels are typically used for loops and conditional code within a routine, or for
small subroutines that are only used locally. They are particularly useful in macros (see
MACRO and MEND on page 7-27).
ROUT
Use the
directive to limit the scope of local labels (see ROUT on page 7-68). A
reference to a local label refers to a matching label within the same scope. If there is no
matching label within the scope in either direction, the assembler generates an error
message and the assembly fails.
You can use the same number for more than one local label even within the same scope.
By default, the assembler links a local label reference to:
•the most recent local label of the same number, if there is one within the scope
•the next following local label of the same number, if there is not a preceding one
within the scope.
Use the optional parameters to modify this search pattern if required.
Syntax
The syntax of a local label is:
n{routname}
The syntax of a reference to a local label is:
%{F|B}{A|T}n{routname}
where:
n
routname
%
F
B
A
T
If neither
If neither
is the number of the local label.
is the name of the current scope.
introduces the reference.
instructs the assembler to search forwards only.
instructs the assembler to search backwards only.
instructs the assembler to search all macro levels.
instructs the assembler to look at this macro level only.
F
or B is specified, the assembler searches backwards first, then forwards.
A
or T is specified, the assembler searches all macros from the current level to
the top level, but does not search lower level macros.
String expressions consist of combinations of string literals, string variables, string
manipulation operators, and parentheses. See:
•String literals
•Variables on page 3-13
•Unary operators on page 3-26
•String manipulation operators on page 3-28
•SETA, SETL, and SETS on page 7-7.
Characters that cannot be placed in string literals can be placed in string expressions
using the
The value of a string expression cannot exceed 512 characters in length. It can be of zero
length.
Example
improb SETS "literal":CC:(strvar2:LEFT:4)
; sets the variable improb to the value "literal"
; with the left-most four characters of the
; contents of string variable strvar2 appended
:CHR:
unary operator. Any ASCII character from 0 to 255 is allowed.
Assembler Reference
3.6.2String literals
String literals consist of a series of characters contained between double quote
characters. The length of a string literal is restricted by the length of the input line (see
Format of source lines on page 3-8).
To include a double quote character or a dollar character in a string, use two of the
character.
-noesc
C string escape sequences are also allowed, unless
is specified (see Command
syntax on page 3-2).
Examples
abc SETS "this string contains only one "" double quote"
def SETS "this string contains only one $$ dollar symbol"
Numeric expressions consist of combinations of numeric constants, numeric variables,
ordinary numeric literals, binary operators, and parentheses. See:
•Numeric constants on page 3-13
•Variables on page 3-13
•Numeric literals on page 3-21
•Binary operators on page 3-28
•SETA, SETL, and SETS on page 7-7.
Numeric expressions can contain register-relative or program-relative expressions if the
overall expression evaluates to a value that does not include a register or the program
counter.
Numeric expressions evaluate to 32-bit integers. You can interpret them as unsigned
numbers in the range 0 to 2
However, the assembler makes no distinction between –n and 2
operators such as >= use the unsigned interpretation. This means that 0 > –1 is
Example
a SETA 256*256 ; 256*256 is a numeric expression
MOV r1,#(a*22) ; (a*22) is a numeric expression
32
– 1, or signed numbers in the range –231 to 231 – 1.
Numeric literals can take any of the following forms:
•
0xhexadecimal-digits
&hexadecimal-digits
•
'character'
where
Assembler Reference
decimal-digits
n_base-n-digits
decimal-digits
hexadecimal-digits
is a sequence of characters using only the digits 0 to 9.
is a sequence of characters using only the digits 0 to 9 and the
letters A to F or a to f.
n_
is a single digit between 2 and 9 inclusive, followed by an
underscore character.
base-n-digits
character
is a sequence of characters using only the digits 0 to (n – 1)
is any single character except a single quote. Use \' if you require
a single quote. In this case the value of the numeric literal is the
numeric code of the character.
You must not use any other characters. The sequence of characters must evaluate to an
integer in the range 0 to 2
64
– 1).
to 2
– 1 (except in
DCQ
and
DCQU
directives, where the range is 0
32
Examples
a SETA 34906
addr DCD 0xA10E
LDR r4,=&1000000F
DCD 2_11001010
c3 SETA 8_74007
DCQ 0x0123456789abcdef
LDR r1,='A' ; pseudo-instruction loading 65 into r1
ADD r3,r2,#'\'' ; add 39 to contents of r2, result to r3
Floating-point literals can take any of the following forms:
{-}digits E{-}digits
{-}{digits}.digits{E{-}digits}
0xhexdigits
&hexdigits
digits
are sequences of characters using only the digits 0 to 9. You can write E
in uppercase or lowercase. These forms correspond to normal
floating-point notation.
hexdigits
are sequences of characters using only the digits 0 to 9 and the letters
A to F or a to f. These forms correspond to the internal representation of
the numbers in the computer. Use these forms to enter infinities and
NaNs, or if you want to be sure of the exact bit patterns you are using.
The range for single-precision floating point values is:
•maximum 3.40282347e+38
•minimum 1.17549435e–38.
The range for double-precision floating point values is:
•maximum 1.79769313486231571e+308
•minimum 2.22507385850720138e–308.
Examples
DCFD 1E308,-4E-100
DCFS 1.0
DCFD 3.725e15
LDFS 0x7FC00000 ; Quiet NaN
LDFD &FFF0000000000000 ; Minus infinity