Introduction to the ARM®Processor
Using Intel FPGA Toolchain
For Quartus Prime 16.1
1Introduction
This tutorial presents an introduction to the ARM®Cortex-A9 processor, which is a processor implemented as a
hardware block in Intel’s Cyclone®V SoC FPGA devices. The tutorial is intended for a user who wishes to use an
ARM-based system on Intel’s DE1-SOC board.
A full description of ARM processors is provided in the ARM Architecture Reference Manual, which is available on
the ARM Holdings web site.
Contents:
• Overview of ARM Cortex-A9 Processor Features
• Register Structure
• Instruction Sets
• Accessing Memory and I/O Devices
• Addressing Modes
• ARM Instructions
• Assembler Directives
• Example Program
• Operating Modes
• Banked Registers
• Exception Processing
• Input/Output Operations
Intel Corporation - FPGA University Program
November 2016
1
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
•
•
•
NZCV
3129702830645
IFT
Processor mode
ARM or Thumb operation
Interrupt disable bits
Condition code flags
031
R0
R1
R13
R14
R15
SP - Stack pointer
LR - Link register
PC - Program counter
Status registerCPSR
2Overview of ARM Cortex-A9 Processor Features
The ARM Cortex-A9 processor has mostly a Reduced Instruction Set Computer (RISC) architecture. Its arithmetic
and logic operations are performed on operands in the general-purpose registers. The data is moved between the
memory and these registers by means of Load and Store instructions.
The word-length of the processor is 32 bits. Data byte addresses in a 32-bit word are assigned in little-endian style,
in which the lower byte addresses are used for the less significant bytes (the rightmost bytes) of the word.
3Register Structure
All registers in the ARM Cortex-A9 processor are 32 bits long. There are 15 general-purpose registers, R0 to R14,
a Program Counter, R15, and a Current Program Status Register, CPSR, as shown in Figure 1. All general-purpose
registers can be used in the same way. However, software programs usually treat two of them in a special way.
Register R13 is used as a Stack Pointer. Register R14 is used as a Link Register in subroutine linkage. In assemblylanguage programs, the registers R15, R14 and R13 can also be referred to by using the acronyms PC, LR and SP,
respectively. In assembly-language programs, the register names can be written either in upper or lower case. Thus,
R1, R2, PC, LR and SP is equivalent to r1, r2, pc, lr and sp.
2Intel Corporation - FPGA University Program
Figure 1. ARM register structure.
November 2016
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
The CPSR register has the following contents:
• Condition Code flags which are set based on the results of a previous operation. Most ARM instructions can
be executed conditionally based on the values of these flags:
– Negative (N) - set to 1 if the result is negative; otherwise, cleared to 0
– Zero (Z) - set to 1 if the result is 0; otherwise, cleared to 0.
– Carry (C) - set to 1 if a carry-out results from the operation; otherwise, cleared to 0.
– Overflow (V) - set to 1 if arithmetic overflow occurs; otherwise cleared to 0.
• Interrupt-disable bits, I and F, where
– I = 1 disables the IRQ interrupts
– F = 1 disables FIQ interrupts
• Thumb bit, where
– T = 0 indicates ARM execution
– T = 1 indicates Thumb execution
• Processor mode bits which identify the mode in which the processor is operating, as explained in Section 9.
For some registers, there are duplicate registers, called banked registers, for saving the contents of primary registers
when various types of interrupts occur, as discussed in Section 10.
4Instruction Sets
The ARM Cortex-A9 processor can execute instructions in three different instruction sets, known as ARM, Thumb
and Thumb-2.
The ARM set is the most powerful. All instructions are 32 bits long. The instructions are stored in memory in
word-aligned manner.
The Thumb set is a smaller version, where the instructions are provided in a format that uses only 16 bits. This
usually results in smaller memory requirements, which can be useful in embedded applications.
The Thumb-2 set includes both 16- and 32-bit instructions. Its functionality is almost identical to that of the ARM
instruction set.
In this tutorial we will deal only with the ARM instruction set. We should note that there exists a Unified AssemblerLanguage (UAL), which provides a common syntax for ARM and Thumb instructions. It supersedes the previous
versions of both the ARM and Thumb assembler languages. We will use UAL in this tutorial.
Intel Corporation - FPGA University Program
November 2016
3
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
5Accessing Memor y and I/O Devices
Any input/output devices that can be accessed by the ARM processor are memory mapped and can be accessed
as memory locations. Data accesses to memory locations and I/O interfaces are performed by means of Load and
Store instructions, which cause data to be transferred between the memory and general-purpose registers. The ARM
processor issues 32-bit addresses. The memory space is byte-addressable. Instructions can read and write words (32
bits), halfwords (16 bits), or bytes (8 bits) of data.
5.1Addressing Modes for Load and Store Instructions
The Load and Store instructions are the only type of instructions that can access memory locations. Load instructions
copy the contents of a memory location specified by an addressing mode into a destination register, which is a
general-purpose register, Rd. Store instructions copy the contents of a general-purpose register, Rd , into a memory
location specified by an addressing mode.
An addressing mode provides the information needed to determine the address of the desired memory location.
There are different ways of specifying the required address. All addressing modes involve one or two generalpurpose registers, plus some additional information. One register is referred to as the base register, Rn. If a second
register is used, it is referred to as the index register, Rm. The memory address is determined by adding the contents
of the base register and a value that is either given as a signed 12-bit offset directly in the instruction or as a magnitude
in the index register. The magnitude in Rm can be scaled by shifting it either left or right a number of bit-positions
specified in the instruction.
There are three primary addressing modes provided:
• Offset mode – the address is determined by adding the contents of a base register and an offset that is either
given directly in the instruction or in an index register.
• Pre-indexed mode – the address is determined in the same way as in the Offset mode; subsequently, this
address replaces the contents of the base register used.
• Post-indexed mode – the address is the contents of a base register; subsequently, the base register is loaded
with a new address that is determined in the same way as in the Offset mode.
These addressing modes are fully specified in Table 1, which indicates how the address generation is performed.
The table also gives the required Assembler syntax.
When an index register is specified, its contents are interpreted as a magnitude which can be either added to or
subtracted from a base register. This magnitude can first be shifted left or right by specifying LSL #k or LSR #k,
respectively, where k is an integer from 1 to 31. Shifting operations are discussed further in section 6.7.
Since the Program Counter, R15, can be treated as a general-purpose register, it can be used in the Offset addressing
mode as a base register, Rn. This makes it possible to access memory locations in terms of their distance relative to
the current address in R15. This mode is often referred to as the Relative addressing mode.
4Intel Corporation - FPGA University Program
November 2016
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
offset = a signed number given in the instruction
shift = direction #integer
where direction is LSL for left shift or LSR for right shift, and
integer is a 5-bit unsigned number specifying the shift amount
±Rm = the magnitude in register Rm that is added to or subtracted
from the contents of base register Rn
Consider the Load instruction, LDR, which loads a 32-bit operand into a register. The instruction
LDR R2, [R6, #−8]
loads R2 from the address in R6 minus 8. The instruction
LDR R2, [R6, #0x200]
loads R2 from the address in R6 plus the hexadecimal number 0x200. The instruction
Intel Corporation - FPGA University Program
November 2016
LDR R2, [R6, −R8]
5
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
loads R2 from the address obtained by subtracting the contents of R8 from the contents of R6.
The Pre-indexed mode is illustrated in
LDR R2, [R6, R8, LSL #4]!
which loads R2 from the location whose address is determined by shifting the contents of R8 to the left by 4 bitpositions (which is equivalent to multiplying by 16) and adding the result to the contents of R6. Subsequently, the
generated address is loaded into R6.
An example of Post-indexed mode is
LDR R2, [R6], #20
where R6 contains the address of the location from which an operand is loaded into R2. Subsequently, the contents
of R6 are modified by adding to them the offset value 20.
Relative addressing can be used simply by specifying the address label associated with the desired memory location.
For example, if MEMLOC is the desired location, then the instruction
LDR R2, MEMLOC
will load the contents of memory location MEMLOC into register R2. The assembler will determine the immediate
offset as the difference between the address MEMLOC and the contents of the updated Program Counter. It will
generate the instruction
LDR R2, [R15, #offset]
This offset takes into account the fact that when the instruction is to be executed, the Program Counter will already
be incremented by 8, because the ARM processor will already have fetched the next instruction (due to pipelined
execution).
5.2Format for Load and Store Instructions
The format for Load and Store instructions is shown in Figure 2. The operation code (OP-code) is provided in bits
27 to 20. The register Rd, which is used as the destination in load instructions or as the source in store instructions,
is identified by bits 15 to 12. The base register, Rn, is identified by bits 19 to 16. Bits 11 to 0 may contain a signed
12-bit offset or identify an index register. If an index register is used, its number, m, is given in the low-order four
bits of the instruction.
Observe, in Figure 2, that the high-order four bits denote a condition for the instruction. In ARM processors, most
instructions can be executed conditionally, as explained in Section 6.11.
6Intel Corporation - FPGA University Program
November 2016
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
31028
Condition
27
1120
1916
1512
OP codeRnRdOffset or Rm
Figure 2. Format for Load and Store instructions.
6ARM Instructions
ARM instructions are 32-bits long. In addition to machine instructions that are executed directly by the processor,
the ARM instruction set includes a number of pseudo-instructions that can be used in assembly language programs.
The Assembler replaces each pseudo-instruction by one or more machine instructions.
This section discusses briefly the main features of the ARM instruction set. For a complete description of the instruction set, including the details of how each instruction is encoded, the reader should consult the ARM ArchitectureReference Manual.
6.1Load and Store Instructions
Load and store instructions are used to move data between memory (and I/0 interfaces) and the general-purpose
registers. The LDR (Load Register) instruction, illustrated in the previous section, loads a 32-bit operand into a
register. The corresponding Store instruction is STR (Store Register). For example,
STR R2, [R4]
copies the contents of R2 into memory location at the address that is found in register R4.
There are also load and store instructions that use operands that are only 8 or 16 bits long. They are referred to as
Load/Store Byte and Load/Store Halfword instructions, respectively. Such load instructions are:
• LDRB (Load Register Byte)
• LDRSB (Load Register Signed Byte)
• LDRH (Load Register Halfword)
• LDRSH (Load Register Signed Halfword)
When a shorter operand is loaded into a 32-bit register, its value has to be adjusted to fit into the register. This is
done by zero-extending the 8- or 16-bit value to 32 bits in the LDRB and LDRH instructions. In the LDRSB and
LDRSH instructions the operand is sign-extended.
The corresponding Store instructions are:
• STRB (Store Register Byte)
Intel Corporation - FPGA University Program
November 2016
7
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
• STRH (Store Register Halfword)
The STRB instruction stores the low byte of register Rd into the memory byte specified by the address. The STRH
instruction stores the low halfword of register Rd.
6.1.1Loading and Storing Multiple Registers
There are two instructions that allow loading of data into multiple registers, LDM (Load Multiple), and storing the
contents of multiple registers into memory, STM (Store Multiple). The memory operands must be in successive
word locations. These instructions are useful for two main purposes:
• transferring blocks of data between memory and processor registers, and
• saving data in registers on a stack, and then later restoring the registers from the stack
The address of the first word in memory is given in the base register, Rn. Upon transferring the last word of data,
the contents of Rn can be updated with the last address by specifying the Pre-indexed (!) addressing mode.
An instruction must specify the registers involved in the transfer. The registers must be listed in the assemblylanguage instruction in a field enclosed by braces, but they do not have to be contiguous. A range of registers is
specified by listing the first and the last registers in the range, separated by a dash (−). In the resulting machine
instruction, each register is identified by setting a corresponding bit in the field comprising the low-order 16 bits.
Registers are always stored by STM in the order from largest-to-smallest register-index (R15, R14, R13, . .., R0),
and loaded by LDM in the order from the smallest-to-largest register-index (R0, R1, R2, . . ., R15).
The instruction must also indicate the direction in which memory addresses are computed. For block transfers there
are four possibilities for determining the addresses of consecutive data words. The address can be incremented or
decremented by 4 either before or after each data item is accessed. The desired action is specified by appending a
suffix to the OP-code mnemonic in the assembly-language instruction. The four suffixes are:
• IA – Increment After
• IB – Increment Before
• DA – Decrement After
• DB – Decrement Before
For example, the instruction
LDMIA R3!, {R4, R6−R8, R10}
will load registers R4, R6, R7, R8 and R10. If the starting address in R3 is 1000, then the data loaded into the
registers will be from addresses 1000, 1004, 1008, 1012 and 1016, respectively. Because the Pre-indexed mode is
specified, the final contents of R3 will be 1020.
8Intel Corporation - FPGA University Program
November 2016
IN TRODUCTION TO TH E ARM®PROCESSOR US IN G INTEL FPGA TOO LC HAINFor Quartus Prime 16.1
The LDM and STM instructions are very useful in the context of subroutines, where they can be used to save
the contents of registers on the stack. For this purpose, there exist pseudo-instructions PUSH and POP, which are
actually implemented as particular forms of STM and LDM instructions. In these instructions the Stack Pointer, SP,
is the base register, which is always updated. The SP is decremented by 4 before each transfer in PUSH instructions,
and it is incremented by 4 after each transfer in POP instructions. For example, the instruction
PUSH {R1, R3−R5}
places the contents of registers R5, R4, R3 and R1 onto the stack. The equivalent Store Multiple instruction is
STMDB SP!, {R1, R3−R5}
The instruction
POP {R1, R3−R5}
restores the contents of these registers from the stack. The equivalent Load Multiple instruction would be
LDMIA SP!, {R1, R3−R5}
6.2Data Processing Instructions
A variety of ARM instructions are provided for the processing of data, including instructions that perform shifting,
arithmetic operations, logical operations, and data transfer between registers.
6.3Flexible Operands
A number of data processing instruction have the general form
OPRd , Rn, Operand2
where Rd is the destination register, Rn is the first operand, and Operand2 is the second operand. A considerable
amount of flexibility is provided by Operand2. It can be an immediate constant, as in
OPRd , Rn, #value
This instruction performs the operation OP using the contents of Rn and the constant value, and places the result
into Rd . For example, if OP is the addition instruction ADD, then
Intel Corporation - FPGA University Program
November 2016
ADD R0, R1, #1
9
Loading...
+ 20 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.