To provide the most up-to-date information, the revision of our documents on the World Wide Web will be
the most current. Your printed copy may be an earlier revision. To verify you have the latest information
available, refer to:
http://www.freescale.com
The following revision history table summarizes changes contained in this document.
Revision History
Revision
Number
3.0April, 2002Incorporated information covering HCS12 Family of 16-bit MCUs throughout the book.
4.0March, 2006Reformatted to Freescale publication standards.
This manual describes the features and operation of the core (central
processing unit, or CPU, and development support functions) used in all
HCS12 microcontrollers. For reference, information is provided for the
M68HC12.
1.2 Features
The CPU12 is a high-speed, 16-bit processing unit that has a programming
model identical tothat of theindustry standard M68HC11central processor
unit (CPU). The CPU12 instruction set is a proper superset of the M68HC11
instruction set, and M68HC11 source code is accepted by CPU12
assemblers with no changes.
Section 1. Introduction
•Full 16-bit data paths supports efficient arithmetic operation and
high-speed math execution
•Supports instructions with odd byte counts, including many
single-byte instructions. This allows much more efficient use of ROM
space.
•An instruction queue buffers program information so the CPU has
immediate access to at least three bytes of machine code at the start
of every instruction.
•Extensive set of indexed addressing capabilities, including:
–Using the stack pointer as an indexing register in all indexed
operations
–Using the program counter as an indexing register in all but auto
increment/decrement mode
–Accumulator offsets using A, B, or D accumulators
–Automatic index predecrement, preincrement, postdecrement,
and postincrement (by –8 to +8)
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor15
1.3 Symbols and Notation
The symbols and notation shown here are used throughout the manual.
More specialized notation that applies only to the instruction glossary or
instruction set summary are described at the beginning of those sections.
1.3.1 Abbreviations for System Resources
A— Accumulator A
B— Accumulator B
D— Double accumulator D (A : B)
X— Index register X
Y— Index register Y
SP— Stack pointer
PC— Program counter
CCR — Condition code register
S — STOP instruction control bit
X — Non-maskable interrupt control bit
H — Half-carry status bit
I — Maskable interrupt control bit
N — Negative status bit
Z — Zero status bit
V — Two’s complement overflow status bit
C — Carry/Borrow status bit
S12CPUV2 Reference Manual, Rev. 4.0
16Freescale Semiconductor
1.3.2 Memory and Addressing
M— 8-bit memory location pointed to by the effective
M : M+1— 16-bit memory location. Consists of the contents of the
M~M+3
M
(Y)~M(Y+3)
M
(X)
M
(SP)
M
(Y+3)
PPAGE— Program overlay page (bank) number for extended
Page— Program overlay page
X
H
X
L
( )— Content of register or memory location
$— Hexadecimal value
%— Binary value
address of the instruction
location pointed to by the effective address
concatenated with the contents of the location at the
nexthighermemoryaddress. The most significant byte
is at location M.
— 32-bit memory location. Consists of the contents of the
effective address of the instruction concatenated with
thecontentsofthenextthreehighermemorylocations.
The most significant byte is at location M or M
(Y)
.
— Memory locations pointed to by index register X
— Memory locations pointed to by the stack pointer
— Memory locations pointed to by index register Y plus 3
memory (>64 Kbytes).
— High-order byte
— Low-order byte
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor17
1.3.3 Operators
+
–
•
+
⊕
×
÷
M
⇒
⇔
—
Addition
—
Subtraction
—
Logical AND
—
Logical OR (inclusive)
—
Logical exclusive OR
—
Multiplication
—
Division
—
Negation. One’s complement (invert each bit of M)
—
:
Concatenate
Example: A : B means the 16-bit valueformedbyconcatenating 8-bit accumulator A with 8-bit accumulator B.
A is in the high-order position.
—
Transfer
Example: (A) ⇒ M means the content of accumulator A is
transferred to memory location M.
—
Exchange
Example: D ⇔ X means exchange the contents of D with
those of X.
S12CPUV2 Reference Manual, Rev. 4.0
18Freescale Semiconductor
1.3.4 Definitions
Logic level 1 is the voltage that corresponds to the true (1) state.
Logic level 0 is the voltage that corresponds to the false (0) state.
Set refers specifically to establishing logic level 1 on a bit or bits.
Cleared refers specifically to establishing logic level 0 on a bit or bits.
Asserted means that a signal is in active logic state. An active low signal
changes from logic level 1 to logic level 0 when asserted, and an active
high signal changes from logic level 0 to logic level 1.
Negated means that an asserted signal changes logic state. An active low
signal changes from logic level 0 to logic level 1 when negated, and an
active high signal changes from logic level 1 to logic level 0.
ADDR is the mnemonic for address bus.
DATA is the mnemonic for data bus.
LSB means least significant bit or bits.
MSB means most significant bit or bits.
LSW means least significant word or words.
MSW means most significant word or words.
A specific bit location within a range is referred to by mnemonic and
number. For example, A7 is bit 7 of accumulator A.
A range of bit locations is referred to by mnemonic and the numbers that
definetherange. For example,DATA[15:8]formthehigh byte ofthedata
bus.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor19
S12CPUV2 Reference Manual, Rev. 4.0
20Freescale Semiconductor
Reference Manual — S12CPUV2
2.1 Introduction
This section describes the CPU12 programming model, register set, the
data types used, and basic memory organization.
2.2 Programming Model
The CPU12 programming model, shown in Figure 2-1, is the same as that
of the M68HC11 CPU. The CPU has two 8-bit general-purpose
accumulators (A and B) that can be concatenated into a single 16-bit
accumulator (D) for certain instructions. It also has:
•Two index registers (X and Y)
Section 2. Overview
•16-bit stack pointer (SP)
•16-bit program counter (PC)
•8-bit condition code register (CCR)
7
15
15
15
15
15
AB
70
D
IX
IY
SP
PC
NSXH IZVC
Figure 2-1. Programming Model
0
8-BIT ACCUMULATORS A AND B
OR
0
16-BIT DOUBLE ACCUMULATOR D
0
INDEX REGISTER X
0
INDEX REGISTER Y
0
STACK POINTER
0
PROGRAM COUNTER
CONDITION CODE REGISTER
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor21
2.2.1 Accumulators
General-purpose8-bitaccumulatorsAandBare used to hold operands and
results of operations. Some instructions treat the combination of these two
8-bit accumulators (A : B) as a 16-bit double accumulator (D).
Most operations can use accumulator A or B interchangeably. However,
there are a few exceptions. Add, subtract, and compare instructions
involving both A and B (ABA, SBA, and CBA) only operate in one direction,
so it is important to make certain the correct operand is in the correct
accumulator. The decimal adjust accumulator A (DAA) instruction is used
after binary-coded decimal (BCD) arithmetic operations. There is no
equivalent instruction to adjust accumulator B.
2.2.2 Index Registers
16-bit index registers X and Y are used for indexed addressing. In the
indexed addressing modes, the contents of an index register are added to
5-bit, 9-bit, or 16-bit constants or to the content of an accumulator to form
the effective address of the instruction operand. The second index register
is especially useful for moves and in cases where operands from two
separate tables are used in a calculation.
2.2.3 Stack Pointer
TheCPU12supportsan automatic program stack.Thestackisused to save
system context during subroutine calls and interrupts and can also be used
for temporary data storage. The stack can be located anywhere in the
standard 64-Kbyte address space and can grow to any size up to the total
amount of memory available in the system.
The stack pointer (SP) holds the 16-bit address of the last stack location
used. Normally, the SP is initialized by one of the first instructions in an
application program. The stack grows downward from the address pointed
to by the SP. Each time a byte is pushed onto the stack, the stack pointer is
automatically decremented, and each time a byte is pulled from the stack,
the stack pointer is automatically incremented.
When a subroutine is called, the address of the instruction following the
calling instruction is automatically calculated and pushed onto the stack.
Normally, a return-from-subroutine (RTS) or a return-from-call (RTC)
instruction is executed at the end of a subroutine. The return instruction
S12CPUV2 Reference Manual, Rev. 4.0
22Freescale Semiconductor
loads the program counter with the previously stacked return address and
execution continues at that address.
When an interrupt occurs, the current instruction finishes execution. The
address of the next instruction is calculated and pushed onto the stack, all
the CPU registers are pushed onto the stack,theprogramcounter is loaded
with the address pointed to by the interrupt vector, and execution continues
at that address. The stacked registers are referred to as an interrupt stack
frame. The CPU12 stack frame is the same as that of the M68HC11.
NOTE:These instructions can be interrupted, and they resume execution once the
interrupt has been serviced:
2.2.4 Program Counter
The program counter (PC) is a 16-bit register that holds the address of the
nextinstructiontobeexecuted.Itisautomaticallyincrementedeachtimean
instruction is fetched.
• REV (fuzzy logic rule evaluation)
• REVW (fuzzy logic rule evaluation (weighted))
• WAV (weighted average)
2.2.5 Condition Code Register
The condition code register (CCR), named for its five status indicators,
contains:
•Five status indicators
•Two interrupt masking bits
•STOP instruction control bit
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor23
The status bits reflect the results of CPU operation as it executes
instructions. The five flags are:
•Half carry (H)
•Negative (N)
•Zero (Z)
•Overflow (V)
•Carry/borrow (C)
The half-carry flag is used only for BCD arithmetic operations. The N, Z, V,
and C status bits allow for branching based on the results of a previous
operation.
In some architectures, only a few instructions affectconditioncodes,so that
multiple instructions must be executed in order to load and test a variable.
Since most CPU12 instructions automatically update condition codes, it is
rarely necessary to execute an extra instruction for this purpose. The
challenge in using the CPU12lies in finding instructions thatdo not alter the
condition codes. The most important of these instructions arepushes,pulls,
transfers, and exchanges.
2.2.5.1 S Control Bit
It is always a good idea to refer to an instruction set summary (see
Appendix A. Instruction Reference) to check which condition codes are
affected by a particular instruction.
The following paragraphs describe normal uses of the condition codes.
There are other, more specialized uses. For instance, the C status bit is
usedto enable weighted fuzzy logic rule evaluation. Specialized usages are
described in the relevant portions of this manual and in Section 6.
Instruction Glossary.
Clearing the S bit enables the STOP instruction. Execution of a STOP
instruction normally causes the on-chip oscillator to stop. This may be
undesirableinsomeapplications.Ifthe CPU encounters a STOP instruction
while the S bit is set, it is treated like a no-operation (NOP) instruction and
continues to the next instruction. Reset sets the S bit.
S12CPUV2 Reference Manual, Rev. 4.0
24Freescale Semiconductor
2.2.5.2 X Mask Bit
XIRQ input is an updated version of the NMI input found on earlier
The
generations of MCUs. Non-maskable interrupts are typically used to deal
with major system failures, such as loss of power. However, enabling
non-maskableinterruptsbeforeasystemisfullypoweredandinitializedcan
lead to spurious interrupts. The X bit provides a mechanism for enabling
non-maskable interrupts after a system is stable.
By default, the Xbit is set to 1 during reset. As longas the X bitremains set,
interrupt service requests made via the
XIRQ pin are not recognized. An
instruction must clear the X bit to enable non-maskable interrupt service
requests made via the
XIRQ pin. Once the X bit has been cleared to 0,
software cannot reset it to 1 by writing to the CCR. The X bit is not affected
by maskable interrupts.
2.2.5.3 H Status Bit
When an
XIRQ interrupt occurs after non-maskable interrupts are enabled,
both the X bit and the I bit are set automatically to prevent other interrupts
from being recognized during the interrupt service routine. The mask bits
are set after the registers are stacked, but before the interrupt vector is
fetched.
Normally, a return-from-interrupt (RTI) instruction at the end of the interrupt
service routine restores register values that were present before the
interrupt occurred. Since the CCR is stacked before the X bit is set, the RTI
normally clears the X bit, and thus re-enables non-maskable interrupts.
While it is possible to manipulate thestackedvalue of X so that X issetafter
an RTI, there is no software method to reset X (and disable
XIRQ) once X
has been cleared.
The H bit indicates a carry from accumulator A bit 3 during an addition
operation. The DAA instruction uses the value of the H bit to adjust a result
in accumulator A to correct BCD format. H is updated only by the add
accumulator A to accumulator B (ABA), add without carry (ADD), and add
with carry (ADC) instructions.
2.2.5.4 I Mask Bit
The I bit enables and disables maskable interrupt sources. By default, the I
bit is set to 1 during reset. An instruction must clear the I bit to enable
maskable interrupts. While the I bit is set, maskable interrupts can become
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor25
2.2.5.5 N Status Bit
pending and are remembered, but operation continues uninterrupted until
the I bit is cleared.
When an interrupt occurs after interrupts are enabled, the I bit is
automatically set to prevent other maskable interrupts during the interrupt
service routine. The I bitis set after the registersare stacked, but before the
first instruction in the interrupt service routine is executed.
Normally, an RTI instruction at the end of the interrupt service routine
restores register values that were present before the interrupt occurred.
Since the CCR is stacked before the I bit is set, the RTI normally clears the
Ibit,andthusre-enables interrupts. Interrupts can bere-enabledbyclearing
the I bit within the service routine.
TheNbitshowsthestateof the MSB of the result. N ismostcommonlyused
in two’s complement arithmetic, where the MSB of a negative number is 1
and the MSB of a positive number is 0, but it has other uses. For instance,
if the MSB ofa register or memory location is used as a status flag, the user
can test status by loading an accumulator.
2.2.5.6 Z Status Bit
2.2.5.7 V Status Bit
2.2.5.8 C Status Bit
The Z bit is set when all the bits of the result are 0s. Compare instructions
perform an internal implied subtraction, and the condition codes, including
Z, reflect the results of that subtraction. The increment index register X
(INX), decrement index register X (DEX), increment index register Y (INY),
and decrement index register Y (DEY) instructions affect the Z bit and no
other condition flags. These operations can only determine = (equal) and ≠
(not equal).
The V bit is set when two’s complement overflow occurs as a result of an
operation.
The C bit is set when a carry occurs during addition or a borrow occurs
duringsubtraction. The C bit also acts as an error flag for multiply and divide
S12CPUV2 Reference Manual, Rev. 4.0
26Freescale Semiconductor
2.3 Data Types
operations.Shiftandrotate instructions operatethroughtheCbit to facilitate
multiple-word shifts.
The CPU12 uses these types of data:
•Bits
•5-bit signed integers
•8-bit signed and unsigned integers
•8-bit, 2-digit binary-coded decimal numbers
•9-bit signed integers
•16-bit signed and unsigned integers
•16-bit effective addresses
•32-bit signed and unsigned integers
Negative integers are represented in two’s complement form.
Five-bit and 9-bit signed integers are used only as offsets for indexed
addressing modes.
Sixteen-bit effective addresses are formed during addressing mode
computations.
Thirty-two-bit integer dividends are used by extended division instructions.
Extended multiply and extended multiply-and-accumulate instructions
produce 32-bit products.
2.4 Memory Organization
ThestandardCPU12addressspaceis64Kbytes.SomeM68HC12devices
support a paged memory expansion scheme that increases the standard
space by means of predefined windows in address space. The CPU12 has
special instructions that support use of expanded memory.
Eight-bit values can be stored at any odd or even byte address in available
memory.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor27
Sixteen-bit values are stored in memory as two consecutive bytes; the high
byte occupies the lowest address, but need not be aligned to an even
boundary.
Thirty-two-bit values are stored in memory as four consecutive bytes; the
high byte occupies the lowest address, but need not be aligned to an even
boundary.
All input/output (I/O) and all on-chip peripherals are memory-mapped. No
special instruction syntax is required to access these addresses. On-chip
registers and memory typically are grouped in blocks which can be
relocated within the standard 64-Kbyte address space. Refer to device
documentation for specific information.
2.5 Instruction Queue
The CPU12 uses an instruction queue to buffer program information. The
mechanism is called a queue rather than a pipeline because a typical
pipelined CPU executes more than one instruction at the same time, while
the CPU12 always finishes executing an instruction before beginning to
execute another. Refer to Section 4. Instruction Queue for more
information.
S12CPUV2 Reference Manual, Rev. 4.0
28Freescale Semiconductor
Reference Manual — S12CPUV2
3.1 Introduction
Addressing modes determine how the central processor unit (CPU)
accesses memory locations to be operated upon. This section discusses
the various modes and how they are used.
3.2 Mode Summary
Addressing modes are an implicit part of CPU12 instructions. Refer to
Appendix A. Instruction Reference for the modes used by each
instruction. All CPU12 addressing modes are shown in Table 3-1.
The CPU12 uses all M68HC11 modes as well as new forms of indexed
addressing. Differences between M68HC11and M68HC12 indexed modes
are described in 3.9 Indexed Addressing Modes. Instructions that use
more than one mode are discussed in 3.10 Instructions Using Multiple
Modes.
Section 3. Addressing Modes
3.3 Effective Address
Each addressing mode except inherent mode generates a 16-bit effective
address which is used during the memory reference portion of the
instruction. Effective address computations do not require extra execution
cycles.
INST oprx3,–xysIDXAuto pre-decrement x, y, or sp by 1 ~ 8
INST oprx3,+xysIDXAuto pre-increment x, y, or sp by 1 ~ 8
INST oprx3,xys–IDXAuto post-decrement x, y, or sp by 1 ~ 8
INST oprx3,xys+IDXAuto post-increment x, y, or sp by 1 ~ 8
or
INST #opr16i
INST rel8
or
INST rel16
INST abd,xyspIDX
IMM
REL
Operand is included in instruction stream
8- or 16-bit size implied by context
Operand is the lower 8 bits of an address
in the range $0000–$00FF
An 8-bit or 16-bit relative offset from the current pc
is supplied in the instruction
5-bit signed constant offset
from X, Y, SP, or PC
Indexed with 8-bit (A or B) or 16-bit (D)
accumulator offset from X, Y, SP, or PC
Indexed
(9-bit offset)
Indexed
(16-bit offset)
Indexed-Indirect
(16-bit offset)
Indexed-Indirect
(D accumulator offset)
30Freescale Semiconductor
INST oprx9,xyspIDX1
INST oprx16,xyspIDX2
INST [oprx16,xysp][IDX2]
INST [D,xysp][D,IDX]
S12CPUV2 Reference Manual, Rev. 4.0
9-bit signed constant offset from X, Y, SP, or PC
(lower 8 bits of offset in one extension byte)
16-bit constant offset from X, Y, SP, or PC
(16-bit offset in two extension bytes)
Pointer to operand is found at...
16-bit constant offset from X, Y, SP, or PC
(16-bit offset in two extension bytes)
Pointer to operand is found at...
X, Y, SP, or PC plus the value in D
3.4 Inherent Addressing Mode
Instructions that use this addressing mode either have no operands or all
operands are in internal CPU registers. In either case, the CPU does not
need to access any memory locations to complete the instruction.
Examples:
NOP;this instruction has no operands
INX;operand is a CPU register
3.5 Immediate Addressing Mode
Operands for immediate mode instructions are included in the instruction
stream and are fetched into the instruction queue one 16-bit word at a time
during normal program fetch cycles. Since program data is read into the
instruction queue several cycles before it is needed, when an immediate
addressingmodeoperandiscalled for by an instruction, it isalreadypresent
in the instruction queue.
The pound symbol (#) is used to indicate an immediate addressing mode
operand. One common programming error is to accidentally omit the #
symbol. This causes the assembler to misinterpret the expression that
follows it as an address rather than explicitly provided data. For example,
LDAA #$55 means toload the immediate value$55 into the A accumulator,
while LDAA $55 means to load the value from address $0055 into the A
accumulator. Without the # symbol, the instruction is erroneously
interpreted as a direct addressing mode instruction.
Examples:
LDAA#$55
LDX#$1234
LDY#$67
These are common examples of 8-bit and 16-bit immediate addressing
modes. The size of the immediate operand is implied by the instruction
context. In the third example, the instruction implies a 16-bit immediate
value but only an 8-bit value is supplied. In this case the assembler will
generate the 16-bit value $0067 because the CPU expects a16-bit value in
the instruction stream.
Example:
BRSETFOO,#$03,THERE
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor31
In this example, extended addressing mode is used to access the operand
FOO, immediate addressing mode is used to access the mask value $03,
and relative addressing mode is used to identify the destination address of
a branch in case the branch-taken conditions are met. BRSET is listed as
an extended mode instruction even though immediate and relative modes
are also used.
3.6 Direct Addressing Mode
This addressing mode is sometimes called zero-page addressing because
it is used to access operands in the address range $0000 through $00FF.
Since these addresses always begin with $00, only the eight low-order bits
of the address need to be included in the instruction, which saves program
space and execution time. A system can be optimized by placing the most
commonly accessed data in this area of memory. The eight low-order bits of
the operand address are supplied with the instruction, and the eight
high-order bits of the address are assumed to be 0.
Example:
LDAA$55
This is a basic example of direct addressing. The value $55 is taken to be
the low-order half of an address in the range $0000 through $00FF. The
high order half of the address is assumed to be 0. During execution of this
instruction, the CPU combines the value $55 from the instruction with the
assumed value of $00 to form the address $0055, which is then used to
access the data to be loaded into accumulator A.
Example:
LDX$20
In this example, the value $20 is combined with the assumed value of $00
to form the address $0020. Since the LDX instruction requires a 16-bit
value, a 16-bit word of data is read from addresses $0020 and $0021. After
execution of this instruction, the X index register will have the value from
address $0020 in its high-orderhalf and the value fromaddress $0021 in its
low-order half.
S12CPUV2 Reference Manual, Rev. 4.0
32Freescale Semiconductor
3.7 Extended Addressing Mode
In this addressing mode, the full 16-bit address of the memory locationtobe
operated on is provided in the instruction. This addressing mode can be
used to access any location in the 64-Kbyte memory map.
Example:
LDAA$F03B
This is a basic example of extended addressing. The value from address
$F03B is loaded into the A accumulator.
3.8 Relative Addressing Mode
Therelativeaddressingmodeisusedonly by branch instructions. Short and
long conditional branch instructions use relative addressing mode
exclusively, but branching versions of bit manipulation instructions (branch
if bits set (BRSET) and branch if bits cleared (BRCLR)) use multiple
addressing modes, including relative mode. Refer to
3.10 Instructions Using Multiple Modes for more information.
Shortbranchinstructionsconsistofan8-bitopcodeand a signed 8-bit offset
contained in the byte that follows the opcode. Long branch instructions
consist of an 8-bit prebyte, an 8-bit opcode, and a signed 16-bit offset
contained in the two bytes that follow the opcode.
Each conditional branch instruction tests certain status bits in the condition
code register. If the bits are in a specified state, the offset is added to the
address of the next memory location after the offset to form an effective
address, and execution continues at that address. If the bits are not in the
specified state, execution continues with the instruction immediately
following the branch instruction.
Bit-condition branches test whether bits in a memory byte are in a specific
state. Various addressing modes can be used to access the memory
location.An8-bitmaskoperandisusedtotestthebits.Ifeachbitinmemory
that corresponds to a 1in the mask is either set (BRSET) or clear(BRCLR),
an 8-bit offset is added to the address of the next memory location after the
offsettoformaneffectiveaddress,andexecutioncontinuesatthataddress.
If all the bits in memory that correspond to a 1 in the mask are not in the
specified state, execution continues with the instruction immediately
following the branch instruction.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor33
8-bit, 9-bit, and 16-bit offsets are signed two’s complement numbers to
support branching upward and downward inmemory. The numeric range of
short branch offset values is $80 (–128) to $7F (127). Loop primitive
instructions support a 9-bit offset which allows a range of $100 (–256) to
$0FF (255). The numeric range of long branch offset values is $8000
(–32,768) to $7FFF (32,767). If the offset is 0, the CPU executes the
instruction immediately following the branch instruction, regardless of the
test involved.
Since the offset is at the end of a branch instruction, using a negative offset
valuecancausetheprogram counter (PC) topointtotheopcodeand initiate
a loop. For instance, a branch always (BRA) instruction consists of two
bytes, so using an offset of $FE sets up an infinite loop; the same is true of
a long branch always (LBRA) instruction with an offset of $FFFC.
An offset that points to the opcode can cause a bit-condition branch to
repeat execution until the specified bit condition is satisfied. Since
bit-condition branches can consist of four, five, or six bytes depending on
the addressing mode used to access the byte in memory, the offset value
that sets up a loop can vary. For instance, using an offset of $FC with a
BRCLR that accesses memory using an 8-bit indexed postbyte sets up a
loop that executes until all the bits in the specified memory byte that
correspond to 1s in the mask byte are cleared.
3.9 Indexed Addressing Modes
The CPU12 uses redefined versions of M68HC11 indexed modes that
reduce execution time and eliminate code size penalties for using the Y
indexregister.In most cases, CPU12codesizefor indexed operationsisthe
same or is smaller than that for the M68HC11. Execution time is shorter in
all cases. Execution time improvements are due to both a reduced number
of cycles for all indexed instructions and to faster system clock speed.
The indexed addressing scheme uses a postbyte plus zero, one, or two
extension bytes after the instruction opcode. The postbyte and extensions
do the following tasks:
1.Specify which index register is used
2.Determine whether a value in an accumulator is used as an offset
3.Enable automatic pre- or post-increment or pre- or post-decrement
4.Specify size of increment or decrement
5.Specify use of 5-, 9-, or 16-bit signed offsets
S12CPUV2 Reference Manual, Rev. 4.0
34Freescale Semiconductor
This approach eliminates the differences between X and Y register use
while dramatically enhancing the indexed addressing capabilities.
Major advantages of the CPU12 indexed addressing scheme are:
•The stack pointer can be used as an index register in all indexed
operations.
•The program counter can be used as an index register in all but
autoincrement and autodecrement modes.
•A, B, or D accumulators can be used for accumulator offsets.
•Automatic pre- or post-increment or pre- or post-decrement by –8 to
+8
•A choice of 5-, 9-, or 16-bit signed constant offsets
•Use of two new indexed-indirect modes:
–Indexed-indirect mode with 16-bit offset
–Indexed-indirect mode with accumulator D offset
Table 3-2 is a summary of indexed addressing mode capabilities and a
description of postbyte encoding. The postbyte is noted as xb in instruction
descriptions. Detailed descriptions of the indexed addressing mode
variations follow the table.
All indexed addressing modes use a 16-bit CPU register and additional
information to create an effective address. In most cases the effective
address specifies the memory location affected by the operation. In some
variations of indexed addressing, the effective address specifies the
location of a value that points to the memory location affected by the
operation.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor35
Table 3-2. Summary of Indexed Operations
Postbyte
Code (xb)
rr0nnnnn
111rr0zs
111rr011[n,r]
rr1pnnnn
Source
Code
Syntax
–n,r
–n,r
n,–r n,+r
n,r–
n,r+
,r
n,r
n,r
Comments
rr; 00 = X, 01 = Y, 10 = SP, 11 = PC
5-bit constant offset n = –16 to +15
r can specify X, Y, SP, or PC
Constant offset (9- or 16-bit signed)
z- 0 = 9-bit with sign in LSB of postbyte(s)–256 ≤ n ≤ 255
1 = 16-bit–32,768 ≤ n ≤ 65,535
if z = s = 1, 16-bit offset indexed-indirect (see below)
r can specify X, Y, SP, or PC
16-bit offset indexed-indirect
rr can specify X, Y, SP, or PC–32,768 ≤ n ≤ 65,535
Auto predecrement, preincrement, postdecrement,or postincrement;
p = pre-(0) or post-(1), n = –8 to –1, +1 to +8
r can specify X, Y, or SP (PC not a valid choice)
+8 = 0111
…
+1 = 0000
–1 = 1111
…
–8 = 1000
A,r
111rr1aa
111rr111[D,r]
B,r
D,r
Accumulator offset (unsigned 8-bit or 16-bit)
aa-00 = A
01 = B
10 = D (16-bit)
11 = see accumulator D offset indexed-indirect
r can specify X, Y, SP, or PC
Accumulator D offset indexed-indirect
r can specify X, Y, SP, or PC
Indexed addressing mode instructions use a postbyte to specify index
registers(Xand Y), stack pointer (SP), or program counter (PC) as the base
indexregisterandtofurtherclassifythewaytheeffective address is formed.
A special group of instructions cause this calculated effective address to be
loaded into an index register for further calculations:
•Load stack pointer with effective address (LEAS)
•Load X with effective address (LEAX)
•Load Y with effective address (LEAY)
S12CPUV2 Reference Manual, Rev. 4.0
36Freescale Semiconductor
3.9.1 5-Bit Constant Offset Indexed Addressing
This indexed addressing mode uses a 5-bit signed offset which is included
in the instruction postbyte. This short offset is added to the base index
register (X, Y, SP, or PC) to form the effective address of the memory
location that will be affected by the instruction. This gives a range of –16
through +15 from the value in the base index register. Although other
indexed addressing modes allow 9- or 16-bit offsets, those modes also
require additional extension bytes in the instruction for this extra
information. The majority of indexed instructions in real programs use
offsets that fit in the shortest 5-bit form of indexed addressing.
Examples:
LDAA0,X
STAB
–8,Y
For these examples, assume X has a value of $1000 and Y has a value of
$2000 before execution. The 5-bit constant offset mode does not change
thevalueinthe index register, so Xwillstillbe$1000 and Y will stillbe$2000
after execution of these instructions. In the first example, A will be loaded
with the value from address $1000. In the second example, the value from
the B accumulator will be stored at address $1FF8 ($2000 –$8).
3.9.2 9-Bit Constant Offset Indexed Addressing
This indexed addressing mode uses a 9-bit signed offset which is added to
the base index register (X, Y,SP, or PC) to form theeffective address of the
memory location affected by the instruction. This gives a range of
through +255 from the valuein the base index register. Themost significant
bit (sign bit) of the offset is included in the instruction postbyte and the
remaining eight bits are provided as an extension byte after the instruction
postbyte in the instruction flow.
Examples:
LDAA$FF,X
LDAB
–20,Y
For these examples, assume X is $1000 andYis $2000 before execution of
these instructions.
NOTE:These instructions do not alter the index registers so they will still be $1000
and $2000, respectively, after the instructions.
The first instruction will load A with the value from address $10FF and the
second instruction will load B with the value from address $1FEC.
–256
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor37
This variation of the indexed addressing mode in the CPU12 is similartothe
M68HC11 indexed addressing mode, but is functionally enhanced. The
M68HC11 CPU provides for unsigned 8-bit constant offset indexing from X
or Y, and use of Y requires an extra instruction byte and thus, an extra
execution cycle. The 9-bit signed offset used in the CPU12 covers the same
range of positive offsets as the M68HC11, and adds negative offset
capability. The CPU12 can use X, Y, SP, or PC as the base index register.
3.9.3 16-Bit Constant Offset Indexed Addressing
This indexed addressing mode uses a 16-bit offset which is added to the
base index register (X, Y, SP, or PC) to form the effective address of the
memory location affected by the instruction. This allows access to any
address in the 64-Kbyte address space. Since the address bus and the
offset are both 16 bits, it does not matter whether the offset value is
considered to be a signed or an unsigned value ($FFFF may be thought of
as +65,535 or as –1). The 16-bit offset is provided as two extension bytes
after the instruction postbyte in the instruction flow.
3.9.4 16-Bit Constant Indirect Indexed Addressing
This indexed addressing mode adds a 16-bit instruction-supplied offset to
the base index register to form the address of a memory location that
contains a pointer to the memory location affected by the instruction. The
instruction itself does not point to the address of the memory location to be
acted upon, but rather to the location of a pointer to the address to be acted
on. The square brackets distinguish this addressing mode from 16-bit
constant offset indexing.
Example:
LDAA[10,X]
In this example, X holds the base address of a table of pointers. Assume
that X has an initial value of $1000, and that the value $2000 is stored at
addresses $100A and $100B. The instruction first adds the value 10 to the
value in X to form the address $100A. Next, an address pointer ($2000) is
fetched from memory at $100A. Then, the value stored in location $2000 is
read and loaded into the A accumulator.
S12CPUV2 Reference Manual, Rev. 4.0
38Freescale Semiconductor
3.9.5 Auto Pre/Post Decrement/Increment Indexed Addressing
This indexed addressing mode provides four ways to automatically change
the value in a base index register as a part of instruction execution. The
index register can be incremented or decremented by an integer value
either before or after indexing takes place. The base index register may be
X, Y, or SP. (Auto-modify modes would not make sense on PC.)
Pre-decrement and pre-increment versions of the addressing mode adjust
the value of the index register before accessing the memory location
affected by the instruction — the index register retains the changed value
after the instruction executes. Post-decrement and post-increment versions
of the addressing mode use the initial value in the index register to access
thememorylocation affected bytheinstruction,thenchange the valueofthe
index register.
The CPU12 allows the index register to be incremented or decremented by
any integer value in the ranges –8 through –1 or 1 through 8. The value
need not be related to the size of the operand for the current instruction.
These instructions can be used to incorporate an index adjustment into an
existinginstructionratherthan using anadditionalinstructionandincreasing
executiontime.Thisaddressingmodeisalsousedtoperformoperationson
a series of data structures in memory.
When an LEAS, LEAX, or LEAY instruction is executed using this
addressingmode,andtheoperationmodifiesthe index register that is being
loaded, the final value in the registeristhevalue that would have been used
to access a memory operand. (Premodification is seen in the result but
postmodification is not.)
Examples:
STAA1,
STX2,
–SP;equivalent to PSHA
–SP;equivalent to PSHX
LDX2,SP+;equivalent to PULX
LDAA1,SP+;equivalent to PULA
For a “last-used” type of stack like the CPU12 stack, these four examples
are equivalent to common push and pull instructions.
For a “next-available” stack like the M68HC11 stack, push A onto stack
(PSHA) is equivalent to store accumulator A (STAA) 1,SP– and pull A from
stack (PULA) is equivalent to load accumulator A (LDAA) 1,+SP. However,
in the M68HC11, 16-bit operations like push register X onto stack (PSHX)
and pull register X from stack (PULX) require multiple instructions to
decrement the SP by one, then store X, then decrement SP by one again.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor39
In the STAA 1,–SP example, the stack pointer is pre-decremented by one
and then A is stored to the address contained in the stack pointer. Similarly
the LDX 2,SP+ first loads X from the address in the stack pointer, then
post-increments SP by two.
Example:
MOVW2,X+,4,+Y
This example demonstrates how to work with data structures larger than
bytes and words. With this instruction in a program loop, it is possible to
movewordsofdatafromalisthavingonewordperentryintoasecondtable
that has four bytes per table element. In this example the source pointer is
updated after the data is read from memory (post-increment) while the
destination pointer is updated before it is used to access memory
(pre-increment).
3.9.6 Accumulator Offset Indexed Addressing
In this indexed addressing mode, the effective address is the sum of the
values in the base index register and an unsigned offset in one of the
accumulators. The value in the index register itself is not changed. The
index register can be X, Y, SP, or PC and the accumulator can be either of
the 8-bit accumulators (A or B) or the 16-bit D accumulator.
Example:
LDAAB,X
This instruction internally adds B to X to form the address from which A will
be loaded. B and X are not changed by this instruction. This example is
similar to the following 2-instruction combination in an M68HC11.
Examples:
ABX
LDAA0,X
However, this 2-instruction sequence alters the index register. If this
sequence was part of a loop where B changed on each pass, the index
register would have to be reloaded with the reference value on each loop
pass. The use of LDAA B,X is more efficient in the CPU12.
S12CPUV2 Reference Manual, Rev. 4.0
40Freescale Semiconductor
3.9.7 Accumulator D Indirect Indexed Addressing
This indexed addressing mode adds the value in the D accumulator to the
value in the base index register to form the address of a memory location
that contains a pointer to the memory location affected by the instruction.
The instruction operand does not point to the address of the memory
location to be acted upon, but rather to the location of a pointer to the
address to be acted upon. The square brackets distinguish this addressing
mode from D accumulator offset indexing.
This example is a computed GOTO. The values beginning at GO1 are
addressesofpotentialdestinations of thejump(JMP)instruction.At the time
the JMP [D,PC] instruction is executed, PC points to the address GO1, and
D holds one of the values $0000, $0002, or $0004 (determined by the
program some time before the JMP).
Assume that the value in D is $0002. The JMP instruction adds the values
in D and PC to form the address of GO2. Next the CPU reads the address
PLACE2 from memory at GO2 and jumps to PLACE2. The locations of
PLACE1 through PLACE3 were known at the timeofprogramassembly but
the destination of the JMP depends upon the value in D computed during
program execution.
3.10 Instructions Using Multiple Modes
Several CPU12 instructions use more than one addressing mode in the
course of execution.
3.10.1 Move Instructions
Moveinstructionsuse separate addressingmodestoaccess the sourceand
destination of a move. There are move variations for all practical
combinations of immediate, extended, and indexed addressing modes.
The only combinations of addressing modes that are not allowed are those
with an immediate mode destination (the operand of an immediate mode
instruction is data, not an address). For indexedmoves,thereference index
register may be X, Y, SP, or PC.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor41
Move instructions do not support indirect modes, 9-bit, or 16-bit offset
modes requiring extra extension bytes. There are special considerations
when using PC-relative addressing with move instructions. The original
M68HC12 implemented the instruction queue slightly differently than the
newer HCS12. In the older M68HC12 implementation, the CPU did not
maintain a pointer to the start of the instruction after the current instruction
(what the user thinks of as the PC value during execution). This caused an
offset for PC-relative move instructions.
PC-relative addressing uses the address of the location immediately
following the last byte of object code for the current instruction as a
reference point. The CPU12 normally corrects for queue offset and for
instruction alignment so that queue operation is transparent to the user.
However, in the original M68HC12, move instructions pose three special
problems:
•Some moves use an indexed source and an indexed destination.
•Some moves have object code that is too long to fit in the queue all at
one time, so the PC value changes during execution.
•All moves do not have the indexed postbyte as the last byte of object
code.
These cases are not handled by automatic queue pointer maintenance, but
it is still possible to use PC-relative indexing with move instructions by
providing for PC offsets in source code.
Table 3-3 shows PC offsets from the location immediately following the
current instruction by addressing mode.
Table 3-3. PC Offsets for MOVE Instructions (M68HC12 Only)
MOVE InstructionAddressing ModesOffset Value
IMM ⇒ IDX+1
EXT ⇒ IDX+2
MOVB
MOVW
IDX ⇒ EXT–2
IDX ⇒ IDX
IMM ⇒ IDX+2
EXT ⇒ IDX+2
IDX ⇒ EXT–2
IDX ⇒ IDX
–1 for first operand
+1 for second operand
–1 for first operand
+1 for second operand
S12CPUV2 Reference Manual, Rev. 4.0
42Freescale Semiconductor
Example:
1000 18 09 C2 20 00 MOVB $2000 2,PC
Moves a byte of data from $2000 to $1009
The expected location of the PC = $1005. The offset = +2.
[1005 + 2 (for 2,PC) + 2 (for correction) = 1009]
$18 is the page pre-byte, 09 is the MOVB opcode for ext-idx, C2 is the
indexed postbyte for 2,PC (without correction).
The Freescale MCUasm assembler produces corrected object code for
PC-relative moves (18 09 C0 20 00 for the example shown).
NOTE:Instead of assembling the 2,PC as C2, the correction has been applied to
makeitC0.Checkwhetheranassemblermakes the correction before using
PC-relative moves.
On the newer HCS12, the instruction queue was implemented such that an
internal pointer, to the start of the next instruction, is always available. On
the HCS12, PC-relative move instructions work as expected without any
offset adjustment. Although this is different from the original M68HC12, it is
unlikelytobeaproblembecausePC-relativeindexingis rarely, if ever, used
with move instructions.
3.10.2 Bit Manipulation Instructions
Bit manipulation instructions use either a combination of two or a
combination of three addressing modes.
Theclearbitsin memory (BCLR) and setbitsinmemory(BSET) instructions
use an 8-bit mask to determine which bits in a memory byte are to be
changed. The mask must be supplied with the instruction as an immediate
modevalue.Thememorylocation to be modified can be specifiedbymeans
of direct, extended, or indexed addressing modes.
The branch if bits cleared (BRCLR) and branch if bits set (BRSET)
instructionsusean8-bit mask to test thestatesofbitsina memory byte. The
mask is supplied with the instruction as an immediate mode value. The
memory location to be tested is specified by means of direct, extended, or
indexed addressing modes. Relative addressing mode is used todetermine
the branch address. A signed 8-bit offset must be supplied with the
instruction.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor43
3.11 Addressing More than 64 Kbytes
Some M68HC12 devices incorporate hardware that supports addressing a
larger memory space than the standard 64 Kbytes. The expanded memory
system uses fast on-chip logic to implement a transparent bank-switching
scheme.
Increased code efficiency is the greatest advantage of using a switching
scheme instead of a large linear address space. In systems with large linear
address spaces, instructions require more bits of information to address a
memory location, and CPU overhead is greater. Other advantages include
the ability to change the size of system memory and the ability to use
various types of external memory.
However, the add-on bank switching schemes used in other
microcontrollers have known weaknesses. These include the cost of
externalgluelogic, increased programming overhead to change banks, and
the need to disable interrupts while banks are switched.
The M68HC12 system requires no external glue logic. Bank switching
overheadisreducedbyimplementing control logic in theMCU.Interruptsdo
not need to be disabled during switching because switching tasks are
incorporated in special instructions that greatly simplify program access to
extended memory.
MCUs with expanded memory treat the 16 Kbytes of memory space from
$8000 to $BFFF as a program memory window. Expanded-memory
architectureincludesan8-bitprogrampageregister(PPAGE),whichallows
up to 256 16-Kbyte program memory pages to be switched into and out of
the program memory window. This provides for upto4 Megabytes of paged
program memory.
The CPU12 instruction set includes call subroutine in expanded memory
(CALL) and return from call (RTC) instructions, which greatly simplify the
use of expanded memory space. These instructions also execute correctly
on devices that do not have expanded-memory addressing capability, thus
providing for portable code.
The CALL instruction is similar to the jump-to-subroutine (JSR) instruction.
When CALL is executed, the current value in PPAGE is pushed onto the
stack with a return address, and a new instruction-supplied value is written
to PPAGE. This value selects the page the called subroutine resides upon
and can be considered part of the effective address. For all addressing
mode variations except indexed indirect modes, the new page value is
S12CPUV2 Reference Manual, Rev. 4.0
44Freescale Semiconductor
provided by an immediate operand in the instruction. For indexed indirect
variations of CALL, a pointer specifies memory locations where the new
page value and the address of the called subroutine are stored. Use of
indirect addressing for boththe page value andthe address within thepage
frees the program from keeping track of explicit values for either address.
The RTC instruction restores the saved program page value and the return
address from the stack. This causes execution to resume at the next
instruction after the original CALL instruction.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor45
S12CPUV2 Reference Manual, Rev. 4.0
46Freescale Semiconductor
Reference Manual — S12CPUV2
4.1 Introduction
The CPU12 uses an instruction queue to increase execution speed.
This section describes queue operation during normal program execution
andchangesinexecutionflow.These concepts augment the descriptions of
instructions and cycle-by-cycle instruction execution in subsequent
sections, but it is important to note that queue operation is automatic, and
generally transparent to the user.
The material in this section is general. Section 6. Instruction Glossary
contains detailed information concerning cycle-by-cycle execution of each
instruction. Section 8. Instruction Queue contains detailed information
about tracking queue operation and instruction execution.
Section 4. Instruction Queue
4.2 Queue Description
The fetching mechanism in the CPU12 is best described as a queue rather
than as a pipeline. Queue logicfetches program information and positions it
for execution, but instructions are executed sequentially. A typicalpipelined
central processor unit (CPU) can execute more than one instruction at the
same time, but interactions between the prefetch and execution
mechanisms can make tracking and debugging difficult. The CPU12 thus
gains the advantages of independent fetches, yet maintains a
straightforward relationship between bus and execution cycles.
Each instruction refills thequeue by fetching thesame number of bytes that
the instruction uses. Program information isfetched in aligned 16-bit words.
Each program fetch (P) indicates that two bytes need to be replaced in the
instructionqueue.Eachoptionalfetch(O)indicatesthat only one byte needs
tobereplaced.Forexample, an instruction composed of five bytesdoestwo
program fetches and one optional fetch. If the first byte of the five-byte
instruction was even-aligned, the optional fetch is converted into a free
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor47
cycle. If the first byte was odd-aligned, the optional fetch is executed as a
program fetch.
Two external pins, IPIPE[1:0], provide time-multiplexed information about
data movement in the queue and instruction execution. Decoding and use
of these signals is discussed in Section 8. Instruction Queue.
4.2.1 Original M68HC12 Queue Implementation
There are two 16-bit queue stages and one 16-bit buffer. Program
information is fetched in aligned 16-bit words. Unless buffering is required,
program information is first queued into stage 1, then advanced to stage 2
for execution.
At least two words of program information are available to the CPU when
execution begins. The first byte of object code is in either the even or odd
half of the word in stage 2, and at leasttwo more bytes of object code are in
the queue.
The buffer is used when a program word arrives before the queue can
advance. This occurs during execution of single-byte and odd-aligned
instructions. For instance, the queue cannot advance after an aligned,
single-byte instruction is executed, because the first byte of the next
instruction is also in stage 2. In these cases, information is latched into the
buffer until the queue can advance.
4.2.2 HCS12 Queue Implementation
There are three 16-bit stages inthe instruction queue. Instructions enter the
queue at stage 1 and shift out of stage 3 as the CPU executes instructions
and fetches new ones intostage 1. Each byte inthe queue is selectable. An
opcode prediction algorithm determines the location of the next opcode in
the instruction queue.
4.3 Data Movement in the Queue
All queue operations are combinations of four basic queue movement
cycles. Descriptions of each of these cycles follows. Queue movement
cycles are only one factor in instruction execution time and should not be
confused with bus cycles.
S12CPUV2 Reference Manual, Rev. 4.0
48Freescale Semiconductor
4.3.1 No Movement
There is no data movement in the instruction queue during the cycle. This
occurs during execution of instructions that must perform a number of
internal operations, such as division instructions.
4.3.2 Latch Data from Bus (Applies Only to the M68HC12 Queue Implementation)
All instructions initiate fetches to refill the queue as execution proceeds.
However, a number of conditions, including instruction alignment and the
length of previous instructions, affect when the queue advances. If the
queue is not ready to advance when fetched information arrives, the
information is latched into the buffer. Later, when the queue does advance,
stage 1 is refilled from the buffer.If more than one latch cycle occurs before
the queue advances, the buffer is filled on the first latch event and
subsequent latch events are ignored until the queue advances.
4.3.3 Advance and Load from Data Bus
The content of queue is advanced by one stage, and stage 1 is loaded with
a word of program information from the data bus. The information was
requested two bus cycles earlier but has only become available this cycle,
due to access delay.
4.3.4 Advance and Load from Buffer (Applies Only to M68HC12 Queue Implementation)
The content of queue stage 1 advances to stage 2, and stage 1 is loaded
with a word of program information from the buffer. The information in the
buffer was latched from the data bus during a previous cycle because the
queue was not ready to advance when it arrived.
4.4 Changes in Execution Flow
During normal instruction execution, queue operations proceed as a
continuoussequenceofqueuemovementcycles.However,situationsarise
which call for changes in flow. These changes are categorized as resets,
interrupts, subroutine calls, conditional branches, and jumps. Generally
speaking, resets and interrupts are considered to be related to events
outside the current program context that require special processing, while
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor49
4.4.1 Exceptions
subroutine calls, branches, and jumps are considered to be elements of
program structure.
During design, great care is taken to assure that the mechanism that
increases instruction throughput during normal programexecutiondoes not
cause bottlenecks during changes of program flow, but internal queue
operation is largely transparent to the user. The following information is
provided to enhance subsequent descriptions of instruction execution.
Exceptions are events that require processing outside the normal flow of
instruction execution. CPU12 exceptions include five types of exceptions:
•Reset (including COP, clock monitor, and pin)
•Unimplemented opcode trap
•Software interrupt instruction
•X-bit interrupts
•I-bit interrupts
4.4.2 Subroutines
All exceptions use the same microcode, but the CPU follows different
execution paths for each type of exception.
CPU12 exception handling is designed to minimize the effect of queue
operation on context switching. Thus, an exception vector fetch is the first
part of exception processing, and fetches to refill the queue from the
addresspointedtoby the vectorareinterleavedwiththe stacking operations
that preserve context, so that program access time does not delay the
switch. Refer to Section 7. Exception Processing for detailedinformation.
TheCPU12canbranchto(BSR),jump to (JSR), or call (CALL) subroutines.
BSR and JSR are used to access subroutines in the normal 64-Kbyte
address space. The CALL instruction is intended for use in MCUs with
expanded memory capability.
BSRuses relative addressing mode to generate the effective address of the
subroutine, while JSR can use various other addressing modes. Both
instructions calculate a return address, stack the address, then perform
three program word fetches to refill the queue.
S12CPUV2 Reference Manual, Rev. 4.0
50Freescale Semiconductor
Subroutines in the normal 64-Kbyte address space are terminated with a
return-from-subroutine (RTS) instruction. RTS unstacks thereturn address,
then performs three program word fetches from that address to refill the
queue.
CALL is similar to JSR. MCUs with expanded memory treat 16 Kbytes of
addresses from $8000 to $BFFF as a memory window. An 8-bit PPAGE
register switches memory pages into and out of the window. When CALL is
executed, a return address is calculated, then it and the current PPAGE
value are stacked, and a new instruction-supplied value is written to
PPAGE. The subroutine address is calculated, then three program word
fetches are made from that address to refill the instruction queue.
The return-from-call (RTC) instruction is used to terminate subroutines in
expanded memory. RTC unstacks the PPAGE value and the return
address, then performs three program word fetches from that address to
refill the queue.
CALL and RTC execute correctly in the normal 64-Kbyte address space,
thus providing for portable code. However, since extra execution cycles are
required, routinely substituting CALL/RTC for JSR/RTS is not
recommended.
4.4.3 Branches
Branch instructions cause execution flow to change when specific
pre-conditions exist. The CPU12 instruction set includes:
•Short conditional branches
•Long conditional branches
•Bit-condition branches
Types and conditions of branch instructions are described in
5.19BranchInstructions. All branchinstructionsaffect the queuesimilarly,
but there are differences in overall cycle counts between the various types.
Loop primitive instructions are a special type of branch instruction used to
implement counter-based loops.
Branch instructions have two execution cases:
•The branch condition is satisfied, and a change of flow takes place.
•The branch condition is not satisfied, and no change of flow occurs.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor51
4.4.3.1 Short Branches
The “not-taken” case for short branches is simple. Since the instruction
consists of a single word containing both an opcode and an 8-bit offset, the
queueadvances,another programwordisfetched, and executioncontinues
with the next instruction.
The “taken” case for short branches requires that the queue be refilled so
that execution can continue at anew address. First, the effective addressof
thedestination is calculated using the relative offset in the instruction. Then,
theaddressisloadedinto the program counter, and the CPUperformsthree
program word fetches at the new address to refill the instruction queue.
4.4.3.2 Long Branches
The “not-taken” case for all long branches requires three cycles, while the
“taken” case requires four cycles. This is duetodifferences in the amount of
program information needed to fill the queue.
Long branch instructions begin with a $18 prebyte which indicates that the
opcode is on page 2 of the opcode map. The CPU12 treats the prebyte as
a special one-byte instruction. If the prebyte is not aligned, the first cycle is
used to perform a program word access; if the prebyte is aligned, the first
cycle is used to perform a free cycle. The first cycle for the prebyte is
executed whether or not the branch is taken.
The first cycle of the branch instruction is an optional cycle. Optional cycles
make the effects of byte-sized and misaligned instructions consistent with
those of aligned word-length instructions. Program information is always
fetched as aligned 16-bit words. When an instruction has an odd number of
bytes, and the first byte is not aligned with an even byte boundary, the
optional cycle makes an additional program word access that maintains
queue order. In all other cases, the optional cycle is a free cycle.
In the “not-taken” case, the queue must advance so that execution can
continue with the next instruction. Two cycles are used to refill the queue.
Alignment determines how the second of these cycles is used.
In the “taken” case, the effective address of the branch is calculated using
the16-bitrelativeoffsetcontained in the second word of theinstruction.This
address is loaded into the program counter, then the CPU performs three
program word fetches at the new address.
S12CPUV2 Reference Manual, Rev. 4.0
52Freescale Semiconductor
4.4.3.3 Bit Condition Branches
Bit condition branch instructions read a location in memory, and branch if
the bits in that location are in a certain state. These instructions can use
direct, extended, or indexed addressing modes.Indexedoperations require
varying amounts of information to determine the effective address, so
instruction length varies according to the mode used, which in turn affects
the amount of program information fetched. To shorten execution time,
these branches perform one program word fetch in anticipation of the
“taken” case. The data from this fetch is ignored in the “not-taken” case. If
thebranchis taken, the CPU fetches three program word fetches at the new
address to fill the instruction queue.
4.4.3.4 Loop Primitives
The loop primitive instructions test a counter value in a register or
accumulator and branch to an address specified by a 9-bit relative offset
contained in the instruction if a specified condition is met. There are
auto-increment and auto-decrement versions of theseinstructions.The test
and increment/decrement operations are performed on internal CPU
registers, and require no additional program information. To shorten
execution time, these branches perform one program word fetch in
anticipation of the “taken” case. The data from this fetch is ignored if the
branch is not taken, and the CPU does one program fetch and one optional
fetch to refill the queue
queue with two additional program word fetches at the new address.
1
. If the branch istaken, the CPU finishes refillingthe
4.4.4 Jumps
Jump (JMP) is the simplest change of flow instruction. JMP can use
extended or indexed addressing. Indexed operations require varying
amounts of information to determine the effective address, so instruction
length varies according to the mode used, which in turn affects the amount
of program information fetched. All forms of JMP perform three program
word fetches at the new address to refill the instruction queue.
1. Inthe originalM68HC12, the implementation ofthese two cyclesare both programword
fetches.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor53
S12CPUV2 Reference Manual, Rev. 4.0
54Freescale Semiconductor
Reference Manual — S12CPUV2
Section 5. Instruction Set Overview
5.1 Introduction
This section contains general information about the central processor unit
(CPU12) instruction set. It is organized into instruction categories grouped
by function.
5.2 Instruction Set Description
CPU12 instructions are a superset of the M68HC11 instruction set. Code
written for an M68HC11 can be reassembled and run on a CPU12 with no
changes. The CPU12 provides expanded functionality and increased code
efficiency. There are two implementations of the CPU12, the original
M68HC12 and the newer HCS12. Both implementations have the same
instructionset,although there aresmalldifferences in cycle-by-cycleaccess
details (the order of some bus cycles changed to accommodate differences
in the way the instruction queue was implemented). These minor
differences are transparent for most users.
In the M68HC12 and HCS12 architecture, all memoryandinput/output(I/O)
are mapped in a common 64-Kbyte address space (memory-mapped I/O).
This allows the same set of instructions to be used to access memory, I/O,
and control registers. General-purpose load, store, transfer, exchange, and
move instructions facilitate movement of data to and from memory and
peripherals.
The CPU12 has a full set of 8-bit and 16-bit mathematical instructions.
There are instructions for signed and unsigned arithmetic, division, and
multiplication with 8-bit, 16-bit, and some larger operands.
Special arithmetic and logic instructions aid stacking operations, indexing,
binary-coded decimal (BCD) calculation, and condition code register
manipulation. There are also dedicated instructions for multiply and
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor55
accumulate operations, table interpolation, and specialized fuzzy logic
operations that involve mathematical calculations.
Refer to Section 6. Instruction Glossary for detailed information about
individual instructions. Appendix A. Instruction Reference contains
quick-reference material, including an opcode map and postbyte encoding
for indexed addressing, transfer/exchange instructions, and loop primitive
instructions.
5.3 Load and Store Instructions
Load instructions copy memory content into an accumulator or register.
Memory content is not changed by the operation. Load instructions (but not
LEA_instructions)affectconditioncodebitssonoseparatetestinstructions
are needed to check the loaded values for negative or 0 conditions.
Store instructions copy the content of a CPU register to memory.
Register/accumulator content is not changed by the operation. Store
instructionsautomaticallyupdatetheNandZconditioncodebits,whichcan
eliminate the need for a separate test instruction in some programs.
Table 5-1 is a summary of load and store instructions.
Table 5-1. Load and Store Instructions
MnemonicFunctionOperation
Load Instructions
LDAALoad A(M) ⇒ A
LDABLoad B(M) ⇒ B
LDDLoad D(M : M + 1) ⇒ (A:B)
LDSLoad SP
LDXLoad index register X
LDYLoad index register Y
LEASLoad effective address into SPEffective address ⇒ SP
LEAXLoad effective address into XEffective address ⇒ X
LEAYLoad effective address into YEffective address ⇒ Y
(M : M + 1) ⇒ SPH:SP
(M : M + 1) ⇒ XH:X
(M : M + 1) ⇒ YH:Y
Continued on next page
L
L
L
S12CPUV2 Reference Manual, Rev. 4.0
56Freescale Semiconductor
Table 5-1. Load and Store Instructions (Continued)
Store Instructions
STAAStore A(A) ⇒ M
STABStore B(B) ⇒ M
STDStore D(A) ⇒ M, (B) ⇒ M + 1
(SP
STSStore SP
:SPL) ⇒ M : M + 1
H
STXStore X
STYStore Y
5.4 Transfer and Exchange Instructions
Transfer instructions copy the content of a register or accumulator into
another register or accumulator. Source content is not changed by the
operation. Transfer register to register (TFR) is a universal transfer
instruction, but other mnemonics are accepted for compatibility with the
M68HC11. The transfer A to B (TAB) and transfer B to A (TBA) instructions
affect the N, Z, and V condition code bits in the same way as M68HC11
instructions. The TFR instruction does not affect the condition code bits.
The sign extend 8-bit operand (SEX) instruction is a special case of the
universal transfer instruction that is used to sign extend 8-bit two’s
complement numbers so that they can be used in 16-bit operations. The
8-bit number is copied from accumulator A, accumulator B, or the condition
code register to accumulatorD, the X indexregister, the Y indexregister, or
the stack pointer. All the bits in the upper byte of the 16-bit result are given
the value of the most-significant bit (MSB) of the 8-bit number.
(XH:XL) ⇒ M : M + 1
(YH:YL) ⇒ M : M + 1
Exchange instructions exchange the contents of pairs of registers or
accumulators.WhenthefirstoperandinanEXGinstructionis8-bitsandthe
secondoperandis16bits,azero-extendoperationisperformedonthe8-bit
register as it is copied into the 16-bit register.
Section 6. Instruction Glossary contains information concerning other
transfers and exchanges between 8- and 16-bit registers.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor57
Table 5-2 is a summary of transfer and exchange instructions.
Table 5-2. Transfer and Exchange Instructions
MnemonicFunctionOperation
Transfer Instructions
TABTransfer A to B(A) ⇒ B
TAPTransfer A to CCR(A) ⇒ CCR
TBATransfer B to A(B) ⇒ A
TFR
TPATransfer CCR to A(CCR) ⇒ A
TSXTransfer SP to X(SP) ⇒ X
TSYTransfer SP to Y(SP) ⇒ Y
TXSTransfer X to SP(X) ⇒ SP
TYSTransfer Y to SP(Y) ⇒ SP
EXG
XGDXExchange D with X(D) ⇔ (X)
XGDYExchange D with Y(D) ⇔ (Y)
SEX
Transfer register
to register
Exchange Instructions
Exchange register
to register
Sign Extension Instruction
Sign extend
8-Bit operand
(A, B, CCR, D, X, Y, or SP) ⇒
A, B, CCR, D, X, Y, or SP
(A, B, CCR, D, X, Y, or SP) ⇔
(A, B, CCR, D, X, Y, or SP)
Sign-extended (A, B, or CCR) ⇒
D, X, Y, or SP
5.5 Move Instructions
Move instructions move (copy) data bytes or words from a source
(M1or M : M +11) to a destination (M2 or M : M +12) in memory. Six
combinations of immediate, extended, and indexed addressing are allowed
to specify source and destination addresses (IMM ⇒ EXT,
IMM⇒ IDX,EXT⇒ EXT,EXT⇒ IDX, IDX⇒ EXT, IDX ⇒ IDX).Addressing
mode combinations with immediate for the destination would not be useful.
Table 5-3 shows byte and word move instructions.
Table 5-3. Move Instructions
MnemonicFunctionOperation
MOVBMove byte (8-bit)
MOVWMove word (16-bit)
S12CPUV2 Reference Manual, Rev. 4.0
58Freescale Semiconductor
(M : M + 11) ⇒ M : M + 1
(M1) ⇒ M
2
2
5.6 Addition and Subtraction Instructions
Signed and unsigned 8- and 16-bit addition can be performed between
registers or between registers and memory. Special instructions support
index calculation. Instructions that add the carry bit in the condition code
register (CCR) facilitate multiple precision computation.
Signed and unsigned 8- and 16-bit subtraction can be performed between
registers or between registers and memory. Special instructions support
index calculation. Instructions that subtract the carry bitintheCCR facilitate
multiple precision computation. Refer to Table 5-4 for addition and
subtraction instructions.
Load effective address (LEAS, LEAX, and LEAY) instructions could also be
considered as specialized addition and subtraction instructions. See 5.25
Pointer and Index Calculation Instructions for more information.
Table 5-4. Addition and Subtraction Instructions
MnemonicFunctionOperation
Addition Instructions
ABAAdd B to A(A) + (B) ⇒ A
ABXAdd B to X(B) + (X) ⇒ X
ABYAdd B to Y(B) + (Y) ⇒ Y
ADCAAdd with carry to A(A) + (M) + C ⇒ A
ADCBAdd with carry to B(B) + (M) + C ⇒ B
ADDAAdd without carry to A(A) + (M) ⇒ A
ADDBAdd without carry to B(B) + (M) ⇒ B
ADDDAdd to D(A:B) + (M : M + 1) ⇒ A : B
Subtraction Instructions
SBASubtract B from A(A) – (B) ⇒ A
SBCASubtract with borrow from A(A) – (M) – C ⇒ A
SBCBSubtract with borrow from B(B) – (M) – C ⇒ B
SUBASubtract memory from A(A) – (M) ⇒ A
SUBBSubtract memory from B(B) – (M) ⇒ B
SUBDSubtract memory from D (A:B)(D) – (M : M + 1) ⇒ D
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor59
5.7 Binary-Coded Decimal Instructions
Toaddbinary-coded decimal (BCD)operands,use addition instructionsthat
set the half-carry bit in the CCR, then adjust the result with the decimal
adjust A (DAA) instruction. Table 5-5 is a summary of instructions that can
be used to perform BCD operations.
Table 5-5. BCD Instructions
MnemonicFunctionOperation
ABAAdd B to A(A) + (B) ⇒ A
ADCAAdd with carry to A(A) + (M) + C ⇒ A
(1)
ADCB
(1)
ADDA
ADDBAdd memory to B(B) + (M) ⇒ B
Add with carry to B(B) + (M) + C ⇒ B
Add memory to A(A) + (M) ⇒ A
DAADecimal adjust A
1. These instructions are not normally used for BCD operations because, although they affect H
correctly, they do not leave the result in the correct accumulator (A) to be used with the DAA
instruction. Thus additional steps would be needed to adjust the result to correct BCD form.
(A)
10
S12CPUV2 Reference Manual, Rev. 4.0
60Freescale Semiconductor
5.8 Decrement and Increment Instructions
The decrement and increment instructions are optimized 8- and 16-bit
addition and subtraction operations. They are generally used to implement
counters. Because they do not affect the carry bit in the CCR, they are
particularly well suited for loop counters in multiple-precision computation
routines. Refer to 5.20 Loop Primitive Instructions for information
concerning automatic counter branches. Table 5-6 is a summary of
decrement and increment instructions.
Table 5-6. Decrement and Increment Instructions
MnemonicFunctionOperation
Decrement Instructions
DECDecrement memory(M) – $01 ⇒ M
DECADecrement A(A) – $01 ⇒ A
DECBDecrement B(B) – $01 ⇒ B
DESDecrement SP(SP) – $0001 ⇒ SP
DEXDecrement X(X) – $0001 ⇒ X
DEYDecrement Y(Y) – $0001 ⇒ Y
Increment Instructions
INCIncrement memory(M) + $01 ⇒ M
INCAIncrement A(A) + $01 ⇒ A
INCBIncrement B(B) + $01 ⇒ B
INSIncrement SP(SP) + $0001 ⇒ SP
INXIncrement X(X) + $0001 ⇒ X
INYIncrement Y(Y) + $0001 ⇒ Y
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor61
5.9 Compare and Test Instructions
Compare and test instructions perform subtraction between a pair of
registers or between a register and memory. The result is not stored, but
condition codes are set by the operation. These instructions are generally
used to establish conditions for branch instructions. In this architecture,
most instructions update condition code bits automatically, so it is often
unnecessary to include separate test or compare instructions. Table 5-7 is
a summary of compare and test instructions.
Table 5-7. Compare and Test Instructions
MnemonicFunctionOperation
CBACompare A to B(A) – (B)
CMPACompare A to memory(A) – (M)
CMPBCompare B to memory(B) – (M)
CPDCompare D to memory (16-bit)(A : B) – (M : M + 1)
CPSCompare SP to memory (16-bit)(SP) – (M : M + 1)
Compare Instructions
CPXCompare X to memory (16-bit)(X) – (M : M + 1)
CPYCompare Y to memory (16-bit)(Y) – (M : M + 1)
Test Instructions
TSTTest memory for zero or minus(M) – $00
TSTATest A for zero or minus(A) – $00
TSTBTest B for zero or minus(B) – $00
S12CPUV2 Reference Manual, Rev. 4.0
62Freescale Semiconductor
5.10 Boolean Logic Instructions
The Boolean logic instructions perform a logic operation between an 8-bit
accumulator or the CCR and a memory value. AND, OR, and exclusive OR
functions are supported. Table 5-8 summarizes logic instructions.
Table 5-8. Boolean Logic Instructions
MnemonicFunctionOperation
ANDAAND A with memory(A) • (M) ⇒ A
ANDBAND B with memory(B) • (M) ⇒ B
EORAExclusive OR A with memory(A) ⊕ (M) ⇒ A
EORBExclusive OR B with memory(B) ⊕ (M) ⇒ B
ORAAOR A with memory(A) + (M) ⇒ A
ORABOR B with memory(B) + (M) ⇒ B
Each of the clear, complement, and negate instructions performs a specific
binary operation on a value in an accumulator or in memory. Clear
operations clear the value to 0, complement operations replace the value
with its one’s complement, and negate operations replace the value with its
two’s complement. Table 5-9 is a summary of clear, complement, and
negate instructions.
Table 5-9. Clear, Complement, and Negate Instructions
MnemonicFunctionOperation
CLCClear C bit in CCR0 ⇒ C
CLIClear I bit in CCR0 ⇒ I
CLRClear memory$00 ⇒ M
CLRAClear A$00 ⇒ A
CLRBClear B$00 ⇒ B
CLVClear V bit in CCR0 ⇒ V
COMOne’s complement memory$FF – (M) ⇒ M or (M) ⇒ M
COMAOne’s complement A$FF – (A) ⇒ A or (A) ⇒ A
COMBOne’s complement B$FF – (B) ⇒ B or (B) ⇒ B
NEGTwo’s complement memory$00 – (M) ⇒ M or (M) + 1 ⇒ M
NEGATwo’s complement A$00 – (A) ⇒ A or (A) + 1 ⇒ A
NEGBTwo’s complement B$00 – (B) ⇒ B or (B) + 1 ⇒ B
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor63
5.12 Multiplication and Division Instructions
There are instructions for signed and unsigned 8- and 16-bit multiplication.
Eight-bit multiplication operations have a 16-bit product. Sixteen-bit
multiplication operations have 32-bit products.
Integer and fractional division instructions have 16-bit dividend, divisor,
quotient, and remainder. Extended division instructions use a 32-bit
dividend and a 16-bit divisor to produce a 16-bit quotient and a 16-bit
remainder.
Table 5-10 is a summary of multiplication and division instructions.
Table 5-10. Multiplication and Division Instructions
MnemonicFunctionOperation
Multiplication Instructions
EMUL16 by 16 multiply (unsigned)(D) × (Y) ⇒ Y : D
EMULS16 by 16 multiply (signed)(D) × (Y) ⇒ Y : D
MUL8 by 8 multiply (unsigned)(A) × (B) ⇒ A : B
Division Instructions
EDIV32 by 16 divide (unsigned)
EDIVS32 by 16 divide (signed)
FDIV16 by 16 fractional divide
IDIV16 by 16 integer divide (unsigned)
IDIVS16 by 16 integer divide (signed)
(Y : D) ÷ (X) ⇒ Y
Remainder ⇒ D
(Y : D) ÷ (X) ⇒ Y
Remainder ⇒ D
(D) ÷ (X) ⇒ X
Remainder ⇒ D
(D) ÷ (X) ⇒ X
Remainder ⇒ D
(D) ÷ (X) ⇒ X
Remainder ⇒ D
S12CPUV2 Reference Manual, Rev. 4.0
64Freescale Semiconductor
5.13 Bit Test and Manipulation Instructions
The bit test and manipulation operations use amaskvalue to test or change
the value of individualbits in an accumulatoror in memory. Bittest A (BITA)
and bit test B (BITB) provide a convenient means of testing bits without
altering the value ofeither operand. Table 5-11 is a summary of bit test and
manipulation instructions.
Table 5-11. Bit Test and Manipulation Instructions
MnemonicFunctionOperation
BCLRClear bits in memory(M) • (mm) ⇒ M
BITABit test A(A) • (M)
BITBBit test B(B) • (M)
BSETSet bits in memory(M) + (mm) ⇒ M
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor65
5.14 Shift and Rotate Instructions
There are shifts and rotates for all accumulators and for memory bytes. All
pass the shifted-out bit through the C status bit to facilitate multiple-byte
operations. Because logical and arithmetic left shifts are identical, there are
noseparatelogicalleftshiftoperations.Logicshiftleft(LSL)mnemonicsare
assembled as arithmetic shift left memory (ASL) operations. Table 5-12
shows shift and rotate instructions.
Table 5-12. Shift and Rotate Instructions
MnemonicFunctionOperation
Logical Shifts
LSL
LSLA
LSLB
Logic shift left memory
Logic shift left A
Logic shift left B
LSLDLogic shift left D
LSR
LSRA
LSRB
Logic shift right memory
Logic shift right A
Logic shift right B
LSRDLogic shift right D
Arithmetic Shifts
ASL
ASLA
ASLB
Arithmetic shift left memory
Arithmetic shift left A
Arithmetic shift left B
ASLDArithmetic shift left D
ASR
ASRA
ASRB
Arithmetic shift right memory
Arithmetic shift right A
Arithmetic shift right B
0
b7
C
b7
C
0
0
b7
C
C
b0
b7
b0
A
b7
b0
A
b7
b7
b0
0
b7
b7
b7
b0A
B
b0
C
b0
B
b0
B
b0
C
0
0
b0
C
Rotates
ROL
ROLA
ROLB
ROR
RORA
RORB
66Freescale Semiconductor
Rotate left memory through carry
Rotate left A through carry
Rotate left B through carry
Rotate right memory through carry
Rotate right A through carry
Rotate right B through carry
S12CPUV2 Reference Manual, Rev. 4.0
b7
C
b7
b0
b0
C
5.15 Fuzzy Logic Instructions
The CPU12 instruction set includes instructions that support efficient
processing of fuzzy logic operations. The descriptions of fuzzy logic
instructions given here are functional overviews. Table 5-13 summarizes
the fuzzy logic instructions. Refer to Section 9. Fuzzy Logic Support for
detailed discussion.
5.15.1 Fuzzy Logic Membership Instruction
The membership function (MEM) instruction is used during the fuzzification
process. During fuzzification, current system input values are compared
againststoredinputmembership functions to determinethedegreetowhich
each label of each system input is true. This is accomplished by finding the
y value for the current input on a trapezoidal membership function for each
label of each system input. The MEM instruction performs this calculation
for one label of one system input. Toperformthe complete fuzzification task
for a system, several MEM instructions must be executed, usually in a
program loop structure.
5.15.2 Fuzzy Logic Rule Evaluation Instructions
The MIN-MAX rule evaluation (REV and REVW) instructions perform
MIN-MAX rule evaluations that are central elements of a fuzzy logic
inference program. Fuzzy input values are processed using a list of rules
from the knowledge base to produce a list of fuzzy outputs. The REV
instruction treats all rules as equally important. The REVW instruction
allows each rule to have a separate weighting factor. The two rule
evaluation instructions also differ in the way rules are encoded into the
knowledge base. Because they require a number of cycles to execute, rule
evaluation instructions can be interrupted. Once the interrupt has been
serviced, instruction execution resumes at the point the interrupt occurred.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor67
5.15.3 Fuzzy Logic Weighted Average Instruction
The weighted average (WAV) instruction computes a sum-of-products and
a sum-of-weights used for defuzzification. To be usable, the fuzzy outputs
produced by rule evaluation must be defuzzified to produce a single output
valuewhichrepresentsthe combined effect ofallofthefuzzy outputs. Fuzzy
outputs correspond to the labels of a system output and each is defined by
a membership function in the knowledge base. The CPU12 typically uses
singletons for output membership functions rather than the trapezoidal
shapes used for inputs. As with inputs, the x-axis represents the range of
possible values for a system output. Singleton membership functions
consist of the x-axis position for a label of the system output. Fuzzy outputs
correspond to the y-axis height of the corresponding output membership
function. The WAV instruction calculates the numerator and denominator
sums for a weighted average of the fuzzy outputs. Because WAV requires
a number of cycles to execute, it can be interrupted. The WAVR
pseudo-instruction causes execution to resume at the point where it was
interrupted.
Table 5-13. Fuzzy Logic Instructions
MnemonicFunctionOperation
µ (grade) ⇒ M
(X) + 4 ⇒ X; (Y) + 1 ⇒ Y; A unchanged
if (A) < P1 or (A) > P2, then µ = 0, else
µ = MIN [((A) – P1) × S1, (P2 – (A)) × S2, $FF]
MEM
Membership
function
A = current crisp input value
X points to a 4-byte data structure
that describes a trapezoidal membership
function as base intercept
points and slopes (P1, P2, S1, S2)
Y points at fuzzy input (RAM location)
where:
Continued on next page
(Y)
S12CPUV2 Reference Manual, Rev. 4.0
68Freescale Semiconductor
Table 5-13. Fuzzy Logic Instructions (Continued)
MnemonicFunctionOperation
Find smallest rule input (MIN)
Store to rule outputs unless fuzzy output is larger
(MAX)
Rules are unweighted
REV
REVW
WAV
MIN-MAX rule
evaluation
MIN-MAX rule
evaluation
Calculates numerator
(sum of products)
and denominator
(sum of weights)
for weighted average
calculation
Results are placed in
correct registers
for EDIV immediately
after WAV
Each rule input is an 8-bit offset
from a base address in Y
Each rule output is an 8-bit offset
from a base address in Y
$FE separates rule inputs from rule outputs
$FF terminates the rule list
REV can be interrupted
Find smallest rule input (MIN)
Multiply by a rule weighting factor (optional)
Store to rule outputs unless fuzzy output is larger
(MAX)
Each rule input is the 16-bit address
of a fuzzy input
Each rule output is the 16-bit address
of a fuzzy output
Address $FFFE separates rule inputs
from rule outputs
$FFFF terminates the rule list
Weights are 8-bit values in a separate table
REVW can be interrupted
B
SiF
∑
i1=
B
∑
i1=
Y:D⇒
i
F
X⇒
i
Resumes execution
WAVR
of interrupted WAV
instruction
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor69
Recover immediate results from stack
rather than initializing them to 0.
5.16 Maximum and Minimum Instructions
The maximum (MAX) and minimum (MIN) instructions are used to make
comparisons between an accumulator and a memory location. These
instructions can be used for linear programming operations, such as
simplex-method optimization, or for fuzzification.
MAX and MIN instructions use accumulator A toperform8-bitcomparisons,
while EMAX and EMIN instructions use accumulator D to perform 16-bit
comparisons. The result (maximum or minimum value) can be stored in the
accumulator (EMAXD, EMIND, MAXA, MINA) or the memory address
(EMAXM, EMINM, MAXM, MINM).
Table 5-14 is a summary of minimum and maximum instructions.
Table 5-14. Minimum and Maximum Instructions
MnemonicFunctionOperation
Minimum Instructions
EMIND
EMINM
MINA
MINM
EMAXD
EMAXM
MAXA
MAXM
MIN of two unsigned 16-bit values
result to accumulator
MIN of two unsigned 16-bit values
result to memory
MIN of two unsigned 8-bit values
result to accumulator
MIN of two unsigned 8-bit values
result to memory
Maximum Instructions
MAX of two unsigned 16-bit values
result to accumulator
MAX of two unsigned 16-bit values
result to memory
MAX of two unsigned 8-bit values
result to accumulator
MAX of two unsigned 8-bit values
result to memory
MIN ((D), (M : M + 1)) ⇒ D
MIN ((D), (M : M + 1)) ⇒ M : M+1
MIN ((A), (M)) ⇒ A
MIN ((A), (M)) ⇒ M
MAX ((D), (M : M + 1)) ⇒ D
MAX ((D), (M : M + 1)) ⇒ M : M + 1
MAX ((A), (M)) ⇒ A
MAX ((A), (M)) ⇒ M
S12CPUV2 Reference Manual, Rev. 4.0
70Freescale Semiconductor
5.17 Multiply and Accumulate Instruction
The multiply and accumulate (EMACS) instruction multiplies two 16-bit
operands stored in memory and accumulates the 32-bit result in a third
memory location. EMACS can be used to implement simple digital filters
and defuzzification routines that use 16-bit operands. The WAV instruction
incorporates an 8- to 16-bit multiply and accumulate operation that obtains
a numerator for the weighted average calculation. The EMACS instruction
can automate this portion of the averaging operation when 16-bit operands
are used. Table 5-15 shows the EMACS instruction.
Table 5-15. Multiply and Accumulate Instructions
MnemonicFunctionOperation
EMACS
Multiply and accumulate (signed)
16 bit by 16 bit ⇒ 32 bit
((M
(X):M(X+1)
+ (M ~ M + 3) ⇒ M ~ M + 3
) × (M
(Y):M(Y+1)
))
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor71
5.18 Table Interpolation Instructions
The table interpolation instructions (TBL and ETBL) interpolate values from
tables stored in memory. Any function that can be represented as a series
of linear equations can be represented by a table of appropriate size.
Interpolation can be used for many purposes, including tabular fuzzy logic
membership functions. TBL uses 8-bit table entries and returns an 8-bit
result; ETBL uses 16-bit table entries and returns a 16-bit result. Use of
indexed addressing mode provides great flexibility in structuring tables.
Consider each of the successive values stored in a table to be y-values for
the endpoint of a line segment. The value in the B accumulator before
instruction execution begins represents the change in x from the beginning
of the line segment to the lookup point divided by total change in x from the
beginning to the end of the line segment. B is treated as an 8-bit binary
fraction with radix point left of the MSB, so each line segment is effectively
dividedinto256smallersegments. During instruction execution, thechange
in y between the beginning and end of the segment (a signed byte for TBL
or a signed wordfor ETBL) is multipliedby the content ofthe B accumulator
to obtain an intermediate delta-y term. The result (stored in the A
accumulator by TBL, and in the D accumulator by ETBL) is the y-value of
the beginning point plus the signed intermediate delta-y value. Table 5-16
shows the table interpolation instructions.
Table 5-16. Table Interpolation Instructions
MnemonicFunctionOperation
(M : M + 1) + [(B) × ((M + 2 : M + 3)
– (M : M + 1))] ⇒ D
Initialize B, and index before ETBL.
<ea> points to the first table entry (M : M + 1)
B is fractional part of lookup value
(M) + [(B) × ((M + 1) – (M))] ⇒ A
Initialize B, and index before TBL.
<ea> points to the first 8-bit table entry (M)
B is fractional part of lookup value.
ETBL
TBL
16-bit table lookup
and interpolate
(no indirect addressing
modes allowed)
8-bit table lookup
and interpolate
(no indirect addressing
modes allowed)
S12CPUV2 Reference Manual, Rev. 4.0
72Freescale Semiconductor
5.19 Branch Instructions
Branch instructions cause a sequence to change when specific conditions
exist. The CPU12 uses three kinds of branch instructions. These are short
branches, long branches, and bit condition branches.
Branch instructions can also be classified by the type of condition that must
be satisfied in order for a branch to be taken. Some instructions belong to
more than one classification. For example:
•Unary branch instructions always execute.
•Simple branches are taken when a specific bit in the condition code
register is in a specific state as a result of a previous operation.
•Unsigned branches are taken when comparison or test of unsigned
quantities results in a specific combination of condition code register
bits.
•Signed branches are taken when comparison or test of signed
quantities results in a specific combination of condition code register
bits.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor73
5.19.1 Short Branch Instructions
Short branch instructions operate this way: When a specified condition is
met, a signed 8-bit offset is added to the value in the program counter.
Program execution continues at the new address.
The numeric range of short branch offset values is $80 (–128) to $7F (127)
from the address of the next memory location after the offset value.
Table 5-17 is a summary of the short branch instructions.
MnemonicFunctionEquation or Operation
BRABranch always1 = 1
BRNBranch never1 = 0
BCCBranch if carry clearC = 0
BCSBranch if carry setC = 1
Table 5-17. Short Branch Instructions
Unary Branches
Simple Branches
BEQBranch if equalZ = 1
BMIBranch if minusN = 1
BNEBranch if not equalZ = 0
BPLBranch if plusN = 0
BVCBranch if overflow clearV = 0
BVSBranch if overflow setV = 1
Unsigned Branches
Relation
BHIBranch if higherR > MC+ Z = 0
BHSBranch if higher or sameR ≥ MC= 0
BLOBranch if lowerR < MC= 1
BLSBranch if lower or sameR ≤ MC+ Z = 1
Signed Branches
BGEBranch if greater than or equalR ≥ MN⊕ V = 0
BGTBranch if greater thanR > MZ+ (N ⊕ V) = 0
BLEBranch if less than or equalR ≤ MZ+ (N ⊕ V) = 1
BLTBranch if less thanR < MN⊕ V = 1
S12CPUV2 Reference Manual, Rev. 4.0
74Freescale Semiconductor
5.19.2 Long Branch Instructions
Long branch instructions operate this way: When a specified condition is
met, a signed 16-bit offset is added to the value in the program counter.
Program execution continues at the new address. Long branches are used
when large displacements between decision-making steps are necessary.
Thenumericrange of longbranchoffsetvaluesis $8000 (–32,768)to$7FFF
(32,767)fromtheaddressofthenextmemorylocation after the offset value.
This permits branching from any location in the standard 64-Kbyte address
map to any other location in the 64-Kbyte map.
Table 5-18 is a summary of the long branch instructions.
MnemonicFunctionEquation or Operation
LBRALong branch always1 = 1
LBRNLong branch never1 = 0
LBCCLong branch if carry clearC = 0
LBCSLong branch if carry setC = 1
LBEQLong branch if equalZ = 1
LBMILong branch if minusN = 1
LBNELong branch if not equalZ = 0
LBPLLong branch if plusN = 0
LBVCLong branch if overflow clearV = 0
LBVSLong branch if overflow setV = 1
LBHILong branch if higherC + Z = 0
LBHSLong branch if higher or sameC = 0
LBLOLong branch if lowerZ = 1
LBLSLong branch if lower or sameC + Z = 1
Table 5-18. Long Branch Instructions
Unary Branches
Simple Branches
Unsigned Branches
Signed Branches
LBGELong branch if greater than or equalN ⊕ V = 0
LBGTLong branch if greater thanZ + (N ⊕ V) = 0
LBLELong branch if less than or equalZ + (N ⊕ V) = 1
LBLTLong branch if less thanN ⊕ V = 1
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor75
5.19.3 Bit Condition Branch Instructions
The bit condition branches are taken when bits in a memory byte are in a
specific state. A mask operand is used to test the location. If all bits in that
location that correspond to ones in the mask are set (BRSET) or cleared
(BRCLR), the branch is taken.
The numeric range of 8-bit offset values is $80 (–128) to $7F (127)
from the address of the next memory location after the offset value.
Table 5-19 is a summary of bit condition branches.
Table 5-19. Bit Condition Branch Instructions
MnemonicFunctionEquation or Operation
BRCLRBranch if selected bits clear(M) • (mm) = 0
BRSETBranch if selected bits set(M) • (mm) = 0
S12CPUV2 Reference Manual, Rev. 4.0
76Freescale Semiconductor
5.20 Loop Primitive Instructions
The loop primitives can also be thought of as counter branches. The
instructions test a counter value in a register or accumulator (A, B, D, X, Y,
or SP) for zero or non-zero value as a branch condition. There are
predecrement, preincrement, and test-only versions of these instructions.
The numeric range of 9-bit offset values is $100 (
–256) to $0FF (255)
from the address of the next memory location after the offset value.
Table 5-20 is a summary of loop primitive branches.
Table 5-20. Loop Primitive Instructions
MnemonicFunctionEquation or Operation
(counter) – 1⇒ counter
If (counter) = 0, then branch;
else continue to next instruction
(counter) – 1⇒ counter
If (counter) not = 0, then branch;
else continue to next instruction
(counter) + 1⇒ counter
If (counter) = 0, then branch;
else continue to next instruction
(counter) + 1⇒ counter
If (counter) not = 0, then branch;
else continue to next instruction
If (counter) = 0, then branch;
else continue to next instruction
DBEQ
DBNE
IBEQ
IBNE
TBEQ
Decrement counter and branch if = 0
(counter = A, B, D, X, Y, or SP)
Decrement counter and branch if ≠ 0
(counter = A, B, D, X, Y, or SP)
Increment counter and branch if = 0
(counter = A, B, D, X, Y, or SP)
Increment counter and branch if ≠ 0
(counter = A, B, D, X, Y, or SP)
Test counter and branch if = 0
(counter = A, B, D, X,Y, or SP)
TBNE
Freescale Semiconductor77
Test counter and branch if ≠ 0
(counter = A, B, D, X,Y, or SP)
S12CPUV2 Reference Manual, Rev. 4.0
If (counter) not = 0, then branch;
else continue to next instruction
5.21 Jump and Subroutine Instructions
Jump (JMP) instructions cause immediate changes in sequence. The JMP
instruction loads the PC with an address in the 64-Kbyte memory map, and
program execution continues at that address. The address can be provided
as an absolute 16-bit address or determined by various forms of indexed
addressing.
Subroutine instructions optimize the process of transferring control to a
code segment that performs a particulartask. A short branch (BSR), a jump
to subroutine (JSR), or an expanded-memory call (CALL) can be used to
initiate subroutines. There is no LBSR instruction, but a PC-relative JSR
performs the same function. A return address is stacked, then execution
begins at the subroutine address. Subroutines in the normal 64-Kbyte
address space are terminated with a return-from-subroutine (RTS)
instruction. RTS unstacks the return address so that execution resumes
with the instruction after BSR or JSR.
The call subroutine in expanded memory (CALL) instruction is intended for
use with expanded memory. CALL stacks the value in the PPAGE register
and the return address, then writes a new value to PPAGE to select the
memory page where the subroutine resides. The page value is an
immediateoperandin all addressing modes except indexed indirect modes;
in these modes, an operand points to locations in memory where the new
page value and subroutine address are stored. The return from call (RTC)
instruction is used to terminate subroutines in expanded memory. RTC
unstacks the PPAGE value and the return address so that execution
resumes with the next instruction after CALL. For software compatibility,
CALL and RTC execute correctly on devices that do not have expanded
addressing capability. Table 5-21 summarizes the jump and subroutine
instructions.
S12CPUV2 Reference Manual, Rev. 4.0
78Freescale Semiconductor
Table 5-21. Jump and Subroutine Instructions
MnemonicFunctionOperation
SP – 2 ⇒ SP
BSRBranch to subroutine
CALL
JMPJumpAddress ⇒ PC
JSRJump to subroutine
RTCReturn from call
RTSReturn from subroutine
Call subroutine
in expanded memory
RTNH: RTNL⇒ M
Subroutine address ⇒ PC
SP – 2 ⇒ SP
RTNH:RTNL⇒ M
SP – 1 ⇒ SP
(PPAGE) ⇒ M
Page ⇒ PPAGE
Subroutine address ⇒ PC
SP – 2 ⇒ SP
RTNH: RTNL⇒ M
Subroutine address ⇒ PC
M
(SP)
SP + 1 ⇒ SP
M
: M
M
(SP)
(SP)
(SP+1)
SP + 2 ⇒ SP
: M
(SP+1)
SP + 2 ⇒ SP
(SP)
(SP)
(SP)
⇒ PPAGE
⇒ PCH: PC
⇒ PCH: PC
: M
: M
(SP)
: M
(SP+1)
(SP+1)
(SP+1)
L
L
5.22 Interrupt Instructions
Interrupt instructions handle transfer of control to a routine that performs a
critical task. Software interrupts are a type of exception. Section 7.
Exception Processing covers interrupt exception processing in detail.
The software interrupt (SWI) instruction initiates synchronous exception
processing. First, the return PC value is stacked. After CPU context is
stacked, execution continues at the address pointed to by the SWI vector.
Execution of the SWI instruction causes an interrupt without an interrupt
service request. SWI is notinhibited by global mask bitsI and X in theCCR,
and execution of SWI sets the I mask bit. Once an SWI interrupt begins,
maskable interrupts are inhibited until the I bit in the CCR is cleared. This
typically occurs when a return from interrupt (RTI) instruction at the end of
the SWI service routine restores context.
The CPU12 uses a variation of the software interrupt for unimplemented
opcode trapping. There are opcodes in all 256 positions in the page 1
opcode map, but only 54 of the 256 positions on page 2 of the opcode map
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor79
are used. If the CPU attempts to execute one of the unimplemented
opcodes on page 2, an opcode trap interrupt occurs. Traps are essentially
interrupts that share the $FFF8:$FFF9 interrupt vector.
The RTI instruction is used to terminate all exception handlers, including
interrupt service routines. RTI first restores the CCR, B:A, X, Y, and the
return address from the stack. If no other interrupt is pending, normal
execution resumes with the instruction following the last instruction that
executed prior to interrupt.
Table 5-22 is a summary of interrupt instructions.
Table 5-22. Interrupt Instructions
MnemonicFunctionOperation
(M
) ⇒ CCR; (SP) + $0001 ⇒ SP
(SP)
(M
RTI
Return
from interrupt
(M
(M
(M
(SP)
(SP)
(SP)
(SP)
: M
: M
: M
(SP+1)
: M
) ⇒ B : A; (SP) + $0002 ⇒ SP
(SP+1)
) ⇒ XH: XL; (SP) + $0004 ⇒ SP
(SP+1)
) ⇒ PCH: PCL; (SP) + $0002 ⇒ SP
) ⇒ YH: YL; (SP) + $0004 ⇒ SP
(SP+1)
SWISoftware interrupt
TRAP
Unimplemented
opcode interrupt
SP – 2 ⇒ SP; RTNH: RTNL⇒ M
SP – 2 ⇒ SP; YH: YL⇒ M
SP – 2 ⇒ SP; XH: XL⇒ M
SP – 2 ⇒ SP; B : A ⇒ M
(SP)
(SP)
(SP)
SP – 1 ⇒ SP; CCR ⇒ M
SP – 2 ⇒ SP; RTNH: RTNL⇒ M
SP – 2 ⇒ SP; YH: YL⇒ M
SP – 2 ⇒ SP; XH: XL⇒ M
SP – 2 ⇒ SP; B : A ⇒ M
(SP)
(SP)
(SP)
SP – 1 ⇒ SP; CCR ⇒ M
(SP)
: M
: M
: M
(SP)
(SP)
: M
: M
: M
(SP)
: M
(SP+1)
(SP+1)
(SP+1)
: M
(SP+1)
(SP+1)
(SP+1)
(SP+1)
(SP+1)
S12CPUV2 Reference Manual, Rev. 4.0
80Freescale Semiconductor
5.23 Index Manipulation Instructions
The index manipulation instructions perform 8- and 16-bit operationson the
three index registers and accumulators, other registers, or memory, as
shown in Table 5-23.
Table 5-23. Index Manipulation Instructions
MnemonicFunctionOperation
ABXAdd B to X(B) + (X) ⇒ X
ABYAdd B to Y(B) + (Y) ⇒ Y
CPSCompare SP to memory(SP) – (M : M + 1)
CPXCompare X to memory(X) – (M : M + 1)
CPYCompare Y to memory(Y) – (M : M + 1)
LDSLoad SP from memoryM : M+1 ⇒ SP
LDXLoad X from memory(M : M + 1) ⇒ X
LDYLoad Y from memory(M : M + 1) ⇒ Y
LEASLoad effective address into SPEffective address ⇒ SP
LEAXLoad effective address into XEffective address ⇒ X
LEAYLoad effective address into YEffective address ⇒ Y
STSStore SP in memory(SP) ⇒ M:M+1
STXStore X in memory(X) ⇒ M : M + 1
STYStore Y in memory(Y) ⇒ M : M + 1
TFRTransfer register to register
TSXTransfer SP to X(SP) ⇒ X
TSYTransfer SP to Y(SP) ⇒ Y
TXStransfer X to SP(X) ⇒ SP
TYStransfer Y to SP(Y) ⇒ SP
EXGExchange register to register
XGDXEXchange D with X(D) ⇔ (X)
XGDYEXchange D with Y(D) ⇔ (Y)
Addition Instructions
Compare Instructions
Load Instructions
Store Instructions
Transfer Instructions
(A, B, CCR, D, X, Y, or SP)
⇒ A, B, CCR, D, X, Y, or SP
Exchange Instructions
(A, B, CCR, D, X, Y, or SP)
⇔ (A, B, CCR, D, X, Y, or SP)
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor81
5.24 Stacking Instructions
The two types of stacking instructions, are shown in Table 5-24. Stack
pointerinstructionsusespecializedformsofmathematicalanddatatransfer
instructions to perform stack pointer manipulation. Stack operation
instructions save information on and retrieve information from the system
stack.
MnemonicFunctionOperation
CPSCompare SP to memory(SP) – (M : M + 1)
DESDecrement SP(SP) – 1 ⇒ SP
INSIncrement SP(SP) + 1 ⇒ SP
LDSLoad SP(M : M + 1) ⇒ SP
LEAS
STSStore SP(SP) ⇒ M : M + 1
TSXTransfer SP to X(SP) ⇒ X
TSYTransfer SP to Y(SP) ⇒ Y
TXSTransfer X to SP(X) ⇒ SP
TYSTransfer Y to SP(Y) ⇒ SP
PSHAPush A
PSHBPush B
PSHCPush CCR
PSHDPush D
PSHXPush X
PSHYPush Y
PULAPull A
PULBPull B
PULCPull CCR
PULDPull D
PULXPull X
PULYPull Y
Table 5-24. Stacking Instructions
Stack Pointer Instructions
Load effective address
into SP
Stack Operation Instructions
Effective address ⇒ SP
(SP) – 1 ⇒ SP; (A) ⇒ M
(SP) – 1 ⇒ SP; (B) ⇒ M
(SP) – 1 ⇒ SP; (A) ⇒ M
(SP) – 2 ⇒ SP; (A : B) ⇒ M
(SP) – 2 ⇒ SP; (X) ⇒ M
(SP) – 2 ⇒ SP; (Y) ⇒ M
(M
) ⇒ A; (SP) + 1 ⇒ SP
(SP)
(M
) ⇒ B; (SP) + 1 ⇒ SP
(SP)
(M
) ⇒ CCR; (SP) + 1 ⇒ SP
(SP)
(M
(M
(M
(SP)
(SP)
(SP)
: M
: M
: M
) ⇒ A : B; (SP) + 2 ⇒ SP
(SP+1)
) ⇒ X; (SP) + 2 ⇒ SP
(SP+1)
) ⇒ Y; (SP) + 2 ⇒ SP
(SP+1)
(SP)
(SP)
(SP)
(SP)
(SP)
(SP)
: M
: M
: M
(SP+1)
(SP+1)
(SP+1)
S12CPUV2 Reference Manual, Rev. 4.0
82Freescale Semiconductor
5.25 Pointer and Index Calculation Instructions
Theloadeffective address instructionsallow5-,8-,or 16-bit constantsorthe
contents of 8-bit accumulatorsA and B or16-bit accumulator D tobe added
to the contents of the X and Y index registers, or to the SP.
Table 5-25 is a summary of pointer and index instructions.
Table 5-25. Pointer and Index Calculation Instructions
MnemonicFunctionOperation
LEAS
LEAX
LEAY
Load result of indexed addressing mode
effective address calculation
into stack pointer
Load result of indexed addressing mode
effective address calculation
into x index register
Load result of indexed addressing mode
effective address calculation
into y index register
r ± constant ⇒ SP or
(r) + (accumulator) ⇒ SP
r = X, Y, SP, or PC
r ± constant ⇒X or
(r) + (accumulator) ⇒X
r = X, Y, SP, or PC
r ± constant ⇒Y or
(r) + (accumulator) ⇒ Y
r = X, Y, SP, or PC
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor83
5.26 Condition Code Instructions
Condition code instructions are special forms of mathematical and data
transfer instructions that can beused to change the condition code register.
Table 5-26 shows instructions that can be used to manipulate the CCR.
Table 5-26. Condition Code Instructions
MnemonicFunctionOperation
ANDCCLogical AND CCR with memory(CCR) • (M) ⇒ CCR
CLCClear C bit0 ⇒ C
CLIClear I bit0 ⇒ I
CLVClear V bit0 ⇒ V
ORCCLogical OR CCR with memory(CCR) + (M) ⇒ CCR
PSHCPush CCR onto stack
PULCPull CCR from stack
SECSet C bit1 ⇒ C
SEISet I bit1 ⇒ I
SEVSet V bit1 ⇒ V
TAPTransfer A to CCR(A) ⇒ CCR
TPATransfer CCR to A(CCR) ⇒ A
(SP) – 1 ⇒ SP; (CCR) ⇒ M
(M
) ⇒ CCR; (SP) + 1 ⇒ SP
(SP)
(SP)
S12CPUV2 Reference Manual, Rev. 4.0
84Freescale Semiconductor
5.27 Stop and Wait Instructions
As shown in Table 5-27, two instructions put the CPU12 inan inactive state
that reduces power consumption.
The stop instruction (STOP) stacks a return address and the contents of
CPU registers and accumulators, then halts all system clocks.
The wait instruction (WAI) stacks a return address and the contents of CPU
registers and accumulators, then waits for an interrupt service request;
however, system clock signals continue to run.
Both STOP and WAI require that either an interrupt or a reset exception
occur before normal execution of instructions resumes. Although both
instructions require the same number of clock cycles to resume normal
program execution after an interrupt service request is made, restarting
after a STOP requiresextra time for the oscillator to reach operating speed.
Table 5-27. Stop and Wait Instructions
MnemonicFunctionOperation
STOPStop
WAIWait for interrupt
SP – 2 ⇒ SP; RTNH: RTNL⇒ M
SP – 2 ⇒ SP; YH: YL⇒ M
SP – 2 ⇒ SP; XH: XL⇒ M
SP – 2 ⇒ SP; B : A ⇒ M
(SP)
(SP)
(SP)
SP – 1 ⇒ SP; CCR ⇒ M
Stop CPU clocks
SP – 2 ⇒ SP; RTNH: RTNL⇒ M
SP – 2 ⇒ SP; YH: YL⇒ M
SP – 2 ⇒ SP; XH: XL⇒ M
SP – 2 ⇒ SP; B : A ⇒ M
(SP)
(SP)
(SP)
SP – 1 ⇒ SP; CCR ⇒ M
(SP)
: M
: M
: M
(SP)
(SP)
: M
: M
: M
(SP)
: M
(SP+1)
(SP+1)
(SP+1)
: M
(SP+1)
(SP+1)
(SP+1)
(SP+1)
(SP+1)
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor85
5.28 Background Mode and Null Operations
Background debug mode (BDM) is a special CPU12 operating mode that is
used for system development and debugging. Executing enter background
debug mode (BGND) when BDM is enabled puts the CPU12 in this mode.
For complete information, refer to Section 8. Instruction Queue.
Null operations are often used to replace other instructions during software
debugging. Replacing conditional branch instructions with branch never
(BRN), for instance, permits testing a decision-making routine by disabling
the conditional branch without disturbing the offset value.
Null operations can also be used in software delay programs to consume
execution time without disturbing the contents of other CPU registers or
memory.
Table 5-28 shows the BGND and null operation (NOP) instructions.
Table 5-28. Background Mode and Null Operation Instructions
MnemonicFunctionOperation
BGNDEnter background debug mode
BRNBranch neverDoes not branch
LBRNLong branch neverDoes not branch
NOPNull operation—
If BDM enabled, enter BDM;
else resume normal processing
S12CPUV2 Reference Manual, Rev. 4.0
86Freescale Semiconductor
Reference Manual — S12CPUV2
6.1 Introduction
This section is a comprehensive reference to the CPU12 instruction set.
Section 6. Instruction Glossary
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor87
6.2 Glossary Information
jdhbbbbb
Theglossary contains an entry for each assembler mnemonic, in alphabetic
order. Figure 6-1 is a representation of a glossary page.
Each entry contains symbolic and textual descriptions of operation,
information concerning the effect of operation on status bits in the condition
code register, and a table that describes assembler syntax, address mode
variations, and cycle-by-cycle execution of the instruction.
S12CPUV2 Reference Manual, Rev. 4.0
88Freescale Semiconductor
6.3 Condition Code Changes
The following special characters are used to describe the effects of
instruction execution on the status bits in the condition code register.
– — Status bit not affected by operation
0 — Status bit cleared by operation
1 — Status bit set by operation
∆ — Status bit affected by operation
⇓ — Status bit may be cleared or remain set, but is not set
by operation.
⇑ — Status bit may be set or remain cleared, but is not
cleared by operation.
? — Status bit may be changed by operation, but the final
state is not defined.
! — Status bit used for a special purpose
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor89
6.4 Object Code Notation
The digits 0 to 9 and the uppercase letters A to F are used to express
hexadecimal values. Pairs of lowercase letters represent the8-bitvaluesas
described here.
dd — 8-bit direct address $0000 to $00FF; high byte
assumed to be $00
ee — High-order byte of a 16-bit constant offset for indexed
addressing
eb — Exchange/transfer post-byte
ff — Low-order eight bits of a 9-bit signed constant offset
for indexed addressing, or low-order byte of a 16-bit
constant offset for indexed addressing
hh — High-order byte of a 16-bit extended address
ii — 8-bit immediate data value
jj — High-order byte of a 16-bit immediate data value
kk — Low-order byte of a 16-bit immediate data value
lb — Loop primitive (DBNE) post-byte
ll — Low-order byte of a 16-bit extended address
mm — 8-bit immediate mask value for bit manipulation
instructions; set bits indicate bits to be affected
pg — Program overlay page (bank) number used in CALL
instruction
qq — High-order byte of a 16-bit relative offset for long
branches
tn — Trap number $30–$39 or $40–$FF
rr — Signed relative offset $80 (–128) to $7F (+127)
offset relative to the byte following the relative offset
byte, or low-order byte of a 16-bit relative offset for
long branches
xb — Indexed addressing post-byte
S12CPUV2 Reference Manual, Rev. 4.0
90Freescale Semiconductor
6.5 Source Forms
The glossary pages provide only essential information about assembler
source forms. Assemblers generally support a number of assembler
directives, allow definition of program labels, and have special conventions
for comments. For complete information about writing source files for a
particular assembler, refer to the documentation provided by theassembler
vendor.
Assemblers are typically flexible about the use of spaces and tabs. Often,
any number of spaces or tabs can be used where a single space is shown
on the glossary pages. Spaces and tabs are also normally allowed before
and after commas. When program labels are used, there must also be at
least one tab or space before all instruction mnemonics. This required
space is not apparent in the source forms.
Everything in the source forms columns, except expressions in italiccharacters, is literal information which must appear in the assembly source
file exactly as shown. The initial 3- to 5-letter mnemonic is always a literal
expression. All commas, pound signs (#), parentheses, square brackets ( [
or ] ), plus signs (+), minus signs (–), and the register designation D (as in
[D,... ), are literal characters.
Groups of italic characters in the columns represent variable information to
be supplied by the programmer. These groups can include any
alphanumeric character or the underscore character, but cannot include a
space or comma. For example, the groups xysp and oprx0_xysp are both
valid, but the two groups oprx0 xysp are not valid because there is a space
between them. Permitted syntax is described here.
The definition of a legal label or expression varies from assembler to
assembler. Assemblers also vary in the way CPU registers are specified.
Refer to assembler documentation for detailed information. Recommended
register designators are a, A, b,B, ccr, CCR, d, D, x,X, y, Y, sp, SP, pc,and
PC.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor91
abc — Any one legal register designator for accumulators A or
B or the CCR
abcdxys — Any one legal register designator for accumulators A or
B,theCCR,thedoubleaccumulatorD,indexregistersX
or Y, or the SP. Some assemblersmayaccept t2, T2, t3,
or T3 codes in certain cases of transfer and exchange
instructions, but these forms are intended for Freescale
use only.
abd — Any one legal register designator for accumulators A or
B or the double accumulator D
abdxys — Any one legal register designator for accumulators A or
B,thedouble accumulator D,indexregisterXor Y, orthe
SP
dxys — Any one legal register designation for the double
accumulator D, index registers X or Y, or the SP
msk8 — Anylabel or expression that evaluates to an 8-bit value.
Some assemblers require a # symbol before this value.
opr8i — Any label or expression that evaluates to an 8-bit
immediate value
opr16i — Any label or expression that evaluates to a 16-bit
immediate value
opr8a — Anylabel or expression that evaluates to an 8-bit value.
The instruction treats this 8-bit value as the low-order 8
bits of an address in the direct page of the 64-Kbyte
address space ($00xx).
opr16a — Anylabel or expression that evaluates to a 16-bit value.
The instruction treats this value as an address in the
64-Kbyte address space.
S12CPUV2 Reference Manual, Rev. 4.0
92Freescale Semiconductor
oprx0_xysp — This word breaks down into one of the following
alternative forms that assemble to an 8-bit indexed
addressing postbyte code. These forms generate the
same object code except for the value of the postbyte
code, which is designated as xb in the object code
columns of the glossary pages. As with the source
forms, treat all commas, plus signs, and minus signs as
literal syntax elements. The italicized words used in
these forms are included in this key.
oprx3 — Any label or expression that evaluates to a value in the
range +1to+8
oprx5 — Anylabel or expression thatevaluates to a 5-bit value in
the range –16 to +15
oprx9 — Anylabel or expression thatevaluates to a 9-bit value in
the range –256 to +255
oprx16 — Anylabel or expression that evaluates to a 16-bit value.
Since the CPU12 has a 16-bit address bus, this can be
either a signed or an unsigned value.
page — Anylabel or expression that evaluates to an 8-bit value.
The CPU12 recognizes up to an 8-bit page value for
memory expansion but not all MCUs that include the
CPU12 implement all of these bits. It is the
programmer’s responsibility to limit the page value to
legal values for the intended MCU system. Some
assemblers require a # symbol before this value.
rel8 — Any label or expression that refers to an address that is
within–128to+127 locations from the nextaddressafter
the last byte of object code for the current instruction.
The assembler will calculate the 8-bit signed offset and
include it in the object code for this instruction.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor93
rel9 — Any label or expression that refers to an address that is
within–256to+255 locations from the nextaddressafter
the last byte of object code for the current instruction.
The assembler will calculate the 9-bit signed offset and
include it in the object code for this instruction. The sign
bit for this 9-bit value is encoded by the assembler as a
bit in the looping postbyte (lb) of one of the loop control
instructions DBEQ, DBNE, IBEQ, IBNE, TBEQ, or
TBNE.Theremainingeight bits of theoffsetareincluded
as an extra byte of object code.
rel16 — Any label or expression that refers to an address
anywhere in the 64-Kbyte address space. The
assemblerwillcalculatethe 16-bit signed offset between
this address and the next address after the last byte of
objectcodeforthisinstructionandincludeitinthe object
code for this instruction.
trapnum — Any label or expression that evaluates to an 8-bit
number in the range $30–$39 or $40–$FF. Used for
TRAP instruction.
xys — Any one legal register designation for index registers X
or Y or the SP
xysp — Any one legal register designation for index registers X
or Y, the SP, or the PC. The reference point for
PC-relative instructions is the next address after the last
byte of object code for the current instruction.
6.6 Cycle-by-Cycle Execution
This information is found in the tables at the bottom of each instruction
glossary page. Entries show how many bytes of information are accessed
from different areas of memory during the course of instruction execution.
Withthisinformationand knowledge of the type and speed of memory in the
system, a user can determine the execution time for any instruction in any
system.
Asinglelettercode in the columnrepresentsasingleCPU cycle. Uppercase
letters indicate 16-bit access cycles. There are cycle codes for each
addressing mode variation of each instruction. Simply count code letters to
determine the execution time of an instruction in a best-case system. An
S12CPUV2 Reference Manual, Rev. 4.0
94Freescale Semiconductor
example of a best-case system is a single-chip 16-bit system with no 16-bit
off-boundary data accesses to any locations other than on-chip RAM.
Many conditions can cause one or more instruction cycles to be stretched,
butthe CPU is not aware of the stretch delays because the clock to the CPU
is temporarily stopped during these delays.
The following paragraphs explain the cycle code letters used and note
conditions that can cause each type of cycle to be stretched.
f — Free cycle. This indicates a cycle where the CPU
does not require use of the system buses. An f cycle
is always one cycle of the system bus clock. These
cycles can be used by a queue controller or the
background debug system to perform single cycle
accesses without disturbing the CPU.
g — Read 8-bit PPAGE register. These cycles are used
only with the CALL instruction to read the current
valueof the PPAGE register and are not visible on the
external bus. Since the PPAGE register is an internal
8-bit register, these cycles are never stretched.
I — Read indirect pointer. Indexed indirect instructions
use this 16-bit pointer from memory to address the
operand for the instruction. These are always 16-bit
reads but they can be either aligned or misaligned.
These cycles are extended to two bus cycles if the
MCU is operating with an 8-bit external data bus and
the corresponding data is stored in external memory.
There can be additional stretching when the address
spaceisassignedtoachip-selectcircuitprogrammed
for slow memory. These cycles are also stretched if
they correspond to misaligned access to a memory
that is not designed for single-cycle misaligned
access.
i — Read indirect PPAGE value. These cycles are only
used with indexed indirect versions of the CALL
instruction, where the 8-bit value for the memory
expansion page register of the CALL destination is
fetched from an indirect memory location. These
cycles are stretched only when controlled by a
chip-select circuit that is programmed for slow
memory.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor95
n — Write 8-bit PPAGE register. These cycles are used
only with the CALL and RTC instructions to write the
destination value of the PPAGE register and are not
visible on the external bus. Since the PPAGE register
is an internal 8-bit register, these cycles are never
stretched.
O — Optionalcycle. Program informationisalwaysfetched
as aligned 16-bit words. When an instruction consists
of an odd number of bytes, and the first byte is
misaligned, an O cycle is used to make an additional
program word access (P) cycle that maintains queue
order.Inallothercases,theOcycleappearsasafree
(f) cycle. The $18 prebyte for page two opcodes is
treatedasaspecial1-byteinstruction.Iftheprebyteis
misaligned, the O cycle is used as a program word
access for theprebyte; if theprebyte is aligned,the O
cycle appears as a free cycle. If the remainder of the
instruction consists of an odd number of bytes,
another O cycle is required some time before the
instruction is completed. If the O cycle for the prebyte
is treated as a P cycle, any subsequent O cycle in the
same instruction is treated as an f cycle; ifthe O cycle
for the prebyte is treated as an f cycle, any
subsequent O cycle in the same instruction is treated
as a P cycle. Optional cycles used for program word
accesses can be extended to two bus cycles if the
MCU is operating with an 8-bit external data bus and
the program is stored in external memory. There can
be additional stretching when the address space is
assigned to a chip-select circuit programmed for slow
memory. Optional cycles used as free cycles are
never stretched.
P — Programword access.Programinformationis fetched
asaligned16-bitwords.Thesecyclesare extended to
two bus cycles if the MCU is operating with an 8-bit
external data bus and the program is stored
externally. There can be additional stretching when
the address space is assigned to a chip-select circuit
programmed for slow memory.
S12CPUV2 Reference Manual, Rev. 4.0
96Freescale Semiconductor
r — 8-bitdataread.These cycles are stretched only when
controlled by a chip-select circuit programmed for
slow memory.
R — 16-bit data read. These cycles are extended to two
bus cycles if the MCU is operating with an 8-bit
external data bus and the corresponding data is
stored in external memory. There can be additional
stretching when the address space is assigned to a
chip-select circuit programmed for slow memory.
These cycles are also stretched if they correspond to
misaligned accesses to memory that is not designed
for single-cycle misaligned access.
s — Stack 8-bit data. These cycles are stretched only
when controlled by a chip-select circuit programmed
for slow memory.
S — Stack 16-bit data. These cycles are extended to two
bus cycles if the MCU is operating with an 8-bit
external data bus and the SP is pointing to external
memory. There can be additional stretching if the
address space is assigned to a chip-select circuit
programmed for slow memory. These cycles are also
stretched if they correspond to misaligned accesses
to a memory that is not designed for single cycle
misaligned access. The internal RAM is designed to
allow single cycle misaligned word access.
w — 8-bitdata write. These cycles are stretched only when
controlled by a chip-select circuit programmed for
slow memory.
W — 16-bit data write. These cycles are extended to two
bus cycles if the MCU is operating with an 8-bit
external data bus and the corresponding data is
stored in external memory. There can be additional
stretching when the address space is assigned to a
chip-select circuit programmed for slow memory.
These cycles are also stretched if they correspond to
misaligned access to a memory that is not designed
for single-cycle misaligned access.
u — Unstack 8-bit data. These cycles are stretched only
when controlled by a chip-select circuit programmed
for slow memory.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor97
U — Unstack16-bitdata.Thesecycles areextendedto two
bus cycles if the MCU is operating with an 8-bit
external data bus and the SP is pointing to external
memory. There can be additional stretching when the
address space is assigned to a chip-select circuit
programmed for slow memory. These cycles are also
stretched if they correspond to misaligned accesses
to a memory that is not designed for single-cycle
misaligned access. The internal RAM is designed to
allow single-cycle misaligned word access.
V — Vectorfetch. Vectors are always aligned16-bitwords.
These cycles are extended to two bus cycles if the
MCU is operating with an 8-bit external data bus and
the program is stored in external memory. There can
be additional stretching when the address space is
assigned to a chip-select circuit programmed for slow
memory.
t — 8-bit conditional read. These cycles are either data
read cycles or unused cycles, depending on the data
and flow of the REVW instruction. These cycles are
stretched only when controlledby a chip-select circuit
programmed for slow memory.
T — 16-bit conditional read. These cycles are either data
read cycles or free cycles, depending on the dataand
flow of the REV or REVW instruction. These cycles
areextendedto two buscyclesiftheMCU is operating
with an 8-bit external data bus and the corresponding
data is stored in external memory. There can be
additional stretching when the address space is
assigned to a chip-select circuit programmed for slow
memory. These cycles are also stretched if they
correspond to misaligned accesses to a memory that
is not designed for single-cycle misaligned access.
x — 8-bit conditional write. These cycles are either data
write cycles or free cycles, depending on the dataand
flow of the REV or REVW instruction. These cycles
are only stretched when controlled by a chip-select
circuit programmed for slow memory.
if not taken. Since the instruction consists of a single
word containing both an opcode and an 8-bit offset,
the not-taken case is simple — the queue advances,
another program word fetch is made, and execution
continues with the next instruction. The taken case
requires that the queue be refilled so that execution
can continue at a new address. First, the effective
address of the destination is determined, then the
CPU performs three program word fetches from that
address.
OPPP/OPO — Long branches require four cycles if taken, three
cycles if not taken. Optional cycles are required
becausealllongbranches are page two opcodes,and
thus include the $18 prebyte. The CPU12 treats the
prebyte as a special 1-byte instruction. If the prebyte
is misaligned, the optional cycle is used to perform a
program word access; if the prebyte is aligned, the
optional cycle is used to perform a free cycle. As a
result, both the taken and not-taken cases use one
optional cycle for the prebyte. In the not-taken case,
the queue must advance so that execution can
continue with the next instruction, and another
optional cycle is required to maintain the queue. The
taken case requires that the queue be refilled so that
execution can continue at a new address. First, the
effective address of the destination is determined,
then the CPU performs three program word fetches
from that address.
6.7 Glossary
This subsection contains an entry for each assembler mnemonic, in
alphabetic order.
S12CPUV2 Reference Manual, Rev. 4.0
Freescale Semiconductor99
ABA
Operation:(A) + (B) ⇒ A
Description:Adds the content of accumulator B to the content of accumulator A and
places the result in A. The content of B is not changed. This instruction
affects the Hstatus bit soit is suitablefor use inBCD arithmetic operations.
See DAA instruction for additional information.
CCR Details:
H: A3 • B3 + B3 • R3 + R3 • A3
N: Set if MSB of result is set; cleared otherwise
Z:Set if result is $00; cleared otherwise
V: A7 • B7 • R7 + A7 • B7 • R7
Add Accumulator B to Accumulator A
SXHINZVC
––∆–∆∆∆∆
Set if there was a carry from bit 3; cleared otherwise
Set if a two’s complement overflow resulted from the operation;
cleared otherwise
ABA
C: A7 • B7 + B7 • R7 + R7 • A7
Set if there was a carry from the MSB of the result; cleared
otherwise
Source Form
ABAINH18 06OOOO
Address
Mode
Object Code
HCS12M68HC12
Access Detail
S12CPUV2 Reference Manual, Rev. 4.0
100Freescale Semiconductor
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.