SGS Thomson Microelectronics ST20C1ISM Datasheet

1/205
ST20-C1 Core
Instruction Set
Reference Manual
72-TRN-274-01 July1997
Contents
2/205
Contents
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
1.1 ST20-C1 features . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .4
1.2 Manual structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . .5
2 Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
2.1 Instruction listings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .6
2.2 Instruction definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . ..............8
2.3 Operators used in the definitions .. . . . . . . . . . . . . . . . . . . . . . ........11
2.4 Data structures and constants . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .13
3 Architecture . . . . . . . . ...........................................16
3.1 Values. . . . . . . . . . . . . . ......................................16
3.2 Memory. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......18
3.3 Registers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .20
3.4 Instruction encoding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .24
4 Using ST20-C1 instructions ......................................30
4.1 Manipulating the evaluation stack. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .30
4.2 Loading and storing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .31
4.3 Expression evaluation.. . . . . . . . . ..............................33
4.4 Arithmetic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .35
4.5 Forming addresses . . . . . . . . . . . . . . . . . .........................41
4.6 Comparisons and jumps . . . ...................................43
4.7 Evaluation of boolean expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .45
4.8 Bitwise logic and bit operations. . . . . . . ..........................47
4.9 Shifting and byte swapping. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.10 Function and procedure calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .48
4.11 Peripherals and I/O. . . . . . . . . . ................................52
4.12 Status register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .55
5 Multiply accumulate . . . . . .......................................56
5.1 Data formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
5.2 mac and umac . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ..........56
5.3 Short multiply accumulate loop. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .56
5.4 Biquad IIR filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .58
5.5 Data vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
5.6 Scaling . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .59
5.7 Data formats. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .65
3/205
Contents
6 Exceptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .70
6.1 Exception levels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .71
6.2 Exception vector table. . . .....................................72
6.3 Exception control block and the saved state. . . ....................73
6.4 Initial exception handler state . .................................74
6.5 Restrictions on exception handlers. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .75
6.6 Interrupts. . . . . . . . . . . . . . . . . . . . . . . . . . ........................75
6.7 Traps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .76
6.8 Setting up the exception handler . . . . . . . . . . . . ...................76
7 Multi-tasking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .78
7.1 Processes . . . . . . . . . . . . . . . . . . . . .............................78
7.2 Descheduled processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .79
7.3 Queues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .80
7.4 Timeslicing ................................................81
7.5 Inactive processes . . . . . . . . . . ................................82
7.6 Descheduled process state. . . . . . . . . . . . . . .. .. . . . . . . . . .. .. . . . . . .82
7.7 Initializing multi-tasking. . . . . .. . . . . . . . . . . . . . . ..................83
7.8 Scheduling kernels . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. .. . . . . . . . . . .84
7.9 Semaphores . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ...........84
7.10 Sleep. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .85
8 Instruction Set Reference. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .87
Appendices . ...........................................168
A Constants and data structures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .169
B Instruction set summary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .174
C Compiling for the ST20-C1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .178
D Glossary. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .187
1.1 ST20-C1 features
4/205
1 Introduction
This manual provides a summary and reference to the ST20 architecture and instruc­tion set for the ST20-C1 core.
ST20 is a technology for building successful embedded VLSI designs. ST20 devices comprise a collection of VLSI macro-cells connected through a high-performance on­chip bus. This architecture allowsthe easy construction of both general purpose (e.g. ST20-MC1 micro-controller) and application specific devices (e.g. ST20-TPx digital set top boxfamily).
The ST20 macro-cell library includes CPU micro-cores, on-chip memories and a wide range of digital and analogue I/O devices. SGS-THOMSON offers a range of ST20 CPU micro-cores, allowing the best cost vs. performance trade-off to be achieved in each application area. This manualdescribes the ST20-C1 CPU micro-core.
ST20 devices are available from SGS-THOMSON and licensed second source vendors.
1.1 ST20-C1 features
The ST20-C1 has the followingfeatures:
It is implemented as a 2-waysuperscalar,3-stage pipeline,with an internal 16­word register cache. This architecture can sustain 4 instructions in progress, with a maximum of 2 instructions completing per cycle.
It uses a variable length instruction coding scheme based on 8-bit units which gives excellent static and dynamic code size. Instructions take between 1 and 8 units to code, with an average of 1.25 units (10 bits) per instruction.
It provides flexible prioritized vectored interrupt capabilities. The worst case interrupt latency is 0.5 microseconds (at 33 Mhz operatingfrequency).
It provides extensive instruction level support for 16-bit digital signal process­ing (DSP) algorithms.
It is particularly suitable for low power and battery-powered applications, with low core operating power,and sophisticated power management facilities.
It provides extensivereal-time debugging capability through the optional ST20 diagnostic controller unit (DCU) macro-cell, which supports fully non-intrusive breakpoints, watchpoints and code tracing.
It has a flexibleand powerful built-in hardware scheduler.This is a light-weight
real-time operating system
(RTOS) directly implemented in the microcode of the ST20-C1 processor. The hardware scheduler can be customized and pro­vides support for software schedulers.
It provides a built-in user-programmable32-bit input/output register providing system control and communication capability directly from the CPU.
5/205
1 Introduction
1.2 Manual structure
The manual is divided into the following chapters:
1 This introduction chapter,which explains the structure of the book; 2 A notation chapter (Chapter 2) which explains the layoutand notation conven-
tions used in the instruction definitionsand elsewhere;
3 An architecture chapter (Chapter 3), which explainsthe structure of the ST20-
C1 core, the registers, memory addressing, the format of the instructions and the exception handling and process models;
4 Four chapters on using the instructions and howthe instructions can be used
to achievecertain useful outcomes: Chapter 4 on the generalinstructions; Chapter 5 on multiply-accumulate;Chapter 6 on interrupts and traps; and Chapter 7 on processes and support formulti-tasking.
5 An alphabetical listing of the instructions, one to a page (Chapter 8). Descrip-
tions and formal definitionsare presented in astandard format with the instruc­tion mnemonic and full name of the instruction at the top of the page.The notation used is explained in detail in Chapter 2.
In addition there are appendices listing constants and structures, covering issues related to compiling for a ST20-C1 core and listing the instruction set plus a glossary for ST20-C1 terminology.
2.1 Instruction listings
6/205
2 Notation
This chapter describes the notation used throughout this manual, including the meaning of the instruction listings and the meanings and valuesof constants.
2.1 Instruction listings
The instructions are listed in alphabetical order, one to a page. Descriptions are presented in a standard format with the instruction mnemonic and full name of the instruction at the top of the page,followedby these categories of information:
Code: the instruction code;
Description: a brief summary of the purpose and behavior of the instruction;
Definition: a more formal and complete description of the instruction, using
the notation described belowin section 2.2;
Status Register: a list of errors and other changes to the Status Register which can occur;
Comments: a list of other important features of the instruction;
See also: cross referencesare providedto other instructions with related func-
tions.
These categories are explained in more detail below, using the
and
instruction as an
example.
2.1.1 Instruction name
The header at the top of each page shows the instruction mnemonic and, on the right, the full name of the instruction. For primary instructions the mnemonic is followed by ‘n’ to indicate the operand to the instruction; the same notation is used in the descrip­tion to show how the operand is used. An explanation of the primary and secondary instruction formats is givenin section 3.4.
2.1.2 Code
The code of the instruction is the value that would appear in memory to represent the instruction.
For secondary instructions the instruction ‘operation code’ is shown as the memory code — the actual bytes, including any prefixes, which are stored in memory. The value is given as a sequence of bytesin hexadecimal, decoded left to right. The codes are stored in memory in ‘little-endian’ format, with the first byte at the lowest address.
For example,the entry for the
and
instruction is:
Code:F9
This means that the hexadecimalbyte value F9 would appear in memory foran
and
.
7/205
2 Notation
For primary instructions the code stored in memory is determined partly by the value of the operand to the instruction. In this case the op-code is shown as ‘Functionx’ wherexis the function code in the last byte of the instruction. For example,
adc(add
constant
) is shownas
Code: Function 8
This means that
adc 1
would appear in memory as the hexadecimal byte value 81. For
an operandnin the range 0 to 15,
adc n
would appear in memory as 8n.
2.1.3 Description
The description section providesan indication of the purpose of the instruction as well as a summary of the behavior. This may include details of the use of registers, whose initial values may be used as parameters and into which results may be stored.
Forexample, the
and
instruction contains the followingdescription:
Description: Bitwise AND of Areg and Breg.
2.1.4 Definition
The definition section provides a formal description of the behavior of the instruction. The behavior is defined in terms of its effect on the state of the processor, i.e. the changes in the values in registers and memory before and after the instruction has executed.
The effects of the instruction on registers,etc. are givenas statements of the following form:
register′←
expression involving registers, etc.
memory_location′←
expression involving registers, etc.
Primed names (e.g. Areg) represent values after instruction execution, while names without primes represent values when the instruction execution starts. For example, Areg represents the value in Areg before the execution of the instruction while Areg represents the value in Areg afterwards. So the example above states that after the instruction has been executed the register or memory location on the left hand side holds the value of the expressionon the right hand side.
Only the changed registers and memory locations are given on the left hand side of the statements. If the new value of a register or memory location is not giventhen the value is unchanged by the instruction.
The description is written with the main function of the instruction stated first. For example the main function of the
add
instruction is to put the sum of Areg and Breg into Areg). This is followed by the other effects of the instruction, such as rotating the stack. There is no temporal ordering implied by the order in which the statements are written.
2.2 Instruction definitions
8/205
For example,the
and
instruction contains the followingdescription:
Definition:
Areg′←Breg Areg Breg′←Creg
Creg′←Areg
This says that the integer stack is rotated and Areg is assigned the bitwise ANDof the values that were initially in Breg and Areg. After the instruction has executed Breg contains the value that was originally in Creg, and Creg has the value that was in Areg.
The notation is described more fully in section 2.2.
2.1.5 Status Register
This section of the instruction definitions lists any changes to bits of the Status register which can occur. The Status register is described in more detail in section
3.3.2.
2.1.6 Comments
This section is used for listing other information about the instructions that may be of interest. This includes an indication of the type of the instruction:
“Primary instruction” — indicates one of the 13 functions which maybe directly encoded with an operand in a single byte instruction.
“Secondary instruction” — indicates an instruction which isencoded using
opr
.
An explanation of the primary and secondary instruction formats is given in section
3.4. The Comments section also describes any situations where the operation of the
instruction is undefinedor invalidand anylimits to the parameter values. For example,the only comment listed for the
and
instruction is:
Comments:
Secondary instruction.
This says that
and
is a secondaryinstruction.
2.2 Instruction definitions
The following sections give a full description of the notation used in the formal defini­tion section of the instruction descriptions.
9/205
2 Notation
2.2.1 The process state The process state consists of the registers (Areg, Breg, Creg, Iptr, Tdesc, Wptr, and
Status), and the contents of memory. A description of the meanings and uses of the
registers and special memory locations and data structures is given in section 3.3.
2.2.2 General
The instruction descriptions are not intended to describe the waythe instructions are implemented, but only their effect on the state of the processor.So, for example, the result of
mul
is shown in terms of an intermediate result calculated to infiniteprecision,
although no such intermediate result is used in the implementation. Comments (in
italics
) are used to both clarify the description and to describe actions
or values that cannot easily be represented by the notation used here; e.g.
take
timeslice trap
. Some of these actions and values are described in more detail in other
chapters. An ellipsis is used to show a range of values; e.g. ‘i = 0..31’ means that i has values
from 0 to 31, inclusive. Subscripts are used to indicate particular bits in a word; e.g. Aregiforbit i of Areg;and
Areg
0..7
for the least significantbyte of Areg. Note that bit 0 is the least significant bit
in a word,and bit 31 is the most significantbit. Except for Iptr, certain reserved words of memory,and taking exceptions or switching
processes, if the description does not mention the state of a register or memory location after the instruction, then the value will not be changed bythe instruction.
Iptr is assigned the address of the next instruction in the code
before
the instruction execution starts. The Iptr is included in the description only when there are additional effects of the instruction (e.g. in the
jump
instruction). In these cases the address of
the next instruction is indicated bythe comment ‘
next instruction
’.
2.2.3 Undefined values
Some instructions in some circumstances leave the contents of a register or memory location in an undefined state. This means that the value of the location may be changed by the instruction, but the new value cannot be easily defined, or is not a meaningful result of the instruction. For example, when division by zero is attempted, Breg and Creg become undefined, i.e. they do not contain any meaningful data. An undefinedvalue is represented by the name
undefined
.
The values of registers which become undefined as a result of executing an instruc­tion are implementation dependent and are not guaranteed to be the same on different members or revisions of the ST20 family of processors.
2.2.4 Data types
The instruction set includes operationson three sizesof data: 8, 16 and 32-bit objects. 8-bit and 16-bit data can represent signed or unsigned integers and 32-bit data can
2.2 Instruction definitions
10/205
represent addresses,signed or unsigned integers. Generally the arithmetic in signed. In some cases it is clear from the context (e.g. from the operators used) whether a particular object represents a signed or unsigned number. A subscripted label is added (e.g. Areg
unsigned
) to clarify where necessary.
2.2.5 Representing memory
The memory is represented by arrays of each data type. These are indexedby a value representing a byte address. Access to the three data types is represented in the instruction descriptions in the following way:
byte[
address
]references a byte in memory at the givenaddress
sixteen[
address
]references a 16-bit half wordin memory
word[
address
]references a 32-bit word in memory
For all of these, the state of the machine referencedis that
before
the instruction if the
function is used without a prime (e.g. word[
address
]), and that
after
the instruction if
the function is used with a prime (e.g. word[
address
]).
For example, writing a value given by an expression,
expr
, to the word in memory at
address
addr
is represented by:
word[
addr
]
expr
and reading a word from a memory location is achievedby:
register′←word[
addr
]
Writing to memory in any of thesewayswill update the contents of memory,and these updates will be consistently visible to the other representations of the memory. For example,writing a byteat address 0 will modify the least significantbyte of the word at address 0.
Data alignment
Generally, word and half word data items have restrictions on their alignment in memory. Byte values can be accessed at any byte address, i.e. they are byte aligned. 16-bit objects can only be accessed at even byte addresses, i.e. the least significant bit of the address must be 0. 32-bit objects must be word aligned, i.e. the 2 least significantbits of the address must be zero.
Address calculation
An address identifiesa particular bytein memory.Addresses are frequently calculated from a base address and an offset. Fordifferent instructions the offset maybe givenin units of bytes or words depending on the data type being accessed. In order to calculate the address of the data, a word offset must be converted to a byte offset before being added to the base address. This is done by multiplying the offset by the number of bytes per word, i.e. 4.
As there are many accesses to memory at word offsets, a shorthand notation is used to represent the calculation of a word address. The notation
register@x
is used to
11/205
2 Notation
represent an address which is offset byxwords (4xbytes) from the address in
register
. For example, in the specification of
load non-local
there is:
Areg′←word[Areg @ n]
Here, Areg is loaded with the contents of the word that is n words from the address pointed to by Areg, i.e. the word at address Areg + 4n.
In all cases, if the given base address has the correct alignment then anyoffset used will also give a correctly aligned address.
2.3 Operators used in the definitions
A full list of the operators used in the instruction definitions is given in Table 2.1. Unless otherwise stated, all arithmetic is signed.
Modulo operators
Arithmetic is done using
modulo
arithmetic — i.e. there is no checkingfor errors and, if the calculation overflows,the result ‘wraps around’ the range of values representable in the word length of the processor — e.g. adding 1 to the address at the top of the
Symbol Meaning
Unchecked (modulo) integer arithmetic
+
×
/
rem
Signed integer add, subtract, multiply,divide and remainder.If the computation overflows the result of the operation is truncated to the word length. If a divide or remainder by zero occurs the result of the operation is undefined.No errors are signalled. The operator ‘’ is also used as a monadic operator.
Signed comparison operators
< >
≤ ≥
=
Comparisons of signed integer values: ‘less than’, ‘greater than’, ‘less than or equal’, ‘greater than or equal’, ‘equal’ and ‘not equal’.
Bitwise operators
>> <<
>>
arith
‘Not’, ‘and’, ‘or’,‘exclusiveor’, logical left and right shift and arithmetic right shift operations on bits in words.
Boolean operators
not
and
or
Boolean combination in conditionals.
Table2.1 Operators used in the instruction descriptions
2.3 Operators used in the definitions
12/205
address map produces the address of the byte at the bottom of the address map. These operators are represented by the symbols ‘+’, ‘’, etc.
Error conditions
Any errors that can occur in instructions which are defined in terms of the modulo operators are indicated explicitly in the instruction description. For example the
add
instruction indicates the cases that can cause overflowor underflow,independently of the actual addition:
if (sum > MostPos) {
Areg′←sum 2
BitsPerWord
Status
underflow
clear
Status
overflow
set
} else if (sum < MostNeg) {
Areg′←sum + 2
BitsPerWord
Status
underflow
set
Status
overflow
clear
}
else
{
Areg′←sum Status
underflow
clear
Status
overflow
clear
}
...
2.3.1 Functions Type conversions
The following notation is used to indicate the type cast of x to a 16-bit integer:
int16 (x)
If x is too large or too small to fitinto a 16-bit integer then the result of the instruction is undefined.
Double word splitting
Where a calculation is performed using a 48-bit or 64-bit value,the value may be split into two words. The function low_word returns the least significant word and the function high_word returns the most significant word.
13/205
2 Notation
2.3.2 Conditions to instructions
In many cases, the action of an instruction depends on the current state of the processor. In these cases the conditions are shownby an if clause; this can take one of the following forms:
if
condition
statement
if
condition
statement
else
statement
if
condition
statement
else if
condition
statement
else
statement
These conditions can be nested. Braces, {}, are used to group statements which are dependent on a condition. For example, thecj(
conditional jump
) instruction contains
the following lines:
if (Areg = 0)
Iptr′←
next instruction
+ n
else
{
Iptr′←
next instruction
Areg′←Breg Breg′←Creg Creg′← Areg
}
This says that if the value in Areg is zero, then the jump is taken (the instruction operand, n, is added to the instruction pointer), otherwise the stack is popped and executioncontinues with the next instruction.
2.4 Data structures and constants
A number of data structures have been defined in this manual. Each comprises a number of data slots that are referenced by name in the text and the instruction descriptions.
These data structures are listed in the tables in Appendix A. Each table gives the name of each slot in the structure and the word offsets from the base address of the structure. A slot in a data structure is identified using the offset notation described in section 2.2.5:
word[
base_address@word_offset
]
2.4 Data structures and constants
14/205
For example,the back pointer of a semaphore structure at address sem would be:
word[sem @ s.Back]
In addition, several constants are used to identify fixed values for the ST20-C1 processor. All the constants are listed in Appendix A.
Product identity value
This is the value returned by the
ldprodid
instruction. For specific product ids in the
ST20 family refer to SGS-THOMSON.
15/205
2 Notation
3.1 Values
16/205
3 Architecture
This chapter describes the general architectural features of the ST20-C1 core which are relevant to more than one instruction or group of instructions. Interrupts and traps are described in Chapter 6 and support for multi-tasking is described in Chapter 7. Other features which are related to specifictasks are described in Chapter 4. A full list of constants and data structures is given in Appendix A.
The ST20-C1 instruction set covers:
control flow
arithmetic and logical operations
bit fieldmanipulations
shifting and byte-swapping
register manipulations
memory access with various addressing modes and data sizes
task scheduling
direct input/output
3.1 Values
The ST20-C1 core supports data objects of different sizes, either signed or unsigned. The sizes directly supported are bytes (8-bit), half words (16-bit), words (32-bit) and multiple words (64-bit, 96-bit etc.). Bytes, half-words and words may be loaded and stored. Arithmetic operationsare providedfor signed words and multiple words. A half word is called a
sixteen
in the instruction names.
The most negativeinteger (0x80000000) is known as
MostNeg
and the most positive
(0x7FFFFFFF) as
MostPos
.
Boolean objects, taking one of the values
trueorfalse
, are also used bysome instruc-
tions.
False
is represented by the value 0 and
true
has the value 1. Section 4.7
describes how other valuesmay be implemented for language compilation. Severaldata structures are defined in this manual. Each comprises a number of data
words (sometimes called
slots
) that are referenced by name in the text and the instruction descriptions and addressed as offsets from the base of the data structure. A full list of these data structures and other constants is given in Appendix A.
3.1.1 Ordering of information
The ST20 is
little-endian
- i.e. less significantdata is always held in lower addresses. This applies to bits in bytes,bytes in words and words in memory.Hence, in a word of data representing an integer, one byte is more significant than another if its byte selector is larger.
Figure 3.1 shows the ordering of bytes in words for the ST20.
17/205
3 Architecture
Figure 3.1 Bytes and bits in words
Forexample, the most significantbit of a word is bit 31, and the most significantbyte is byte 3, consisting of bits 24 to 31. This ordering is compatible with Intel processors, but not Motorola or SPARC.
For compatibility with other devices, a
swap32
instruction is provided to reverse the
order of byteswithin a word.
3.1.2 Signed integers and sign extension
A signed object is stored in twos-complement format. A signed value may be repre­sented by an object of any size. Most commonly a signed integer is represented bya single word, but as explained,it may be stored, for example,in a 64-bit object, a 16-bit object, or an 8-bit object. In each of these formats, all the bits within the object contain useful information.
The length of the object that stores a signed valuecan be increased, so that the object size is increased without changing the value that is represented. This operation is known as
sign extension
. All the extra bits that are allocated for the larger object, are meaningful to the value of the signed integer; they must therefore be set to the appro­priate value. The value for all these extra bits is the same as the value of the most significantbit - i.e. the sign bit - of the smaller object. The ST20-C1 providesinstruc­tions that sign extendbyte and half-word objects to words.
The example shown in Figure 3.2 shows how the value -10 is stored in a 32-bit register, either as an 8-bit object or as a 32-bit object. In this case, bits 31 to 8 are meaningful for the 32-bit object but not for the 8-bit object. These bits are set to 1 in the 32-bit object.
3
210
Bytes in a word
Most significant
Least
significant
Bits in a word
Most significant
Least
significant
31
0
3.2 Memory
18/205
Figure 3.2 Storing a signed integer in different length objects
3.2 Memory
The ST20 processor is a 32-bit word machine, with byte addressing and a 4 Gbyte address space. This section explainshow data is arranged in that address space.The address of an object is the address of the base,i.e. the byte with the lowest address.
3.2.1 Word address and byte selector
A machine address, or pointer, is a single word of data which identifies a byte in memory - i.e. a byte address. It comprises two parts, a word address and a byte selector.The byte selector occupies the two least significantbits of the word; the word address the thirty most significant bits.
An address is treated asa signed value,the range of which starts at the most negative integer and continues, through zero, to the most positive integer. This enables the standard arithmetic and comparison functions to be used on pointer values in the same way that theyare used on numerical values.
Certain values can never be used as pointers because they represent reserved addresses at the bottom of memory space. They are reserved for use by the processor and initialization. A full list of names and values of constants used in this manual is givenin Appendix A.
In particular, the null process pointer (known as
NotProcess
) has the value
MostNeg
,
since zero could be a valid process address.
3.2.2 Alignment
A data object is said to be
word-aligned
if it is at an address with a byte selector of
zero,i.e. the fulladdress of the object is divisibleby 4. Similarly,a data object is said to
11110110these bit values not related to integer value
07831bit position
11110110
07831bit position
11 1...
signed integervalue (-10) storedas an 8-bitobject (byte)
signed integer value(-10) stored as a32-bit object(word)
19/205
3 Architecture
be
half-word-aligned
if it is at an address with an even byte selector, i.e. the full
address of the object is divisible by 2. Word objects, including addresses, are normally stored word-aligned in memory.This
is usually desirable to make the best use of any 32-bit wide memory. Also most instructions that involve fetching data from or storing data into memory, use word aligned addresses and load or store four contiguous bytes.
However, there are some instructions that can manipulate part of a word. A half-word object is normally half-word-aligned, so it can be stored either in the least significant 16 bits of a word or in the most significant 16 bits. A data item that is represented in two contiguous words is called a
double word
object and is normally word-aligned.
3.2.3 Ordering of information in memory
Data is stored in memory using the little-endian rule. Objects consisting of more than one byte are stored in consecutive bytes, with the least significant byte at the lowest address and the most significant at the highest address.
Figure 3.3 shows the ordering of bytes in words in memory. If X is a word-aligned address then the word at X consists of the bytes at addresses X to X+3, where the byte at X is the least significantbyte and thebyte at X+3 is the most significant byte of the word.
Figure 3.3 Bytes in wordsin memory
3.2.4 Work space
The ST20-C1 uses a stack-based data structure in memory to hold the local working data of a program, called the work space. The work space is a word-aligned collection of 32-bit words pointed to by the work space pointer register (Wptr).
The programmer’s model isthat all local data is held in the work space, i.e.in memory, and must be brought into the evaluation stack to be operated on, and then written back from the evaluationstack to the work space.
X+7 X+6 X+5 X+4 X+3 X+2 X+1 X+0
Memory
(bytes)
X+3 X+2 X+1 X+0
MSB LSB
31 24 23 16 15 8 7 0
X+7 X+6 X+5 X+4
MSB LSB
31 24 23 16 15 8 7 0
32-bit words
X
is a word-aligned byte address
X+n
is the bytenbytes past
X
3.3 Registers
20/205
An implementation of the ST20-C1 core may include a
register cache
. This providesa mechanism to accelerate access to local work space without changing the programmer’smodel of how the work space operates or impacting either the excellent code density or low interrupt latency associated with a stack-based instruction set.
3.3 Registers
This section introduces the ST20-C1 core registers that are visible to the programmer. Seven registers, known as process state registers, define the local state of the executing process. These registers are preserved through exceptions. One other register is provided for performing input/output, and is not preserved through excep­tions. All registers are 32-bit. Each instruction explicitly refers to specific registers, as described in the instruction definitions.
The state of an executing process at any instant is defined by the contents of the machine registers listed in Table 3.1. The registers are illustrated in Figure 3.4 and described in the rest of this section.
Figure 3.4 Register set
Register Description
Areg Evaluation stack register A Breg Evaluation stack register B Creg Evaluation stack register C Iptr Instruction pointer register, pointing to the next instruction to be executed Status Status register Wptr Work space pointer, pointing to the stack of the currently executingprocess Tdesc Task descriptor IOreg Input and output register
Table 3.1 Processor registers
Iptr
Areg
Breg
Creg
IOReg
Tdesc
Wptr
Evaluation Stack
Memory
Program Code
Local Program Data
offset
base
Status
Task Descriptor
Instruction Pointer
Workspace Pointer
ST20-C1 Core
Task Control Block
21/205
3 Architecture
3.3.1 Evaluation stack
The registers Areg, Breg and Creg are organized as a three register evaluationstack, with Areg at the top. The evaluation stack is used for expression evaluation and to hold operands and results of instructions. Generally, instructions maypop values from or push values onto the evaluation stack or both, and do not address individual evalu­ation stack registers.
Pushing a value onto the stack means that the value initially in Breg is pushed into Creg, the value in Areg is pushed into Breg and the new value is put in Areg. Popping a value from the stack means that a value is taken from Areg, the value initially in Breg is popped into Areg, and the value in Creg is popped into Breg.The value left in Creg varies between instructions, but is generally the value initially in the Areg. These actions are illustrated in Figure 3.5 and Figure 3.6.
Figure 3.5 Pushing a value x onto the evaluationstack
Figure 3.6 Poppinga valuefrom the evaluation stack
3.3.2 Status register
The status register contains status bits which describe the current state of the executing process and any errors which may have been detected. Initially the status register is set to the value given in Table7.3.
The contents of the status register are summarizedin Table 3.2 and described in more detail in the following paragraphs. Generally the status register is local except for the
Before
Areg Breg
Creg
After
a b c
a b
x
Before
Areg Breg
Creg
After
a b c
c a
b
3.3 Registers
22/205
global interrupt enable and timeslice enable, which are global and carried from one process to another across a context switch.
The mac_count, mac_buffer, mac_scale and mac_mode fields are used by the multiply-accumulate instructions to hold initialization data which must be saved when an exception occurs. See Chapter 5 for details of multiply accu­mulation.
The global_interrupt_enable bit enables external interrupts. Interrupts remain enabled or disabled until explicitly disabled or enabledagain. This bit is global and is maintained when a process is descheduled.
The local_interrupt_enable enables external interrupts. Clearing this bit dis- ables external interrupts until the current process is descheduled. This is needed when a process delegates part of its processing to a peripheral and then deschedules until completion, as describedin section 4.11.3.
Overflow, underflow and carry bits relating to arithmetic state are kept in the status word.
The ST20-C1 maintains “sticky”bits in the status word which indicate whether an overflow or underflow has occurred. This allows a complete expression to be evaluated before testing whether an overflow has occurred. Overflow and underflow are chosen as they apply both to addition as well as multiply as opposed to a more traditional method of replicating two bits out of the carry
Bit numbers Full name Meaning when set or meaning of value
0-7 mac_count Multiply-accumulate number of steps.
8-10 mac_buffer Multiply-accumulate data buffersize code.
11 - 12 mac_scale Multiply-accumulate scaling code.
13 mac_mode Multiply-accumulate accumulator format code. 14 global_interrupt_enable Enable external interrupts until explicitlydisabled. 15 local_interrupt_enable Enable external interrupts. Clearing this bit disables inter-
rupts until the current process is descheduled.
16 overflow An arithmetic operation gave a positive overflow. 17 underflow An arithmetic operation gave a negative overflow. 18 carry An arithmetic operation produced a carry. 19 user_mode A user process is executing. 20 interrupt_mode An interrupt handler is executingor trapped. 21 trap_mode A trap handler is executing. 22 sleep The processor is due to go to sleep. 23 reserved Reserved. 24 start_next_task The CPU must start executing a new process. 25 timeslice_enable Timeslicing is enabled.
26 - 31 timeslice_count Timeslice counter.
Table 3.2 Status register bits
23/205
3 Architecture
chain. In addition, they allow for saturated arithmetic to be implemented rela­tively easily.
A (non-sticky) carry bit is provided to allow efficient implementation of long addition and subtraction. The carry bit is only manipulated by the
addc
and
subc
instructions allowing the other add instructions to be used in address for­mation of multi-word values where carry propagation is required so that the carry is not lost in the address formation evaluations.
The user_mode bit indicates when the machine is handling a user process, i.e. a process which is not an exception handler. The interrupt_mode bit indi­cates when the machine is handling an interrupt, or a trap from an interrupt handler and the trap_mode bit indicates when the machine is executinga trap handler.An operating system mayneed to distinguish betweenmodes to allow it to perform scheduling activities from a trap handler. These bits are also required to enable the
eret
instruction to determine whether a signal to the
interrupt controller is required.
The sleep bit indicates that the CPU is due to go to sleep, i.e. to turn off its clocks and go into lowpower mode. This bit is set when the CPU detects there is no user process to execute and is cleared when the CPU goes to sleep.
The start_next_task bit when set causes the processor to attempt to run the next process from the scheduling queue.
The timeslice_enable bit and timeslice_count field are used for timeslicing, as described in section 7.4.
The instructions to use the status register are described in section 4.12.
3.3.3 The work space pointer
All programs need somewhere to store local working data, e.g. local variables in the application code. In the ST20 architecture, this local storage is termed the
work space
of the program. The Wptr register is the local work space pointer,which holds the address of the stack
of the executing process. The stack is downward pointing, so space is allocated by moving the Wptr to a lower address. This address is word aligned and therefore has the two least significant bits set to zero. When a process is descheduled, the Wptr is stored as part of the process descriptor block, which is pointed to by Tdesc.
The Wptr is used as a base for addressing
local
variables. A word offset from the
Wptr is the operand for the instructions
ldl
(load local),
stl
(store local) and
ldlp
(load
local pointer). The ST20-C1 simplifiesthe normal stack scheme by decoupling the load/store action
from the pointer update:
Load-local and store-local instructions access values in the work space with addresses relative to the Wptr, but do not change the valueof Wptr.
Separate instructions (
ajw,gajw
) are provided to update the work space
3.4 Instruction encoding
24/205
pointer by any amount in one step without needing a series of increments or decrements.
On calling a function or procedure,the Wptr is normally decreased to a loweraddress to allocate space for the parameters and local variables of the function. This is performed using the instruction
ajw
. The Wptr is returned to its initial value before
returning from the function to free the local workspace.
3.3.4 The task descriptor
The task descriptor Tdesc points to the process descriptor block for the currently executing process. The value held in the Tdesc becomes the process identifierwhen the process is not executing.
The process descriptor block is a block of memory whose contents depend on the state of the process. It will generally hold the saved Wptr and Iptr for the process, and may hold a link to the next process if the process is in a queue of waiting processes. The process descriptor block is described in section 7.2.
3.3.5 IO register
The bits of the IOreg are mapped to external connections on the ST20-C1 core.They may be used to signal to, or read signals from, peripherals on or off chip. The
io
instruction is used to read and write to the IOreg and is described in section 4.11. The IOreg is global, and remains unchanged by any context switch. The bits of the IOreg are definedin Table 3.3.
In some ST20 variants, some bits of the IO register may be reserved for system use. The reserved bits will be the most significant bits of the appropriate half word. The number of any such bits is given in the data sheet for each variant.
3.4 Instruction encoding
The ST20-C1 is a zero-address machine. Instruction operands are alwaysimplicit and no bits are needed in the instruction representation to carry address or operand location information. This results in very short instructions and exceptionallyhigh code density.
The instruction encoding is designed so that the mostcommonly executed instructions occupy the least number of bytes. This reduces the size of the code, which saves memory and reduces the memory bandwidth needed for instruction fetching. This section describes the encoding mechanism.
A sequence of single byte
instruction components
is used to encode an instruction.
The ST20 interprets this sequence at the instruction fetch stage of execution. Most
Bits Purpose
0-15 Output data 16-31 Input data
Table 3.3 IOreg bits
25/205
3 Architecture
programmers, working at the level of microprocessor assembly language or high-level language, need not be aware of the existence of instruction components and do not generally need to consider the encoding.
This section has been included to provide a background. Appendix C discusses consequential issues which need to be considered in order to implement a code generator.
3.4.1 An instruction component
Each instruction component is one byte long, and is divided into two 4-bit parts. The four most significant bits of the byte form a
function code
, and the four least significant
bits are used to build an
instruction data value
as shown in Figure 3.7.
Figure 3.7 Instruction format
This representation provides for sixteen function code values (one for each function), each with a data field ranging from 0 to 15.
Instructions that specify the instruction directly in the function code are called
primary
instructions
or
functions
. There are 13 primary instructions, and the other three possible function code values are used to build larger data values and other instruc­tions. Two function code values,
pfix
and
nfix
, are used to extend the instruction data
value by prefixing. One function code
operate(opr
) is used to specify an instruction
indirectly using the
instruction data value.opr
is used to implement
secondaryinstruc-
tionsoroperations
.
3.4.2 The instruction data value and prefixing
The data fieldof an instruction component is used to create an instruction data value. Primary instructions interpret the instruction data value as the operand of the instruc­tion. Secondaryinstructions interpret it as theoperation code for the instruction itself.
The instruction data value is a signed integer that is represented as a 32-bit word. For each new instruction sequence, the initial value of this integer is zero.Since there are only 4 bits in the data field of a single instruction component, it is only possible for most instruction components to initially assign an instruction data value in the range 0 to 15. Prefix components are used to extend the range of the instruction data value.
mnemonic name
pfix
n
prefix
nfix
n
negative prefix
Table 3.4 Prefixinginstruction components
function code data
0347
3.4 Instruction encoding
26/205
One or more prefixing components may be needed to create the full instruction data value. The prefixes are shown in Table3.4 and explained below.
All instruction components initially load the four data bits into the least significant four bits of the instruction data value.
pfix
loads its four data bits into the instruction data value,and then shifts this value up four places. Consequently,a sequence of one or more prefixes can be used to extend the data valueof the following instruction to any positive value. Instruction data values in the range 16 to 255 can be represented using one
pfix
.
nfix
is similar,except that it complements all 32 bits of the instruction data value before shifting it up, thus changing the sign of the instruction data value. Consequently, a sequence of one or more
pfix
es with one
nfix
can be used to extend the data value of a following instruction to any negative value. Instruction data values in the range -256 to -1 can be represented using one
nfix
.
When the processor encounters an instruction component other than
pfix
or
nfix
,it loads the 4-bit data field into the instruction data value. The instruction encoding is now complete and the instruction can be executed. The instruction data valueis then cleared so that the processor is ready to fetch the next instruction component, by building a new instruction data value.
For example, to load the constant 0x11, the instruction
ldc
0x11 is encoded with the
sequence:
pfix1;ldc
1
The instruction
ldc
0x2A68 is encoded with the sequence:
pfix2;pfixA;pfix6;ldc
8
The instruction
ldc
-1 is encoded with the sequence:
nfix0;ldc
F
3.4.3 Primary Instructions
Research has shown that computers spend most time executing a small number of instructions such as:
instructions to load and store from a small number of ‘local’ variables;
instructions to add and compare with small constants; and
instructions to jump to or call other parts of the program.
For efficiency, in the ST20 these are encoded directly as primary instructions using the function field of an instruction component.
Thirteen of the instruction components are used to encode the most important opera­tions performed by any computer executing a high levellanguage. These are used (in conjunction with zero or more prefixes) to implement the primary instructions. Primary instructions interpret the instruction data value as an operand for the instruction. The mnemonic for a primary instruction always includes this operand, shown in this manual asn.
27/205
3 Architecture
The mnemonics and names for the primary instructions are listed in Table 3.5.
3.4.4 Secondary instructions
The ST20 encodes all other instructions, known as
secondary instructions,
indirectly
using the instruction data value.
The function code
opr
causes the instruction data value to be interpreted as the operation code of the instruction to be executed. This selects an operation to be performed on the values held in the evaluation stack, so that a further 16 operations can be encoded in a single byte instruction. The
pfix
instruction component can be used to extend the instruction data value, allowing any number of operations to be encoded.
Secondary instructions do not have an operand specified by the encoding, because the instruction data value has been used to specify the operation.
To ensure that programs are represented as compactly as possible,the operations are encoded in such a way that the most frequently used secondary instructions are represented without using prefix instructions.
Forexample, the instruction
add
is encoded by:
opr
4
The instruction
and
is encoded by:
opr
F9
mnemonic name
adc n
add constant
ajw n
adjust work space
fcall n
function call
cj n
conditional jump
eqc n
equals constant
jn
jump
ldc n
load constant
ldl n
load local
ldlp n
load local pointer
ldnl n
load non-local
ldnlp n
load non-local pointer
stl n
store local
stnl n
store non-local
Table 3.5 Primary instructions
mnemonic name
opr n
operate
Table 3.6 Operate instruction
3.4 Instruction encoding
28/205
which is in turn encoded with the sequence:
pfixF;opr
9
3.4.5 Summary of encoding
The encoding mechanism has important consequences.
It produces very compact code.
It simplifies language compilation, by providing a completely uniform way of
allowing a primary instruction to take an operand of any size up to the proces­sor word-length.
It allows these operands to be represented in a form independent of the word­length of the processor.
It enables any number of secondary instructions to be implemented.
To aid clarity and brevity,prefix sequences and the use of
opr
are not explicitly shown in this guide. Each instruction is represented bya mnemonic, and for primary instruc­tions an item of data, which stands for the appropriate instruction component sequence. Hence the examples above would be just shown as:
ldc 17,add
, and
and
. Where appropriate, an expressionmaybe placed in a code sequence to represent the code needed to evaluatethat expression.
29/205
3 Architecture
4.1 Manipulating the evaluation stack
30/205
4 Using ST20-C1 instructions
This chapter describes the purpose for which the sequential instructions are intended, except for the multiply-accumulate instructions, which are described in Chapter 5. These instructions are described in the context of their intended use. Some instruc­tions are designed for use in a particular sequence of instructions, so this chapter describes those sequences. Instructions for exceptions are described in Chapter 6 and multi-tasking instructions are described in Chapter 7.
The architecture of the ST20-C1, including the registers and memory arrangement, is described in Chapter 3.
4.1 Manipulating the evaluation stack
The evaluation stack consists of the registers Areg, Breg and Creg. The general action of the evaluationstack is described in section 3.3.1.
Instructions are provided for shuffling and re-ordering the values on the evaluation stack, as listed in Table4.1.
rot
pops the value from Areg off the evaluationstack and rotates it into Creg, and
arot
pushes the value from Creg onto the stack.
rev
swaps the Areg and Breg, and
dup
pushes a copy of Areg onto the stack. Table 4.2 shows how each of these affects the evaluation stack. Each row shows the
contents of the evaluation stack after one of these instructions is executedif the initial values of the Areg, Breg and Creg are a, b and c respectively.
Many instructions leave the initial Areg in Creg. This value may be restored into the Areg by using
arot
.
Mnemonic Name
rot
rotate stack
arot
anti-rotate stack
dup
duplicate stack
rev
reversestack
Table 4.1 Evaluation stack manipulation instructions
Instruction Areg Breg Creg
rot
bca
arot
cab
rev
bac
dup
aab
Table 4.2 Evaluation stack manipulation
Loading...
+ 175 hidden pages