Specifications and information in this document are subject to change
without notice and do not represent a commitment on the part of
Hyperstone AG. Hyperstone AG reserves the right to make changes to
improve functioning. Although the information in this document has been
carefully reviewed, Hyperstone AG does not assume any liability arising
out of the use of the product or circuit described herein.
Hyperstone AG does not authorize the use of the Hyperstone
microprocessor in life support applications wherein a failure or
malfunction of the microprocessor may directly threaten life or cause
injury. The user of the Hyperstone microprocessor in life support
applications assumes all risks of such use and indemnifies Hyperstone
AG against all damages.
No part of this manual may be reproduced or transmitted in any form or
by any means, electronic or mechanical, including photo-copying and
recording, for any purpose without the permission of Hyperstone AG.
Hyperstone is a registered trademark of Hyperstone AG
For further information please contact:
Hyperstone AG
Line-Eid-Strasse 3
D-78467 Konstanz
Germany
The hyperstone E1-32XS and hyperstone E1-16XS microprocessors represent a further
development of the hyperstone E1-32X and hyperstone E1-16X processors. Using a
sophisticated 0.25 µm CMOS process, the maximum clock rate of the processor could be
further improved. Being pin-compatible to their predecessors these new processors can be
used as a direct replacement in existing designs, given that the changed voltage supply
requirements are considered.
This combination of a high-performance RISC microprocessor with an additional powerful
DSP instruction set and on-chip microcontroller functions offers a high throughput. The
speed is obtained by an optimized combination of the following features:
❒ The most recent stack frames are kept in a register stack, thereby reducing data memory
accesses to a minimum by keeping almost all local data in registers.
❒ Pipelined memory access allows overlapping of memory accesses with execution.
❒ 16 KByte fully static On-Chip Memory (IRAM).
❒ On-chip Instruction Cache omits instruction fetch in inner loops and provides prefetch.
❒ Variable-length instructions of 16, 32 or 48 bits provide a large, powerful instruction
set, thereby reducing the number of instructions to be executed.
❒ Primarily used 16-bit instructions halve the memory bandwidth required for instruction
fetch in comparison to conventional RISC architectures with fixed-length 32-bit
instructions, yielding also even better code economy than conventional CISC
architectures.
❒ Orthogonal instruction set.
❒ Most instructions execute in one cycle.
❒ Pipelined DSP instructions.
❒ Parallel execution of ALU, DSP, and load/store instructions.
❒ Single-cycle half-word multiply-accumulate operation.
❒ Fast Call and Return by parameter passing via registers.
❒ An instruction pipeline depth of just two stages — decode/execute — provides
branching without insertion of wait cycles in combination with Delayed Branch
instructions.
❒ Range and pointer checks are performed without speed penalty, thus, these checks need
no longer be turned off, thereby providing higher runtime reliability.
❒ Separate address and data buses provide a throughput of one 32-bit word each cycle.
The features noted above contribute to reduce the number of idle wait cycles to a bare
minimum. The processor is designed to sustain its execution rate with a standard SDRAM
memory.
The low power consumption is of advantage for mobile (portable) applications or in
temperature-sensitive environments.
Page 1-2 Architecture
Most of the transistors are used for the on-chip memory, the instruction cache, the register
stack and the multiplier, whereas only a small number is required for the control logic.
Due to their low manufacturing costs, the
hyperstone E1-32XS and E1-16XS
microprocessors are very well suited for embedded systems applications requiring high
performance and lowest cost. To simplify board design as well as to reduce system costs,
the hyperstone E1-32XS and E1-16XS already come with integrated peripherals, such as a
timer and memory and bus control logic. Therefore, complete systems using hyperstone
microprocessors can be designed with a minimum of external components. To connect any
kind of memory or I/O, no glue logic is necessary. It is even suitable for systems where up
to now microprocessors with 16-bit architecture have been used for cost reasons. Its
improved performance compared to conventional microcontrollers can be used to
software-substitute many external peripherals like graphics controllers or DSPs. Using the
hyperstone E1-32XS as a core for ASICs or ASSPs is particularly advantageous because the
total gate count of the entire logic is just about 35k gates without the on-chip memory.
The software development tools include an optimizing C compiler, assembler, source-level
debugger with profiler as well as a runtime kernel and a sophisticated DSP library. Using
the runtime kernel, up to 31 tasks, each with its own virtual timer, can be developed
independently of each other. The synchronization of these tasks is effected almost
automatically by the runtime kernel. For the system designer, it seems as if he has up to 31
hyperstone microprocessors to which he can allocate his programs accordingly. Real-time
debugging of multiple tasks is assisted in an optimized way. The DSP library provides a
powerful set of DSP related functions and is taking care of all parallelism between ALU,
DSP, and Load/Store unit.
The following description gives a brief architectural overview:
1.1.1. Compatibility:
❒ Pin compatible to hyperstone E1-32X and hyperstone E1-16X.
1.1.2. Phased Locked Loop:
❒ An internal phased locked loop circuit (PLL) provides clock rate multiplication by a
factor ½, 1, 2, 4, or 8. For example, only an external crystal of 12 MHz is required to
achieve an internal clock rate of 96 MHz.
1.1.3. Registers:
❒ 32 global and 64 local registers of 32 bits each
❒ 16 global and up to 16 local registers are addressable directly
1.1.4. Flags:
❒ Zero(Z), negative(N), carry(C) and overflow(V) flag
❒ Interrupt-mode, interrupt-lock, trace-mode, trace-pending, supervisor state, cache-mode
and high global flag
E1-32XS User’s Manual Page 1-3
1.1.5. Register Data Types:
❒ Unsigned integer, signed integer, single or double signed short, signed complex short,
single or double 16-bit fixed-point, bit string, IEEE-754 floating-point, each either 32 or
64 bits
1.1.6. External Memory:
❒ Address space of 4 Gbytes, divided into five areas
❒ Separate I/O address space
❒ Load/Store architecture
❒ Pipelined memory and I/O accesses
❒ High-order data located and addressed at lower address (big endian)
❒ Instructions and double-word data may cross DRAM page boundaries
1.1.7. On-Chip Memory (IRAM):
❒ 16 KByte fully static internal memory, 32 bit wide, single cycle access
1.1.8. Memory Data Types:
❒ Unsigned and signed byte (8 bit)
❒ Unsigned and signed half-word (16 bit), located on half-word boundary
❒ Undedicated word (32 bit), located on word boundary
❒ Undedicated double-word (64 bit), located on word boundary
1.1.9. Runtime Stack:
❒ Runtime Stack is divided into memory part and register part
❒ Register part is implemented by the 64 local registers holding the most recent stack
frame(s)
❒ Current stack frame (maximum 16 registers) is always kept in register part of the stack
❒ Data transfer between memory and register part of the stack is automatic
❒ Upper stack bound is guarded
1.1.10. Instruction Cache:
❒ An on-chip Instruction Cache reduces instruction memory access substantially
1.1.11. Instructions General:
❒ Variable-length instructions of one, two or three half-words halve required memory
bandwidth
❒ Pipeline depth of only two stages, assures immediate refill after branches
❒ Register instructions of type "source operator destination ⇒ destination" or
Page 1-4 Architecture
"source operator immediate ⇒ destination"
❒ All register bits participate in an operation
❒ Immediate operands of 5, 16 and 32 bits, zero- or sign-expanded
❒ Large address displacement of up to 28 bits
❒ Two sets of signed arithmetical instructions: instructions set or clear either only the
overflow flag or trap additionally to a Range Error routine on overflow
❒ DSP instructions operate on 16-bit integer, real and complex fixed-point data and 32-bit
integer data into 32-bit and 64-bit hardware accumulators
1.1.12. Instruction Summary:
❒ Memory instructions pipelined to a depth of two stages, trap on address register equal to
placement (including PC relative), register post-increment by displacement (next
address), absolute, stack address, I/O absolute and I/O displacement
❒ Load, all data types, bytes and half-words right adjusted and zero- or sign-expanded,
execution proceeds after Load until data is needed
❒ Store, all data types, trap when range of signed byte or half-word is exceeded
❒ Move, Move immediate, Move double-word
❒ Logical instructions AND, AND not, OR, XOR, NOT, AND not immediate, OR
immediate, XOR immediate
❒ Mask source and immediate ⇒ destination
❒ Add unsigned/signed, Add signed with trap on overflow, Add with carry
❒ Add unsigned/signed immediate, Add signed immediate with trap on overflow
❒ Sum source + immediate ⇒ destination, unsigned/signed and signed with trap on
overflow
❒ Subtract unsigned/signed, Subtract signed with trap on overflow, Subtract with carry
❒ Negate unsigned/signed, Negate signed with trap on overflow
❒ Multiply word ∗ word ⇒ low-order word unsigned or signed, Multiply word ∗ word ⇒
double-word unsigned and signed
❒ Divide double-word by word ⇒ quotient and remainder, unsigned and signed
❒ Shift left unsigned/signed, single and double-word, by constant and by content of
register
❒ Shift right unsigned and signed, single and double-word, by constant and by content of
register
❒ Rotate left single word by content of register
❒ Index Move, move an index value scaled by 1, 2, 4 or 8, optionally with bounds check
❒ Check a value for an upper bound specified in a register or check for zero
❒ Compare unsigned/signed, Compare unsigned/signed immediate
E1-32XS User’s Manual Page 1-5
❒ Compare bits, Compare bits immediate, Compare any byte zero
❒ Test number of leading zeros
❒ Set Conditional, save conditions in a register
❒ Branch unconditional and conditional (12 conditions)
❒ Delayed Branch unconditional and conditional (12 conditions)
❒ Call subprogram, unconditional and on overflow
❒ Trap to supervisor subprogram, unconditional and conditional (11 conditions)
❒ Frame, structure a new stack frame, include parameters in frame addressing, set frame
length, restore reserve frame length and check for upper stack bound
❒ Return from subprogram, restore program counter, status register and return-frame
❒ Software instructions, call an associated subprogram and pass a source operand and the
address of a destination operand to it
❒ DSP Multiply instructions:
signed and/or unsigned multiplication ⇒ single and double word product
❒ DSP Multiply-Accumulate instructions:
signed multiply-add and multiply-subtract ⇒ single and double word product sum and
difference
❒ DSP Half-word Multiply-Accumulate instructions:
signed multiply-add operating on four half-word operands ⇒ single and double word
product sum
❒ DSP Complex Half-word Multiply instruction:
signed complex half-word multiplication ⇒ real and imaginary single word product
signed complex half-word multiply-add ⇒ real and imaginary single word product sum
❒ DSP Add and Subtract instructions:
signed half-word add and subtract with and without fixed-point adjustment ⇒ single
word sum and difference
❒ Floating-point instructions are architecturally fully integrated, they are executed as
Software instructions by the present version. Floating-point Add, Subtract, Multiply,
Divide, Compare and Compare unordered for single and double-precision, and Convert
single ⇔ double are provided.
1.1.13. Exceptions:
❒ Pointer, Privilege, Frame and Range Error, Extended Overflow, Parity Error, Interrupt
and Trace mode exception
❒ Watchdog function
❒ Error-causing instructions can be identified by backtracking, thus allowing a very
detailed error analysis
Page 1-6 Architecture
1.1.14. Timer:
❒ Two multi-functional timers
1.1.15. Bus Interface:
❒ Separate address bus of 26 (E1-32XS) or 22 (E1-16XS) bits and data bus of up to 32 (E1-
32XS
) or 16 bits (E1-16XS) provide a throughput of up to four or two bytes at each clock
cycle, respectively.
❒ Data bus width of 32, 16 or 8 bits, individually selectable for each external memory
area.
❒ 8-bit, 16-bit, and 32-bit boot width selectable via two external pins.
❒ Configurable I/O pins
❒ Internal generation of all memory and I/O control signals
❒ Wait pin function for I/O accesses to peripheral devices.
❒ Wait pin function for memory accesses to address space MEM2.
❒ On-chip DRAM controller supporting Fast-Page-Mode DRAMs, EDO DRAMs, and
synchronous DRAMs (SDRAM).
❒ Up to seven vectored interrupts.
❒ Control function for CLKOUT pin.
1.1.16. Power Management:
❒ Lower power supply current in power-down mode.
❒ Clock-Off function to further reduce power dissipation (Sleep Mode).
❒ PLL settings controllable by software, providing a “gear-up” or “gear-down” within one
clock cycle.
E1-32XS User’s Manual Page 1-7
1.2. Block Diagram
Register Set
64 Local
26 Global
X Y PC
X Y IInstruction
ALU
Barrel Shifter
Z W A
X Decode
Y Decode
Instruction
Cache
Control
X Y
DSP
Execution
Unit
Instruction
Cache
Instruction
Decode
Execution
Control
Instruction
Prefetch Control
Load
Decode
Bus Interface
Control
Store Data
Pipeline
32
32
(16)
Data Bus Parity
Figure 1.1: Block Diagram
4
(2)
16 KByte
SRAM
14
Memory Address
Pipeline
26
(22)
Address
Bus
Watchdog
PLL
Power
Reset
Control
Bus Pipeline
Internal
Timer
I/O and
Interrupt
Control
7
Control
Control
Bus
Page 1-8 Architecture
1.3. Global Register Set
The architecture provides 32 global registers of 32 bits each. These are:
G0 Program Counter PC
G1 Status Register SR
G2 Floating-Point Exception Register FER
G3..G15 General purpose registers
G16..G17 Reserved
G18 Stack Pointer SP
G19 Upper Stack Bound UB
G20 Bus Control Register BCR (see section 6. Bus Interface)
G21 Timer Prescaler Register TPR (see section 5. Timer)
G22 Timer Compare Register TCR (see section 5. Timer and CPU Clock
Modes)
G23 Timer Register TR (see section 5. Timer and CPU Clock Modes)
G24 Watchdog Compare Register WCR (see section 6. Bus Interface)
G25 Input Status Register ISR (see section 6. Bus Interface)
G26 Function Control Register FCR (see section 6. Bus Interface)
G27 Memory Control Register MCR (see section 6. Bus Interface)
G28..G31 Reserved
Registers G0..G15 can be addressed directly by the register code (0..15) of an instruction.
Registers G18..G27 can be addressed only by a MOV or MOVI instruction with the high
global flag H set to 1.
E1-32XS User’s Manual Page 1-9
031
G0
G1
G2
G3
G15
G16
G17
G18
G19
G20
G21
G22
G23
G24
G25
Program Counter PC
Status Register SR
Floating-Point Exception Register FER
General Purpose Registers G3..G15
Reserved
Reserved
Stack Pointer SP
Upper Stack Bound UB
Bus Control Register BCR
Timer Prescaler Register TPR
Timer Compare Register TCR
Timer Register TR
Watchdog Compare Register WCR
Input Status Register ISR
0
0000
G26
G27
G28
G31
Figure 1.2: Global Register Set
Function Control Register FCR
Memory Control Register MCR
G28..G31 Reserved
1.3.1. Program Counter PC
G0 is the program counter PC. It is updated to the address of the next instruction through
instruction execution. Besides this implicit updating, the PC can also be addressed like a
regular source or destination register. When the PC is referenced as an operand, the value
supplied is the address of the first byte after the instruction which references it, except
when referenced by a delay instruction with a preceding delayed branch taken (see section
3.26. Delayed Branch Instructions).
Placing a result in the PC has the effect of a branch taken. Bit zero of the PC is always
zero, regardless of any value placed in the PC.
Page 1-10 Architecture
1.3.2. Status Register SR
G1 is the status register SR. Its content is updated by instruction execution. Besides this
implicit updating, the SR can also be addressed like a regular register. When addressed as
source or destination operand, all 32 bits are used as an operand. However, only bits 15..0
of a result can be placed in bits 15..0 of the SR, bits 31..16 of the result are discarded and
bits 31..16 of the SR remain unchanged. The full content of the SR is replaced only by the
Return Instruction. A result placed in the SR overrules any setting or clearing of the
condition flags as a result of an instruction.
31 3027 26 25 24 23 22 21 20 19 18 17 16
Figure 1.3: Status Register SR (bits 31..16)
151411109876543210
LI
FRM
2829
FP
Frame PointerFrame Length
1213
FTEVN
FLS
ILC
Instruction-Length Code
MH
P
Supervisor State Flag
Z
T
Trace Pending Flag
C
Zero Flag
Trace-Mode Flag
Carry Flag
Floating-Point Trap Enable
Floating-Point Rounding Mode
Interrupt-Lock Flag
Figure 1.4: Status Register SR (bits 15..0)
Negative Flag
Overflow Flag
Cache-Mode Flag
High Global Flag
Reserved
Interrupt-Mode Flag
E1-32XS User’s Manual Page 1-11
The status register SR contains the following status information:
CBit zero is the carry condition flag C. In general, when set it indicates that the
unsigned integer range has been exceeded. At add operations, it indicates a
carry out of bit 31 of the result. At subtract operations, it indicates a borrow
(inverse carry) into bit 31 of the result.
ZBit one is the zero condition flag Z. When set, it indicates that all 32 or 64
result bits are equal to zero regardless of any carry, borrow or overflow.
NBit two is the negative condition flag N. On compare instructions, it indicates
the arithmetic correct (true) sign of the result regardless of an overflow. On all
other instructions, it is derived from result bit 31, which is the true sign bit
when no overflow occurs. In the case of overflow, result bit 31 and N reflect
the inversion of the true sign.
VBit three is the overflow condition flag V. In general, when set it indicates a
signed overflow.
MBit four is the cache-mode flag M. Besides being set or cleared under program
control, it is also automatically cleared by a Frame instruction and by any
branch taken except a delayed branch. See section 1.9. Instruction Cache for
details.
HBit five is the high global flag H. When H is set, denotation of G0..G15 addres-
ses G16..G31 instead. Thus, the registers G18..G27 may be addressed by denoting G2..G11 respectively.
The H flag is effective only in the first cycle of the next instruction after it was
set; then it is cleared automatically.
Only the MOV or MOVI instruction issued as the next instructions can be used
to copy the content of a local register or an immediate value to one of the high
global registers. The MOV instruction can also be used to copy the content of a
high global register (except the BCR, TPR, FCR and MCR register, which are
write-only) to a local register. With all other instructions, the result may be
invalid.
If one of the high global registers is addressed as the destination register in user
state (S = 0), the condition flags are undefined, the destination register remains
unchanged and a trap to Privilege Error occurs.
Reserved Bit six is reserved for future use. It must always be zero.
I Bit seven is the interrupt-mode flag I. It is set automatically on interrupt entry
and reset to its old value by a Return instruction. The I flag is used by the
operating system; it must be never changed by any user program, regardless of
user or supervisor state.
FTEBits 12..8 are the floating-point trap enable flags (see section 3.33.2. Floating-
Point Instructions).
FRMBits 14..13 are the floating-point rounding modes (see section 3.33.2. Floating-
Point Instructions).
Page 1-12 Architecture
LBit 15 is the interrupt-lock flag L. When the L flag is one, all Interrupt, Parity
Error and Extended Overflow exceptions regardless of individual mode bits are
inhibited. The state of the L flag is effective immediately after any instruction
which changed it. The L flag is set to one by any exception.
The L flag can be cleared or kept set in any or on return to any privilege state
(user or supervisor). Changing the L flag from zero to one is privileged to
supervisor or return from supervisor to supervisor state. A trap to Privilege
Error occurs if the L flag is set under program control from zero to one in user
or on return to user state.
The following status information cannot be changed by addressing the SR:
TBit 16 is the trace-mode flag T. When both the T flag and the trace pending
flag P are one, a trace exception occurs after every instruction except after a
Delayed Branch instruction. The T flag is cleared by any exception.
Note: The T flag can only be changed in the saved return SR and is then
effective after execution of a Return instruction.
PBit 17 is the trace pending flag P. It is automatically set to one by all in-
structions except by the Return instruction, which restores the P flag from bit
17 of the saved return SR.
Since for a Trace exception both the P and the T flag must be one, the P flag
determines whether a trace exception occurs (P = 1) or does not occur (P = 0)
immediately after a Return instruction which restored the T flag to one.
Note: The P flag can only be changed in the saved SR. No program except the
trace exception handler should affect the saved P flag. The trace exception
handler must clear the saved P flag to prevent a trace exception on return, in
order to avoid tracing the same instruction in an endless loop.
SBit 18 is the supervisor state flag S (see section 1.5. Privilege States). It is set
to one by any exception.
ILCBits 20 and 19 represent the instruction-length code ILC. It is updated by
instruction execution. The ILC holds (in general) the length of the last instruction: ILC values of one, two or three represent an instruction length of
one, two or three half-words respectively. After a branch taken, the ILC is
invalid. The Return instruction clears the ILC.
Note: Since a Return instruction following an exception clears the ILC, a
program must not rely on the current value of the ILC.
FLBits 24..21 represent the frame length FL. The FL holds the number of usable
local registers (maximum 16) assigned to the current stack frame.
FL = 0 is always interpreted as FL = 16.
FPBits 31..25 represent the frame pointer FP. The least significant six bits of the
FP point to the beginning of the current stack frame in the local register set,
that is, they point to L0.
The FP contains bit 8..2 of the address at which the content of L0 would be
stored if pushed onto the memory part of the stack.
E1-32XS User’s Manual Page 1-13
1.3.3. Floating-Point Exception Register FER
G2 is the floating-point exception register. Only bits 12..8 and 4..0 may be changed by a
user program, all other bits must remain unchanged.
31
Reserved
Figure 1.5: Floating-Point Exception Register
1.3.4. Stack Pointer SP
12131110987654
Reserved for Operating System
Floating-Point Actual Exceptions
3
Floating-Point Accrued Exceptions
10
2
G18 is the stack pointer SP. The SP contains the top address + 4 of the memory part of the
stack, that is the address of the first free memory location in which the first local register
would be saved by a push operation (see section 3.29. Frame Instruction for details). Stack
growth is from low to high address.
When the SP is set, bits one and zero must always be zero. The SP can be addressed only
via the high global flag H being set. Copying an operand to the SP is a privileged
operation.
1.3.5. Upper Stack Bound UB
G19 is the upper stack bound UB. The UB contains the address beyond the highest legal
memory stack location. It is used by the Frame instruction to inhibit stack overflow.
When the UB is set, bits one and zero must always be zero. The UB can be addressed only
via the high global flag H being set. Copying an operand to the UB is a privileged
operation.
1.3.6. Bus Control Register BCR
G20 is the write-only bus control register BCR. Its content defines the options possible for
bus cycle, parity and refresh control. The BCR can be addressed only via the high global
flag H being set. Copying an operand to the BCR is a privileged operation. The BCR
register is described in detail in the bus interface description in section 6.
Page 1-14 Architecture
1.3.7. Timer Prescaler Register TPR
G21 is the write-only timer prescaler register TPR. It adapts the timer clock to different
processor clock frequencies and controls the processor clock generation by the PLL circuit.
The TPR can be addressed only via the high global flag H being set. Copying an operand
to the TPR is a privileged operation. The TPR is described in the timer description in
section 5. Updating TPR with amended values for the PLL multiplication factor in
conjunction with corresponding updated values for the timer prescaler changes the internal
clock frequency while keeping all timer settings correct.
1.3.8. Timer Compare Register TCR
G22 is the timer compare register TCR. Its content is compared continuously with the
content of the timer register TR. The TCR can be addressed only via the high global flag H
being set. Copying an operand to the TCR is a privileged operation. The TCR is described
in the timer description in section 5.
1.3.9. Timer Register TR
G23 is the timer register TR. Its content is incremented by one on each time unit. The TR
can be addressed only via the high global flag H being set. Copying an operand to the TR
is a privileged operation. The TR is described in the timer description in section 5.
1.3.10. Watchdog Compare Register WCR
G24 is the watchdog compare register WCR. The WCR can be addressed only via the high
global flag H being set. Copying an operand to the WCR is a privileged operation. The
WCR is described in the bus interface description in section 6.
1.3.11. Input Status Register ISR
G25 is the read-only input status register ISR. The ISR can be addressed only via the high
global flag H being set. The ISR is described in the bus interface description in section 6.
1.3.12. Function Control Register FCR
G26 is the write-only function control register FCR. The FCR can be addressed only via
the high global flag H being set. Copying an operand to the FCR is a privileged operation.
The FCR is described in the bus interface description in section 6.
1.3.13. Memory Control Register MCR
G27 is the write-only memory control register MCR. The MCR can be addressed only via
the high global flag H being set. Copying an operand to the MCR is a privileged operation.
The MCR is described in the bus interface description in section 6.
E1-32XS User’s Manual Page 1-15
1.4. Local Register Set
The architecture provides a set of 64 local registers of 32 bits each. The local registers
0..63 represent the register part of the stack, containing the most recent stack frame(s).
0
L0
L15
63
31
0
Local Register L0
Local Register L15
Figure 1.6: Local Register Set 0..63
The local registers can be addressed by the register code (0..15) of an instruction as
L0..L15 only relative to the frame pointer FP; they can also be addressed absolutely as part
of the stack in the stack address mode (see section 3.1.1. Address Modes).
The absolute local register address is calculated from the register code as:
That is, only the least significant six bits of the sum FP + register code are used and thus,
the absolute local register addresses for L0..L15 wrap around modulo 64.
The absolute local register addresses for FP + register code + 1 or FP + FL + offset are
calculated accordingly.
Page 1-16 Architecture
1.5. Privilege States
The architecture provides two privilege states, determined by the supervisor state flag S:
user state (S = 0) and supervisor state (S = 1).
The runtime kernel hyRTK is executed in the higher privileged supervisor state, thereby
restricting access to all sensitive data to a highly reliable system program. The following
operations are also privileged to be executed only in the supervisor or on return from
supervisor to supervisor state:
❒ Copying an operand to any of the high global registers
❒ Changing the interrupt-lock flag L from zero to one
❒ Returning through a Return instruction to supervisor state
Any illegal attempt causes a trap to Privilege Error.
The S flag is also saved in bit zero of the saved return PC by the Call, Trap and Software
instructions and by an exception. A Return instruction restores it from this bit position to
the S flag in bit position 18 of the SR (thereby overwriting the bit 18 returned from the
saved return SR).
If a Return instruction attempts a return from user to supervisor state, a trap to Privilege
Error occurs (S = 1 is saved).
Returning from supervisor to user state is achieved by clearing the S flag in bit zero of the
saved return PC before return. Switching from user to supervisor state is only possible by
executing a Trap instruction or by exception processing through one of the 64 supervisor
subprogram entries (see section 2.4. Entry Tables).
E1-32XS User’s Manual Page 1-17
1.6. Register Data Types
31
MSB
310
MSB
Double-Word Bitstring
31
MSBLSB
31
MSB
31
MSBLSB
S
High-Order 32-Bit Magnitude
Unsigned Double-Word Integer
Signed Integer, Two's Complement
32 Bits
Bitstring
High-Order 32-Bits
32-Bit Magnitude
Unsigned Integer
31-Bit Magnitude
LSB
LSBLow-Order 32-Bits
LSBLow-Order 32-Bit Magnitude
0
0
0
0
Register:
n
n
n+1
n
n
n+1
n
31
MSB
S
Signed Double-Word Integer, Two's Complement
31
S MSBLSB S
31
S MSBLSB S
31
S 8-Bit Exponent
31
11-Bit Exponent
SMSB
High-Order 31-Bit Magnitude
15
MSBLSB
Two Signed Shorts
15
Real PartImaginary Part
Complex Signed Short
MSBLSB
Single Precision Floating-Point Number
Low-Order 32-Bit Fraction
Double Precision Floating-Point Number
MSBLSB
23-Bit Fraction
High-Order 20-Bit Fraction
LSBLow-Order 32-Bit Magnitude
LSB
0
n
n+1
0
n
0
n
0
n
0
n
n+1
S = sign bit, MSB = most significant bit, LSB = least significant bit
Figure 1.7: Register Data Types
Page 1-18 Architecture
1.7. Memory Organization
The architecture provides a memory address space in the range of 0..232 - 1
(0..4 294 967 295) 8-bit bytes. Memory is implied to be organized as 32-bit words. The
following memory data types are available (see figure 1.8)
❒ Byte unsigned (unsigned 8-bit integer, bit string or character)
❒ Byte signed (signed 8-bit integer, two's complement)
❒ Half-word unsigned (unsigned 16-bit integer or bit string)
❒ Half-word signed (signed 16-bit integer, two's complement)
❒ Word (32-bit undedicated word)
❒ Double-word (64-bit undedicated double-word)
Besides the memory address space, a separate I/O address space is provided. In the I/O
address space, only word and double-word data types are available.
Words and double-words must be located at word boundaries, that is, their most significant
byte must be located at an address whose two least significant bits are zero. Half-words
must be located at half-word boundaries, their most significant byte being located at an
address whose least significant bit is zero. Bytes may be located at any address.
The variable-length instructions are located as contiguous sequences of one, two or three
half-words at half-word boundaries.
Memory- and I/O-accesses are pipelined to a depth of two words.
Note: All data is located high to low order at addresses ascending from low to high, that is,
the high order part of all data is located at the lower address (big-endian).
E1-32XS User’s Manual Page 1-19
Figure 1.8 shows the location of data and instructions in memory relative to a binary
address n = ...xxx00 (x = 0 or 1). The memory organization is big-endian.
At all data types, the most significant bit is located at the higher and the least significant
bit at the lower bit position.
Page 1-20 Architecture
1.8. Stack
A runtime stack, called stack here, holds generations of local variables in last-in-first-out
order. A generation of local variables, called stack frame or activation record, is created
upon subprogram entry and released upon subprogram return.
The runtime stack provided by the architecture is divided into a memory part and a register
part. The register part of the stack, implemented by a set of 64 local registers organized as
a circular buffer, holds the most recent stack frame(s). The current stack frame is always
kept in the register part of the stack. The frame pointer FP points to the beginning of the
current stack frame (addressed as register L0). The frame length FL indicates the number
of registers (maximum 16) assigned to the current stack frame. The stack grows from low
to high address. It is guarded by the upper stack bound UB.
The stack is maintained as follows:
❒ A Call, Trap or Software instruction increments the FP and sets FL to six, thus creating
a new stack frame with a length of six registers (including the return PC and the return
SR).
❒ An exception increments the FP by the value of FL and then sets FL to two.
❒ A Frame instruction restructures a stack frame to include (optionally) passed parameters
by decrementing the FP and by resetting the FL to the desired length, and restores a reserve of 10 local registers for the next subprogram call. If the required number of
registers + 10 do not fit in the register part of the stack, the contents of the differential
(required + 10 - available) number of local registers are pushed onto the memory part of
the stack. A trap to Frame Error occurs after the push operation when the old value of
the stack pointer SP exceeded the upper stack bound UB.
❒ A Return instruction releases the current stack frame and restores the preceding stack
frame. If the restored stack frame is not fully contained in the register part of the stack,
the content of the missing part of the stack frame is pulled from the memory part of the
stack.
For more details see the descriptions of the specific instructions.
When the number of local registers required for a stack frame exceeds its maximum length
of 16 (in rare cases), a second runtime stack in memory may be used. This second stack is
also required to hold local record or array data.
The stack is used by routines in user or supervisor state, that is, supervisor stack frames are
appended to user stack frames, and thus, parameters can be passed between user and
supervisor state. A small stack space must be reserved above UB. UB can then be set to a
higher value by the Frame Error handler to free stack space for error handling.
E1-32XS User’s Manual Page 1-21
Because the complete stack management is accomplished automatically by the hardware,
programming the stack handling instructions is easy and does not require any knowledge
of the internal working of the stack.
The following example demonstrates how the Call, Frame and Return instructions are
applied to achieve the stack behavior of the register part of the stack shown in the figures
1.9 and 1.10.
A currently activated function A has a frame length of FL = 13. Registers L0..L6 are to be
retained through a subsequent call, registers L7..L12 are temporaries. A call to function B
needs 2 parameters to be passed. The parameters are placed by function A in registers L7
and L8 before calling B. The Call instruction addresses L9 as destination for the return PC
and return SR register pair to be used by function B on return to function A.
On entry of function B, the new frame of B has an implicit length of FL = 6. It starts
physically at the former register L9 of frame A. However, since the frame pointer FP has
been incremented by 9 by the Call instruction, this register location is now being addressed
as L0 of frame B. The passed parameters cannot be addressed because they are located
below the new register L0 of frame B. To make them addressable, a Frame instruction
decrements the frame pointer FP by 2. Then, parameter 1 and 2 passed to B can be
addressed as registers L0 and L1 respectively. Note that the return PC is now to be
addressed as L2!
The Frame instruction in B specifies also the new, complete frame length FL = 11
(including the passed parameters as well as the return PC and return SR pair). Besides, a
new reserve of 10 registers for subsequent function calls and traps is provided in the
register stack. A possible overflow of the register stack is checked and handled
automatically by the Frame instruction. A program needs not and must not pay attention to
register stack overflow.
At the end of function B, a Return instruction returns control to function A and restores the
frame A. A possible underflow of the register stack is handled also automatically; thus, the
frame A is always completely restored, regardless whether it was wholly or partly pushed
into the memory part of the stack before (in the case when B called other functions).
In the present example with the frame length of FL = 13, any suitable destination register
up to L13 could be specified in the Call instruction. The parameters to be passed to the
function B would then be placed in L11 and L12. It is even possible to append a new frame
to a frame with a length of FL = 16 (coded as FL = 0 in the status register SR): the
destination register in the Call instruction is then coded as L0, but interpreted as the
register past L15.
See also sections 3.27. Call instruction, 3.29. Frame instruction and 3.30. Return
instruction for further details.
Note: With an average frame length of 8 registers, ca. 7..8 Frame instructions succeed a
pulling Return instruction until a push occurs and 7..8 Return instructions succeed a
pushing Frame instruction until a pull occurs. Thus, the built-in hysteresis makes pushing
and pulling a rare event in regular programs.
Page 1-22 Architecture
Program Example:
A: FRAME L13, L3 ; set frame length FL = 13, decrement FP by 3
: ; parameters passed to A can be addressed
: ; in L0, L1, L2
:
:
code of function A
:
:
MOV L7, L5 ; copy L5 to L7 for use as parameter1
MOVI L8, 4 ; set L8 = 4 for use as parameter2
CALL L9, 0, B ; call function B,
: ; save return PC, return SR in L9, L10
:
:
MOVI L0, 20 ; set L0 = 20 as return parameter for caller
RET PC, L3 ; return to function calling A,
; restore frame of caller
B: FRAME L11, L2 ; set frame length FL = 11, decrement FP by 2
: ; passed parameter1 can now be addressed in L0
: ; passed parameter2 can now be addressed in L1
:
:
code of function B
:
:
RET PC, L2 ; return to function A, frame A is restored by
; copying return PC and return SR in L2 and L3
; of frame B to PC and SR
Loading...
+ 138 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.