Rainbow Electronics AT89C2051 User Manual

Using the AT89C2051 Microcontroller

as a Virtual Machine

It is often cited that what differentiates an embedded microcontroller from other general purpose computing devices is its integration into a larger electrical or electro-mechanical system. While this is generally true , the f act rema ins tha t processors of widely differing capabi lity and architecture are employed in this regard.

Unfortunately, this broad explanation defines nothing; we are still left to contend with everything from full-blown embedded PCs to the smallest self-contained single-chip microcontrollers. Within this expansi ve realm, conve ntional wisdom may lead to the conclusion that the smallest microcontrollers are only appropr iate for drivi ng smallscale applications with very limited processing requireme nts. While this is unquestionably the case in many instances, a class of app lication s exists that mandates a relatively high level of program complexity within severely constrained space limitations. Faced with such a seeming paradox, engineers often feel they have no choice but to adopt a less than opti mal design stra tegy using a larger mi crocontroller than originally intend ed.

The problem, of cours e, i s one of limited resources. Functio nal complexity implies a non-trivial program, and the greater the functional complexity the larger the program. Even as the capability of small single-chip microcontrollers contin uously inches upward s, application requirements seem to grow at a commensurate rate. Tr ying to hit such a moving target is difficult at best.

The economy of using a microcontr oller with just enough processing power for a given application is a potent incentive to find just the right fit. Of course, this only

works when the system requirements are thoroughly understood and clearly defined. Since such a design normally has little reserve capacity, it is usually hard pressed to handle featu re s be yo nd those originally specified. Should additional capabilities eventually b ecome a necessity, the result could be a system that runs out of steam and an engineer that runs out of options. Such are the perils of designing on the edge.

Atmel’s AT89C2051 offers capabilities that far exceed those of competing devices of similar size. This op ens up potential design opportu nities that wer e simply unattainable with previously available parts. Housed in a 20-pin package, Atmel’s miniature microcontroller retains all the major features of the 8 051 architecture. Furtherm ore, the AT89 C2051 includes all of the 8051’s “special” pins including the external interrupts, UART transmit and receive lines, and the external timer controls. Even though the AT89C2051 significantly ups the processing ante, it would seem that there are limits to what you can accomplish with any single-chip microcontroller.

This dilemma is nothing new. The traditional way of dealing with such limitations has been to operate the microcontroller in external memory mode. Common sense would in dicat e the hope lessness of applying such an appr oach to the AT89C2051. After all, the AT89C2051 is truly a single-chip design that does not even possess an externa l bus structure. It turns out that the situation is not hopeless at all.

AT89C2051 Flash Microcontroller

Application Note

5-47

Processor Simulation

The concept of microprocessor simulation is widely used and well understood. Simulation is often used for development

purposes where a P C program mo dels a sp ecific proc essor’s architecture and interprets and executes its binary instruction set. Using this technique enables one to develop, test, and debug algorithms that will ul timately be combined into a l arger p ro gr am. Su ch a p ro gr am wil l e ve ntually run on a standalone microprocessor or microcontroller. Using simulation early in the design cycle i s attractive because it allows you to start developing code long before the actual target hardware is available.

Processor simulation has also been applied to simulate entire computi ng syst ems. In t his conte xt, exis ting ap plication programs, in their nativ e binary format, have been coerced to run on various compute rs powered by completely different processors. For obvious reasons, the performance resulting from such an approach often proves to be disappointi ng. This does no t necess arily hav e to be the case if the implementation is designed for a specific purpose. Factors effecting performance efficiency include the host processor’s strengths and limitations, the specific types of operations that are to be simulated, and, to an extent, the language the original program is written in.

Virtual Processor Simulation

Many developmental simul ators have be en produced tha t emulate the functions of popula r proces sors and microcontrollers using st andard des ktop co mputer s. The same pr inciples can be utilized at the other end of the spect rum; there are cases where runn ing a simulation on a small microcontroller can be put to an advantage. In this case, however, the benefit is not derived from simulating a known processor, but on e that offer s inhe rent adva ntages tail ored to solving the specific problem at hand. The implication, of course, points to the design of a virtual processor. The idea is based on the pr em is e of u si ng a re al proc es s or to implement a virtual device specifically designed to suit the special needs of a particular application. In other words, designing the tool set for a particular job.

The fact is that adopting such a methodology can ultimately result in an archit ect ur e tha t c an be pr essed to serve as an efficient vehicle for a number of special ized tasks. Detai ls including the fundamental architecture, instruction set, and memory model can be app roached with tota l freed om. But , can such an approach provide the level of performance demanded by embedded applicati ons ?

Efficiency and Overhead

To illustrate that efficiency is a subjective matter, consider what happens when a ty pical C pr og ram is c om pil ed to r un on an 8051 processor. It’s inconceivab le that, on such an architecture, any C statement will effectively compile down to any correspondi ng 8051 instruct ion. A single C state ment invariably results in the execution of multiple instruction steps. It follows that, gi ven an efficient simulated instruction set, the simulation overhead might account for a very small percentage of the overall execution time.

The key behind making this premise work is to devise an instruction set and processor architecture that’s conducive to performing the types of operations that a C compiler naturally generates. In such an implementation, the contrived instruction set essentially amounts to an intermediate language. The op codes mere ly serve as a vehicle for succinctly conveying the compiler’s directives to the target processor for execution.

The target processor, while performing the functions of a simulator, interprets the intermediate instructions to perform the functions specified in the original high level language source sta tements. Th e resul ting eff iciency c an be quite tolerable sin ce the bulk of the i nstr uctio ns wou ld exe cute regardless of whether they were emitted directly by the compiler or invoked by the simulation kernel.

It turns out the performance penalty of such an approach is, to a great exten t, dependent on the way the program memory itself is impleme nted. Sin ce the AT89 C2051 has no external bus structure it makes sense to use a serial bus to access the progra m memory . Using I provides the required fle xibility along with reason able throughput.

Selecting I choosing from a wide variety of EEPROM memory devices. The most favorable configuration is Atmel’s AT24C64 that offers 8K byte s of st orage in an 8-pi n packag e. Utiliz ing extended 16-bit ad dressing , the AT 24C64 p rovides li near access to the entire intern al memory array. A nd altho ugh a lot of functional ity can b e cr am med i nto a si ng le ch ip , a ddi tional devices can easily be added in 8K increments to handle very complex applications. Up to eight AT24C64s can simultaneously reside on the I storage while using just two wires.

Of course, serial memory access does come at a cost. In this case the expense comes in the form of access time. To an extent, this is moderated by the fact that the AT24C64 can operate at a 400 kHz cloc k rate (s tandard I fied at a maximum of 100 kH z). Remember however , that

C can exact a significan t performa nce penalty because a

I substantial percentage of its bandwidth can be consumed for control functions.

C as a memory bus presents the potential of

C bus providing a full 64K of

C for this purpose

C is speci-

5-48

Microcontroller

The greatest overhead burden that I the transfer of addressing information. For every random read or write, a 16-bit address must be transmitted along with the extra overhead necessary to coordinate bus control for both the addressing phase and the data manipulation phase. Under such conditions, actual data movement could be swamped by the requisite overhead resulting in unacceptable performance degradation. Fortunately, I provides a means of eliminating much of this wasteful activity.

The AT24C64, like all other I an internal auto-increment address generator. Using this feature, once addressability is established, data can be continually streamed in a sequential fashion. As each byte is read and ac knowle dged the internal addre ss generat or increments in preparation for the n ext byte transfer. The AT24C64 sets the maximum speed limit at 400 kHz but I does not impose a lower limit. Effectively, the minimum frequency can drop all the way to DC. As a result, it’s accep table to suspend a s equ ential tr an sfe r fo r as l ong a s neces sary.

Utilizing these features, communications can be sped up considerably. The rami fications are particu larly sign ificant when the memory is use d to store an executab le prog ram. For example, once an address is written into the AT24C64, data can be fetched in a continual stream until the program branches or, if multiple AT24C64’s are used, until it becomes necessary to cross into the next chip. At these points it’s necessary to explicitly reload the internal address generator. Normally, how ev er, the majority of the accesses will be sequential, resulting in greatly reduced overhead.

C memory devices, contains

C imposes involves

Processor Simulators and Language Interpreters

It’s important to note the distinction between language specific interpreters that implement a defined language such as BASIC, and a processor simulator that interprets a low level binary instruction set. A tokenized BASIC interpreter, while quite effi cient in e xecuting the command s that are explicitly implemented as part of the language, is strictly confined to what the language supports. The inherent efficiency of an interpret ed langu age come s at the expen se of flexibility.

In contrast, a processor s imulator, that deal s with a true binary instruction set, enjo ys total freedom in combining these basic op codes into larger functional entities in almost limitless permutations. Just like a real processor, a simulated processor can ut ilize its instruc tion set for st andard and custom C l ibra ry f unc tio ns , fl oat ing po in t li br ari es , device drivers, etc.

Microcontroller

The Virtual Machine — An Imaginary Processor

The processor to be described is imaginary in the sense that its architecture and instruction set a re original and unique. Realize, however, that this is not just a toy or an intellectual diversion—from an implementation standpoint it is quite real. The fundamental concept has been success-

fully ported to a variety of processor architectures. A version exists that runs on a personal computer that is suitable for demonstration and development purposes. T he most promising sm all-sy stem p ort ha s be en to t he AT 89C2051 due to the microcontroller’s standard processing core and integrated peripheral set. The basic 8K Virtual Machine is schematically depicte d in Fig ure 1. The c ircuit’s simplicity reveals that this is primarily a software implementation— the definitive soft machine.

This imaginary processor, the product of Dunfield Development Systems, has served in various applications providing reliable solutions to real world problems where a standard configuration was not necessary, optimal, or practical. That this Virtual Machine al so goes by the name “C-FLEA” affirms its optimization for efficiently rendering the output of a C language code generator.

The prime currency of a proc essor is time. V iewed in this context, the expense of complexity can prove unacceptably burdensome. Taking this into consideration, the Virtual Machine, based on a simple 16-bit architecture that incorporates only four registers, is the epitome of simplicity. This register set comprises an accumulator, index register, stack pointer, an d program coun ter. Appendi x A provid es detailed infor mati on abou t the Vi rtual Mac hin e arc hite cture and instruction set. R efer to Ta bl e 1 fo r a des c ripti on o f the fundamental resource set.

Although the Virtual Machi ne pe rfo r ms all operati on s to 16 bit precision, the needs of many embe dded systems resolve to 8 bits. To facilitate working with this common denominator, the Virtual Machine stores data in little endian format (low byte first) which facilitates the use of a given variable’s base address to refer to either an 8-bit or 16-bit quantity. Interesti ngly, the ar chitecture provides n o user accessible flags. When invoking a compare instruction, internal flags persist only long enough to accommodate the ensuing branch instruction or the intervening compare modifiers (which are described later).

This spartan register set is made workable by the inclusion of a variety of add ress ing mo des tha t exc el at the ty pes of stack manipulations tha t are central to the canonical C implementation. The Vir tual Machine’s memory access instructions, detailed in Table 2, include the following addressing modes: immediate (8 or 16 bit), direct, indirect through index register (with or without offset), indirect through stack with offset, top of st ack, and in direct thro ugh top of stack (remove or leave value on stack).

5-49

The bulk of the virtual instruction set is presented in Table

3. These instructions include memory access instructions, arithmetic instruction, and logical instructions. In keeping with the previously established proposition, most can return either bytes or words.

Since the compare in struc tions are de signed to only deter mine equality, the instruction set is augmented by a set of special compa re modifiers . Using these , nuances of relative (signed and unsigned) magnitude can be coerc ed from the basic com pare instruction s. These modifier s are described in Table 4.

Program branching is supported using the relatively conventional set of cond ition al and u ncon ditio nal j ump i nstr uctions shown in Table 5. Versions are provided for both near and far destinati on targets to e nhance code efficiency. Note the inclusion of the SWITC H ins tructio n whic h pr oves especially useful since the “normal” compare instructions destroy the contents of the accumulator when returning the result of the compare operation.

Table 6 presents the stack manipulation set. Inc luded are common functio ns such as CALL, RETur n, and PUSH . Conspicuously absent is an explicit POP instruction. The corresponding functionality is provided by the various addressing modes that, by default, manipulate the top of the stack. For instanc e, PO P A is s y non ym ous wi th LD S +. Additional instruc tions ar e inc luded to facil itate stac k frame creation and des tructi on tha t is a nece ssar y func tion of the C language implementation.

Finally, the virtual instruction set is rounded with a number of miscellaneous instructions shown in Table 7. For the most part, these pe rform s tanda rd fu nction s tha t shou ld be self explanatory. The input/output instructions are special in that they offer an implementation specific avenue for establishing certain peripheral functions as instructions. Remember that, even though, the vi rtual instruc tion set offers the programmer to tal f reedom t o cons truct an y kind o f comp utational sequence, all I/O operations are dependent on the support coded into the Virtual Machine kernel. Essentially, the simulation kernel is the software embodiment of a microprocess or archite cture. Nat urally, th e goal is to pr ovide a general purpose en gin e capable of serving in a wide variety of real embedded systems.

A significant numbe r of op codes re main unassi gned and are available for future use.

Initial Program Loader

While not actually p art of the Vi rtual Machi ne, the sim ulation kernel contains a built-in program loader utility. This operates serially and is invoked following a system reset by a sequence of special commands from a utility program running on the host computer. In addition to transferring the load image to the Virtual Process or, the PC program provides a number of features which include a simulator (that can hook into the target’s logical and physical I/O subsystem) and a console wind ow for per forming user I/O to the target system. Since the Virtual Machine’s co de generator emits a standard Intel HEX file format, the use of the PC utili ty program is optional.

In principle, there is no r ea so n why a n AT24C64 cannot be programmed externally us ing a standard device prog rammer just as you would program an EPROM for a use in a typical embedded computer. Although workable, this approach would, at the least, prove cumbersome throughout the development cyc le. The difficul ty of this approa ch would be exacerbated in a system using mu ltiple memory chips. Obviously, it would be completely unworkable in the event a Virtual Machine computer was rendered as a surface mount assembly.

5-50

Microcontroller

Figure 1.

8K Virtual Machine

10 Fµ

10K

22 pF

1N914

14.7456 MHz

AT89C2051

RST

XTAL2

XTAL1

.1 Fµ

1 2 3

AT24C64

A0 A1 A2

SCL

SDA

RXD/P3.0

TXD/P3.1

INT0/P3.2

INT1/P3.3

T0/P3.4

T1/P3.5

P3.7

GND

6 5

.1 Fµ

4.7K 4.7K

AIN1/P1.1 AIN0/P1.0

P1.7 P1.6 P1.5 P1.4 P1.3 P1.2

19 18 17 16 15 14 13 12

5-51

Virtual Machine I/O

The Virtual Machine handles physical I/O (as well as virtual I/O) through the use of input/output instructions. It is natural to reserve certain I /O addres ses for on-ch ip funct ions such as serial I/O and for access to the AT89C2051’s on-chip parallel I/O por ts. A dditio nal I/ O ad dres ses are ass igned to second level functions such as serial port configuration and direct I/O bit set and cl ear functions . The bit mani pulation functions are important when an on-chip parallel port is simultaneously used for both input and output.

Consider the ramifications of performing a standard read/modify/write operation on such a port. Normally this would be accomplished by reading the port via an IN instruction, performing a logical operation on the value, and writing the modifie d data back to the port using OU T. Should an input pin be externally pulling low while the port was being read, th e unfortun ate outcom e of this exerc ise would be to render that line permanently jammed low and unusable for any further input!

Additional virtual input/output devices are provided for functions such as time-of-day clock, general system timing, pulse width modulation, and pulse accumulation. These are implemented as background interrupt service routines and are accessed as simple input/output devices.

Serviceable as the basic I/O resource set may be, it’s often necessary to provide ancillary I/O functions external to the processor. The Virtual Machine accomplishes this transparently by passi ng any und efined I/O addres ses to th e exter nal peripheral trap. Thi s h andle r us es a s econ dary I to implement an auxiliary external peripheral/memory channel.

Here, the instruction’s I/O address is taken as the I address. For output operations data is passed via the low byte of the virtual accumulator. Input functions return data in the low byte of the virtua l ac c umu lat or . In b oth c ases the accumulator’s high byte is util ized to convey com pletion status and can be interrogated to determine the outcome of the requested operation. The result code reflects the status of the data link transfer and e ither indicates valid compl etion or fault status. Should a fault be reported it could be the result of a peripheral in busy status, a device that is not present, or a legitimate peripheral malfunction.

C bus

C slave

Virtual Machine Assembly

To clarify the relati onshi p b etwe en the Vi r tual M ach in e k ernel, a virtua l as sem bly l angua ge l ibrar y fun ctio n, an d a virtual C application program, an example is in order. This will

also serve to il lustrate how ea sily commu nication to the outside world can be orchestrated in such an environment.

The program depicted in Listing 1 is a library function that supports console I/O using a special I (This is the same module that was detailed in the applica-

C user I/O module.

tion note “A Framework for Peripheral Expansion.”) The user I/O module contains a standard 20 x 4 LCD, 4 x 4 keypad, and beeper. These are supported using two I2C-toparallel port expanders. The underlying premise is that, once the data transport mechanism is hidden, the I can be used just like any conventional I/O ports. In this case the concealment is complete since the I written in the AT89C2051’s native instruction set and is therefore completely invisible and inaccessible to a virtual program running on the Vi rtual Mac hine. Re ading and writing to I OUT.

Looking again to listing 1 reveals how virtual instructions can be combined to generate a useful program. Far from being constrictive, the virtual instruction set yields an economy of expressio n while re tainin g a great de al of flexi bilit y. The limited number of registers does, however, require a reliance on the stack for param eter pas sin g and fo r ho ldi ng intermediate results. This shouldn’t be surprising considering the fact that the Virtual Machine is primarily designed as a C engine. Anyone familiar with the way a C compiler utilizes the stack frame should have little trouble adapting these concepts to writing efficient assembler programs.

C devices now becomes strictly a matter of IN and

C ports

C driver is

Virtual Machine C ompilation

Not much can be said about the compilation process for the Virtual Machine. This is truly a virtue since, after all, the primary purpose of a language compiler is to insulate the programmer from the complexities of a particular processor. To those experienced with C compilers fo r 8051 processors, the most notable omissio n here is the absenc e of the multiplicity of libraries for the various memory models that are so necessary when working with a native 8051. Recall that the Virtual Machine support s a single, eminently reasonable, flat 64K memory space.

Listing 2 reveals that there is nothing special and, more importantly, that there are no artificial limitations inherent in a C program written for the Virtual Machine. This program implements a simple calcul ator func tion th at uses t he I user I/O module as the system console device and utilizes the long math function s from the Virtual Machi ne math library. The actual functionality beh ind this module is secondary. What is mo re i mpo rtant is th at i t l ook s l ik e a C p rogram and behaves l ike a C pr og ram — and can be abused like a C program. In short, it can be coerced to do the things you need a typical embedded program to do.

5-52

Microcontroller

Pint Sized Computer

Although tiny by any s ca le of m eas ur e, the Vi rt ual ma ch ine behaves the way you would expect any self respecting processor to behave, virtual or not. More to the point, the Virtual Machine in actuality is a fully functional computer system. You would be hard pressed to find a smaller, fully functional, computer with comparable capability that adequately supports the C programming language.

Using surface m ount manufacturi ng techniques, a fully operational computer can be constr ucted to fit into an ar ea the size of a postage stamp. The Virtual Machine’s large program memory space, combined wi th its secondar y I memory/peripheral bu s, m ak es the a rchi tectur e sui tab le for handling a number of relatively ambitious embedded projects. Its minuscule size allows it to be placed anywhere.

Sources

If you are int erested in experim enting with the V irtual Machine concept, a fully operational PC based Virtual Machine simulator, C compiler with libraries, and assembler are available for downloading fr om the the Dunfield Development Systems bulletin b oard at (613) 256-6289 . For availability of the Vir tual Machine proc essor, devel opment system, and support software contact Mid-Tech Computing Devices USA; P.O. Box 218; Stafford, CT 06075 (860) 684-2442.

To obtain the listing 1 and listing 2 codes, please download from Atmel’s Web Site or BBS.

Microcontroller

5-53

Appendix A — Virtual Machine Arc hitecture

Table 1.

ACC 16-bit accumulator 8-bit accesses are auto zero-filled INDEX 16-bit addressing register, cannot be manipulated as 8 bits SP 16-bit stack pointer PC 16-bit program counter

Table 2.

Syntax Coding Description

#n x0 ii(ii) Immediat e (8 or 16-bit operand) aaaa x1 dd dd Direct memory address I x2 Indirect (through INDEX register) no offset n,I x3 oo Indirect (through INDEX register) with 8-bit offset n,S x4 oo Indirect (through SP) with 8-bit offset S+ x5 On Top of Stack (remove) [S+] x6 Indirect through TOS (remove) [S] x7 Indirect through TOS (leave on stack)

Notes: 1. Address ing mode is in lower 3 bits of op code.

Fundamental Resource Set

General Addressing Modes

2. Mode S+ always pops 16 bits from stack. Only 16-bit values can be pushed.

3. Modes [S+] and [S] will always use a 16-bit address on the top pf the stack but the final target can be 8 or 16 bits.

5-54

Microcontroller

Table 3.

Name Description (Unused Address Modes)

LD Load ACC 16 bits LDB Load ACC 8 bits ADD Add 16 bits ADDB Add 8 bits SUB Subtract 16 bits SUBB Subtract 8 bits MUL Multiply by 16 bits MULB Multiply by 8 bits DIV Divide by 16 bits DIVB Divide by 8 bits AND And 16 bits ANDB And 8 bits OR OR 16 bits ORB OR 8 bits XOR XOR 16 bits

Memory Addressing Instructions

XORB XOR 8 bits CMP Compare 16 bits (ACC = 1 if equal) CMPB Compare 8 bits LDI Load I NDEX (16 bit s only) LEAI Load INDEX with address (00, 05) ST Store ACC 16 bits (00, 05) STB Store ACC 8 bits (00, 05) STI Store INDEX (16 bits only) (00, 05) SHR Shift right (8-bit count only) SHL Shift left (8-bit count only)

Notes: 1. ACC always contains 16 vali d b its. All operat ion s a re p erformed in 16-bit precis ion . 8- bit operands are z ero -filled when they

are fetched.

2. SI decrements when data is pushed

3. Data is stored in little endian format.

4. There are no user accessible flags. In the case of CMP, internal flags are maintaned only long enough to accommoate the LT-UGE instruction.

5-55

Table 4.

Name Description

LT ACC = 1 if less than (signed) LE ACC = 1 if less than or equal (signed) GT ACC = 1 if greater than (signed) GE ACC = 1 if greater than of equal (signed) ULT ACC = 1 if lower than (unsigned) ULE ACC = 1 if lower than or same (unsigned) UGT ACC = 1 if higher than (unsigned) UGE ACC = 1 if higher than or same (unsigned)

Notes: 1. These instructions must immediately follow a CMP instru ction.

Compare Modifiers

2. NOT instruction is used to implement explicit NE.

Table 5.

Name Description

JMP Long jump (16-bit absolute) JZ Long jump if ACC=0 (16-bit absolute) JNZ Long jump if ACC!=0 (16-bit absolute) SJMP Short jump (8-bit PC offset) SJZ Short jump if ACC=0 (8-bit PC offset) SJNZ Short jump if ACC!=0 (8-bit PC offset) IJMP Indirect jump (Address in ACC) SWITCH Jump through switch table (ACC=value, INDEX=table)

Note: 1. Switch table format: addr1, value1, addr2, value2, ... 0, default addr

Table 6.

Name Description

CALL Call subroutine (16-bit absolute address) RET Return from subroutine ALLOC Allocate space on stack (8-bit value) FREE Release space on stack (8-bit value)

Jump Instructions

Stack Manipulation Instructions

PUSHA Push ACC on stack PUSHI Push INDEX on stack TAS Copy ACC to SP TSA Copy SP to ACC

Note: 1. Explicit POP instruction are not required since various addressing modes use and remove the top item on stack.

5-56

Microcontroller

Table 7.

Name Description

CLR Zero ACC COM Complement ACC (ACC = ACC XOR FFFF) NEG Negate ACC (ACC = 0 - ACC) NOT ACC = 1 if ACC was 0, else ACC = 0 INC Increment ACC DEC Decrement ACC TAI Copy ACC to INDEX TIA Copy INDEX to ACC ADAI Add ACC to INDEX ALT Get alternate result from MUL/DIV OUT Output byte in ACC to PORT IN Read byte from PORT SYS System interface function

Note: 1. ALT obtains the remainder after DIV and obtains the high word after a multiply. This instruction must be executed immedi-

Miscellaneous Instructions

ately after the MUL or DIV.

5-57

Rainbow Electronics AT89C2051 User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual