Using Software Overlays with the
ADSP-219x and VisualDSP 2.0++
(Some assembly required…)
Last modified: February 6, 2002
are stored) in unique locations in external
memory, but they ‘run’ (or execute from) a
common location in the internal memory of the
DSP. Soft overlays are not physically present in
the DSP’s internal SRAM at all times; rather
they are transferred/fetched into internal
Introduction
This application note will define and discuss the
definition and uses for software overlays. Also,
an in-depth discussion on overlay management
techniques will be covered in an abstract level,
as well as covering overlay management topics
in further detail. Code examples will be
provided to help illustrate these topics in detail
as well.
What are Software Overlays?
As DSP software applications have grown more
memory from external memory dynamically at
runtime.
Say, for example, that your software system has
10 functions, all of which comprise a total of
120k words of Program Memory (PM), but your
DSP only has a maximum of 32k PM locations.
What do you do? With software overlays, you
can fetch the desired function at runtime into the
DSP’s internal memory and then execute this
function. Accessing code and/or data overlays
dynamically gives you greater flexibility with
your DSP’s internal memory requirements.
complex, system memory requirements have
increased as a result of these newer applications.
Because of this, an application may actually
exceed the internal memory size of a particular
DSP. This is where a software overlay system
comes into play.
Also, in many cost-sensitive applications, it’s to
the hardware designer’s advantage to design a
system with the least expensive DSP (which
typically means less on-chip memory is
available.) Since the cost of bulk memory such
as SRAMs and EPROMs are small compared to
the cost of a DSP, it is sometimes more costefficient to have code and/or data reside in
cheaper external memory. This is another
scenario in which software overlays can be
implemented in a system design.
Soft overlays are a “many to one” memory
mapping system. Several overlays can ‘live’ (or
Figure 1: Simple Memory Overlay Example
Figure 1 demonstrates the concept of memory
overlays. In this figure there are two total
memory spaces; the internal memory of the
DSP, and external memory. For this example,
the external memory is partitioned into six
overlays, comprised of three functions and three
data buffers. The internal memory contains the
Copyright 2001, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product d esign or the use or application of customers ’ produc ts or
for any infringements of patents or rights of others w hich may result fro m Analog Devices assist ance. All trademarks and logos are property of their respective holders. Information
furnished by Analog Devices Applications and Development Tools Engineers is believed to be accurate and reliable, however no responsibility is assumed by Analog Devices
regarding the technical accuracy of the content provided in all Analog Devices’ Engineer-to-Engineer Notes.
main program code, an overlay manager
function, and two segments reserved for
execution of overlay program instructions and
data. From this exam ple, we also see a “many to
one” mapping, where Program Memory overlays
1, 2, and 3 map to the same overlay ‘run’ space.
(Data overlays 4, 5, and 6 map to the Data
Memory overlay ‘run’ space as well.)
In this example, overlays 1, 2, and 3 share the
same runtime location within the DSP’s internal
memory. If the main program calls the function
‘Function_1’, the overlay manager will be
invoked to load overlay #1 into the memory
segment within the DSP’s memory where
overlay 2 has been designated to run. If the
function ‘Function_3’ is requested by the main
program, then the overlay manager will again be
invoked to load overlay 3 into its designated run
time memory segment. We will cover what the
overlay manager’s role in a soft overlay system
is and what an overlay manager is (and does) in
more detail later on in this EE note.
Software Overlays for the ADSP-2191M
Software overlays are a very important software
feature that can take advantage of the internal
DSP memory resources and I/O bandwidth of
the 2191’s external memory interface (EMI).
The ADSP-2191M has 32k words of Program
Memory (PM) and 32k words of Data Memory
(DM). Currently, there are also two additional
219x variants in the 219x family; the ADSP2195 and ADSP-2196. The 2195 contains 16k
words of PM and DM, while the 2196 contains
8k words for PM and DM, respectively. Because
some software applications may require more
memory than is available on-chip, software
overlays become increasingly more important.
Another point to mention here is that the EMI of
the 219x runs at a slower rate than the DSP’s
core. Therefore, executing code or fetching data
from external memory will have an impact on
overall system performance. The attractive
feature of software overlays (for this case) is
that you can execute code and access data via
the DSP’s core, while simultaneously fetching
and loading the desired overlay into internal
memory in the background via the directmemory access (DMA) controller of the ADSP219x. (For more detailed information on EMI
throughput on the 219x family, please refer to
table 7-10, page 7-26, of the “ADSP-219x/2191
DSP Hardware Reference”).
What comprises a Soft Overlay?
Soft overlays have only a few attributes; an
overlay ID#, ‘live’ address, ‘run’ address, ‘live’
size, and lastly, a ‘run’ size. Before explaining
what these terms mean, let’s first talk about the
two places where an overlay will exist in a
system.
There are two terms associated with soft
overlays; ‘live’ space and ‘run’ space. ‘Live’
space is the address range in external memory
where the overlay resides. ‘Run’ space is the
address range of the DSP’s internal memory
where the overlay resides at runtime. (For code
overlays, the ‘run’ address is the target address
of where the caller of the overlay will ‘jump’ or
‘call’ to in your program code. For data
overlays, the ‘run’ address is the first location
of the buffer or data type. )
So getting back to the overlay’s attributes, the
‘run’ address is the address in the DSP’s internal
memory where the code overlay will be
executed from or where the data overlay will
reside. The ‘live’ address is where in external
memory the overlay will reside. One important
point to mention here is that the ADSP-219x
family supports up to 16M words of addressing
(0x010000 – 0xfeffff) via its EMI; therefore the
EE-152 Page 2
Technical Notes on using Analog Devices’ DSP components and development tools
‘run’ and ‘live’ addresses are 24-bit address
fields.
The ‘run’ and ‘live’ size attributes define the
size of the overlay module in words; for the
ADSP-219x both of these values are the same.
The last attribute is the overlay ID#. (An
important note to mention here is that the run
and live size attributes must not exceed 64k
words in size, whether for a PM overlay or a
DM overlay. This is because the EMI of the
ADSP-219x will not automatically cross page
boundaries.) This is a unique integer value
which gets assigned to each overlay by the
VisualDSP linker (linker.exe). (The first overlay
gets assigned an ID# of 1, the second gets
assigned an ID of 2, etc.)
All of these overlay attributes are linkergenerated constants which will be used by our
overlay manager. (We’ll cover this in much
more detail later on in this application note.)
So you can see from the overlay attributes that
soft overlays can reside at whatever internal
memory ‘run’ space that you define; more than
one overlay can map to a specific ‘run’ space.
For more complex overlay managers and
systems, a single overlay can map to more than
one ‘run’ space also; we’ll cover this in more
detail in the “Advanced Topics” section of this
application note.
What is an Overlay Manager?
An overlay manager is responsible for
controlling the fetching and loading of an
overlay module into internal memory. For code
overlay modules (or functions), the overlay
manager is also responsible for telling the main
program (or the ‘caller’ of the overlay function)
the correct target address to ‘jump’ to after the
overlay has been loaded. Also, an overlay
manager is responsible for any housekeeping or
additional memory management required by the
main program or calling function.
A simple model of an overlay manager would
perform the following tasks:
• Identify the desired overlay module by
getting the ID# of the overlay.
• Assign the appropriate live/run addresses
and sizes to the DMA engine to properly
load the overlay into internal ‘run’ space
from external ‘live’ space.
• Invalidate and flush the instruction
cache. (This is very important because
we don’t want the overlay manager
“polluting” the cache when we return
back to our main program.)
• Return back to the main program or the
‘calling’ function of the overlay.
A more elaborate overlay manager would
perform all of the above tasks as well as these
additional tasks listed below:
• Perform a context save/restore of all of
the registers used by the overlay manager
(via a software stack located in the
DSP’s internal Data Memory).
• Check to see if the requested overlay is
already present in its ‘run’ space. If so,
then jump to the target ‘run’ address of
the overlay (if the desired overlay is a
code overlay module), or return to the
calling function of main program (if the
desired overlay is a data overlay
module.)
• Perform any memory management
“housekeeping” tasks before returning to
the main program or calling function.
An overlay manager should be written in an
optimized manner to ensure that the minimum
number of instruction cycles is required to
execute it. The overlay manager is responsible
EE-152 Page 3
Technical Notes on using Analog Devices’ DSP components and development tools
for managing the DSP’s internal memory only;
just like an Interrupt Subroutine (ISR), you want
to spend the least amount of cycles in the
overlay manager, and the majority of the
processor’s time running actual DSP code.
Remember that we’re developing code for a
real-time system here! Another obvious point to
mention here is that the overlay manager code
should reside in the DSP’s internal memory, not
in an overlay run segment where it could get
overwritten. The overlay manager also should
not reside in external memory, since the
latencies due to executing code through the EMI
would incur too much system overhead.
VisualDSP Support for Overlays
The VisualDSP development tools generate
overlay constants “automagically” which can be
used by your overlay manager to configure the
DMA parameter registers to load in the desired
overlay module. Also, the VisualDSP linker
automatically redirects overlay function calls to
a jump table, called a PLIT (or Procedure
Linkage Table; please refer to the section “What
Is A PLIT?” on page 6 of this document for
more information), which is used to setup the
overlay ID# and overlay run address parameters
which are passed from the PLIT to your overlay
manager. Basically, the PLIT is just a function
containing some user-generated assembly
instructions that are used to setup the call to the
overlay manager. We will explain the operation
of the PLIT and where it is located later in this
section.
The linker description file (LDF) is where the
user defines the memory architecture of their
system. It is within the LDF that you define both
internal and external memory segments.
Specifically for overlays, the LDF is where you
define the ‘live’ and ‘run’ memory segments for
each overlay module or file. (A complete
explanation of the LDF is beyond the scope of
this EE note. Here we only wish to illustrate the
LDF concepts that apply specifically to software
overlays. For more information on LDFs, please
refer to chapter 2 of the “Linker and Utilities
Manual for ADSP-21xx DSPs”.)
Figure 2: PM overlay ‘live’ address vs. ‘run’ address example
Figure 2 shows a simple software system where
there are three PM overl ays defined in their own
individual ‘live’ segments in external memory.
All three of these overlays ‘run’ in the same
memory segment within the internal memory of
the DSP. Let’s look at an excerpt of what our
LDF would look like for this example:
The first thing to notice is that each overlay
declaration in the LDF has an input and an
output section. The inputs to the overlay are
declared within the scope of the overlay
definition in the LDF via the curly braces. The
main thing to be aware of is the use of the
“INPUT_SECTIONS” LDF macro which tells
the linker that this specific section from the
specified input file is to be used as an input for
this overlay segment.
The output of the overlay is defined using the
redirect input symbol “>”; this redirection tells
the linker where in memory to place this overlay
(‘live’ space). For our example in Figure 2, the
first overlay declaration links the object file
“Function_1.doj” (which is the output file
generated after assembling the source file
“Function_1.asm”) into the overlay ‘live’ space
named ‘mem_pm_ovl1_live_space’.
The overlay run space from Figure 2 is defined
at the last line of this excerpt. Therefore, all
three of the overlays declared in this section of
the LDF are declared in the LDF to run in the
overlay ‘run’ memory segment named
“mem_pm_ovl_run_space”. The overlay ‘live’
and ‘run’ address segments are defined earlier in
the MEMORY{} section of the LDF.
Linker Generated Overlay Constants
As mentioned earlier in this EE note, soft
overlays have the following attributes; an
overlay ID#, ‘live’ address, ‘run’ address, ‘live’
size, and lastly, a ‘run’ size. For each program
memory overlay segment, the linker will
generate the following constants, (where N is
the ID# of the overlay):
_ov_startaddress_N
_ov_word_run_size_ N
_ov_word_live_size_ N
_ov_runtimestartaddress_ N
Example 2: Linker Generated Overlay Constants Example
The first constant, “_ov_startaddress_N”,
represents the ‘live’ or external address where
the overlay resides. The second and third
constants represent the ‘run’ and ‘live’ sizes of
the desired overlay. For the ADSP-2191, these
two values are (and must) be the same. The
reason for this is simple; the hardware logic of
the 2191’s EMI takes care of all of the data
packing for you. So regardless of the external
data bus width configuration (8-bits or 16-bits),
or the memory access type (24-bit PM or 16-bit
DM), the internal and external memory word
sizes are the same as far as the ADSP-2191 is
concerned. The last overlay constant represents
the ‘run’ address where the overlay will reside
in the DSP’s internal memory.
These linker generated overlay constants can be
stored in arrays that can be accessed at runtime
by your overlay manager to facilitate the loading
of these overlays into internal memory. For
example, if you had a system with two code
overlays, you would declare the overlay constant
arrays in a fashion like what is shown in the
following example:
From this example we see how these linker
generated overlay constants are arranged for use
by the overlay manager. The overlay constants
are first declared as external data types to the
scope of this file via the “.extern” assembler
directive. This is necessary because these
constants are generated by the linker in the
overlay output file (.ovl) and are referenced
from that file.
One very important point to mention here is
with how the overlay ‘live’ address constants
are defined. Since the ADSP-2191 has a 22-bit
external address bus, and since the live space for
the overlays in our system are declared to reside
in external memory, the linker generated
constants for the overlay ‘live’ addresses must
be stored in a 24-bit PM buffer, and initialized
using the “/init24” assembler directive.
(For this linker generated constants example,
we’ve also declared all of these const ant arrays
as globals since the declaration of these arrays
exists outside of the file which contains the
overlay manager code, which references these
arrays.)
What is a PLIT?
As defined by the “VisualDSP Linker and
Utilities Manual for ADSP-21xx DSPs”, a PLIT
is a template of instructions from which the
linker generates code to set up the information
necessary to support the DSP program’s overlay
manager. Every branch instruction that
references a global label defined in an overlay is
replaced by a call to this generated code. For
each overlay routine in the program, the linker
builds and stores a list of PLIT instances
according to that template, as it builds its
executable.
The code for the PLIT is written by the
programmer in the reserved PLIT section of the
linker description file (LDF) in the project. A
simple PLIT merely copies the ‘run’ address of
the called symbol that resides in the overlay and
the overlay ID# into user-defined registers.
Below is a simple PLIT example taken from an
arbitrary overlay LDF:
From the above example, we see that the
registers ax0 and ay0 are used to fetch the
overlay ID# and ‘run’ address, respectively.
A more practical example would be where the
registers used by the PLIT function are saved off
by the PLIT before consuming these registers to
properly set up the overlay manager. Here is a
simple example below:
Aside from containing user-defined code, the
PLIT is also a section in the DSP’s Program
Memory where the linker generates a jump table
containing references to all of the overlay
function labels. This jump table we refer to as a
PLIT table. The next section explains how the
linker adds additional code to facilitate the
actual call and/or loading of the overlay function
to allow the program sequencer to begin
execution of the overlay code, and how the PLIT
table is invoked and comes into play during
program execution.
EE-152 Page 6
Technical Notes on using Analog Devices’ DSP components and development tools
In your source code, calling a function that
resides in an overlay (in order to invoke the
overlay manager to load the overlay code into
the DSP’s internal memory) is implemented in
your source code in the same manner that you
would call an ordinary (i.e. non-overlay)
function. For example,
call my_function;
The difference between a non-overlay function
call and an overlay function call is that the
linker actually replaces the function call with a
call to the PLIT entry for the desired overlay
function. For example, an overlay function call
from assembly like the following:
call function_1;
Actually gets replaced by a call to the PLIT by
the following code:
call .plt_function_1;
Each overlay module that we declared in our
LDF gets its own unique copy of the PLIT entry
that was defined in our LDF. For example, let’s
say in our system we have three code overlays
declared, and we have a simple PLIT declared in
our LDF as shown in Example 4 above. Then
the corresponding PLIT table for our three
overlays would look like the following:
Therefore, the total memory size of your PLIT
table in the DSP’s Program Memory will be the
number of code overlays in your system
multiplied by the number of assembly
instructions contained in the PLIT{} declaration
in your LDF.
Looking at Example 6 (and referring back to our
PLIT declaration from Example 4, where the
register ax0 is defined as containing the overlay
ID number), we can see that each overlay has its
own unique ID number; ‘function_1’ has the
value ‘0x0001’ as its overlay ID number,
function ‘function_2’ has an ID of ‘0x002’, etc.
We also see that all three of these overlays share
the same run address, which is passed as a
parameter to the overlay manager in the register
ay0. Lastly, the jump instruction for each PLIT
table entry is a jump to the overlay manager
itself.
Simple Overlay Manager Example
Example 7 below contains example assembly
source code showing one implementation of a
‘simple’ overlay manager. This overlay manager
example is broken up into six different sections;
we’ll discuss each section in detail to explain
how this simple overlay manager example
works.
Overlay_Manager:
ar = ax0 - 1;
m7 = ar;
dm(curr_PM_ovly_ID) = ar;
dm(WR_DMA_DESC_BLOCK+2) = ay0;
Get_Overlay_Run_Size:
i7 = runWordSize;
ax0 = dm(i7 + m7);
Example 7: Simple Overlay Manager
EE-152 Page 7
Technical Notes on using Analog Devices’ DSP components and development tools
Let’s take a look at this code section by section
to determine what the overlay manager is
actually doing. (Keep in mind that this is a
simple overlay manager example; we’ll build on
this example and explain more overlay manager
concepts in further detail later on in this EE
note.)
The section “Overlay_Manager” is the address
label where the overlay manager gets invoked
via a “jump Overlay_Manager;” instruction
which is from the PLIT table.
Before we execute the first line of code from
this overlay manager example, there is an
important point to mention here first; the
register ax0 contains the overlay ID# of the
desired overlay that we wish to fetch. This point
is important because we need to subtract 1 from
the overlay ID# in order to use the register m7
as an index to the array of the linker generated
overlay constants.
Think of this as the same manner you would
with an array in C. The first element in a C array
is actually index zero, even though it’s still the
first array element. (This is commonly known as
an “off by one” error when indexing arrays in C;
the same case applies here for our overlay
constants and its appropriate ID value.)
Since the register ay0 already contains the ‘run
address’ of the desired overlay (via the PLIT
assignment, “ay0 = PLIT_SYMBOL_ADDRESS;”),
we simply assign this value to the address field
of our Write DMA Descriptor Block. (For more
information on the configuration and use of
DMA Descriptors, please refer to the sections
titled “Descriptor-Based DMA Transfers” and
“Code Example: Internal Memory DMA”, on
pages 6-4 and 6-33, respectively, of the “ADSP219x/2191 DSP Hardware Reference”.)
The next section of our overlay manager,
labeled “Get_Overlay_Run_Size”, is where we
again use the m7 register as an offset into the
‘runWordSize’ array, which contains the ‘run’
size of the overlay we wish to fetch. One
important point to mention here is that the
overlay run size actually contains an additional
word, which is the overlay ID number itself.
Because of this, we must subtract one from the
actual run size of the overlay stored in the
‘runWordSize’ array. This is performed in the
EE-152 Page 8
Technical Notes on using Analog Devices’ DSP components and development tools
instruction “ar = ar – 1;”. Once we get the
proper size for the overlay, we then place this
value into the MEMDMA read and write
descriptors at the last two instructions of this
section.
In the next section, labeled Parse_Live_Address,
we take the 24-bit ‘live’ address of the overlay
and break it up into its respective 8-bit page
value and 16-bit address offset. We implement
this by taking the 16 MSBs of the 24-bit ‘live’
address value into the AR register, using AR as
an input to the shifter register SR0, and then
shifting this value 8 bit positions to the left,
filling up the 8 LSBs of the SR1 register. We
then “OR” the remaining 8-bits of the ‘live’
address that were stored in the PX register from
the 24-bit PM fetch, to get the proper 16-bit
address offset value. Figure 3 below shows the
operation of this procedure to properly parse the
live address into the appropriate page and offset
values.
Figure 3: Parsing Overlay Live Address Example
(Remember that because the ADSP-2191’s
opcodes are always 24-bits in length, the
complete 24-bit address cannot be fully
contained in a 16-bit register, nor can it be fully
contained in an opcode. Therefore all external
memory addresses are broken up into an 8-bit
page value and a 16-bit address offset.)
The next section labeled “DMA_Config” is
where we initialize the remainder of the DMA
Descriptors for the read and write DMA
channels to kick off the DMA that will fetch the
overlay from external memory and place it into
the proper internal memory addresses. After
kicking off the DMA, we simply sit at the
“idle;” instruction and wait until we get a MEM
DMA interrupt.
After servicing the interrupt, we return to
execution of the overlay manager at the next
instruction, which is in the section labeled,
“Jump_To_Overlay”. In this section, we simply
load the ‘run’ address of the overlay that we’ve
just DMA’ed into internal memory into the I7
register, and then begin execution of the overlay
via the “jump (i7);” instruction.
Overlay Manager Optimizations
The simple overlay manager example we just
dissected is what its name implies; a “simple”
overlay manager. In this section, we’ll talk about
optimization techniques to make our overlay
manager more robust for our system design.
The first thing to mention is that our overlay
manager example didn’t perform a context save
and restore of the registers that it corrupted. You
have two options; you can both create and
manage a software stack (used primarily by your
overlay manager), or you can use the secondary
computational and DAG registers. But, as with
many things in life, there are trade-offs between
both of these implementations.
Using the secondary registers is the fastest
method for context switching, but since most
real-time systems use the secondary registers for
interrupt subroutines (ISRs) to minimize their
EE-152 Page 9
Technical Notes on using Analog Devices’ DSP components and development tools