Technical notes on using Analog Devices DSPs, processors and development tools
Contact our technical support at dsp.support@analog.com and at dsptools.support@analog.com
Or vi sit our o n-li ne r esou rces htt p:/ /www.analog.com/ee-notes and http://www.analog.com/processors
Code Overlays on the ADSP-2126x SHARC® Family of DSPs
Contributed by Brian M. Rev 1 – February 17, 2004
Introduction
The overlay facilities in the VisualDSP++®
development tools provide an efficient method of
managing DSP internal memory. Similar to
previous generations of SHARC® family DSPs,
you can use code overlays to exploit the
relatively small internal memory blocks (2 Mbits
total) found in third-generation SHARC (ADSP2126x) DSPs.
This article, which extends upon previous EENotes, focuses on the overlay management
changes that are necessary to support thirdgeneration SHARC DSPs. Before continuing
with this article, be sure to be familiar with the
concepts explained in these earlier documents:
•Using Memory Overlays (EE-66) [1] This
document introduces the use of overlays on
SHARC DSPs. It includes the concepts and
details of using the code overlays from
external RAM.
• Using Code Overlays from ROM (EE-180)
on the ADSP-21161N EZ-KIT Lite™ [2]
This document extends the concepts
presented in EE-66 to include the use of code
overlays from cheaper ROM parts.
Because the ADSP-2126x DSP memory and core
clock speed now runs at double the previous rate,
its I/O Processor can no longer access 48-bit
addresses. This means that overlays must be
copied to the equivalent 32-bit location. Since
the beginning or end of a 48-bit word can lie in
the middle of a 32-bit word, special measures are
needed to ensure proper transfers.
This document demonstrates overlay managers
with three projects that demonstrate a method to
handle this new behavior:
• A simple overlay manager that uses external
SRAM to store the overlays
• A project that stores overlays in a parallel
EEPROM or Flash
• A project that stores overlays in an SPI Flash
A Quick Introduction to Overlays
When using the overlay facilities provided by
VisualDSP++, the linker changes a call to a
function located in an overlay section into a call
to an overlay manager. The overlay manager
then transfers the desired code from an external
memory location (the live address) to an area in
internal memory set aside for overlays (the run
address) and jumps to the applicable location to
execute the code. (See EE-66 for a thorough
discussion of overlays on SHARC DSPs.)
When overlays are stored in external RAM, the
overlay live space is initialized at boot time in
the same manner as internal memory (by the boot
loader kernel). However, when ROM or Flash
devices are used instead of RAM, it is easier to
use the loader to generate a boot image to burn
into the memory (see EE-180). When this
method is used, an initialization routine is run to
determine where the overlays live, and how
Copyright 2004, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product design or the use or application of
customers’ products or for any infringements of patents or rights of others which may result from Analog Devices assistance. All trademarks and logos are property
of their respective holders. Information furnished by Analog Devices applications and development tools engineers is believed to be accurate and reliable, however
no responsibility is assumed by Analog Devices regarding technical accuracy and topicality of the content provided in Analog Devices’ Engineer-to-Engineer Notes.
a
many loader sections comprise the entire
overlay.
This document discusses how to extend the
previous examples to deal with issues that are
specific to ADSP-2126x DSPs.
Basic Overlay Example
The overlay sections provided in the attached
projects are intended for use on the ADSP-21262
EZ-KIT Lite evaluation system. Each overlay has
one function that sets a unique subset of the 8
LEDs on the board to indicate the overlay that
has just been executed.
Overlays from External RAM
The external memory support provided in
VisualDSP++ for the ADSP-2126x family of
DSPs is different from the support provided for
the ADSP-21161 and ADSP-21065L DSPs
discussed in EE-180 and EE-66, respectively.
When mapping the overlays to external memory,
use the PACKING() command to place the
overlay data properly in external memory.
Figures 1 and 2 show the necessary PACKING()
command for 8- and 16-bit external SRAM
modules, respectively.
These PACKING() commands must appear in
the OVERLAY_INPUT() section of the LDF file
to instruct the linker how to place the data in the
executable file. (See Using External Memory with ADSP-2126x SHARC DSPs (EE-220) [3] for
a discussion of external memory on ADSP2126x family DSPs.)
Another change from previous SHARC DSPs is
the lack of 48-bit DMA transfers on ADSP2126x DSPs. This change greatly affects the
overlay manager. All transfers that use the
Parallel Port or SPI are limited to 32-bit words;
therefore, the 48-bit run address must be
translated into an equivalent 32-bit address. This
is accomplished by multiplying the 48-bit
address by 1.5. Figure 3 shows the assembly
code that translates a 48-bit address into its 32bit equivalent.
r3=0x80100; /* 48-bit address */
r2=3; /* Use R2 as a multiplier */
/* Extract lower bits from R4 into R3
(bits that need to be translated) */
r3=fext r4 by 0:16;
/* Place remaining bits in R0 (these
bits indicate word space) */
r0=r4-r3;
/* Multiply lower bits by 3 */
r3=r3*r2 (ssi);
/* Divide the result by 2 */
r3=lshift r3 by -1;
/* Insert the new lower bits */
r0=r0 or fdep r3 by 0:16;
/* Finished! R0=0x80180 */
Figure 3. Translating a 48-bit address to the 32-bit
address required by the overlay manager
It is also necessary to transform the number of
48-bit words (run word size) to the equivalent
number of 32-bit words. For a DMA transaction
to complete correctly, the DMA must be set up to
transfer a whole number of 32-bit words.
Therefore, when translating from 48-bit to 32-bit
lengths, round the calculation up.
Each 48-bit word ends or begins in the middle of
the equivalent 32-bit address. If the 48-bit run
address is odd, the first 16-bits of the equivalent
32-bit address must be saved before fetching the
overlay and restored after the transfer is
complete, since only the final half of the 32-bit
Code Overlays on the ADSP-2126x SHARC® Family of DSPs (EE-230) Page 2 of 14
a
word pertains to the run address. Similarly, if the
run space ends on an even 48-bit address, the
final 16-bits of the equivalent 32-bit address
must be saved and restored in the same manner.
In the examples included with this document, the
starting and ending run address are checked for
each overlay, and problem addresses are stored
into a 1-location stack and restored as described
above.
When the run address is odd, the starting live
address must be decreased by 1 or 2 locations,
respectively, depending on whether the external
memory is 8- or 16-bit. This aligns the location
of the 48-bit run and live addresses, since the 32bit run address that will be accessed must always
lie on a 32-bit boundary.
Since ADSP-2126x DSPs support 8- and 16-bit
external memories, the provided overlay
manager automatically determines the width of
the external memory being used. This is
accomplished by comparing the run word size
and live word size.
Overlays from ROM/Flash
Using overlays from ROM/Flash is very similar
to using them from RAM. Set up the
OVERLAY_INPUT() section of the LDF file as
if the overlays are to reside in external memory.
However, map the overlays to a dummy live
address in the external memory map (between
0x2000000 and 0x2FFFFFF) that is not being
occupied by a physical memory device.
(Remember the actual live address is a location
in the ROM – this dummy live address is used
only as a search string to locate each overlay in
the ROM because this info is not available at
link-time.) It does not matter whether the
memory is set up as 8- or 16-bit memory, as the
loader generates the proper format from either
memory type.
When using an image generated by the loader,
tags are inserted before data sections to identify
different sections in the boot stream. The dummy
live addresses precede each overlay as it sits in
the boot-ROM, along with a data-type tag and
word count. Before calling an overlay section,
the overlay manager must parse the boot stream,
looking for these dummy live addresses, saving
their locations in the ROM and number of bootsections for the entire address range devoted to
the overlay section. The overlay sections are
tagged (as shown in Table 1) with a normal
external memory type at an address in the
overlay range specified in the LDF file.
Memory
Type
8-bit 0x7 0x9
16-bit 0x8 0xA
Table 1. Loader tags for ADSP-2126x DSPs
As described in EE-180, the overlay init
subroutine parses the boot stream to find the
locations of the overlay sections, and the overlay
section info subroutine determines how many
loader sections comprise each overlay section.
The provided code assumes that the overlays live
in the same device used to boot the DSP.
Therefore, the overlay init routine begins parsing
the loader file at the first byte after the kernel
(location 0x600).
The overlay init routine supplied with these
examples assumes that the live space for all of
the overlays is contiguous. The routine looks for
tags that reside between the live start address of
the first overlay and the live end address of the
final overlay. It automatically uses the addresses
supplied by the linker, but must be updated to
account for the number of overlays being used.
When using the overlay manager, the overlay init
and overlay section info routines automatically
provide a whole number of 32-bit words, and a
valid live address. The same calculation to obtain
the run address described above must be
performed when running from ROM, as do the
checks for odd and even 48-bit problems.
However, if the overlay starts on an odd address,
the beginning of each loader section in the
Tag for
Zero Init
Tag for
Regular
Code Overlays on the ADSP-2126x SHARC® Family of DSPs (EE-230) Page 3 of 14
a
overlay to be fetched will fall in the middle of a
32-bit word; therefore, apply the save and restore
scheme for the first 16-bits of the word to each
loader section that is fetched.
Finally, as in the loader kernel, the overlay
manager automatically initializes the correct
number of words when a zero init section is
detected, rather than transferring the zeroes.
Parallel ROM/Flash Overlay
Manager
The overlay manager provided for a parallel
ROM/Flash uses the Parallel Port DMA registers
to store the internal run address. However, the
address saved in the IIPP register must be in 32bit normal word space, so the highest bits of
address are not stored in the register. The starting
addresses to save and restore is taken from this
register, and therefore, the upper bits must be
masked in. This example assumes that all of the
overlays run in Block 0 (addresses 0x800000x87FFF), that is, the mask is always assumed to
be 0x80000.
The read_data_bytes() routine used in the
overlay manager and overlay init routine
assumes that the device being used is byte-wide,
since this is the format that must be used to boot
the DSP.
least significant bit first (LSBF) format to be
bootable. However, the commands sent to an SPI
Flash must be sent in most significant bit first
format. Therefore, when accessing overlays
stored in an SPI Flash, bit-reverse the read
command and address that is sent to the Flash to
read the overlay data stored in LSBF format. The
read_data_bytes() routine used to access
the SPI Flash is taken from the SPI Flash
programmer provided in EE-231.
Since the SPI Flash requires an extra 32-bit
transfer (rather than direct address lines), one
extra 32-bit word (corresponding to the 1-byte
read command and 24-bit address) is transferred
into the run space for each access to the SPI
Flash. When transferring the overlay sections,
this word overwrites the word directly preceding
the section to be written. The affected word is
saved and restored in the same manner as the
words that fall on an odd 48-bit boundary.
Conclusion
Although the concepts of code overlays are
described in detail in EE-66 and EE-180, this
document extends these concepts to the ADSP2126x DSPs. The main difference in using
overlays on ADSP-2126x DSPs stems from its
lack of 48-bit DMA support, but this is easily
worked around.
SPI ROM/Flash Overlay Manager
The basic concepts used to store overlays in an
SPI Flash are identical to storing them in a
Parallel Flash as described above. However, a
few simple changes are necessary to support the
way the SPI flash is addressed.
As described in Programming an ST M25P80
SPI Flash with ADSP-21262 SHARC DSPs (EE-
231) [4], an SPI flash must be programmed in a
Code Overlays on the ADSP-2126x SHARC® Family of DSPs (EE-230) Page 4 of 14
In addition to using overlays from external RAM
and ROM or Flash devices via the Parallel Port
(as has been described for previous members of
the SHARC family), DSP overlays can be stored
in an SPI Flash device. This document provides a
framework for using overlays from SPI Flash,
supporting ADSP-2126x DSPs. This same
framework can also be extended to support the
ADSP-21161.
a
Appendix
ovly_mgr.asm
/* The OVLY_MGR.ASM file is the overlay manager. When a symbol */
/* residing in overlay is referenced, the overlay manager loads */
/* the overlay code and begins execution. (This overlay manager */
/* does not check to see if the overlay is already in internal */
/* memory.) A DMA transfer is performed to load in the memory */
/* overlay. */
#include <def21262.h>
.SECTION/DM dm_data;
/* The following constants are defined by the linker. */
/* These constants contain the word size, live location */
/* and run location of the overlay functions. */
/* Placing the linker constants in a structure so the overlay */
/* manager can use the appropriate constant based on the */
/* overlay id. */
#define PHYS_WORD_SIZE(run_size,live_size) (48 / (live_size/run_size))
.import "OverlayStruct.h";