Engineer To Engineer Note EE-206
a
Technical Notes on using Analog Devices' DSP components and development tools
Contact our technical support by phone: (800) ANALOG-D or e-mail: dsp.support@analog.com
Or visit our on-line resources http://www.analog.com/dsp and http://www.analog.com/dsp/EZAnswers
ADSP-BF535 Blackfin® Processor PCI Interface Performance
Contributed by Jorge Manguane September 3, 2003
Introduction
This Engineer to Engineer Note will briefly
discuss the performance of the ADSP-BF535
Blackfin® Processor PCI interface. In general,
maximum PCI performance is achieved during
burst transfers, where a single address phase
(one clock cycle at the beginning of the
transaction where the address and transfer type
are output on the AD bus and on the C/BE lines,
respectively) is followed by multiple data
phases. It is therefore easy to see how single
data transfers may yield lower throughput. We
will look at the ramifications of each of these
transfer types, namely, burst and single word
transfers.
To validate this analysis the Eagle-35 and the
Hawk-35 boards were used as a system host and
PCI device, respectively. Both boards have the
ADSP-BF535 Blackfin® Processor as the main
processor and are sold and supported by
Momentum Data Systems, MDS. Also, a
VMETRO PCI bus analyzer/exerciser was used
to both monitor the PCI traffic along with its
metrics and to exercise the bus.
It should be noted that a PCI-to-PCI bridge is
used on the Hawk-35 board to interface the
BF535 PCI core interface to the PCI bus signals.
The main purpose of this bridge is to perform
automatic voltage detection and translation in
order to allow the board to be plugged into either
a 3.3 Volt or 5 Volt system.
data directly to/from internal (L2 memory) or
external memory spaces without involving the
processor core.
Burst transfers
When the BF535 is the bus master, burst
transfers are accomplished through Direct
Memory Access (DMA), in particular through
the processor’s Memory DMA (MemDMA)
engine. For PCI transfers, MemDMA allows
memory to memory DMA transfers between PCI
memory space and either internal L2 or SDRAM
memory.
The MemDMA engine’s burst size is 8 words,
which means that after every 8 words transferred
the PCI interface issues the next PCI address.
This situation is illustrated in the PCI trace
shown in Figure 1, where every address phase
(Start) is followed by eight data words. A new
Start (address phase) transfer follows, and the
process goes on until all data has been
transferred. Not every transaction (address
phase, data phase) will be the same (i.e., 8 data
words for each address phase) because at some
point the target may issue ”retries.” These retries
will force the PCI bus master to reissue a new
address. Figure 2 shows the PCI bus statistics
including the Data Transfers per Transaction ,
which indicate an average number of data words
transferred during each burst access.
On the BF535, a dedicated bus is available on
chip to allow an external bus master to transfer
Copyright 2003, Analog Devices, Inc. All rights reserved. Analog Devices assumes no responsibility for customer product design or the use or application of
customers’ products or for any infringements of patents or rights of others which may result from Analog Devices assistance. All trademarks and logos are property
of their respective holders. Information furnished by Analog Devices Applications and Development Tools Engineers is believed to be accurate and reliable, however
no responsibility is assumed by Analog Devices regarding technical accuracy and topicality of the content provided in Analog Devices’ Engineer-to-Engineer Notes.
Figure 1. Eagle-35 Burst write trace
slave). The target memory accessed was the onchip ADSP-BF535 L2 memory. This will be the
case for the remainder of the traces shown in this
application note. Accesses to SDRAM yielded a
slight (insignificant) decrease in throughput –
refer to the metrics in T . As it can be seen
Figure 2. Eagle-35 Burst Write statistics
from , a
high throughput of 89.12 MB/s was achieved, as
well as an efficiency of 72.83%. Here the
efficiency is measured as a ratio of the
percentage of data transfers against the
percentage of total transactions.
The processors’ system clocks, SCLK, on both
boards were set to 131 MHz, with the core clocks
running at 262 MHz. Since the maximum burst
length on the BF535 is 8, SCLK is a major factor
that influences PCI throughput. As an example,
when SCLK was set to 120 MHz, the observed
throughput was 80 MB/s. Because the
MemDMA engine operates in the SCLK domain,
it is apparent that a higher SCLK allows more
data transferred per unit time. Note, however,
that 133 MHz is the maximum frequency to
which SCLK can be set.
able 1
a
Figure 2. Eagle-35 Burst Write statistics
These metrics were obtained from a DMA write
transfer between the Eagle-35 board (as the bus
master) and the Hawk-35 board (as the bus
When the BF535 is the target (slave) of a burst
transfer, the initiator’s burst size will determine
the amount of data transferred per transaction. As
an example, the burst length of the VMETRO’s
PCI exerciser can be specified to an arbitrary
length. Burst length also influences throughput.
As the burst length increases, the number of
transactions needed to complete the transfer
decreases. Figure 3 shows the Hawk-35 being
accessed by the VMETRO’s PCI exerciser (read
access), and Figure 4 shows the corresponding
bus statistics. Here the burst length was set to
equal the number of bytes transferred.
ADSP-BF535 Blackfin® Processor PCI Interface Performance (EE-206) Page 2 of 5