AN-525
a
APPLICATION NOTE
One Technology Way • P.O. Box 9106 • Norwood, MA 02062-9106 • 781/329-4700 • World Wide Web Site: http://www.analog.com
ADV601 Video Codec Design Considerations
by David Starr
OVERVIEW
This Applications note is for hardware and software designers starting an ADV601 design. Using this note and
the information in the ADV601 Video Codec data sheet
you can do the following:
Design ADV601-based video compression hardware.
Write software drivers and hardware diagnostic
programs.
Integrate your hardware into the PCI bus and your
software into Windows
The design examples in this application note refer to the
ADV601-based Videolab demonstration board, but you
can apply the techniques used in these examples to any
ADV601-based design. The software source code and
hardware schematics mentioned in this note are available on the Analog Devices computer products FTP site,
whose Uniform Resource Locator (URL) is:
ftp://ftp.analog.com/pub/dsp/adv601/
VCLK (VIDEO CLOCK) FREQUENCY FOR SQUARE AND
NONSQUARE PIXELS
The ADV601 uses the VCLK signal for internal processing,
DRAM timing and strobing in video data. The ADV601’s
internal PLL multiplies VCLK up to generate the DRAM
/CAS and /RAS timing. Use only the clock frequencies
listed on the data sheet under “Clock Pins,” even in
nonreal-time applications. You must set the mode control register bits P/N (PAL/NTSC) and SPE (Square Pixel
Enable) to match the selected VCLK frequency. For instance, if VCLK is 29.5 MHz, then set both P/N and SPE
equal to one for the ADV601 to function properly. If you
®
95.
intend to switch square pixel enable on and off, you must
also vary the clock frequency to match. Pulse-to-pulse jitter on VCLK should be less than 1 ns. The part is designed
to function with VCLK phase locked to the horizontal
sync. There is enough tolerance in the clock circuit to
track the horizontal timing variations caused by tape
speed variations (flutter and wow) on consumer grade
VHS video cassette recorders (VCRs).
COMPRESSED VIDEO DATA INTERFACE DESIGN ISSUES
The compressed video data bus must support a high
data rate. Raw video comes into the part at 12 to
14 Megapixels/sec. Video will come out of the part just
as fast at low compression ratios. The compression ratio
can vary from its programmed value, causing the video
data rate to increase (or decrease, but the increase
causes the difficulty). A slow compressed video bus will
cause the ADV601’s internal FIFO to underflow or overflow, resulting in lost frames on capture and torn frames
on playback. Difficulty may occur if the compressed
video bus is slower than 5 Megabytes/sec. The Analog
Devices evaluation board uses a Bus Master PCI bus interface capable of 16 Megabytes/sec.
Many applications capture and play back video to/from
hard disk. In this case the disk is the limiting factor in system throughput. However, if the disk and the ADV601 reside on the same bus (for example, a PCI bus system),
bus bandwidth may also be a factor. If the video goes
from the ADV601 card to main memory, and then from
main memory to disk, bus traffic is double what it would
be if the video went directly from the ADV601 to the disk
VIDEO
INPUT
VIDEO
OUTPUT
Windows is a registered trademark of Microsoft Corporation.
BIDIRECTIONAL
RAW VIDEO BUS
Figure 1. Video Signal Flow
WAVELET
TRANSFORM
DRAM
ADV601
HUFFMAN
CODER
BIDIRECTIONAL
COMPRESSED
VIDEO BUS
AN-525
controller with no halfway stop in main memory. Burst
mode, where the hardware acquires the bus, asserts one
address and transfers a block of data, will give best performance. The bus hardware may not be fast enough if it
must acquire the bus and assert an address for each
word transfer.
FIFO STATUS SIGNALS FIFO_SRQ, FIFO_STP, FIFO_ERR,
FIFO_STP
FIFO_STP is a combined FULL and EMPTY pin. On encode it signals EMPTY, on decode it signals FULL. It
means stop moving data into or out of the FIFO.
FIFO_STP is asserted quite late and it can be difficult for
hardware to see the FIFO_STP signal in time to halt the
next FIFO transfer. In this case, an extra read will move
invalid data, and an extra write will trash a word already
inside the FIFO.
FIFO_SRQ
FIFO_SRQ is a combined NEARLY FULL and NEARLY
EMPTY bit. On Encode it signals NEARLY FULL, and
on DECODE is signals NEARLY EMPTY. NEARLY (the service request trigger point) is programmed by the FIFO
Control Register over the range 32 to 480 long words.
FIFO_SRQ is easier to use for data transfer control than
FIFO_STP, because there is no penalty for moving one or
two words after FIFO_SRQ goes away. FIFO_SRQ will go
away at least 32 reads or writes before FULL or EMPTY
occurs. The size of each data transfer can be controlled
by programming NEARLY. Setting NEARLY to half full
(256 words) will cause the hardware to move at least 256
words for each service request. This can be advantageous if there is significant overhead required to set up
each bus transfer. Overhead might be arbitrating for the
bus, entering host interrupt service or asserting the data
address.
FIFO_SRQ can reoccur very rapidly. The host and the
ADV601 are racing each other through the FIFO. It is possible for the host to transfer a single word that clears the
FIFO service request and on the very next VCLK, the
ADV601 can transfer a word that sets the FIFO service
request again. FIFO_SRQ is asynchronous to the host
port. Take care not to violate setup and hold time
requirements of host port hardware.
FIFO_ERR
FIFO_ERR is a combined EMPTY and FULL pin. On
decode it signals EMPTY and on encode it signals FULL.
This is the reverse of FIFO_STP. When asserted, the host
is falling behind.
BIN WIDTH CALCULATION BASICS
Off-chip computation, either by the host or a dedicated
DSP, is required to control the compression ratio during
encode. The Wavelet transformer output is 16 bits wide.
To increase the compression ratio, some low order bits
must be discarded before the run length and Huffman
coders. This increases the length of the zero runs leading
to more data compression. The ADV601’s adaptive
quantizer discards low order bits by multiplying every
sample in the bin by a user-specified fraction, called the
reciprocal bin width. On playback, the sub-bands are
restored to proper size by multiplication by a userspecified coefficient called the bin width. Each of the 42
sub-bands has its own bin width and reciprocal Bin Width
Register. The bin widths are embedded in the compressed data stream during the encoding process. On
decode, the ADV601 extracts the bin widths from the
compressed data stream and multiplies each sample by
the bin width to bring it back up to proper size. Bin Width
Registers are of concern on encode only; nothing need be
put in the registers for decode.
Computation of a Bin Width Register is straightforward—
merely take the reciprocal of the corresponding reciprocal Bin Width Register. Remember that the reciprocal Bin
Width Registers are scaled 6.10 and the Bin Width Registers are scaled 8.8, and scale your reciprocal calculation
accordingly.
The number of bits required to encode an image varies
with the busyness of the image. A plain solid black field
will encode very compactly since there is no high frequency energy in the picture. The higher sub-bands are
all zero everywhere. On the other hand, something like a
close-up of a plaid shirt has a lot of high frequency energy and will call for more bits to encode. As the picture
gets busier, you need to use a smaller fraction in the reciprocal Bin Width Registers.
At the end of each field, the ADV601 supplies the bin
width computer with the sum of the squares of each subband as a measure of the busyness. These (and a few
other numbers) are referred to as “statistics.” As the sum
of the squares gets larger, the reciprocal bin widths need
to get smaller.
This bin width computation works best if done quickly.
The ADV601 will present the statistics just as vertical retrace is beginning. The bin width computer needs to read
all the statistics, compute 42 reciprocal bin widths and 42
bin widths, and write the new setting back into the
ADV601 before the next field starts. Next field starts in 20
horizontal line times or about 1.2 milliseconds. The computation needs to be repeated once per field, or
every 16 milliseconds. The computation load will be
about 1.2 milliseconds every 16 milliseconds or 7%. This
assumes that the bin width calculation is actually completed within the 1.2 millisecond deadline. If not, the
ADV601 will use the existing bin width setting on the new
field. Since one field is much like another field, no great
harm is done.
DIAGNOSTICS AND DEBUGGING STRATEGY
In testing out a new design it is important to get simple
things working before testing more complex features.
–2–