Xilinx is providing this product documentation, hereinafter “Information,” to you “AS IS” with no warranty of any kind, express or implied.
Xilinx makes no representation that the Information, or any particular implementation thereof, is free from any claims of infringement. You
are responsible for obtaining any rights you may require for any implementation based on the Information. All specifications are subject to
change without notice.
XILINX EXPRESSLY DISCLAIMS ANY WARRANTY WHATSOEVER WITH RESPECT TO THE ADEQUACY OF THE INFORMATION OR
ANY IMPLEMENTATION BASED THEREON, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OR REPRESENTATIONS THAT
THIS IMPLEMENTATION IS FREE FROM CLAIMS OF INFRINGEMENT AND ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR
FITNESS FOR A PARTICULAR PURPOSE.
Except as stated herein, none of the Information may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or
transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without
the prior written consent of Xilinx.
UG805 March 1, 2011www.xilinx.comVideo Scaler v4.0 User Guide
About This Guide
The LogiCORE™ IPVideo Scaler v4.0 User Guide provides information about generating the
Video Scaler core, customizing and simulating the core using the provided example
design, and running the design files through implementation using the Xilinx tools.
Guide Contents
This manual contains the following chapters:
•Chapter 1, Introduction introduces the Xilinx Video Scaler core and provides related
information, including recommended design experience, additional resources,
technical support, and submitting feedback to Xilinx.
•Chapter 2, Overview illustrates examples of video scaler applications.
•Chapter 3, Implementation elaborates on the internal structure in the core and
describes interfacing.
•Chapter 4, Video I/O Interface and Timing describes how to drive the input timing
signals so the scaler can be operated correctly. It also describes the data output signals
and their relation to the output data.
•Chapter 5, Scaler Architectures describes Single-engine for sequential YC processing,
Dual Engine for parallel YC processing, and Triple engine for parallel RGB/4:4:4
processing.
•Chapter 6, Control Interface discusses the three control interface options available to
the user in CORE Generator™ software: EDK pCore, GPP and Constant.
•Chapter 7, Scaler Aperture explains how to define the scaler aperture using the
appropriate dynamic control registers.
•Chapter 8, Coefficients describes the coefficients used by both the Vertical and
Horizontal filter portions of the scaler, in terms of number, range, formatting and
download procedures.
•Chapter 9, Performance emphasizes the importance of available clock rate and
provides some worst-case conversion examples.
•Appendix A, Use Cases illustrates two likely usage scenarios for the video scaler.
•Appendix B, Programmer Guide provides a description of how to program and
control the data flow for the video scaler hardware pCore.
•"Appendix C, System Level Design provides an example design extracted from a
known, working EDK project, including other Video IP blocks.
Preface
Video Scaler v4.0 User Guidewww.xilinx.com11
UG805 March 1, 2011
Preface: About This Guide
Additional Resources
To find additional documentation, see the Xilinx website at:
To search the Answer Database of silicon, software, and IP questions and answers, or to
create a technical support WebCase, see the Xilinx website at:
http://www.xilinx.com/support/mysupport.htm
.
This document uses the following conventions. An example illustrates each convention.
The following typographical conventions are used in this document:
ConventionMeaning or UseExample
Messages, prompts, and
Courier font
Courier bold
Helvetica bold
program files that the system
displays
Literal commands that you enter
in a syntactical statement
Commands that you select from
a menu
Keyboard shortcutsCtrl+C
speed grade: - 100
ngdbuilddesign_name
File Open
Italic font
Dark Shading
Square brackets [ ]
Braces { }
Vertical bar |
Angle brackets < >
Variables in a syntax statement
for which you must supply
values
References to other manuals
Emphasis in text
Items that are not supported or
reserved
An optional entry or parameter.
However, in bus specifications,
such as bus[7:0], they are
required.
A list of items from which you
must choose one or more
Separates items in a list of
choices
User-defined variable or in code
samples
ngdbuild design_name
See the User Guide for more
information.
If a wire is drawn so that it
overlaps the pin of a symbol, the
two nets are not connected.
This feature is not supported
ngdbuild [option_name]
design_name
lowpwr ={on|off}
lowpwr ={on|off}
<directory name>
12www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
ConventionMeaning or UseExample
Conventions
Vertical ellipsis
.
.
.
Horizontal ellipsis . . .
Notations
Online Document
The following conventions are used in this document:
ConventionMeaning or UseExample
Blue text
Blue, underlined text
Repetitive material that has
been omitted
Repetitive material that has
been omitted
The prefix ‘0x’ or the suffix ‘h’
indicate hexadecimal notation
An ‘_n’ means the signal is
active low
Cross-reference link to a location
in the current document
Hyperlink to a website (URL)
IOB #1: Name = QOUT’
IOB #2: Name = CLKIN’
.
.
.
allow block block_name loc1
loc2 ... locn;
A read of address 0x00112975
returned 45524943h.
usr_teof_n is active low.
See Chapter 3, Basic
Architecture for details.
See Additional Resources,
page 12,” for details.
Go to www.xilinx.com
latest speed files.
for the
Video Scaler v4.0 User Guidewww.xilinx.com13
UG805 March 1, 2011
Preface: About This Guide
14www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Introduction
This chapter introduces the Video Scaler core and provides related information, including
recommended design experience, additional resources, technical support, and submitting
feedback to Xilinx. See www.xilinx.com/products/ipcenter/EF-DI-VID-SCALER.htm
About the Core
The Video Scaler core is a Xilinx CORE Generator™ IP core, included in the latest IP
Update on the Xilinx IP Center
Scaler product page.
Recommended Experience
Although the Video Scaler core is a fully verified solution, the challenge associated with
implementing a complete design varies depending on the configuration and functionality
of the application. For best results, previous experience building high performance,
pipelined FPGA designs using Xilinx implementation software and UCF is recommended.
Chapter 1
.
. For detailed information about the core, see the Video
Contact your local Xilinx representative for a closer review and estimation for your specific
requirements
Additional Core Resources
For detailed information about video scaler technology and updates to the Video Scaler
core, see the following:
Documentation
From the Video Scaler product page:
•Video Scaler Data Sheet
•Video Scaler Release Notes
Technical Support
For technical support, visit www.xilinx.com/support. Questions are routed to a team of
engineers with expertise using the Video Scaler core.
Xilinx will provide technical support for use of this product as described in the
LogiCORE™ IP Video Scaler User Guide. Xilinx cannot guarantee timing, functionality, or
support of this product for designs that do not follow these guidelines.
Video Scaler v4.0 User Guidewww.xilinx.com15
UG805 March 1, 2011
Chapter 1: Introduction
Providing Feedback
Xilinx welcomes comments and suggestions about the Video Scaler core and the
documentation supplied with the core.
Core
For comments or suggestions about the Video Scaler core, submit a WebCase from
www.xilinx.com/support
•Product name
•Core version number
•Explanation of your comments
Documentation
For comments or suggestions about this document, submit a WebCase from
www.xilinx.com/support
•Document title
•Document number
•Page number(s) to which your comments refer
•Explanation of your comments
. Be sure to include the following information:
. Be sure to include the following information:
Nomenclature
The following are defined for the purposes of this document:
Table 1-1:Nomenclature
TermDefinition
Scaler Aperture The input data rectangle used to create the output data rectangle.
Filter Aperture The group of contributory data used in a filter to generate one
particular output. The number of elements in this group of data is
the number of taps. We define the filter aperture size using the
num_h_taps and num_v_taps parameters.
Coefficient Phase Each tap is multiplied by a coefficient to make its contribution to
the output pixel. The coefficients used are selected from a “phase”
of num_x_taps coefficients. The phase selection is dependent
upon the position of the output pixel in the input sampling grid
space. For each dimension of the filter, each coefficient phase
consists of num_h_taps or num_v_taps coefficients.
Channel For scaler purposes, all monochromatic video streams, for example
Y, Cb, Cr, R, G, B, are all considered separate channels.
Coefficient Phase Index An index given that selects the coefficient phase applied to one
filter aperture in a FIR. For an n-tap filter, this index points to n
coefficients.
16www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Nomenclature
Table 1-1:Nomenclature
TermDefinition
Coefficient Bank A group of coefficients that will be applied to one video component
(Y or C) in one dimension (H or V) for a conversion of one frame. It
includes all phases. For an n-tap, m-phase filter, a coefficient bank
comprises nxm values. Each tap may be multiplied by any one of
m coefficients assigned to it, selected by the phase index, which is
applied to all taps.
Coefficient Set A group of four coefficient banks (VY, VC, HY, HC). One full set
should be written into the scaler before use.
Video Scaler v4.0 User Guidewww.xilinx.com17
UG805 March 1, 2011
Chapter 1: Introduction
18www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Overview
Video scaling is the process of converting an input color image of dimensions Xin pixels by
Y
in
Within predefined limits, the Xilinx Video Scaler supports the modification of the X
X
out
dynamically crop selected subject area from the input image prior to scaling that area. This
dynamic combination lends itself well to applications that require shrink and zoom
functionality.
The Xilinx Video Scaler supports real-time video inputs and memory interface inputs (that
is, a frame buffer). When connected to a real-time input source, the input clock and
horizontal and vertical (H/V) timing signals come directly from the input video stream. In
the case of a memory interface, standard memory handshaking signals may be used in
place of the H/V timing signals.
While maintaining image quality is usually of primary interest, it is subjective and heavily
dependent upon the end application. Moreover, image quality comes at a price in terms of
FPGA resources. Hence, while the core structure and architecture of the scaler is
maintained for all applications, flexibility is made paramount to enable users from all
applications to use this IP.
Chapter 2
lines to an output color image of dimensions X
, Y
input parameters during run-time on a frame basis. Furthermore, you may also
out
pixels by Y
out
out
lines.
, Yin,
in
Video Scaler v4.0 User Guidewww.xilinx.com19
UG805 March 1, 2011
Chapter 2: Overview
20www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Implementation
Video Rectangle In
(Dimensions X
in XYin)
Video Rectangle Out
(Dimensions Xout XYout)
Video Scaler
UG_07_031909
This section elaborates on the internal structure in the core, and describes interfacing.
Basic Architecture
The Xilinx Video Scaler LogiCORE™ IP converts a specified rectangular area of an input
digital video image from the original sampling grid to a desired target sampling grid
(Figure 3-1).
X-Ref Target - Figure 3-1
Chapter 3
Figure 3-1:High Level View of the Functionality
The input image must be provided in raster scan format (left to right and top to bottom).
The valid outputs will also be given in this order.
The Xilinx Video Scaler makes few assumptions regarding the origin or the destination of
the video data. The input could be fed in real-time from a live video feed, or it could be
read from an external memory. The output could feed directly to another processing stage
in real time, but also could feed an external frame buffer (for example, for a VGA controller,
or a Picture-in-Picture controller). Whatever the configuration, you must assess, given the
clock-frequency available, how much time is available for scaling, and define:
1.Whether to source the scaler using live video or an input-side frame buffer, and
2.Whether the scaler feeds out directly to the next stage or to an output-side frame
buffer.
When using a live video input source, you have no control over the video timing signals.
Hence, the specific requirements must allow for this. For example, when up-scaling by a
factor of 2, two lines must be output for every input line. The scaler core clock-rate (‘clk’)
must allow for this, especially considering the architectural specifics within the scaler that
take advantage of the high speed features of the FPGA to allow for resource sharing.
Feeding data from an input frame buffer is more costly, but allows you to read the required
data as needed, but still have one “frame” period in which to process it.
Video Scaler v4.0 User Guidewww.xilinx.com21
UG805 March 1, 2011
Chapter 3: Implementation
$ATA&LOW
#ONTROL&LOW#LOCKS
VIDEO?IN?CLK
ACTIVE?VIDEO?IN
LINE?REQUEST
HBLANK?INVBLANK?IN
7RITESIDECONTROL
VIDEO?DATA?IN
OOEDIVKLC?NIOEDIVUT?CLK
VIDEO?DATA?OUT
2EADSIDECONTROL
VIDEO?OUT?CLK
VIDEO?OUT?ALMOST?FULL
VIDEO?OUT?WE
#LK
#ONTROL
3TATE-ACHINES
!SYNC)NPUT
,INE"UFFER
!SYNC/UTPUT
,INE"UFFERS
3CALER-ODULE
5'???
3CALER#ORE
Some observations (not exclusively true for all conversions):
•Generally, when up-scaling, or dealing with high definition (HD) rates, it is simplest
to use an input-side frame buffer. This does depend upon the available clock rates.
•When down-scaling, it is often the case that the input-side frame buffer is not
required, because for every input line the scaler is required to generate a maximum of
one valid output line.
•Generally, the output data does not conform to any standard. It is therefore not
possible to feed the output directly to a display driver. Usually, a frame buffer is
ultimately required to smooth the output data over an output frame period. The
output video stream is described later.
I/O Buffering, Clock Domains
Figure 3-2 shows the top level buffering, indicating the different clock domains, and the
scope of the control state-machines.
X-Ref Target - Figure 3-2
Figure 3-2:Simplified Top Level Block Diagram, Indicating Clock-domains
To support the many possibilities of input and output configurations, and to take
advantage of the fast FPGA fabric, the scaler core uses a separate clock domain from that
used in controlling data I/O. More information is given in Chapter 9, Performance about
how to calculate the minimum required operational clock frequency. It is also possible to
read the output of the scaler using a 3rd clock domain. These clock domains are isolated
22www.xilinx.comVideo Scaler v4.0 User Guide
from each other using asynchronous line buffers as shown in Figure 3-2. The control state-
machines monitor the I/O line buffers. They also monitor the current input and output line
numbers.
UG805 March 1, 2011
Video I/O Interface and Timing
CORE Generator™ software provides two interface options for provision of the video data
into the video scaler core.
1.Live – standard format video signal, along with synchronization signals to be driven
directly into the core.
2.Memory – an internal memory arbiter is included in the core, so the active video area
may be accessed from an external memory block.
Data Source: Live Video
Input Data and Timing Signals
•General Input Handshaking Principles
•Hblank_in Input
•Vblank_in Input
•Frame_rst Signal
•Active_video_in Input
Chapter 4
General Input Handshaking Principles
The input data is written into an internal double-buffered line buffer. Availability of space
for one entire line of data is indicated by a high level on the line_request output. One
line of data, of a length up to max_samples_in_per_line, may be written to this buffer
without the need for further arbitration. Following the first valid pixel-write operation to
this line buffer, the line_request output will be driven low by the scaler. This signal
may rise a few (> 3) clock cycles later to indicate availability of the other half of the double
buffer. The number of clock cycles is dependent on the current conversion.
Video Scaler v4.0 User Guidewww.xilinx.com23
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
Valid video data is written into the input line buffer using active_video_in as a writeenable. This is shown in Figure 4-1 for the 8-bit 4:2:2 case The active_video_in signal must remain in a high state for the duration of the active input line.
X-Ref Target - Figure 4-1
video_in_clk
line_request
active_video_in
video_data _in (7:0) (Luma)
video_data_in (15:8) (Chroma)
Cb
Y
Y
0
1
Cr
0
0
YnY
n+1Yn+2Yn+3
CbnCrnCb
n+2Crn+2
Y
size-1
Cr
size-2
UG678_5-1_081809
Figure 4-1: Scaler 8-bit 4:2:2 Input Timing
The scaler is capable of accepting and delivering 4:4:4 (e.g., RGB), 4:2:2, and 4:2:0 chroma
formats. It will not convert between chroma formats. For delivery of 4:4:4 video data, a
third channel would be added to this diagram, and the three channels would be either R,
G, and B or Y, Cb, and Cr. It is necessary to clarify the I/O format. For bandwidth, 4:2:0 is
essentially the same as 4:2:2 horizontally, but is half the bandwidth vertically. Different
signaling is required for the delivery of the YC4:2:2: and YC4:2:0 chroma systems. The
luma (Y) input is a full bandwidth 8-bit input on video_data_in[7:0]. The chroma for
both 4:2:0 and 4:2:2 is also a full-bandwidth input on
video_data_in[(data_width*2)-1:data_width], but Cb and Cr are interleaved
on a pixel basis, as shown in Figure 4-1 for the 8-bit case. An additional input
active_chroma_in is required in the 4:2:0 case. This must be asserted high on all lines
for 4:2:2, but only for alternate lines for 4:2:0, as shown in Figure 4-2.
X-Ref Target - Figure 4-2
chroma_in
video_data_in (7:0)_(Luma)
video_data_in (15:8)_(Chroma)
Line1
Valid
Line2
N/V
Line3
Valid
Line4
N/V
When running the scaler using Live Mode, you are likely to derive the active_video_in
from timing signals such as horizontal sync or embedded flags like EAV and SAV. In this
case, you will have calculated that the line-rate at the input, often defined by the input
video format, is sufficiently low that the host system will never need to wait for the
line_request signal to be asserted.
However, in contrast, you may calculate that this is not possible, and that the scaler must
hold off the input data. The line_request flag deasserted state should be used to hold
off the write-operation for a new line. Since it is impossible to hold off a live video feed, the
data must be fed (directly or indirectly) from a frame buffer, and the appropriate external
control provided (Memory Mode).
The horizontal blanking input signal hblank_in is generally used as a line-based reset. It
must be provided to the scaler core in the same clock domain as the video data
(video_in_clk).
The hblank_in signal is used to perform the following operations:
•Reset an internal input pixel counter.
•Reset the internal input side line buffer write-address pointer.
•Increment the input line counter (rising edge of hblank_in).
•Decode the input line count during active data period to open and close an internal
processing “window.”
•Decode the input line count to create a delayed internal frame-based reset signal
(frame_rst) during vblank_in. The line-number is specified in the CORE
Generator GUI (Frame Reset line Number).
The timing of hblank_in must satisfy the following criteria:
•It must be low for the active-data duration of the input line.
•It must be high for a period greater than or equal to 100 video_in_clk-cycles in
duration, once per line. This allows the scaler time to handle inherent line-based
latency in the filters.
•It must be low for a period greater than or equal to 32 video_in_clk-cycles in
duration, once per line.
The hblank_in input must be tied to the horizontal blanking signal provided with the
input video stream. Also, you may choose to use the inverse of hblank_in to create the
active_video_in signal (see the Active_video_in Input section).
Vblank_in Input
The vertical blanking input signal vblank_in is generally used as a frame-based reset. It
must be provided into the scaler core on the same clock domain as the video data
(video_in_clk).
The vblank_in signal is used to perform the following operations:
•Reset input line counter (both edges).
•Generate internal frame-based reset signal (frame_rst) during vertical blanking.
In Live Video mode, Frame Reset Line Number must be set to a value that is lower than
the number of line periods for which vblank_in remains high between frames. To
characterize this further, hblank_in must transition high a larger number of times than Frame Reset Line Number while vblank_in is high.
The vblank_in input must be tied to the vertical blanking signal provided with the input
video stream.
Frame_rst Signal
To maximize robustness of the scaler core, it is preferable to reset internal state-machines,
FIFOs and other processes once per frame. Owing to inherent multi-line period latency in
the system, it is not possible to use the vbank_in for this purpose. During vblank_in, hblank_in must continue to be active (as per most video formats). Frame_rst is
generated when the number of hblank_in pulses equals Frame Reset Line Number
Video Scaler v4.0 User Guidewww.xilinx.com25
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
specified in the CORE Generator/EDK GUI. Figure 4-3 is a screen shot from simulation,
showing the relationship between vblank_in, hblank_in and Frame_rst. The line
count shown is an internal counter included in this image for clarity. To achieve the case
illustrated, enter the value 22 into the CORE Generator GUI or pCore GUI.
The Frame_rst signal is used to perform the following operations:
•Trigger the transfer of coefficients from the coefficient FIFO to the coefficient stores if and only if a full set of coefficients exists in the FIFO.
•Trigger the transfer of control register values from the scaler core pins to internal
“active” registers, ready for use during the next frame. Setting bit 1 of the Control
register to 0 prevents this transfer from happening.
•Reset read- and write-pointers of input and output line buffers.
•Reset internal state-machine to indicate next input line as the top line in a frame.
Active_video_in Input
The active_video_in signal is generally used as an input data validation signal. It must
be provided into the scaler core on the same clock domain as the video data
(video_in_clk).
The timing of active_video_in must satisfy the following criteria:
•The first low-to-high transition will coincide with the first active data value for the
current line.
•This signal must be low when hblank_in is high.
•Following the transition from low to high, active_video_in must not transition
low during the active period of the current line. Following a high-to-low transition, a
pulse on the hblank_in signal must occur as described previously in the Hblank_in
Input section.
•For each line, while hblank_in = 0, the active_video_in signal must remain high
for at least ApertureEndPixel+1 cycles. For example, to scale an entire 720P image, set
ApertureStartPixel = 0, ApertureEndPixel=1279.
If hblank_in is driven high before this has occurred, the line will not be
acknowledged by the scaler. This parameter is provided as an input to the scaler by the
user.
You may choose to use the inverse of hblank_in to create the active_video_in signal.
26www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Data Source: Memory
This mode is primarily intended for use with a memory controller with rectangular access
capability such as the VFBC port on the MPMC. The VFBC port must be configured to
provide the amount of data that the scaler is expecting for each frame. The port must
contain sufficient buffering for at least one horizontal line of the input video rectangle.
When this video interface mode has been selected in CORE Generator, hblank_in, vblank_in, and active_video_in timing signals are not required. Also, the video data
must be fed into the scaler core via the rd_data port instead of the video_data_in port.
The rd_almost_empty signal must be asserted when the port has less than one line
available in the buffer.
When rd_almost_empty is low and the scaler is ready to accept a new line of input data,
it asserts the rd_re signal high. This signal will remain high for the duration of one line
period (determined by aperture_start_pixel and aperture_end_pixel). The first
(left-most) valid data pixel must be driven onto the rd_data port one clock cycle after
rd_re has been asserted. See Figure 4-4.
X-Ref Target - Figure 4-4
Data Source: Memory
Figure 4-4: Interface Timing for Memory Source Mode
It is important for the scaler core to have a concept of frame synchronization so that topedge filtering may be performed cleanly. For this purpose, you must also supply a vertical
synchronization pulse vsync_in once per frame, before the input of the top line. Only the
rising edge of vsync_in is used internally. It should be provided in the video_in_clk
domain.
In this mode, cropping is not possible within the scaler itself as in Live Video mode.
aperture_start_pixel and aperture_start_line must be set to 0. Cropping can
be achieved using memory offsets. The first pixel and line provided to the scaler will
always be included in the horizontal and vertical apertures.
Video Scaler v4.0 User Guidewww.xilinx.com27
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
VIDEO?OUT?CLK
VIDEO?OUT?WE
VIDEO?DATA?OUT,UMA
VIDEO?DATA?OUT#HROMA
6ALID
6ALID
.OT6ALID
.OT6ALID
6ALID
6ALID
5'??
Output Data and Timing Signals
Although driving the scaler input using a direct standard video feed is supported, the
equivalent cannot be said for the scaler output. Because of the bursty nature of the vertical
filter portion of the scaling operation, the required size of the output buffering would be
prohibitive. This would be more aptly targeted to an external memory interface, which is
beyond the scope of this LogiCORE™ IP. However, the user may decide that his system
can directly handle the bursty data output from the scaler, provided valid data is indicated
by the core. Consequently, simple hand-shaking is achieved using the video_out_we and
video_out_almost_full signals.
When a line of data becomes available in the output buffer, and the
video_out_almost_full flag is low, the video_out_we flag is asserted as shown in
The video_out_almost_full input is provided to throttle the output from the scaler.
When this is asserted high for a number of line periods, the line_request signal will be
deasserted due to back-pressure through the scaler. If video_out_almost_full is low
at the start of an output line, the entire line will be delivered. The target must de-assert
video_out_almost_full when it is ready to accept the entire line.
Upon completion of the final line requested according to the output_v_size parameter,
the scaler will send a pulse of six video_out_clk cycles on the output_frame_done
signal.
For 4:2:0 outputs, the valid chroma data output will be accompanied by a high level on the
chroma_out signal as shown in Figure 4-6.
The scaler supports the following possible arrangements of the internal filters.
•Option 1: Single-engine for sequential YC processing
•Option 2: Dual Engine for parallel YC processing
•Option 3: Triple engine for parallel RGB/4:4:4 processing
When using RGB/4:4:4, only Option 3 can be used. Selecting Option 1 or Option 2
significantly affects throughput trading versus resource usage. These three options are
described in detail in this chapter.
Architecture Descriptions
Single-Engine for Sequential YC Processing
Chapter 5
This is the most complex of the three options because Y, Cr, and Cb operations are
multiplexed through the same filter engine kernel.
One entire line of one channel (for example luma) is processed before the single-scaler
engine is dedicated to another channel of the same video line. The input buffering
arrangement allows for the channels to be separated on a line-basis. The internal data path
bit widths are shown in Figure 5-1, as implemented for a 4:2:2 or 4:2:0 scaler. DataWidth
may be set to 8, 10, or 12 bits.
X-Ref Target - Figure 5-1
2*DataWidth
The scaler module is flanked by buffers that are large enough to contain one line of data,
double buffered.
At the input, the line buffer size is determined by the parameter
max_samples_in_per_line. At the output, the line-buffer size is determined by the
parameter max_samples_out_per_line. These line buffers enable line-based
arbitration, and avoid pixel-based handshaking issues between the input and the scaler
core. The input line buffer also serves as the “most recent” vertical tap (that is, the lowest
in the image) in the vertical filter.
Input Line
Buffer
Figure 5-1: Internal Data Path Bitwidths for Single-Engine YC Mode
1*DataWidth1*DataWidth
Scaler
Output Line
Buffer (Y)
1*DataWidth
Output Line
Buffer (Cb/Cr)
2*DataWidth
UG_16_031909
Video Scaler v4.0 User Guidewww.xilinx.com29
UG805 March 1, 2011
Chapter 5: Scaler Architectures
Ou tputLine
LineBuffer
ScalerEngine
Ou tputLine
Input LineBu ffer
ScalerEngine
Ou tputLine
Ch1In pu tLine
Buffer
Sc alerEngine
(
)
OutputLine
Buffer
Sc alerEngine
(
)
(
Buffer
Sc alerEngine
(
)
4:2:0 Special Requirements
When operating with 4:2:0, it is also important to include the following restriction: when
scaling 4:2:0, the vertical scale factor applied at the vsf input must not be less than
20
(2
)*144/1080. This restriction has been included because Direct Mode 4:2:0 requires
additional input buffering to align the chroma vertical aperture with the correct luma
vertical aperture. In a later release of the video scaler, this restriction will be removed.
Dual-Engine for Parallel YC Processing
For this architecture, separate engines are used to process Luma and Chroma channels in
parallel as shown in Figure 5-2.
X-Ref Target - Figure 5-2
video_data_in
1*DataWidth
2*DataWidth
1*DataWidth
Luma(Y)Input
Chro ma (Cr/Cb)
Figure 5-2: Internal Data Path Bitwidths for Dual-Engine YC Mode
1*DataWi dth
1*DataWi dth
(Y)
(C)
1* DataWidth
1* DataWidth
Buffer(Y)
Buffer (C)
1*DataWi dth
video_da ta_out
2* DataWi d th
1*DataWidth
For the Chroma channel, Cr and Cb are processed sequentially. Due to overheads in
completing each component, the chroma channel operations for each line require slightly
more time than the Luma operation. It is worth noting also that the Y and C operations do
not work in synchrony.
Triple-Engine for RGB/4:4:4 Processing
For this architecture, separate engines are used to process the three channels in parallel, as
shown in Figure 5-3.
X-Ref Target - Figure 5-3
vi deo _da ta_in video_da ta_out
1*DataWidth
3*DataWidth
1*DataWidth
1* DataWidth
1* DataWidth
1* DataWidth
Buffer (Ch1)
Buffer (Ch2)
Ou tputLine
Buffer
Ch3)
1*DataWidth
1*DataWidth
Ch2In pu tLine
Ch3In pu tLine
1*DataWidth
Ch1
1*DataWidth
Ch2
1*DataWidth
Ch3
Figure 5-3: Internal Data Path Bitwidths for Triple-Engine RGB/4:4:4 Architecture
For this case, all three channels are processed in synchrony.
3* DataWi d th
30www.xilinx.comVideo Scaler v4.0 User Guide
UG805 March 1, 2011
Loading...
+ 72 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.