Xilinx LogiCORE IP Video Scaler v4.0 User Manual

LogiCORE™ IP Video Scaler v4.0
User Guide
UG805 March 1, 2011
Xilinx is providing this product documentation, hereinafter “Information,” to you “AS IS” with no warranty of any kind, express or implied. Xilinx makes no representation that the Information, or any particular implementation thereof, is free from any claims of infringement. You are responsible for obtaining any rights you may require for any implementation based on the Information. All specifications are subject to change without notice.
XILINX EXPRESSLY DISCLAIMS ANY WARRANTY WHATSOEVER WITH RESPECT TO THE ADEQUACY OF THE INFORMATION OR ANY IMPLEMENTATION BASED THEREON, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OR REPRESENTATIONS THAT THIS IMPLEMENTATION IS FREE FROM CLAIMS OF INFRINGEMENT AND ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
Except as stated herein, none of the Information may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx.
© Copyright 2009-2011 Xilinx, Inc. XILINX, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners.Xilinx, Inc. XILINX, the Xilinx logo, Virtex, Spartan, ISE, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. MATLAB is a registered trademark of The MathWorks, Inc. All other trademarks are the property of their respective owners.
Revision History
The following table shows the revision history for this document.
Date Version Revision
04/24/09 1.0 Initial Xilinx release.
09/16/09 2.0 Updated for core release version 2.0.
04/19/10 2.1 Updated for core release version 2.1.
09/21/10 3.0 Updated for core release version 3.0.
03/01/11 4.0 Updated for core release version 4.0.
Video Scaler v4.0 User Guide www.xilinx.com UG805 March 1, 2011
Table of Contents
Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
Schedule of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
Schedule of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Preface: About This Guide
Guide Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Typographical. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
Online Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Chapter 1: Introduction
About the Core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Recommended Experience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Additional Core Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Providing Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 2: Overview
Chapter 3: Implementation
Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
I/O Buffering, Clock Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
Chapter 4: Video I/O Interface and Timing
Data Source: Live Video. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Input Data and Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
General Input Handshaking Principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
Hblank_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Vblank_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Frame_rst Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Active_video_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Data Source: Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Output Data and Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Chapter 5: Scaler Architectures
Architecture Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
Single-Engine for Sequential YC Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4:2:0 Special Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Dual-Engine for Parallel YC Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
Triple-Engine for RGB/4:4:4 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
GUI Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Chapter 6: Control Interface
Control Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
Constant (Fixed) Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
General Purpose Processor (GPP) Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
Coefficient Delivery for GPP Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
EDK pCore Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
Parameter Modification in CORE Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Scaler Software Driver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Coefficient Delivery for EDK pCore Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Chapter 7: Scaler Aperture
Input Aperture Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Cropping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 8: Coefficients
Coefficient Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
Coefficient Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
Examples of Coefficient Set Generation and Loading . . . . . . . . . . . . . . . . . . . . . . . . 44
Example 1: Num_h_taps = num_v_taps = 8; max_phases = 4 . . . . . . . . . . . . . . . . . . . 44
Example 2: Num_h_taps = num_v_taps = 8;
max_phases = 5, 6, 7 or 8; num_h_phases = num_v_phases = 4 . . . . . . . . . . . . . . . 47
Example 3: Num_h_taps = 9; num_v_taps = 7;
max_phases = num_h_phases = num_v_phases = 4 . . . . . . . . . . . . . . . . . . . . . . . . . 49
Coefficient Preloading Using a .coe File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Generating .coe Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Extracting Coefficients From xscaler_coefs.c File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Format for .coe Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Coefficient Readback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Chapter 9: Performance
Live Video Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
Memory Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
Appendix A: Use Cases
Typical Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Appendix B: Programmer Guide
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Register Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Filter Coefficient Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Video Scaler Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
System Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Proposed API function calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
L0 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
L1 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
L2 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
Example Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Pass Thru . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Down Sample by 2 in Horizontal and Vertical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Appendix C: System Level Design
Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Example System General Configuration.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
Control Buses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
VDMA0 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
VDMA1 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
VDMA2 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Video Scaler Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
MPMC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Scaler READ-port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
Scaler WRITE-port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
Cropping from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
OSD Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
EDK MHS File Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Video Scaler v4.0 User Guide www.xilinx.com UG805 March 1, 2011
Schedule of Figures
Chapter 1: Introduction
Chapter 2: Overview
Chapter 3: Implementation
Figure 3-1: High Level View of the Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
Figure 3-2: Simplified Top Level Block Diagram, Indicating Clock-domains . . . . . . . . 22
Chapter 4: Video I/O Interface and Timing
Figure 4-1: Scaler 8-bit 4:2:2 Input Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 4-2: Scaler 8-bit 4:2:0 Input Chroma Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
Figure 4-3: VBlank, HBlank, Frame_rst, LineCount Screenshot,
with Frame Reset Line Number = 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
Figure 4-4: Interface Timing for Memory Source Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
Figure 4-5: Scaler Output Timing (8-bits YC4:2:2/4:2:0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Figure 4-6: Scaler 4:2:0 Output Validation (8-bits). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Chapter 5: Scaler Architectures
Figure 5-1: Internal Data Path Bitwidths for Single-Engine YC Mode . . . . . . . . . . . . . . . 29
Figure 5-2: Internal Data Path Bitwidths for Dual-Engine YC Mode . . . . . . . . . . . . . . . . 30
Figure 5-3: Internal Data Path Bitwidths for Triple-Engine RGB/4:4:4 Architecture . . . 30
Figure 5-4: Auto Select in GUI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
Figure 5-5: CORE Generator GUI Information Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Chapter 6: Control Interface
Figure 6-1: Typical EDK-based System Showing Interrupt Structure. . . . . . . . . . . . . . . . 38
Chapter 7: Scaler Aperture
Figure 7-1: Hblank_in at Falling Edge of VBlank_in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
Figure 7-2: Active_video_in in Relation to First Active Sample . . . . . . . . . . . . . . . . . . . . . 40
Figure 7-3: Cropping from the Input Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
Chapter 8: Coefficients
Figure 8-1: Coefficient Write-Format on coef_data_in(31:0) . . . . . . . . . . . . . . . . . . . . . . . . 42
Figure 8-2: Coefficient Loading Mechanism, Including External FIFO . . . . . . . . . . . . . . 43
Figure 8-3: Coefficient Loading Procedure – One Phase (8-tap filter shown) . . . . . . . . . 44
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Chapter 9: Performance
Appendix A: Use Cases
Figure A-1: Format Down-scaling. Example 720p to 640x480,
HSF = 2
Figure A-2: Format Up-scaling. Example 640x480 to 720p,
HSF = 2
Figure A-3: Zoom (Up-scaling), HSF = 2
Figure A-4: Shrink (Down-scaling). Example for Picture-in-Picture (PinP),
HSF = 2
Figure A-5: Zoom (Up-scaling) reading from External Memory,
HSF = 2
20
x 1280/640; VSF = 220 x 720/480 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
20
x 640/1280; 220 x VSF = 480/720 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
20
x 480/1280; VSF = 220 x 270/720 . . . . . . . . . . . . 72
20
x 1280/480; VSF = 220 x 720/270 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
20
x 480/1280; VSF = 220 x 270/720 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
Appendix B: Programmer Guide
Figure B-0: Video Scaler Flow Chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
Figure B-0: System Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Appendix C: System Level Design
Figure C-1: Simplified System Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Schedule of Tables
Chapter 1: Introduction
Table 1-1: Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
Chapter 2: Overview
Chapter 3: Implementation
Chapter 4: Video I/O Interface and Timing
Chapter 5: Scaler Architectures
Chapter 6: Control Interface
Chapter 7: Scaler Aperture
Chapter 8: Coefficients
Table 8-1: Example 1 Decimal Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
Table 8-2: Example 1 Normalized Integer Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 8-3: Example 1 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
Table 8-4: Example 2 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
Table 8-5: Example 9-Tap Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
Table 8-7: Example 3 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Table 8-6: Example 7-Tap Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Table 8-8: Coefficient “Binning” in SW Driver (xscaler_coefs.c) . . . . . . . . . . . . . . . . . . . . 52
Table 8-9: Ordering of Coefficients in .coe File for Different Coefficient Sharing
Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
Table 8-10: .coe File Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
Table 8-11: .coe File Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
Table 8-12: .coe File Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Chapter 9: Performance
Table 9-1: Target Maximum Clock Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Table 9-2: Throughput Calculations for Different Chroma Formats . . . . . . . . . . . . . . . . . 63
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
Appendix A: Use Cases
Appendix B: Programmer Guide
Table B-1: Video Scaler Registers Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
Table B-2: control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Table B-3: reserved Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Table B-4: status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
Table B-5: status_done Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Table B-6: horizontal_shrink_factor Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Table B-7: vsf Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
Table B-8: aperture_horz Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Table B-9: aperture_vert Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Table B-10: output_size Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
Table B-11: num_phases Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Table B-12: coeff_sets Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Table B-13: start_hpa_y Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
Table B-14: start_vpa_y Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table B-15: start_hpa_c Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table B-16: start_vpa_c Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table B-17: Coefficient_write_set_address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
Table B-18: coef_values Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
Table B-19: Coefficient Set and Bank Read Address Register. . . . . . . . . . . . . . . . . . . . . . . 82
Table B-20: Coefficient Phase and Tap Read Address Register. . . . . . . . . . . . . . . . . . . . . . 83
Table B-21: Coefficient Memory Readback Output Register. . . . . . . . . . . . . . . . . . . . . . . . 83
Table B-22: Version Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Table B-23: Software Reset Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Table B-24: Global Interrupt Enable Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Table B-25: Interrupt Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Table B-26:
Table B-27: Pass Through Register Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
Table B-28: Down Sample Register Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Interrupt Enable Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
Appendix C: System Level Design
UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide
About This Guide
The LogiCORE™ IP Video Scaler v4.0 User Guide provides information about generating the Video Scaler core, customizing and simulating the core using the provided example design, and running the design files through implementation using the Xilinx tools.
Guide Contents
This manual contains the following chapters:
Chapter 1, Introduction introduces the Xilinx Video Scaler core and provides related information, including recommended design experience, additional resources, technical support, and submitting feedback to Xilinx.
Chapter 2, Overview illustrates examples of video scaler applications.
Chapter 3, Implementation elaborates on the internal structure in the core and describes interfacing.
Chapter 4, Video I/O Interface and Timing describes how to drive the input timing signals so the scaler can be operated correctly. It also describes the data output signals and their relation to the output data.
Chapter 5, Scaler Architectures describes Single-engine for sequential YC processing, Dual Engine for parallel YC processing, and Triple engine for parallel RGB/4:4:4 processing.
Chapter 6, Control Interface discusses the three control interface options available to the user in CORE Generator™ software: EDK pCore, GPP and Constant.
Chapter 7, Scaler Aperture explains how to define the scaler aperture using the appropriate dynamic control registers.
Chapter 8, Coefficients describes the coefficients used by both the Vertical and Horizontal filter portions of the scaler, in terms of number, range, formatting and download procedures.
Chapter 9, Performance emphasizes the importance of available clock rate and provides some worst-case conversion examples.
Appendix A, Use Cases illustrates two likely usage scenarios for the video scaler.
Appendix B, Programmer Guide provides a description of how to program and control the data flow for the video scaler hardware pCore.
•"Appendix C, System Level Design provides an example design extracted from a known, working EDK project, including other Video IP blocks.
Preface
Video Scaler v4.0 User Guide www.xilinx.com 11
UG805 March 1, 2011
Preface: About This Guide
Additional Resources
To find additional documentation, see the Xilinx website at:
Conventions
Typographical
http://www.xilinx.com/support/documentation/index.htm
.
To search the Answer Database of silicon, software, and IP questions and answers, or to create a technical support WebCase, see the Xilinx website at:
http://www.xilinx.com/support/mysupport.htm
.
This document uses the following conventions. An example illustrates each convention.
The following typographical conventions are used in this document:
Convention Meaning or Use Example
Messages, prompts, and
Courier font
Courier bold
Helvetica bold
program files that the system displays
Literal commands that you enter in a syntactical statement
Commands that you select from a menu
Keyboard shortcuts Ctrl+C
speed grade: - 100
ngdbuild design_name
File  Open
Italic font
Dark Shading
Square brackets [ ]
Braces { }
Vertical bar |
Angle brackets < >
Variables in a syntax statement for which you must supply values
References to other manuals
Emphasis in text
Items that are not supported or reserved
An optional entry or parameter. However, in bus specifications, such as bus[7:0], they are required.
A list of items from which you must choose one or more
Separates items in a list of choices
User-defined variable or in code samples
ngdbuild design_name
See the User Guide for more information.
If a wire is drawn so that it overlaps the pin of a symbol, the two nets are not connected.
This feature is not supported
ngdbuild [option_name] design_name
lowpwr ={on|off}
lowpwr ={on|off}
<directory name>
12 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Convention Meaning or Use Example
Conventions
Vertical ellipsis
. . .
Horizontal ellipsis . . .
Notations
Online Document
The following conventions are used in this document:
Convention Meaning or Use Example
Blue text
Blue, underlined text
Repetitive material that has been omitted
Repetitive material that has been omitted
The prefix ‘0x’ or the suffix ‘h’ indicate hexadecimal notation
An ‘_n’ means the signal is active low
Cross-reference link to a location in the current document
Hyperlink to a website (URL)
IOB #1: Name = QOUT’ IOB #2: Name = CLKIN’
. . .
allow block block_name loc1 loc2 ... locn;
A read of address 0x00112975 returned 45524943h.
usr_teof_n is active low.
See Chapter 3, Basic
Architecture for details.
See Additional Resources,
page 12,” for details.
Go to www.xilinx.com latest speed files.
for the
Video Scaler v4.0 User Guide www.xilinx.com 13
UG805 March 1, 2011
Preface: About This Guide
14 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Introduction
This chapter introduces the Video Scaler core and provides related information, including recommended design experience, additional resources, technical support, and submitting feedback to Xilinx. See www.xilinx.com/products/ipcenter/EF-DI-VID-SCALER.htm
About the Core
The Video Scaler core is a Xilinx CORE Generator™ IP core, included in the latest IP Update on the Xilinx IP Center
Scaler product page.
Recommended Experience
Although the Video Scaler core is a fully verified solution, the challenge associated with implementing a complete design varies depending on the configuration and functionality of the application. For best results, previous experience building high performance, pipelined FPGA designs using Xilinx implementation software and UCF is recommended.
Chapter 1
.
. For detailed information about the core, see the Video
Contact your local Xilinx representative for a closer review and estimation for your specific requirements
Additional Core Resources
For detailed information about video scaler technology and updates to the Video Scaler core, see the following:
Documentation
From the Video Scaler product page:
Video Scaler Data Sheet
Video Scaler Release Notes
Technical Support
For technical support, visit www.xilinx.com/support. Questions are routed to a team of engineers with expertise using the Video Scaler core.
Xilinx will provide technical support for use of this product as described in the LogiCORE™ IP Video Scaler User Guide. Xilinx cannot guarantee timing, functionality, or support of this product for designs that do not follow these guidelines.
Video Scaler v4.0 User Guide www.xilinx.com 15
UG805 March 1, 2011
Chapter 1: Introduction
Providing Feedback
Xilinx welcomes comments and suggestions about the Video Scaler core and the documentation supplied with the core.
Core
For comments or suggestions about the Video Scaler core, submit a WebCase from
www.xilinx.com/support
•Product name
Core version number
Explanation of your comments
Documentation
For comments or suggestions about this document, submit a WebCase from
www.xilinx.com/support
Document title
•Document number
Page number(s) to which your comments refer
Explanation of your comments
. Be sure to include the following information:
. Be sure to include the following information:
Nomenclature
The following are defined for the purposes of this document:
Table 1-1: Nomenclature
Term Definition
Scaler Aperture The input data rectangle used to create the output data rectangle.
Filter Aperture The group of contributory data used in a filter to generate one
particular output. The number of elements in this group of data is the number of taps. We define the filter aperture size using the num_h_taps and num_v_taps parameters.
Coefficient Phase Each tap is multiplied by a coefficient to make its contribution to
the output pixel. The coefficients used are selected from a “phase” of num_x_taps coefficients. The phase selection is dependent upon the position of the output pixel in the input sampling grid space. For each dimension of the filter, each coefficient phase consists of num_h_taps or num_v_taps coefficients.
Channel For scaler purposes, all monochromatic video streams, for example
Y, Cb, Cr, R, G, B, are all considered separate channels.
Coefficient Phase Index An index given that selects the coefficient phase applied to one
filter aperture in a FIR. For an n-tap filter, this index points to n coefficients.
16 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Nomenclature
Table 1-1: Nomenclature
Term Definition
Coefficient Bank A group of coefficients that will be applied to one video component
(Y or C) in one dimension (H or V) for a conversion of one frame. It includes all phases. For an n-tap, m-phase filter, a coefficient bank comprises nxm values. Each tap may be multiplied by any one of m coefficients assigned to it, selected by the phase index, which is applied to all taps.
Coefficient Set A group of four coefficient banks (VY, VC, HY, HC). One full set
should be written into the scaler before use.
Video Scaler v4.0 User Guide www.xilinx.com 17
UG805 March 1, 2011
Chapter 1: Introduction
18 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Overview
Video scaling is the process of converting an input color image of dimensions Xin pixels by Y
in
Within predefined limits, the Xilinx Video Scaler supports the modification of the X X
out
dynamically crop selected subject area from the input image prior to scaling that area. This dynamic combination lends itself well to applications that require shrink and zoom functionality.
The Xilinx Video Scaler supports real-time video inputs and memory interface inputs (that is, a frame buffer). When connected to a real-time input source, the input clock and horizontal and vertical (H/V) timing signals come directly from the input video stream. In the case of a memory interface, standard memory handshaking signals may be used in place of the H/V timing signals.
While maintaining image quality is usually of primary interest, it is subjective and heavily dependent upon the end application. Moreover, image quality comes at a price in terms of FPGA resources. Hence, while the core structure and architecture of the scaler is maintained for all applications, flexibility is made paramount to enable users from all applications to use this IP.
Chapter 2
lines to an output color image of dimensions X
, Y
input parameters during run-time on a frame basis. Furthermore, you may also
out
pixels by Y
out
out
lines.
, Yin,
in
Video Scaler v4.0 User Guide www.xilinx.com 19
UG805 March 1, 2011
Chapter 2: Overview
20 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Implementation
Video Rectangle In
(Dimensions X
in X Yin)
Video Rectangle Out (Dimensions Xout X Yout)
Video Scaler
UG_07_031909
This section elaborates on the internal structure in the core, and describes interfacing.
Basic Architecture
The Xilinx Video Scaler LogiCORE™ IP converts a specified rectangular area of an input digital video image from the original sampling grid to a desired target sampling grid (Figure 3-1).
X-Ref Target - Figure 3-1
Chapter 3
Figure 3-1: High Level View of the Functionality
The input image must be provided in raster scan format (left to right and top to bottom). The valid outputs will also be given in this order.
The Xilinx Video Scaler makes few assumptions regarding the origin or the destination of the video data. The input could be fed in real-time from a live video feed, or it could be read from an external memory. The output could feed directly to another processing stage in real time, but also could feed an external frame buffer (for example, for a VGA controller, or a Picture-in-Picture controller). Whatever the configuration, you must assess, given the clock-frequency available, how much time is available for scaling, and define:
1. Whether to source the scaler using live video or an input-side frame buffer, and
2. Whether the scaler feeds out directly to the next stage or to an output-side frame buffer.
When using a live video input source, you have no control over the video timing signals. Hence, the specific requirements must allow for this. For example, when up-scaling by a factor of 2, two lines must be output for every input line. The scaler core clock-rate (‘clk’) must allow for this, especially considering the architectural specifics within the scaler that take advantage of the high speed features of the FPGA to allow for resource sharing.
Feeding data from an input frame buffer is more costly, but allows you to read the required data as needed, but still have one “frame” period in which to process it.
Video Scaler v4.0 User Guide www.xilinx.com 21
UG805 March 1, 2011
Chapter 3: Implementation
$ATA&LOW #ONTROL&LOW#LOCKS
VIDEO?IN?CLK
ACTIVE?VIDEO?IN
LINE?REQUEST
HBLANK?INVBLANK?IN
7RITESIDECONTROL
VIDEO?DATA?IN
OOEDIVKLC?NIOEDIV UT?CLK
VIDEO?DATA?OUT
2EADSIDECONTROL
VIDEO?OUT?CLK
VIDEO?OUT?ALMOST?FULL
VIDEO?OUT?WE
#LK
#ONTROL
3TATE-ACHINES
!SYNC)NPUT
,INE"UFFER
!SYNC/UTPUT
,INE"UFFERS
3CALER-ODULE
5'???
3CALER#ORE
Some observations (not exclusively true for all conversions):
Generally, when up-scaling, or dealing with high definition (HD) rates, it is simplest to use an input-side frame buffer. This does depend upon the available clock rates.
When down-scaling, it is often the case that the input-side frame buffer is not required, because for every input line the scaler is required to generate a maximum of one valid output line.
Generally, the output data does not conform to any standard. It is therefore not possible to feed the output directly to a display driver. Usually, a frame buffer is ultimately required to smooth the output data over an output frame period. The output video stream is described later.
I/O Buffering, Clock Domains
Figure 3-2 shows the top level buffering, indicating the different clock domains, and the
scope of the control state-machines.
X-Ref Target - Figure 3-2
Figure 3-2: Simplified Top Level Block Diagram, Indicating Clock-domains
To support the many possibilities of input and output configurations, and to take advantage of the fast FPGA fabric, the scaler core uses a separate clock domain from that used in controlling data I/O. More information is given in Chapter 9, Performance about how to calculate the minimum required operational clock frequency. It is also possible to read the output of the scaler using a 3rd clock domain. These clock domains are isolated
22 www.xilinx.com Video Scaler v4.0 User Guide
from each other using asynchronous line buffers as shown in Figure 3-2. The control state- machines monitor the I/O line buffers. They also monitor the current input and output line numbers.
UG805 March 1, 2011
Video I/O Interface and Timing
CORE Generator™ software provides two interface options for provision of the video data into the video scaler core.
1. Live – standard format video signal, along with synchronization signals to be driven directly into the core.
2. Memory – an internal memory arbiter is included in the core, so the active video area may be accessed from an external memory block.
Data Source: Live Video
Input Data and Timing Signals
General Input Handshaking Principles
Hblank_in Input
Vblank_in Input
Frame_rst Signal
Active_video_in Input
Chapter 4
General Input Handshaking Principles
The input data is written into an internal double-buffered line buffer. Availability of space for one entire line of data is indicated by a high level on the line_request output. One line of data, of a length up to max_samples_in_per_line, may be written to this buffer without the need for further arbitration. Following the first valid pixel-write operation to this line buffer, the line_request output will be driven low by the scaler. This signal may rise a few (> 3) clock cycles later to indicate availability of the other half of the double buffer. The number of clock cycles is dependent on the current conversion.
Video Scaler v4.0 User Guide www.xilinx.com 23
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
Valid video data is written into the input line buffer using active_video_in as a write­enable. This is shown in Figure 4-1 for the 8-bit 4:2:2 case The active_video_in signal must remain in a high state for the duration of the active input line.
X-Ref Target - Figure 4-1
video_in_clk
line_request
active_video_in
video_data _in (7:0) (Luma)
video_data_in (15:8) (Chroma)
Cb
Y
Y
0
1
Cr
0
0
YnY
n+1Yn+2Yn+3
CbnCrnCb
n+2Crn+2
Y
size-1
Cr
size-2
UG678_5-1_081809
Figure 4-1: Scaler 8-bit 4:2:2 Input Timing
The scaler is capable of accepting and delivering 4:4:4 (e.g., RGB), 4:2:2, and 4:2:0 chroma formats. It will not convert between chroma formats. For delivery of 4:4:4 video data, a third channel would be added to this diagram, and the three channels would be either R, G, and B or Y, Cb, and Cr. It is necessary to clarify the I/O format. For bandwidth, 4:2:0 is essentially the same as 4:2:2 horizontally, but is half the bandwidth vertically. Different signaling is required for the delivery of the YC4:2:2: and YC4:2:0 chroma systems. The luma (Y) input is a full bandwidth 8-bit input on video_data_in[7:0]. The chroma for both 4:2:0 and 4:2:2 is also a full-bandwidth input on video_data_in[(data_width*2)-1:data_width], but Cb and Cr are interleaved on a pixel basis, as shown in Figure 4-1 for the 8-bit case. An additional input active_chroma_in is required in the 4:2:0 case. This must be asserted high on all lines for 4:2:2, but only for alternate lines for 4:2:0, as shown in Figure 4-2.
X-Ref Target - Figure 4-2
chroma_in
video_data_in (7:0)_(Luma)
video_data_in (15:8)_(Chroma)
Line1
Valid
Line2
N/V
Line3
Valid
Line4
N/V
When running the scaler using Live Mode, you are likely to derive the active_video_in from timing signals such as horizontal sync or embedded flags like EAV and SAV. In this case, you will have calculated that the line-rate at the input, often defined by the input video format, is sufficiently low that the host system will never need to wait for the line_request signal to be asserted.
However, in contrast, you may calculate that this is not possible, and that the scaler must hold off the input data. The line_request flag deasserted state should be used to hold off the write-operation for a new line. Since it is impossible to hold off a live video feed, the data must be fed (directly or indirectly) from a frame buffer, and the appropriate external control provided (Memory Mode).
24 www.xilinx.com Video Scaler v4.0 User Guide
UG678_5-2_081809
Figure 4-2: Scaler 8-bit 4:2:0 Input Chroma Validation
UG805 March 1, 2011
Data Source: Live Video
Hblank_in Input
The horizontal blanking input signal hblank_in is generally used as a line-based reset. It must be provided to the scaler core in the same clock domain as the video data (video_in_clk).
The hblank_in signal is used to perform the following operations:
Reset an internal input pixel counter.
Reset the internal input side line buffer write-address pointer.
Increment the input line counter (rising edge of hblank_in).
Decode the input line count during active data period to open and close an internal processing “window.”
Decode the input line count to create a delayed internal frame-based reset signal (frame_rst) during vblank_in. The line-number is specified in the CORE Generator GUI (Frame Reset line Number).
The timing of hblank_in must satisfy the following criteria:
It must be low for the active-data duration of the input line.
It must be high for a period greater than or equal to 100 video_in_clk-cycles in duration, once per line. This allows the scaler time to handle inherent line-based latency in the filters.
It must be low for a period greater than or equal to 32 video_in_clk-cycles in duration, once per line.
The hblank_in input must be tied to the horizontal blanking signal provided with the input video stream. Also, you may choose to use the inverse of hblank_in to create the active_video_in signal (see the Active_video_in Input section).
Vblank_in Input
The vertical blanking input signal vblank_in is generally used as a frame-based reset. It must be provided into the scaler core on the same clock domain as the video data (video_in_clk).
The vblank_in signal is used to perform the following operations:
Reset input line counter (both edges).
Generate internal frame-based reset signal (frame_rst) during vertical blanking.
In Live Video mode, Frame Reset Line Number must be set to a value that is lower than the number of line periods for which vblank_in remains high between frames. To characterize this further, hblank_in must transition high a larger number of times than Frame Reset Line Number while vblank_in is high.
The vblank_in input must be tied to the vertical blanking signal provided with the input video stream.
Frame_rst Signal
To maximize robustness of the scaler core, it is preferable to reset internal state-machines, FIFOs and other processes once per frame. Owing to inherent multi-line period latency in the system, it is not possible to use the vbank_in for this purpose. During vblank_in, hblank_in must continue to be active (as per most video formats). Frame_rst is generated when the number of hblank_in pulses equals Frame Reset Line Number
Video Scaler v4.0 User Guide www.xilinx.com 25
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
specified in the CORE Generator/EDK GUI. Figure 4-3 is a screen shot from simulation, showing the relationship between vblank_in, hblank_in and Frame_rst. The line count shown is an internal counter included in this image for clarity. To achieve the case illustrated, enter the value 22 into the CORE Generator GUI or pCore GUI.
X-Ref Target - Figure 4-3
Figure 4-3: VBlank, HBlank, Frame_rst, LineCount Screenshot,
with Frame Reset Line Number = 22
The Frame_rst signal is used to perform the following operations:
Trigger the transfer of coefficients from the coefficient FIFO to the coefficient stores if and only if a full set of coefficients exists in the FIFO.
Trigger the transfer of control register values from the scaler core pins to internal “active” registers, ready for use during the next frame. Setting bit 1 of the Control register to 0 prevents this transfer from happening.
Reset read- and write-pointers of input and output line buffers.
Reset internal state-machine to indicate next input line as the top line in a frame.
Active_video_in Input
The active_video_in signal is generally used as an input data validation signal. It must be provided into the scaler core on the same clock domain as the video data (video_in_clk).
The timing of active_video_in must satisfy the following criteria:
The first low-to-high transition will coincide with the first active data value for the current line.
This signal must be low when hblank_in is high.
Following the transition from low to high, active_video_in must not transition low during the active period of the current line. Following a high-to-low transition, a pulse on the hblank_in signal must occur as described previously in the Hblank_in
Input section.
For each line, while hblank_in = 0, the active_video_in signal must remain high for at least ApertureEndPixel+1 cycles. For example, to scale an entire 720P image, set ApertureStartPixel = 0, ApertureEndPixel=1279.
If hblank_in is driven high before this has occurred, the line will not be acknowledged by the scaler. This parameter is provided as an input to the scaler by the user.
You may choose to use the inverse of hblank_in to create the active_video_in signal.
26 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Data Source: Memory
This mode is primarily intended for use with a memory controller with rectangular access capability such as the VFBC port on the MPMC. The VFBC port must be configured to provide the amount of data that the scaler is expecting for each frame. The port must contain sufficient buffering for at least one horizontal line of the input video rectangle.
When this video interface mode has been selected in CORE Generator, hblank_in, vblank_in, and active_video_in timing signals are not required. Also, the video data must be fed into the scaler core via the rd_data port instead of the video_data_in port.
The rd_almost_empty signal must be asserted when the port has less than one line available in the buffer.
When rd_almost_empty is low and the scaler is ready to accept a new line of input data, it asserts the rd_re signal high. This signal will remain high for the duration of one line period (determined by aperture_start_pixel and aperture_end_pixel). The first (left-most) valid data pixel must be driven onto the rd_data port one clock cycle after rd_re has been asserted. See Figure 4-4.
X-Ref Target - Figure 4-4
Data Source: Memory
Figure 4-4: Interface Timing for Memory Source Mode
It is important for the scaler core to have a concept of frame synchronization so that top­edge filtering may be performed cleanly. For this purpose, you must also supply a vertical synchronization pulse vsync_in once per frame, before the input of the top line. Only the rising edge of vsync_in is used internally. It should be provided in the video_in_clk domain.
In this mode, cropping is not possible within the scaler itself as in Live Video mode. aperture_start_pixel and aperture_start_line must be set to 0. Cropping can be achieved using memory offsets. The first pixel and line provided to the scaler will always be included in the horizontal and vertical apertures.
Video Scaler v4.0 User Guide www.xilinx.com 27
UG805 March 1, 2011
Chapter 4: Video I/O Interface and Timing
VIDEO?OUT?CLK
VIDEO?OUT?WE
VIDEO?DATA?OUT,UMA
VIDEO?DATA?OUT#HROMA
6ALID
6ALID
.OT6ALID
.OT6ALID
6ALID
6ALID
5'??
Output Data and Timing Signals
Although driving the scaler input using a direct standard video feed is supported, the equivalent cannot be said for the scaler output. Because of the bursty nature of the vertical filter portion of the scaling operation, the required size of the output buffering would be prohibitive. This would be more aptly targeted to an external memory interface, which is beyond the scope of this LogiCORE™ IP. However, the user may decide that his system can directly handle the bursty data output from the scaler, provided valid data is indicated by the core. Consequently, simple hand-shaking is achieved using the video_out_we and video_out_almost_full signals.
When a line of data becomes available in the output buffer, and the video_out_almost_full flag is low, the video_out_we flag is asserted as shown in
Figure 4-5, and data is driven out.
X-Ref Target - Figure 4-5
Figure 4-5: Scaler Output Timing (8-bits YC4:2:2/4:2:0)
The video_out_almost_full input is provided to throttle the output from the scaler. When this is asserted high for a number of line periods, the line_request signal will be deasserted due to back-pressure through the scaler. If video_out_almost_full is low at the start of an output line, the entire line will be delivered. The target must de-assert video_out_almost_full when it is ready to accept the entire line.
Upon completion of the final line requested according to the output_v_size parameter, the scaler will send a pulse of six video_out_clk cycles on the output_frame_done signal.
For 4:2:0 outputs, the valid chroma data output will be accompanied by a high level on the chroma_out signal as shown in Figure 4-6.
X-Ref Target - Figure 4-6
video_data_out (7:0) (Luma)
video_data_out (15: 8) (Chroma)
Line1
Valid
Line2
N/V
Line3
Valid
Line4
N/V
chroma_out
UG678_5-5_081809
Figure 4-6: Scaler 4:2:0 Output Validation (8-bits)
28 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Scaler Architectures
The scaler supports the following possible arrangements of the internal filters.
Option 1: Single-engine for sequential YC processing
Option 2: Dual Engine for parallel YC processing
Option 3: Triple engine for parallel RGB/4:4:4 processing
When using RGB/4:4:4, only Option 3 can be used. Selecting Option 1 or Option 2 significantly affects throughput trading versus resource usage. These three options are described in detail in this chapter.
Architecture Descriptions
Single-Engine for Sequential YC Processing
Chapter 5
This is the most complex of the three options because Y, Cr, and Cb operations are multiplexed through the same filter engine kernel.
One entire line of one channel (for example luma) is processed before the single-scaler engine is dedicated to another channel of the same video line. The input buffering arrangement allows for the channels to be separated on a line-basis. The internal data path bit widths are shown in Figure 5-1, as implemented for a 4:2:2 or 4:2:0 scaler. DataWidth may be set to 8, 10, or 12 bits.
X-Ref Target - Figure 5-1
2*DataWidth
The scaler module is flanked by buffers that are large enough to contain one line of data, double buffered.
At the input, the line buffer size is determined by the parameter max_samples_in_per_line. At the output, the line-buffer size is determined by the parameter max_samples_out_per_line. These line buffers enable line-based arbitration, and avoid pixel-based handshaking issues between the input and the scaler core. The input line buffer also serves as the “most recent” vertical tap (that is, the lowest in the image) in the vertical filter.
Input Line
Buffer
Figure 5-1: Internal Data Path Bitwidths for Single-Engine YC Mode
1*DataWidth 1*DataWidth
Scaler
Output Line
Buffer (Y)
1*DataWidth
Output Line
Buffer (Cb/Cr)
2*DataWidth
UG_16_031909
Video Scaler v4.0 User Guide www.xilinx.com 29
UG805 March 1, 2011
Chapter 5: Scaler Architectures
Ou tputLine
LineBuffer
ScalerEngine
Ou tputLine
Input LineBu ffer
ScalerEngine
Ou tputLine
Ch1In pu tLine
Buffer
Sc alerEngine
(
)
OutputLine
Buffer
Sc alerEngine
(
)
(
Buffer
Sc alerEngine
(
)
4:2:0 Special Requirements
When operating with 4:2:0, it is also important to include the following restriction: when scaling 4:2:0, the vertical scale factor applied at the vsf input must not be less than
20
(2
)*144/1080. This restriction has been included because Direct Mode 4:2:0 requires
additional input buffering to align the chroma vertical aperture with the correct luma vertical aperture. In a later release of the video scaler, this restriction will be removed.
Dual-Engine for Parallel YC Processing
For this architecture, separate engines are used to process Luma and Chroma channels in parallel as shown in Figure 5-2.
X-Ref Target - Figure 5-2
video_data_in
1*DataWidth
2*DataWidth
1*DataWidth
Luma(Y)Input
Chro ma(Cr/Cb)
Figure 5-2: Internal Data Path Bitwidths for Dual-Engine YC Mode
1*DataWi dth
1*DataWi dth
(Y)
(C)
1* DataWidth
1* DataWidth
Buffer(Y)
Buffer (C)
1*DataWi dth
video_da ta_out
2* DataWi d th
1*DataWidth
For the Chroma channel, Cr and Cb are processed sequentially. Due to overheads in completing each component, the chroma channel operations for each line require slightly more time than the Luma operation. It is worth noting also that the Y and C operations do not work in synchrony.
Triple-Engine for RGB/4:4:4 Processing
For this architecture, separate engines are used to process the three channels in parallel, as shown in Figure 5-3.
X-Ref Target - Figure 5-3
vi deo _da ta_in video_da ta_out
1*DataWidth
3*DataWidth
1*DataWidth
1* DataWidth
1* DataWidth
1* DataWidth
Buffer (Ch1)
Buffer (Ch2)
Ou tputLine Buffer
Ch3)
1*DataWidth
1*DataWidth
Ch2In pu tLine
Ch3In pu tLine
1*DataWidth
Ch1
1*DataWidth
Ch2
1*DataWidth
Ch3
Figure 5-3: Internal Data Path Bitwidths for Triple-Engine RGB/4:4:4 Architecture
For this case, all three channels are processed in synchrony.
3* DataWi d th
30 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
GUI Operation
X-Ref Target - Figure 5-4
GUI Operation
When the chroma format is specified as 4:4:4, the triple-engine parallel architecture is always selected. Otherwise, selection between the YC Sequential or Parallel options may be achieved automatically (YC Filter Configuration = Auto Select) or manually in the CORE Generator GUI or the EDK GUI (see Figure 5-4).
The primary goal of selecting the correct architecture is to optimize resource usage, for a given worst case operational scenario. When Auto Select is selected, the GUI tries to establish what the user's worst case is from the following input parameters:
Input maximum rectangle size
Output maximum rectangle size
Target Clock-frequency
Desired Frame rate
Figure 5-4: Auto Select in GUI
The pseudo-code calculation made by the GUI for the Auto Select option is as follows:
OverheadMultiplier := 1.15; max_pixels := max(MaxHSizeIn, MaxHSizeOut); max_lines := max(MaxVSizeIn, MaxVSizeOut); max_frame_cycles := max_pixels * max_lines * OverHeadMultiplier; MaxFrameRateOneComponent := (TgtFMax * 1000000)/max_frame_cycles;
if (TgtFrameRate <= MaxFrameRateOneComponent/2) then Use Single engine else Use Dual engine end if;
Video Scaler v4.0 User Guide www.xilinx.com 31
UG805 March 1, 2011
Chapter 5: Scaler Architectures
The Information tab (see Figure 5-5) in the CORE Generator GUI (not available in EDK GUI) shows the estimated maximum achievable frame-rate given the above information, using a similar calculation as above. The user is advised to take a look at this value, and may elect to force the GUI one way or the other. This may be advisable in cases where, for example, a higher overhead per frame than 15% is needed. This overhead is intended as a general way of representing inactive periods in a frame such as blanking, but also includes filter flushing time, state-machine initialization, etc.
X-Ref Target - Figure 5-5
Figure 5-5: CORE Generator GUI Information Tab
32 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Control Interface
)2*]
1__
____
([
20
sizehoutput
pixelstartaperturepixelendaperture
roundhsf
)2*]
1__
____
([
20
sizevoutput
linestartaperturelineendaperture
roundvsf
There are three control interface options available in CORE Generator™ software: EDK pCore, GPP or Constant. The interface types differ primarily in the method of delivery of the user-defined control values and filter coefficients. These values are listed in the video scaler data sheet DS
Control Values
There follows a brief description of the function of the control values.
In GPP mode and pCore mode, these values are provided as dynamic inputs, and may be changed during runtime – the user inputs become active once per frame after completion of an output frame, using an internal active value capture register.
For the pCore version of the core, CORE Generator software provides the GPP core placed in a wrapper which allows you to parameterize the scaler core in EDK. The ports are driven by registers that sit on the AXI4-Lite. The address is decoded in the wrapper. A MicroBlaze™ processor software driver is provided in source-code form to drive these ports. Typical usage of the pCore is shown in Figure 6-1.
840 in Table 2, under “Dynamic Control Register Interface.”
Chapter 6
aperture_start_pixel, aperture_end_pixel, aperture_start_line, aperture_end_line
These parameters define the size and location of the input rectangle. They are explained in detail in Chapter 7, “Scaler Aperture.”
output_h_size, output_v_size
These two parameters define the size of the output rectangle. They do not determine anything about the target video format. You must determine what do with the scaled rectangle that emerges from the scaler core.
hsf, vsf
These are the horizontal and vertical shrink-factors that must be supplied the user. They should be supplied as integers, and can typically be calculated as follows:
and
Video Scaler v4.0 User Guide www.xilinx.com 33
UG805 March 1, 2011
Chapter 6: Control Interface
Hence, up-scaling is achieved using a shrink-factor value less than one. Down-scaling is achieved with a shrink-factor greater than one.
You may wish to work this calculation backwards. For a desired scale-factor, you may wish to calculate the output size or the input size. This is application-dependent. Smooth zoom/shrink applications may take advantage of this approach, coupled with usage of the following start-phase controls described below.
The allowed range of values on these parameters is 1/12 to 12: (0x015555 to 0xC00000).
num_h_phases, num_v_phases
Although you must specify the maximum number of phases (max_phases) that the core supports in the CORE Generator GUI, it is not necessary to run the core with a filter that has that many phases. Under some scaling conditions, you may want a large number of phases, but under others you may need only a few, or even only one. Non power-of-two numbers of phases are supported.
coef_wr_addr, h_coeff_set, v_coeff_set
In GPP and pCore interfaces, you may load coefficients. The scaler can store up to max_coef_sets coefficient sets internally. coef_wr_addr sets the set location of the set to which you intend to write. The set may subsequently be used by controlling the h_coeff_set and v_coeff_set values.
start_hpa_y, start_hpa_c, start_vpa_y, start_vpa_c
These are the start-phase controls. Internally to the core, the scaler accumulates the 24-bit shrink-factor (hsf, vsf) to determine phase and filter aperture. These four values allow you to preset the fractional part of the accumulations horizontally (hpa) and vertically (vpa) for luma (y) and chroma (c).
When dealing with 4:2:2, luma and chroma are always vertically cosited. Hence the start_vpa_c value is ignored.
Usage of these parameters is important for scaling interlaced formats cleanly. On successive input fields, the start_vpa_y value needs to be modified.
Also, when the desired result is a smooth shrink or zoom over a period of time, you may get better results by changing these parameters for each frame.
The allowed range of values on these parameters is -0.99 to 0.99: (0x100001 to 0x0FFFFF). The default value for these parameters is 0.
control
The control register contains only two active bits. The default value for the control register during continuous operation is “0x3.”
bit 0 is a general purpose enable. Activated/deactivated on a vblank_in basis, a value of 0 disables the scaler output.
bit 1 enables values on the other register inputs to become internally active on a vblank_in basis. A value of 0 prevents the active internal values from being changed.
34 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Constant (Fixed) Mode
When using this mode, the values are fixed at compile time. The user system does not need to drive any of the parameters. The CORE Generator GUI prompts you to specify:
coefficient file (.coe)
•hsf
•vsf
aperture_start_pixel
aperture_end_pixel
•aperture_start_line
•aperture_end_line
•output_h_size
•output_v_size
num_h_phases
num_v_phases
Constant mode has the following restrictions:
Constant (Fixed) Mode
A single coefficient set must be specified using a .coe file; this is the only way to populate the coefficient memory.
Coefficients may not be written to the core; the coef_wr_addr control is disabled.
You may not specify h_coeff_set or v_coeff_set; there is only one set of coefficients.
You may not specify start_hpa_y, start_hpa_c, start_vpa_y, start_vpa_c; they are set internally to zero.
The control register is always set to “0x00000003,” fixing the scaler in active mode.
General Purpose Processor (GPP) Interface
This interface type exposes all control ports to the user. You are responsible for driving these ports. Xilinx recommends that GPP mode be used only by experienced scaler users.
Figure 6-1 indicates how the EDK pCore is effectively a wrapper around the GPP mode
core. This should be considered as an example of how you may choose to wrap the GPP mode core to suit any processor.
In GPP mode, the control values may be changed during runtime – the user input control values become active once per frame after completion of an output frame, using an internal active value capture register.
Coefficient Delivery for GPP Interface
In this mode, you must supply all coefficients to the core. See Chapter 8, “Coefficients,” for all details regarding coefficient loading in GPP mode.
Video Scaler v4.0 User Guide www.xilinx.com 35
UG805 March 1, 2011
Chapter 6: Control Interface
EDK pCore Interface
In contrast to GPP Mode and Constant Mode control interfaces, when you select this control interface option in CORE Generator, no netlist is created. Instead, a database is generated containing the necessary files for use in an EDK project. This database includes:
<component_name> -> drivers -> scaler_v3_01_a -> data -> scaler_v2_1_0.mdd
scaler_v2_1_0.tcl
-> example -> example.c
-> src -> Makefile
xscaler.c
xscaler.h
xscaler_coefs.c
xscaler_g.c
xscaler_hw.h
xscaler_intr.c
xscaler_sinit.c
-> pcores -> axi_scaler_v4_00_a -> data -> scaler_v2_1_0.mpd
scaler_v2_1_0.pao
-> hdl -> vhdl
-> CoefsFIFO.vhd
coefs.vhd
CoefRAM.vhd
For use in an EDK project:
CoefMemBlk.vhd
HeartBeater.vhd
HPhaseAccumulator.vhd
HWT.vhd
ImageXLib_arch.vhd
ImageXLib_utils.vhd
MemXLib_arch.vhd
MemXLib_utils.vhd
Scaler.vhd
Scaler_RTI.vhd
Scaler_wrap0.vhd
Scaler_wrap0_core.vhd
ScalerExternalSM.vhd
syncgen_core.vhd
user_logic.vhd
v_scaler_v4_0.vhd
xscaler.vhd
YCCheckSum.vhd
1. Copy the /drivers/scaler_v3_01_a sub-directory from the CORE Generator database to the /drivers directory in your EDK project repository.
36 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
2. Copy the /pcores/axi_scaler_v4_00_a sub-directory from the CORE Generator database to the /pcores directory in your EDK project repository.
All VHDL files are encrypted. Do not attempt to modify these files.
Parameter Modification in CORE Generator
When "EDK pCore" is selected in the CORE Generator GUI, all parameters are greyed-out. The user must use the EDK GUI to parameterize the core.
Scaler Software Driver
All files provided by CORE Generator software under the drivers directory are tested SW drivers for the video scaler. They are unencrypted c-code which you may adapt for your own environment. This is intended for a memory-mapped system. The register map for the scaler registers is given in Appendix B, “Programmer Guide.”
Coefficient Delivery for EDK pCore Interface
Delivery of coefficients to the hardware core is achieved exactly as is described for the GPP Interface (see Chapter 8, “Coefficients,” for full details). However, the pCore wrapper and software driver mask you from the detail described.
Interrupts
Interrupts
There are six interrupts:
1. intr_output_frame_done – Issued once per complete output frame.
2. intr_reg_update_done – Issued during Vertical blanking when the register values have been transferred to the active registers.
3. intr_input_error – Issued if active_video_in is asserted before the scaler is ready to receive a new line.
4. intr_output_error – Issued if frame period completes before full output frame has been delivered.
5. intr_coef_wr_error – Issued if coefficient is written into coefficient FIFO when the FIFO is not ready.
6. intr_coef_fifo_rdy – High when the coefficient FIFO is ready to receive a coefficient for the current set; stays low once a full set has been written into FIFO; sent high during Vertical blanking.
7. intr_coef_mem_rdbk_rdy - Sent low after CoefMemRdEn (control register bit (3)) is written low. Two frames after CoefMemRdEn is written high, this signal is driven high again.
In GPP mode, all seven interrupts are active.
In Constant mode, only intr_input_error, intr_output_error and intr_output_frame_done are active.
Video Scaler v4.0 User Guide www.xilinx.com 37
UG805 March 1, 2011
Chapter 6: Control Interface
-ICRO"LAZE
)NTERRUPT
#ONTROLLER
6IDEO3CALER P#ORE
)NTERRUPT
#ONTROLLER
0ERIPHERAL 0ERIPHERALN
)NTERRUPTS
!8),ITE
6IDEO 3CALER '00
Inside the pCore wrapper, an Interrupt Controller (Xilinx Interrupt Control LogiCORE™ (DS516 microprocessor must then read the interrupt status registers to establish the nature of the interrupt. The interrupt registers are defined in Appendix B, “Programmer Guide.” A generic n-peripheral system is shown in Figure 6-1. It shows the intended usage of interrupts in an EDK-based system. It also shows how the Xilinx Interrupt Controller is used internally to the pCore along with the scaler in GPP mode.
X-Ref Target - Figure 6-1
)) collates these interrupts into one interrupt on the AXI4-Lite bus. The
Figure 6-1: Typical EDK-based System Showing Interrupt Structure
38 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Scaler Aperture
This section explains how to define the scaler aperture using the appropriate dynamic control registers. The aperture is defined relative to the input timing signals.
Input Aperture Definition
It is vital to understand how to specify the scaler aperture properly. The scaler aperture is defined as the input data rectangle used to create the output data rectangle. The input values aperture_start_line, aperture_end_line, aperture_start_pixel and aperture_end_pixel need to be driven correctly.
To scale from a rectangle of size 1280x720, they should be set as follows:
aperture_start_pixel 0
aperture_end_pixel 1279
Chapter 7
X-Ref Target - Figure 7-1
aperture_start_line 0
aperture_end_line 719
It is also important to understand how “line 0” and “pixel 0” are defined to ensure that these values are entered correctly. Line 0 is defined as the first active line following a rising edge in active_video_in. An internal line counter is decoded to signal internally that the current line is indeed line 0. This line counter is reset on a falling edge of vblank_in. It increments on a rising edge of hblank_in.
One situation that needs to be avoided is the counter effectively starting at 1 instead of 0. This will cause no video output. The correct relationship between input hblank_in and
vblank_in to avoid this situation is shown in Figure 7-1. The falling edge of vblank_in occurs while hblank_in is still high.
Figure 7-1: Hblank_in at Falling Edge of VBlank_in
Video Scaler v4.0 User Guide www.xilinx.com 39
UG805 March 1, 2011
Chapter 7: Scaler Aperture
Pixel 0 is defined as the first active pixel after the rising edge of active_video_in. This is indicated in Figure 7-2. The value 128 is used as the default value in video_data_in during blanking. In this example, the first pixel in the horizontal scaler aperture is the first active pixel in the input line.
X-Ref Target - Figure 7-2
Figure 7-2: Active_video_in in Relation to First Active Sample
Cropping
When using “Live” mode, you may choose to select a small portion of the input image. To achieve this, set the aperture_start_line, aperture_end_line, aperture_start_pixel and aperture_end_pixel according to your requirements.
For example, from an input which is 720P, you may want to scale from a rectangle of size 80x60, starting at (pixel, line) = (20, 32). Set the following:
X-Ref Target - Figure 7-3
aperture_start_pixel 20
aperture_end_pixel 99
aperture_start_line 32
aperture_end_line 91
Figure 7-3 shows the opening of an internal processing window signal
(t_verticalwindow) with the preceding cropping settings. A similar operation occurs in the horizontal domain. A useful developer note is that if the largest input rectangle is cropped from the input, then this size may be used in deciding the max_pixels_in_per_line parameter. This may save block RAM usage in some cases.
Figure 7-3: Cropping from the Input Image
When using “Memory” mode, cropping must be achieved by selecting the appropriate rectangular area from memory. aperture_start_pixel and aperture_start_line must be set to zero.
40 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Coefficients
This section describes the coefficients used by both the Vertical and Horizontal filter portions of the scaler, in terms of number, range, formatting and download procedures.
Coefficient Table
One single size-configurable, block RAM-based, Dual Port RAM block stores all H and V coefficients combined, and holds different coefficients for luma and chroma as desired.
This coefficient store may be populated with active coefficients as follows:
Using the Coefficient Interface (see Coefficient Interface).
By preloading using a .coe file
Coefficients that are preloaded using a .coe file remain in this memory until they are overwritten with coefficients loaded by the Coefficient Interface. Consequently, this is not possible when using Constant mode. Preloading with coefficients allows the user an easy way of initializing the scaler from power-up.
Chapter 8
When using pCore or GPP interfaces, you may want more than one coefficient set from which to choose. For example, it may be necessary to select different filter responses for different shrink factors. This is often true when down-scaling by different factors to eliminate aliasing artifacts. The user may load (or preload using a .coe file) multiple coefficient sets.
The number of phases for each set may also vary, dependent upon the nature of the conversion, and how you have elected to generate and partition the coefficients. The maximum number of phases per set defines the size of the memory required to store them, and this may have an impact on resource usage. Careful selection of the parameters max_phases and max_coef_sets is paramount if optimal resource usage is important.
Each coefficient set is allocated an amount of space equal to 2 fixed parameter that is defined at compile time. However, it is not necessary for every set to have that many phases. The number of phases for each set may be different, provided you indicate how many phases there are in the current set being used, by setting the input register values num_h_phases, and num_v_phases accordingly. Without setting these correctly, invalid coefficients will be selected by the phase accumulators.
Horizontal filter coefficients are stored in the lower half of the coefficient memory. Vertical filter coefficients are stored in the upper half of the coefficient memory. For each of the H and V sectors, luma coefficients occupy the lower half and chroma coefficients occupy the upper half. This method simplifies internal addressing. When the chroma format is set to 4:4:4., one set of coefficients will be shared between all three channels (i.e., R, G, and B will be scaled identically).
max_phases
. Max_phases is a
Video Scaler v4.0 User Guide www.xilinx.com 41
UG805 March 1, 2011
Chapter 8: Coefficients
31150
Valid - Coefficient n+1
Valid - Coefficient n
16-bit Coefficients
UG_28_031909
If the user specifies in the CORE Generator or EDK GUI that the Luma and Chroma filters share common coefficients, then there is no coefficient memory space available for chroma coefficients. In this case, the user must not load chroma coefficients using the Coefficient interface, and must not specify chroma coefficients in the .coe file.
Similarly, if the user has specified in the CORE Generator or EDK GUI that the Horizontal and Vertical filters share common coefficients, then there is no coefficient memory space available for Vertical coefficients. In this case, the user must not load Vertical coefficients using the Coefficient interface, and must not specify Vertical coefficients in the .coe file.
Note:
taps.
This option is only available if the number of horizontal taps is equal to the number of vertical
Coefficient Interface
The scaler uses only one set of coefficients per frame period. To change to a different set of stored coefficients for the next frame, use the h_coeff_set and v_coeff_set dynamic register inputs.
You may load new coefficients into a different location in the coefficient store during some frame period before they are required. You may load a maximum of one coefficient set (including all of HY, HC, VY, VC components) per frame period. Subsequently, this coefficient set may be selected for use by controlling h_coeff_set and v_coeff_set.
Filter Coefficients may be loaded into the coefficient memory using the coefficient memory interface. This comprises:
coef_data_in(31:0) 32-bit coefficient input bus
coef_wr_en Coefficient write-enable
coef_set_wr_addr(3:0) Coefficient set write address
The 32-bit input word always holds two coefficients. The scaler supports 16-bit coefficient bit-widths. The word format is shown in Figure 8-1.
X-Ref Target - Figure 8-1
Figure 8-1: Coefficient Write-Format on coef_data_in(31:0)
42 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Coefficient Interface
vblank_in
Coefficient Load
Control SM
Coefficient Load
FIFO
Coefficient Storecoef_data_in(31:0)
coef_set_wr_addr(3:0)
coef_wr_en
Coefficient Write Address
Coefficients to filters
Video Scaler
Por t A
Operational Read Address (V Filter)
Operational Read Address (H Filter)
Por t B
UG678_7-3_081809
An address-multiplexer is used to support the coefficient write interface as shown in
Figure 8-2. The coefficient write-address is multiplexed with the coefficient read-address
for the vertical filter to create the address for Port A on the dual-port coefficient RAM. Consequently, coefficients must be loaded into the coefficient stores when no active video scaling is occurring. It is only possible, therefore, to load the coefficients during the vertical blanking period. Since this would be an impossible burden on a processor, an external block RAM FIFO has been provided to which you load your coefficients during one frame period, as shown in Figure 8-2. Following a latency period after the positive transition of vblank_in, any new coefficient set is streamed into the internal coefficient store for use by the filter in the next frame.
X-Ref Target - Figure 8-2
Figure 8-2: Coefficient Loading Mechanism, Including External FIFO
A waveform indicating the coefficient loading process is shown in Figure 8-3.
The coefficient memory interface is an asynchronous interface. A high level on the coef_wr_en signal is used to capture the coefficients delivered on coef_data_in as shown in Figure 8-3. An internal state-machine detects the 3rd ‘clk’ period when coef_wr_en is stable and high. At this point, the data is registered into the FIFO. Xilinx recommends that the high coef_wr_en pulse be no less than the equivalent of 6 ‘clk’ periods in duration. It is required that it also be low for a period no less than 6 ‘clk’ periods between write operations.
The guidelines are as follows:
•The address coef_set_addr for all coefficients in one set must be written via the normal register interface.
coef_data_in delivers two coefficients per 32-bit word. The lower word (bits 15:0) always holds the coefficient that will be applied to the latest tap (that is, spatially speaking, the right-most or lowest). The word format is shown in Figure 8-1.
All coefficients for one phase must be loaded sequentially via coef_data_in, starting with coef 0 and coef 1 [coef 0 is applied to the newest (right-most or lowest) input sample in the current filter aperture]. See Figure 8-3. For an odd number of coefficients, the final upper 16 bits is ignored.
All phases must be loaded sequentially starting at phase 0, and ending at phase (max_phases-1). This must always be observed, even if a particular set of coefficients has fewer active phases than max_phases.
For RGB/4:4:4, when not sharing coefficients across H and V operations, for each
Video Scaler v4.0 User Guide www.xilinx.com 43
UG805 March 1, 2011
dimension, one bank of coefficients must be loaded into the FIFO before they can be streamed into the coefficient memory. When sharing coefficients across H and V operations, it is only necessary to write coefficients for the H operation. This process is permitted to take as much time as desired by the user system. This means that worst
Chapter 8: Coefficients
Coefs 0,1coef_data_in
coef_wr_en
Coefs 2, 3 Coefs 4, 5 Coefs 6, 7
UG_30_031909
case, for a 12H-tap x 12V-tap 64-phase filter, you need to write 6 times per phase. If the user has specified separate H and V coefficients, this is a total of 768 write operations per set.
For YC4:2:2 or YC4:2:0, when not sharing coefficients across H and V operations or across Y and C operations, one bank of luma (Y) and chroma (C) coefficients must be loaded into the FIFO for each dimension before they can be streamed into the coefficient memory. When sharing coefficients across H and V operations, it is only necessary to write coefficients for the H operation. Also, when sharing coefficients across Y and C operations, it is only necessary to write coefficients for the Y operation. This process is permitted to take as much time as desired by the user system. This means that worst case, for a 12H-tap x 12V-tap 64-phase filter, you need to write 6 times per phase. If the user has specified separate H and V coefficients and separate Y and C coefficients, this is a total of 1536 write operations per set.
Writing a new address to coef_set_addr resets the internal state-machine that oversees the coefficient loading procedure. An error condition will be asserted if the loading procedure comes up less than 2 x max_phases*Max(num_h_taps, num_v_taps) when coef_set_addr is updated.
X-Ref Target - Figure 8-3
Figure 8-3: Coefficient Loading Procedure – One Phase (8-tap filter shown)
Examples of Coefficient Set Generation and Loading
As mentioned, when data is fed in raster format, coefficient 0 is applied to the lowest tap in the aperture for the Vertical filter or for the right-most tap in the Horizontal filter. Following are a few examples of how to generate some coefficients and translate them into the correct format for downloading to the scaler.
Example 1: Num_h_taps = num_v_taps = 8; max_phases = 4
Tab le 8- 1 shows a set of coefficients drawn from a sinc function.
Table 8-1: Example 1 Decimal Coefficients
Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7
0 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000
1 -0.0600 0.0818 -0.1286 0.3001 0.9003 -0.1801 0.1000 -0.0693
2 -0.0909 0.1273 -0.2122 0.6366 0.6366 -0.2122 0.1273 -0.0909
3 -0.0693 0.1000 -0.1801 0.9003 0.3001 -0.1286 0.0818 -0.0600
In this example, a 32-point 1-D sinc function has been sub-sampled to generate four phases of eight coefficients each. Sub-sampling in this way usually results in a phases whose component coefficients rarely sum to 1.0 – this will cause image distortion. The example MATLAB express them as the 16-bit integers required by the hardware. For this process, coef_width = 16. Note that this is only pseudo code. Generation of actual coefficients is
®
m-code that follows shows how to normalize the phases to unity and how to
44 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Examples of Coefficient Set Generation and Loading
beyond the scope of this document. Refer to Answer Record 35262 and Filter Coefficient
Calculations for more information on coefficient generation for the video scaler.
% Subsample a Sinc function, and create 2D array x=-(num_taps/2):1/num_phases:((num_taps/2)-1/num_phases); coefs_2d=reshape(sinc(x), num_phases, num_taps) format long
% Normalize each phase individually for i=1:num_phases sum_phase = sum(coefs_2d(i,:)); for j=1:num_taps norm_phases(i, j) = coefs_2d(i, j)/sum_phase; end % Check - Normalized values should sum to 1 in each phase norm_sum_phase = sum(norm_phases(i,:)) end
% Translate real to integer values with precision defined by coef_width int_phases = round(((2^(coef_width-2))*norm_phases))
This generates the 2D array of integer values shown (in hexadecimal form) in Tab le 8- 2.
Table 8-2: Example 1 Normalized Integer Coefficients
Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7
0 0x0000 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000
1 0xFBEF 0x058C 0xF749 0x1457 0x3D04 0xF3CC 0x06C8 0xFB4E
2 0xF9AF 0x08D8 0xF143 0x2C36 0x2C36 0xF143 0x08D8 0xF9AF
3 0xFB4E 0x06C8 0xF3CC 0x3D04 0x1457 0xF749 0x058C 0xFBEF
It remains to format these values for the scaler.
The 16-bit coefficients must be coupled into 32-bit values for delivery to the HW. The resulting coefficient file for download is shown in Ta bl e 8- 3.
The coefficients must be downloaded in the following order:
1. Horizontal Luma (always required)
2. Horizontal Chroma (required if not sharing Y and C coefficients)
3. Vertical Luma (required if not sharing H and V coefficients)
4. Vertical Chroma (required if not sharing H and V coefficients, and also not sharing Y and C coefficients)
Table 8-3: Example 1 Coefficient Set Download Format
Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 17 0x00000000 (Ph0 T1 << 16) | Ph0 T0
2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 18 0x00000000 (Ph0 T3 << 16) | Ph0 T2
3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 19 0x00004000 (Ph0 T5 << 16) | Ph0 T4
Phase 0
4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 20 0x00000000 (Ph0 T7 << 16) | Ph0 T6
Video Scaler v4.0 User Guide www.xilinx.com 45
UG805 March 1, 2011
Phase 0
Chapter 8: Coefficients
Table 8-3: Example 1 Coefficient Set Download Format (Cont’d)
5 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 21 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0
6 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 22 0x1457F749 (Ph1 T3 << 16) | Ph1 T2
7 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 23 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4
Phase 1
8 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 24 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6
9 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 25 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0
10 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 26 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2
11 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 27 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4
Phase 2
12 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 28 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6
13 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 29 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0
14 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 30 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2
15 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 31 0xF7491457 (Ph3 T5 << 16) | Ph3 T4
Phase 3
16 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 32 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6
Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
33 0x00000000 (Ph0 T1 << 16) | Ph0 T0 49 0x00000000 (Ph0 T1 << 16) | Ph0 T0
34 0x00000000 (Ph0 T3 << 16) | Ph0 T2 50 0x00000000 (Ph0 T3 << 16) | Ph0 T2
Phase 1
Phase 2
Phase 3
Phase 0
35 0x00004000 (Ph0 T5 << 16) | Ph0 T4 51 0x00004000 (Ph0 T5 << 16) | Ph0 T4
Phase 0
36 0x00000000 (Ph0 T7 << 16) | Ph0 T6 52 0x00000000 (Ph0 T7 << 16) | Ph0 T6
37 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 53 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0
38 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 54 0x1457F749 (Ph1 T3 << 16) | Ph1 T2
39 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 55 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4
Phase 1
40 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 56 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6
41 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 57 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0
42 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 58 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2
43 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 59 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4
Phase 2
44 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 60 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6
45 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 61 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0
46 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 62 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2
47 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 63 0xF7491457 (Ph3 T5 << 16) | Ph3 T4
Phase 3
48 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 64 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6
Phase 1
Phase 2
Phase 3
46 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Examples of Coefficient Set Generation and Loading
Example 2: Num_h_taps = num_v_taps = 8; max_phases = 5, 6, 7 or 8; num_h_phases = num_v_phases = 4
If the max_phases parameter is greater than the number of phases in the set being loaded, load default coefficients into the unused locations. Example 2 is an extended version of Example 1 to show this. Ta bl e 8- 4 shows the same 4-phase coefficient set loaded into the scaler when num_h_phases = 4, num_v_phases = 4 and max_phases is greater than 4(max_phases = 5, 6, 7 or 8, num_h_taps = 8, num_v_taps =8).
Note that:
1. If max_phases is not equal to an integer power of 2, then the number of phases to be loaded is rounded up to the next integer power of 2. See Example 2 (Tabl e 8 -4 ). Unused phases should be loaded with zeros.
2. The number of values loaded per phase is not rounded to the nearest power of 2. See Example 3 (Ta bl e 8 -7 ).
Table 8-4: Example 2 Coefficient Set Download Format
Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma
Load
Sequence
Number
1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 33 0x00000000 (Ph0 T1 << 16) | Ph0 T0
2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 34 0x00000000 (Ph0 T3 << 16) | Ph0 T2
Phase 0
Phase 1
Phase 2
3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 35 0x00004000 (Ph0 T5 << 16) | Ph0 T4
4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 36 0x00000000 (Ph0 T7 << 16) | Ph0 T6
5 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 37 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0
6 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 38 0x1457F749 (Ph1 T3 << 16) | Ph1 T2
7 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 39 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4
8 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 40 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6
9 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 41 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0
10 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 42 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2
11 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 43 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4
12 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 44 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6
13 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 45 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0
14 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 46 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2
Val ue
Calculation
Ph= Phase #, T= Tap #
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
Phase 0
Phase 1
Phase 2
Phase 3
15 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 47 0xF7491457 (Ph3 T5 << 16) | Ph3 T4
Phase 3
16 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 48 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6
17 0x00000000 N/A Dummy coef 49 0x00000000 N/A Dummy coef
18 0x00000000 N/A Dummy coef 50 0x00000000 N/A Dummy coef
19 0x00000000 N/A Dummy coef 51 0x00000000 N/A Dummy coef
Phase 4
20 0x00000000 N/A Dummy coef 52 0x00000000 N/A Dummy coef
Video Scaler v4.0 User Guide www.xilinx.com 47
UG805 March 1, 2011
Phase 4
Chapter 8: Coefficients
Table 8-4: Example 2 Coefficient Set Download Format (Cont’d)
21 0x00000000 N/A Dummy coef 53 0x00000000 N/A Dummy coef
22 0x00000000 N/A Dummy coef 54 0x00000000 N/A Dummy coef
23 0x00000000 N/A Dummy coef 55 0x00000000 N/A Dummy coef
Phase 5
24 0x00000000 N/A Dummy coef 56 0x00000000 N/A Dummy coef
25 0x00000000 N/A Dummy coef 57 0x00000000 N/A Dummy coef
26 0x00000000 N/A Dummy coef 58 0x00000000 N/A Dummy coef
27 0x00000000 N/A Dummy coef 59 0x00000000 N/A Dummy coef
Phase 6
28 0x00000000 N/A Dummy coef 60 0x00000000 N/A Dummy coef
29 0x00000000 N/A Dummy coef 61 0x00000000 N/A Dummy coef
30 0x00000000 N/A Dummy coef 62 0x00000000 N/A Dummy coef
31 0x00000000 N/A Dummy coef 63 0x00000000 N/A Dummy coef
Phase 7
32 0x00000000 N/A Dummy coef 64 0x00000000 N/A Dummy coef
Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma
Addr Value
Calculation
Ph= Phase #, T= Tap #
Addr Value
Calculation
Ph= Phase #, T= Tap #
65 0x00000000 (Ph0 T1 << 16) | Ph0 T0 97 0x00000000 (Ph0 T1 << 16) | Ph0 T0
66 0x00000000 (Ph0 T3 << 16) | Ph0 T2 98 0x00000000 (Ph0 T3 << 16) | Ph0 T2
Phase 5
Phase 6
Phase 7
Phase 0
67 0x00004000 (Ph0 T5 << 16) | Ph0 T4 99 0x00004000 (Ph0 T5 << 16) | Ph0 T4
Phase 0
68 0x00000000 (Ph0 T7 << 16) | Ph0 T6 100 0x00000000 (Ph0 T7 << 16) | Ph0 T6
69 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 101 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0
70 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 102 0x1457F749 (Ph1 T3 << 16) | Ph1 T2
71 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 103 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4
Phase 1
72 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 104 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6
73 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 105 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0
74 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 106 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2
75 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 107 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4
Phase 2
76 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 108 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6
77 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 109 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0
78 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 110 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2
79 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 111 0xF7491457 (Ph3 T5 << 16) | Ph3 T4
Phase 3
80 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 112 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6
Phase 1
Phase 2
Phase 3
48 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Examples of Coefficient Set Generation and Loading
Table 8-4: Example 2 Coefficient Set Download Format (Cont’d)
81 0x00000000 N/A Dummy coef 113 0x00000000 N/A Dummy coef
82 0x00000000 N/A Dummy coef 114 0x00000000 N/A Dummy coef
83 0x00000000 N/A Dummy coef 115 0x00000000 N/A Dummy coef
Phase 4
84 0x00000000 N/A Dummy coef 116 0x00000000 N/A Dummy coef
85 0x00000000 N/A Dummy coef 117 0x00000000 N/A Dummy coef
86 0x00000000 N/A Dummy coef 118 0x00000000 N/A Dummy coef
87 0x00000000 N/A Dummy coef 119 0x00000000 N/A Dummy coef
Phase 5
88 0x00000000 N/A Dummy coef 120 0x00000000 N/A Dummy coef
89 0x00000000 N/A Dummy coef 121 0x00000000 N/A Dummy coef
90 0x00000000 N/A Dummy coef 122 0x00000000 N/A Dummy coef
91 0x00000000 N/A Dummy coef 123 0x00000000 N/A Dummy coef
Phase 6
91 0x00000000 N/A Dummy coef 124 0x00000000 N/A Dummy coef
93 0x00000000 N/A Dummy coef 125 0x00000000 N/A Dummy coef
94 0x00000000 N/A Dummy coef 126 0x00000000 N/A Dummy coef
95 0x00000000 N/A Dummy coef 127 0x00000000 N/A Dummy coef
Phase 7
96 0x00000000 N/A Dummy coef 128 0x00000000 N/A Dummy coef
Phase 4
Phase 5
Phase 6
Phase 7
Example 3: Num_h_taps = 9; num_v_taps = 7; max_phases = num_h_phases = num_v_phases = 4
Now consider the case where the number of taps in the Horizontal dimension is different to that in the Vertical dimension. For this case, when loading the coefficients for the dimension for which the number of taps is smaller, each phase of coefficients must be padded with zeros up to the larger number of taps.
Example coefficients are shown in hexadecimal form in Tab le 8 - 5 (horizontal) and Ta bl e 8 -6 (vertical).
Table 8-5: Example 9-Tap Coefficients
Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7 Tap 8
0 0x0000 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000 0x0000
1 0xFFB1 0x0123 0x047C 0x10C6 0x3A26 0xF5F0 0x037D 0xFF0A 0x0046
2 0xFF84 0x01D1 0xF865 0x2490 0x2A42 0xF3D0 0x0490 0xFEB4 0x0060
3 0xFF9E 0x017E 0xF93F 0x3619 0x14D7 0xF846 0x0312 0xFF1B 0x0043
Video Scaler v4.0 User Guide www.xilinx.com 49
UG805 March 1, 2011
Chapter 8: Coefficients
Table 8-6: Example 7-Tap Coefficients
Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6
0 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000
1 0x006D 0xFD69 0x0F04 0x3A81 0xF6FE 0x0204 0xFFA4
2 0x00B2 0xFB85 0x2160 0x2B58 0xF4E0 0x02B0 0xFF81
3 0x0097 0xFBE1 0x332B 0x1627 0xF8B1 0x01DF 0xFFA5
The resulting coefficient file for download is shown in Tab le 8 -7 .
Table 8-7: Example 3 Coefficient Set Download Format
Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 21 0x00000000 (Ph0 T1 << 16) | Ph0 T0
2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 22 0x00000000 (Ph0 T3 << 16) | Ph0 T2
3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 23 0x00004000 (Ph0 T5 << 16) | Ph0 T4
Phase 0
4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 24 0x00000000 (Ph0 T7 << 16) | Ph0 T6
5 0x00000000 (0 << 16) | Ph0 T8 25 0x00000000 (0 << 16) | Ph0 T8
6 0x0123FFB1 (Ph1 T1 << 16) | Ph1 T0 26 0x0123FFB1 (Ph1 T1 << 16) | Ph1 T0
7 0x10C6047C (Ph1 T1 << 16) | Ph1 T2 27 0x10C6047C (Ph1 T1 << 16) | Ph1 T2
8 0XF5F03A26 (Ph1 T1 << 16) | Ph1 T4 28 0XF5F03A26 (Ph1 T1 << 16) | Ph1 T4
Phase 1
9 0XFF0A037D (Ph1 T1 << 16) | Ph1 T6 29 0XFF0A037D (Ph1 T1 << 16) | Ph1 T6
10 0x00000046 (0 << 16) | Ph1 T8 30 0x00000046 (0 << 16) | Ph1 T8
11 0x01D1FF84 (Ph2 T1 << 16) | Ph2 T0 31 0x01D1FF84 (Ph2 T1 << 16) | Ph2 T0
12 0x2490F865 (Ph2 T3 << 16) | Ph2 T2 32 0x2490F865 (Ph2 T3 << 16) | Ph2 T2
13 0XF3D02A2 (Ph2 T5 << 16) | Ph2 T4 33 0XF3D02A2 (Ph2 T5 << 16) | Ph2 T4
Phase 2
14 0XFEB40490 (Ph2 T7 << 16) | Ph2 T6 34 0XFEB40490 (Ph2 T7 << 16) | Ph2 T6
Phase 0
Phase 1
Phase 2
15 0x00000060 (0 << 16) | Ph2 T8 35 0x00000060 (0 << 16) | Ph2 T8
16 0x017EFF9E (Ph3 T1 << 16) | Ph3 T0 36 0x017EFF9E (Ph3 T1 << 16) | Ph3 T0
17 0x3619F93F (Ph3 T3 << 16) | Ph3 T2 37 0x3619F93F (Ph3 T3 << 16) | Ph3 T2
18 0XF84614D7 (Ph3 T1 << 16) | Ph3 T4 38 0XF84614D7 (Ph3 T1 << 16) | Ph3 T4
Phase 3
19 0XFF1B0312 (Ph3 T1 << 16) | Ph3 T6 39 0XFF1B0312 (Ph3 T1 << 16) | Ph3 T6
20 0x00000043 (0 << 16) | Ph3 T8 40 0x00000043 (0 << 16) | Ph3 T8
50 www.xilinx.com Video Scaler v4.0 User Guide
Phase 3
UG805 March 1, 2011
Examples of Coefficient Set Generation and Loading
Table 8-7: Example 3 Coefficient Set Download Format (Cont’d)
Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
Load
Sequence
Number
Val ue
Calculation
Ph= Phase #, T= Tap #
41 0x00000000 (Ph0 T1 << 16) | Ph0 T0 61 0x00000000 (Ph0 T1 << 16) | Ph0 T0
42 0x40000000 (Ph0 T3 << 16) | Ph0 T2 62 0x40000000 (Ph0 T3 << 16) | Ph0 T2
43 0x00000000 (Ph0 T5 << 16) | Ph0 T4 63 0x00000000 (Ph0 T5 << 16) | Ph0 T4
Phase 0
44 0x00000000 (0 << 16) | Ph0 T6 64 0x00000000 (0 << 16) | Ph0 T6
45 0x00000000 N/A dummy coef 65 0x00000000 N/A dummy coef
46 0XFD69006D (Ph1 T1 << 16) | Ph1 T0 66 0XFD69006D (Ph1 T1 << 16) | Ph1 T0
47 0x3A810F04 (Ph1 T1 << 16) | Ph1 T2 67 0x3A810F04 (Ph1 T1 << 16) | Ph1 T2
48 0X0204F6FE (Ph1 T1 << 16) | Ph1 T4 68 0X0204F6FE (Ph1 T1 << 16) | Ph1 T4
Phase 1
49 0X0000FFA4 (0 << 16) | Ph1 T6 69 0X0000FFA4 (0 << 16) | Ph1 T6
50 0x00000000 N/A dummy coef 70 0x00000000 N/A dummy coef
51 0XFB8500B2 (Ph2 T1 << 16) | Ph2 T0 71 0XFB8500B2 (Ph2 T1 << 16) | Ph2 T0
52 0x2B582160 (Ph2 T3 << 16) | Ph2 T2 72 0x2B582160 (Ph2 T3 << 16) | Ph2 T2
53 0X02B0F4E0 (Ph2 T5 << 16) | Ph2 T4 73 0X02B0F4E0 (Ph2 T5 << 16) | Ph2 T4
Phase 2
54 0X0000FF81 (0 << 16) | Ph2 T6 74 0X0000FF81 (0 << 16) | Ph2 T6
Phase 0
Phase 1
Phase 2
55 0x00000000 N/A dummy coef 75 0x00000000 N/A dummy coef
56 0XFBE10097 (Ph3 T1 << 16) | Ph3 T0 76 0XFBE10097 (Ph3 T1 << 16) | Ph3 T0
57 0x1627332B (Ph3 T3 << 16) | Ph3 T2 77 0x1627332B (Ph3 T3 << 16) | Ph3 T2
58 0X01DFF8B1 (Ph3 T1 << 16) | Ph3 T4 78 0X01DFF8B1 (Ph3 T1 << 16) | Ph3 T4
Phase 3
59 0X0000FFA5 (0 << 16) | Ph3 T6 79 0X0000FFA5 (0 << 16) | Ph3 T6
50 0x00000000 N/A dummy coef 80 0x00000000 N/A dummy coef
Phase 3
Video Scaler v4.0 User Guide www.xilinx.com 51
UG805 March 1, 2011
Chapter 8: Coefficients
Coefficient Preloading Using a .coe File
To preload the scaler with coefficients (mandatory when in Constant mode), you must specify, using the CORE Generator GUI or the EDK GUI, a .coe file that contains the coefficients you want to use. It is important that the .coe file specified is in the correct format. The coefficients specified in the .coe file become hard-coded into the hardware during synthesis.
Generating .coe Files
Generating .coe files can be accomplished by either extracting coefficients from a file provided with the core (refer to the next section) or developing your own set of coefficients. Developing your own coefficients is a very complex and subjective operation, and is beyond the scope of this document. Refer to Answer Record 35262
Coefficient Calculations for more information on generating video scaler coefficients.
Extracting Coefficients From xscaler_coefs.c File
The pCore version of the video scaler includes a software driver. The coefficients are included in this driver in the xscaler_coefs.c file. The pCore version of the core can be generated by selecting "EDK pCore" in the CORE Generator GUI. Coefficients from this file can be extracted manually; however, it is important to know the format of this file.
and Filter
All coefficients required for any conversion are provided with the SW Driver. The filename is xscaler_coefs.c. You may modify this file, and the driver code that reads the coefficients from it, as you see fit.
The file defines 19 “bins” of coefficients. You must select which bin to use according to your application. In the delivered driver, the file xscaler.c includes a function called XScaler_CoeffBinOffset, which assesses the scaling requirements specified by you (for example, input/output rectangle sizes) and calculates which bin of coefficients is required. In this driver, the bins have been allocated as per Ta bl e 8 -8 . This function may be used independently for all Horizontal, Vertical, Luma, and Chroma filter operations.
Table 8-8: Coefficient “Binning” in SW Driver (xscaler_coefs.c)
Bin #
1 SF<1 All up-scaling cases
1+Ceil((output_size*16)/input_size)
(bins 2 to 17)
For example:
Down-scaling 1920 to 1440: use bin
13
Down-scaling 1080 to 1000 : Use
bin 16
Down-scaling 1080 to 144 : Use
bin 4
SF=input_size/
output_size
1<SF<16
(All down-
scaling cases)
Comments
General down-scaling coefficients
Down-scaling filter coefficients include anti-aliasing characteristics that differ according to scale-factor
52 www.xilinx.com Video Scaler v4.0 User Guide
18 N/A Unity coefficient in center tap
19
1920/1280
(1080/720)
Example user-specific case for HD down scaling conversion
UG805 March 1, 2011
Coefficient Preloading Using a .coe File
Within each “bin,” four further levels of granularity can be observed. In order of decreasing size of granularity, these levels are:
•Number of taps defined
•Number of phases defined
Phase number (one line in file)
Tap number (one element of each line), newest (right-most or lowest) first
For example, the first set of coefficients, defined for two taps and two phases, is given as:
// bin # 1; num_taps = 2; num_phases = 2
1018, 15366,
8192, 8192
The second set of coefficients, defined for two taps and three phases, is given immediately afterwards as:
/* bin # 1; num_taps = 2; num_phases = 3 */
1018, 15366,
5852, 10532,
10532, 5852,
And so forth.
Format for .coe Files
The guidelines for creating a .coe file are as follows:
Coefficients may be specified in either 16-bit binary form or signed decimal form.
First line of a 16-bit binary file must be memory_initialization_radix=2;
First line of a signed decimal file must be memory_initialization_radix=10;
Second line of all .coe files must be memory_initialization_vector=
All coefficient entries must end with a comma (",") except the final entry which must end with a semicolon ";".
Final entry must have a carriage return at the end after the semicolon.
All coefficient sets must be listed consecutively, starting with set 0.
All sets in the file must be of equal size in terms of the number of coefficient entries.
Number of coefficient entries in all sets depends upon:
•Max_coef_sets
Max_phases
Max_taps (=max(num_h_taps, num_v_taps))
User setting for "Separate Y/C coefficients"
User setting for “Chroma_format”
Video Scaler v4.0 User Guide www.xilinx.com 53
UG805 March 1, 2011
Chapter 8: Coefficients
User setting for "Separate H/V coefficients"
The simplest method is to specify an intermediate value num_banks:
num_banks=4;
if (Separate H/V coefficients = 0) then
num_banks := num_banks/2;
end;
if (Separate Y/C coefficients = 0) or (chroma_format=4:4:4) then
num_banks := num_banks/2;
end;
Consequently, the number of entries in the .coe file can be defined as:
num_coefs_in_coe_file = max_coef_sets x num_banks x max_phases x max_taps
Within each set, coefficient banks must be specified in the following order:
Table 8-9: Ordering of Coefficients in .coe File for Different Coefficient Sharing Options
Separate Y/C Coefficients Separate H/V Coefficients Bank Order in .coe File
True True HY, HC, V Y, V C
True Fals e H, V
False True Y, C
False False Single set only
Within each bank, all phases must be listed consecutively, starting with phase 0, followed by phase 1, etc.
The number of phases specified (per bank) in the .coe file must be equal to Max_Phases, even for filters that use fewer phases. Set all coefficients in unused phases to 0 (decimal) or 0000000000000000 (16b binary).
Within each phase, all coefficients must be listed consecutively. The first specified coefficient for any phase represents the value applied to the newest (rightmost or lowest) tap in the aperture.
Tab le 8- 10 shows an example of a .coe file with the following specification:
num_h_taps = num_v_taps = 12;
max_phases = 4;
max_coef_sets = 1;
Separate H/V Coefficients = False;
Separate Y/C Coefficients = False;
54 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Coefficient Preloading Using a .coe File
Both signed decimal and 16-bit binary forms are shown.
Table 8-10: .coe File Example 1
Phase Tap File line-number Line text (signed decimal form) Line text (16-bit binary form)
N/A N/A 1 memory_initialization_radix=10; memory_initialization_radix=2;
2 memory_initialization_vector= memory_initialization_vector=
0 0 3 0, 0000000000000000,
0 1 4 162, 0000000010100010,
0 2 5 0, 0000000000000000,
0 3 6 -1069, 1111101111010011,
0 4 7 0, 0000000000000000,
0 5 8 5199, 0001010001001111,
0 6 9 8167, 0001111111100111,
0 7 10 4457, 0001000101101001,
0 8 11 0, 0000000000000000,
0 9 12 -616, 1111110110011000,
0 10 13 0, 0000000000000000,
0 11 14 85, 0000000001010101,
1 0 15 28, 0000000000011100,
1 1 16 155, 0000000010011011,
1 2 17 -186, 1111111101000110,
1 3 18 -1062, 1111101111001010,
1 4 19 960, 0000001111000000,
1 5 20 6311, 0001100010100111,
1 6 21 7842, 0001111010100010,
1 7 22 3246, 0000110010101110,
1 8 23 -538, 1111110111100110,
1 9 24 -518, 1111110111111010,
1 10 25 72, 0000000001001000,
1 11 26 73, 0000000001001001,
2 0 27 53, 0000000000110101,
2 1 28 125, 0000000001111101,
2 2 29 -366, 1111111010010010,
2 3 30 -890, 1111110010000110,
2 4 31 2060, 0000100000001100,
2 5 32 7209, 0001110000101001,
Video Scaler v4.0 User Guide www.xilinx.com 55
UG805 March 1, 2011
Chapter 8: Coefficients
Table 8-10: .coe File Example 1
2 6 33 7209, 0001110000101001,
2 7 34 2060, 0000100000001100,
2 8 35 -890, 1111110010000110,
2 9 36 -366, 1111111010010010,
2 10 37 125, 0000000001111101,
2 11 38 53, 0000000000110101,
3 0 39 73, 0000000001001001,
3 1 40 72, 0000000001001000,
3 2 41 -518, 1111110111111010,
3 3 42 -538, 1111110111100110,
3 4 43 3246, 0000110010101110,
3 5 44 7842, 0001111010100010,
3 6 45 6311, 0001100010100111,
3 7 46 960, 0000001111000000,
3 8 47 -1062, 1111101111001010,
3 9 48 -186, 1111111101000110,
3 10 49 155, 0000000010011011,
3 11 50 28; 0000000000011100;
351
Tab le 8- 11 shows an example of a .coe file with the following specification:
num_h_taps = 12, num_v_taps = 12;
max_phases = 4;
max_coef_sets = 2;
Separate H/V Coefficients = True;
Separate Y/C Coefficients = True;
56 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Coefficient Preloading Using a .coe File
Just signed decimal form is shown. For clarity's sake, the same coefficient values have been used for each bank. Be aware that these are not realistic coefficients. Also note that this list includes ellipses to show continuation, and that it does not include a complete set of coefficients.
Table 8-11: .coe File Example 2
Set Bank Phase Tap File line-number Line Text
N/A 1 memory_initialization_radix=10;
2 memory_initialization_vector=
0 0 (HY) 0 0 3 0,
0 0 (HY) 0 1 4 162,
0 0 (HY) 0 2 5 0,
0 0 (HY) 0 3 6 -1069,
0 0 (HY) 0
0 0 (HY) 1 0 15 28,
0 0 (HY) 1 1 16 155,
0 0 (HY) 1 2 17 -186,
0 0 (HY)
0 0 (HY) 3 0 39 73,
0 0 (HY) 3 1 40 72,
0 0 (HY) 3
0 0 (HY) 3 11 50 28,
0 1 (HC) 0 0 51 0,
0 1 (HC) 0 1 52 162,
0 1 (HC) 0 2 53 0,
0 …… … …
0 1 (HC) 3 0 87 73,
0 1 (HC) 3 1 88 72,
0 1 (HC) 3
0 1 (HC) 3 11 98 28,
0 2 (VY) 0 0 99 0,
0 2 (VY) 0 1 100 162,
0 2 (VY) 0 2 101 0,
0 …… … …
0 2 (VY) 3 0 135 73,
0 2 (VY) 3 1 136 72,
0 2 (VY) 3
Video Scaler v4.0 User Guide www.xilinx.com 57
UG805 March 1, 2011
Chapter 8: Coefficients
Table 8-11: .coe File Example 2
0 2 (VY) 3 11 146 28,
0 3 (VC) 0 0 147 0,
0 3 (VC) 0 1 148 162,
0 3 (VC) 0 2 149 0,
0 …… … …
0 3 (VC) 3 0 183 73,
0 3 (VC) 3 1 184 72,
0 3 (VC) 3
0 3 (VC) 3 11 194 28,
1 0 (HY) 0 0 195 0,
1 0 (HY) 0 1 196 162,
1 0 (HY) 0 2 197 0,
1 0 (HY)
1 0 (HY) 3 11 242 28
1 1 (HC) 0 0 243 0,
1 …… … …
1 2 (VY) 0 0 291 0,
1 …… … …
1 3 (VC) 3 0 375 73,
1 3 (VC) 3 1 376 72,
1 3 (VC) 3
1 3 (VC) 3 11 386 28;
- - - - 387 “”
Tab le 8- 12 shows an example of a .coe file with the following specification:
num_h_taps = 4, num_v_taps = 3;
max_phases = 4;
max_coef_sets = 1;
Separate H/V Coefficients = True;
Separate Y/C Coefficients = False;
58 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Just signed decimal form is shown.
Table 8-12: .coe File Example 3
Coefficient Preloading Using a .coe File
Bank Phase Tap
File line-
number
Line Text Notes
N/A 1 memory_initialization_radix=10;
2 memory_initialization_vector=
0 (H) 0 0 3 -104,
0 (H) 0 1 4 1018,
0 (H) 0 2 5 15364,
0 (H) 0 3 6 106,
0 (H) 1 0 7 -240,
0 (H) 1 1 8 4793,
0 (H) 1 2 9 12022,
0 (H) 1 3 10 -191,
0 (H) 2 0 11 -282,
0 (H) 2 1 12 8474,
0 (H) 2 2 13 8474,
0 (H) 2 3 14 -282,
0 (H) 3 0 15 -191,
0 (H) 3 1 16 12022,
0 (H) 3 2 17 4793,
0 (H) 3 3 18 -240,
1 (V) 0 0 19 86,
1 (V) 0 1 20 16212,
1 (V) 0 2 21 86,
1 (V) - - 22 0, Padding value
1 (V) 1 0 23 512,
1 (V) 1 1 24 16068,
1 (V) 1 2 25 -197,
1 (V) - - 26 0, Padding value
1 (V) 2 0 27 1243,
1 (V) 2 1 28 15539,
1 (V) 2 2 29 -398,
1 (V) - - 30 0, Padding value
1 (V) 3 0 31 2829,
Video Scaler v4.0 User Guide www.xilinx.com 59
UG805 March 1, 2011
Chapter 8: Coefficients
Table 8-12: .coe File Example 3
1 (V) 3 1 32 14099,
1 (V) 3 2 33 -544,
1 (V) - - 34 0; Padding value
-- - 35
Coefficient Readback
For coefficient verification purposes, a feature of the video scaler allows the user to read back coefficients in the active coefficient memory.
Dedicated connections are included to facilitate this feature:
coef_set_bank_rd_addr(11:8): Coefficient set read-address
coef_set_bank_rd_addr(1:0): Coefficient bank read-address. 00=HY, 01=HC, 10=VY, 11=VC
coef_mem_rd_addr(13:8): Coefficient phase read-address
coef_mem_rd_addr(3:0): Coefficient tap read-address
coef_mem_output(15:0): Coefficient readback output
intr_coef_mem_rdbk_rdy: Output flag indicating that the specified coefficient bank is ready for reading
Before changing the set and bank read address, the user must set bit 3 of the control register to 0. Using the coef_set_bank_rd_addr, the user provides a set number and bank number for the coefficients he wants to read back. The user must then activate the new bank of coefficients by setting bit 3 of the control register to 1. A FIFO is then populated with that bank of coefficients. Once the intr_coef_mem_rdbk_rdy interrupt has gone high, using coef_mem_rd_addr the user must also provide the phase and tap number of the coefficient he wants to read from that bank. The coefficient will appear at coef_mem_output three clk cycles later.
Reading back coefficients does not cause image distortion, and may be executed during normal operation.
60 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Performance
The target maximum clock frequencies for all scaler input clocks are shown in Tab le 9 -1 .
Table 9-1: Target Maximum Clock Frequencies
Family Speed grade FMax (MHz)
Virtex-5 -1 225
Virtex-6 -1 250
Spartan-6 -2 150
Chapter 9
-2 250
-3 275
-2 280
-3 160
Spartan-3A DSP -4 150
-5 160
It is very important to ensure that the clock rate available supports worst-case conversions. This chapter includes detailed information and examples for worst-case scenarios.
Every user of the Xilinx Video Scaler should have a worst-case scenario in mind. The factors that may contribute to this scenario include:
Maximum line length to be handled in the system (into and out from the scaler)
Maximum number of lines per frame (in and out)
Maximum frame refresh rate
Chroma format (4:4:4, 4:2:2, or 4:2:0)
Clock FMax (depends upon the selected device)
These factors may contribute to decisions made for configuring the scaler and its supporting system. For example, the user may decide to use the scaler in its dual-engine parallel Y/C configuration to achieve the scale factor and frame rate desired. Using a dual­engine scaler allows the scaler to process more data per clock cycle at the cost of an increased resource usage. He may also elect to change speed-grade or even device family dependent upon his findings.
The size of the scaler implementation is determined by the number of taps and number of phases in the filter and the number of engines. The number of taps and number of phases do not impact the clock frequency.
Video Scaler v4.0 User Guide www.xilinx.com 61
UG805 March 1, 2011
Chapter 9: Performance
How do you establish whether or not the scaler will meet the application requirements? The approach taken is to calculate the minimum clock frequency required to make the intended conversions possible.
Definitions:
Subject Image The area of the active image that is driven into the scaler. This may or may
not be the entire image, dependent upon your requirements. It is of dimensions (SubjWidth x SubjHeight).
Active Image The entire active input image, some or all of which will include the Subject
Image, and is of dimensions (ActWidth x ActHeight).
FPix The input sample rate.
F'clk The 'clk' frequency. Data is read from the internal input line buffer,
processed and written to the internal output buffer using the system clock.
FLineIn The input Line Rate – could be driven by input rate or scaler LineReq rate.
FLineIn must represent the maximum burst frequency of the input lines. For example, 720P exhibits an FLineIn of 45kHz.
FFrameIn The fixed frame refresh rate (Hz) – same for both input and output.
To make the calculations according to the previous definitions and assumptions, it is necessary to distinguish between the following cases:
Live Video mode: An input video stream feeds directly into the scaler.
Memory mode: The user may control the input feed using back-pressure/
There follow some example cases which attempt to illustrate how to calculate what clock frequencies may be required to sustain the throughput required for given usage scenarios.
Live Video Mode
If no input frame buffer is used, and the timing of the input video format drives the scaler, then the number of 'clk' cycles available per H period becomes important. FLineIn is a predetermined frequency in this case, often (but not necessarily) defined according to a known broadcast video format (for example 1080i/60, 720P, CCIR601 etc.).
The critical factors may be summarized as follows:
ProcessingOverheadPerComponent –The number of extraneous cycles needed by
The user may not hold off the input stream.
The system must be able to cope with the constant flow of video data.
handshaking by implementing an input frame buffer.
the scaler to complete the generation of one component of the output line, in addition to the actual processing cycles. This is required due to filter latency and State-Machine initialization. For all cases in this document, this has been approximated as 50 cycles per component per line.
62 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Live Video Mode
CyclesPerOutputLine – This is the number of cycles the scaler requires to generate one output line, of multiple components. The final calculation depends upon the chroma format and the filter configuration (YC4:2:2 only), and can be summarized as:
For 4:4:4:
CyclesPerOutputLine = Max(output_h_size,SubjWidth) + ProcessingOverheadPerComponent
For 4:2:2 dual-engine:
CyclesPerOutputLine = Max(output_h_size,SubjWidth) + 2*ProcessingOverheadPerComponent
For 4:2:2 single-engine:
CyclesPerOutputLine = 2*Max(output_h_size,SubjWidth) + 3*ProcessingOverheadPerComponent
For 4:2:0:
CyclesPerOutputLine = 2*Max(output_h_size,SubjWidth) + 3*ProcessingOverheadPerComponent
For more details on the above estimations, continue reading. Otherwise, skip to the MaxVHoldsPerInputAperture bullet below.
The general calculation is:
CyclesPerOutputLine=(CompsPerEngine*Max(output_h_size,SubjWidth))+ OverHeadMult*ProcessingOverheadPerComponent
The CompsPerEngine and OverHeadMult values can be extracted from Tab le 9 -2 .
Table 9-2: Throughput Calculations for Different Chroma Formats
Chroma Format NumEngines CompsPerEngine OverHeadMult
4:4:4 (e.g., RGB) 3 1 1
4:2:2 High performance 2 1 2
4:2:2 Standard performance 1 2 3
4:2:0 1 2 3
NumEngines
This is the number of engines used in the implementation. For the YC4:2:2 case, a higher number of engines uses more resources - particularly BRAM and DSP48.
CompsPerEngine
This is the largest number of full h-resolution components to be processed by this instance of the scaler. When using YC, each chroma component constitutes 0.5 in this respect.
OverHeadMult
For each component processed by a single engine, the ProcessingOverheadPerComponent overhead factor must be included in the equation. The number of times this overhead needs to be factored in depends upon the number of components processed by the worst-case engine.
CyclesRequiredPerOutputLine=Max(output_h_size,SubjWidth)+Proces singOverheadPerComponent
Video Scaler v4.0 User Guide www.xilinx.com 63
UG805 March 1, 2011
Chapter 9: Performance
We modify this to include the chroma components. YC case is shown in this example.
CyclesRequiredPerOutputLine=2*Max(output_h_size,SubjWidth)+3*ProcessingOver headPerComponent
MaxVHoldsPerInputAperture – This is the maximum number of times the vertical aperture needs to be 'held' (especially up-scaling):
MaxVHoldsPerInputAperture = CEIL(Vertical scaling ratio)
where
vertical scaling ratio = output_v_size/input_v_size
Given the preceding information, it is now necessary to calculate how many cycles it will take to generate the worst-case number of output lines for any vertical aperture:
MaxClksTakenPerVAperture – This is the number of cycles it will take to generate MaxVHoldsPerInputAperture lines.
MaxClksTakenPerVAperture = CyclesRequiredPerOutputLine x MaxVHoldsPerInputAperture
It is then necessary to decide the minimum 'clk' frequency required to achieve your goals according to this calculation:
MinF'clk' = FLineIn x MaxClksTakenPerVAperture
Also useful is the reciprocal relationship that defines the number of 'clk' cycles available before the next line is written into the input line buffer, for a predefined 'clk' frequency:
ClksAvailablePerLine = F'clk'/FLineIn
Within this number of cycles, all output lines that require the use of the current vertical filter aperture must be completely generated. If MaxClksTakenPerVAperture < ClksAvailablePerLine, then the desired conversion is possible using the current clock frequency, without the use of an input frame buffer.
Some examples follow. They are estimates only, and are subject to change.
Example 1: The Unity Case
1080i/60 YC4:2:2 'passthrough' Vertical scaling ratio = 1.00 Horizontal scaling ratio = 1.00 FLineIn = 33750 Single-engine implementation
CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(540/540) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 33750*3990 = 134.66 MHz
Shrink-factor inputs:
hsf=220 x (1/1.0) = 0x100000
20
vsf=2
x (1/1.0) = 0x100000
This case is possible with no input buffer using Spartan-3A DSP because the MinF'clk is less than the core Fmax, as shown in Ta b l e 9 - 1 .
64 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Example 2: Up-scaling 640x480 60 Hz YC4:2:2 to 800x600 Assuming 30 kHz line rate Vertical scale ratio = 1.25 Horizontal scale ratio = 1.25 FLineIn = 30000 Single-engine implementation
CyclesRequiredPerOutputLine = 2*800 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(600/480) = 2 MaxClksTakenPerVAperture = 1750 * 2= 3500 MinF'clk' = 30000*3500 = 105 MHz
Shrink-factor inputs:
hsf=220 x (1/1.25) = 0x0CCCCC
20
vsf=2
x (1/1.25) = 0x0CCCCC
This case is easily possible with no input buffer, in Spartan-3A DSP.
Example 3: Up-scaling 640x480 60 Hz YC4:2:2 to 1920x1080p60
Assuming 30 kHz line rate Vertical scale ratio = 3.0 Horizontal scale ratio = 2.2 FLineIn = 30000 Single-engine implementation
Live Video Mode
CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture =round_up(1080/480) = 3 MaxClksTakenPerVAperture = 3990 * 3 = 11970 MinF'clk' = 30000*11970 = 359.1 MHz
Shrink-factor inputs:
hsf=220 x (1/1.25) = 0x0CCCCC
20
vsf=2
x (1/1.25) = 0x0CCCCC
Without an input frame buffer, this conversion will not work in any device currently available.
Example 4: Up-scaling 640x480 60 Hz YC4:2:2 to 1920x1080p60
Assuming 30 kHz line rate Vertical scale ratio = 3.0 Horizontal scale ratio = 2.2 FLineIn = 30000 Dual-engine implementation
CyclesPerOutputLine = 1*1920 + 2*50 (approximately) MaxVHoldsPerInputAperture =round_up(1080/480) = 3 MaxClksTakenPerVAperture = 2020 * 3 = 6060 MinF'clk' = 30000*6060 = 181.8 MHz
Shrink-factor inputs:
hsf=220 x (1/1.25) = 0x0CCCCC vsf=220 x (1/1.25) = 0x0CCCCC
For a dual-engine implementation, without an input frame buffer, this conversion will work in devices that support this clock-frequency.
Video Scaler v4.0 User Guide www.xilinx.com 65
UG805 March 1, 2011
Chapter 9: Performance
Example 5: Down-scaling 800x600 60Hz YC4:2:2 to 640x480 Assuming 30 kHz line rate Vertical scale ratio = 0.8 Horizontal scale ratio = 0.8 FLineIn = 30000 Single-engine implementation
CyclesRequiredPerOutputLine = 2*800 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(480/600) = 1 MaxClksTakenPerVAperture = 1750 * 1= 1750 MinF'clk' = 30000*1750 = 52.5 MHz
Shrink-factor inputs:
hsf=220 x (1/0.8) = 0x140000
20
vsf=2
x (1/0.8) = 0x140000
This conversion will work in any of the supported devices and speed grades.
Example 6: Down-scaling 1080P60 YC4:2:2 to 720P/60
67.5 kHz line rate
Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.6667 FLineIn = 67500 Single-engine implementation
CyclesPerOutputLine = 2*1920 + 3*50 (approximately) MaxVHoldsPerInputAperture = round_up(720/1080) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 67500*3990 = 269.32 MHz
Shrink-factor inputs:
hsf=220 x (1/0.6667) = 0x180000 vsf=220 x (1/0.6667) = 0x180000
When using a single-engine, this conversion will not work with or without frame buffers (see below - Memory mode) unless using higher speed-grade Virtex-5 or Virtex-6 devices.
Example 7: Down-scaling 1080P60 YC4:2:2 to 720P/60
67.5 kHz line rate
Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.6667 FLineIn = 67500 Dual-engine implementation
CyclesPerOutputLine = 1*1920 + 2*50 (approximately) MaxVHoldsPerInputAperture = round_up(720/1080) = 1 MaxClksTakenPerVAperture = 2020 * 1 = 3990 MinF'clk' = 67500*2020 = 136.35 MHz
Shrink-factor inputs:
hsf=220 x (1/0.6667) = 0x180000 vsf=220 x (1/0.6667) = 0x180000
This conversion will work in any of the supported devices and speed grades.
66 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Memory Mode
Example 8: Down-scaling 720P/60 YC4:2:2 to 640x480 45 kHz line rate Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.5 FLineIn = 45000 Single-engine implementation
CyclesRequiredPerOutputLine = 2*1280 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(480/720) = 1 MaxClksTakenPerVAperture = 2710 * 1 = 2710 MinF'clk' = 45000*2710 = 121.95 MHz
Shrink-factor inputs:
hsf=220 x (1/0.5) = 0x200000
20
vsf=2
x (1/0.6667) = 0x180000
This conversion will work in any of the supported devices and speed grades.
Example 9: Converting 720P/60 YC4:2:2 to 1080i/60 (1920x540)
45 kHz line rate Vertical scale ratio = 0.75 Horizontal scale ratio = 1.5 FLineIn = 45000 Single-engine implementation
Memory Mode
CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(540/720) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 45000*3990 = 179.55 MHz
Shrink-factor inputs:
hsf=220 x (1/1.5) = 0x0AAAAA
20
vsf=2
x (1/0.6667) = 0x155555
This conversion will work in Virtex-5, but not in Spartan-3A DSP since the MinF'clk is greater than the Spartan-3A Fmax, but less than the Virtex-5 Fmax, as shown in
Ta b l e 9 - 1.
Using an input frame buffer allows you to stretch the processing time over the entire frame period (utilizing the available blanking periods). New input lines may be provided as the internal phase-accumulator dictates, instead of the input timing signals.
The critical factors may be summarized as follows:
ProcessingOverheadPerLine – The number of extraneous cycles needed by the scaler to complete the generation of one output line, in addition to the actual processing cycles. This is required due to filter latency and State-Machine initialization. For all cases in this document, this has been approximated as 50 cycles per component per line.
FrameProcessingOverhead – The number of extraneous cycles needed by the scaler to complete the generation of one output frame, in addition to the actual processing cycles. This is required mainly due to vertical filter latency. For all cases in this document, this has been generally approximated as 10000 cycles per frame.
Video Scaler v4.0 User Guide www.xilinx.com 67
UG805 March 1, 2011
Chapter 9: Performance
CyclesPerOutputFrame – This is the number of cycles the scaler requires to generate one output frame, of multiple components. The final calculation depends upon the chroma format (and, for YC4:2:2 only, the filter configuration), and can be summarized as:
For 4:4:4:
CyclesPerOutputFrame = Max [ (output_h_size + ProcessingOverheadPerLine)*output_v_size, (input_h_size + ProcessingOverheadPerLine)*input_v_size ] + FrameProcessingOverhead
For 4:2:2 dual-engine:
CyclesPerOutputFrame = Max [ (output_h_size + (ProcessingOverheadPerLine*2))*output_v_size, (input_h_size + (ProcessingOverheadPerLine*2))*input_v_size ] + FrameProcessingOverhead
For 4:2:2 single-engine:
CyclesPerOutputFrame = Max [ ((output_h_size*2) + (ProcessingOverheadPerLine*3))*output_v_size, ((input_h_size*2) + (ProcessingOverheadPerLine*3))*input_v_size ] + FrameProcessingOverhead
For 4:2:0:
CyclesPerOutputFrame = Max [ ((output_h_size*2) + (ProcessingOverheadPerLine*3))*output_v_size, ((input_h_size*2) + (ProcessingOverheadPerLine*3))*input_v_size ] + FrameProcessingOverhead
It is then necessary to decide the minimum 'clk' frequency according to this calculation:
MinF'clk' = FFrameIn x CyclesPerOutputFrame
68 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Example 10: Converting 720P YC4:2:2 to 1080i/60 (1920x540)
Vertical scale ratio = 0.75 Horizontal scale ratio = 1.5 FFrameIn = 60
CyclesPerOutputFrame = (1920*2 + 150)*540 + 10000 (approximately) = 2164600 MinF'clk' = 60 x 2164600 = 129.87 MHz
Shrink-factor inputs:
hsf=220 x (1/1.5) = 0x0AAAAA vsf=220 x (1/0.8) = 0x155555
This conversion is allowed in Spartan-3A DSP.
Memory Mode
Note:
Spartan-3A DSP.
Example 9 showed that the same conversion with no frame buffer is not possible in
Video Scaler v4.0 User Guide www.xilinx.com 69
UG805 March 1, 2011
Chapter 9: Performance
70 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Use Cases
Typical Uses
Some scenarios for scaler usage are shown in Figure A-1 through Figure A-5. In particular, usage of the following dynamic parameter values are illustrated:
aperture_start_line
aperture_end_line
aperture_start_pixel
aperture_end_pixel
output_h_size
output_v_size
hsf
vsf
Appendix A
These values are very significant, and their usage is be referred to throughout this document.
X-Ref Target - Figure A-1
720
aperture_start_pixel = 0
1280
aperture_end_pixel = 1279
aperture_start_line
= 0
output_y_size = 480
aperture_end_line = 719
output_h_size = 640
Figure A-1: Format Down-scaling. Example 720p to 640x480,
HSF = 2
20
x 1280/640; VSF = 220 x 720/480
UG_01_031909
Video Scaler v4.0 User Guide www.xilinx.com 71
UG805 March 1, 2011
Appendix A: Use Cases
aperture_start_line = 0
aperture_start_pixel = 0
640
480
aperture_end_pixel = 639
aperture_end_line
= 479
output_h_size = 1280
output_y_size
= 720
UG_02_031909
aperture_start_line = 420
aperture_start_pixel = 750
1280
480
270
720
aperture_end_pixel = 1229
aperture_end_line
= 689
output_h_size = 1280
output_y_size
= 720
UG678_4-5_081809
aperture_start_line = 0
aperture_start_pixel = 0
720
aperture_end_pixel = 1279
aperture_end_line
= 719
12801280
720
270
480
output_h_size = 480
output_y_size
= 270
UG678_4-6_081809
X-Ref Target - Figure A-2
Figure A-2: Format Up-scaling. Example 640x480 to 720p,
HSF = 2
X-Ref Target - Figure A-3
20
x 640/1280; 220 x VSF = 480/720
Figure A-3: Zoom (Up-scaling), HSF = 220 x 480/1280; VSF = 220 x 270/720
X-Ref Target - Figure A-4
72 www.xilinx.com Video Scaler v4.0 User Guide
Figure A-4: Shrink (Down-scaling). Example for Picture-in-Picture (PinP),
HSF = 2
20
x 1280/480; VSF = 220 x 720/270
UG805 March 1, 2011
X-Ref Target - Figure A-5
aperture_start_line = 0
aperture_start_pixel = 0
1280
480
270
720
aperture_end_pixel = 479
aperture_end_line
= 269
output_h_size = 1280
output_y_size
= 720
UG678_4-7_081809
Typical Uses
Figure A-5: Zoom (Up-scaling) reading from External Memory,
HSF = 2
20
x 480/1280; VSF = 220 x 270/720
Video Scaler v4.0 User Guide www.xilinx.com 73
UG805 March 1, 2011
Appendix A: Use Cases
74 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Programmer Guide
Introduction
This appendix provides a description of how to program and control the data flow for the video scaler hardware pCore. The information is sufficient for the development of a software driver (API) for use in application software for applications such as video conferencing and video analytics.
Appendix B
Note:
as described here.
A software driver is provided with the pCore so that you do not have to develop a software API
Conventions
Reserved locations in the registers will be ignored by the hardware and can be written by software with any value. Therefore the software does not need to zero or mask bits.
Unused coefficients should be set to zero. The number of taps is a compile time parameter for the IP core and needs to be known by the programmer to be able to load the coefficient tables correctly.
Register Definitions
Note: All registers default to 0x00000000 on power-up or software reset.
Table B-1: Video Scaler Registers Overview
Address Name Read/Write Description
0x0000 control R/W General control register
0x0004 status R General readable status register
0x0008 status_error R General readable status register for errors
0x000c status_done R/W General read register for status done
0x0010 horz_shrink_factor R/W Horizontal Shrink Factor
0x0014 vert_shrink_factor R/W Vertical Shrink Factor
0x0018 aperture_horz R/W
Video Scaler v4.0 User Guide www.xilinx.com 75
UG805 March 1, 2011
aperture_start_pixel: Location of first subject pixel in input line, relative to first active pixel in that line
aperture_end_pixel: Location of final subject pixel in input line, relative to first active pixel in that line
Appendix B: Programmer Guide
Table B-1: Video Scaler Registers Overview (Cont’d)
Address Name Read/Write Description
aperture_start_line: Location of first subject line in input image,
0x001c aperture_vert R/W
relative to first active line in that image
aperture_end_line: Location of final subject line in input image, relative to first active line in that image
0x0020 output_size R/W
output_h_size: Width of output image (pixels)
output_v_size: Height of output image (lines)
num_h_phases: Number of phases of coefficients in current
0x0024 num_phases R/W
horizontal filter set
num_v_phases: Number of phases of coefficients in current vertical filter set
hcoeffset: Active coefficient set to use in horizontal filter
0x0028 coeff_sets R/W
operation
vcoeffset: Active coefficient set to use in vertical filter operation
0x002c start_hpa_y R/W
0x0030 start_hpa_c R/W
0x0034 start_vpa_y R/W
0x0038 start_vpa_c R/W
0x003c coef_write_set_addr R/W
Fractional value used to initialize horizontal accumulator at rectangle left edge for luma
Fractional value used to initialize vertical accumulator at rectangle top edge for luma
Fractional value used to initialize horizontal accumulator at rectangle left edge for chroma
Fractional value used to initialize vertical accumulator at rectangle top edge for chroma
Coefficient set write address to indicate which coefficient bank to write
0x0040 coef_values W Coefficient values to write
0x0044 coef_set_bank_rd_addr R/W Set and bank number to be read
0x0048 coef_mem_rd_addr R/W Phase and tap number to be read
0x004c coef_mem_output R Coefficient readback output
0x00F0 Version Register R Hardware version information
Writing a SOFT_RESET value to this register resets the software
0x0100 Software_Reset W
registers and the Video Scaler IP core. The SOFT_RESET value is determined by EDK.
0x021C GIER R/W Global Interrupt Enable Register
0x0220 ISR R/W
0x0228 IER R/W
Interrupt Status Register; Read to determine the source of the interrupt, write to clear the interrupt
Interrupt Enable Register; 0 to mask out an interrupt, 1 to enable an interrupt
76 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Register Definitions
Tab le B- 2: control Register
0x0000 control R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved enable
Name Bits Description
Reserved 31:3 Reserved
Timing Generator enabled into video scaler. This bit
Timing_Gen_Enable 2
enables the timing generator signals, vblank, hblank, active video to go through to the signals on the video scaler core.
Register Update enable. This bit communicates to the IP core to take new values at the next frame vblank rising
Reg_Update_Enable 1
edge. The registers that utilize this bit are 0x0010 through 0x0038.
Usage: This bit is cleared when the IP core next vblank happens.
0
Enable 0 Enable the Video Scaler core on the next video frame.
Tab le B- 3: reserved Register
0x0004 status R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved C
Name Bits Description
Reserved 31:1 Reserved
Coef_write_rdy 0
If this bit is '1' then the Coeffs can be written into the core.
Check at the beginning of a coeff transfer.
Tab le B- 4: status Register
0x0008 status_error R
313029282726252423222120191817161514131211100908070605040302010
0
0
Error_Code3 Error_Code2 Error_Code1 Error_Code0
Name Bits Description
Error_Code3 31:24 Error codes to be defined
Video Scaler v4.0 User Guide www.xilinx.com 77
UG805 March 1, 2011
Appendix B: Programmer Guide
Tab le B- 4: status Register
Error_Code2 23:16 Error codes to be defined
Error_Code1 15:8 Error codes to be defined
Error_Code0 7:0 Error codes to be defined
Tab le B- 5: status_done Register
313029282726252423222120191817161514131211100908070605040302010
Reserved 31:24 Reserved
Reserved 23:16 Reserved
Reserved 15:8 Reserved
0x000c status_done R/W
0
Reserved d
Name Bits Description
Reserved 7:1 Reserved
Done bit can be polled by software for end for video
Done 0
scaler operation.
Usage: This bit is cleared when any value is written to the register.
Tab le B- 6: horizontal_shrink_factor Register
0x0010 horz_shrink_factor R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved hsf_int hsf_frac
Name Bits Description
Reserved 31:24 Reserved
hsf_int 23:20 Horizontal Shrink Factor integer
hsf_frac 19:0 Horizontal Shrink Factor fractional
Tab le B- 7: vsf Register
0
0x0014 vert_shrink_factor R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved vsf_int vsf_frac
Name Bits Description
78 www.xilinx.com Video Scaler v4.0 User Guide
0
UG805 March 1, 2011
Register Definitions
Tab le B- 7: vsf Register
Reserved 31:24 Reserved
vsf_int 23:20 Vertical Shrink Factor integer
vsf_frac 19:0 Vertical Shrink Factor fractional
Tab le B- 8: aperture_horz Register
0x0018 aperture_horz R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved aperture_end_pixel Reserved aperture_start_pixel
Name Bits Description
Reserved 31:26 Reserved
aperture_end_pixel 28:16 Location of last pixel in line
Reserved 15:11 Reserved
0
aperture_start_pixel 12:0 Location of first pixel in line
Tab le B- 9: aperture_vert Register
0x001c aperture_vert R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved aperture_end_line Reserved aperture_start_line
Name Bits Description
Reserved 31:27 Reserved
aperture_end_line 28:16 Location of last line in active video
Reserved 15:11 Reserved
aperture_start_line 12:0 Location of first line in active video
Tab le B- 10 : output_size Register
0x0020 output_size R/W
313029282726252423222120191817161514131211100908070605040302010
0
0
Reserved output_v_size Reserved output_h_size
Name Bits Description
Reserved 31:27 Reserved
output_v_size 28:16 Number of lines in output image
Video Scaler v4.0 User Guide www.xilinx.com 79
UG805 March 1, 2011
Appendix B: Programmer Guide
Tab le B- 10 : output_size Register
Reserved 15:11 Reserved
output_h_size 12:0 Number of pixels in output image
Tab le B- 11 : num_phases Register
313029282726252423222120191817161514131211100908070605040302010
Reserved 31:15 Reserved
num_v_phases 14:8 Number of vertical phases
Reserved 7 Reserved
num_h_phases 6:0 Number of horizontal phases
0x0024 num_phases R/W
0
Reserved num_v_phases num_h_phases
Name Bits Description
Tab le B- 12 : coeff_sets Register
0x0028 coeff_sets R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved vcoeffset hcoeffset
Name Bits Description
Reserved 31:28 Reserved
vcoeffset 7:4 Active vertical coefficient set
hcoeffset 3:0 Active horizontal coefficient set
Tab le B- 13 : start_hpa_y Register
0x002c start_hpa_y R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved start_hpa_y
0
0
Name Bits Description
Reserved 31:21 Reserved
start_hpa_y 20:0
80 www.xilinx.com Video Scaler v4.0 User Guide
Fractional value used to initialize horizontal accumulator for luma
UG805 March 1, 2011
Register Definitions
Tab le B- 14 : start_vpa_y Register
0x0030 start_hpa_c R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved start_hpa_c
Name Bits Description
Reserved 31:21 Reserved
0
start_hpa_c 20:0
Fractional value used to initialize horizontal accumulator for chroma
Tab le B- 15 : start_hpa_c Register
0x0034 start_vpa_y R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved start_vpa_y
Name Bits Description
Reserved 31:21 Reserved
start_vpa_y 20:0
Fractional value used to initialize vertical accumulator for luma
Tab le B- 16 : start_vpa_c Register
0x0038 start_vpa_c R/W
313029282726252423222120191817161514131211100908070605040302010
0
0
Name Bits Description
Reserved 31:21 Reserved
start_vpa_c 20:0
Tab le B- 17 : Coefficient_write_set_address Register
0x003c coef_write_set_addr R/W
313029282726252423222120191817161514131211100908070605040302010
Video Scaler v4.0 User Guide www.xilinx.com 81
UG805 March 1, 2011
Reserved start_vpa_c
Fractional value used to initialize vertical accumulator for chroma
0
Reserved coef_wsa
Appendix B: Programmer Guide
Tab le B- 17 : Coefficient_write_set_address Register
Reserved 31:4 Reserved
coef_write_set_addr 3:0 Coefficient bank to write, address
Tab le B- 18 : coef_values Register
313029282726252423222120191817161514131211100908070605040302010
coef_value_N+1 31:16
Name Bits Description
0x0040 coef_values W
0
coef_value_N+1 coef_value_N
Name Bits Description
Coefficient value N+1 where N is index for the coefficient set.
Usage: Each write to this register increments an internal counter by 2 to generate a coefficient set internal to the video scaler. LSB aligned for coefficients less than 16 bits.
Coefficient value N where N is index for the coefficient set.
coef_value_N 15:0
Usage: Each write to this register increments an internal counter by 2 to generate a coefficient set internal to the video scaler. LSB aligned for coefficients less than 16 bits
Tab le B- 19 : Coefficient Set and Bank Read Address Register
0x0044 coef_set_bank_rd_addr R/W
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
Reserved Set Reserved Bank
Name Bits Description
Coeff Readback Set 11:8 Coefficient set to be read from the scaler
Coeff Readback Bank 1:0
Coefficient bank to be read from scaler:
00=HY; 01=HC; 10=VY; 11=VC
82 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Register Definitions
Tab le B- 20 : Coefficient Phase and Tap Read Address Register
0x0048 coef_mem_rd_addr R/W
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
Reserved Phase Reserved Ta p
Name Bits Description
Coeff Readback
Phase
Coeff Readback Bank 3:0 Coefficient tap to be read from scaler
13:8
Coefficient phase to be read from the scaler
Tab le B- 21 : Coefficient Memory Readback Output Register
0x004c coef_mem_rd_addr R
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
Reserved Coeff Readback Output
Name Bits Description
Coeff Readback Output
15:0
Coefficient readout from the scaler
Tab le B- 22 : Version Register
0x00F0 Version R
313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0
HW Version
Name Bits Description
HW Version 31:0 Hard-coded hardware version register
Tab le B- 23 : Software Reset Register
0x0100 Software_Reset W
313029282726252423222120191817161514131211100908070605040302010
Name Bits Description
Soft_Reset_Value 31:0
Video Scaler v4.0 User Guide www.xilinx.com 83
UG805 March 1, 2011
0
Reserved d
Soft Reset to reset the registers and IP core, data Value provided by the EDK create peripheral utility
Appendix B: Programmer Guide
Tab le B- 24 : Global Interrupt Enable Register
313029282726252423222120191817161514131211100908070605040302010
Reserved 31:1 Reserved
GIER 0 Global Interrupt Enable Register. Active High
Tab le B- 25 : Interrupt Status Register
313029282726252423222120191817161514131211100908070605040302010
0x021C Software_Reset W
0
Reserved d
Name Bits Description
0x0220 ISR R/W
0
Reserved Int
Name Bits Description
Reserved 31:6 Reserved
intr_coef_mem_rdbk _rdy
intr_reg_update_ done
intr_coef_wr_error 4
intr_output_error 3
intr_input_error 2
6
5
Level sensitive: Output flag indicating that the specified coefficient bank is ready for reading.
Level sensitive: issued during Vertical blanking when the register values have been transferred to the active registers.
Rising edge sensitive: issued if coefficient is written into coefficient FIFO when the FIFO is not ready.
Rising edge sensitive: issued if frame period completes before full output frame has been delivered.
Rising edge sensitive: issued if active_video_in is asserted before the scaler is ready to receive a new line.
Level sensitive: issued when the coefficient FIFO is ready
intr_coef_fifo_rdy 1
to receive a coefficient for the current set. Stays low once a full set has been written into FIFO. Sent high during Vertical blanking.
intr_output_frame_ done
0
Rising edge sensitive: issued once per complete output frame.
84 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Register Definitions
Tab le B- 26 : Interrupt Enable Register
0x0228 IER R/W
313029282726252423222120191817161514131211100908070605040302010
Reserved Int
Name Bits Description
Reserved 31:6 Reserved
0
intr_coef_mem_rdbk _rdy
intr_reg_update_ done
intr_coef_wr_error 4 Mask or Enable interrupt for intr_coef_wr_error
intr_output_error 3 Mask or Enable interrupt for intr_output_error
intr_input_error 2 Mask or Enable interrupt for intr_input_error
intr_coef_fifo_rdy 1 Mask or Enable interrupt for intr_coef_fifo_rdy
intr_output_frame_ done
Filter Coefficient Calculations
The values for the filter coefficients can be calculated with any standard digital filter tool. MATLAB® software provides a tool box for establishing the filter coefficients once the cutoff frequency is known from the scale factor. It should be noted that sharp cutoff frequencies are generally not desired in image processing due to the ringing generated at sharp transitions (artifacts). Additionally allowing some amount of aliasing can be subjectively preferred in side-by-side comparisons. The MATLAB software FIR1 function can be used as a starting point for deriving coefficient values.
6 Mask or enable interrupt for intr_coef_mem_rdbk_rdy
5 Mask or Enable interrupt for intr_reg_update_done
0 Mask or Enable interrupt for intr_output_frame_done
Xilinx provides a C-Model that generates coefficients. Contact Xilinx support for information on how to obtain this C-Model. Refer to the Video Scaler Product Page information about accessing the C-Model.
Video Scaler v4.0 User Guide www.xilinx.com 85
UG805 March 1, 2011
for
Appendix B: Programmer Guide
Video Scaler Flow Diagram
Start
Scaling
Y
Initialize
Registers
Set Load
Coef Bank
Load
Coefs
Set Active
Coef Bank
New
Scale
N
Factors
Y
Initialize
Registers
HSF . VSF .
Output_h/v
Set Active Coef
Bank
New
Coef
Bank?
Y
Set Load
Coef Bank
N
N
Y
N
Disable
Scaler
Control O0
86 www.xilinx.com Video Scaler v4.0 User Guide
Enable
Video Scaler
Control 0 1
N
Y
3
Done?Done?
Y
Stop
Scaling
N
Figure B-0: Video Scaler Flow Chart
Load Coefs
UG678_01_030210
UG805 March 1, 2011
System Timing Diagram
System Timing Diagram
Video Scaler v4.0 User Guide www.xilinx.com 87
UG805 March 1, 2011
Figure B-0: System Timing Diagram
Appendix B: Programmer Guide
Proposed API function calls
The following functions are proposed for LO, L1, L2 API.
L0 API Function Calls
#define XScaler_Enable(InstancePtr)
#define XScaler_Disable(InstancePtr)
#define XScaler_Reset(InstancePtr)
#define XScaler_GetStatus(InstancePtr)
#define XScaler_CheckDone(InstancePtr)
#define XScaler_SetHoriShrinkFactor(InstancePtr, Integer, Fractional)
#define XScaler_GetHoriShrinkFactor(InstancePtr)
#define XScaler_SetVertShrinkFactor(InstancePtr, Integer, Fractional)
#define XScaler_GetVertShrinkFactor(InstancePtr)
#define XScaler_SetHoriAperture(InstancePtr, FirstPixel, LastPixel)
#define XScaler_GetHoriAperture(InstancePtr)
#define XScaler_SetVertAperture(InstancePtr, FirstLine, LastLine)
#define XScaler_GetVertAperture(InstancePtr)
#define XScaler_SetOutputSize(InstancePtr, Lines, Pixels)
#define XScaler_GetOutputSize(InstancePtr)
#define XScaler_SetNumPhases(InstancePtr, Vert, Hori)
#define XScaler_GetNumPhases(InstancePtr)
#define XScaler_SetCoeffSet(InstancePtr, Vert, Hori)
#define XScaler_GetCoeffSet(InstancePtr)
#define XScaler_SetHoriAccuLuma(InstancePtr, Fraction)
#define XScaler_GetHoriAccuLuma(InstancePtr)
#define XScaler_SetVertAccuLuma(InstancePtr, Fraction)
#define XScaler_GetVertAccuLuma(InstancePtr)
#define XScaler_SetHoriAccuChroma(InstancePtr, Fraction)
#define XScaler_GetHoriAccuChroma(InstancePtr)
#define XScaler_SetVertAccuChroma(InstancePtr, Fraction)
#define XScaler_GetVertAccuChroma(InstancePtr)
#define XScaler_SetWriteCoeffBankAddr(InstancePtr, Address)
#define XScaler_GetWriteCoeffBankAddr(InstancePtr)
#define XScaler_SetCoefValue(InstancePtr, NPlus1, N)
#define XScaler_GetCoefValue(InstancePtr)
88 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
L1 API Function Calls
#define XScaler_CalcCoeffs(coeffs, scale, taps, phases, coeff_precision)
software function, no registers written
#define XScaler_WriteCoeffValues(InstancePtr, coeffs, coeff_bank)
sets coef_write_set_addr and writes consecutively coef_values
#define XScaler_CalcScaleFactors(InstancePtr, hsv, vsf, input_h, input_v, output_h, output_v)
software function, no registers written
#define XScaler_SetActiveCoeffBank(InstancePtr, coeff_bank)
sets active register coeff_sets
#define XScaler_SetScalerValues(InstancePtr, reg_data_structure)
This is the main video scaler function call utilized in a frame basis when the shrink factor is changing every frame such as zooming applications.
The mandatory registers that need to change for a new shrink factor are:
horz_shrink_factor
vert_shrink_factor
•output_size
Proposed API function calls
Optionally these registers may also need to be modified depending on the input resolution and user preference:
aperture_horz
aperture_vert
•num_phases
coeff_sets
start_hpa_y
start_vpa_y
start_hpa_c
start_vpa_c
L2 API Function Calls
#define XScaler_Zoom(InstancePtr, zoom_factor_h, zoom_factor_v, starting_aperture_h, ending_aperture_h, starting_aperture_v, ending_aperture_v, output_h, output_v, num_of_frames)
In a zoom operation the input image size is changing on a frame basis and the output resolution is fixed.
Calls XScaler_CalcScaleFactors, XScaler_SetScalerValues every frame to perform the zoom function. Prior to beginning the zoom operation, you will have to preload the coeff banks you would like to use for the duration and decide when to transition to a new coefficient bank; example 4 coeff banks for 200 frames switch bank every 50 frames.
#define XScaler_DownSize(InstancePtr, downsize_factor_h, downsize_factor_v, num_of_frames)
Video Scaler v4.0 User Guide www.xilinx.com 89
UG805 March 1, 2011
Appendix B: Programmer Guide
In a downsize operation, the input image size is not changing on a frame basis and the output resolution is changing.
Calls XScaler_CalcScaleFactors, XScaler_SetScalerValues every frame to perform the downsize function. Prior to beginning the downsize operation, you will have to preload the coeff banks you would like to use for the duration and decide when to transition to a new coefficient bank; example 4 coeff banks for 200 frames switch bank every 50 frames.
Example Settings
The following examples illustrate settings for different scale factors.
Pass Thru
Tab le B- 27 is an example of pass thru of a 1280 x 720 resolution image.
Tab le B- 27 : Pass Through Register Settings
Address Name Decimal Value
0x0000 control 07
0x0010 hsf 1048576
0x0014 vsf 1048576
0x0018 aperture_start_pixel 0
0x0018 aperture_end_pixel 1279
0x001c aperture_start_line 0
0x001c aperture_end_line 719
0x0020 Output_h_size 1280
0x0020 Output_v_size 720
0x0024 num_h_phases 4
0x0024 num_v_phases 4
0x0028 h_coeff_set 0
0x0028 v_coeff_set 0
0x002c start_hpa _y 0
0x0030 start_hpa_c 0
0x0034 start_vpa_y 0
0x0038 start_vpa_c 0
0x003c Coef_set_write_addr 0
0x0040 Coef_values
90 www.xilinx.com Video Scaler v4.0 User Guide
See Chapter 8,
Coefficients
UG805 March 1, 2011
Down Sample by 2 in Horizontal and Vertical
Tab le B- 28 is an example of scaling down a 1280 x 720 resolution image by a factor of
2 horizontally and vertically to 640x 360.
Tab le B- 28 : Down Sample Register Settings
Address Name Decimal VAlue
0x0000 control 07
0x0010 hsf 2097152
0x0014 vsf 2097152
0x0018 aperture_start_pixel 0
0x0018 aperture_end_pixel 1279
0x001c aperture_start_line 0
0x001c aperture_end_line 719
0x0020 Output_h_size 640
0x0020 Output_v_size 360
Example Settings
0x0024 num_h_phases 4
0x0024 num_v_phases 4
0x0028 h_coeff_set 0
0x0028 v_coeff_set 0
0x002c start_hpa_y 0
0x0030 start_hpa_c 0
0x0034 start_vpa_y 0
0x0038 start_vpa_c 0
0x003c Coef_set_write_addr 0
0x0040 Coef_values
See Chapter 8,
Coefficients
Video Scaler v4.0 User Guide www.xilinx.com 91
UG805 March 1, 2011
Appendix B: Programmer Guide
92 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
System Level Design
Introduction
This appendix provides an example system that includes the video scaler core. Important system level aspects when designing with the video scaler are highlighted, including:
Video scaler usage with the VDMA/VFBC/MPMC or other memory interface/controller
Inclusion of the video scaler in an EDK project
Typical usage of video scaler in conjunction with other cores
System level distribution of video timing and genlock signals
Example System General Configuration.
Appendix C
The system input and output is expected to be no larger than 720P (1280Hx720V), with a maximum pixel frequency of 74.25 MHz, with equivalent clocks.
MicroBlaze controls scale factors according to user input
The system can upscale or downscale
When down scaling, the full input image is scaled down and placed in the center of a black 720P background and displayed
When upscaling, the center of the 720P input image is cropped from memory and upscaled to 720P, and displayed as a full 720P image on the output
Operational clock frequencies are derived from the input clock
Figure C-1 shows a typical example of the video scaler in memory mode incorporated into
a larger system. Here are the essential details:
•The Multiport Memory Controller (MPMC) represents the memory access point for multiple IP blocks.
The MPMC ports are configured as Video Frame Buffer Controllers (VFBC), which allow the user to access data in rectangular fashion, making it simple to store frames of data, and access portions of any frame. This configuration is useful for cropping an area in preparation for upscaling (for example). See the MPMC Data Sheet information
•The Video Direct Memory Access (VDMA) blocks simplify the VFBC interface, and act as a SW-controllable processor peripheral. See the VDMA Data Sheet information.
•The Timebase Controller is a SW-configurable timing detector and generator block, which generates timing signals for distribution around the system. See the Timing
Controller Data Sheet for more information.
for more
for more
Video Scaler v4.0 User Guide www.xilinx.com 93
UG805 March 1, 2011
Appendix C: System Level Design
•The On-Screen Display (OSD) block aligns the data read from memory with the timing signals and presents it as a standard-format video data stream. It also alpha­blends multiple layers of information (e.g. text, other video data). See the OSD Data
Sheet for more information.
X-Ref Target - Figure C-1
Control Buses
In this example, MicroBlaze is configured to use the PLB v4.6. The VDMAs sit on the PLB bus directly. The Video Scaler, Timing Controller, and OSD use AXI4-Lite. The PLB-to-AXI bridge facilitates the transition between PLB and AXI buses.
VDMA0 Configuration
VDMA0 is used uni-directionally, used for writing input data into the memory. Normally, this should be configured as a write-only core (C_DMA_TYPE = 0). However, currently, it is
configured as a bidirectional core in this case (C_DMA_TYPE = 2), to work around an issue in the VDMA design - the read side of this core is not connected, except for the read-side clock.
The system operates using a Genlock mechanism. A rotational 5-frame buffer is defined in the external memory. Using the Genlock bus vdma_0_XIL_WD_MGENLOCK, VDMA0 communicates to VDMA1 which of the five frame locations is being written, to avoid R/W collisions.
VDMA0, in the MHS file text given below, is sourced from an engineering test-pattern generator (not included in the MHS file below). This generates a VDMA write bus that connects directly to the VDMA write port.
Figure C-1: Simplified System Diagram
94 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
VDMA1 Configuration
VDMA1 is bidirectional, used for reading the original frames from memory, and writing the scaled frame back to memory.
The system operates using a Genlock mechanism. A second rotational 5-frame buffer is defined in the external memory. VDMA1 communicates to VDMA2 which frame it is writing to, using the Genlock bus vdma_1_XIL_WD_MGENLOCK.
VDMA1, in the MHS file text below, interfaces with the video scaler via a VDMA read bus (scaler input) and VDMA write bus (scaler output).
VDMA2 Configuration
VDMA2 is unidirectional, and is configured that way. It is used for reading the scaled frame from memory in order to display it. It is a Genlock slave to VDMA1.
Video Scaler Configuration
The video scaler is configured as follows:
single-engine 4:2:2
11Hx11V-taps
64 phases
shared YC coefficients
VDMA1 Configuration
Its core uses a 148.5 MHz derivative of the 74.25 MHz input clock.
MPMC Configuration
The MPMC is configured to have three VFBC ports. Each port includes a FIFO. The FIFOs are configured to be 2048 pixels in length. This is especially important for VDMA1, which handles video data to/from the video scaler. The video scaler arbitrates on a line-by-line
basis. It does this by analyzing the status of the rd_almost_empty and wd_almost_full flags on the VDMA buses, before reading or writing one line, but never
analyzes these flags once a line-read or line-write operation has commenced. This is described in detail in the main text of this user guide. The guidelines for this port are described in the following two sections.
Scaler READ-port
For the port that feeds data into the video scaler, ensure that there is a FIFO of a size equal to or greater than the maximum line length anticipated to be scaled by the scaler. Ideally, set this to the next power of 2 above the maximum input line length.
For this example, the max line length is 1280, so the FIFO has been set to 2048 pixels.
For systems like the VFBC, which have a FIXED threshold for the ALMOST full/empty flags, set this value to the maximum input line-length. This ensures that the rd_almost_empty flag will not be driven low until an entire line of video data
is in the FIFO, ready for the scaler to accept.
Video Scaler v4.0 User Guide www.xilinx.com 95
UG805 March 1, 2011
Appendix C: System Level Design
Scaler WRITE-port
For the port that feeds from the video scaler out to the memory, ensure that there is a FIFO of a size equal to or greater than the maximum line length anticipated to be output by the scaler. Ideally, set this to the next power of 2 above the maximum output line length. For this example, the max line length is 1280, so the FIFO has been
set to 2048 pixels.
For systems like the VFBC, which have a FIXED threshold for the ALMOST full/empty flags, set this value to the maximum output line-length. This ensures that the wd_almost_full flag will not be driven low until there is sufficient space in
the FIFO for an entire line of video data.
Cropping from Memory
Controlling the VDMA dynamically (e.g., from MicroBlaze or other processor) allows you to request any rectangle from any where in the image in memory, and change the position and dimensions of this rectangle on a frame-by frame basis. One complication of doing this with the VFBC is that the FIFO almost full/empty thresholds are FIXED at compile-time. According to the guidelines above, it is necessary to set the thresholds to the maximum line length. Yet, when cropping from memory, you will be requesting a rectangle of a smaller width than the maximum line length. Consequently, the final lines may not be read from memory correctly, resulting in some distortion at the bottom of the image.
To work around this issue, it is necessary, and safe, to request more lines than you want to scale. This keeps the FIFO topped up with data. This can be achieved by setting the VDMA
Read Vsize register (address offset 0x28) to a number greater than you want. See the
VDMA Data Sheet
differently to your desired values.
OSD Configuration
The OSD is configured for two layers. The first layer is video data read from VDMA2. The second layer is text overlay.
EDK MHS File Text
The following is an example EDK MHS file insert for the system described.
Note:
scaler system in EDK.
This is NOT a complete design, but provides some idea as to the construction of a video
BEGIN vdma PARAMETER INSTANCE = vdma_0 PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1fffffff PARAMETER C_GEN_RESET = 1 PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_CROP_ENABLE = 1 PARAMETER C_DMA_TYPE = 2 PARAMETER C_BASEADDR = 0xcb480000 PARAMETER C_HIGHADDR = 0xcb48ffff BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_VFBC = vdma_0_XIL_VFBC BUS_INTERFACE XIL_WD_VDMA = tpg_0_XIL_VDMA_TPG_OUT PORT IP2INTC_Irpt = vdma_0_IP2INTC_Irpt
for more information. The scaler register settings should not be set
96 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
PORT m_wd_frame_ptr_out = vdma_0_XIL_WD_MGENLOCK PORT vdma_wcmd_clk = vid_in_clk PORT vdma_wd_clk = vid_in_clk PORT vdma_rcmd_clk = vid_in_clk PORT vdma_rd_clk = vid_in_clk END
BEGIN timebase PARAMETER INSTANCE = timebase_1 PARAMETER HW_VER = 3.00.a PARAMETER C_BASEADDR = 0xc3800000 PARAMETER C_HIGHADDR = 0xc380ffff PARAMETER C_MAX_LINES = 1024 PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE S_AXI = axi_interconnect_0 BUS_INTERFACE XSVI_OUT = timebase_1_XSVI_OUT PORT ce = net_vcc PORT video_clk_in = vid_in_clk PORT fsync_o = timebase_1_fsync PORT IP2INTC_Irpt = timebase_1_IP2INTC_Irpt PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 END
BEGIN vdma PARAMETER INSTANCE = vdma_1 PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1FFFFFFF PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_DMA_TYPE = 2 PARAMETER C_CROP_ENABLE = 1 PARAMETER C_BASEADDR = 0xcb460000 PARAMETER C_HIGHADDR = 0xcb46ffff BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_VFBC = vdma_1_XIL_VFBC BUS_INTERFACE XIL_RD_VDMA = scaler_0_XIL_VDMA_SCALER_IN BUS_INTERFACE XIL_WD_VDMA = scaler_0_XIL_VDMA_SCALER_OUT BUS_INTERFACE XIL_RD_SGENLOCK1 = vdma_0_XIL_WD_MGENLOCK BUS_INTERFACE XIL_WD_MGENLOCK = vdma_1_XIL_WD_MGENLOCK PORT IP2INTC_Irpt = vdma_1_IP2INTC_Irpt END
EDK MHS File Text
BEGIN axi_scaler PARAMETER INSTANCE = scaler_0 PARAMETER HW_VER = 4.00.a PARAMETER C_SEPARATE_YC_COEFS = 0 PARAMETER C_MAX_SAMPLES_OUT_PER_LINE = 1280 PARAMETER C_MAX_PHASES = 64 PARAMETER C_INIT_COEF_SOURCE = 1 PARAMETER C_YC_FILTER_CONFIG = 1 PARAMETER C_BASEADDR = 0xc3400000 PARAMETER C_HIGHADDR = 0xc340ffff PARAMETER C_NUMBER_OF_H_TAPS = 11 PARAMETER C_NUMBER_OF_V_TAPS = 11 PARAMETER C_MAX_COEF_SETS = 16 PARAMETER C_SEPARATE_HV_COEFS = 1 PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE XIL_VDMA_SCALER_IN = scaler_0_XIL_VDMA_SCALER_IN BUS_INTERFACE XIL_VDMA_SCALER_OUT = scaler_0_XIL_VDMA_SCALER_OUT BUS_INTERFACE S_AXI = axi_interconnect_0 PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 PORT clk = vid_in_clkx2 PORT video_in_clk = vid_in_clk PORT video_out_clk = vid_in_clk PORT debug = xscaler_0_LEDsOut PORT IP2INTC_Irpt = scaler_0_IP2INTC_Irpt PORT vsync_i = timebase_1_XSVI_OUT_vsync END
BEGIN vdma PARAMETER INSTANCE = vdma_2
Video Scaler v4.0 User Guide www.xilinx.com 97
UG805 March 1, 2011
Appendix C: System Level Design
PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1fffffff PARAMETER C_USE_FSYNC = 1 PARAMETER C_DMA_TYPE = 1 PARAMETER C_GEN_RESET = 1 PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_BASEADDR = 0xcb420000 PARAMETER C_HIGHADDR = 0xcb42ffff PARAMETER C_CROP_ENABLE = 1 BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_RD_SGENLOCK1 = vdma_1_XIL_WD_MGENLOCK BUS_INTERFACE XIL_VFBC = vdma_2_XIL_VFBC BUS_INTERFACE XIL_RD_VDMA = osd_0_XIL_RD0_VFBC PORT fsync = timebase_1_fsync PORT IP2INTC_Irpt = vdma_2_IP2INTC_Irpt END
BEGIN axi_osd PARAMETER INSTANCE = osd_0 PARAMETER HW_VER = 2.00.a PARAMETER C_LAYER1_TYPE = 1 PARAMETER C_NUM_LAYERS = 2 PARAMETER C_LAYER1_IMEM_SIZE = 96 PARAMETER C_NUM_DATA_CHANNELS = 2 PARAMETER C_ALPHA_CHANNEL_EN = 0 PARAMETER C_LAYER2_TYPE = 2 PARAMETER C_BASEADDR = 0xc3a00000 PARAMETER C_HIGHADDR = 0xc3a0ffff PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI PARAMETER C_OUTPUT_MODE = 1 BUS_INTERFACE XSVI_IN = timebase_1_XSVI_OUT BUS_INTERFACE XSVI_OUT = osd_0_XSVI_OUT BUS_INTERFACE XIL_RD0_VFBC = osd_0_XIL_RD0_VFBC BUS_INTERFACE S_AXI = axi_interconnect_0 PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 PORT clk = vid_in_clk PORT IP2INTC_Irpt = osd_0_IP2INTC_Irpt END
BEGIN microblaze PARAMETER INSTANCE = microblaze_0 PARAMETER HW_VER = 7.30.b PARAMETER C_DEBUG_ENABLED = 1 PARAMETER C_ICACHE_BASEADDR = 0x10000000 PARAMETER C_ICACHE_HIGHADDR = 0x1fffffff PARAMETER C_CACHE_BYTE_SIZE = 16384 PARAMETER C_ICACHE_ALWAYS_USED = 1 PARAMETER C_DCACHE_BASEADDR = 0x10000000 PARAMETER C_DCACHE_HIGHADDR = 0x1fffffff PARAMETER C_DCACHE_BYTE_SIZE = 16384 PARAMETER C_DCACHE_ALWAYS_USED = 1 PARAMETER C_USE_ICACHE = 1 PARAMETER C_USE_DCACHE = 1 PARAMETER C_USE_BARREL = 1 PARAMETER C_DPLB_BUS_EXCEPTION = 1 PARAMETER C_IPLB_BUS_EXCEPTION = 1 PARAMETER C_ILL_OPCODE_EXCEPTION = 1 PARAMETER C_UNALIGNED_EXCEPTIONS = 1 PARAMETER C_OPCODE_0x0_ILLEGAL = 1 PARAMETER C_USE_HW_MUL = 2 PARAMETER C_USE_DIV = 1 PARAMETER C_DIV_ZERO_EXCEPTION = 1 PARAMETER C_ICACHE_LINE_LEN = 8 PARAMETER C_USE_MMU = 3 PARAMETER C_MMU_ZONES = 2 PARAMETER C_PVR = 2 BUS_INTERFACE DPLB = mb_plb BUS_INTERFACE IPLB = mb_plb BUS_INTERFACE DXCL = microblaze_0_DXCL BUS_INTERFACE IXCL = microblaze_0_IXCL
98 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
BUS_INTERFACE DEBUG = microblaze_0_mdm_bus BUS_INTERFACE DLMB = dlmb BUS_INTERFACE ILMB = ilmb PORT MB_RESET = mb_reset PORT INTERRUPT = xps_intc_0_Irq END
BEGIN plb_v46 PARAMETER INSTANCE = mb_plb PARAMETER HW_VER = 1.05.a PORT PLB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END
BEGIN lmb_v10 PARAMETER INSTANCE = ilmb PARAMETER HW_VER = 1.00.a PORT LMB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END
BEGIN lmb_v10 PARAMETER INSTANCE = dlmb PARAMETER HW_VER = 1.00.a PORT LMB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END
BEGIN lmb_bram_if_cntlr PARAMETER INSTANCE = dlmb_cntlr PARAMETER HW_VER = 2.10.b PARAMETER C_BASEADDR = 0x00000000 PARAMETER C_HIGHADDR = 0x00001fff BUS_INTERFACE SLMB = dlmb BUS_INTERFACE BRAM_PORT = dlmb_port END
EDK MHS File Text
BEGIN lmb_bram_if_cntlr PARAMETER INSTANCE = ilmb_cntlr PARAMETER HW_VER = 2.10.b PARAMETER C_BASEADDR = 0x00000000 PARAMETER C_HIGHADDR = 0x00001fff BUS_INTERFACE SLMB = ilmb BUS_INTERFACE BRAM_PORT = ilmb_port END
BEGIN bram_block PARAMETER INSTANCE = lmb_bram PARAMETER HW_VER = 1.00.a BUS_INTERFACE PORTA = ilmb_port BUS_INTERFACE PORTB = dlmb_port END
BEGIN axi_uartlite PARAMETER INSTANCE = RS232_Uart_1 PARAMETER C_BAUDRATE = 9600 PARAMETER C_DATA_BITS = 8 PARAMETER C_USE_PARITY = 0 PARAMETER C_ODD_PARITY = 0 PARAMETER HW_VER = 1.01.a PARAMETER C_BASEADDR = 0x83000000 PARAMETER C_HIGHADDR = 0x8300ffff PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE S_AXI = axi_interconnect_0 PORT RX = fpga_0_RS232_Uart_1_RX_pin PORT TX = fpga_0_RS232_Uart_1_TX_pin PORT Interrupt = RS232_Uart_1_Interrupt PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 END
BEGIN mpmc PARAMETER INSTANCE = DDR3_SDRAM
Video Scaler v4.0 User Guide www.xilinx.com 99
UG805 March 1, 2011
Appendix C: System Level Design
PARAMETER HW_VER = 6.03.a PARAMETER C_NUM_PORTS = 6 PARAMETER C_MMCM_EXT_LOC = MMCM_ADV_X0Y9 PARAMETER C_MEM_TYPE = DDR3 PARAMETER C_MEM_PARTNO = MT4JSF6464HY-1G1 PARAMETER C_MEM_ODT_TYPE = 1 PARAMETER C_MEM_REG_DIMM = 0 PARAMETER C_MEM_CLK_WIDTH = 1 PARAMETER C_MEM_CE_WIDTH = 1 PARAMETER C_MEM_CS_N_WIDTH = 1 PARAMETER C_MEM_DATA_WIDTH = 32 PARAMETER C_MEM_NDQS_COL0 = 3 PARAMETER C_MEM_NDQS_COL1 = 1 PARAMETER C_MEM_DQS_LOC_COL0 = 0x000000000000000000000000000000020100 PARAMETER C_MEM_DQS_LOC_COL1 = 0x000000000000000000000000000000000003 PARAMETER C_IODELAY_GRP = DDR3_SDRAM PARAMETER C_MPMC_CLK0_PERIOD_PS = 5000 PARAMETER C_ARB0_ALGO = CUSTOM PARAMETER C_ARB0_NUM_SLOTS = 2 PARAMETER C_ARB0_SLOT0 = 425310 PARAMETER C_ARB0_SLOT1 = 423510 # PIM0 (XCL) PARAMETER C_PIM0_BASETYPE = 1 PARAMETER C_XCL0_B_IN_USE = 1 # PIM1 (Video Input) PARAMETER C_PIM1_BASETYPE = 6 PARAMETER C_PIM1_DATA_WIDTH = 64 PARAMETER C_PI1_RD_FIFO_TYPE = DISABLED PARAMETER C_PI1_WR_FIFO_TYPE = SRL PARAMETER C_VFBC1_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC1_RDWD_FIFO_DEPTH = 2048 PARAMETER C_VFBC1_RD_AEMPTY_WD_AFULL_COUNT = 20 # PIM2 (Scaler IO) PARAMETER C_PIM2_DATA_WIDTH = 64 PARAMETER C_PIM2_BASETYPE = 6 PARAMETER C_VFBC2_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC2_RDWD_FIFO_DEPTH = 2048 PARAMETER C_PI2_WR_FIFO_TYPE = SRL PARAMETER C_PI2_RD_FIFO_TYPE = SRL PARAMETER C_VFBC2_RD_AEMPTY_WD_AFULL_COUNT = 20 # PIM3 (OSD1 - Scaled Video Output) PARAMETER C_PIM3_BASETYPE = 6 PARAMETER C_PIM3_DATA_WIDTH = 64 PARAMETER C_PI3_RD_FIFO_TYPE = SRL PARAMETER C_PI3_WR_FIFO_TYPE = DISABLED PARAMETER C_VFBC3_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC3_RDWD_FIFO_DEPTH = 2048 PARAMETER C_VFBC3_RD_AEMPTY_WD_AFULL_COUNT = 20 # DDR3 Parameters PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1FFFFFFF BUS_INTERFACE XCL0 = microblaze_0_IXCL BUS_INTERFACE XCL0_B = microblaze_0_DXCL BUS_INTERFACE VFBC1 = vdma_0_XIL_VFBC BUS_INTERFACE VFBC2 = vdma_1_XIL_VFBC BUS_INTERFACE VFBC3 = vdma_2_XIL_VFBC PORT MPMC_Clk0 = clk_200_0000MHzMMCM0 PORT MPMC_Clk_200MHz = clk_200_0000MHzMMCM0 PORT MPMC_Rst = sys_periph_reset PORT MPMC_Clk_Mem = clk_400_0000MHzMMCM0 PORT MPMC_Clk_Rd_Base = clk_400_0000MHzMMCM0_nobuf_varphase PORT MPMC_DCM_PSEN = MPMC_DCM_PSEN PORT MPMC_DCM_PSINCDEC = MPMC_DCM_PSINCDEC PORT MPMC_DCM_PSDONE = MPMC_DCM_PSDONE PORT DDR3_Clk = fpga_0_DDR3_SDRAM_DDR3_Clk_pin PORT DDR3_Clk_n = fpga_0_DDR3_SDRAM_DDR3_Clk_n_pin PORT DDR3_CE = fpga_0_DDR3_SDRAM_DDR3_CE_pin PORT DDR3_CS_n = fpga_0_DDR3_SDRAM_DDR3_CS_n_pin PORT DDR3_ODT = fpga_0_DDR3_SDRAM_DDR3_ODT_pin PORT DDR3_RAS_n = fpga_0_DDR3_SDRAM_DDR3_RAS_n_pin PORT DDR3_CAS_n = fpga_0_DDR3_SDRAM_DDR3_CAS_n_pin PORT DDR3_WE_n = fpga_0_DDR3_SDRAM_DDR3_WE_n_pin
100 www.xilinx.com Video Scaler v4.0 User Guide
UG805 March 1, 2011
Loading...