Xilinx LogiCORE IP Video Scaler v4.0 User Manual

LogiCORE™ IP Video Scaler v4.0

User Guide

UG805 March 1, 2011

Xilinx is providing this product documentation, hereinafter “Information,” to you “AS IS” with no warranty of any kind, express or implied. Xilinx makes no representation that the Information, or any particular implementation thereof, is free from any claims of infringement. You are responsible for obtaining any rights you may require for any implementation based on the Information. All specifications are subject to change without notice.

XILINX EXPRESSLY DISCLAIMS ANY WARRANTY WHATSOEVER WITH RESPECT TO THE ADEQUACY OF THE INFORMATION OR ANY IMPLEMENTATION BASED THEREON, INCLUDING BUT NOT LIMITED TO ANY WARRANTIES OR REPRESENTATIONS THAT THIS IMPLEMENTATION IS FREE FROM CLAIMS OF INFRINGEMENT AND ANY IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Except as stated herein, none of the Information may be copied, reproduced, distributed, republished, downloaded, displayed, posted, or transmitted in any form or by any means including, but not limited to, electronic, mechanical, photocopying, recording, or otherwise, without the prior written consent of Xilinx.

© Copyright 2009-2011 Xilinx, Inc. XILINX, the Xilinx logo, Artix, ISE, Kintex, Spartan, Virtex, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. All other trademarks are the property of their respective owners.Xilinx, Inc. XILINX, the Xilinx logo, Virtex, Spartan, ISE, and other designated brands included herein are trademarks of Xilinx in the United States and other countries. MATLAB is a registered trademark of The MathWorks, Inc. All other trademarks are the property of their respective owners.

Revision History

The following table shows the revision history for this document.

Date Version Revision

04/24/09 1.0 Initial Xilinx release.

09/16/09 2.0 Updated for core release version 2.0.

04/19/10 2.1 Updated for core release version 2.1.

09/21/10 3.0 Updated for core release version 3.0.

03/01/11 4.0 Updated for core release version 4.0.

Video Scaler v4.0 User Guide www.xilinx.com UG805 March 1, 2011

Table of Contents

Revision History . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

Schedule of Figures. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

Schedule of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

Preface: About This Guide

Guide Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

Additional Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Typographical. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

Online Document . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

Chapter 1: Introduction

About the Core. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Recommended Experience. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Additional Core Resources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Technical Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

Providing Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Chapter 2: Overview

Chapter 3: Implementation

Basic Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

I/O Buffering, Clock Domains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

Chapter 4: Video I/O Interface and Timing

Data Source: Live Video. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Input Data and Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

General Input Handshaking Principles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

Hblank_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Vblank_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Frame_rst Signal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Active_video_in Input . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Data Source: Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Output Data and Timing Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Chapter 5: Scaler Architectures

Architecture Descriptions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

Single-Engine for Sequential YC Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4:2:0 Special Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Dual-Engine for Parallel YC Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

Triple-Engine for RGB/4:4:4 Processing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

GUI Operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Chapter 6: Control Interface

Control Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

Constant (Fixed) Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

General Purpose Processor (GPP) Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

Coefficient Delivery for GPP Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

EDK pCore Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

Parameter Modification in CORE Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Scaler Software Driver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Coefficient Delivery for EDK pCore Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Interrupts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

Chapter 7: Scaler Aperture

Input Aperture Definition. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Cropping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Chapter 8: Coefficients

Coefficient Table. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

Coefficient Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

Examples of Coefficient Set Generation and Loading . . . . . . . . . . . . . . . . . . . . . . . . 44

Example 1: Num_h_taps = num_v_taps = 8; max_phases = 4 . . . . . . . . . . . . . . . . . . . 44

Example 2: Num_h_taps = num_v_taps = 8;

max_phases = 5, 6, 7 or 8; num_h_phases = num_v_phases = 4 . . . . . . . . . . . . . . . 47

Example 3: Num_h_taps = 9; num_v_taps = 7;

max_phases = num_h_phases = num_v_phases = 4 . . . . . . . . . . . . . . . . . . . . . . . . . 49

Coefficient Preloading Using a .coe File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Generating .coe Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Extracting Coefficients From xscaler_coefs.c File . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52

Format for .coe Files. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

Coefficient Readback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60

Chapter 9: Performance

Live Video Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

Memory Mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

Appendix A: Use Cases

Typical Uses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Appendix B: Programmer Guide

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Conventions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Register Definitions. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Filter Coefficient Calculations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Video Scaler Flow Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

System Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Proposed API function calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

L0 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

L1 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

L2 API Function Calls . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

Example Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Pass Thru . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Down Sample by 2 in Horizontal and Vertical . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Appendix C: System Level Design

Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Example System General Configuration.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

Control Buses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

VDMA0 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

VDMA1 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

VDMA2 Configuration. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Video Scaler Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

MPMC Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Scaler READ-port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

Scaler WRITE-port . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

Cropping from Memory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

OSD Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

EDK MHS File Text . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Video Scaler v4.0 User Guide www.xilinx.com UG805 March 1, 2011

Schedule of Figures

Chapter 1: Introduction

Chapter 2: Overview

Chapter 3: Implementation

Figure 3-1: High Level View of the Functionality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

Figure 3-2: Simplified Top Level Block Diagram, Indicating Clock-domains . . . . . . . . 22

Chapter 4: Video I/O Interface and Timing

Figure 4-1: Scaler 8-bit 4:2:2 Input Timing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Figure 4-2: Scaler 8-bit 4:2:0 Input Chroma Validation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

Figure 4-3: VBlank, HBlank, Frame_rst, LineCount Screenshot,

with Frame Reset Line Number = 22 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Figure 4-4: Interface Timing for Memory Source Mode. . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

Figure 4-5: Scaler Output Timing (8-bits YC4:2:2/4:2:0) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Figure 4-6: Scaler 4:2:0 Output Validation (8-bits). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

Chapter 5: Scaler Architectures

Figure 5-1: Internal Data Path Bitwidths for Single-Engine YC Mode . . . . . . . . . . . . . . . 29

Figure 5-2: Internal Data Path Bitwidths for Dual-Engine YC Mode . . . . . . . . . . . . . . . . 30

Figure 5-3: Internal Data Path Bitwidths for Triple-Engine RGB/4:4:4 Architecture . . . 30

Figure 5-4: Auto Select in GUI. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

Figure 5-5: CORE Generator GUI Information Tab . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

Chapter 6: Control Interface

Figure 6-1: Typical EDK-based System Showing Interrupt Structure. . . . . . . . . . . . . . . . 38

Chapter 7: Scaler Aperture

Figure 7-1: Hblank_in at Falling Edge of VBlank_in . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

Figure 7-2: Active_video_in in Relation to First Active Sample . . . . . . . . . . . . . . . . . . . . . 40

Figure 7-3: Cropping from the Input Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

Chapter 8: Coefficients

Figure 8-1: Coefficient Write-Format on coef_data_in(31:0) . . . . . . . . . . . . . . . . . . . . . . . . 42

Figure 8-2: Coefficient Loading Mechanism, Including External FIFO . . . . . . . . . . . . . . 43

Figure 8-3: Coefficient Loading Procedure – One Phase (8-tap filter shown) . . . . . . . . . 44

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Chapter 9: Performance

Appendix A: Use Cases

Figure A-1: Format Down-scaling. Example 720p to 640x480,

HSF = 2

Figure A-2: Format Up-scaling. Example 640x480 to 720p,

HSF = 2

Figure A-3: Zoom (Up-scaling), HSF = 2

Figure A-4: Shrink (Down-scaling). Example for Picture-in-Picture (PinP),

HSF = 2

Figure A-5: Zoom (Up-scaling) reading from External Memory,

HSF = 2

x 1280/640; VSF = 220 x 720/480 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

x 640/1280; 220 x VSF = 480/720 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

x 480/1280; VSF = 220 x 270/720 . . . . . . . . . . . . 72

x 1280/480; VSF = 220 x 720/270 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

x 480/1280; VSF = 220 x 270/720 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73

Appendix B: Programmer Guide

Figure B-0: Video Scaler Flow Chart. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

Figure B-0: System Timing Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

Appendix C: System Level Design

Figure C-1: Simplified System Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Schedule of Tables

Chapter 1: Introduction

Table 1-1: Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

Chapter 2: Overview

Chapter 3: Implementation

Chapter 4: Video I/O Interface and Timing

Chapter 5: Scaler Architectures

Chapter 6: Control Interface

Chapter 7: Scaler Aperture

Chapter 8: Coefficients

Table 8-1: Example 1 Decimal Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

Table 8-2: Example 1 Normalized Integer Coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Table 8-3: Example 1 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

Table 8-4: Example 2 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

Table 8-5: Example 9-Tap Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

Table 8-7: Example 3 Coefficient Set Download Format . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Table 8-6: Example 7-Tap Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

Table 8-8: Coefficient “Binning” in SW Driver (xscaler_coefs.c) . . . . . . . . . . . . . . . . . . . . 52

Table 8-9: Ordering of Coefficients in .coe File for Different Coefficient Sharing

Options . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

Table 8-10: .coe File Example 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

Table 8-11: .coe File Example 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

Table 8-12: .coe File Example 3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

Chapter 9: Performance

Table 9-1: Target Maximum Clock Frequencies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

Table 9-2: Throughput Calculations for Different Chroma Formats . . . . . . . . . . . . . . . . . 63

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

Appendix A: Use Cases

Appendix B: Programmer Guide

Table B-1: Video Scaler Registers Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Table B-2: control Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Table B-3: reserved Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Table B-4: status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

Table B-5: status_done Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Table B-6: horizontal_shrink_factor Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Table B-7: vsf Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

Table B-8: aperture_horz Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Table B-9: aperture_vert Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Table B-10: output_size Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

Table B-11: num_phases Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Table B-12: coeff_sets Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Table B-13: start_hpa_y Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

Table B-14: start_vpa_y Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Table B-15: start_hpa_c Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Table B-16: start_vpa_c Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Table B-17: Coefficient_write_set_address Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81

Table B-18: coef_values Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82

Table B-19: Coefficient Set and Bank Read Address Register. . . . . . . . . . . . . . . . . . . . . . . 82

Table B-20: Coefficient Phase and Tap Read Address Register. . . . . . . . . . . . . . . . . . . . . . 83

Table B-21: Coefficient Memory Readback Output Register. . . . . . . . . . . . . . . . . . . . . . . . 83

Table B-22: Version Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Table B-23: Software Reset Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

Table B-24: Global Interrupt Enable Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Table B-25: Interrupt Status Register . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

Table B-26:

Table B-27: Pass Through Register Settings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

Table B-28: Down Sample Register Settings. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

Interrupt Enable Register. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

Appendix C: System Level Design

UG805 March 1, 2011 www.xilinx.com Video Scaler v4.0 User Guide

About This Guide

The LogiCORE™ IP Video Scaler v4.0 User Guide provides information about generating the Video Scaler core, customizing and simulating the core using the provided example design, and running the design files through implementation using the Xilinx tools.

Guide Contents

This manual contains the following chapters:

• Chapter 1, Introduction introduces the Xilinx Video Scaler core and provides related information, including recommended design experience, additional resources, technical support, and submitting feedback to Xilinx.

• Chapter 2, Overview illustrates examples of video scaler applications.

• Chapter 3, Implementation elaborates on the internal structure in the core and describes interfacing.

• Chapter 4, Video I/O Interface and Timing describes how to drive the input timing signals so the scaler can be operated correctly. It also describes the data output signals and their relation to the output data.

• Chapter 5, Scaler Architectures describes Single-engine for sequential YC processing, Dual Engine for parallel YC processing, and Triple engine for parallel RGB/4:4:4 processing.

• Chapter 6, Control Interface discusses the three control interface options available to the user in CORE Generator™ software: EDK pCore, GPP and Constant.

• Chapter 7, Scaler Aperture explains how to define the scaler aperture using the appropriate dynamic control registers.

• Chapter 8, Coefficients describes the coefficients used by both the Vertical and Horizontal filter portions of the scaler, in terms of number, range, formatting and download procedures.

• Chapter 9, Performance emphasizes the importance of available clock rate and provides some worst-case conversion examples.

• Appendix A, Use Cases illustrates two likely usage scenarios for the video scaler.

• Appendix B, Programmer Guide provides a description of how to program and control the data flow for the video scaler hardware pCore.

•"Appendix C, System Level Design provides an example design extracted from a known, working EDK project, including other Video IP blocks.

Preface

Video Scaler v4.0 User Guide www.xilinx.com 11

UG805 March 1, 2011

Preface: About This Guide

Additional Resources

To find additional documentation, see the Xilinx website at:

Conventions

Typographical

http://www.xilinx.com/support/documentation/index.htm

To search the Answer Database of silicon, software, and IP questions and answers, or to create a technical support WebCase, see the Xilinx website at:

http://www.xilinx.com/support/mysupport.htm

This document uses the following conventions. An example illustrates each convention.

The following typographical conventions are used in this document:

Convention Meaning or Use Example

Messages, prompts, and

Courier font

Courier bold

Helvetica bold

program files that the system displays

Literal commands that you enter in a syntactical statement

Commands that you select from a menu

Keyboard shortcuts Ctrl+C

speed grade: - 100

ngdbuild design_name

File  Open

Italic font

Dark Shading

Square brackets [ ]

Braces { }

Vertical bar |

Angle brackets < >

Variables in a syntax statement for which you must supply values

References to other manuals

Emphasis in text

Items that are not supported or reserved

An optional entry or parameter. However, in bus specifications, such as bus[7:0], they are required.

A list of items from which you must choose one or more

Separates items in a list of choices

User-defined variable or in code samples

ngdbuild design_name

See the User Guide for more information.

If a wire is drawn so that it overlaps the pin of a symbol, the two nets are not connected.

This feature is not supported

ngdbuild [option_name] design_name

lowpwr ={on|off}

12 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Convention Meaning or Use Example

Conventions

Vertical ellipsis

. . .

Horizontal ellipsis . . .

Notations

Online Document

The following conventions are used in this document:

Convention Meaning or Use Example

Blue text

Blue, underlined text

Repetitive material that has been omitted

The prefix ‘0x’ or the suffix ‘h’ indicate hexadecimal notation

An ‘_n’ means the signal is active low

Cross-reference link to a location in the current document

Hyperlink to a website (URL)

IOB #1: Name = QOUT’ IOB #2: Name = CLKIN’

. . .

allow block block_name loc1 loc2 ... locn;

A read of address 0x00112975 returned 45524943h.

usr_teof_n is active low.

See Chapter 3, Basic

Architecture for details.

See Additional Resources,

page 12,” for details.

Go to www.xilinx.com latest speed files.

for the

Video Scaler v4.0 User Guide www.xilinx.com 13

UG805 March 1, 2011

Preface: About This Guide

14 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Introduction

This chapter introduces the Video Scaler core and provides related information, including recommended design experience, additional resources, technical support, and submitting feedback to Xilinx. See www.xilinx.com/products/ipcenter/EF-DI-VID-SCALER.htm

About the Core

The Video Scaler core is a Xilinx CORE Generator™ IP core, included in the latest IP Update on the Xilinx IP Center

Scaler product page.

Recommended Experience

Although the Video Scaler core is a fully verified solution, the challenge associated with implementing a complete design varies depending on the configuration and functionality of the application. For best results, previous experience building high performance, pipelined FPGA designs using Xilinx implementation software and UCF is recommended.

Chapter 1

. For detailed information about the core, see the Video

Contact your local Xilinx representative for a closer review and estimation for your specific requirements

Additional Core Resources

For detailed information about video scaler technology and updates to the Video Scaler core, see the following:

Documentation

From the Video Scaler product page:

• Video Scaler Data Sheet

• Video Scaler Release Notes

Technical Support

For technical support, visit www.xilinx.com/support. Questions are routed to a team of engineers with expertise using the Video Scaler core.

Xilinx will provide technical support for use of this product as described in the LogiCORE™ IP Video Scaler User Guide. Xilinx cannot guarantee timing, functionality, or support of this product for designs that do not follow these guidelines.

Video Scaler v4.0 User Guide www.xilinx.com 15

UG805 March 1, 2011

Chapter 1: Introduction

Providing Feedback

Xilinx welcomes comments and suggestions about the Video Scaler core and the documentation supplied with the core.

Core

For comments or suggestions about the Video Scaler core, submit a WebCase from

www.xilinx.com/support

•Product name

• Core version number

• Explanation of your comments

Documentation

For comments or suggestions about this document, submit a WebCase from

www.xilinx.com/support

• Document title

•Document number

• Page number(s) to which your comments refer

• Explanation of your comments

. Be sure to include the following information:

Nomenclature

The following are defined for the purposes of this document:

Table 1-1: Nomenclature

Term Definition

Scaler Aperture The input data rectangle used to create the output data rectangle.

Filter Aperture The group of contributory data used in a filter to generate one

particular output. The number of elements in this group of data is the number of taps. We define the filter aperture size using the num_h_taps and num_v_taps parameters.

Coefficient Phase Each tap is multiplied by a coefficient to make its contribution to

the output pixel. The coefficients used are selected from a “phase” of num_x_taps coefficients. The phase selection is dependent upon the position of the output pixel in the input sampling grid space. For each dimension of the filter, each coefficient phase consists of num_h_taps or num_v_taps coefficients.

Channel For scaler purposes, all monochromatic video streams, for example

Y, Cb, Cr, R, G, B, are all considered separate channels.

Coefficient Phase Index An index given that selects the coefficient phase applied to one

filter aperture in a FIR. For an n-tap filter, this index points to n coefficients.

16 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Nomenclature

Table 1-1: Nomenclature

Term Definition

Coefficient Bank A group of coefficients that will be applied to one video component

(Y or C) in one dimension (H or V) for a conversion of one frame. It includes all phases. For an n-tap, m-phase filter, a coefficient bank comprises nxm values. Each tap may be multiplied by any one of m coefficients assigned to it, selected by the phase index, which is applied to all taps.

Coefficient Set A group of four coefficient banks (VY, VC, HY, HC). One full set

should be written into the scaler before use.

Video Scaler v4.0 User Guide www.xilinx.com 17

UG805 March 1, 2011

Chapter 1: Introduction

18 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Overview

Video scaling is the process of converting an input color image of dimensions Xin pixels by Y

Within predefined limits, the Xilinx Video Scaler supports the modification of the X X

out

dynamically crop selected subject area from the input image prior to scaling that area. This dynamic combination lends itself well to applications that require shrink and zoom functionality.

The Xilinx Video Scaler supports real-time video inputs and memory interface inputs (that is, a frame buffer). When connected to a real-time input source, the input clock and horizontal and vertical (H/V) timing signals come directly from the input video stream. In the case of a memory interface, standard memory handshaking signals may be used in place of the H/V timing signals.

While maintaining image quality is usually of primary interest, it is subjective and heavily dependent upon the end application. Moreover, image quality comes at a price in terms of FPGA resources. Hence, while the core structure and architecture of the scaler is maintained for all applications, flexibility is made paramount to enable users from all applications to use this IP.

Chapter 2

lines to an output color image of dimensions X

, Y

input parameters during run-time on a frame basis. Furthermore, you may also

out

pixels by Y

out

lines.

, Yin,

Video Scaler v4.0 User Guide www.xilinx.com 19

UG805 March 1, 2011

Chapter 2: Overview

20 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Implementation

Video Rectangle In

(Dimensions X

in X Yin)

Video Rectangle Out (Dimensions Xout X Yout)

Video Scaler

UG_07_031909

This section elaborates on the internal structure in the core, and describes interfacing.

Basic Architecture

The Xilinx Video Scaler LogiCORE™ IP converts a specified rectangular area of an input digital video image from the original sampling grid to a desired target sampling grid (Figure 3-1).

X-Ref Target - Figure 3-1

Chapter 3

Figure 3-1: High Level View of the Functionality

The input image must be provided in raster scan format (left to right and top to bottom). The valid outputs will also be given in this order.

The Xilinx Video Scaler makes few assumptions regarding the origin or the destination of the video data. The input could be fed in real-time from a live video feed, or it could be read from an external memory. The output could feed directly to another processing stage in real time, but also could feed an external frame buffer (for example, for a VGA controller, or a Picture-in-Picture controller). Whatever the configuration, you must assess, given the clock-frequency available, how much time is available for scaling, and define:

1. Whether to source the scaler using live video or an input-side frame buffer, and

2. Whether the scaler feeds out directly to the next stage or to an output-side frame buffer.

When using a live video input source, you have no control over the video timing signals. Hence, the specific requirements must allow for this. For example, when up-scaling by a factor of 2, two lines must be output for every input line. The scaler core clock-rate (‘clk’) must allow for this, especially considering the architectural specifics within the scaler that take advantage of the high speed features of the FPGA to allow for resource sharing.

Feeding data from an input frame buffer is more costly, but allows you to read the required data as needed, but still have one “frame” period in which to process it.

Video Scaler v4.0 User Guide www.xilinx.com 21

UG805 March 1, 2011

Chapter 3: Implementation

$ATA&LOW #ONTROL&LOW#LOCKS

VIDEO?IN?CLK

ACTIVE?VIDEO?IN

LINE?REQUEST

HBLANK?INVBLANK?IN

7RITESIDECONTROL

VIDEO?DATA?IN

OOEDIVKLC?NIOEDIV UT?CLK

VIDEO?DATA?OUT

2EADSIDECONTROL

VIDEO?OUT?CLK

VIDEO?OUT?ALMOST?FULL

VIDEO?OUT?WE

#LK

#ONTROL

3TATE-ACHINES

!SYNC)NPUT

,INE"UFFER

!SYNC/UTPUT

,INE"UFFERS

3CALER-ODULE

5'???

3CALER#ORE

Some observations (not exclusively true for all conversions):

• Generally, when up-scaling, or dealing with high definition (HD) rates, it is simplest to use an input-side frame buffer. This does depend upon the available clock rates.

• When down-scaling, it is often the case that the input-side frame buffer is not required, because for every input line the scaler is required to generate a maximum of one valid output line.

• Generally, the output data does not conform to any standard. It is therefore not possible to feed the output directly to a display driver. Usually, a frame buffer is ultimately required to smooth the output data over an output frame period. The output video stream is described later.

I/O Buffering, Clock Domains

Figure 3-2 shows the top level buffering, indicating the different clock domains, and the

scope of the control state-machines.

X-Ref Target - Figure 3-2

Figure 3-2: Simplified Top Level Block Diagram, Indicating Clock-domains

To support the many possibilities of input and output configurations, and to take advantage of the fast FPGA fabric, the scaler core uses a separate clock domain from that used in controlling data I/O. More information is given in Chapter 9, Performance about how to calculate the minimum required operational clock frequency. It is also possible to read the output of the scaler using a 3rd clock domain. These clock domains are isolated

22 www.xilinx.com Video Scaler v4.0 User Guide

from each other using asynchronous line buffers as shown in Figure 3-2. The control state- machines monitor the I/O line buffers. They also monitor the current input and output line numbers.

UG805 March 1, 2011

Video I/O Interface and Timing

CORE Generator™ software provides two interface options for provision of the video data into the video scaler core.

1. Live – standard format video signal, along with synchronization signals to be driven directly into the core.

2. Memory – an internal memory arbiter is included in the core, so the active video area may be accessed from an external memory block.

Data Source: Live Video

Input Data and Timing Signals

• General Input Handshaking Principles

• Hblank_in Input

• Vblank_in Input

• Frame_rst Signal

• Active_video_in Input

Chapter 4

General Input Handshaking Principles

The input data is written into an internal double-buffered line buffer. Availability of space for one entire line of data is indicated by a high level on the line_request output. One line of data, of a length up to max_samples_in_per_line, may be written to this buffer without the need for further arbitration. Following the first valid pixel-write operation to this line buffer, the line_request output will be driven low by the scaler. This signal may rise a few (> 3) clock cycles later to indicate availability of the other half of the double buffer. The number of clock cycles is dependent on the current conversion.

Video Scaler v4.0 User Guide www.xilinx.com 23

UG805 March 1, 2011

Chapter 4: Video I/O Interface and Timing

Valid video data is written into the input line buffer using active_video_in as a writeenable. This is shown in Figure 4-1 for the 8-bit 4:2:2 case The active_video_in signal must remain in a high state for the duration of the active input line.

X-Ref Target - Figure 4-1

video_in_clk

line_request

active_video_in

video_data _in (7:0) (Luma)

video_data_in (15:8) (Chroma)

YnY

n+1Yn+2Yn+3

CbnCrnCb

n+2Crn+2

size-1

size-2

UG678_5-1_081809

Figure 4-1: Scaler 8-bit 4:2:2 Input Timing

The scaler is capable of accepting and delivering 4:4:4 (e.g., RGB), 4:2:2, and 4:2:0 chroma formats. It will not convert between chroma formats. For delivery of 4:4:4 video data, a third channel would be added to this diagram, and the three channels would be either R, G, and B or Y, Cb, and Cr. It is necessary to clarify the I/O format. For bandwidth, 4:2:0 is essentially the same as 4:2:2 horizontally, but is half the bandwidth vertically. Different signaling is required for the delivery of the YC4:2:2: and YC4:2:0 chroma systems. The luma (Y) input is a full bandwidth 8-bit input on video_data_in[7:0]. The chroma for both 4:2:0 and 4:2:2 is also a full-bandwidth input on video_data_in[(data_width*2)-1:data_width], but Cb and Cr are interleaved on a pixel basis, as shown in Figure 4-1 for the 8-bit case. An additional input active_chroma_in is required in the 4:2:0 case. This must be asserted high on all lines for 4:2:2, but only for alternate lines for 4:2:0, as shown in Figure 4-2.

X-Ref Target - Figure 4-2

chroma_in

video_data_in (7:0)_(Luma)

video_data_in (15:8)_(Chroma)

Line1

Valid

Line2

N/V

Line3

Valid

Line4

N/V

When running the scaler using Live Mode, you are likely to derive the active_video_in from timing signals such as horizontal sync or embedded flags like EAV and SAV. In this case, you will have calculated that the line-rate at the input, often defined by the input video format, is sufficiently low that the host system will never need to wait for the line_request signal to be asserted.

However, in contrast, you may calculate that this is not possible, and that the scaler must hold off the input data. The line_request flag deasserted state should be used to hold off the write-operation for a new line. Since it is impossible to hold off a live video feed, the data must be fed (directly or indirectly) from a frame buffer, and the appropriate external control provided (Memory Mode).

24 www.xilinx.com Video Scaler v4.0 User Guide

UG678_5-2_081809

Figure 4-2: Scaler 8-bit 4:2:0 Input Chroma Validation

UG805 March 1, 2011

Data Source: Live Video

Hblank_in Input

The horizontal blanking input signal hblank_in is generally used as a line-based reset. It must be provided to the scaler core in the same clock domain as the video data (video_in_clk).

The hblank_in signal is used to perform the following operations:

• Reset an internal input pixel counter.

• Reset the internal input side line buffer write-address pointer.

• Increment the input line counter (rising edge of hblank_in).

• Decode the input line count during active data period to open and close an internal processing “window.”

• Decode the input line count to create a delayed internal frame-based reset signal (frame_rst) during vblank_in. The line-number is specified in the CORE Generator GUI (Frame Reset line Number).

The timing of hblank_in must satisfy the following criteria:

• It must be low for the active-data duration of the input line.

• It must be high for a period greater than or equal to 100 video_in_clk-cycles in duration, once per line. This allows the scaler time to handle inherent line-based latency in the filters.

• It must be low for a period greater than or equal to 32 video_in_clk-cycles in duration, once per line.

The hblank_in input must be tied to the horizontal blanking signal provided with the input video stream. Also, you may choose to use the inverse of hblank_in to create the active_video_in signal (see the Active_video_in Input section).

Vblank_in Input

The vertical blanking input signal vblank_in is generally used as a frame-based reset. It must be provided into the scaler core on the same clock domain as the video data (video_in_clk).

The vblank_in signal is used to perform the following operations:

• Reset input line counter (both edges).

• Generate internal frame-based reset signal (frame_rst) during vertical blanking.

In Live Video mode, Frame Reset Line Number must be set to a value that is lower than the number of line periods for which vblank_in remains high between frames. To characterize this further, hblank_in must transition high a larger number of times than Frame Reset Line Number while vblank_in is high.

The vblank_in input must be tied to the vertical blanking signal provided with the input video stream.

Frame_rst Signal

To maximize robustness of the scaler core, it is preferable to reset internal state-machines, FIFOs and other processes once per frame. Owing to inherent multi-line period latency in the system, it is not possible to use the vbank_in for this purpose. During vblank_in, hblank_in must continue to be active (as per most video formats). Frame_rst is generated when the number of hblank_in pulses equals Frame Reset Line Number

Video Scaler v4.0 User Guide www.xilinx.com 25

UG805 March 1, 2011

Chapter 4: Video I/O Interface and Timing

specified in the CORE Generator/EDK GUI. Figure 4-3 is a screen shot from simulation, showing the relationship between vblank_in, hblank_in and Frame_rst. The line count shown is an internal counter included in this image for clarity. To achieve the case illustrated, enter the value 22 into the CORE Generator GUI or pCore GUI.

X-Ref Target - Figure 4-3

Figure 4-3: VBlank, HBlank, Frame_rst, LineCount Screenshot,

with Frame Reset Line Number = 22

The Frame_rst signal is used to perform the following operations:

• Trigger the transfer of coefficients from the coefficient FIFO to the coefficient stores if and only if a full set of coefficients exists in the FIFO.

• Trigger the transfer of control register values from the scaler core pins to internal “active” registers, ready for use during the next frame. Setting bit 1 of the Control register to 0 prevents this transfer from happening.

• Reset read- and write-pointers of input and output line buffers.

• Reset internal state-machine to indicate next input line as the top line in a frame.

Active_video_in Input

The active_video_in signal is generally used as an input data validation signal. It must be provided into the scaler core on the same clock domain as the video data (video_in_clk).

The timing of active_video_in must satisfy the following criteria:

• The first low-to-high transition will coincide with the first active data value for the current line.

• This signal must be low when hblank_in is high.

• Following the transition from low to high, active_video_in must not transition low during the active period of the current line. Following a high-to-low transition, a pulse on the hblank_in signal must occur as described previously in the Hblank_in

Input section.

• For each line, while hblank_in = 0, the active_video_in signal must remain high for at least ApertureEndPixel+1 cycles. For example, to scale an entire 720P image, set ApertureStartPixel = 0, ApertureEndPixel=1279.

If hblank_in is driven high before this has occurred, the line will not be acknowledged by the scaler. This parameter is provided as an input to the scaler by the user.

You may choose to use the inverse of hblank_in to create the active_video_in signal.

26 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Data Source: Memory

This mode is primarily intended for use with a memory controller with rectangular access capability such as the VFBC port on the MPMC. The VFBC port must be configured to provide the amount of data that the scaler is expecting for each frame. The port must contain sufficient buffering for at least one horizontal line of the input video rectangle.

When this video interface mode has been selected in CORE Generator, hblank_in, vblank_in, and active_video_in timing signals are not required. Also, the video data must be fed into the scaler core via the rd_data port instead of the video_data_in port.

The rd_almost_empty signal must be asserted when the port has less than one line available in the buffer.

When rd_almost_empty is low and the scaler is ready to accept a new line of input data, it asserts the rd_re signal high. This signal will remain high for the duration of one line period (determined by aperture_start_pixel and aperture_end_pixel). The first (left-most) valid data pixel must be driven onto the rd_data port one clock cycle after rd_re has been asserted. See Figure 4-4.

X-Ref Target - Figure 4-4

Data Source: Memory

Figure 4-4: Interface Timing for Memory Source Mode

It is important for the scaler core to have a concept of frame synchronization so that topedge filtering may be performed cleanly. For this purpose, you must also supply a vertical synchronization pulse vsync_in once per frame, before the input of the top line. Only the rising edge of vsync_in is used internally. It should be provided in the video_in_clk domain.

In this mode, cropping is not possible within the scaler itself as in Live Video mode. aperture_start_pixel and aperture_start_line must be set to 0. Cropping can be achieved using memory offsets. The first pixel and line provided to the scaler will always be included in the horizontal and vertical apertures.

Video Scaler v4.0 User Guide www.xilinx.com 27

UG805 March 1, 2011

Chapter 4: Video I/O Interface and Timing

VIDEO?OUT?CLK

VIDEO?OUT?WE

VIDEO?DATA?OUT,UMA

VIDEO?DATA?OUT#HROMA

6ALID

.OT6ALID

6ALID

5'??

Output Data and Timing Signals

Although driving the scaler input using a direct standard video feed is supported, the equivalent cannot be said for the scaler output. Because of the bursty nature of the vertical filter portion of the scaling operation, the required size of the output buffering would be prohibitive. This would be more aptly targeted to an external memory interface, which is beyond the scope of this LogiCORE™ IP. However, the user may decide that his system can directly handle the bursty data output from the scaler, provided valid data is indicated by the core. Consequently, simple hand-shaking is achieved using the video_out_we and video_out_almost_full signals.

When a line of data becomes available in the output buffer, and the video_out_almost_full flag is low, the video_out_we flag is asserted as shown in

Figure 4-5, and data is driven out.

X-Ref Target - Figure 4-5

Figure 4-5: Scaler Output Timing (8-bits YC4:2:2/4:2:0)

The video_out_almost_full input is provided to throttle the output from the scaler. When this is asserted high for a number of line periods, the line_request signal will be deasserted due to back-pressure through the scaler. If video_out_almost_full is low at the start of an output line, the entire line will be delivered. The target must de-assert video_out_almost_full when it is ready to accept the entire line.

Upon completion of the final line requested according to the output_v_size parameter, the scaler will send a pulse of six video_out_clk cycles on the output_frame_done signal.

For 4:2:0 outputs, the valid chroma data output will be accompanied by a high level on the chroma_out signal as shown in Figure 4-6.

X-Ref Target - Figure 4-6

video_data_out (7:0) (Luma)

video_data_out (15: 8) (Chroma)

Line1

Valid

Line2

N/V

Line3

Valid

Line4

N/V

chroma_out

UG678_5-5_081809

Figure 4-6: Scaler 4:2:0 Output Validation (8-bits)

28 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Scaler Architectures

The scaler supports the following possible arrangements of the internal filters.

• Option 1: Single-engine for sequential YC processing

• Option 2: Dual Engine for parallel YC processing

• Option 3: Triple engine for parallel RGB/4:4:4 processing

When using RGB/4:4:4, only Option 3 can be used. Selecting Option 1 or Option 2 significantly affects throughput trading versus resource usage. These three options are described in detail in this chapter.

Architecture Descriptions

Single-Engine for Sequential YC Processing

Chapter 5

This is the most complex of the three options because Y, Cr, and Cb operations are multiplexed through the same filter engine kernel.

One entire line of one channel (for example luma) is processed before the single-scaler engine is dedicated to another channel of the same video line. The input buffering arrangement allows for the channels to be separated on a line-basis. The internal data path bit widths are shown in Figure 5-1, as implemented for a 4:2:2 or 4:2:0 scaler. DataWidth may be set to 8, 10, or 12 bits.

X-Ref Target - Figure 5-1

2*DataWidth

The scaler module is flanked by buffers that are large enough to contain one line of data, double buffered.

At the input, the line buffer size is determined by the parameter max_samples_in_per_line. At the output, the line-buffer size is determined by the parameter max_samples_out_per_line. These line buffers enable line-based arbitration, and avoid pixel-based handshaking issues between the input and the scaler core. The input line buffer also serves as the “most recent” vertical tap (that is, the lowest in the image) in the vertical filter.

Input Line

Buffer

Figure 5-1: Internal Data Path Bitwidths for Single-Engine YC Mode

1*DataWidth 1*DataWidth

Scaler

Output Line

Buffer (Y)

1*DataWidth

Output Line

Buffer (Cb/Cr)

2*DataWidth

UG_16_031909

Video Scaler v4.0 User Guide www.xilinx.com 29

UG805 March 1, 2011

Chapter 5: Scaler Architectures

Ou tputLine

LineBuffer

ScalerEngine

Ou tputLine

Input  LineBu ffer

ScalerEngine

Ou tputLine

Ch1In pu tLine

Buffer

Sc alerEngine

(

)

OutputLine

Buffer

Sc alerEngine

(

)

(

Buffer

Sc alerEngine

(

)

4:2:0 Special Requirements

When operating with 4:2:0, it is also important to include the following restriction: when scaling 4:2:0, the vertical scale factor applied at the vsf input must not be less than

)*144/1080. This restriction has been included because Direct Mode 4:2:0 requires

additional input buffering to align the chroma vertical aperture with the correct luma vertical aperture. In a later release of the video scaler, this restriction will be removed.

Dual-Engine for Parallel YC Processing

For this architecture, separate engines are used to process Luma and Chroma channels in parallel as shown in Figure 5-2.

X-Ref Target - Figure 5-2



video_data_in

1*DataWidth

2*DataWidth

1*DataWidth

Luma(Y)Input

Chro ma (Cr/Cb)

Figure 5-2: Internal Data Path Bitwidths for Dual-Engine YC Mode

1*DataWi dth

(Y)

(C)

1* DataWidth

Buffer(Y)

Buffer (C)

1*DataWi dth

video_da ta_out

2* DataWi d th

1*DataWidth

For the Chroma channel, Cr and Cb are processed sequentially. Due to overheads in completing each component, the chroma channel operations for each line require slightly more time than the Luma operation. It is worth noting also that the Y and C operations do not work in synchrony.

Triple-Engine for RGB/4:4:4 Processing

For this architecture, separate engines are used to process the three channels in parallel, as shown in Figure 5-3.

X-Ref Target - Figure 5-3



vi deo _da ta_in  video_da ta_out

1*DataWidth

3*DataWidth

1*DataWidth

1* DataWidth

Buffer (Ch1)

Buffer (Ch2)

Ou tputLine Buffer 

Ch3)

1*DataWidth

Ch2In pu tLine

Ch3In pu tLine

1*DataWidth

Ch1

1*DataWidth

Ch2

1*DataWidth

Ch3

Figure 5-3: Internal Data Path Bitwidths for Triple-Engine RGB/4:4:4 Architecture

For this case, all three channels are processed in synchrony.

3* DataWi d th

30 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

GUI Operation

X-Ref Target - Figure 5-4

GUI Operation

When the chroma format is specified as 4:4:4, the triple-engine parallel architecture is always selected. Otherwise, selection between the YC Sequential or Parallel options may be achieved automatically (YC Filter Configuration = Auto Select) or manually in the CORE Generator GUI or the EDK GUI (see Figure 5-4).

The primary goal of selecting the correct architecture is to optimize resource usage, for a given worst case operational scenario. When Auto Select is selected, the GUI tries to establish what the user's worst case is from the following input parameters:

• Input maximum rectangle size

• Output maximum rectangle size

• Target Clock-frequency

• Desired Frame rate

Figure 5-4: Auto Select in GUI

The pseudo-code calculation made by the GUI for the Auto Select option is as follows:

OverheadMultiplier := 1.15; max_pixels := max(MaxHSizeIn, MaxHSizeOut); max_lines := max(MaxVSizeIn, MaxVSizeOut); max_frame_cycles := max_pixels * max_lines * OverHeadMultiplier; MaxFrameRateOneComponent := (TgtFMax * 1000000)/max_frame_cycles;

if (TgtFrameRate <= MaxFrameRateOneComponent/2) then Use Single engine else Use Dual engine end if;

Video Scaler v4.0 User Guide www.xilinx.com 31

UG805 March 1, 2011

Chapter 5: Scaler Architectures

The Information tab (see Figure 5-5) in the CORE Generator GUI (not available in EDK GUI) shows the estimated maximum achievable frame-rate given the above information, using a similar calculation as above. The user is advised to take a look at this value, and may elect to force the GUI one way or the other. This may be advisable in cases where, for example, a higher overhead per frame than 15% is needed. This overhead is intended as a general way of representing inactive periods in a frame such as blanking, but also includes filter flushing time, state-machine initialization, etc.

X-Ref Target - Figure 5-5

Figure 5-5: CORE Generator GUI Information Tab

32 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Control Interface

)2*]

1__

____

([





sizehoutput

pixelstartaperturepixelendaperture

roundhsf

)2*]

1__

____

([





sizevoutput

linestartaperturelineendaperture

roundvsf

There are three control interface options available in CORE Generator™ software: EDK pCore, GPP or Constant. The interface types differ primarily in the method of delivery of the user-defined control values and filter coefficients. These values are listed in the video scaler data sheet DS

Control Values

There follows a brief description of the function of the control values.

In GPP mode and pCore mode, these values are provided as dynamic inputs, and may be changed during runtime – the user inputs become active once per frame after completion of an output frame, using an internal active value capture register.

For the pCore version of the core, CORE Generator software provides the GPP core placed in a wrapper which allows you to parameterize the scaler core in EDK. The ports are driven by registers that sit on the AXI4-Lite. The address is decoded in the wrapper. A MicroBlaze™ processor software driver is provided in source-code form to drive these ports. Typical usage of the pCore is shown in Figure 6-1.

840 in Table 2, under “Dynamic Control Register Interface.”

Chapter 6

• aperture_start_pixel, aperture_end_pixel, aperture_start_line, aperture_end_line

These parameters define the size and location of the input rectangle. They are explained in detail in Chapter 7, “Scaler Aperture.”

• output_h_size, output_v_size

These two parameters define the size of the output rectangle. They do not determine anything about the target video format. You must determine what do with the scaled rectangle that emerges from the scaler core.

• hsf, vsf

These are the horizontal and vertical shrink-factors that must be supplied the user. They should be supplied as integers, and can typically be calculated as follows:

and

Video Scaler v4.0 User Guide www.xilinx.com 33

UG805 March 1, 2011

Chapter 6: Control Interface

Hence, up-scaling is achieved using a shrink-factor value less than one. Down-scaling is achieved with a shrink-factor greater than one.

You may wish to work this calculation backwards. For a desired scale-factor, you may wish to calculate the output size or the input size. This is application-dependent. Smooth zoom/shrink applications may take advantage of this approach, coupled with usage of the following start-phase controls described below.

The allowed range of values on these parameters is 1/12 to 12: (0x015555 to 0xC00000).

• num_h_phases, num_v_phases

Although you must specify the maximum number of phases (max_phases) that the core supports in the CORE Generator GUI, it is not necessary to run the core with a filter that has that many phases. Under some scaling conditions, you may want a large number of phases, but under others you may need only a few, or even only one. Non power-of-two numbers of phases are supported.

• coef_wr_addr, h_coeff_set, v_coeff_set

In GPP and pCore interfaces, you may load coefficients. The scaler can store up to max_coef_sets coefficient sets internally. coef_wr_addr sets the set location of the set to which you intend to write. The set may subsequently be used by controlling the h_coeff_set and v_coeff_set values.

• start_hpa_y, start_hpa_c, start_vpa_y, start_vpa_c

These are the start-phase controls. Internally to the core, the scaler accumulates the 24-bit shrink-factor (hsf, vsf) to determine phase and filter aperture. These four values allow you to preset the fractional part of the accumulations horizontally (hpa) and vertically (vpa) for luma (y) and chroma (c).

When dealing with 4:2:2, luma and chroma are always vertically cosited. Hence the start_vpa_c value is ignored.

Usage of these parameters is important for scaling interlaced formats cleanly. On successive input fields, the start_vpa_y value needs to be modified.

Also, when the desired result is a smooth shrink or zoom over a period of time, you may get better results by changing these parameters for each frame.

The allowed range of values on these parameters is -0.99 to 0.99: (0x100001 to 0x0FFFFF). The default value for these parameters is 0.

• control

The control register contains only two active bits. The default value for the control register during continuous operation is “0x3.”

• bit 0 is a general purpose enable. Activated/deactivated on a vblank_in basis, a value of 0 disables the scaler output.

• bit 1 enables values on the other register inputs to become internally active on a vblank_in basis. A value of 0 prevents the active internal values from being changed.

34 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Constant (Fixed) Mode

When using this mode, the values are fixed at compile time. The user system does not need to drive any of the parameters. The CORE Generator GUI prompts you to specify:

• coefficient file (.coe)

•hsf

•vsf

• aperture_start_pixel

• aperture_end_pixel

•aperture_start_line

•aperture_end_line

•output_h_size

•output_v_size

• num_h_phases

• num_v_phases

Constant mode has the following restrictions:

Constant (Fixed) Mode

• A single coefficient set must be specified using a .coe file; this is the only way to populate the coefficient memory.

• Coefficients may not be written to the core; the coef_wr_addr control is disabled.

• You may not specify h_coeff_set or v_coeff_set; there is only one set of coefficients.

• You may not specify start_hpa_y, start_hpa_c, start_vpa_y, start_vpa_c; they are set internally to zero.

• The control register is always set to “0x00000003,” fixing the scaler in active mode.

General Purpose Processor (GPP) Interface

This interface type exposes all control ports to the user. You are responsible for driving these ports. Xilinx recommends that GPP mode be used only by experienced scaler users.

Figure 6-1 indicates how the EDK pCore is effectively a wrapper around the GPP mode

core. This should be considered as an example of how you may choose to wrap the GPP mode core to suit any processor.

In GPP mode, the control values may be changed during runtime – the user input control values become active once per frame after completion of an output frame, using an internal active value capture register.

Coefficient Delivery for GPP Interface

In this mode, you must supply all coefficients to the core. See Chapter 8, “Coefficients,” for all details regarding coefficient loading in GPP mode.

Video Scaler v4.0 User Guide www.xilinx.com 35

UG805 March 1, 2011

Chapter 6: Control Interface

EDK pCore Interface

In contrast to GPP Mode and Constant Mode control interfaces, when you select this control interface option in CORE Generator, no netlist is created. Instead, a database is generated containing the necessary files for use in an EDK project. This database includes:

<component_name> -> drivers -> scaler_v3_01_a -> data -> scaler_v2_1_0.mdd

scaler_v2_1_0.tcl

-> example -> example.c

-> src -> Makefile

xscaler.c

xscaler.h

xscaler_coefs.c

xscaler_g.c

xscaler_hw.h

xscaler_intr.c

xscaler_sinit.c

-> pcores -> axi_scaler_v4_00_a -> data -> scaler_v2_1_0.mpd

scaler_v2_1_0.pao

-> hdl -> vhdl

-> CoefsFIFO.vhd

coefs.vhd

CoefRAM.vhd

For use in an EDK project:

CoefMemBlk.vhd

HeartBeater.vhd

HPhaseAccumulator.vhd

HWT.vhd

ImageXLib_arch.vhd

ImageXLib_utils.vhd

MemXLib_arch.vhd

MemXLib_utils.vhd

Scaler.vhd

Scaler_RTI.vhd

Scaler_wrap0.vhd

Scaler_wrap0_core.vhd

ScalerExternalSM.vhd

syncgen_core.vhd

user_logic.vhd

v_scaler_v4_0.vhd

xscaler.vhd

YCCheckSum.vhd

1. Copy the /drivers/scaler_v3_01_a sub-directory from the CORE Generator database to the /drivers directory in your EDK project repository.

36 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

2. Copy the /pcores/axi_scaler_v4_00_a sub-directory from the CORE Generator database to the /pcores directory in your EDK project repository.

All VHDL files are encrypted. Do not attempt to modify these files.

Parameter Modification in CORE Generator

When "EDK pCore" is selected in the CORE Generator GUI, all parameters are greyed-out. The user must use the EDK GUI to parameterize the core.

Scaler Software Driver

All files provided by CORE Generator software under the drivers directory are tested SW drivers for the video scaler. They are unencrypted c-code which you may adapt for your own environment. This is intended for a memory-mapped system. The register map for the scaler registers is given in Appendix B, “Programmer Guide.”

Coefficient Delivery for EDK pCore Interface

Delivery of coefficients to the hardware core is achieved exactly as is described for the GPP Interface (see Chapter 8, “Coefficients,” for full details). However, the pCore wrapper and software driver mask you from the detail described.

Interrupts

There are six interrupts:

1. intr_output_frame_done – Issued once per complete output frame.

2. intr_reg_update_done – Issued during Vertical blanking when the register values have been transferred to the active registers.

3. intr_input_error – Issued if active_video_in is asserted before the scaler is ready to receive a new line.

4. intr_output_error – Issued if frame period completes before full output frame has been delivered.

5. intr_coef_wr_error – Issued if coefficient is written into coefficient FIFO when the FIFO is not ready.

6. intr_coef_fifo_rdy – High when the coefficient FIFO is ready to receive a coefficient for the current set; stays low once a full set has been written into FIFO; sent high during Vertical blanking.

7. intr_coef_mem_rdbk_rdy - Sent low after CoefMemRdEn (control register bit (3)) is written low. Two frames after CoefMemRdEn is written high, this signal is driven high again.

In GPP mode, all seven interrupts are active.

In Constant mode, only intr_input_error, intr_output_error and intr_output_frame_done are active.

Video Scaler v4.0 User Guide www.xilinx.com 37

UG805 March 1, 2011

Chapter 6: Control Interface

-ICRO"LAZE

)NTERRUPT

#ONTROLLER

6IDEO3CALER P#ORE

)NTERRUPT

#ONTROLLER

0ERIPHERAL 0ERIPHERALN

)NTERRUPTS

!8),ITE

6IDEO 3CALER '00

Inside the pCore wrapper, an Interrupt Controller (Xilinx Interrupt Control LogiCORE™ (DS516 microprocessor must then read the interrupt status registers to establish the nature of the interrupt. The interrupt registers are defined in Appendix B, “Programmer Guide.” A generic n-peripheral system is shown in Figure 6-1. It shows the intended usage of interrupts in an EDK-based system. It also shows how the Xilinx Interrupt Controller is used internally to the pCore along with the scaler in GPP mode.

X-Ref Target - Figure 6-1

)) collates these interrupts into one interrupt on the AXI4-Lite bus. The

Figure 6-1: Typical EDK-based System Showing Interrupt Structure

38 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Scaler Aperture

This section explains how to define the scaler aperture using the appropriate dynamic control registers. The aperture is defined relative to the input timing signals.

Input Aperture Definition

It is vital to understand how to specify the scaler aperture properly. The scaler aperture is defined as the input data rectangle used to create the output data rectangle. The input values aperture_start_line, aperture_end_line, aperture_start_pixel and aperture_end_pixel need to be driven correctly.

To scale from a rectangle of size 1280x720, they should be set as follows:

aperture_start_pixel 0

aperture_end_pixel 1279

Chapter 7

X-Ref Target - Figure 7-1

aperture_start_line 0

aperture_end_line 719

It is also important to understand how “line 0” and “pixel 0” are defined to ensure that these values are entered correctly. Line 0 is defined as the first active line following a rising edge in active_video_in. An internal line counter is decoded to signal internally that the current line is indeed line 0. This line counter is reset on a falling edge of vblank_in. It increments on a rising edge of hblank_in.

One situation that needs to be avoided is the counter effectively starting at 1 instead of 0. This will cause no video output. The correct relationship between input hblank_in and

vblank_in to avoid this situation is shown in Figure 7-1. The falling edge of vblank_in occurs while hblank_in is still high.

Figure 7-1: Hblank_in at Falling Edge of VBlank_in

Video Scaler v4.0 User Guide www.xilinx.com 39

UG805 March 1, 2011

Chapter 7: Scaler Aperture

Pixel 0 is defined as the first active pixel after the rising edge of active_video_in. This is indicated in Figure 7-2. The value 128 is used as the default value in video_data_in during blanking. In this example, the first pixel in the horizontal scaler aperture is the first active pixel in the input line.

X-Ref Target - Figure 7-2

Figure 7-2: Active_video_in in Relation to First Active Sample

Cropping

When using “Live” mode, you may choose to select a small portion of the input image. To achieve this, set the aperture_start_line, aperture_end_line, aperture_start_pixel and aperture_end_pixel according to your requirements.

For example, from an input which is 720P, you may want to scale from a rectangle of size 80x60, starting at (pixel, line) = (20, 32). Set the following:

X-Ref Target - Figure 7-3

aperture_start_pixel 20

aperture_end_pixel 99

aperture_start_line 32

aperture_end_line 91

Figure 7-3 shows the opening of an internal processing window signal

(t_verticalwindow) with the preceding cropping settings. A similar operation occurs in the horizontal domain. A useful developer note is that if the largest input rectangle is cropped from the input, then this size may be used in deciding the max_pixels_in_per_line parameter. This may save block RAM usage in some cases.

Figure 7-3: Cropping from the Input Image

When using “Memory” mode, cropping must be achieved by selecting the appropriate rectangular area from memory. aperture_start_pixel and aperture_start_line must be set to zero.

40 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Coefficients

This section describes the coefficients used by both the Vertical and Horizontal filter portions of the scaler, in terms of number, range, formatting and download procedures.

Coefficient Table

One single size-configurable, block RAM-based, Dual Port RAM block stores all H and V coefficients combined, and holds different coefficients for luma and chroma as desired.

This coefficient store may be populated with active coefficients as follows:

• Using the Coefficient Interface (see Coefficient Interface).

• By preloading using a .coe file

Coefficients that are preloaded using a .coe file remain in this memory until they are overwritten with coefficients loaded by the Coefficient Interface. Consequently, this is not possible when using Constant mode. Preloading with coefficients allows the user an easy way of initializing the scaler from power-up.

Chapter 8

When using pCore or GPP interfaces, you may want more than one coefficient set from which to choose. For example, it may be necessary to select different filter responses for different shrink factors. This is often true when down-scaling by different factors to eliminate aliasing artifacts. The user may load (or preload using a .coe file) multiple coefficient sets.

The number of phases for each set may also vary, dependent upon the nature of the conversion, and how you have elected to generate and partition the coefficients. The maximum number of phases per set defines the size of the memory required to store them, and this may have an impact on resource usage. Careful selection of the parameters max_phases and max_coef_sets is paramount if optimal resource usage is important.

Each coefficient set is allocated an amount of space equal to 2 fixed parameter that is defined at compile time. However, it is not necessary for every set to have that many phases. The number of phases for each set may be different, provided you indicate how many phases there are in the current set being used, by setting the input register values num_h_phases, and num_v_phases accordingly. Without setting these correctly, invalid coefficients will be selected by the phase accumulators.

Horizontal filter coefficients are stored in the lower half of the coefficient memory. Vertical filter coefficients are stored in the upper half of the coefficient memory. For each of the H and V sectors, luma coefficients occupy the lower half and chroma coefficients occupy the upper half. This method simplifies internal addressing. When the chroma format is set to 4:4:4., one set of coefficients will be shared between all three channels (i.e., R, G, and B will be scaled identically).

max_phases

. Max_phases is a

Video Scaler v4.0 User Guide www.xilinx.com 41

UG805 March 1, 2011

Chapter 8: Coefficients

31150

Valid - Coefficient n+1

Valid - Coefficient n

16-bit Coefficients

UG_28_031909

If the user specifies in the CORE Generator or EDK GUI that the Luma and Chroma filters share common coefficients, then there is no coefficient memory space available for chroma coefficients. In this case, the user must not load chroma coefficients using the Coefficient interface, and must not specify chroma coefficients in the .coe file.

Similarly, if the user has specified in the CORE Generator or EDK GUI that the Horizontal and Vertical filters share common coefficients, then there is no coefficient memory space available for Vertical coefficients. In this case, the user must not load Vertical coefficients using the Coefficient interface, and must not specify Vertical coefficients in the .coe file.

Note:

taps.

This option is only available if the number of horizontal taps is equal to the number of vertical

Coefficient Interface

The scaler uses only one set of coefficients per frame period. To change to a different set of stored coefficients for the next frame, use the h_coeff_set and v_coeff_set dynamic register inputs.

You may load new coefficients into a different location in the coefficient store during some frame period before they are required. You may load a maximum of one coefficient set (including all of HY, HC, VY, VC components) per frame period. Subsequently, this coefficient set may be selected for use by controlling h_coeff_set and v_coeff_set.

Filter Coefficients may be loaded into the coefficient memory using the coefficient memory interface. This comprises:

coef_data_in(31:0) 32-bit coefficient input bus

coef_wr_en Coefficient write-enable

coef_set_wr_addr(3:0) Coefficient set write address

The 32-bit input word always holds two coefficients. The scaler supports 16-bit coefficient bit-widths. The word format is shown in Figure 8-1.

X-Ref Target - Figure 8-1

Figure 8-1: Coefficient Write-Format on coef_data_in(31:0)

42 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Coefficient Interface

vblank_in

Coefficient Load

Control SM

Coefficient Load

FIFO

Coefficient Storecoef_data_in(31:0)

coef_set_wr_addr(3:0)

coef_wr_en

Coefficient Write Address

Coefficients to filters

Video Scaler

Por t A

Operational Read Address (V Filter)

Operational Read Address (H Filter)

Por t B

UG678_7-3_081809

An address-multiplexer is used to support the coefficient write interface as shown in

Figure 8-2. The coefficient write-address is multiplexed with the coefficient read-address

for the vertical filter to create the address for Port A on the dual-port coefficient RAM. Consequently, coefficients must be loaded into the coefficient stores when no active video scaling is occurring. It is only possible, therefore, to load the coefficients during the vertical blanking period. Since this would be an impossible burden on a processor, an external block RAM FIFO has been provided to which you load your coefficients during one frame period, as shown in Figure 8-2. Following a latency period after the positive transition of vblank_in, any new coefficient set is streamed into the internal coefficient store for use by the filter in the next frame.

X-Ref Target - Figure 8-2

Figure 8-2: Coefficient Loading Mechanism, Including External FIFO

A waveform indicating the coefficient loading process is shown in Figure 8-3.

The coefficient memory interface is an asynchronous interface. A high level on the coef_wr_en signal is used to capture the coefficients delivered on coef_data_in as shown in Figure 8-3. An internal state-machine detects the 3rd ‘clk’ period when coef_wr_en is stable and high. At this point, the data is registered into the FIFO. Xilinx recommends that the high coef_wr_en pulse be no less than the equivalent of 6 ‘clk’ periods in duration. It is required that it also be low for a period no less than 6 ‘clk’ periods between write operations.

The guidelines are as follows:

•The address coef_set_addr for all coefficients in one set must be written via the normal register interface.

• coef_data_in delivers two coefficients per 32-bit word. The lower word (bits 15:0) always holds the coefficient that will be applied to the latest tap (that is, spatially speaking, the right-most or lowest). The word format is shown in Figure 8-1.

• All coefficients for one phase must be loaded sequentially via coef_data_in, starting with coef 0 and coef 1 [coef 0 is applied to the newest (right-most or lowest) input sample in the current filter aperture]. See Figure 8-3. For an odd number of coefficients, the final upper 16 bits is ignored.

• All phases must be loaded sequentially starting at phase 0, and ending at phase (max_phases-1). This must always be observed, even if a particular set of coefficients has fewer active phases than max_phases.

• For RGB/4:4:4, when not sharing coefficients across H and V operations, for each

Video Scaler v4.0 User Guide www.xilinx.com 43

UG805 March 1, 2011

dimension, one bank of coefficients must be loaded into the FIFO before they can be streamed into the coefficient memory. When sharing coefficients across H and V operations, it is only necessary to write coefficients for the H operation. This process is permitted to take as much time as desired by the user system. This means that worst

Chapter 8: Coefficients

Coefs 0,1coef_data_in

coef_wr_en

Coefs 2, 3 Coefs 4, 5 Coefs 6, 7

UG_30_031909

case, for a 12H-tap x 12V-tap 64-phase filter, you need to write 6 times per phase. If the user has specified separate H and V coefficients, this is a total of 768 write operations per set.

• For YC4:2:2 or YC4:2:0, when not sharing coefficients across H and V operations or across Y and C operations, one bank of luma (Y) and chroma (C) coefficients must be loaded into the FIFO for each dimension before they can be streamed into the coefficient memory. When sharing coefficients across H and V operations, it is only necessary to write coefficients for the H operation. Also, when sharing coefficients across Y and C operations, it is only necessary to write coefficients for the Y operation. This process is permitted to take as much time as desired by the user system. This means that worst case, for a 12H-tap x 12V-tap 64-phase filter, you need to write 6 times per phase. If the user has specified separate H and V coefficients and separate Y and C coefficients, this is a total of 1536 write operations per set.

• Writing a new address to coef_set_addr resets the internal state-machine that oversees the coefficient loading procedure. An error condition will be asserted if the loading procedure comes up less than 2 x max_phases*Max(num_h_taps, num_v_taps) when coef_set_addr is updated.

X-Ref Target - Figure 8-3

Figure 8-3: Coefficient Loading Procedure – One Phase (8-tap filter shown)

Examples of Coefficient Set Generation and Loading

As mentioned, when data is fed in raster format, coefficient 0 is applied to the lowest tap in the aperture for the Vertical filter or for the right-most tap in the Horizontal filter. Following are a few examples of how to generate some coefficients and translate them into the correct format for downloading to the scaler.

Example 1: Num_h_taps = num_v_taps = 8; max_phases = 4

Tab le 8- 1 shows a set of coefficients drawn from a sinc function.

Table 8-1: Example 1 Decimal Coefficients

Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7

0 0.0000 0.0000 0.0000 0.0000 1.0000 0.0000 0.0000 0.0000

1 -0.0600 0.0818 -0.1286 0.3001 0.9003 -0.1801 0.1000 -0.0693

2 -0.0909 0.1273 -0.2122 0.6366 0.6366 -0.2122 0.1273 -0.0909

3 -0.0693 0.1000 -0.1801 0.9003 0.3001 -0.1286 0.0818 -0.0600

In this example, a 32-point 1-D sinc function has been sub-sampled to generate four phases of eight coefficients each. Sub-sampling in this way usually results in a phases whose component coefficients rarely sum to 1.0 – this will cause image distortion. The example MATLAB express them as the 16-bit integers required by the hardware. For this process, coef_width = 16. Note that this is only pseudo code. Generation of actual coefficients is

m-code that follows shows how to normalize the phases to unity and how to

44 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Examples of Coefficient Set Generation and Loading

beyond the scope of this document. Refer to Answer Record 35262 and Filter Coefficient

Calculations for more information on coefficient generation for the video scaler.

% Subsample a Sinc function, and create 2D array x=-(num_taps/2):1/num_phases:((num_taps/2)-1/num_phases); coefs_2d=reshape(sinc(x), num_phases, num_taps) format long

% Normalize each phase individually for i=1:num_phases sum_phase = sum(coefs_2d(i,:)); for j=1:num_taps norm_phases(i, j) = coefs_2d(i, j)/sum_phase; end % Check - Normalized values should sum to 1 in each phase norm_sum_phase = sum(norm_phases(i,:)) end

% Translate real to integer values with precision defined by coef_width int_phases = round(((2^(coef_width-2))*norm_phases))

This generates the 2D array of integer values shown (in hexadecimal form) in Tab le 8- 2.

Table 8-2: Example 1 Normalized Integer Coefficients

Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7

0 0x0000 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000

1 0xFBEF 0x058C 0xF749 0x1457 0x3D04 0xF3CC 0x06C8 0xFB4E

2 0xF9AF 0x08D8 0xF143 0x2C36 0x2C36 0xF143 0x08D8 0xF9AF

3 0xFB4E 0x06C8 0xF3CC 0x3D04 0x1457 0xF749 0x058C 0xFBEF

It remains to format these values for the scaler.

The 16-bit coefficients must be coupled into 32-bit values for delivery to the HW. The resulting coefficient file for download is shown in Ta bl e 8- 3.

The coefficients must be downloaded in the following order:

1. Horizontal Luma (always required)

2. Horizontal Chroma (required if not sharing Y and C coefficients)

3. Vertical Luma (required if not sharing H and V coefficients)

4. Vertical Chroma (required if not sharing H and V coefficients, and also not sharing Y and C coefficients)

Table 8-3: Example 1 Coefficient Set Download Format

Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 17 0x00000000 (Ph0 T1 << 16) | Ph0 T0

2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 18 0x00000000 (Ph0 T3 << 16) | Ph0 T2

3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 19 0x00004000 (Ph0 T5 << 16) | Ph0 T4

Phase 0

4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 20 0x00000000 (Ph0 T7 << 16) | Ph0 T6

Video Scaler v4.0 User Guide www.xilinx.com 45

UG805 March 1, 2011

Phase 0

Chapter 8: Coefficients

Table 8-3: Example 1 Coefficient Set Download Format (Cont’d)

5 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 21 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0

6 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 22 0x1457F749 (Ph1 T3 << 16) | Ph1 T2

7 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 23 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4

Phase 1

8 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 24 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6

9 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 25 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0

10 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 26 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2

11 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 27 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4

Phase 2

12 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 28 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6

13 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 29 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0

14 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 30 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2

15 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 31 0xF7491457 (Ph3 T5 << 16) | Ph3 T4

Phase 3

16 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 32 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6

Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

33 0x00000000 (Ph0 T1 << 16) | Ph0 T0 49 0x00000000 (Ph0 T1 << 16) | Ph0 T0

34 0x00000000 (Ph0 T3 << 16) | Ph0 T2 50 0x00000000 (Ph0 T3 << 16) | Ph0 T2

Phase 1

Phase 2

Phase 3

Phase 0

35 0x00004000 (Ph0 T5 << 16) | Ph0 T4 51 0x00004000 (Ph0 T5 << 16) | Ph0 T4

Phase 0

36 0x00000000 (Ph0 T7 << 16) | Ph0 T6 52 0x00000000 (Ph0 T7 << 16) | Ph0 T6

37 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 53 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0

38 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 54 0x1457F749 (Ph1 T3 << 16) | Ph1 T2

39 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 55 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4

Phase 1

40 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 56 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6

41 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 57 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0

42 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 58 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2

43 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 59 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4

Phase 2

44 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 60 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6

45 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 61 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0

46 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 62 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2

47 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 63 0xF7491457 (Ph3 T5 << 16) | Ph3 T4

Phase 3

48 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 64 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6

Phase 1

Phase 2

Phase 3

46 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Examples of Coefficient Set Generation and Loading

Example 2: Num_h_taps = num_v_taps = 8; max_phases = 5, 6, 7 or 8; num_h_phases = num_v_phases = 4

If the max_phases parameter is greater than the number of phases in the set being loaded, load default coefficients into the unused locations. Example 2 is an extended version of Example 1 to show this. Ta bl e 8- 4 shows the same 4-phase coefficient set loaded into the scaler when num_h_phases = 4, num_v_phases = 4 and max_phases is greater than 4(max_phases = 5, 6, 7 or 8, num_h_taps = 8, num_v_taps =8).

Note that:

1. If max_phases is not equal to an integer power of 2, then the number of phases to be loaded is rounded up to the next integer power of 2. See Example 2 (Tabl e 8 -4 ). Unused phases should be loaded with zeros.

2. The number of values loaded per phase is not rounded to the nearest power of 2. See Example 3 (Ta bl e 8 -7 ).

Table 8-4: Example 2 Coefficient Set Download Format

Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma

Load

Sequence

Number

1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 33 0x00000000 (Ph0 T1 << 16) | Ph0 T0

2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 34 0x00000000 (Ph0 T3 << 16) | Ph0 T2

Phase 0

Phase 1

Phase 2

3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 35 0x00004000 (Ph0 T5 << 16) | Ph0 T4

4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 36 0x00000000 (Ph0 T7 << 16) | Ph0 T6

5 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 37 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0

6 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 38 0x1457F749 (Ph1 T3 << 16) | Ph1 T2

7 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 39 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4

8 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 40 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6

9 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 41 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0

10 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 42 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2

11 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 43 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4

12 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 44 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6

13 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 45 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0

14 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 46 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2

Val ue

Calculation

Ph= Phase #, T= Tap #

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

Phase 0

Phase 1

Phase 2

Phase 3

15 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 47 0xF7491457 (Ph3 T5 << 16) | Ph3 T4

Phase 3

16 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 48 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6

17 0x00000000 N/A Dummy coef 49 0x00000000 N/A Dummy coef

18 0x00000000 N/A Dummy coef 50 0x00000000 N/A Dummy coef

19 0x00000000 N/A Dummy coef 51 0x00000000 N/A Dummy coef

Phase 4

20 0x00000000 N/A Dummy coef 52 0x00000000 N/A Dummy coef

Video Scaler v4.0 User Guide www.xilinx.com 47

UG805 March 1, 2011

Phase 4

Chapter 8: Coefficients

Table 8-4: Example 2 Coefficient Set Download Format (Cont’d)

21 0x00000000 N/A Dummy coef 53 0x00000000 N/A Dummy coef

22 0x00000000 N/A Dummy coef 54 0x00000000 N/A Dummy coef

23 0x00000000 N/A Dummy coef 55 0x00000000 N/A Dummy coef

Phase 5

24 0x00000000 N/A Dummy coef 56 0x00000000 N/A Dummy coef

25 0x00000000 N/A Dummy coef 57 0x00000000 N/A Dummy coef

26 0x00000000 N/A Dummy coef 58 0x00000000 N/A Dummy coef

27 0x00000000 N/A Dummy coef 59 0x00000000 N/A Dummy coef

Phase 6

28 0x00000000 N/A Dummy coef 60 0x00000000 N/A Dummy coef

29 0x00000000 N/A Dummy coef 61 0x00000000 N/A Dummy coef

30 0x00000000 N/A Dummy coef 62 0x00000000 N/A Dummy coef

31 0x00000000 N/A Dummy coef 63 0x00000000 N/A Dummy coef

Phase 7

32 0x00000000 N/A Dummy coef 64 0x00000000 N/A Dummy coef

Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma

Addr Value

Calculation

Ph= Phase #, T= Tap #

Addr Value

Calculation

Ph= Phase #, T= Tap #

65 0x00000000 (Ph0 T1 << 16) | Ph0 T0 97 0x00000000 (Ph0 T1 << 16) | Ph0 T0

66 0x00000000 (Ph0 T3 << 16) | Ph0 T2 98 0x00000000 (Ph0 T3 << 16) | Ph0 T2

Phase 5

Phase 6

Phase 7

Phase 0

67 0x00004000 (Ph0 T5 << 16) | Ph0 T4 99 0x00004000 (Ph0 T5 << 16) | Ph0 T4

Phase 0

68 0x00000000 (Ph0 T7 << 16) | Ph0 T6 100 0x00000000 (Ph0 T7 << 16) | Ph0 T6

69 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0 101 0x058CFBEF (Ph1 T1 << 16) | Ph1 T0

70 0x1457F749 (Ph1 T3 << 16) | Ph1 T2 102 0x1457F749 (Ph1 T3 << 16) | Ph1 T2

71 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4 103 0xF3CC3D04 (Ph1 T5 << 16) | Ph1 T4

Phase 1

72 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6 104 0xFB4E06C8 (Ph1 T7 << 16) | Ph1 T6

73 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0 105 0x08D8F9AF (Ph2 T1 << 16) | Ph2 T0

74 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2 106 0x2C36F143 (Ph2 T3 << 16) | Ph2 T2

75 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4 107 0xF1432C36 (Ph2 T5 << 16) | Ph2 T4

Phase 2

76 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6 108 0xF9AF08D8 (Ph2 T7 << 16) | Ph2 T6

77 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0 109 0x06C8FB4E (Ph3 T1 << 16) | Ph3 T0

78 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2 110 0x3D04F3CC (Ph3 T3 << 16) | Ph3 T2

79 0xF7491457 (Ph3 T5 << 16) | Ph3 T4 111 0xF7491457 (Ph3 T5 << 16) | Ph3 T4

Phase 3

80 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6 112 0xFBEF058C (Ph3 T7 << 16) | Ph3 T6

Phase 1

Phase 2

Phase 3

48 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Examples of Coefficient Set Generation and Loading

Table 8-4: Example 2 Coefficient Set Download Format (Cont’d)

81 0x00000000 N/A Dummy coef 113 0x00000000 N/A Dummy coef

82 0x00000000 N/A Dummy coef 114 0x00000000 N/A Dummy coef

83 0x00000000 N/A Dummy coef 115 0x00000000 N/A Dummy coef

Phase 4

84 0x00000000 N/A Dummy coef 116 0x00000000 N/A Dummy coef

85 0x00000000 N/A Dummy coef 117 0x00000000 N/A Dummy coef

86 0x00000000 N/A Dummy coef 118 0x00000000 N/A Dummy coef

87 0x00000000 N/A Dummy coef 119 0x00000000 N/A Dummy coef

Phase 5

88 0x00000000 N/A Dummy coef 120 0x00000000 N/A Dummy coef

89 0x00000000 N/A Dummy coef 121 0x00000000 N/A Dummy coef

90 0x00000000 N/A Dummy coef 122 0x00000000 N/A Dummy coef

91 0x00000000 N/A Dummy coef 123 0x00000000 N/A Dummy coef

Phase 6

91 0x00000000 N/A Dummy coef 124 0x00000000 N/A Dummy coef

93 0x00000000 N/A Dummy coef 125 0x00000000 N/A Dummy coef

94 0x00000000 N/A Dummy coef 126 0x00000000 N/A Dummy coef

95 0x00000000 N/A Dummy coef 127 0x00000000 N/A Dummy coef

Phase 7

96 0x00000000 N/A Dummy coef 128 0x00000000 N/A Dummy coef

Phase 4

Phase 5

Phase 6

Phase 7

Example 3: Num_h_taps = 9; num_v_taps = 7; max_phases = num_h_phases = num_v_phases = 4

Now consider the case where the number of taps in the Horizontal dimension is different to that in the Vertical dimension. For this case, when loading the coefficients for the dimension for which the number of taps is smaller, each phase of coefficients must be padded with zeros up to the larger number of taps.

Example coefficients are shown in hexadecimal form in Tab le 8 - 5 (horizontal) and Ta bl e 8 -6 (vertical).

Table 8-5: Example 9-Tap Coefficients

Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6 Tap 7 Tap 8

0 0x0000 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000 0x0000

1 0xFFB1 0x0123 0x047C 0x10C6 0x3A26 0xF5F0 0x037D 0xFF0A 0x0046

2 0xFF84 0x01D1 0xF865 0x2490 0x2A42 0xF3D0 0x0490 0xFEB4 0x0060

3 0xFF9E 0x017E 0xF93F 0x3619 0x14D7 0xF846 0x0312 0xFF1B 0x0043

Video Scaler v4.0 User Guide www.xilinx.com 49

UG805 March 1, 2011

Chapter 8: Coefficients

Table 8-6: Example 7-Tap Coefficients

Phase Tap 0 Tap 1 Tap 2 Tap 3 Tap 4 Tap 5 Tap 6

0 0x0000 0x0000 0x0000 0x4000 0x0000 0x0000 0x0000

1 0x006D 0xFD69 0x0F04 0x3A81 0xF6FE 0x0204 0xFFA4

2 0x00B2 0xFB85 0x2160 0x2B58 0xF4E0 0x02B0 0xFF81

3 0x0097 0xFBE1 0x332B 0x1627 0xF8B1 0x01DF 0xFFA5

The resulting coefficient file for download is shown in Tab le 8 -7 .

Table 8-7: Example 3 Coefficient Set Download Format

Horizontal Filter Coefficients for Luma Horizontal Filter Coefficients for Chroma

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

1 0x00000000 (Ph0 T1 << 16) | Ph0 T0 21 0x00000000 (Ph0 T1 << 16) | Ph0 T0

2 0x00000000 (Ph0 T3 << 16) | Ph0 T2 22 0x00000000 (Ph0 T3 << 16) | Ph0 T2

3 0x00004000 (Ph0 T5 << 16) | Ph0 T4 23 0x00004000 (Ph0 T5 << 16) | Ph0 T4

Phase 0

4 0x00000000 (Ph0 T7 << 16) | Ph0 T6 24 0x00000000 (Ph0 T7 << 16) | Ph0 T6

5 0x00000000 (0 << 16) | Ph0 T8 25 0x00000000 (0 << 16) | Ph0 T8

6 0x0123FFB1 (Ph1 T1 << 16) | Ph1 T0 26 0x0123FFB1 (Ph1 T1 << 16) | Ph1 T0

7 0x10C6047C (Ph1 T1 << 16) | Ph1 T2 27 0x10C6047C (Ph1 T1 << 16) | Ph1 T2

8 0XF5F03A26 (Ph1 T1 << 16) | Ph1 T4 28 0XF5F03A26 (Ph1 T1 << 16) | Ph1 T4

Phase 1

9 0XFF0A037D (Ph1 T1 << 16) | Ph1 T6 29 0XFF0A037D (Ph1 T1 << 16) | Ph1 T6

10 0x00000046 (0 << 16) | Ph1 T8 30 0x00000046 (0 << 16) | Ph1 T8

11 0x01D1FF84 (Ph2 T1 << 16) | Ph2 T0 31 0x01D1FF84 (Ph2 T1 << 16) | Ph2 T0

12 0x2490F865 (Ph2 T3 << 16) | Ph2 T2 32 0x2490F865 (Ph2 T3 << 16) | Ph2 T2

13 0XF3D02A2 (Ph2 T5 << 16) | Ph2 T4 33 0XF3D02A2 (Ph2 T5 << 16) | Ph2 T4

Phase 2

14 0XFEB40490 (Ph2 T7 << 16) | Ph2 T6 34 0XFEB40490 (Ph2 T7 << 16) | Ph2 T6

Phase 0

Phase 1

Phase 2

15 0x00000060 (0 << 16) | Ph2 T8 35 0x00000060 (0 << 16) | Ph2 T8

16 0x017EFF9E (Ph3 T1 << 16) | Ph3 T0 36 0x017EFF9E (Ph3 T1 << 16) | Ph3 T0

17 0x3619F93F (Ph3 T3 << 16) | Ph3 T2 37 0x3619F93F (Ph3 T3 << 16) | Ph3 T2

18 0XF84614D7 (Ph3 T1 << 16) | Ph3 T4 38 0XF84614D7 (Ph3 T1 << 16) | Ph3 T4

Phase 3

19 0XFF1B0312 (Ph3 T1 << 16) | Ph3 T6 39 0XFF1B0312 (Ph3 T1 << 16) | Ph3 T6

20 0x00000043 (0 << 16) | Ph3 T8 40 0x00000043 (0 << 16) | Ph3 T8

50 www.xilinx.com Video Scaler v4.0 User Guide

Phase 3

UG805 March 1, 2011

Examples of Coefficient Set Generation and Loading

Table 8-7: Example 3 Coefficient Set Download Format (Cont’d)

Vertical Filter Coefficients for Luma Vertical Filter Coefficients for Chroma

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

Load

Sequence

Number

Val ue

Calculation

Ph= Phase #, T= Tap #

41 0x00000000 (Ph0 T1 << 16) | Ph0 T0 61 0x00000000 (Ph0 T1 << 16) | Ph0 T0

42 0x40000000 (Ph0 T3 << 16) | Ph0 T2 62 0x40000000 (Ph0 T3 << 16) | Ph0 T2

43 0x00000000 (Ph0 T5 << 16) | Ph0 T4 63 0x00000000 (Ph0 T5 << 16) | Ph0 T4

Phase 0

44 0x00000000 (0 << 16) | Ph0 T6 64 0x00000000 (0 << 16) | Ph0 T6

45 0x00000000 N/A dummy coef 65 0x00000000 N/A dummy coef

46 0XFD69006D (Ph1 T1 << 16) | Ph1 T0 66 0XFD69006D (Ph1 T1 << 16) | Ph1 T0

47 0x3A810F04 (Ph1 T1 << 16) | Ph1 T2 67 0x3A810F04 (Ph1 T1 << 16) | Ph1 T2

48 0X0204F6FE (Ph1 T1 << 16) | Ph1 T4 68 0X0204F6FE (Ph1 T1 << 16) | Ph1 T4

Phase 1

49 0X0000FFA4 (0 << 16) | Ph1 T6 69 0X0000FFA4 (0 << 16) | Ph1 T6

50 0x00000000 N/A dummy coef 70 0x00000000 N/A dummy coef

51 0XFB8500B2 (Ph2 T1 << 16) | Ph2 T0 71 0XFB8500B2 (Ph2 T1 << 16) | Ph2 T0

52 0x2B582160 (Ph2 T3 << 16) | Ph2 T2 72 0x2B582160 (Ph2 T3 << 16) | Ph2 T2

53 0X02B0F4E0 (Ph2 T5 << 16) | Ph2 T4 73 0X02B0F4E0 (Ph2 T5 << 16) | Ph2 T4

Phase 2

54 0X0000FF81 (0 << 16) | Ph2 T6 74 0X0000FF81 (0 << 16) | Ph2 T6

Phase 0

Phase 1

Phase 2

55 0x00000000 N/A dummy coef 75 0x00000000 N/A dummy coef

56 0XFBE10097 (Ph3 T1 << 16) | Ph3 T0 76 0XFBE10097 (Ph3 T1 << 16) | Ph3 T0

57 0x1627332B (Ph3 T3 << 16) | Ph3 T2 77 0x1627332B (Ph3 T3 << 16) | Ph3 T2

58 0X01DFF8B1 (Ph3 T1 << 16) | Ph3 T4 78 0X01DFF8B1 (Ph3 T1 << 16) | Ph3 T4

Phase 3

59 0X0000FFA5 (0 << 16) | Ph3 T6 79 0X0000FFA5 (0 << 16) | Ph3 T6

50 0x00000000 N/A dummy coef 80 0x00000000 N/A dummy coef

Phase 3

Video Scaler v4.0 User Guide www.xilinx.com 51

UG805 March 1, 2011

Chapter 8: Coefficients

Coefficient Preloading Using a .coe File

To preload the scaler with coefficients (mandatory when in Constant mode), you must specify, using the CORE Generator GUI or the EDK GUI, a .coe file that contains the coefficients you want to use. It is important that the .coe file specified is in the correct format. The coefficients specified in the .coe file become hard-coded into the hardware during synthesis.

Generating .coe Files

Generating .coe files can be accomplished by either extracting coefficients from a file provided with the core (refer to the next section) or developing your own set of coefficients. Developing your own coefficients is a very complex and subjective operation, and is beyond the scope of this document. Refer to Answer Record 35262

Coefficient Calculations for more information on generating video scaler coefficients.

Extracting Coefficients From xscaler_coefs.c File

The pCore version of the video scaler includes a software driver. The coefficients are included in this driver in the xscaler_coefs.c file. The pCore version of the core can be generated by selecting "EDK pCore" in the CORE Generator GUI. Coefficients from this file can be extracted manually; however, it is important to know the format of this file.

and Filter

All coefficients required for any conversion are provided with the SW Driver. The filename is xscaler_coefs.c. You may modify this file, and the driver code that reads the coefficients from it, as you see fit.

The file defines 19 “bins” of coefficients. You must select which bin to use according to your application. In the delivered driver, the file xscaler.c includes a function called XScaler_CoeffBinOffset, which assesses the scaling requirements specified by you (for example, input/output rectangle sizes) and calculates which bin of coefficients is required. In this driver, the bins have been allocated as per Ta bl e 8 -8 . This function may be used independently for all Horizontal, Vertical, Luma, and Chroma filter operations.

Table 8-8: Coefficient “Binning” in SW Driver (xscaler_coefs.c)

Bin #

1 SF<1 All up-scaling cases

1+Ceil((output_size*16)/input_size)

(bins 2 to 17)

For example:

• Down-scaling 1920 to 1440: use bin

• Down-scaling 1080 to 1000 : Use

bin 16

• Down-scaling 1080 to 144 : Use

bin 4

SF=input_size/

output_size

1<SF<16

(All down-

scaling cases)

Comments

General down-scaling coefficients

Down-scaling filter coefficients include anti-aliasing characteristics that differ according to scale-factor

52 www.xilinx.com Video Scaler v4.0 User Guide

18 N/A Unity coefficient in center tap

1920/1280

(1080/720)

Example user-specific case for HD down scaling conversion

UG805 March 1, 2011

Coefficient Preloading Using a .coe File

Within each “bin,” four further levels of granularity can be observed. In order of decreasing size of granularity, these levels are:

•Number of taps defined

•Number of phases defined

• Phase number (one line in file)

• Tap number (one element of each line), newest (right-most or lowest) first

For example, the first set of coefficients, defined for two taps and two phases, is given as:

// bin # 1; num_taps = 2; num_phases = 2

1018, 15366,

8192, 8192

The second set of coefficients, defined for two taps and three phases, is given immediately afterwards as:

/* bin # 1; num_taps = 2; num_phases = 3 */

1018, 15366,

5852, 10532,

10532, 5852,

And so forth.

Format for .coe Files

The guidelines for creating a .coe file are as follows:

• Coefficients may be specified in either 16-bit binary form or signed decimal form.

• First line of a 16-bit binary file must be memory_initialization_radix=2;

• First line of a signed decimal file must be memory_initialization_radix=10;

• Second line of all .coe files must be memory_initialization_vector=

• All coefficient entries must end with a comma (",") except the final entry which must end with a semicolon ";".

• Final entry must have a carriage return at the end after the semicolon.

• All coefficient sets must be listed consecutively, starting with set 0.

• All sets in the file must be of equal size in terms of the number of coefficient entries.

• Number of coefficient entries in all sets depends upon:

•Max_coef_sets

• Max_phases

• Max_taps (=max(num_h_taps, num_v_taps))

• User setting for "Separate Y/C coefficients"

• User setting for “Chroma_format”

Video Scaler v4.0 User Guide www.xilinx.com 53

UG805 March 1, 2011

Chapter 8: Coefficients

• User setting for "Separate H/V coefficients"

The simplest method is to specify an intermediate value num_banks:

num_banks=4;

if (Separate H/V coefficients = 0) then

num_banks := num_banks/2;

end;

if (Separate Y/C coefficients = 0) or (chroma_format=4:4:4) then

num_banks := num_banks/2;

end;

Consequently, the number of entries in the .coe file can be defined as:

num_coefs_in_coe_file = max_coef_sets x num_banks x max_phases x max_taps

• Within each set, coefficient banks must be specified in the following order:

Table 8-9: Ordering of Coefficients in .coe File for Different Coefficient Sharing Options

Separate Y/C Coefficients Separate H/V Coefficients Bank Order in .coe File

True True HY, HC, V Y, V C

True Fals e H, V

False True Y, C

False False Single set only

• Within each bank, all phases must be listed consecutively, starting with phase 0, followed by phase 1, etc.

• The number of phases specified (per bank) in the .coe file must be equal to Max_Phases, even for filters that use fewer phases. Set all coefficients in unused phases to 0 (decimal) or 0000000000000000 (16b binary).

• Within each phase, all coefficients must be listed consecutively. The first specified coefficient for any phase represents the value applied to the newest (rightmost or lowest) tap in the aperture.

Tab le 8- 10 shows an example of a .coe file with the following specification:

num_h_taps = num_v_taps = 12;

max_phases = 4;

max_coef_sets = 1;

Separate H/V Coefficients = False;

Separate Y/C Coefficients = False;

54 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Coefficient Preloading Using a .coe File

Both signed decimal and 16-bit binary forms are shown.

Table 8-10: .coe File Example 1

Phase Tap File line-number Line text (signed decimal form) Line text (16-bit binary form)

N/A N/A 1 memory_initialization_radix=10; memory_initialization_radix=2;

2 memory_initialization_vector= memory_initialization_vector=

0 0 3 0, 0000000000000000,

0 1 4 162, 0000000010100010,

0 2 5 0, 0000000000000000,

0 3 6 -1069, 1111101111010011,

0 4 7 0, 0000000000000000,

0 5 8 5199, 0001010001001111,

0 6 9 8167, 0001111111100111,

0 7 10 4457, 0001000101101001,

0 8 11 0, 0000000000000000,

0 9 12 -616, 1111110110011000,

0 10 13 0, 0000000000000000,

0 11 14 85, 0000000001010101,

1 0 15 28, 0000000000011100,

1 1 16 155, 0000000010011011,

1 2 17 -186, 1111111101000110,

1 3 18 -1062, 1111101111001010,

1 4 19 960, 0000001111000000,

1 5 20 6311, 0001100010100111,

1 6 21 7842, 0001111010100010,

1 7 22 3246, 0000110010101110,

1 8 23 -538, 1111110111100110,

1 9 24 -518, 1111110111111010,

1 10 25 72, 0000000001001000,

1 11 26 73, 0000000001001001,

2 0 27 53, 0000000000110101,

2 1 28 125, 0000000001111101,

2 2 29 -366, 1111111010010010,

2 3 30 -890, 1111110010000110,

2 4 31 2060, 0000100000001100,

2 5 32 7209, 0001110000101001,

Video Scaler v4.0 User Guide www.xilinx.com 55

UG805 March 1, 2011

Chapter 8: Coefficients

Table 8-10: .coe File Example 1

2 6 33 7209, 0001110000101001,

2 7 34 2060, 0000100000001100,

2 8 35 -890, 1111110010000110,

2 9 36 -366, 1111111010010010,

2 10 37 125, 0000000001111101,

2 11 38 53, 0000000000110101,

3 0 39 73, 0000000001001001,

3 1 40 72, 0000000001001000,

3 2 41 -518, 1111110111111010,

3 3 42 -538, 1111110111100110,

3 4 43 3246, 0000110010101110,

3 5 44 7842, 0001111010100010,

3 6 45 6311, 0001100010100111,

3 7 46 960, 0000001111000000,

3 8 47 -1062, 1111101111001010,

3 9 48 -186, 1111111101000110,

3 10 49 155, 0000000010011011,

3 11 50 28; 0000000000011100;

351“” “”

Tab le 8- 11 shows an example of a .coe file with the following specification:

num_h_taps = 12, num_v_taps = 12;

max_phases = 4;

max_coef_sets = 2;

Separate H/V Coefficients = True;

Separate Y/C Coefficients = True;

56 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Coefficient Preloading Using a .coe File

Just signed decimal form is shown. For clarity's sake, the same coefficient values have been used for each bank. Be aware that these are not realistic coefficients. Also note that this list includes ellipses to show continuation, and that it does not include a complete set of coefficients.

Table 8-11: .coe File Example 2

Set Bank Phase Tap File line-number Line Text

N/A 1 memory_initialization_radix=10;

2 memory_initialization_vector=

0 0 (HY) 0 0 3 0,

0 0 (HY) 0 1 4 162,

0 0 (HY) 0 2 5 0,

0 0 (HY) 0 3 6 -1069,

0 0 (HY) 0 … … …

0 0 (HY) 1 0 15 28,

0 0 (HY) 1 1 16 155,

0 0 (HY) 1 2 17 -186,

0 0 (HY) … … … …

0 0 (HY) 3 0 39 73,

0 0 (HY) 3 1 40 72,

0 0 (HY) 3 … … …

0 0 (HY) 3 11 50 28,

0 1 (HC) 0 0 51 0,

0 1 (HC) 0 1 52 162,

0 1 (HC) 0 2 53 0,

0 …… … … …

0 1 (HC) 3 0 87 73,

0 1 (HC) 3 1 88 72,

0 1 (HC) 3 … … …

0 1 (HC) 3 11 98 28,

0 2 (VY) 0 0 99 0,

0 2 (VY) 0 1 100 162,

0 2 (VY) 0 2 101 0,

0 …… … … …

0 2 (VY) 3 0 135 73,

0 2 (VY) 3 1 136 72,

0 2 (VY) 3 … … …

Video Scaler v4.0 User Guide www.xilinx.com 57

UG805 March 1, 2011

Chapter 8: Coefficients

Table 8-11: .coe File Example 2

0 2 (VY) 3 11 146 28,

0 3 (VC) 0 0 147 0,

0 3 (VC) 0 1 148 162,

0 3 (VC) 0 2 149 0,

0 …… … … …

0 3 (VC) 3 0 183 73,

0 3 (VC) 3 1 184 72,

0 3 (VC) 3 … … …

0 3 (VC) 3 11 194 28,

1 0 (HY) 0 0 195 0,

1 0 (HY) 0 1 196 162,

1 0 (HY) 0 2 197 0,

1 0 (HY) … … … …

1 0 (HY) 3 11 242 28

1 1 (HC) 0 0 243 0,

1 …… … … …

1 2 (VY) 0 0 291 0,

1 …… … … …

1 3 (VC) 3 0 375 73,

1 3 (VC) 3 1 376 72,

1 3 (VC) 3 … … …

1 3 (VC) 3 11 386 28;

- - - - 387 “”

Tab le 8- 12 shows an example of a .coe file with the following specification:

num_h_taps = 4, num_v_taps = 3;

max_phases = 4;

max_coef_sets = 1;

Separate H/V Coefficients = True;

Separate Y/C Coefficients = False;

58 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Just signed decimal form is shown.

Table 8-12: .coe File Example 3

Coefficient Preloading Using a .coe File

Bank Phase Tap

File line-

number

Line Text Notes

N/A 1 memory_initialization_radix=10;

2 memory_initialization_vector=

0 (H) 0 0 3 -104,

0 (H) 0 1 4 1018,

0 (H) 0 2 5 15364,

0 (H) 0 3 6 106,

0 (H) 1 0 7 -240,

0 (H) 1 1 8 4793,

0 (H) 1 2 9 12022,

0 (H) 1 3 10 -191,

0 (H) 2 0 11 -282,

0 (H) 2 1 12 8474,

0 (H) 2 2 13 8474,

0 (H) 2 3 14 -282,

0 (H) 3 0 15 -191,

0 (H) 3 1 16 12022,

0 (H) 3 2 17 4793,

0 (H) 3 3 18 -240,

1 (V) 0 0 19 86,

1 (V) 0 1 20 16212,

1 (V) 0 2 21 86,

1 (V) - - 22 0, Padding value

1 (V) 1 0 23 512,

1 (V) 1 1 24 16068,

1 (V) 1 2 25 -197,

1 (V) - - 26 0, Padding value

1 (V) 2 0 27 1243,

1 (V) 2 1 28 15539,

1 (V) 2 2 29 -398,

1 (V) - - 30 0, Padding value

1 (V) 3 0 31 2829,

Video Scaler v4.0 User Guide www.xilinx.com 59

UG805 March 1, 2011

Chapter 8: Coefficients

Table 8-12: .coe File Example 3

1 (V) 3 1 32 14099,

1 (V) 3 2 33 -544,

1 (V) - - 34 0; Padding value

-- - 35 “”

Coefficient Readback

For coefficient verification purposes, a feature of the video scaler allows the user to read back coefficients in the active coefficient memory.

Dedicated connections are included to facilitate this feature:

• coef_set_bank_rd_addr(11:8): Coefficient set read-address

• coef_set_bank_rd_addr(1:0): Coefficient bank read-address. 00=HY, 01=HC, 10=VY, 11=VC

• coef_mem_rd_addr(13:8): Coefficient phase read-address

• coef_mem_rd_addr(3:0): Coefficient tap read-address

• coef_mem_output(15:0): Coefficient readback output

• intr_coef_mem_rdbk_rdy: Output flag indicating that the specified coefficient bank is ready for reading

Before changing the set and bank read address, the user must set bit 3 of the control register to 0. Using the coef_set_bank_rd_addr, the user provides a set number and bank number for the coefficients he wants to read back. The user must then activate the new bank of coefficients by setting bit 3 of the control register to 1. A FIFO is then populated with that bank of coefficients. Once the intr_coef_mem_rdbk_rdy interrupt has gone high, using coef_mem_rd_addr the user must also provide the phase and tap number of the coefficient he wants to read from that bank. The coefficient will appear at coef_mem_output three clk cycles later.

Reading back coefficients does not cause image distortion, and may be executed during normal operation.

60 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Performance

The target maximum clock frequencies for all scaler input clocks are shown in Tab le 9 -1 .

Table 9-1: Target Maximum Clock Frequencies

Family Speed grade FMax (MHz)

Virtex-5 -1 225

Virtex-6 -1 250

Spartan-6 -2 150

Chapter 9

-2 250

-3 275

-2 280

-3 160

Spartan-3A DSP -4 150

-5 160

It is very important to ensure that the clock rate available supports worst-case conversions. This chapter includes detailed information and examples for worst-case scenarios.

Every user of the Xilinx Video Scaler should have a worst-case scenario in mind. The factors that may contribute to this scenario include:

• Maximum line length to be handled in the system (into and out from the scaler)

• Maximum number of lines per frame (in and out)

• Maximum frame refresh rate

• Chroma format (4:4:4, 4:2:2, or 4:2:0)

• Clock FMax (depends upon the selected device)

These factors may contribute to decisions made for configuring the scaler and its supporting system. For example, the user may decide to use the scaler in its dual-engine parallel Y/C configuration to achieve the scale factor and frame rate desired. Using a dualengine scaler allows the scaler to process more data per clock cycle at the cost of an increased resource usage. He may also elect to change speed-grade or even device family dependent upon his findings.

The size of the scaler implementation is determined by the number of taps and number of phases in the filter and the number of engines. The number of taps and number of phases do not impact the clock frequency.

Video Scaler v4.0 User Guide www.xilinx.com 61

UG805 March 1, 2011

Chapter 9: Performance

How do you establish whether or not the scaler will meet the application requirements? The approach taken is to calculate the minimum clock frequency required to make the intended conversions possible.

Definitions:

Subject Image The area of the active image that is driven into the scaler. This may or may

not be the entire image, dependent upon your requirements. It is of dimensions (SubjWidth x SubjHeight).

Active Image The entire active input image, some or all of which will include the Subject

Image, and is of dimensions (ActWidth x ActHeight).

FPix The input sample rate.

F'clk The 'clk' frequency. Data is read from the internal input line buffer,

processed and written to the internal output buffer using the system clock.

FLineIn The input Line Rate – could be driven by input rate or scaler LineReq rate.

FLineIn must represent the maximum burst frequency of the input lines. For example, 720P exhibits an FLineIn of 45kHz.

FFrameIn The fixed frame refresh rate (Hz) – same for both input and output.

To make the calculations according to the previous definitions and assumptions, it is necessary to distinguish between the following cases:

• Live Video mode: An input video stream feeds directly into the scaler.

• Memory mode: The user may control the input feed using back-pressure/

There follow some example cases which attempt to illustrate how to calculate what clock frequencies may be required to sustain the throughput required for given usage scenarios.

Live Video Mode

If no input frame buffer is used, and the timing of the input video format drives the scaler, then the number of 'clk' cycles available per H period becomes important. FLineIn is a predetermined frequency in this case, often (but not necessarily) defined according to a known broadcast video format (for example 1080i/60, 720P, CCIR601 etc.).

The critical factors may be summarized as follows:

• ProcessingOverheadPerComponent –The number of extraneous cycles needed by

• The user may not hold off the input stream.

• The system must be able to cope with the constant flow of video data.

handshaking by implementing an input frame buffer.

the scaler to complete the generation of one component of the output line, in addition to the actual processing cycles. This is required due to filter latency and State-Machine initialization. For all cases in this document, this has been approximated as 50 cycles per component per line.

62 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Live Video Mode

• CyclesPerOutputLine – This is the number of cycles the scaler requires to generate one output line, of multiple components. The final calculation depends upon the chroma format and the filter configuration (YC4:2:2 only), and can be summarized as:

For 4:4:4:

CyclesPerOutputLine = Max(output_h_size,SubjWidth) + ProcessingOverheadPerComponent

For 4:2:2 dual-engine:

CyclesPerOutputLine = Max(output_h_size,SubjWidth) + 2*ProcessingOverheadPerComponent

For 4:2:2 single-engine:

CyclesPerOutputLine = 2*Max(output_h_size,SubjWidth) + 3*ProcessingOverheadPerComponent

For 4:2:0:

CyclesPerOutputLine = 2*Max(output_h_size,SubjWidth) + 3*ProcessingOverheadPerComponent

For more details on the above estimations, continue reading. Otherwise, skip to the MaxVHoldsPerInputAperture bullet below.

The general calculation is:

CyclesPerOutputLine=(CompsPerEngine*Max(output_h_size,SubjWidth))+ OverHeadMult*ProcessingOverheadPerComponent

The CompsPerEngine and OverHeadMult values can be extracted from Tab le 9 -2 .

Table 9-2: Throughput Calculations for Different Chroma Formats

Chroma Format NumEngines CompsPerEngine OverHeadMult

4:4:4 (e.g., RGB) 3 1 1

4:2:2 High performance 2 1 2

4:2:2 Standard performance 1 2 3

4:2:0 1 2 3

NumEngines

This is the number of engines used in the implementation. For the YC4:2:2 case, a higher number of engines uses more resources - particularly BRAM and DSP48.

CompsPerEngine

This is the largest number of full h-resolution components to be processed by this instance of the scaler. When using YC, each chroma component constitutes 0.5 in this respect.

OverHeadMult

For each component processed by a single engine, the ProcessingOverheadPerComponent overhead factor must be included in the equation. The number of times this overhead needs to be factored in depends upon the number of components processed by the worst-case engine.

CyclesRequiredPerOutputLine=Max(output_h_size,SubjWidth)+Proces singOverheadPerComponent

Video Scaler v4.0 User Guide www.xilinx.com 63

UG805 March 1, 2011

Chapter 9: Performance

We modify this to include the chroma components. YC case is shown in this example.

CyclesRequiredPerOutputLine=2*Max(output_h_size,SubjWidth)+3*ProcessingOver headPerComponent

• MaxVHoldsPerInputAperture – This is the maximum number of times the vertical aperture needs to be 'held' (especially up-scaling):

MaxVHoldsPerInputAperture = CEIL(Vertical scaling ratio)

where

vertical scaling ratio = output_v_size/input_v_size

Given the preceding information, it is now necessary to calculate how many cycles it will take to generate the worst-case number of output lines for any vertical aperture:

• MaxClksTakenPerVAperture – This is the number of cycles it will take to generate MaxVHoldsPerInputAperture lines.

MaxClksTakenPerVAperture = CyclesRequiredPerOutputLine x MaxVHoldsPerInputAperture

It is then necessary to decide the minimum 'clk' frequency required to achieve your goals according to this calculation:

MinF'clk' = FLineIn x MaxClksTakenPerVAperture

Also useful is the reciprocal relationship that defines the number of 'clk' cycles available before the next line is written into the input line buffer, for a predefined 'clk' frequency:

ClksAvailablePerLine = F'clk'/FLineIn

Within this number of cycles, all output lines that require the use of the current vertical filter aperture must be completely generated. If MaxClksTakenPerVAperture < ClksAvailablePerLine, then the desired conversion is possible using the current clock frequency, without the use of an input frame buffer.

Some examples follow. They are estimates only, and are subject to change.

Example 1: The Unity Case

1080i/60 YC4:2:2 'passthrough' Vertical scaling ratio = 1.00 Horizontal scaling ratio = 1.00 FLineIn = 33750 Single-engine implementation

CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(540/540) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 33750*3990 = 134.66 MHz

Shrink-factor inputs:

hsf=220 x (1/1.0) = 0x100000

vsf=2

x (1/1.0) = 0x100000

This case is possible with no input buffer using Spartan-3A DSP because the MinF'clk is less than the core Fmax, as shown in Ta b l e 9 - 1 .

64 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Example 2: Up-scaling 640x480 60 Hz YC4:2:2 to 800x600 Assuming 30 kHz line rate Vertical scale ratio = 1.25 Horizontal scale ratio = 1.25 FLineIn = 30000 Single-engine implementation

CyclesRequiredPerOutputLine = 2*800 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(600/480) = 2 MaxClksTakenPerVAperture = 1750 * 2= 3500 MinF'clk' = 30000*3500 = 105 MHz

Shrink-factor inputs:

hsf=220 x (1/1.25) = 0x0CCCCC

vsf=2

x (1/1.25) = 0x0CCCCC

This case is easily possible with no input buffer, in Spartan-3A DSP.

Example 3: Up-scaling 640x480 60 Hz YC4:2:2 to 1920x1080p60

Assuming 30 kHz line rate Vertical scale ratio = 3.0 Horizontal scale ratio = 2.2 FLineIn = 30000 Single-engine implementation

Live Video Mode

CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture =round_up(1080/480) = 3 MaxClksTakenPerVAperture = 3990 * 3 = 11970 MinF'clk' = 30000*11970 = 359.1 MHz

Shrink-factor inputs:

hsf=220 x (1/1.25) = 0x0CCCCC

vsf=2

x (1/1.25) = 0x0CCCCC

Without an input frame buffer, this conversion will not work in any device currently available.

Example 4: Up-scaling 640x480 60 Hz YC4:2:2 to 1920x1080p60

Assuming 30 kHz line rate Vertical scale ratio = 3.0 Horizontal scale ratio = 2.2 FLineIn = 30000 Dual-engine implementation

CyclesPerOutputLine = 1*1920 + 2*50 (approximately) MaxVHoldsPerInputAperture =round_up(1080/480) = 3 MaxClksTakenPerVAperture = 2020 * 3 = 6060 MinF'clk' = 30000*6060 = 181.8 MHz

Shrink-factor inputs:

hsf=220 x (1/1.25) = 0x0CCCCC vsf=220 x (1/1.25) = 0x0CCCCC

For a dual-engine implementation, without an input frame buffer, this conversion will work in devices that support this clock-frequency.

Video Scaler v4.0 User Guide www.xilinx.com 65

UG805 March 1, 2011

Chapter 9: Performance

Example 5: Down-scaling 800x600 60Hz YC4:2:2 to 640x480 Assuming 30 kHz line rate Vertical scale ratio = 0.8 Horizontal scale ratio = 0.8 FLineIn = 30000 Single-engine implementation

CyclesRequiredPerOutputLine = 2*800 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(480/600) = 1 MaxClksTakenPerVAperture = 1750 * 1= 1750 MinF'clk' = 30000*1750 = 52.5 MHz

Shrink-factor inputs:

hsf=220 x (1/0.8) = 0x140000

vsf=2

x (1/0.8) = 0x140000

This conversion will work in any of the supported devices and speed grades.

Example 6: Down-scaling 1080P60 YC4:2:2 to 720P/60

67.5 kHz line rate

Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.6667 FLineIn = 67500 Single-engine implementation

CyclesPerOutputLine = 2*1920 + 3*50 (approximately) MaxVHoldsPerInputAperture = round_up(720/1080) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 67500*3990 = 269.32 MHz

Shrink-factor inputs:

hsf=220 x (1/0.6667) = 0x180000 vsf=220 x (1/0.6667) = 0x180000

When using a single-engine, this conversion will not work with or without frame buffers (see below - Memory mode) unless using higher speed-grade Virtex-5 or Virtex-6 devices.

Example 7: Down-scaling 1080P60 YC4:2:2 to 720P/60

67.5 kHz line rate

Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.6667 FLineIn = 67500 Dual-engine implementation

CyclesPerOutputLine = 1*1920 + 2*50 (approximately) MaxVHoldsPerInputAperture = round_up(720/1080) = 1 MaxClksTakenPerVAperture = 2020 * 1 = 3990 MinF'clk' = 67500*2020 = 136.35 MHz

Shrink-factor inputs:

hsf=220 x (1/0.6667) = 0x180000 vsf=220 x (1/0.6667) = 0x180000

This conversion will work in any of the supported devices and speed grades.

66 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Memory Mode

Example 8: Down-scaling 720P/60 YC4:2:2 to 640x480 45 kHz line rate Vertical scale ratio = 0.6667 Horizontal scale ratio = 0.5 FLineIn = 45000 Single-engine implementation

CyclesRequiredPerOutputLine = 2*1280 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(480/720) = 1 MaxClksTakenPerVAperture = 2710 * 1 = 2710 MinF'clk' = 45000*2710 = 121.95 MHz

Shrink-factor inputs:

hsf=220 x (1/0.5) = 0x200000

vsf=2

x (1/0.6667) = 0x180000

This conversion will work in any of the supported devices and speed grades.

Example 9: Converting 720P/60 YC4:2:2 to 1080i/60 (1920x540)

45 kHz line rate Vertical scale ratio = 0.75 Horizontal scale ratio = 1.5 FLineIn = 45000 Single-engine implementation

Memory Mode

CyclesRequiredPerOutputLine = 2*1920 + 150 (approximately) MaxVHoldsPerInputAperture = round_up(540/720) = 1 MaxClksTakenPerVAperture = 3990 * 1 = 3990 MinF'clk' = 45000*3990 = 179.55 MHz

Shrink-factor inputs:

hsf=220 x (1/1.5) = 0x0AAAAA

vsf=2

x (1/0.6667) = 0x155555

This conversion will work in Virtex-5, but not in Spartan-3A DSP since the MinF'clk is greater than the Spartan-3A Fmax, but less than the Virtex-5 Fmax, as shown in

Ta b l e 9 - 1.

Using an input frame buffer allows you to stretch the processing time over the entire frame period (utilizing the available blanking periods). New input lines may be provided as the internal phase-accumulator dictates, instead of the input timing signals.

The critical factors may be summarized as follows:

• ProcessingOverheadPerLine – The number of extraneous cycles needed by the scaler to complete the generation of one output line, in addition to the actual processing cycles. This is required due to filter latency and State-Machine initialization. For all cases in this document, this has been approximated as 50 cycles per component per line.

• FrameProcessingOverhead – The number of extraneous cycles needed by the scaler to complete the generation of one output frame, in addition to the actual processing cycles. This is required mainly due to vertical filter latency. For all cases in this document, this has been generally approximated as 10000 cycles per frame.

Video Scaler v4.0 User Guide www.xilinx.com 67

UG805 March 1, 2011

Chapter 9: Performance

• CyclesPerOutputFrame – This is the number of cycles the scaler requires to generate one output frame, of multiple components. The final calculation depends upon the chroma format (and, for YC4:2:2 only, the filter configuration), and can be summarized as:

For 4:4:4:

CyclesPerOutputFrame = Max [ (output_h_size + ProcessingOverheadPerLine)*output_v_size, (input_h_size + ProcessingOverheadPerLine)*input_v_size ] + FrameProcessingOverhead

For 4:2:2 dual-engine:

CyclesPerOutputFrame = Max [ (output_h_size + (ProcessingOverheadPerLine*2))*output_v_size, (input_h_size + (ProcessingOverheadPerLine*2))*input_v_size ] + FrameProcessingOverhead

For 4:2:2 single-engine:

CyclesPerOutputFrame = Max [ ((output_h_size*2) + (ProcessingOverheadPerLine*3))*output_v_size, ((input_h_size*2) + (ProcessingOverheadPerLine*3))*input_v_size ] + FrameProcessingOverhead

For 4:2:0:

CyclesPerOutputFrame = Max [ ((output_h_size*2) + (ProcessingOverheadPerLine*3))*output_v_size, ((input_h_size*2) + (ProcessingOverheadPerLine*3))*input_v_size ] + FrameProcessingOverhead

It is then necessary to decide the minimum 'clk' frequency according to this calculation:

MinF'clk' = FFrameIn x CyclesPerOutputFrame

68 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Example 10: Converting 720P YC4:2:2 to 1080i/60 (1920x540)

Vertical scale ratio = 0.75 Horizontal scale ratio = 1.5 FFrameIn = 60

CyclesPerOutputFrame = (1920*2 + 150)*540 + 10000 (approximately) = 2164600 MinF'clk' = 60 x 2164600 = 129.87 MHz

Shrink-factor inputs:

hsf=220 x (1/1.5) = 0x0AAAAA vsf=220 x (1/0.8) = 0x155555

This conversion is allowed in Spartan-3A DSP.

Memory Mode

Note:

Spartan-3A DSP.

Example 9 showed that the same conversion with no frame buffer is not possible in

Video Scaler v4.0 User Guide www.xilinx.com 69

UG805 March 1, 2011

Chapter 9: Performance

70 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Use Cases

Typical Uses

Some scenarios for scaler usage are shown in Figure A-1 through Figure A-5. In particular, usage of the following dynamic parameter values are illustrated:

• aperture_start_line

• aperture_end_line

• aperture_start_pixel

• aperture_end_pixel

• output_h_size

• output_v_size

• hsf

• vsf

Appendix A

These values are very significant, and their usage is be referred to throughout this document.

X-Ref Target - Figure A-1

720

aperture_start_pixel = 0

1280

aperture_end_pixel = 1279

aperture_start_line

= 0

output_y_size = 480

aperture_end_line = 719

output_h_size = 640

Figure A-1: Format Down-scaling. Example 720p to 640x480,

HSF = 2

x 1280/640; VSF = 220 x 720/480

UG_01_031909

Video Scaler v4.0 User Guide www.xilinx.com 71

UG805 March 1, 2011

Appendix A: Use Cases

aperture_start_line = 0

aperture_start_pixel = 0

640

480

aperture_end_pixel = 639

aperture_end_line

= 479

output_h_size = 1280

output_y_size

= 720

UG_02_031909

aperture_start_line = 420

aperture_start_pixel = 750

1280

480

270

720

aperture_end_pixel = 1229

aperture_end_line

= 689

output_h_size = 1280

output_y_size

= 720

UG678_4-5_081809

aperture_start_line = 0

aperture_start_pixel = 0

720

aperture_end_pixel = 1279

aperture_end_line

= 719

12801280

720

270

480

output_h_size = 480

output_y_size

= 270

UG678_4-6_081809

X-Ref Target - Figure A-2

Figure A-2: Format Up-scaling. Example 640x480 to 720p,

HSF = 2

X-Ref Target - Figure A-3

x 640/1280; 220 x VSF = 480/720

Figure A-3: Zoom (Up-scaling), HSF = 220 x 480/1280; VSF = 220 x 270/720

X-Ref Target - Figure A-4

72 www.xilinx.com Video Scaler v4.0 User Guide

Figure A-4: Shrink (Down-scaling). Example for Picture-in-Picture (PinP),

HSF = 2

x 1280/480; VSF = 220 x 720/270

UG805 March 1, 2011

X-Ref Target - Figure A-5

aperture_start_line = 0

aperture_start_pixel = 0

1280

480

270

720

aperture_end_pixel = 479

aperture_end_line

= 269

output_h_size = 1280

output_y_size

= 720

UG678_4-7_081809

Typical Uses

Figure A-5: Zoom (Up-scaling) reading from External Memory,

HSF = 2

x 480/1280; VSF = 220 x 270/720

Video Scaler v4.0 User Guide www.xilinx.com 73

UG805 March 1, 2011

Appendix A: Use Cases

74 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Programmer Guide

Introduction

This appendix provides a description of how to program and control the data flow for the video scaler hardware pCore. The information is sufficient for the development of a software driver (API) for use in application software for applications such as video conferencing and video analytics.

Appendix B

Note:

as described here.

A software driver is provided with the pCore so that you do not have to develop a software API

Conventions

Reserved locations in the registers will be ignored by the hardware and can be written by software with any value. Therefore the software does not need to zero or mask bits.

Unused coefficients should be set to zero. The number of taps is a compile time parameter for the IP core and needs to be known by the programmer to be able to load the coefficient tables correctly.

Note: All registers default to 0x00000000 on power-up or software reset.

Table B-1: Video Scaler Registers Overview

Address Name Read/Write Description

0x0000 control R/W General control register

0x0004 status R General readable status register

0x0008 status_error R General readable status register for errors

0x000c status_done R/W General read register for status done

0x0010 horz_shrink_factor R/W Horizontal Shrink Factor

0x0014 vert_shrink_factor R/W Vertical Shrink Factor

0x0018 aperture_horz R/W

Video Scaler v4.0 User Guide www.xilinx.com 75

UG805 March 1, 2011

aperture_start_pixel: Location of first subject pixel in input line, relative to first active pixel in that line

aperture_end_pixel: Location of final subject pixel in input line, relative to first active pixel in that line

Appendix B: Programmer Guide

Table B-1: Video Scaler Registers Overview (Cont’d)

Address Name Read/Write Description

aperture_start_line: Location of first subject line in input image,

0x001c aperture_vert R/W

relative to first active line in that image

aperture_end_line: Location of final subject line in input image, relative to first active line in that image

0x0020 output_size R/W

output_h_size: Width of output image (pixels)

output_v_size: Height of output image (lines)

num_h_phases: Number of phases of coefficients in current

0x0024 num_phases R/W

horizontal filter set

num_v_phases: Number of phases of coefficients in current vertical filter set

hcoeffset: Active coefficient set to use in horizontal filter

0x0028 coeff_sets R/W

operation

vcoeffset: Active coefficient set to use in vertical filter operation

0x002c start_hpa_y R/W

0x0030 start_hpa_c R/W

0x0034 start_vpa_y R/W

0x0038 start_vpa_c R/W

0x003c coef_write_set_addr R/W

Fractional value used to initialize horizontal accumulator at rectangle left edge for luma

Fractional value used to initialize vertical accumulator at rectangle top edge for luma

Fractional value used to initialize horizontal accumulator at rectangle left edge for chroma

Fractional value used to initialize vertical accumulator at rectangle top edge for chroma

Coefficient set write address to indicate which coefficient bank to write

0x0040 coef_values W Coefficient values to write

0x0044 coef_set_bank_rd_addr R/W Set and bank number to be read

0x0048 coef_mem_rd_addr R/W Phase and tap number to be read

0x004c coef_mem_output R Coefficient readback output

0x00F0 Version Register R Hardware version information

Writing a SOFT_RESET value to this register resets the software

0x0100 Software_Reset W

registers and the Video Scaler IP core. The SOFT_RESET value is determined by EDK.

0x021C GIER R/W Global Interrupt Enable Register

0x0220 ISR R/W

0x0228 IER R/W

Interrupt Status Register; Read to determine the source of the interrupt, write to clear the interrupt

Interrupt Enable Register; 0 to mask out an interrupt, 1 to enable an interrupt

76 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Tab le B- 2: control Register

0x0000 control R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved enable

Name Bits Description

Reserved 31:3 Reserved

Timing Generator enabled into video scaler. This bit

Timing_Gen_Enable 2

enables the timing generator signals, vblank, hblank, active video to go through to the signals on the video scaler core.

Reg_Update_Enable 1

edge. The registers that utilize this bit are 0x0010 through 0x0038.

Usage: This bit is cleared when the IP core next vblank happens.

Enable 0 Enable the Video Scaler core on the next video frame.

Tab le B- 3: reserved Register

0x0004 status R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved C

Name Bits Description

Reserved 31:1 Reserved

Coef_write_rdy 0

If this bit is '1' then the Coeffs can be written into the core.

Check at the beginning of a coeff transfer.

Tab le B- 4: status Register

0x0008 status_error R

313029282726252423222120191817161514131211100908070605040302010

Error_Code3 Error_Code2 Error_Code1 Error_Code0

Name Bits Description

Error_Code3 31:24 Error codes to be defined

Video Scaler v4.0 User Guide www.xilinx.com 77

UG805 March 1, 2011

Appendix B: Programmer Guide

Tab le B- 4: status Register

Error_Code2 23:16 Error codes to be defined

Error_Code1 15:8 Error codes to be defined

Error_Code0 7:0 Error codes to be defined

Tab le B- 5: status_done Register

313029282726252423222120191817161514131211100908070605040302010

Reserved 31:24 Reserved

Reserved 23:16 Reserved

Reserved 15:8 Reserved

0x000c status_done R/W

Reserved d

Name Bits Description

Reserved 7:1 Reserved

Done bit can be polled by software for end for video

Done 0

scaler operation.

Usage: This bit is cleared when any value is written to the register.

Tab le B- 6: horizontal_shrink_factor Register

0x0010 horz_shrink_factor R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved hsf_int hsf_frac

Name Bits Description

Reserved 31:24 Reserved

hsf_int 23:20 Horizontal Shrink Factor integer

hsf_frac 19:0 Horizontal Shrink Factor fractional

Tab le B- 7: vsf Register

0x0014 vert_shrink_factor R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved vsf_int vsf_frac

Name Bits Description

78 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Tab le B- 7: vsf Register

Reserved 31:24 Reserved

vsf_int 23:20 Vertical Shrink Factor integer

vsf_frac 19:0 Vertical Shrink Factor fractional

Tab le B- 8: aperture_horz Register

0x0018 aperture_horz R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved aperture_end_pixel Reserved aperture_start_pixel

Name Bits Description

Reserved 31:26 Reserved

aperture_end_pixel 28:16 Location of last pixel in line

Reserved 15:11 Reserved

aperture_start_pixel 12:0 Location of first pixel in line

Tab le B- 9: aperture_vert Register

0x001c aperture_vert R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved aperture_end_line Reserved aperture_start_line

Name Bits Description

Reserved 31:27 Reserved

aperture_end_line 28:16 Location of last line in active video

Reserved 15:11 Reserved

aperture_start_line 12:0 Location of first line in active video

Tab le B- 10 : output_size Register

0x0020 output_size R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved output_v_size Reserved output_h_size

Name Bits Description

Reserved 31:27 Reserved

output_v_size 28:16 Number of lines in output image

Video Scaler v4.0 User Guide www.xilinx.com 79

UG805 March 1, 2011

Appendix B: Programmer Guide

Tab le B- 10 : output_size Register

Reserved 15:11 Reserved

output_h_size 12:0 Number of pixels in output image

Tab le B- 11 : num_phases Register

313029282726252423222120191817161514131211100908070605040302010

Reserved 31:15 Reserved

num_v_phases 14:8 Number of vertical phases

Reserved 7 Reserved

num_h_phases 6:0 Number of horizontal phases

0x0024 num_phases R/W

Reserved num_v_phases num_h_phases

Name Bits Description

Tab le B- 12 : coeff_sets Register

0x0028 coeff_sets R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved vcoeffset hcoeffset

Name Bits Description

Reserved 31:28 Reserved

vcoeffset 7:4 Active vertical coefficient set

hcoeffset 3:0 Active horizontal coefficient set

Tab le B- 13 : start_hpa_y Register

0x002c start_hpa_y R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved start_hpa_y

Name Bits Description

Reserved 31:21 Reserved

start_hpa_y 20:0

80 www.xilinx.com Video Scaler v4.0 User Guide

Fractional value used to initialize horizontal accumulator for luma

UG805 March 1, 2011

Tab le B- 14 : start_vpa_y Register

0x0030 start_hpa_c R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved start_hpa_c

Name Bits Description

Reserved 31:21 Reserved

start_hpa_c 20:0

Fractional value used to initialize horizontal accumulator for chroma

Tab le B- 15 : start_hpa_c Register

0x0034 start_vpa_y R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved start_vpa_y

Name Bits Description

Reserved 31:21 Reserved

start_vpa_y 20:0

Fractional value used to initialize vertical accumulator for luma

Tab le B- 16 : start_vpa_c Register

0x0038 start_vpa_c R/W

313029282726252423222120191817161514131211100908070605040302010

Name Bits Description

Reserved 31:21 Reserved

start_vpa_c 20:0

Tab le B- 17 : Coefficient_write_set_address Register

0x003c coef_write_set_addr R/W

313029282726252423222120191817161514131211100908070605040302010

Video Scaler v4.0 User Guide www.xilinx.com 81

UG805 March 1, 2011

Reserved start_vpa_c

Fractional value used to initialize vertical accumulator for chroma

Reserved coef_wsa

Appendix B: Programmer Guide

Tab le B- 17 : Coefficient_write_set_address Register

Reserved 31:4 Reserved

coef_write_set_addr 3:0 Coefficient bank to write, address

Tab le B- 18 : coef_values Register

313029282726252423222120191817161514131211100908070605040302010

coef_value_N+1 31:16

Name Bits Description

0x0040 coef_values W

coef_value_N+1 coef_value_N

Name Bits Description

Coefficient value N+1 where N is index for the coefficient set.

Usage: Each write to this register increments an internal counter by 2 to generate a coefficient set internal to the video scaler. LSB aligned for coefficients less than 16 bits.

Coefficient value N where N is index for the coefficient set.

coef_value_N 15:0

Usage: Each write to this register increments an internal counter by 2 to generate a coefficient set internal to the video scaler. LSB aligned for coefficients less than 16 bits

Tab le B- 19 : Coefficient Set and Bank Read Address Register

0x0044 coef_set_bank_rd_addr R/W

313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0

Reserved Set Reserved Bank

Name Bits Description

Coeff Readback Set 11:8 Coefficient set to be read from the scaler

Coeff Readback Bank 1:0

Coefficient bank to be read from scaler:

00=HY; 01=HC; 10=VY; 11=VC

82 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Tab le B- 20 : Coefficient Phase and Tap Read Address Register

0x0048 coef_mem_rd_addr R/W

313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0

Reserved Phase Reserved Ta p

Name Bits Description

Coeff Readback

Phase

Coeff Readback Bank 3:0 Coefficient tap to be read from scaler

13:8

Coefficient phase to be read from the scaler

Tab le B- 21 : Coefficient Memory Readback Output Register

0x004c coef_mem_rd_addr R

313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0

Reserved Coeff Readback Output

Name Bits Description

Coeff Readback Output

15:0

Coefficient readout from the scaler

Tab le B- 22 : Version Register

0x00F0 Version R

313029282726252423222120191817161514131211109 8 7 6 5 4 3 2 1 0

HW Version

Name Bits Description

HW Version 31:0 Hard-coded hardware version register

Tab le B- 23 : Software Reset Register

0x0100 Software_Reset W

313029282726252423222120191817161514131211100908070605040302010

Name Bits Description

Soft_Reset_Value 31:0

Video Scaler v4.0 User Guide www.xilinx.com 83

UG805 March 1, 2011

Reserved d

Soft Reset to reset the registers and IP core, data Value provided by the EDK create peripheral utility

Appendix B: Programmer Guide

Tab le B- 24 : Global Interrupt Enable Register

313029282726252423222120191817161514131211100908070605040302010

Reserved 31:1 Reserved

GIER 0 Global Interrupt Enable Register. Active High

Tab le B- 25 : Interrupt Status Register

313029282726252423222120191817161514131211100908070605040302010

0x021C Software_Reset W

Reserved d

Name Bits Description

0x0220 ISR R/W

Reserved Int

Name Bits Description

Reserved 31:6 Reserved

intr_coef_mem_rdbk _rdy

intr_reg_update_ done

intr_coef_wr_error 4

intr_output_error 3

intr_input_error 2

Level sensitive: Output flag indicating that the specified coefficient bank is ready for reading.

Level sensitive: issued during Vertical blanking when the register values have been transferred to the active registers.

Rising edge sensitive: issued if coefficient is written into coefficient FIFO when the FIFO is not ready.

Rising edge sensitive: issued if frame period completes before full output frame has been delivered.

Rising edge sensitive: issued if active_video_in is asserted before the scaler is ready to receive a new line.

Level sensitive: issued when the coefficient FIFO is ready

intr_coef_fifo_rdy 1

to receive a coefficient for the current set. Stays low once a full set has been written into FIFO. Sent high during Vertical blanking.

intr_output_frame_ done

Rising edge sensitive: issued once per complete output frame.

84 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Tab le B- 26 : Interrupt Enable Register

0x0228 IER R/W

313029282726252423222120191817161514131211100908070605040302010

Reserved Int

Name Bits Description

Reserved 31:6 Reserved

intr_coef_mem_rdbk _rdy

intr_reg_update_ done

intr_coef_wr_error 4 Mask or Enable interrupt for intr_coef_wr_error

intr_output_error 3 Mask or Enable interrupt for intr_output_error

intr_input_error 2 Mask or Enable interrupt for intr_input_error

intr_coef_fifo_rdy 1 Mask or Enable interrupt for intr_coef_fifo_rdy

intr_output_frame_ done

Filter Coefficient Calculations

The values for the filter coefficients can be calculated with any standard digital filter tool. MATLAB® software provides a tool box for establishing the filter coefficients once the cutoff frequency is known from the scale factor. It should be noted that sharp cutoff frequencies are generally not desired in image processing due to the ringing generated at sharp transitions (artifacts). Additionally allowing some amount of aliasing can be subjectively preferred in side-by-side comparisons. The MATLAB software FIR1 function can be used as a starting point for deriving coefficient values.

6 Mask or enable interrupt for intr_coef_mem_rdbk_rdy

5 Mask or Enable interrupt for intr_reg_update_done

0 Mask or Enable interrupt for intr_output_frame_done

Xilinx provides a C-Model that generates coefficients. Contact Xilinx support for information on how to obtain this C-Model. Refer to the Video Scaler Product Page information about accessing the C-Model.

Video Scaler v4.0 User Guide www.xilinx.com 85

UG805 March 1, 2011

for

Appendix B: Programmer Guide

Video Scaler Flow Diagram

Start

Scaling

Initialize

Registers

Set Load

Coef Bank

Load

Coefs

Set Active

Coef Bank

New

Scale

Factors

Initialize

Registers

HSF . VSF .

Output_h/v

Set Active Coef

Bank

New

Coef

Bank?

Set Load

Coef Bank

Disable

Scaler

Control O0

86 www.xilinx.com Video Scaler v4.0 User Guide

Enable

Video Scaler

Control 0 1

Done?Done?

Stop

Scaling

Figure B-0: Video Scaler Flow Chart

Load Coefs

UG678_01_030210

UG805 March 1, 2011

System Timing Diagram

Video Scaler v4.0 User Guide www.xilinx.com 87

UG805 March 1, 2011

Figure B-0: System Timing Diagram

Appendix B: Programmer Guide

Proposed API function calls

The following functions are proposed for LO, L1, L2 API.

L0 API Function Calls

#define XScaler_Enable(InstancePtr)

#define XScaler_Disable(InstancePtr)

#define XScaler_Reset(InstancePtr)

#define XScaler_GetStatus(InstancePtr)

#define XScaler_CheckDone(InstancePtr)

#define XScaler_SetHoriShrinkFactor(InstancePtr, Integer, Fractional)

#define XScaler_GetHoriShrinkFactor(InstancePtr)

#define XScaler_SetVertShrinkFactor(InstancePtr, Integer, Fractional)

#define XScaler_GetVertShrinkFactor(InstancePtr)

#define XScaler_SetHoriAperture(InstancePtr, FirstPixel, LastPixel)

#define XScaler_GetHoriAperture(InstancePtr)

#define XScaler_SetVertAperture(InstancePtr, FirstLine, LastLine)

#define XScaler_GetVertAperture(InstancePtr)

#define XScaler_SetOutputSize(InstancePtr, Lines, Pixels)

#define XScaler_GetOutputSize(InstancePtr)

#define XScaler_SetNumPhases(InstancePtr, Vert, Hori)

#define XScaler_GetNumPhases(InstancePtr)

#define XScaler_SetCoeffSet(InstancePtr, Vert, Hori)

#define XScaler_GetCoeffSet(InstancePtr)

#define XScaler_SetHoriAccuLuma(InstancePtr, Fraction)

#define XScaler_GetHoriAccuLuma(InstancePtr)

#define XScaler_SetVertAccuLuma(InstancePtr, Fraction)

#define XScaler_GetVertAccuLuma(InstancePtr)

#define XScaler_SetHoriAccuChroma(InstancePtr, Fraction)

#define XScaler_GetHoriAccuChroma(InstancePtr)

#define XScaler_SetVertAccuChroma(InstancePtr, Fraction)

#define XScaler_GetVertAccuChroma(InstancePtr)

#define XScaler_SetWriteCoeffBankAddr(InstancePtr, Address)

#define XScaler_GetWriteCoeffBankAddr(InstancePtr)

#define XScaler_SetCoefValue(InstancePtr, NPlus1, N)

#define XScaler_GetCoefValue(InstancePtr)

88 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

L1 API Function Calls

#define XScaler_CalcCoeffs(coeffs, scale, taps, phases, coeff_precision)

• software function, no registers written

#define XScaler_WriteCoeffValues(InstancePtr, coeffs, coeff_bank)

• sets coef_write_set_addr and writes consecutively coef_values

#define XScaler_CalcScaleFactors(InstancePtr, hsv, vsf, input_h, input_v, output_h, output_v)

• software function, no registers written

#define XScaler_SetActiveCoeffBank(InstancePtr, coeff_bank)

• sets active register coeff_sets

#define XScaler_SetScalerValues(InstancePtr, reg_data_structure)

• This is the main video scaler function call utilized in a frame basis when the shrink factor is changing every frame such as zooming applications.

The mandatory registers that need to change for a new shrink factor are:

• horz_shrink_factor

• vert_shrink_factor

•output_size

Proposed API function calls

Optionally these registers may also need to be modified depending on the input resolution and user preference:

• aperture_horz

• aperture_vert

•num_phases

• coeff_sets

• start_hpa_y

• start_vpa_y

• start_hpa_c

• start_vpa_c

L2 API Function Calls

#define XScaler_Zoom(InstancePtr, zoom_factor_h, zoom_factor_v, starting_aperture_h, ending_aperture_h, starting_aperture_v, ending_aperture_v, output_h, output_v, num_of_frames)

• In a zoom operation the input image size is changing on a frame basis and the output resolution is fixed.

• Calls XScaler_CalcScaleFactors, XScaler_SetScalerValues every frame to perform the zoom function. Prior to beginning the zoom operation, you will have to preload the coeff banks you would like to use for the duration and decide when to transition to a new coefficient bank; example 4 coeff banks for 200 frames switch bank every 50 frames.

#define XScaler_DownSize(InstancePtr, downsize_factor_h, downsize_factor_v, num_of_frames)

Video Scaler v4.0 User Guide www.xilinx.com 89

UG805 March 1, 2011

Appendix B: Programmer Guide

• In a downsize operation, the input image size is not changing on a frame basis and the output resolution is changing.

• Calls XScaler_CalcScaleFactors, XScaler_SetScalerValues every frame to perform the downsize function. Prior to beginning the downsize operation, you will have to preload the coeff banks you would like to use for the duration and decide when to transition to a new coefficient bank; example 4 coeff banks for 200 frames switch bank every 50 frames.

Example Settings

The following examples illustrate settings for different scale factors.

Pass Thru

Tab le B- 27 is an example of pass thru of a 1280 x 720 resolution image.

Tab le B- 27 : Pass Through Register Settings

Address Name Decimal Value

0x0000 control 07

0x0010 hsf 1048576

0x0014 vsf 1048576

0x0018 aperture_start_pixel 0

0x0018 aperture_end_pixel 1279

0x001c aperture_start_line 0

0x001c aperture_end_line 719

0x0020 Output_h_size 1280

0x0020 Output_v_size 720

0x0024 num_h_phases 4

0x0024 num_v_phases 4

0x0028 h_coeff_set 0

0x0028 v_coeff_set 0

0x002c start_hpa _y 0

0x0030 start_hpa_c 0

0x0034 start_vpa_y 0

0x0038 start_vpa_c 0

0x003c Coef_set_write_addr 0

0x0040 Coef_values

90 www.xilinx.com Video Scaler v4.0 User Guide

See Chapter 8,

Coefficients

UG805 March 1, 2011

Down Sample by 2 in Horizontal and Vertical

Tab le B- 28 is an example of scaling down a 1280 x 720 resolution image by a factor of

2 horizontally and vertically to 640x 360.

Tab le B- 28 : Down Sample Register Settings

Address Name Decimal VAlue

0x0000 control 07

0x0010 hsf 2097152

0x0014 vsf 2097152

0x0018 aperture_start_pixel 0

0x0018 aperture_end_pixel 1279

0x001c aperture_start_line 0

0x001c aperture_end_line 719

0x0020 Output_h_size 640

0x0020 Output_v_size 360

Example Settings

0x0024 num_h_phases 4

0x0024 num_v_phases 4

0x0028 h_coeff_set 0

0x0028 v_coeff_set 0

0x002c start_hpa_y 0

0x0030 start_hpa_c 0

0x0034 start_vpa_y 0

0x0038 start_vpa_c 0

0x003c Coef_set_write_addr 0

0x0040 Coef_values

See Chapter 8,

Coefficients

Video Scaler v4.0 User Guide www.xilinx.com 91

UG805 March 1, 2011

Appendix B: Programmer Guide

92 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

System Level Design

Introduction

This appendix provides an example system that includes the video scaler core. Important system level aspects when designing with the video scaler are highlighted, including:

• Video scaler usage with the VDMA/VFBC/MPMC or other memory interface/controller

• Inclusion of the video scaler in an EDK project

• Typical usage of video scaler in conjunction with other cores

• System level distribution of video timing and genlock signals

Example System General Configuration.

Appendix C

The system input and output is expected to be no larger than 720P (1280Hx720V), with a maximum pixel frequency of 74.25 MHz, with equivalent clocks.

• MicroBlaze controls scale factors according to user input

• The system can upscale or downscale

• When down scaling, the full input image is scaled down and placed in the center of a black 720P background and displayed

• When upscaling, the center of the 720P input image is cropped from memory and upscaled to 720P, and displayed as a full 720P image on the output

• Operational clock frequencies are derived from the input clock

Figure C-1 shows a typical example of the video scaler in memory mode incorporated into

a larger system. Here are the essential details:

•The Multiport Memory Controller (MPMC) represents the memory access point for multiple IP blocks.

• The MPMC ports are configured as Video Frame Buffer Controllers (VFBC), which allow the user to access data in rectangular fashion, making it simple to store frames of data, and access portions of any frame. This configuration is useful for cropping an area in preparation for upscaling (for example). See the MPMC Data Sheet information

•The Video Direct Memory Access (VDMA) blocks simplify the VFBC interface, and act as a SW-controllable processor peripheral. See the VDMA Data Sheet information.

•The Timebase Controller is a SW-configurable timing detector and generator block, which generates timing signals for distribution around the system. See the Timing

Controller Data Sheet for more information.

for more

Video Scaler v4.0 User Guide www.xilinx.com 93

UG805 March 1, 2011

Appendix C: System Level Design

•The On-Screen Display (OSD) block aligns the data read from memory with the timing signals and presents it as a standard-format video data stream. It also alphablends multiple layers of information (e.g. text, other video data). See the OSD Data

Sheet for more information.

X-Ref Target - Figure C-1

Control Buses

In this example, MicroBlaze is configured to use the PLB v4.6. The VDMAs sit on the PLB bus directly. The Video Scaler, Timing Controller, and OSD use AXI4-Lite. The PLB-to-AXI bridge facilitates the transition between PLB and AXI buses.

VDMA0 Configuration

VDMA0 is used uni-directionally, used for writing input data into the memory. Normally, this should be configured as a write-only core (C_DMA_TYPE = 0). However, currently, it is

configured as a bidirectional core in this case (C_DMA_TYPE = 2), to work around an issue in the VDMA design - the read side of this core is not connected, except for the read-side clock.

The system operates using a Genlock mechanism. A rotational 5-frame buffer is defined in the external memory. Using the Genlock bus vdma_0_XIL_WD_MGENLOCK, VDMA0 communicates to VDMA1 which of the five frame locations is being written, to avoid R/W collisions.

VDMA0, in the MHS file text given below, is sourced from an engineering test-pattern generator (not included in the MHS file below). This generates a VDMA write bus that connects directly to the VDMA write port.

Figure C-1: Simplified System Diagram

94 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

VDMA1 Configuration

VDMA1 is bidirectional, used for reading the original frames from memory, and writing the scaled frame back to memory.

The system operates using a Genlock mechanism. A second rotational 5-frame buffer is defined in the external memory. VDMA1 communicates to VDMA2 which frame it is writing to, using the Genlock bus vdma_1_XIL_WD_MGENLOCK.

VDMA1, in the MHS file text below, interfaces with the video scaler via a VDMA read bus (scaler input) and VDMA write bus (scaler output).

VDMA2 Configuration

VDMA2 is unidirectional, and is configured that way. It is used for reading the scaled frame from memory in order to display it. It is a Genlock slave to VDMA1.

Video Scaler Configuration

The video scaler is configured as follows:

• single-engine 4:2:2

• 11Hx11V-taps

• 64 phases

• shared YC coefficients

VDMA1 Configuration

Its core uses a 148.5 MHz derivative of the 74.25 MHz input clock.

MPMC Configuration

The MPMC is configured to have three VFBC ports. Each port includes a FIFO. The FIFOs are configured to be 2048 pixels in length. This is especially important for VDMA1, which handles video data to/from the video scaler. The video scaler arbitrates on a line-by-line

basis. It does this by analyzing the status of the rd_almost_empty and wd_almost_full flags on the VDMA buses, before reading or writing one line, but never

analyzes these flags once a line-read or line-write operation has commenced. This is described in detail in the main text of this user guide. The guidelines for this port are described in the following two sections.

Scaler READ-port

• For the port that feeds data into the video scaler, ensure that there is a FIFO of a size equal to or greater than the maximum line length anticipated to be scaled by the scaler. Ideally, set this to the next power of 2 above the maximum input line length.

For this example, the max line length is 1280, so the FIFO has been set to 2048 pixels.

• For systems like the VFBC, which have a FIXED threshold for the ALMOST full/empty flags, set this value to the maximum input line-length. This ensures that the rd_almost_empty flag will not be driven low until an entire line of video data

is in the FIFO, ready for the scaler to accept.

Video Scaler v4.0 User Guide www.xilinx.com 95

UG805 March 1, 2011

Appendix C: System Level Design

Scaler WRITE-port

• For the port that feeds from the video scaler out to the memory, ensure that there is a FIFO of a size equal to or greater than the maximum line length anticipated to be output by the scaler. Ideally, set this to the next power of 2 above the maximum output line length. For this example, the max line length is 1280, so the FIFO has been

set to 2048 pixels.

• For systems like the VFBC, which have a FIXED threshold for the ALMOST full/empty flags, set this value to the maximum output line-length. This ensures that the wd_almost_full flag will not be driven low until there is sufficient space in

the FIFO for an entire line of video data.

Cropping from Memory

Controlling the VDMA dynamically (e.g., from MicroBlaze or other processor) allows you to request any rectangle from any where in the image in memory, and change the position and dimensions of this rectangle on a frame-by frame basis. One complication of doing this with the VFBC is that the FIFO almost full/empty thresholds are FIXED at compile-time. According to the guidelines above, it is necessary to set the thresholds to the maximum line length. Yet, when cropping from memory, you will be requesting a rectangle of a smaller width than the maximum line length. Consequently, the final lines may not be read from memory correctly, resulting in some distortion at the bottom of the image.

To work around this issue, it is necessary, and safe, to request more lines than you want to scale. This keeps the FIFO topped up with data. This can be achieved by setting the VDMA

Read Vsize register (address offset 0x28) to a number greater than you want. See the

VDMA Data Sheet

differently to your desired values.

OSD Configuration

The OSD is configured for two layers. The first layer is video data read from VDMA2. The second layer is text overlay.

EDK MHS File Text

The following is an example EDK MHS file insert for the system described.

Note:

scaler system in EDK.

This is NOT a complete design, but provides some idea as to the construction of a video

BEGIN vdma PARAMETER INSTANCE = vdma_0 PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1fffffff PARAMETER C_GEN_RESET = 1 PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_CROP_ENABLE = 1 PARAMETER C_DMA_TYPE = 2 PARAMETER C_BASEADDR = 0xcb480000 PARAMETER C_HIGHADDR = 0xcb48ffff BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_VFBC = vdma_0_XIL_VFBC BUS_INTERFACE XIL_WD_VDMA = tpg_0_XIL_VDMA_TPG_OUT PORT IP2INTC_Irpt = vdma_0_IP2INTC_Irpt

for more information. The scaler register settings should not be set

96 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

PORT m_wd_frame_ptr_out = vdma_0_XIL_WD_MGENLOCK PORT vdma_wcmd_clk = vid_in_clk PORT vdma_wd_clk = vid_in_clk PORT vdma_rcmd_clk = vid_in_clk PORT vdma_rd_clk = vid_in_clk END

BEGIN timebase PARAMETER INSTANCE = timebase_1 PARAMETER HW_VER = 3.00.a PARAMETER C_BASEADDR = 0xc3800000 PARAMETER C_HIGHADDR = 0xc380ffff PARAMETER C_MAX_LINES = 1024 PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE S_AXI = axi_interconnect_0 BUS_INTERFACE XSVI_OUT = timebase_1_XSVI_OUT PORT ce = net_vcc PORT video_clk_in = vid_in_clk PORT fsync_o = timebase_1_fsync PORT IP2INTC_Irpt = timebase_1_IP2INTC_Irpt PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 END

BEGIN vdma PARAMETER INSTANCE = vdma_1 PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1FFFFFFF PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_DMA_TYPE = 2 PARAMETER C_CROP_ENABLE = 1 PARAMETER C_BASEADDR = 0xcb460000 PARAMETER C_HIGHADDR = 0xcb46ffff BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_VFBC = vdma_1_XIL_VFBC BUS_INTERFACE XIL_RD_VDMA = scaler_0_XIL_VDMA_SCALER_IN BUS_INTERFACE XIL_WD_VDMA = scaler_0_XIL_VDMA_SCALER_OUT BUS_INTERFACE XIL_RD_SGENLOCK1 = vdma_0_XIL_WD_MGENLOCK BUS_INTERFACE XIL_WD_MGENLOCK = vdma_1_XIL_WD_MGENLOCK PORT IP2INTC_Irpt = vdma_1_IP2INTC_Irpt END

EDK MHS File Text

BEGIN axi_scaler PARAMETER INSTANCE = scaler_0 PARAMETER HW_VER = 4.00.a PARAMETER C_SEPARATE_YC_COEFS = 0 PARAMETER C_MAX_SAMPLES_OUT_PER_LINE = 1280 PARAMETER C_MAX_PHASES = 64 PARAMETER C_INIT_COEF_SOURCE = 1 PARAMETER C_YC_FILTER_CONFIG = 1 PARAMETER C_BASEADDR = 0xc3400000 PARAMETER C_HIGHADDR = 0xc340ffff PARAMETER C_NUMBER_OF_H_TAPS = 11 PARAMETER C_NUMBER_OF_V_TAPS = 11 PARAMETER C_MAX_COEF_SETS = 16 PARAMETER C_SEPARATE_HV_COEFS = 1 PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE XIL_VDMA_SCALER_IN = scaler_0_XIL_VDMA_SCALER_IN BUS_INTERFACE XIL_VDMA_SCALER_OUT = scaler_0_XIL_VDMA_SCALER_OUT BUS_INTERFACE S_AXI = axi_interconnect_0 PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 PORT clk = vid_in_clkx2 PORT video_in_clk = vid_in_clk PORT video_out_clk = vid_in_clk PORT debug = xscaler_0_LEDsOut PORT IP2INTC_Irpt = scaler_0_IP2INTC_Irpt PORT vsync_i = timebase_1_XSVI_OUT_vsync END

BEGIN vdma PARAMETER INSTANCE = vdma_2

Video Scaler v4.0 User Guide www.xilinx.com 97

UG805 March 1, 2011

Appendix C: System Level Design

PARAMETER HW_VER = 1.01.a PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1fffffff PARAMETER C_USE_FSYNC = 1 PARAMETER C_DMA_TYPE = 1 PARAMETER C_GEN_RESET = 1 PARAMETER C_DATA_WIDTH = 16 PARAMETER C_NUM_FSTORES = 5 PARAMETER C_BASEADDR = 0xcb420000 PARAMETER C_HIGHADDR = 0xcb42ffff PARAMETER C_CROP_ENABLE = 1 BUS_INTERFACE SPLB = mb_plb BUS_INTERFACE XIL_RD_SGENLOCK1 = vdma_1_XIL_WD_MGENLOCK BUS_INTERFACE XIL_VFBC = vdma_2_XIL_VFBC BUS_INTERFACE XIL_RD_VDMA = osd_0_XIL_RD0_VFBC PORT fsync = timebase_1_fsync PORT IP2INTC_Irpt = vdma_2_IP2INTC_Irpt END

BEGIN axi_osd PARAMETER INSTANCE = osd_0 PARAMETER HW_VER = 2.00.a PARAMETER C_LAYER1_TYPE = 1 PARAMETER C_NUM_LAYERS = 2 PARAMETER C_LAYER1_IMEM_SIZE = 96 PARAMETER C_NUM_DATA_CHANNELS = 2 PARAMETER C_ALPHA_CHANNEL_EN = 0 PARAMETER C_LAYER2_TYPE = 2 PARAMETER C_BASEADDR = 0xc3a00000 PARAMETER C_HIGHADDR = 0xc3a0ffff PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI PARAMETER C_OUTPUT_MODE = 1 BUS_INTERFACE XSVI_IN = timebase_1_XSVI_OUT BUS_INTERFACE XSVI_OUT = osd_0_XSVI_OUT BUS_INTERFACE XIL_RD0_VFBC = osd_0_XIL_RD0_VFBC BUS_INTERFACE S_AXI = axi_interconnect_0 PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 PORT clk = vid_in_clk PORT IP2INTC_Irpt = osd_0_IP2INTC_Irpt END

BEGIN microblaze PARAMETER INSTANCE = microblaze_0 PARAMETER HW_VER = 7.30.b PARAMETER C_DEBUG_ENABLED = 1 PARAMETER C_ICACHE_BASEADDR = 0x10000000 PARAMETER C_ICACHE_HIGHADDR = 0x1fffffff PARAMETER C_CACHE_BYTE_SIZE = 16384 PARAMETER C_ICACHE_ALWAYS_USED = 1 PARAMETER C_DCACHE_BASEADDR = 0x10000000 PARAMETER C_DCACHE_HIGHADDR = 0x1fffffff PARAMETER C_DCACHE_BYTE_SIZE = 16384 PARAMETER C_DCACHE_ALWAYS_USED = 1 PARAMETER C_USE_ICACHE = 1 PARAMETER C_USE_DCACHE = 1 PARAMETER C_USE_BARREL = 1 PARAMETER C_DPLB_BUS_EXCEPTION = 1 PARAMETER C_IPLB_BUS_EXCEPTION = 1 PARAMETER C_ILL_OPCODE_EXCEPTION = 1 PARAMETER C_UNALIGNED_EXCEPTIONS = 1 PARAMETER C_OPCODE_0x0_ILLEGAL = 1 PARAMETER C_USE_HW_MUL = 2 PARAMETER C_USE_DIV = 1 PARAMETER C_DIV_ZERO_EXCEPTION = 1 PARAMETER C_ICACHE_LINE_LEN = 8 PARAMETER C_USE_MMU = 3 PARAMETER C_MMU_ZONES = 2 PARAMETER C_PVR = 2 BUS_INTERFACE DPLB = mb_plb BUS_INTERFACE IPLB = mb_plb BUS_INTERFACE DXCL = microblaze_0_DXCL BUS_INTERFACE IXCL = microblaze_0_IXCL

98 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

BUS_INTERFACE DEBUG = microblaze_0_mdm_bus BUS_INTERFACE DLMB = dlmb BUS_INTERFACE ILMB = ilmb PORT MB_RESET = mb_reset PORT INTERRUPT = xps_intc_0_Irq END

BEGIN plb_v46 PARAMETER INSTANCE = mb_plb PARAMETER HW_VER = 1.05.a PORT PLB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END

BEGIN lmb_v10 PARAMETER INSTANCE = ilmb PARAMETER HW_VER = 1.00.a PORT LMB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END

BEGIN lmb_v10 PARAMETER INSTANCE = dlmb PARAMETER HW_VER = 1.00.a PORT LMB_Clk = clk_100_0000MHzMMCM0 PORT SYS_Rst = sys_bus_reset END

BEGIN lmb_bram_if_cntlr PARAMETER INSTANCE = dlmb_cntlr PARAMETER HW_VER = 2.10.b PARAMETER C_BASEADDR = 0x00000000 PARAMETER C_HIGHADDR = 0x00001fff BUS_INTERFACE SLMB = dlmb BUS_INTERFACE BRAM_PORT = dlmb_port END

EDK MHS File Text

BEGIN lmb_bram_if_cntlr PARAMETER INSTANCE = ilmb_cntlr PARAMETER HW_VER = 2.10.b PARAMETER C_BASEADDR = 0x00000000 PARAMETER C_HIGHADDR = 0x00001fff BUS_INTERFACE SLMB = ilmb BUS_INTERFACE BRAM_PORT = ilmb_port END

BEGIN bram_block PARAMETER INSTANCE = lmb_bram PARAMETER HW_VER = 1.00.a BUS_INTERFACE PORTA = ilmb_port BUS_INTERFACE PORTB = dlmb_port END

BEGIN axi_uartlite PARAMETER INSTANCE = RS232_Uart_1 PARAMETER C_BAUDRATE = 9600 PARAMETER C_DATA_BITS = 8 PARAMETER C_USE_PARITY = 0 PARAMETER C_ODD_PARITY = 0 PARAMETER HW_VER = 1.01.a PARAMETER C_BASEADDR = 0x83000000 PARAMETER C_HIGHADDR = 0x8300ffff PARAMETER C_INTERCONNECT_S_AXI_MASTERS = plbv46_axi_bridge_0.M_AXI BUS_INTERFACE S_AXI = axi_interconnect_0 PORT RX = fpga_0_RS232_Uart_1_RX_pin PORT TX = fpga_0_RS232_Uart_1_TX_pin PORT Interrupt = RS232_Uart_1_Interrupt PORT S_AXI_ACLK = clk_100_0000MHzMMCM0 END

BEGIN mpmc PARAMETER INSTANCE = DDR3_SDRAM

Video Scaler v4.0 User Guide www.xilinx.com 99

UG805 March 1, 2011

Appendix C: System Level Design

PARAMETER HW_VER = 6.03.a PARAMETER C_NUM_PORTS = 6 PARAMETER C_MMCM_EXT_LOC = MMCM_ADV_X0Y9 PARAMETER C_MEM_TYPE = DDR3 PARAMETER C_MEM_PARTNO = MT4JSF6464HY-1G1 PARAMETER C_MEM_ODT_TYPE = 1 PARAMETER C_MEM_REG_DIMM = 0 PARAMETER C_MEM_CLK_WIDTH = 1 PARAMETER C_MEM_CE_WIDTH = 1 PARAMETER C_MEM_CS_N_WIDTH = 1 PARAMETER C_MEM_DATA_WIDTH = 32 PARAMETER C_MEM_NDQS_COL0 = 3 PARAMETER C_MEM_NDQS_COL1 = 1 PARAMETER C_MEM_DQS_LOC_COL0 = 0x000000000000000000000000000000020100 PARAMETER C_MEM_DQS_LOC_COL1 = 0x000000000000000000000000000000000003 PARAMETER C_IODELAY_GRP = DDR3_SDRAM PARAMETER C_MPMC_CLK0_PERIOD_PS = 5000 PARAMETER C_ARB0_ALGO = CUSTOM PARAMETER C_ARB0_NUM_SLOTS = 2 PARAMETER C_ARB0_SLOT0 = 425310 PARAMETER C_ARB0_SLOT1 = 423510 # PIM0 (XCL) PARAMETER C_PIM0_BASETYPE = 1 PARAMETER C_XCL0_B_IN_USE = 1 # PIM1 (Video Input) PARAMETER C_PIM1_BASETYPE = 6 PARAMETER C_PIM1_DATA_WIDTH = 64 PARAMETER C_PI1_RD_FIFO_TYPE = DISABLED PARAMETER C_PI1_WR_FIFO_TYPE = SRL PARAMETER C_VFBC1_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC1_RDWD_FIFO_DEPTH = 2048 PARAMETER C_VFBC1_RD_AEMPTY_WD_AFULL_COUNT = 20 # PIM2 (Scaler IO) PARAMETER C_PIM2_DATA_WIDTH = 64 PARAMETER C_PIM2_BASETYPE = 6 PARAMETER C_VFBC2_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC2_RDWD_FIFO_DEPTH = 2048 PARAMETER C_PI2_WR_FIFO_TYPE = SRL PARAMETER C_PI2_RD_FIFO_TYPE = SRL PARAMETER C_VFBC2_RD_AEMPTY_WD_AFULL_COUNT = 20 # PIM3 (OSD1 - Scaled Video Output) PARAMETER C_PIM3_BASETYPE = 6 PARAMETER C_PIM3_DATA_WIDTH = 64 PARAMETER C_PI3_RD_FIFO_TYPE = SRL PARAMETER C_PI3_WR_FIFO_TYPE = DISABLED PARAMETER C_VFBC3_RDWD_DATA_WIDTH = 16 PARAMETER C_VFBC3_RDWD_FIFO_DEPTH = 2048 PARAMETER C_VFBC3_RD_AEMPTY_WD_AFULL_COUNT = 20 # DDR3 Parameters PARAMETER C_MPMC_BASEADDR = 0x10000000 PARAMETER C_MPMC_HIGHADDR = 0x1FFFFFFF BUS_INTERFACE XCL0 = microblaze_0_IXCL BUS_INTERFACE XCL0_B = microblaze_0_DXCL BUS_INTERFACE VFBC1 = vdma_0_XIL_VFBC BUS_INTERFACE VFBC2 = vdma_1_XIL_VFBC BUS_INTERFACE VFBC3 = vdma_2_XIL_VFBC PORT MPMC_Clk0 = clk_200_0000MHzMMCM0 PORT MPMC_Clk_200MHz = clk_200_0000MHzMMCM0 PORT MPMC_Rst = sys_periph_reset PORT MPMC_Clk_Mem = clk_400_0000MHzMMCM0 PORT MPMC_Clk_Rd_Base = clk_400_0000MHzMMCM0_nobuf_varphase PORT MPMC_DCM_PSEN = MPMC_DCM_PSEN PORT MPMC_DCM_PSINCDEC = MPMC_DCM_PSINCDEC PORT MPMC_DCM_PSDONE = MPMC_DCM_PSDONE PORT DDR3_Clk = fpga_0_DDR3_SDRAM_DDR3_Clk_pin PORT DDR3_Clk_n = fpga_0_DDR3_SDRAM_DDR3_Clk_n_pin PORT DDR3_CE = fpga_0_DDR3_SDRAM_DDR3_CE_pin PORT DDR3_CS_n = fpga_0_DDR3_SDRAM_DDR3_CS_n_pin PORT DDR3_ODT = fpga_0_DDR3_SDRAM_DDR3_ODT_pin PORT DDR3_RAS_n = fpga_0_DDR3_SDRAM_DDR3_RAS_n_pin PORT DDR3_CAS_n = fpga_0_DDR3_SDRAM_DDR3_CAS_n_pin PORT DDR3_WE_n = fpga_0_DDR3_SDRAM_DDR3_WE_n_pin

100 www.xilinx.com Video Scaler v4.0 User Guide

UG805 March 1, 2011

Xilinx LogiCORE IP Video Scaler v4.0 User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual