MITEL PDSP16488AA0, PDSP16488AAC, PDSP16488AB0, PDSP16488AC, PDSP16488ACBR Datasheet

...
PDSP16488A
Single Chip 2D Convolver with Integral Line Delays
Advance Information
Supersedes version in 1996 Media IC Handbook, HB4599-1.0 DS3713 - 6.4 December 1997 and the PDSP16488A MA data sheet, DS3742
The PDSP16488A is a fully integrated, application specific, image processing device. It performs a two dimensional convo­lution between the pixels within a video window and a set of stored coefficients. An internal multiplier accumulator array can be multi-cycled at double or quadruple the pixel clock rate. This then gives the window size options listed in Table 1.
An internal 32kbit RAM can be configured to provide either four or eight line delays. The length of each delay can be programmed to the users requirement, up to a maximum of 1024 pixels per line. The line delays are arranged in two groups,which may be internally connected in series or may be configured to accept separate pixel inputs. This allows interlaced video or frame to frame operations to be supported.
The 8-bit coefficients are also stored internally and can be downloaded from a host computer or from an EPROM. No additional logic is required to support the EPROM and a single device can support up to 16 convolvers.
The PDSP16488A contains an expansion adder and delay network which allows several devices to be cascaded. Con­volvers with larger windows can then be fabricated as shown in Table 2.
Intermediate 32-bit precision is provided to avoid any danger of overflow, but the final result will not normally occupy all bits. The PDSP16488A thus provides a gain control block in the output path, which allows the user to align the result to the most significant end of the 32-bit word.
Table 2 PDSP16488As needed to implement typical window sizes
Pixel
size
Width
Window size
Depth
Maximum pixel
rate (MHz)
Line delays
8 8
8 16 16
4 8 8 4 8
4 4 8 4 4
20 20 10 20 10
431024 431024
83512 43512 43512
Table 1 Single PDSP16488A configurations
Max. pixel
rate
(MHz)
Pixel
size
333
535 737 939 11311 15315 23323
No. of PDSP16488As for N3N window size
10 10 20 20 40 40
8
16
8
16
8
16
1 1 1 1 1 2
1 2 2 4
4*
-
1 2 2 4
4*
-
4
-
6
-
-
-
4
-
6
-
-
-
4
-
8
-
-
-
9
-
-
-
-
-
*Maximum rate is limited to 30MHz by line store expansion delays
ORDERING INFORMATION
Commercial (0°C to 170°C)
PDSP16488A / C0 / AC (PGA)
Industrial (240°C to 185°C)
PDSP16488A / B0 / AC (PGA) PDSP16488A / B0 / GC (QFP)
Military (255°C to 1125°C)
PDSP16488A / A0 / AC (PGA) PDSP16488A / A0 / GC (QFP) PDSP16488A / MA / ACBR (PGA) MIL-STD-883 Class B* PDSP16488A / MA / GCPR (QFP) MIL-STD-883 Class B* *See Notes following Static Electrical CharacteristicsTable
Note: PDSP16488A devices are not guaranteed to cascade with PDSP16488 devices. Mitel Semiconductor do not recommend that PDSP16488A be mixed with PDSP16488 devices in a single equipment design. The PDSP16488A requires external pullup resistors in EPROM Mode (see Static Electrical Characteristics).
FEATURES
The PDSP16488A is a replacement for the
PDSP16488 (see Note below)
8 or 16-bit Pixels with rates up to 40 MHz
Window Sizes up to 838 with a Single Device
Eight Internal Line Delays
Supports Interlace and Frame-to-Frame Operations
Coefficients Supplied from an EPROM or Remote Host
Expandable in both X and Y for Larger Windows
Gain Control and Pixel Output Manipulation
84-pin PGA or 132-pin QFP Package Options
Fig. 1 Typical stand-alone real time system
PDSP16488A
EPROM
ADDR DATA
POWER
ON
RESET
DELAYED
SYNC
OUTPUT
DATA
RES
DELOP
CLK HRES BYPASS
SYNC
EXTRACT
COMPOSITE
DATA
PIXEL
CLOCK
GEN
ADC
OPTIONAL
FIELD
DELAY
IP7:0
L7:0
D15:0
SYNC
ODD FIELD
2
Type Input I/O
Input Input
Dual function
Output
Output
Input
Output
I/O
Input
Input I/O
Input
Output
Output Input Input Input
Input Outputs
Outputs
Power Power
Signal IP7:0 L7:0
BYPASS HRES
X15:0
D15:0
PC1
PC0
DELOP
DS
CE
R/W
PROG
CLK
BIN
OVR
RES SINGLE MASTER
OEN
CS3:0
F1:0
V
DD
GND
Table 3 Signal descriptions
Description Pixel data input to the first line delay (most significant byte in 16-bit mode). Pixel data input to the second group of line delays. (least significant byte in 16-bit mode). Alterna-
tively an output from the last line delay when the appropriate mode bit is set. The first line delay in the first group is bypassed when this input is high. No internal pullup resistor. Resets the line delay address pointers when high. Normally the composite sync signal in real time
applications. In non real time systems it defines a frame store update period, when low. Address/data connections from a Master or Single device to the external coefficient source,
with X15 defining EPROM or Host support. Otherwise they provide the expansion data input. Signed 16-bit scaled data or multiplexed 32-bit intermediate data. During intermediate transfers the
most significant half is valid when the clock is low, and the least significant half when clock is high. During programming a Master device outputs a timing strobe on this pin. This is passed down
the chain in a multiple device system, using the
PC0
input on the next device.
This pin is used in conjunction with
PC1
in multiple device systems. It terminates the write strobe
from a Master device which is EPROM supported. This output provides a version of the HRES input which has been delayed by an amount defined by
the user. The data strobe from a host computer, active low. This pin will be an output from an EPROM
supported Master device which provides strobes to the remaining devices. An active low enable which is internally gated with
R/W
andDSto perform reads or writes to the internal registers. In a Single or Master device, which is supported from an EPROM, the bottom 72 addresses are always used andCEis not needed.CEcan then be used to initiate a new register load sequence after the power on load sequence.
Read / not write line from the host CPU. When an EPROM is used this pin should be tied low. This pin is normally an input which signifies that registers are to be changed or examined. It is,
however, an output from an EPROM supported Single or Master device indicating to the rest of the system that registers are being updated.
Clock. All events are triggered on the rising edge of CLK, except the latching of least significant expansion inputs . Internally the clock can be multiplied by two or four in order to increase the effective number of multipliers.
This output indicates the result from the internal comparison. A high value indicates that the pixel was greater than the internal threshold. The output is only valid from the last device in a chain.
When high this output indicates that there has been a gain control overflow. Active low power on reset signal. Tied to ground to indicate a Single device system. Internal pullup resistor. Tied to ground to indicate the Master device in a multiple device system. Must be left open circuit
in a Single device system. Internal pullup resistor. Output enable signal. Active low. Four address bits from a Master specifying one of sixteen devices in a multiple device system.
Must be externally decoded to provide chip enables for the additional devices. These bits indicate the field selection given by the gain control auto select logic. The same coding
as that used for Control Register bits C5:4 is used. 15V supply. All VDD pins must be connected. 0V supply. All GND pins must be connected.
3
838
ARRAY
OF
MACs
COEFFICIENT
STORE (64)
3 LINE
DELAYS
4 LINE
DELAYS
1 LINE
DELAY
Y
DELAY
Y
DELAY
BYPASS
X
DELAY
CS3:0RES
CONTROL
PC1
CONTROL
REGISTERS
COMPARATOR
D15:0
BIN
OVR
CLK OEN
PROG
MASTER
SINGLE
DELOP
HRES
PC0R/WDSCE
BIDIRECTIONAL MULTIPURPOSE
DATA BUS X15:0
IP7:0
L7:0
F1:0
Fig. 2 Functional block diagram
4
A B C D E F G H J K L M N
1 2 3 4 5 6 7 8
9 10 11 12 13
PIN 1
PIN 132
Fig. 3a Pin connections for 84 I/O pin grid array package - AC84 (Power ) (bottom view)
Fig 3b Pin connections for 132 I/O ceramic power flatpack - GC132 (Power) (top view)
Fig 3 Pin connection diagrams (not to scale). See Table 3 for signal descriptions and Tables 4 and 5 for pinouts.
5
Pin
1 2 3 4 5 6 7 8
9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33
Pin
34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66
Signal
N/C D0
OEN
BIN
PC1
V
DD
GND OVR N/C HRES
R/W
CE
N/C N/C GND N/C
DS
GND V
DD
PROG
GND CS3 CS2 CS1 CS0 V
DD
RES
PC0
N/C DELOP X0 X1 N/C
Signal
N/C X2 X3 X4 N/C X5 GND X6 X7 N/C X8 X9 V
DD
V
DD
V
DD
X10
MASTER
N/C X11 X12
SINGLE
GND GND N/C X13 X14 N/C X15 V
DD
BYPASS IP0 V
DD
N/C
Pin
67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99
Signal
N/C IP1 GND IP2 N/C V
DD
IP3 V
DD
IP4 GND IP5 GND IP6 V
DD
IP7 V
DD
N/C L7 GND L6 GND L5 V
DD
L4 V
DD
L3 V
DD
L2 GND L1 F1 L0 N/C
Pin
100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132
Signal
N/C V
DD
F0 D15 N/C D14 D13 GND D12 GND V
DD
V
DD
D11 D10 D9 GND CLK CLK CLK GND GND D8 V
DD
D7 D6 D5 D4 GND D3 N/C D2 D1 N/C
Table 5 Pin connections for GC132 (power) package. See Fig 3b.
Pin
A1 B1 C2 C1 D2 D1 E2 E1 F2 G2 G1 H2
J1
J2 K1 K2
L1
Signal
L0 F1 L1 L2 L3 N/C L4 L5 L6 L7 IP7 N/C IP6 IP5 IP4 N/C IP3
Pin
L2 M1 N1 N2 M3 N3 M4 N4 M5 N5 M6 M7 N7 M8 N9 M9
N10
Signal
IP2 IP1 IP0 BYPASS X15 X14 X13 N/C
SINGLE
X12 X11
MASTER
X10 X9 X8 X7 X6
Pin
M10 N11 M11 N12 N13 M13
L12
L13 K12 K13
J12
J13 H12 G12 G13
F12
E13
Signal
X5 X4 X3 X2 X1 X0 DELOP
PC0
RES
CS0 CS1 CS2 CS3
PROG DS CE
R/W
Pin
E12 D13 D12 C13 C12 B13 A13 A12 B11 A11 B10 A10
B9 A9 B8 B7 A7
Signal
HRES OVR
PC1
BIN
OEN
D0 D1 D2 D3 D4 D5 D6 D7 D8 CLK N/C D9
Pin
B6 A5 B5 A4 B4 A3 B3 A2 F1 N6
F13
A6 H1 N8
H13
A8
Signal
D10 D11 N/C D12 D13 D14 D15 F0 V
DD
1
V
DD
2
V
DD
3
V
DD
4 GND1 GND2 GND3 GND4
Table 4 Pin connections for AC84 (power) package. See Fig. 3a.
6
BASIC OPERATION
The PDSP16488A convolver performs a weighted sum of all the pixels within an N3N two dimensional window. Each pixel value is multiplied by a signed coefficient, or weight, and the products are summed together. In practice positive weights would be used to produce averaging effects, with various distri­bution laws, and negative weights would be used for edge enhancement. The window is moved continuously over the video frame, and for real time operation a new result must be obtained for every pixel clock. In most applications odd sized windows will be used, resulting in a centre pixel whose value is modified by the surrounding pixels.
Output Accuracy
With 8 bit pixels, and an 838 window, it is possible for the accumulated sum to grow to 22 bits within a single device. With 16-bit pixels, and an 834 window (the maximum possible), the sum can grow to 29 bits. The PDSP16488A actually allows for word growth up to 32 bits, and thus allows several devices to be cascaded without any danger of overflow. Since coefficients can be negative, the final result is a 32-bit signed two’s complement number.
In a particular application the desired output will lie some­where within these 32 bits, the actual position being dependent on the coefficient values used. This causes problems in physi­cally choosing which output pins to connect to the rest of the system. To overcome this problem the PDSP16488A contains a gain control block, which allows the final result to be aligned to the most significant end of the 32-bit internal result. The provision of the gain control block, rather than a simple shifter, allows the gain to be defined more accurately.
The sixteen most significant bits of the adjusted result are available on output pins D15:0, which contains a sign bit.
Output Saturation
If the output from the convolver is driving a display, negative pixels will give erroneous results. An option is thus provided (register bits C7:6) that forces all negative results to zero, which are then interpreted as black by the display. At the same time positive results, which overflow the gain control, are forced to saturate at the most positive number, i.e. peak white. In this mode the output sign bit is always zero,and should not be connected to an A-D converter.
A separate option forces both negative and positive overflows to saturate at their respective maximum values, but in scale negative results remain valid. A gain control overflow warning flag (OVR) is also available, which can be used in a host CPU supported system to change the gain parameters if overflows are not acceptable.
Binary Output
The PDSP16488A contains a 16-bit arithmetic comparator which allows the output from the gain control block to be com­pared with a previously programmed value. An output flag allows the user to detemine if the result was above or below a value contained within an internal register.
Multiplier Array
The PDSP16488A contains sixteen 838 multipliers each producing a 16-bit result. Internally the pixel clock supplied by the user can be multiplied by two or four, which together with the proprietary architecture, allows each multiplier to be used several times within a pixel clock period. This increases the
effective number of multipliers, which are available to the user, from 16 to 32 or 64 respectively. This architecture produces a very efficient utilization of chip area, and allows the line delays to be accommodated on the same device.
The sixteen multipliers are arranged in a 4 deep by 4 wide array, resulting in effective arrays of 4 by 8 or 8 by 8 with the multi­cycling options. The multiplier array can also be configured to handle 16-bit signed pixels; the effective number of available multipliers is then halved.
Line Delay Operation
Internal RAM is arranged in two separate groups, and can be configured to provide line delays to match the chosen size of the convolver. When a four deep arrangement is used, with 8-bit pixels, four line delays are available, and each can be pro­grammed to contain up to 1024 pixels. In an eight deep array, or if 16-bit pixels are needed, each line can contain up to 512 pixels. Fig. 4 illustrates the options available.
The first line delay in one of the groups can optionally be switched in or out under the control of an input pin. It is used to delay the pixel input when data is obtained from another convolver in a multiple device system, or it is used to support interlaced video.
Signals L7:0 may be used as pixel inputs or outputs. They are configured as inputs at power-on to avoid possible bus conflicts, but by setting a mode control bit can become outputs. They can then be used to drive another device when multiple PDSP16488As are required.
Interlaced Video
When using real time interlaced video, a picture or frame is composed from two fields, with odd lines in one field and even lines in the other. An external field delay is thus required to gather information from adjacent lines, and the convolver needs two input buses. The bus providing the delayed pixels has an extra internal line delay. This is only used in the field containing the upper line in any pair of lines, and must be bypassed in the other field. It ensures that data from the previous field always corre­sponds to the line above the present active line, and avoids the need to change the position of the coefficients from one field to the next.
Fig. 5 shows the translation from physical to internal line positions, for single device interlaced systems. Line N is the line presently being convolved, which is either one or two lines previous to the line presently being produced.
When windows requiring four or more lines are to be imple­mented, the first line delay, in the group supplied from the L7:0 pins, must always be bypassed. This bypass option is controlled by register B, bit 7 and is not effected by the BYPASS input pin.. The coefficients must be loaded into the locations shown, which match the translated line positions, with unused coefficients, shown shaded, loaded with zeros.
Defining the Length of the Line Delay
Fig. 5 defines the maximum line lengths available in each of the window size options. The actual line lengths can be defined in one of three ways, to support both real time applications, taking pixels directly from a camera, and also use in systems supported by a frame store. In the former case the line delays must be referenced to video synchronization pulses. In the latter case the line lengths are well defined, and the horizontal flyback ‘dead times’ will have been removed.
7
Fig. 4 Line delay configurations
512
512
512
512
838
ARRAY
512
512
512
512
IP7:0
BYPASS
L7:0
BYPASS
512
512
512
512
838
ARRAY
512
512
512
512
IP7:0
BYPASS
L7:0
BYPASS
1024
1024
1024
1024
434
OR
834
ARRAY
IP7:0
BYPASS
L7:0
1024
1024
1024
1024
434
OR
834
ARRAY
IP7:0
BYPASS
L7:0
512
512
512
512
512
512
512
512
IP7:0
BYPASS
L7:0
BYPASS
434
OR
834
ARRAY
16 16
16
16
To support real time applications an option is provided in which the length of the line delay is defined by the number of clocks obtained while the HRES input is low. HRES would normally be composite sync when the convolver is directly attached to an NTSC or PAL video input.
The line delay is achieved by reading the previous contents of a RAM-based line store, and then writing new information to the same address. When HRES is high, write operations are inhibited, and the address counter is reset. During an active line the counter is incremented by the pixel clock. If the maximum count is reached before the end of a line, then write operations are terminated and wraparound effects avoided.
The rising edge of HRES, marking the end of a line, is normally asynchronous to the pixel clock, and it is possible for an additional pixel to be stored on some lines. This has no effect on the convolver operation, and will not cause a cumulative shift in the pixel position from line to line.
An alternative means of defining the line length is, however, provided when an exact number of pixels is needed. HRES going low then starts the delay operation for every line, but it ceases when the 10-bit value contained in two registers is reached. This method can avoid the need to store blank pixels at the end of a line before HRES goes high. With this method the line must contain an even number of pixels but the value loaded into the control registers, defining the line length, must be one less than the even number required.
In an image processing system, the pixel clock is often re­synchronized, or even inhibited, during blanking or sync. The next line is then started with a precise time interval from the end of sync (falling edge of HRES) to the first pixel clock edge. This avoids any visible pixel jitter at the beginning of the line, which would otherwise be present since pixel clock is asynchronous with respect to video sync pulses.
When using the PDSP16488A the pixel clock should not be inhibited, or re-synchronized, until the delayed version of the
HRES input goes active. This is present on the DELOP output pin. This will ensure that no pixels on the right hand edge are lost due to the internal pipeline delay. If the pixel clock is a continuous signal, the user must ensure that the HRES high to low transition meets the timing requirements defined in Fig. 10. The HRES rising edge at the end of a line need not be synchronized.
When pixels are read or written to a frame store, an alternative line delay configuration is needed. Within the frame store lines would be stored in contiguous locations, with no gaps caused by the flyback period between the lines. This method of use makes the HRES defined line delay operation difficult to use, and an alternative mode of operation is provided. The HRES input is then driven by a system-provided signal, which defines a complete frame store update period. It is not a line defining signal. The high to low transition of this signal will initiate the line store update sequence and allow the internal address pointers to increment. These pointers will be synchronously reset at the end of a line, when they reach the pre-programmed value. They will then immediately start a new operation using address zero. The actual line delay must be pre-loaded into two control registers as described previously.
Write operations back to the frame store must allow for the total pipeline delay. This can be achieved by inhibiting write operations until DELOP goes low. Write operations then continue until it goes back high. The PDSP16488A assumes that data is valid when a clock signal is applied, and that it also meets the set up and hold requirements given in Fig. 10. If data is not valid due, for example, to a frame store DRAM refresh cycle, then the user must externally inhibit the clock. The clock supplied to the convolver will in this mode be a signal which defines a frame store cycle time.
The use of the convolver in a line scan system is similar to its use with a frame store. These systems have no flyback period, and the address counter must be synchronously reset at the end of the line and then allowed to continue.
8
1024
1024
1024
1024
434
OR
834
ARRAY
IP7:0
ODD
FIELD
Output is shifted
by 1 line in every field
FIELD
DELAY
VIDEO
LINE N12
L7:0
N11 N21
N
L7:0
512
512
512
512
838
ARRAY
512
512
512
512
IP7:0
FIELD
DELAY
Output is shifted
by 1 line in every field
ODD
FIELD
DELAY
BYPASSED
REG B BIT 7 SET
VIDEO
LINE N12
N11 N21
N12
N
N22
L7:0
512
512
512
512
838
ARRAY
512
512
512
512
IP7:0
FIELD
DELAY
Output is shifted
by 2 lines in
every field
ODD
FIELD
DELAY
BYPASSED
REG B BIT 7 SET
VIDEO
LINE N14
N13 N11
N14
N
N22
N21
N23
N12
C24
C56
C16
C48
C8
C40
C0
C32
LINE N23
LINE N22
LINE N21
LINE N
LINE N11
LINE N12
LINE N13
LINE N14
C25
C57
C17
C49
C9
C41
C1
C33
C26
C58
C18
C50
C10
C42
C2
C34
C27
C59
C19
C51
C11
C43
C3
C35
C28
C60
C20
C52
C12
C44
C4
C36
C29
C61
C21
C53
C13
C45
C5
C37
C30
C62
C22
C54
C14
C46
C6
C38
C31
C63
C23
C55
C15
C47
C7
C39
838 WINDOW
C48
C8
C40
C0
C32
LINE N22
LINE N21
LINE N
LINE N11
LINE N12
C49
C9
C41
C1
C33
C50
C10
C42
C2
C34
C51
C11
C43
C3
C35
C52
C12
C44
C4
C36
535 WINDOW
C4
C8
C0
LINE N21
LINE N
LINE N11
C5
C9
C1
C6
C10
C2
333 WINDOW
Fig. 5 Line delay allocations in SINGLE device interlaced systems
9
Gain Control Block
This block is provided as an aid to locating the bits of interest in the 32-bit internal result. The magnitude of the largest convolved output will depend on the size of the window, and the coefficient values used. The function of the gain control block is then to produce an output, which is accurate to 16 bits, and which is aligned to the most significant end of this 32-bit word. The sixteen most significant bits of the word are available on D15:0 and the largest number need only have one sign bit if the gain control is correctly adjusted.
Fig. 6 indicates the mechanism employed with the required function implemented in two steps. Two mode control bits, register C, bits 5:4, allow one of four 20 bit fields to be selected from the final 32-bit value. These four fields are positioned with the first at the most significant end, and then at four bit displacements down to the least significant end.
By setting an enabling bit, register C, bit 0, the field selection can optionally be done automatically. This feature should only be used in the real time operating mode, when HRES defines video lines. Internal logic examines the most significant 13, 9, or 5 bits from the 32-bit result, and makes a field selection dependent on which group does not contain identical sign bits. If less than five sign bits are obtained, the logic will select the field containing the most significant 20 bits. The selection is indicated by F1:0.
The automatic field selection is particularly useful when a fixed scene is being processed. The selection is reset when any internal register is updated (i.e.
PROG
has been low) and is then held high for ten further occurrences of the HRES input. This allows the internal multiplier/accumulator array to be completely flushed before a field selection is made. As convolver outputs of greater magnitude are produced the field selection logic will respond by selecting a more significant field. The most significant field found necessary remains selected until
PROG
again goes low. Even if the automatic field selection is not enabled, F1:0 will still indicate which field would have been selected. These are coded in the same way as register C, bits 5:4.
Having chosen a field, either manually or automatically, it is then multiplied by a 4-bit unsigned integer. This is contained within the user-programmed gain control register, and the multiplication will produce a 24-bit result . The middle 16 bits of this result contain the required output bits. The gain control multiplier can overflow in to the unused most significant four bits if the parameters are chosen wrongly. This condition is flagged by pin OVR.
By setting appropriate mode control bits, further manipulation of the gain control output is possible. One option, register C, bits
Fig. 6 Gain control block
7:6 = 11, allows all negative outputs to be forced to zero, and at the same time positive gain control overflows will saturate at the maximum positive number. Register C, bits 7:6 = 10 will saturate positive and negative overflows at their respective maximum values, but otherwise leaves them unchanged. Occasional over­flows can be tolerated in some systems, and this option prevents any gross errors.
Expansion
Multiple devices can be connected in cascade in order to obtain window sizes larger than those provided by a single PDSP16488A. This requires an additional adder in each device which is fed from expansion data inputs. This adder is not used by a Single device or the first device in a cascaded system, and is enabled or disabled by register B, bit 4.
The first device in the cascaded system must be designated as a Master device by
MASTER
tying low. Its expansion input bus is then used as the source of data for the coefficient and control registers in all devices in the system.
In order to reduce the pin count required for 32-bit buses, both expansion in and data out are time-multiplexed with the phases of the pixel clock. When the clock is high the least significant half will be valid, and when the clock is low the most significant half will be valid.
In practice this multiplexing is only possible with pixel clocks up to 20MHz. Above these frequencies the multiplexing must be inhibited by setting register A, bit 7. The intermediate data accuracy will then be reduced, since only the lower 16 bits of the internal 32-bit intermediate sum are available on the D15:0 output pins. In such systems the coefficients must be scaled down in order to keep the intermediate and final results down to 16 bits. The final device should not use the gain control block but instead should simply output the non-multiplexed 16-bit result. The OVR flag and pixel saturation options will not be available.
Pixel Input and Output Delays
In a real time system, when line delays are referenced to video sync pulses present on the HRES input, the first pixel from the last line delay does not appear on the L7:0 pins until the fifth active pixel clock edge after HRES has gone low. This is illustrated in Fig. 8. In a vertically expanded system, this output provides the input to the first line delays in the vertically displaced devices. The internal logic is thus designed to always expect this five clock delay. Compensation must thus be applied to the devices which are directly connected to the video source, such that the first pixel is not valid until the fifth clock rising edge.
For this reason the PDSP16488A contains an optional four clock pipeline delay on each of the pixel data inputs, as shown in Fig. 7. When the delay is used the first pixel in a video line must be available on the input pins after the first pixel clock edge. This would be so if the device were connected to an A-D converter, since that would introduce a one pixel pipeline delay. If the system introduces any further external pipeline delays, then the internal delay should be bypassed, and the user should ensure that the first pixel is valid after the fifth clock edge.
The use of this four clock delay is controlled by register B, bit 3. This delay is in addition to the delays which are provided to support expansion in both the X and Y directions, and are controlled by register D, bits 3:2. Both delays are in fact simply added together in the device, but are separately defined since they add delays for different system reasons.
FROM EXPANSION ADDER
AUTOMATIC
FIELD SELECT
32 BITS
204 8 208 4 201220 12
MUX
GAIN CONTROL
REGISTER
4
16
4
20
4
24
SATURATE
LOGIC
16
MSB
LSB
D15:0
F1:0
10
4 CLOCK
DELAY
LINE
DELAYS
0
DELAYS
Σ
WIDTH = S
+
4 CLOCK
DELAYS
0
DELAYS
0
DELAYS
ZERO
REG B3 = 1
DELAY = 0, DEFINED BY REG D3:2 = 00
REG D0 = 0
4 CLOCK
DELAY
LINE
DELAYS
0
DELAYS
Σ
WIDTH = S
+
4 CLOCK
DELAYS
D
DELAYS
0
DELAYS
REG B3 = 0
D = 41S(N21) DEFINED BY REG D3:2
REG D0 = 0
4 CLOCK
DELAY
LINE
DELAYS
0
DELAYS
Σ
WIDTH = S
+
D
DELAYS
0
DELAYS
4 CLOCK
DELAY
LINE
DELAYS
0/4
DELAYS
Σ
WIDTH = S
+
0
DELAYS
0
DELAYS
REG D0 = 0 IF S = 4,
OR REG D0 = 1 IF S = 8
Nth PDSP16488A IN THE ROW
4 CLOCK
DELAY
LINE
DELAYS
0/4
DELAYS
Σ
WIDTH = S
+
D
DELAYS
0
DELAYS
Nth PDSP16488A IN THE ROW
4 CLOCK
DELAY
LINE
DELAYS
0/4
DELAYS
Σ
WIDTH = S
+
D
DELAYS
0
DELAYS
REG B3 = 1
D = 41S(N21) DEFINED BY REG D3:2
Nth PDSP16488A IN THE ROW
INPUT
OUTPUT
PDSP16488A
PDSP16488A
PDSP16488A
REG B3 = 1
DELAY = 0, DEFINED BY REG D3:2 = 00
REG D0 = 0 IF S = 4,
OR REG D0 = 1 IF S = 8
REG B3 = 1
D = 41S(N21) DEFINED BY REG D3:2
REG D0 = 0 IF S = 4,
OR REG D0 = 1 IF S = 8
REG B3 = 0
D = 41S(N21) DEFINED BY REG D3:2
REG D0 = 0
Fig. 7 Multi-device delay paths
Delay Compensation for Large Windows
A large window is composed of several partial windows each of which is implemented in an individual device. If necessary the partial window must be padded with zero coefficients to become one of the standard sizes. When constructing a large window it is necessary to delay the expansion data inputs in order to compensate for growth in the horizontal direction. Delays in the partial sums are also necessary to compensate for the total pipeline delay needed to produce the previous complete horizontal stripe.
Within each device in a horizontal stripe, apart from the first, the expansion input must be delayed by the width of the partial window, before it is added to the internal sum. Since partial windows can only be 4 or 8 pixels wide, a delay of 4 or 8 pixel clocks is needed. There is, however, an in-built delay of 4 pixels in the inter device connection, and the PDSP16488A thus only needs an option to delay the expansion input by an additional four pixels.
Loading...
+ 23 hidden pages