MITEL PDSP16488AMA Datasheet

PDSP16488A MA

Single Chip 2D Convolver with Integral Line Delays

Supersedes January 1997 version, DS3742 - 3.1 DS3742 - 4.0 January 2000

The PDSP16488A is a fully integrated, application specific, image processing device. It performs a two dimensional convolution between the pixels within a video window and a set of stored coefficients. An internal multiplier accumulator array can be multi-cycled at double or quadruple the pixel clock rate. This then gives the window size options listed in Table 1.

An internal 32k bit RAM can be configured to provide either four or eight line delays. The length of each delay can be programmed to the users requirement, up to a maximum of 1024 pixels per line. The line delays are arranged in two groups,which may be internally connected in series or may be configured to accept separate pixel inputs. This allows interlaced video or frame to frame operations to be supported.

The 8 bit coefficients are also stored internally and can be downloaded from a host computer or from an EPROM. No additional logic is required to support the EPROM and a single device can support up to 16 convolvers.

The PDSP16488A contains an expansion adder and delay network which allows several devices to be cascaded. Convolvers with larger windows can then be fabricated as shown in Table 2.

Intermediate 32 bit precision is provided to avoid any danger of overflow, but the final result will not normally occupy all bits. The PDSP16488A thus provides a multiplier in the output path, which allows the user to align the result to the most significant end of the 32 bit word.

FEATURES

■ The PDSP16488A is a fully compatible replacement for the PDSP16488

■ 8 or 16 bit pixels with rates up to 40 MHz

■ Window sizes up to 8 x 8 with a single device

■ Eight internal line delays

■ Supports interlace and frame to frame operations

■ Coefficients supplied from an EPROM or remote host

■ Expandable in both X and Y for larger windows

■ Gain control and pixel output manipulation

■ 132 pin QFP

Rev A B C D

Date MAR 1993 JUL 1996 JAN1997

NOTE

Polyimide is used as an inter-layer dielectric and as glassivation.

Polymeric material is also used for die attach which according to the requirement in paragraph 1.2.1.b. (2) precludes catagorising this device as fully compliant. In every other respect this device has been manufactured and screened in full accordance with the requirements of Mil-Std 883 (latest revision).

Data Size

8 8

8 16 16

Max Pixel

Rate

10MHz

20MHz

40MHz

* Maximum rate is limited to 30 MHz by line store expansion delays

Table 2 Devices needed to implement typical window sizes

Window Size

Width X Depth

4 8 8 4 8

Table 1 Single Device Configurations

Pixel

4 4 8 4 4

3x3 5x5 7x7 9x9 11x11 15x15 23x23Size

Max Pixel

Rate

40MHz 20MHz 10MHz 20MHz 10MHz

Window size

Line Delays

4x1024 4x1024 8x512 4x512 4x512

CHANGE NOTIFICATION

The change notification requirements of MIL-PRF-38535 will be implemented on this device type. Known customers will be notified of any changes since the last buy when ordering further parts if significant changes have been made.

PIXEL

CLOCK

GENERATOR

SYNC

EXTRACT

A/D

CONVERTER

OPTIONAL

FIELD

COMPOSITE

STORE

Fig. 1 Typical , Stand Alone, Real Time System

SYNC

BYPASS

DATA IN

AUX DATA

EPROM

ADDR DATA

CLK

PDSP

16488A

CONVOLVER

RES

POWER ON

RESET

DELAYED

SYNC

OUTPUT

DATA

PDSP16488A MA

PROG

MASTER

SINGLE

DELOP

CE DS R/W PC0 PC1 RES CS3:0

CONTROL

MULTI PURPOS

DATA BUS

X15:0

DELAY

CONTROL

REGISTERS

IP7:0

BY PASS

L7:0

DELAY

LINE

DELAY

LINE

DLYS

LINE

DLYS

COEFFICIENT

STORE (64)

8 X 8

ARRAY O

MAC'S

CLOCK

ADDER

COMPARATO

SCALER

MUX

BIN

OVER FLOW

D15:0 DATA OUT

OEN

Fig. 2 Functional Block Diagram

PIN NO

AC PACKAG

A1 B1 C2 C1 D2 D1 E2 E1 F2 G2 G1 H2 J1 J2 K1 K2 L1 L2 M1 N1 N2

FUNCTIO

L0 F1 L1 L2 L3 SPARE L4 L5 L6 L7 IP7 SPARE IP6 IP5 IP4 SPARE IP3 IP2 IP1 IP0 BYPASS

PIN NO

AC PACKAG

M3 N3 M4 N4 M5 N5 M6 M7 N7 M8 N9 M9 N10 M10 N11 M11 N12 N13 M13 L12 L13

FUNCTIO

X15 X14 X13 SPARE SINGLE X12 X11 MASTE X10 X9 X8 X7 X6 X5 X4 X3 X2 X1 X0 DELOP PC0

PIN NO

AC PACKAG

K12 K13 J12 J13 H12 G12 G13 F12 E13 E12 D13 D12 C13 C12 B13 A13 A12 B11 A11 B10 A10

FUNCTIO

RES CS0 CS1 CS2 CS3 PROG DS CE R/W HRES OV PC1 BIN OEN D0 D1 D2 D3 D4 D5 D6

PIN NO

AC PACKAG

B9 A9 B8 B7 A7 B6 A5 B5 A4 B4 A3 B3 A2 F1 N6 F13 A6 H1 N8 H13 A8

FUNCTIO

D7 D8 CLK SPARE D9 D10 D11 SPARE D12 D13 D14 D15 F0 VDD VDD VDD VDD GND GND GND GND

Pin out Table (84 pin PGA - AC84)

PDSP16488A MA

NAME

IP7:0

L7:0

BYPASS

HRES

X15:0

D15:0

PC1

PC0

DELOP

TYPE

INPUT

I/O

INPUT

DUAL FUNCTION

OUTPUT

INPUT

OUTPUT

I/O

INPUT

DESCRIPTION

Pixel data input to the first line delay. [most significant byte in 16 bit mode]

Pixel data input to the second group of line delays. [least significant byte in 16bit mode]. Alternatively an output from the last line delay when the appropriate mode bit is set.

The first line delay in the first group is bypassed when this input is active. (High). No internal pull up.

Resets the line delay address pointers when high. Normally the composite sync signal in real time applications. In non real time systems it defines a frame store update period, when low.

Address/data connections from a MASTER or SINGLE device to the external coefficient source, with X15 defining EPROM or Host support. Otherwise they provide the expansion data input.

Signed 16 bit scaled data or multiplexed 32 bit intermediate data. During intermediate transfers the most significant half is valid when the clock is low, and the least significant half when clock is high.

During programming a MASTER device outputs a timing strobe on this pin. This is passed down the chain in a multiple device system, using the PC0 input on the next device.

This pin is used in conjunction with PC1 in multiple device systems. It terminates the write strobe from a MASTER device which is EPROM supported.

This output provides a version of the HRES input which has been delayed by an amount defined by the user.

The data strobe from a host computer. Active low. This pin will be an output from an EPROM supported MASTER device which provides strobes to the remaining devices.

An active low enable which is internally gated with R/ W and DS to perform reads or writes to the internal registers. In a SINGLE or MASTER device, which is supported from an EPROM, the bottom 72 addresses are always used and CE is not needed. CE can then be used to initiate a new register load sequence after the power on load sequence.

R/ W

PROG

CLK

BIN

RES

SINGLE

MASTER

OEN

CS3:0

F1:0

VCC / GND

INPUT

I/O

INPUT

OUTPUT

INPUT

OUTPUTS

SUPPLY

Read / not write line from the host CPU. When an EPROM is used this pin should be tied low.

This pin is normally an input which signifies that registers are to be changed or examined. It is, however, an output from an EPROM supported SINGLE or MASTER device indicating to the rest of the system that registers are being updated.

Clock. All events are triggered on the rising edge of the clock, except the latching of least significant expansion inputs . Internally the clock can be multiplied by two or four in order to increase the effective number of multipliers.

This output indicates the result from the internal comparison. A high value indicates that the pixel was greater than the internal threshold. The output is only valid from the last device in a chain.

When high this output indicates that there has been a gain control overflow.

Active low power on reset signal.

Tied to ground to indicate a SINGLE device system. Internal pull up resistor.

Tied to ground to indicate the MASTER device in a multiple device system. Must be left open circuit in a SINGLE device system. Internal pull up.

Output enable signal. Active low.

Four address bits from a MASTER specifying one of sixteen devices in a multiple device system. Must be externally decoded to provide chip enables for the additional devices.

These bits indicate the field selection given by the auto select logic. The same coding as that used for Control Register bits C5:4 is used.

Four Power and ground pairs. All must be connected.

PDSP16488A MA

BASIC OPERATION

The PDSP16488A convolver performs a weighted

sum of all the pixels within an N x N two dimensional window. Each pixel value is multiplied by a signed coefficient, or weight, and the products are summed together. In practice positive weights would be used to produce averaging effects, with various distribution laws, and negative weights would be used for edge enhancement. The window is moved continuously over the video frame, and for real time operation a new result must be obtained for every pixel clock. In most applications odd sized windows will be used, resulting in a centre pixel whose value is modified by the surrounding pixels.

OUTPUT ACCURACY

With 8 bit pixels, and an 8 x 8 window, it is possible for

the accumulated sum to grow to 22 bits within a single device. With 16 bit pixels, and an 8 x 4 window ( the maximum possible ), the sum can grow to 29 bits. The PDSP16488A actually allows for word growth up to 32 bits, and thus allows several devices to be cascaded without any danger of overflow. Since coefficients can be negative, the final result is a 32 bit signed two's complement number.

In a particular application the desired output will lie

somewhere within these 32 bits, the actual position being dependent on the coefficient values used. This causes problems in physically choosing which output pins to connect to the rest of the system. To overcome this problem the PDSP16488A contains an output multiplier, or gain control, which allows the final result to be aligned to the most significant end of the 32 bit internal result.The provision of a multiplier, rather than a simple shifter, allows the gain to be defined more accurately.

The sixteen most significant bits of the adjusted result are

available on output pins, and contain a sign bit.

OUTPUT SATURATION

MULTIPLIER ARRAY

The PDSP16488A contains sixteen 8x8 multipliers each producing a 16 bit result. Internally the pixel clock supplied by the user can be multiplied by two or four, which together with the proprietary architecture, allows each multiplier to be used several times within a pixel clock period. This increases the effective number of multipliers, which are available to the user, from 16 to 32 or 64 respectively. This architecture produces a very efficient utilization of chip area, and allows the line delays to be accommodated on the same device.

The sixteen multipliers are arranged in a 4 deep by 4 wide array, resulting in effective arrays of 4 by 8 or 8 by 8 with the multi-cycling options. The multiplier array can also be configured to handle 16 bit signed pixels; the effective number of available multipliers is then halved.

LINE DELAY OPERATION

Internal RAM is arranged in two separate groups, and can be configured to provide line delays to match the chosen size of the convolver. When a four deep arrangement is used, with 8 bit pixels, four line delays are available, and each can be programmed to contain up to 1024 pixels. In an eight deep array, or if16 bit pixels are needed, each line can contain up to 512 pixels. Figure 4 illustrates the options available.

The first line delay in one of the groups can optionally be switched in or out under the control of an input pin. It is used to delay the pixel input when data is obtained from another convolver in a multiple device system, or it is used to support interlaced video.

Signals L7:0 may be used as pixel inputs or outputs. They are configured as inputs at power-on to avoid possible bus conflicts, but by setting a mode control bit can become outputs. They can then be used to drive another device when multiple PDSP16488A's are required.

If the output from the convolver is driving a display, negative pixels will give erroneous results. An option is thus provided which forces all negative results to zero, which are then interpreted as black by the display. At the same time positive results, which overflow the gain control, are forced to saturate at the most positive number ie peak white. In this mode the output sign bit is always zero,and should not be connected to an A/D converter.

A separate option forces both negative and positive overflows to saturate at their respective maximum values, but in scale negative results remain valid. A gain control overflow warning flag is also available, which can be used in a host CPU supported system to change the gain parameters if overflows are not acceptable.

BINARY OUTPUT

The PDSP16488A contains a 16 bit arithmetic comparator which allows the output from the gain control to be compared with a previously programmed value. An output flag allows the user to detemine if the result was above or below a value contained within an internal register.

INTERLACED VIDEO

When using real time interlaced video, a picture or frame is composed from two fields, with odd lines in one field and even lines in the other. An external field delay is thus required to gather information from adjacent lines, and the convolver needs two input busses. The bus providing the delayed pixels has an extra internal line delay. This is only used in the field containing the upper line in any pair of lines, and must be bypassed in the other field. It ensures that data from the previous field always corresponds to the line above the present active line, and avoids the need to change the position of the coefficients from one field to the next.

Figure 3 shows the translation from physical to internal line positions, for single device interlaced systems. Line N is the line presently being convolved, which is either one or two lines previous to the line presently being produced.

When windows requiring four or more lines are to be implemented, the first line delay, in the group supplied from the L7:0 pins, must always be by-passed. This by-pass option is controlled by Register B, bit 7 and is not effected by the BYPASS input pin.. The coefficients must be loaded into the locations shown, which match the translated line positions, with unused coefficients, shown shaded, loaded with zero's.

LINE N-1

LINE N

LINE N+1

3 X 3 WINDOW

C4 C5C9C6

C10

C2C0 C1

VIDEO

LINE N+2

FIELD

DELAY

ODD FIELD

IP7:0

L7:0

1024

N+1

N - 1

PDSP16488A MA

4 X 4

Output is shifted

8 X 4

ARRAY

by 1 line in

every field

LINE N-2

LINE N-1

LINE N

LINE N+1

LINE N+2

LINE N-3

LINE N-2

LINE N-1

LINE N

LINE N+1

LINE N+2

LINE N+3

LINE N+4

5 X 5 WINDOW

C48 C49 C50 C51 C52

C8 C9 C10 C11 C12

C40 C41 C42 C43

C0 C1 C2 C3 C4

C32 C33 C34 C35 C36

C44

8 X 8 WINDOW

C30C29C28C27C26C25C24

C56 C57 C58 C59 C60 C61 C62 C63

C16 C17 C18 C19 C20 C21 C22

C48 C49 C50 C51 C52 C53 C54

C8 C9 C10 C11 C12 C13 C14

C40 C41 C42 C43 C44 C45 C46

C0 C1 C2 C3 C4 C5 C6

C32

C33 C34 C35 C36 C37 C38

VIDEO

LINE N+2

Delay is By-Passed

[REG B,BIT 7 IS SET]

C31

C23

C55

VIDEO

C15

LINE N+4

C47

C39

Delay is By-Passed

[REG B,BIT 7 IS SET]

FIELD

DELAY

FIELD

DELAY

IP7:0

L7:0

FIELD

L7:0

ODD

FIELD

IP7:0

512

N+1

N-1

N+2

N-2

N+3

N+1

N-1

N-3

N+4

N+2

N-2

8 X 8

ARRAY

8 X 8

ARRAY

Output is shifted

by 1 line in

every field

Output is shifted

by 2 lines in

every field

Figure 3. Line Delay Allocations in Single Device Interlaced Systems

PDSP16488A MA

L7:0

IP7:0

BYPASS

L7:0

IP7:0

BYPASS

L7:0

512

1024

8x8

ARRAY

4 X 4

8 X 4

ARRAY

4X4

8X4

IP7:0

BYPASS

IP7:0

BYPASS

L7:0

512

1024

8X8

ARRAY

4 X 4

8 X 4

ARRAY

BYPASS

Fig. 4. Line Delay Configurations

DEFINING THE LENGTH OF THE LINE DELAY

Figure 4 defines the maximum line lengths available in each of the window size options. The actual line lengths can be defined in one of three ways, to support both real time applications, taking pixels directly from a camera, and also use in systems supported by a frame store. In the former case the line delays must be referenced to video synchronization pulses. In the latter case the line lengths are well defined, and the horizontal flyback 'dead times' will have been removed.

To support real time applications an option is provided in which the length of the line delay is defined by the number of clocks obtained whilst an input pin ( HRES ) is in-active. HRES would normally be composite sync when the convolver is directly attached to an NTSC or PAL video camera.

Conceptually, the line delay is achieved by reading the previous contents of a RAM based line store, and then writing new information to the same address. When HRES is active write operations are inhibited, and the address counter is reset. During an active line the counter is incremented by the pixel clock. If the maximum count is reached before the end of a line, then write operations are terminated and wrap-around effects avoided.

The active going edge of HRES, marking the end of a line, is normally asynchronous to the pixel clock, and it is possible for an additional pixel to be stored on some lines. This has no effect on the convolver operation, and will not cause a cumulative shift in the pixel position from line to line.

An alternative means of defining the line length is, however, provided when an exact number of pixels is needed. HRES going in-active then starts the delay operation for every line, but it ceases when the 10 bit value contained in two registers is reached. This method can avoid the need to store blank pixels at the end of a line before sync goes active. With this method the line must contain an even number of pixels, but the value loaded into the control registers defining the line length, must be one less than the even number needed.

In an image processing system, the pixel clock is often re-synchronized, or even inhibited, during blanking or sync. The next line is then started with a precise time interval from the end of sync to the first pixel clock edge. This avoids any visible pixel jitter at the beginning of the line, which would otherwise be present since pixel clock is asynchronous with respect to video sync pulses.

When using the PDSP16488A the pixel clock should not be inhibited, or re-synchronized, until the delayed version of the HRES input goes active. This is present on the DELOP output pin. This will ensure that no pixels on the right hand edge are lost due to the internal pipeline delay.

If the pixel clock is a continuous signal, the user must ensure that the HRES in-active transition meets the timing requirements defined in Figure 10. The active going edge at the end of a line need not be synchronized.

When pixels are read/written to a frame store, an alternative line delay configuration is needed. Within the frame store lines would be stored in contiguous locations, with no gaps caused by the flyback period between the lines. This method of use makes the HRES defined line delay operation difficult to use, and an alternative mode of operation is provided. The HRES input is then driven by a system provided signal, which defines a complete frame store update period. It is not a line defining signal. The high to low transition of this signal will initiate the line store update sequence and allow the internal address pointers to increment. These pointers will be synchronously reset at the end of a line, when they reach the pre-programmed value. They will then immediately start a new operation using address zero. The actual line delay must be pre-loaded into two control registers as described previously.

Write operations back to the frame store must allow for the total pipeline delay. This can be achieved by inhibiting write operations until the delayed version of HRES goes low at the DELOP output pin. Write operations then continue until it goes back high. The PDSP16488A assumes that data is valid when a clock signal is applied, and that it also meets the set up and hold requirements given in Figure 10. If data is not valid, due for example to a frame store DRAM refresh cycle, then the user must externally inhibit the clock. The clock supplied to the convolver will in this mode be a signal which defines a frame store cycle time.

The use of the convolver in a line scan system is similar to its use with a frame store. These systems have no flyback period, and the address counter must be synchronously reset at the end of the line and then allowed to continue.

GAIN CONTROL

The gain control is provided as an aid to locating the bits of interest in the 32 bit internal result. The magnitude of the largest convolved output will depend on the size of the

PDSP16488A MA

window, and the coefficient values used. The function of the gain control is then to produce an output, which is accurate to 16 bits, and which is aligned to the most significant end of this 32 bit word. The sixteen most significant bits of the word are available on output pins, and the largest number need only have one sign bit if the gain control is correctly adjusted.

Fiigure 5 indicates the mechanism employed with the required function implemented in two steps. Two mode control bits allow one of four 20 bit fields to be selected from the final 32 bit value. These four fields are positioned with the first at the most significant end, and then at four bit displacements down to the least significant end.

By setting an enabling bit, the field selection can optionally be done automatically. This feature should only be used in the real time operating mode, when HRES defines video lines. Internal logic examines the most significant 13, 9, or 5 bits from the 32 bit result, and makes a field selection dependent on which group does not contain identical sign bits. If less than five sign bits are obtained, the logic will select the field containing the most significant 20 bits.

The automatic selection is particularly useful when a fixed scene is being processed. The selection is reset when any internal register is updated ( ie PROG has been active ) and is then held in-active for ten further occurances of the HRES input. This allows the internal multiplier/ accumulator array to be completely flushed before a field selection is made. As convolver outputs of greater magnitude are produced the field selection logic will respond by selecting a more significant field. The most significant field found necessary remains selected until PROG again goes active. Even if the automatic field selection is not enabled, two outputs, F1:0, will still indicate which field would have been selected. These are coded in the same way as Register C, bits 5:4.

Having chosen a field, either manually or automatically, it is then multiplied by a 4 bit unsigned integer. This is contained within a user programmed register, and the multiplication will produce a 24 bit result . The middle 16 bits of this result contain the required output bits. The gain control multiplier can overflow in to the unused most significant four bits if the parameters are chosen wrongly. This condition is indicated by an overflow flag .

By setting appropriate mode control bits, further manipulation of the gain control output is possible. One option allows all negative outputs to be forced to zero, and at the same time positive gain control overflows will saturate at the maximum positive number. A different option will saturate positive and negative overflows at their respective maximum values, but otherwise leaves them unchanged. Occasional

FROM EXPANSION ADDER

32 BITS

20 20 20

488412

MSB

LSB

D15:0

MUX

GAIN

1624

SATURATE

LOGIC

overflows can be tolerated in some systems, and this option prevents any gross errors.

EXPANSION

Multiple devices can be connected in cascade in order to fabricate window sizes larger than those provided by a single device. This requires an additional adder in each device which is fed from expansion data inputs. This adder is not used by a single device or the first device in a cascaded system, and can be disabled by a mode control bit.

The first device in the cascaded system must be designated as a MASTER device by tying an input pin low. Its expansion input bus is then used as the source of data for the coefficient and control registers in all devices in the system.

In order to reduce the pin count required for 32 bit busses, both expansion in and data out are time multiplexed with the phases of the pixel clock. When the clock is high the least significant half will be valid, and when the clock is low the most significant half will be valid.

In practice this multiplexing is only possible with pixel clocks up to 20MHz. Above these frequencies the multiplexing must be inhibited by setting a Mode Control bit ( Register A, Bit 7 ). The intermediate data accuracy will then be reduced, since only the lower 16 bits of the internal 32 bit intermediate sum are available on the output pins. In such systems the coefficients must be scaled down in order to keep the intermediate and final results down to 16 bits. The final device should not use the gain control, and instead should simply output the non-multiplexed 16 bit result. The overflow flag and pixel saturation options will not be available.

PIXEL INPUT AND OUTPUT DELAYS

In a real time system, when line delays are referenced to video sync pulses present on the HRES input, the first pixel from the last line delay does not appear on the L7:0 pins until the fifth active pixel clock edge after HRES has gone low. This is illustrated in Figure 7. In a vertically expanded system, this output provides the input to the first line delays in the vertically displaced devices. The internal logic is thus designed to always expect this five clock delay. Compensation must thus be applied to the devices which are directly connected to the video source, such that the first pixel is not valid until the fifth clock edge.

For this reason the PDSP16488A contains an optional four clock pipeline delay on each of the pixel data inputs. When the delay is used the first pixel in a video line must be available on the input pins after the first pixel clock edge. This would be so if the device were connected to an A/D converter, since that would introduce a one pixel pipeline delay. If the system introduces any further external pipeline delays, then the internal delay should be bypassed, and the user should ensure that the first pixel is valid after the fifth clock edge.

The use of this four clock delay is controlled by Bit 3, in Control Register B. This delay is in addition to the delays which are provided to support expansion in both the X and Y directions, and are controlled by Register D, Bits 3:2. Both delays are in fact simply added together in the device, but are provided for conceptually different reasons.

Fig. 5. Gain Control Operation

PDSP16488A MA

INPUT

delays

4 clock

delay

delays

4 clock

delay

B3 = 1

D3:2 = 00

WIDTH = S

line

delays

ZERO

B3 = 0

line

delays

D0 = 0

D = 4+S(N-1) Defined by D3:2

WIDTH = S

delays

D0 = 0

4 clock

delay

4 clock

delay

delays

B3 = 1

D3:2 = 00

line

delays

line

delays

B3 = 0

N th DEVICE IN THE ROW

WIDTH = S

0/4

delays

0 IF S = 4, 4 IF S = 8

D0 = 0 or 1

N th DEVICE IN THE ROW

D = 4+S(N-1) Defined by D3:2

WIDTH = S

0/4

delays

0 IF S = 4, 4 IF S = 8

D0 = 0 OR 1

4 clock

delay

4 clock

delay

B3 = 0

delays

D = 4+S(N-1) Defined by D3:2

delays

line

delays

WIDTH = S

delays

D0 = 0

4 clock

delay

Fig. 6. Multi-Device Delay Paths

DELAY COMPENSATION FOR LARGE WINDOWS

A large window is composed of several partial windows each of which is implemented in an individual device. If necessary the partial window must be padded with zero coefficients to become one of the standard sizes. When constructing a large window it is necessary to delay the expansion data inputs in order to compensate for growth in the horizontal direction. Delays in the partial sums are also necessary to compensate for the total pipeline delay needed to produce the previous complete horizontal stripe.

Within each device in a horizontal stripe, apart from the first, the expansion input must be delayed by the width of the partial window, before it is added to the internal sum. Since partial windows can only be 4 or 8 pixels wide,a delay of 4 or 8 pixel clocks is needed. There is, however, an in-built delay

B3 = 0

delays

D = 4+S(N-1) Defined by D3:2

delays

line

delays

N th DEVICE IN THE ROW

WIDTH = S

0/4

delays

0 IF S = 4,4 IF S = 8

D0 = 0 OR 1

4 clock

delay

OUTPUT

of 4 pixels in the inter device connection, and the PDSP16488A thus only needs an option to delay the expansion input by an additional four pixels.

The data from the last device in a horizontal row of convolvers feeds the expansion input of the first device in the next row. This is shown in Figure 6. With this arrangement, the position of the partial window as illustrated, is the inverse of its vertical position on a normal TV screen. Thus the top, left hand, device corresponds to the bottom, left hand, portion of the complete window.

The output from the last device in the row is delayed with respect to the original data input by an amount given by the formula;

DELAY = 4 + [N-1].S where N is the number of devices in

a row and S is the partial window width, ie 4 or 8.

PDSP16488A MA

The internal convolver sums, in each of the devices in the next row, must be delayed by this amount before they are added to results from the previous row. This is more conveniently achieved by delaying data going into the line stores. The required cumulative delay with respect to the first horizontal stripe is then automatically obtained when more than two rows of devices are needed.

Two bits in Control Register D are used to define one of four delay options. These delays have been selected to support systems needing from two to eight devices and are described in the applications section.

COEFFICIENTS

Sixty-four coefficients are stored internally and must be initially loaded from an external source. Table 3 gives the coefficient addresses within a device, with coefficent C0 specified by the least significant address and C63 by the most significant address. Table 5 shows the physical window position within the device which is allocated to each coefficient in the various modes of operation. Horizontally the coefficient positions correspond to the convolution process as if it were conceptually observed on a viewing screen, ie the left hand pixel is multiplied with C0. In the vertical direction the lines of coefficients are inverted with respect to a visual screen, ie the line starting with C0 is actually at the bottom of the visualized window.

The coefficients may be provided from a Host CPU using conventional addressing, a read/write line, data strobe, and a chip enable. Alternatively, in stand alone systems, an EPROM may be used. A single EPROM can support up to 16 devices with no additional hardware.

When windows are to be fabricated which are smaller than the maximum size that the device will provide in the required configuration, then the areas which are not to be used must contain zero coefficients. The pipeline delay will then be that of a completely filled window.

TOTAL PIPELINE DELAY

The total pipeline delay is dependent on the device configuration and the number of devices in the system. Table 4 gives the delays obtained with the various single device

Function

Mode Reg A Mode Reg B Mode Reg C Mode Reg D Comparator LSB Comparator MSB Scale Value Pixels / Line LSB Pixels / Line MSB C0 - C15 C16 - C31 C32 - C47 C48 - C63 Unused

Hex. Addr

00 01 02 03 04 05 06 07

08 40 - 4F 50 - 5F 60 - 6F 70 - 7F 09 - 3F

Table 3 Internal Register Addressing

Data

size

8 8

8 16 16

configurations when the gain control is used. These delays are the the internal processing delays and do not include the delays needed to move a given size window completely into a field of interest. When multiple devices are needed, additional delays are produced which must be calculated for the particular application. These delays are discussed in the applications section.

The PDSP16488A contains facilities for outputing a delayed version of HRES to match any processing delay. Control register bits allow this delay to be selected from any value between 29 and 92 pixel clocks.

Window

Size

4x4 8x4 8x8 4x4 8x4

ble 4 Pipe line dalays

Pipeline

Dela

34 30 26 28 26

ASYNCHRONOUS BACK EDGE

ACTIVE LINE PERIOD

23 45678

First pixel from

line

store

valid

Fig.7 Pixel Input Delays

12 76

last 2

pixels

intern-

ally

stored

LINE STORE

WRITES INHIBITED

HRES [SYNC]

CLOCK

Set Up

Time

First pixel valid

[B3 set]

+ 21 hidden pages

MITEL PDSP16488AMA Datasheet

Specifications and Main Features

Frequently Asked Questions

User Manual