Single Chip 2D Convolver with Integral Line Delays
Supersedes January 1997 version, DS3742 - 3.1DS3742 - 4.0 January 2000
The PDSP16488A is a fully integrated, application specific, image processing device. It performs a two dimensional
convolution between the pixels within a video window and a
set of stored coefficients. An internal multiplier accumulator
array can be multi-cycled at double or quadruple the pixel
clock rate. This then gives the window size options listed in
Table 1.
An internal 32k bit RAM can be configured to provide
either four or eight line delays. The length of each delay can
be programmed to the users requirement, up to a maximum of
1024 pixels per line. The line delays are arranged in two
groups,which may be internally connected in series or may be
configured to accept separate pixel inputs. This allows interlaced video or frame to frame operations to be supported.
The 8 bit coefficients are also stored internally and can
be downloaded from a host computer or from an EPROM. No
additional logic is required to support the EPROM and a single
device can support up to 16 convolvers.
The PDSP16488A contains an expansion adder and
delay network which allows several devices to be cascaded.
Convolvers with larger windows can then be fabricated as
shown in Table 2.
Intermediate 32 bit precision is provided to avoid any
danger of overflow, but the final result will not normally occupy
all bits. The PDSP16488A thus provides a multiplier in the
output path, which allows the user to align the result to the
most significant end of the 32 bit word.
FEATURES
■The PDSP16488A is a fully compatible replacement
for the PDSP16488
■8 or 16 bit pixels with rates up to 40 MHz
■Window sizes up to 8 x 8 with a single device
■Eight internal line delays
■Supports interlace and frame to frame operations
■Coefficients supplied from an EPROM or remote host
■Expandable in both X and Y for larger windows
■Gain control and pixel output manipulation
■132 pin QFP
Rev A B C D
Date MAR 1993 JUL 1996 JAN1997
NOTE
Polyimide is used as an inter-layer dielectric and as
glassivation.
Polymeric material is also used for die attach which according
to the requirement in paragraph 1.2.1.b. (2) precludes
catagorising this device as fully compliant. In every other
respect this device has been manufactured and screened in full
accordance with the requirements of Mil-Std 883 (latest revision).
Data
Size
8
8
8
16
16
Max Pixel
Rate
10MHz
10MHz
20MHz
20MHz
40MHz
40MHz
* Maximum rate is limited to 30 MHz by line store expansion delays
Table 2 Devices needed to implement typical window sizes
Window Size
Width X Depth
4
8
8
4
8
Table 1 Single Device Configurations
Pixel
8
16
8
16
8
16
4
4
8
4
4
3x35x57x79x9 11x11 15x15 23x23Size
1
1
1
2
1
2
1
4
1
4
2
Max Pixel
Rate
40MHz
20MHz
10MHz
20MHz
10MHz
Window size
1
4
2
2
6
4
4
4
6
Line
Delays
4x1024
4x1024
8x512
4x512
4x512
4
8
CHANGE NOTIFICATION
The change notification requirements of MIL-PRF-38535 will
be implemented on this device type. Known customers will be
notified of any changes since the last buy when ordering further
parts if significant changes have been made.
Pixel data input to the first line delay. [most significant byte in 16 bit mode]
Pixel data input to the second group of line delays. [least significant byte in 16bit mode]. Alternatively
an output from the last line delay when the appropriate mode bit is set.
The first line delay in the first group is bypassed when this input is active. (High). No internal pull up.
Resets the line delay address pointers when high. Normally the composite sync signal in real time
applications. In non real time systems it defines a frame store update period, when low.
Address/data connections from a MASTER or SINGLE device to the external coefficient source, with
X15 defining EPROM or Host support. Otherwise they provide the expansion data input.
Signed 16 bit scaled data or multiplexed 32 bit intermediate data. During intermediate transfers the
most significant half is valid when the clock is low, and the least significant half when clock is high.
During programming a MASTER device outputs a timing strobe on this pin. This is passed down the
chain in a multiple device system, using the PC0 input on the next device.
This pin is used in conjunction with PC1 in multiple device systems. It terminates the write strobe from
a MASTER device which is EPROM supported.
This output provides a version of the HRES input which has been delayed by an amount defined by
the user.
The data strobe from a host computer. Active low. This pin will be an output from an EPROM supported
MASTER device which provides strobes to the remaining devices.
An active low enable which is internally gated with R/ W and DS to perform reads or writes to the
internal registers. In a SINGLE or MASTER device, which is supported from an EPROM, the bottom
72 addresses are always used and CE is not needed. CE can then be used to initiate a new register
load sequence after the power on load sequence.
R/ W
PROG
CLK
BIN
OV
RES
SINGLE
MASTER
OEN
CS3:0
F1:0
VCC / GND
INPUT
I/O
INPUT
OUTPUT
OUTPUT
INPUT
INPUT
INPUT
INPUT
OUTPUTS
OUTPUTS
SUPPLY
Read / not write line from the host CPU. When an EPROM is used this pin should be tied low.
This pin is normally an input which signifies that registers are to be changed or examined. It is,
however, an output from an EPROM supported SINGLE or MASTER device indicating to the rest of
the system that registers are being updated.
Clock. All events are triggered on the rising edge of the clock, except the latching of least significant
expansion inputs . Internally the clock can be multiplied by two or four in order to increase the effective
number of multipliers.
This output indicates the result from the internal comparison. A high value indicates that the pixel
was greater than the internal threshold. The output is only valid from the last device in a chain.
When high this output indicates that there has been a gain control overflow.
Active low power on reset signal.
Tied to ground to indicate a SINGLE device system. Internal pull up resistor.
Tied to ground to indicate the MASTER device in a multiple device system. Must be left open circuit
in a SINGLE device system. Internal pull up.
Output enable signal. Active low.
Four address bits from a MASTER specifying one of sixteen devices in a multiple device system. Must
be externally decoded to provide chip enables for the additional devices.
These bits indicate the field selection given by the auto select logic. The same coding as that used
for Control Register bits C5:4 is used.
Four Power and ground pairs. All must be connected.
3
PDSP16488A MA
BASIC OPERATION
The PDSP16488A convolver performs a weighted
sum of all the pixels within an N x N two dimensional window.
Each pixel value is multiplied by a signed coefficient, or weight,
and the products are summed together. In practice positive
weights would be used to produce averaging effects, with
various distribution laws, and negative weights would be used
for edge enhancement. The window is moved continuously
over the video frame, and for real time operation a new result
must be obtained for every pixel clock. In most applications
odd sized windows will be used, resulting in a centre pixel
whose value is modified by the surrounding pixels.
OUTPUT ACCURACY
With 8 bit pixels, and an 8 x 8 window, it is possible for
the accumulated sum to grow to 22 bits within a single device.
With 16 bit pixels, and an 8 x 4 window ( the maximum
possible ), the sum can grow to 29 bits. The PDSP16488A
actually allows for word growth up to 32 bits, and thus allows
several devices to be cascaded without any danger of overflow. Since coefficients can be negative, the final result is a 32
bit signed two's complement number.
In a particular application the desired output will lie
somewhere within these 32 bits, the actual position being
dependent on the coefficient values used. This causes problems in physically choosing which output pins to connect to the
rest of the system. To overcome this problem the
PDSP16488A contains an output multiplier, or gain control,
which allows the final result to be aligned to the most significant end of the 32 bit internal result.The provision of a
multiplier, rather than a simple shifter, allows the gain to be
defined more accurately.
The sixteen most significant bits of the adjusted result are
available on output pins, and contain a sign bit.
OUTPUT SATURATION
MULTIPLIER ARRAY
The PDSP16488A contains sixteen 8x8 multipliers
each producing a 16 bit result. Internally the pixel clock
supplied by the user can be multiplied by two or four, which
together with the proprietary architecture, allows each multiplier to be used several times within a pixel clock period. This
increases the effective number of multipliers, which are available to the user, from 16 to 32 or 64 respectively. This
architecture produces a very efficient utilization of chip area,
and allows the line delays to be accommodated on the same
device.
The sixteen multipliers are arranged in a 4 deep by 4
wide array, resulting in effective arrays of 4 by 8 or 8 by 8 with
the multi-cycling options. The multiplier array can also be
configured to handle 16 bit signed pixels; the effective number
of available multipliers is then halved.
LINE DELAY OPERATION
Internal RAM is arranged in two separate groups, and
can be configured to provide line delays to match the chosen
size of the convolver. When a four deep arrangement is used,
with 8 bit pixels, four line delays are available, and each can
be programmed to contain up to 1024 pixels. In an eight deep
array, or if16 bit pixels are needed, each line can contain up
to 512 pixels. Figure 4 illustrates the options available.
The first line delay in one of the groups can optionally
be switched in or out under the control of an input pin. It is used
to delay the pixel input when data is obtained from another
convolver in a multiple device system, or it is used to support
interlaced video.
Signals L7:0 may be used as pixel inputs or outputs.
They are configured as inputs at power-on to avoid possible
bus conflicts, but by setting a mode control bit can become
outputs. They can then be used to drive another device when
multiple PDSP16488A's are required.
If the output from the convolver is driving a display,
negative pixels will give erroneous results. An option is thus
provided which forces all negative results to zero, which are
then interpreted as black by the display. At the same time
positive results, which overflow the gain control, are forced to
saturate at the most positive number ie peak white. In this
mode the output sign bit is always zero,and should not be
connected to an A/D converter.
A separate option forces both negative and positive
overflows to saturate at their respective maximum values, but
in scale negative results remain valid. A gain control overflow
warning flag is also available, which can be used in a host CPU
supported system to change the gain parameters if overflows
are not acceptable.
BINARY OUTPUT
The PDSP16488A contains a 16 bit arithmetic comparator which allows the output from the gain control to be
compared with a previously programmed value. An output
flag allows the user to detemine if the result was above or
below a value contained within an internal register.
4
INTERLACED VIDEO
When using real time interlaced video, a picture or
frame is composed from two fields, with odd lines in one field
and even lines in the other. An external field delay is thus
required to gather information from adjacent lines, and the
convolver needs two input busses. The bus providing the
delayed pixels has an extra internal line delay. This is only
used in the field containing the upper line in any pair of lines,
and must be bypassed in the other field. It ensures that data
from the previous field always corresponds to the line above
the present active line, and avoids the need to change the
position of the coefficients from one field to the next.
Figure 3 shows the translation from physical to internal
line positions, for single device interlaced systems. Line N is
the line presently being convolved, which is either one or two
lines previous to the line presently being produced.
When windows requiring four or more lines are to be
implemented, the first line delay, in the group supplied from
the L7:0 pins, must always be by-passed. This by-pass option
is controlled by Register B, bit 7 and is not effected by the
BYPASS input pin.. The coefficients must be loaded into the
locations shown, which match the translated line positions,
with unused coefficients, shown shaded, loaded with zero's.
LINE N-1
LINE N
LINE N+1
3 X 3 WINDOW
C4 C5C9C6
C8
C10
C2C0C1
VIDEO
LINE N+2
FIELD
DELAY
ODD
FIELD
IP7:0
L7:0
1024
1024
1024
1024
N+1
N - 1
N
PDSP16488A MA
4 X 4
Output is shifted
OR
8 X 4
ARRAY
by 1 line in
every field
LINE N-2
LINE N-1
LINE N
LINE N+1
LINE N+2
LINE N-3
LINE N-2
LINE N-1
LINE N
LINE N+1
LINE N+2
LINE N+3
LINE N+4
5 X 5 WINDOW
C48 C49 C50 C51 C52
C8C9 C10 C11 C12
C40 C41 C42 C43
C0C1 C2C3 C4
C32 C33 C34 C35 C36
C44
8 X 8 WINDOW
C30C29C28C27C26C25C24
C56 C57C58 C59C60 C61C62 C63
C16 C17C18C19C20 C21C22
C48 C49C50 C51C52 C53C54
C8 C9C10 C11C12 C13C14
C40 C41C42C43 C44C45C46
C0 C1C2C3C4C5C6
C32
C33 C34C35C36 C37C38
VIDEO
LINE N+2
*
Delay is By-Passed
[REG B,BIT 7 IS SET]
C31
C23
C55
VIDEO
C15
LINE N+4
C47
*
C7
C39
Delay is By-Passed
[REG B,BIT 7 IS SET]
FIELD
DELAY
FIELD
DELAY
IP7:0
L7:0
FIELD
L7:0
ODD
ODD
FIELD
IP7:0
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
N+1
N-1
*
N+2
N
N-2
N+3
N+1
N-1
N-3
*
N+4
N+2
N
N-2
8 X 8
ARRAY
8 X 8
ARRAY
Output is shifted
by 1 line in
every field
Output is shifted
by 2 lines in
every field
Figure 3. Line Delay Allocations in Single Device Interlaced Systems
5
PDSP16488A MA
L7:0
IP7:0
IP7:0
BYPASS
L7:0
IP7:0
BYPASS
L7:0
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
512
1024
1024
1024
1024
16
16
16
16
8x8
ARRAY
4 X 4
OR
8 X 4
ARRAY
4X4
OR
8X4
IP7:0
BYPASS
IP7:0
BYPASS
L7:0
L7:0
512
512
512
512
512
512
512
512
1024
1024
1024
1024
8X8
ARRAY
4 X 4
OR
8 X 4
ARRAY
BYPASS
Fig. 4. Line Delay Configurations
DEFINING THE LENGTH OF THE LINE DELAY
Figure 4 defines the maximum line lengths available in
each of the window size options. The actual line lengths can
be defined in one of three ways, to support both real time
applications, taking pixels directly from a camera, and also
use in systems supported by a frame store. In the former case
the line delays must be referenced to video synchronization
pulses. In the latter case the line lengths are well defined, and
the horizontal flyback 'dead times' will have been removed.
To support real time applications an option is provided
in which the length of the line delay is defined by the number
of clocks obtained whilst an input pin ( HRES ) is in-active.
HRES would normally be composite sync when the convolver
is directly attached to an NTSC or PAL video camera.
Conceptually, the line delay is achieved by reading the
previous contents of a RAM based line store, and then writing
new information to the same address. When HRES is active
write operations are inhibited, and the address counter is
reset. During an active line the counter is incremented by the
pixel clock. If the maximum count is reached before the end of
a line, then write operations are terminated and wrap-around
effects avoided.
The active going edge of HRES, marking the end of a
line, is normally asynchronous to the pixel clock, and it is
possible for an additional pixel to be stored on some lines. This
has no effect on the convolver operation, and will not cause a
cumulative shift in the pixel position from line to line.
An alternative means of defining the line length is,
however, provided when an exact number of pixels is needed.
HRES going in-active then starts the delay operation for every
line, but it ceases when the 10 bit value contained in two
registers is reached. This method can avoid the need to store
blank pixels at the end of a line before sync goes active. With
this method the line must contain an even number of pixels,
but the value loaded into the control registers defining the line
length, must be one less than the even number needed.
In an image processing system, the pixel clock is often
re-synchronized, or even inhibited, during blanking or sync.
The next line is then started with a precise time interval from
the end of sync to the first pixel clock edge. This avoids any
visible pixel jitter at the beginning of the line, which would
otherwise be present since pixel clock is asynchronous with
respect to video sync pulses.
When using the PDSP16488A the pixel clock should
not be inhibited, or re-synchronized, until the delayed version
of the HRES input goes active. This is present on the DELOP
output pin. This will ensure that no pixels on the right hand
edge are lost due to the internal pipeline delay.
If the pixel clock is a continuous signal, the user must
ensure that the HRES in-active transition meets the timing
requirements defined in Figure 10. The active going edge at
the end of a line need not be synchronized.
When pixels are read/written to a frame store, an
alternative line delay configuration is needed. Within the
frame store lines would be stored in contiguous locations,
with no gaps caused by the flyback period between the lines.
This method of use makes the HRES defined line delay
operation difficult to use, and an alternative mode of operation
is provided. The HRES input is then driven by a system
provided signal, which defines a complete frame store update
period. It is not a line defining signal. The high to low transition
of this signal will initiate the line store update sequence and
allow the internal address pointers to increment. These pointers will be synchronously reset at the end of a line, when they
reach the pre-programmed value. They will then immediately
start a new operation using address zero. The actual line delay
must be pre-loaded into two control registers as described
previously.
Write operations back to the frame store must allow for
the total pipeline delay. This can be achieved by inhibiting
write operations until the delayed version of HRES goes low
at the DELOP output pin. Write operations then continue until
it goes back high. The PDSP16488A assumes that data is
valid when a clock signal is applied, and that it also meets the
set up and hold requirements given in Figure 10. If data is not
valid, due for example to a frame store DRAM refresh cycle,
then the user must externally inhibit the clock. The clock
supplied to the convolver will in this mode be a signal which
defines a frame store cycle time.
The use of the convolver in a line scan system is similar
to its use with a frame store. These systems have no flyback
period, and the address counter must be synchronously reset
at the end of the line and then allowed to continue.
GAIN CONTROL
The gain control is provided as an aid to locating the
bits of interest in the 32 bit internal result. The magnitude of the
largest convolved output will depend on the size of the
6
PDSP16488A MA
window, and the coefficient values used. The function of the
gain control is then to produce an output, which is accurate to
16 bits, and which is aligned to the most significant end of this
32 bit word. The sixteen most significant bits of the word are
available on output pins, and the largest number need only
have one sign bit if the gain control is correctly adjusted.
Fiigure 5 indicates the mechanism employed with the
required function implemented in two steps. Two mode control
bits allow one of four 20 bit fields to be selected from the final
32 bit value. These four fields are positioned with the first at
the most significant end, and then at four bit displacements
down to the least significant end.
By setting an enabling bit, the field selection can
optionally be done automatically. This feature should only be
used in the real time operating mode, when HRES defines
video lines. Internal logic examines the most significant 13, 9,
or 5 bits from the 32 bit result, and makes a field selection
dependent on which group does not contain identical sign bits.
If less than five sign bits are obtained, the logic will select the
field containing the most significant 20 bits.
The automatic selection is particularly useful when a
fixed scene is being processed. The selection is reset when
any internal register is updated ( ie PROG has been active )
and is then held in-active for ten further occurances of the
HRES input. This allows the internal multiplier/ accumulator
array to be completely flushed before a field selection is made.
As convolver outputs of greater magnitude are produced the
field selection logic will respond by selecting a more significant
field. The most significant field found necessary remains
selected until PROG again goes active. Even if the automatic
field selection is not enabled, two outputs, F1:0, will still
indicate which field would have been selected. These are
coded in the same way as Register C, bits 5:4.
Having chosen a field, either manually or automatically, it is then multiplied by a 4 bit unsigned integer. This is
contained within a user programmed register, and the multiplication will produce a 24 bit result . The middle 16 bits of this
result contain the required output bits. The gain control multiplier can overflow in to the unused most significant four bits if
the parameters are chosen wrongly. This condition is indicated by an overflow flag .
By setting appropriate mode control bits, further manipulation of the gain control output is possible. One option
allows all negative outputs to be forced to zero, and at the
same time positive gain control overflows will saturate at the
maximum positive number. A different option will saturate
positive and negative overflows at their respective maximum
values, but otherwise leaves them unchanged. Occasional
FROM EXPANSION ADDER
32 BITS
20
202020
488412
12
MSB
LSB
D15:0
MUX
GAIN
REGISTER
4
20
X
4
1624
SATURATE
LOGIC
4
overflows can be tolerated in some systems, and this option
prevents any gross errors.
EXPANSION
Multiple devices can be connected in cascade in order
to fabricate window sizes larger than those provided by a
single device. This requires an additional adder in each device
which is fed from expansion data inputs. This adder is not
used by a single device or the first device in a cascaded
system, and can be disabled by a mode control bit.
The first device in the cascaded system must be
designated as a MASTER device by tying an input pin low. Its
expansion input bus is then used as the source of data for the
coefficient and control registers in all devices in the system.
In order to reduce the pin count required for 32 bit
busses, both expansion in and data out are time multiplexed
with the phases of the pixel clock. When the clock is high the
least significant half will be valid, and when the clock is low the
most significant half will be valid.
In practice this multiplexing is only possible with pixel
clocks up to 20MHz. Above these frequencies the multiplexing
must be inhibited by setting a Mode Control bit ( Register A,
Bit 7 ). The intermediate data accuracy will then be reduced,
since only the lower 16 bits of the internal 32 bit intermediate
sum are available on the output pins. In such systems the
coefficients must be scaled down in order to keep the
intermediate and final results down to 16 bits. The final device
should not use the gain control, and instead should simply
output the non-multiplexed 16 bit result. The overflow flag and
pixel saturation options will not be available.
PIXEL INPUT AND OUTPUT DELAYS
In a real time system, when line delays are referenced
to video sync pulses present on the HRES input, the first pixel
from the last line delay does not appear on the L7:0 pins until
the fifth active pixel clock edge after HRES has gone low. This
is illustrated in Figure 7. In a vertically expanded system, this
output provides the input to the first line delays in the vertically
displaced devices. The internal logic is thus designed to
always expect this five clock delay. Compensation must thus
be applied to the devices which are directly connected to the
video source, such that the first pixel is not valid until the fifth
clock edge.
For this reason the PDSP16488A contains an optional
four clock pipeline delay on each of the pixel data inputs.
When the delay is used the first pixel in a video line must be
available on the input pins after the first pixel clock edge. This
would be so if the device were connected to an A/D converter,
since that would introduce a one pixel pipeline delay. If the
system introduces any further external pipeline delays, then
the internal delay should be bypassed, and the user should
ensure that the first pixel is valid after the fifth clock edge.
The use of this four clock delay is controlled by Bit 3,
in Control Register B. This delay is in addition to the delays
which are provided to support expansion in both the X and Y
directions, and are controlled by Register D, Bits 3:2. Both
delays are in fact simply added together in the device, but are
provided for conceptually different reasons.
Fig. 5. Gain Control Operation
7
PDSP16488A MA
INPUT
delays
delays
4 clock
delay
0
delays
D
delays
4 clock
delay
4
B3 = 1
0
D3:2 = 00
WIDTH = S
line
delays
ZERO
B3 = 0
line
delays
0
delays
D0 = 0
D = 4+S(N-1) Defined by D3:2
WIDTH = S
0
delays
D0 = 0
4 clock
delay
4 clock
delay
4
delays
0
delays
0
delays
D
delays
B3 = 1
D3:2 = 00
line
delays
line
delays
B3 = 0
N th DEVICE IN THE ROW
WIDTH = S
0/4
delays
0 IF S = 4, 4 IF S = 8
D0 = 0 or 1
N th DEVICE IN THE ROW
D = 4+S(N-1) Defined by D3:2
WIDTH = S
0/4
delays
0 IF S = 4, 4 IF S = 8
D0 = 0 OR 1
4 clock
delay
4 clock
delay
0
B3 = 0
delays
D
D = 4+S(N-1) Defined by D3:2
delays
line
delays
WIDTH = S
0
delays
D0 = 0
4 clock
delay
Fig. 6. Multi-Device Delay Paths
DELAY COMPENSATION FOR LARGE WINDOWS
A large window is composed of several partial windows
each of which is implemented in an individual device. If
necessary the partial window must be padded with zero
coefficients to become one of the standard sizes. When
constructing a large window it is necessary to delay the
expansion data inputs in order to compensate for growth in the
horizontal direction. Delays in the partial sums are also
necessary to compensate for the total pipeline delay needed
to produce the previous complete horizontal stripe.
Within each device in a horizontal stripe, apart from
the first, the expansion input must be delayed by the width of
the partial window, before it is added to the internal sum. Since
partial windows can only be 4 or 8 pixels wide,a delay of 4 or
8 pixel clocks is needed. There is, however, an in-built delay
0
B3 = 0
delays
D = 4+S(N-1) Defined by D3:2
D
delays
line
delays
N th DEVICE IN THE ROW
WIDTH = S
0/4
delays
0 IF S = 4,4 IF S = 8
D0 = 0 OR 1
4 clock
delay
OUTPUT
of 4 pixels in the inter device connection, and the
PDSP16488A thus only needs an option to delay the
expansion input by an additional four pixels.
The data from the last device in a horizontal row of
convolvers feeds the expansion input of the first device in the
next row. This is shown in Figure 6. With this arrangement, the
position of the partial window as illustrated, is the inverse of
its vertical position on a normal TV screen. Thus the top, left
hand, device corresponds to the bottom, left hand, portion of
the complete window.
The output from the last device in the row is delayed
with respect to the original data input by an amount given by
the formula;
DELAY = 4 + [N-1].S where N is the number of devices in
a row and S is the partial window width, ie 4 or 8.
8
PDSP16488A MA
y
The internal convolver sums, in each of the devices in
the next row, must be delayed by this amount before they are
added to results from the previous row. This is more conveniently achieved by delaying data going into the line stores. The
required cumulative delay with respect to the first horizontal
stripe is then automatically obtained when more than two rows
of devices are needed.
Two bits in Control Register D are used to define one
of four delay options. These delays have been selected to
support systems needing from two to eight devices and are
described in the applications section.
COEFFICIENTS
Sixty-four coefficients are stored internally and must
be initially loaded from an external source. Table 3 gives the
coefficient addresses within a device, with coefficent C0
specified by the least significant address and C63 by the most
significant address. Table 5 shows the physical window position within the device which is allocated to each coefficient in
the various modes of operation. Horizontally the coefficient
positions correspond to the convolution process as if it were
conceptually observed on a viewing screen, ie the left hand
pixel is multiplied with C0. In the vertical direction the lines of
coefficients are inverted with respect to a visual screen, ie the
line starting with C0 is actually at the bottom of the visualized
window.
The coefficients may be provided from a Host CPU
using conventional addressing, a read/write line, data strobe,
and a chip enable. Alternatively, in stand alone systems, an
EPROM may be used. A single EPROM can support up to 16
devices with no additional hardware.
When windows are to be fabricated which are smaller
than the maximum size that the device will provide in the
required configuration, then the areas which are not to be used
must contain zero coefficients. The pipeline delay will then be
that of a completely filled window.
TOTAL PIPELINE DELAY
The total pipeline delay is dependent on the device
configuration and the number of devices in the system. Table
4 gives the delays obtained with the various single device
Function
Mode Reg A
Mode Reg B
Mode Reg C
Mode Reg D
Comparator LSB
Comparator MSB
Scale Value
Pixels / Line LSB
Pixels / Line MSB
C0 - C15
C16 - C31
C32 - C47
C48 - C63
Unused
Hex. Addr
00
01
02
03
04
05
06
07
08
40 - 4F
50 - 5F
60 - 6F
70 - 7F
09 - 3F
Table 3 Internal Register Addressing
Data
size
8
8
8
16
16
configurations when the gain control is used. These delays
are the the internal processing delays and do not include the
delays needed to move a given size window completely into
a field of interest. When multiple devices are needed, additional delays are produced which must be calculated for the
particular application. These delays are discussed in the
applications section.
The PDSP16488A contains facilities for outputing a
delayed version of HRES to match any processing delay.
Control register bits allow this delay to be selected from any
value between 29 and 92 pixel clocks.
Window
Size
4x4
8x4
8x8
4x4
8x4
Ta
ble 4 Pipe line dalays
Pipeline
Dela
34
30
26
28
26
ASYNCHRONOUS BACK EDGE
ACTIVE LINE PERIOD
23 45678
First
pixel
from
line
store
valid
Fig.7 Pixel Input Delays
1276
last 2
pixels
intern-
ally
stored
LINE STORE
WRITES INHIBITED
HRES
[SYNC]
CLOCK
Set Up
Time
First
pixel
valid
[B3 set]
9
Loading...
+ 21 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.