Introduction
Frame Buffer Design Example
Simplified 3D-RAM Block Diagram
3D-RAM Functional Blocks
Block, Page, and Page Group
DRAM Banks and Basic DRAM Operations
Pixel Buffer
Video Buffers
Global Bus
Pixel ALU Basics
Introduction
1280 x 1024 x 8 Organization
1280 x 1024 x 32 Single Buffered Organization
1280 x 1024 x 32 Double Buffered Organization with Z
640 x 512 x 8 Double Buffered Organization with Z
7
Electrical Specifications
Absolute Maximum Ratings
Testing Conditions
DC Specifications
AC Specifications
Pixel ALU Timing Parameters
DRAM Timing Parameters
Video Buffer Timing Parameters
Boundary-Scan Timing Parameters
Elements
Bit Ordering of Elements
Access Page
Duplicate Page
Precharge Bank
Read Block
Masked Write Block
Unmasked Write Block
Video Transfer
Video Cycle
Data Read
Stateless Initial Data Write
Stateless Normal Data Write
Replace Dirty Tag
OR Dirty Tag
Write Plane Mask Register
through 7.9. Note that both the “-10” and
“-12” speed grades now have the same
values for Video Buffer timing parameters.
•
Chapter 8
•
All figure numbers and table numbers
were corrected.
•
The speed grade “-13” was replaced by
the speed grade “-12” in Tables 8.2
through 8.13. Note that both the “-10”
and “-12” speed grades now have the
same values for Video Buffer timing
parameters.
•
Chapter 9
•
The mnemonic for the tracking label was
corrected to show the “-12” speed grade.
•
Table 9.43 on p. 157 now shows the cor-
rect values for the parameters L and I2.
Chapter 10
•
The paragraph Boundary-Scan Register
•
on p. 169 now shows both bits 1 and 0 of
the PASS_IN pins.
Figure 10.4 on p. 170 now more correctly
•
reflects the scan chain described on p.
171.
Rev. 1.00
• Chapter 2
• Tracking label mnemonic on p. 22 was
corrected to show the speed grade “10A”.
• Chapter 3
• Wording of note on bit fields on p. 60 corrected to clarify meaning.
• Chapter 5
• Tables 5.2 on p. 88, 5.4 on p. 90, 5.6 on p.
92, 5.8 on p. 94, 5.10 on p. 96, and 5.12
on p. 98 deleted.
Revision History
0
vii
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
Revision History
0
• Chapter 7
• The speed grade “-10A” was added to all
tables.
• Entries in Table 7.4 on pp. 117 and 118,
Table 7.5 on p. 119, Table 7.6 on p. 120,
and Table 77 on p. 121 were corrected.
• Chapter 8
• The speed grade “-10A” was added to all
tables.
• Entries in Table 8.3 on p. 131, Table 8.5
on p. 135, Table 8.6 on p. 135, Table 8.7
on p. 137, and Table 8.0 on p. 140 were
corrected.
• Chapter 9
• Tracking label mnemonic on p. 22 was
corrected to show the speed grade “10A”.
Rev. 1.02
• Chapter 2
• Tracking label and pinout diagrams are
updated to reflect the new 5-character
manufacturing code.
• Chapter 3
• The stateless mode of the Color Depth
Select register in Table 3.14 is corrected.
• Description of prohibited Write Control
Register operation sequence is added.
• Chapter 4
• Description of prohibited Video Transfer
operation sequence is added.
• Chapter 7
• The value of I
rected to reflect the improvement of
VID_CLK cycle time from 14.0 ns down
to 12.0 ns.
• Minor editorial corrections in Table 7.4 are
done; no parameter values are changed.
• Chapter 8
• Minor editorial corrections in Table 8.3 are
done; no parameter values are changed.
• Chapter 9
• Tracking label and pinout diagrams are
updated to reflect the new 5-character
manufacturing code.
in Table 7.3 is cor-
cc<VID>
Rev. 1.03
• Table of contents is added
• Chapter 3
• Figure 3.28 and the corresponding paragraph are corrected.
• Chapter 4
• Figure 4.6 and the corresponding paragraph are corrected
• Chapter 9
• The thermal resistance values in Tables
9.2 and 9.3 are updated.
viii
1
Overview of 3D-RAM and Its Functional Blocks
s
M
ITSUBISHI
ELECTRONIC DEVICE GROUP
3D-RAM (M5M410092B)
Rev. 1.03
Overview of 3D-RAM and Its Functional Blocks
Introduction
One of the traditional bottlenecks of 3D graphics
hardware has been the rate at which pixels can be
rendered into a frame buffer using conventional
DRAM or VRAM. The 3D-RAM emerged from a
complete rethinking of frame buffer technology
and produces an order of magnitude increase in
rendering performance. The essence of the
3D-RAM architecture is: (1) an optimized array
architecture that minimizes the average memory
cycle time when rendering and (2) a selective
on-chip logic that converts the interface with the
rendering controller from a read-modified-write
mode to a write-mostly mode. In addition to the
performance boost, the new architecture also
significantly reduces the system chip count. In
1994 Mitsubishi pioneered the introduction of the
first member of the 3D-RAM family of products.
This databook specifies all the features and
operations of the third generation product of the
3D-RAM family to further elevate the performance
of the 3D-RAM based 3D graphics systems. All
references to 3D-RAM means the product
M5M410092B, unless otherwise specifically
designated.
The factors responsible for the dramatic overall
performance improvement include:
• Flexible dual Video Buffer supporting
85-Hz CRT refresh
•
Write Mostly Interface
• On-chip ALU
• Four ROP units supporting 16 raster
operations on byte data
• Four Blend units blending the old pixel
value with new information
• On-chip hardware acceleration for all
OpenGL blending modes
• On-chip hardware acceleration for all
OpenGL stencil modes
• One 32-bit Match Comparator and one
32-bit Magnitude Comparator
• Concurrent operations of DRAM, Pixel
Buffer, ALU and Video Buffer
• 32-bit synchronous high-bandwidth data
bus interface with rendering controller
• Blending operations in both (8, 8, 8, 8)
and (4, 4, 4, 4) color modes
• One additional PASS_IN pin for flexible
bit plane organization
(NEW)
(NEW)
(NEW)
Frame Buffer Design Example
Overview of 3D-RAM and Its Functional Block
1
•
New Memory Architecture
• 10-Mbits DRAM array supporting 1280 x
1024 x 8 frame buffer
• Four independent, interleaved DRAM
banks
• 2048-bit SRAM Pixel Buffer as the cache
between DRAM and ALU
• Built-in tile-oriented memory addressing
for rendering and scan line-oriented
memory addressing for video refresh
• 256-bit global bus connecting DRAM
banks and Pixel Buffer
Figure 1.1 is a simple frame buffer design
example showing a 1280 x 1024 x 32 single
buffered configuration. The rendering controller
writes pixel data across the 128-bit bus to the four
3D-RAMs. The controller commands most of the
3D-RAM operations, including ALU functions,
Pixel Buffer addressing, and DRAM operations.
The controller can also command video display by
setting up the RAMDAC and requesting video
transfers from 3D-RAMs.
With the 128-bit pixel data bus shown in
Figure 1.1, four pixels can be moved across the
bus in one cycle. There are two ways to organize
the 3D-RAMs: (1) Each 3D-RAM holds one of the
1
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
8-bit color components—R, G, B, or a—for all
1280 x 1024 pixels; (2) Each 3D-RAM holds all 32
bits of a pixel value for 320 x 1024 pixels, allowing
fast scrolling in the vertical direction and
interleaving four 3D-RAMs in the horizontal
direction.
If the width of the data bus from the rendering
System Interface
Address & Control
Rendering Controller
Overview of 3D-RAM and Its Functional Blocks
1
3D-RAM3D-RAM3D-RAM
3D-RAM
controller to 3D-RAM is reduced to 64 bits, then
two pixels are transferred in one cycle. Similarly, a
32-bit data bus can transfer only one pixel at a
time.
Chapter 6 provides more examples of frame
buffer organizations using 3D-RAMs, such as
1280 x 1024 x 8, 320 x 1024 x 32, etc.
Pixel Data
32323232
Monitor
Video
Control
Video Data
Video Data
Video Data
Video Data
Figure 1.1 1280 x 1024 x 32 frame buffer consisting of four 3D-RAMs, shown together with a rendering control-
ler and a RAMDAC
2
16
16
16
16
RAMDAC
s
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Simplified 3D-RAM Block Diagram
The 3D-RAM block diagram is shown in
Figure 1.2. The DRAM array is partitioned into
four independent banks of 2.5 Mbits each.
Together, these four banks can support a screen
resolution of 1280 x 1024 x 8. The independent
banks can be interleaved to facilitate almost
uninterrupted frame buffer update and, at the
same time, can transfer pixel data to the dual
Video Buffer for screen refresh. Data from the
DRAM Bank
A
(2.5 Mbits)
640
Video Buffer IVideo Buffer II
256
Rev. 1.03
3D-RAM (M5M410092B)
DRAM banks is transferred over the 256-bit
Global Bus to the triple-ported Pixel Buffer. The
Pixel Buffer consists of eight blocks, each of which
is 256 bits and is updated in a single transfer on
the Global Bus. Hence, the memory size of the
Pixel Buffer is 2 Kbits. The ALU uses two of the
Pixel Buffer ports to read and write data in the
same clock cycle. Each Video Buffer is 80 x 8 bits
and is loaded in a single DRAM operation. One
Video Buffer can be loaded while the other is
sending out video data.
DRAM Bank
B
(2.5 Mbits)
640
16
Video
Data
Overview of 3D-RAM and Its Functional Block
1
640
DRAM Bank
C
(2.5 Mbits)
Global Bus
DRAM Bank
(2.5 Mbits)
640
D
32
Pixel Buffer
(2 Kbits)
Figure 1.2 Simplified 3D-RAM block diagram
32
ALU
32
Pixel
Data
3
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
3D-RAM Functional Blocks
The 3D-RAM has five major functional blocks in:
DRAM banks, Video Buffers, Pixel Buffer, Global
Bus, and Pixel ALU. The following sections
provide a quick overview of each of these
functional blocks. Chapter 3 describes details of
the Pixel ALU operations, Chapter 4 presents
specifics of the DRAM operations, and Chapter 5
provides examples of parallelism between the
Pixel ALU operations and the DRAM operations.
Now, to give readers a better grasp of these
functional blocks, we first describe the memory
units on which these functional blocks operate.
Block, Page, and Page Group
A word has 32 bits and is the unit of data
Overview of 3D-RAM and Its Functional Blocks
operations within the Pixel ALU and between the
Pixel ALU and Pixel Buffer. When the Pixel ALU
1
accesses the Pixel Buffer, not only a block
address needs to be specified but also a word has
to be identified. Since there are eight blocks in the
Pixel Buffer and eight words in a block, the upper
three bits of the input pins PALU_A designate the
block, and the lower three bits select the word.
The data in a word is directly mapped to
PALU_DQ
0 of the word is mapped to PALU_DQ0, bit 1 to
PALU_DQ1, and so on.
Although an ALU write operation operates on one
word at a time, each of the four bytes in a word
may be individually masked. The mapping is also
direct and linear: byte 0 is PALU_DQ
PALU_DQ
3 PALU_DQ
A block has 256 bits and is the unit of memory
operations between a DRAM bank and the Pixel
Buffer over the Global Bus. The input pins
DRAM_A selects a block from the Pixel Buffer and
a block from a page of a DRAM bank. The DRAM
operations on block data are Unmasked Write
in corresponding order. That is, bit
[31:0]
byte 2 PALU_DQ
[15:8],
[31:24]
.
[23:16]
, byte 1
[7:0]
, and byte
Block (UWB), Masked Write Block (MWB), and
Read Block (RDB). These operations are
described in detail on page 44, “Description of
DRAM Operations.”
A page in a DRAM bank is organized into 10 x 4
blocks. Since a block has 256 bits, a page has
10,240 bits. There are four DRAM banks in a
3D-RAM chip, the pages of the same page
address from all four DRAM banks compose a
page group. Therefore, a page group has 20 x 8
blocks.
Note in Figure 1.3, the block and page are
purposely drawn as rectangular shapes. The user
may relate these to a tiled frame buffer memory
organization. For example, if the display resolution
is 1280 x 1024 x 8, then a (32-bit) word contains
four pixels. Since a block may be viewed as
having 2 x 4 words, it contains 8 x 4 pixels. A page
is organized into 10 x 4 blocks, so it contains 80 x
16 pixels, and a page group holds 160 x 32 pixels.
Finally, a screen is composed of 8 x 32 page
groups. The advantage of such a frame buffer
memory organization is the minimization of page
miss penalty. 3D objects frequently occupy
portions of multiple scan lines. Since in this case a
page contains 80 x 16 pixels instead of 10,240 x 1
pixels, page miss is reduced. When an object
extends beyond a page boundary, bank
interleaving allows hidden precharge and
uninterrupted memory access. Details of the
various frame buffer memory organizations using
3D-RAMs are discussed in Chapter 6.
On the other hand, to support screen refresh, the
Video Buffer must output pixel data one scan line
at a time. The internal organization of a page also
allows data to be transferred from a page to the
Video Buffer, one of the sixteen scan lines of 80
bytes long each at a time. See the section “Video
Buffers” on page 7 for a summary and the section
“Video Transfer (VDX)” on page 46 for full details.
Selecting a block in the height
direction from a DRAM page
Selecting a block in the width
direction from a DRAM page
Selecting one of eight blocks
in the Pixel Buffer
[8:0]
Rev. 1.03
Overview of 3D-RAM and Its Functional Block
1
012345
PALU_A
7:0 15:8 23:16 31:24
Word 0 in Block 0
Selecting one of eight blocks
from the Pixel Buffer
Figure 1.3 Relations and addressing scheme of blocks and words in the Pixel Buffer and in the DRAM page
Selecting one of eight words
from the selected block
5
[5..0]
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
DRAM Banks and Basic DRAM
Operations
The 3D-RAM contains four independent DRAM
banks which can be interleaved to facilitate hidden
precharge or access in one bank while screen
refresh is being performed in another bank. Each
DRAM bank has 256 pages with 10,240 bits per
page for a total storage of 2,621,440 bits. An
additional 257th page can be accessed for special
functions or used to hold off-screen data. A row
decoder takes 9-bit page address signals to
generate 257 word lines, one for each page. The
word lines select which page is connected to the
sense amplifiers. The sense amplifiers read and
write the page selected by the row decoder.
Because the sense amplifiers retain data after the
read/write operations, they function like a direct-
Overview of 3D-RAM and Its Functional Blocks
mapped level-two pixel cache. (The Pixel Buffer,
which is discussed on page 7, functions as a
1
level-one pixel cache in a frame buffer with
3D-RAMs.)
During an Access Page (ACP) operation, the row
decoder selects a page by activating its word line.
Activating the word line of a particular page
transfers the bit charges of that page to the sense
amplifiers. The sense amplifiers amplify the
charges. After the sensing and amplification are
completed, the sense amplifiers are ready to
interface the Global Bus or Video Buffer. In a way,
ACP may be viewed as a “write cache” operation
on the sense amplifiers as a level-two pixel cache.
Because the activated word line remains
connected to the sense amplifiers after the ACP
operation until the subsequent Precharge Bank
operation, when a block of the sense amplifiers is
updated by a block write operation (UMB or
MWB), the corresponding block in the DRAM
array is also updated. Therefore, the sense
amplifiers function as a “write-through” cache, and
no write back to the DRAM array is necessary.
Alternatively, the data in the sense amplifiers can
be written to any page in the same bank at this
time, simply by selecting a word line without first
equalizing the sense amplifiers. This function is
called Duplicate Page (DUP). A typical application
of this function is copying from the 257th page to
one of the 256 normal pages—all 10,240 bits at a
time—for fast area fill.
When the sense amplifiers in a DRAM bank
completes the read/write operations with the
Global Bus or Video Buffer, a Precharge Bank
(PRE) operation usually follows. A Precharge
Bank cycle simply deactivates the selected word
line corresponding to the current page and
equalizes the sense amplifiers. The PRE
operation may be viewed as the close of a page
access or as the preparation for the subsequent
page access. The DRAM bank must be
precharged prior to accessing a new page.
10,240 bits/page
Row
Decoder
Latch
DRAM array
257 pages
Sense amplifiers
Figure 1.4 DRAM bank consisting of row decoder, address latch, DRAM array, and sense amplifiers
6
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
Pixel Buffer
The Pixel Buffer is a 2048-bit SRAM organized
into eight 256-bit blocks, as seen in Figure 1.3,
and functions as a level-one write-back pixel
cache. It has a 256-bit read/write port, a 32-bit
read port, and a 32-bit write port. Referring to
Figure 1.6, the 256-bit read/write port is
connected to the Global Bus via a Write Buffer,
and the two 32-bit ports are connected to the Pixel
ALU and the pixel data pins. All three ports can be
used simultaneously as long as the same memory
cell is not accessed. If the two 32-bit ports access
the same cell, the write operation will be
successful but the read data will be undefined.
A 1-bit Dirty Tag bit is assigned to each byte data
in the Pixel Buffer. Therefore, each block in the
Pixel Buffer is associated with a 32-bit Dirty Tag in
the dual-port Dirty Tag RAM. When a block is
transferred from the sense amplifiers to the Pixel
Buffer through the 256-bit port, the corresponding
32-bit Dirty Tag is cleared. When a block is
A DRAM Page
80 bytes
0
1
2
640
14
15
•••
16
16 bytes
Video Buffer (40 x 16 bits)
transferred from the Pixel Buffer to a DRAM bank,
the Dirty Tag determines which bytes are actually
written. This feature can save as much as 50% of
the power consumed by a 256-bit block write
operation without the Dirty Tag.
The cache set associativity is determined external
to the 3D-RAM, thereby permitting optimal cache
design tailored to the particular graphics system.
Video Buffers
Each video buffer receives 640-bit data at a time
from one of the two DRAM banks connected to it.
(The reader is reminded of the 3D-RAM block
diagram in Figure 1.2.) sixteen bits of data are
shifted out onto the video data pins every video
clock cycle at 14-ns rate. It takes 40 video clocks
to shift all data out of a video buffer. The video
counter counts modulo 40 and toggles the buffer
select line when the count wraps around to 0.
These two video buffers can be alternated to
provide a seamless stream of video data.
01
2345678
Ignored
Other functions
Video Data Out
DRAM_A
Selecting one of the
sixteen 80-byte scan
lines from the page
[8..0]
Overview of 3D-RAM and Its Functional Blocks
1
Figure 1.5 Video transfer from a DRAM page to the Video Buffer
7
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
Global Bus
The Global Bus connects the Pixel Buffer to the
sense amplifiers of all four DRAM banks. The
Global Bus consists of 256 data lines. Referring to
Figure 1.6, during a transfer from the Pixel Buffer
to DRAM, the 256 bits are conditionally written
depending on the 32-bit Dirty Tag and the 32-bit
Plane Mask. When a data block is transferred
from the Pixel Buffer to the sense amplifiers, the
Dirty Tag and Plane Mask control which bits of the
sense amplifiers are changed via the Write Buffer.
Global Bus
Write Block Enable
(Pixel Buffer to DRAM)
Overview of 3D-RAM and Its Functional Blocks
1
0000
32
H
Write
Enable
Logic
Note that all read/write operations are viewed
from the perspective of the rendering controller. In
other words, a read operation across the Global
Bus always means a read by the Pixel ALU; that
is, data is transferred from a DRAM bank into the
Pixel Buffer. Similarly, a write operation across the
Global Bus means data is updated from the Pixel
Buffer to a DRAM bank. This is also specifically
noted in Figure 1.6 by the signals Global Bus
Write Block Enable and Global Bus Read Block
Enable.
to DRAM Sense Amps
Global
Bus
Write
Buffer
Enable
256
256
Global Bus
Read Block Enable
(DRAM to Pixel Buffer)
32
Dirty Tag RAM
8 blocks x 32 bits
32
from Pixel ALU
Figure 1.6 Tri-port Pixel Buffer, Global Bus and dual-port Dirty Tag RAM
3
3
32
32-bit Plane Mask
256
read/write port
Pixel Buffer
8 blocks x 256 bits
write port
32
from Pixel ALU
read port
32
to Pixel ALU
Block Address
3
from DRAM_A
3
Block Address
from PALU_A
3
Word Address
from PALU_A
[8:6]
[5:3]
[2:0]
8
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
Pixel ALU Basics
The Pixel ALU consists of four 8-bit ROP/Blend
units, which may be independently programed to
perform either a raster operation or a blending
function, one 32-bit Match Compare unit, and one
32-bit Magnitude Compare unit. The two Compare
units are also commonly referred to as the Dual
Compare units. The motivation for including the
Pixel ALU on chip is to convert the interface from
a read-modify-write interface to a write-mostly
interface. This logic integration with memory
arrays greatly improves rendering throughput by
avoiding time consuming reads and direction
changes on the data bus.
PALU_DX
PALU_DQ
Constant
Register
3632
[3:0]
[31:0]
ALU Read Port
36
Input Data, byte 3 and byte 0 plus ext. bits
Constant Data, bit 0 of extension bits plus byte 0
Pixel Buffer
Old Data, byte 0
The ROP/Blend units and the Dual Compare units
are highly pipelined. Page 11 contains a brief
discussion of the ALU pipeline. The output of a
ROP/Blend unit is conditionally written to the Pixel
Buffer, depending on the comparison results from
the on-chip Dual Compare units and from the Dual
Compare units of the preceding 3D-RAM chips.
For example, for a 1280 x 1024 x 32 doublebuffered graphics system with 32-bit Z buffer,
there are effectively 96 bits per pixel. In this case,
eight 3D-RAMs are used as color chips and four
as Z chips. The Pixel ALUs of the Z chips perform
magnitude comparisons and feed the comparison
results via their PASS_OUT pins to the
PASS_OUT
ALU Write Port
PASS_IN
8
18
ROP/
Blend
9
Unit 0
32
8
Overview of 3D-RAM and Its Functional Blocks
1
Old Data, byte 1
Input Data, byte 3 and byte 1 plus ext. bits
Constant Data, bit 1 of extension bits plus byte 1
Old Data, byte 2
Input Data, byte 3 and byte 2 plus ext. bits
Constant Data, bit 2 of extension bits plus byte 2
Old Data, byte 3
Input Data, byte 3 plus ext. bit
Constant Data, bit 3 of extension bits plus byte 3
Old Data
Input Data
Constant Data
Figure 1.7 Pixel ALU (Pipeline stages are not shown.)
9
8
18
9
8
18
9
8
9
9
32
32
32
ROP/
Blend
Unit 1
ROP/
Blend
Unit 2
ROP/
Blend
Unit 3
Dual
Compare
Unit
8
8
8
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
corresponding color chips. It is important to note
that due to the pipelining, the color chips do not
wait for the magnitude comparison results from
the Z chips; rather, the results of the ROP/
blending operations and comparison operations
on the color chips, and the results of the
magnitude comparison on the Z chips all are
presented to the Pixel Buffer of the color chips in
the same clock cycle. In this sense, the rendering
controller can accomplish a pixel blending
operation with Z compare and window ID compare
all in a single clock cycle. Furthermore, because
of the pipelining and the tri-ported architecture of
the Pixel Buffer, the read and write operations
may be performed on the Pixel Buffer of the
3D-RAM during the same clock cycle.
ROP/Blend Units
Overview of 3D-RAM and Its Functional Blocks
The ROP/Blend units can be configured as either
1
a ROP unit or a Blend unit by setting a register bit.
Each ROP unit can perform all 16 standard ROP
functions. These functions are listed in Chapter 3.
One of the operands of the ROP functions is the
old data from the Pixel Buffer, and the other
operand may be either the data from the primary I/
O pins or the data from an internal register (called
the Constant register). For the blending operation,
the general equation is as follows:
Write data to Pixel Buffer
= New Term + (Old Data x Old Fraction)
= (New Data x New Fraction) +
(Old Data x Old Fraction)
The 3D-RAM Blend units accomplish what is
called destination blending in a single MCLK
cycle, that is, the addition and the second
multiplication in the above equation. In this case,
the rendering controller must perform the
multiplication of New Data with New Fraction (i.e.,
the source blending) and present the result as the
New Term to 3D-RAM. In addition, 3D-RAM can
also accomplish the full blending by taking two
MCLK cycles, with a loop back mechanism.
Dual Compare Unit
Physically, the Dual Compare units consist of one
32-bit Match Compare unit and one 32-bit
Magnitude Compare unit. Both Match Compare
and Magnitude Compare are done in parallel. One
of the sources is always the old data from the
Pixel Buffer. The other source is independently
selectable between the data from the PALU_DQ
pins and the data from the Constant source
register. There are also two mask registers,
namely Match Mask and Magnitude Mask, that
define which bits of the 32-bit words will be
compared and which will be “don’t care.”
One application of the Match Compare unit is
Window ID comparison, and the Magnitude
Compare unit is typically used in the depth
comparison of a Z-buffer algorithm for hidden
surface removal. When these Compare units are
used together, the system can achieve hidden
surface removal for only a specific window on the
display in one cycle. Furthermore, since the data
to be written into the Pixel Buffer always comes
through the ROP/Blend units, a system with
3D-RAM can achieve a pixel update with a raster
or blending operation specifically on only the new
objects in the selected window that are closer to
the viewer than the existing objects in the frame
buffer.
The results of both Match Compare and
Magnitude Compare operations are logically
ANDed together to generate the PASS_OUT pin.
The PASS_IN signal (fed from another 3D-RAM
chip) and the internally generated PASS_OUT
signal are then logically ANDed together to
produce a Write Enable signal to the Pixel Buffer.
Thus, the PASS_IN and PASS_OUT pins offer
hardware support for display resolutions where
multiple 3D-RAM chips are required, such as in
the cases of 1280x 1024 x 32 (single color buffer
plus Z buffer) and 1280 x 1024 x 96 (double color
buffer plus Z buffer).
10
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
Pipelining
The 3D-RAM Pixel ALU pipeline is designed so
that read and write operations can be performed
with minimal delay. This is achieved by having all
operations conform to a uniform 7-stage pipeline.
Figure 1.8 is an example that illustrates the
efficiency afforded by the pipeline flow of Pixel
ALU read/write operations. A pipeline stage
begins with a rising edge of MCLK and ends
before the next rising edge of MCLK. (In 3D-RAM,
all references to MCLK are relative to the rising
edge except for some boundary scan test
operations.) For clarity, separate stage counts are
provided for the first read and first write operations
and are labeled as R1 through R4 and W1
through W7, respectively. The Read A operation
is asserted for two cycles; Read A is first
presented in Stage R1 and latched into the
3D-RAM by Clock 1 in Stage R2. Data A is piped
out by Clock 2 in Stage R3 and becomes stable
for sampling in Stage R4. Between Read B and
WC (Write C), two single-cycle NOPs are inserted
to guarantee an idle cycle for the data bus to turn
around. On the other hand, a read operation can
immediately follow a write operation, as shown by
Read G following WF. To allow maximum
bandwidth for the rendering controller, a write
operation may be started everything cycle. In this
example, we start with the WC operation. The
address and write instruction are presented in
Stage W1 and latched into the 3D-RAM by Clock
7 in Stage W2; Data C and WD are presented in
Stage W2 and latched into the 3D-RAM by Clock
8 in Stage W3. Then, after three cycles for internal
processing, the valid PASS_OUT Pass C is piped
out by Clock 11 in Stage W6. The actual updating
of the Pixel Buffer takes place in Stage W7. Thus,
n consecutive write operations take only 7 + n - 1
= n + 6 cycles to complete, including all internal
activities. It is important to point out that the
effective write cycle time from the perspective of
the rendering controller interface is only n + 1
cycles for n consecutive write operations, as
shown by WC through WF.
Overview of 3D-RAM and Its Functional Blocks
1
MCLK
PALU_A, PALU_OP,
PALU_BE, PALU_WE,
PALU_EN
PALU_DQ, PALU_DX
PASS_OUT
HIT
Figure 1.8 Example of Pixel Port read/write operations that satisfy the pipeline flow
R1R2R3R4
10
23456
Read AWC
Read BWENOPWFWDRead G
A
B
11
789101112131415
Data
C
W1W2W3W4W5W6W7W8
Data
D
DataEData
F
Pass
C
Pass
D
PassEPass
G
F
M1029
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
The Picking Logic
From the user’s view point, a common experience
of the picking function in 2D computer graphics
may be using the mouse and the associated
cursor to select an icon on the display screen,
resulting in the selected icon highlighted in a
different color. This is a basic function in
interactive computer graphics, and 3D-RAM
provides the Picking Logic and the HIT
support this picking function for selection of
objects in a 3D scene.
A picking function may involve redrawing the
objects into the frame buffer and returning a list of
objects that intersect with some predefined
selection volume. When the user uses multiple
3D-RAMs in a frame buffer design to determine if
a pixel data is successfully written by any Stateful
Overview of 3D-RAM and Its Functional Blocks
Write operation (see “Pixel Data Operations” on
1
page 40) during the redraw process, the
comparison result on the PASS_OUT pin from
each chip must be logically ANDed. If this logical
operation is left to off-chip glue logic between the
3D-RAM frame buffer and the rendering controller,
excessive delay is unavoidable in this critical
timing path. If the rendering controller is to
perform this logical operation, extra pins must be
provided by both the 3D-RAM and the rendering
pin to
controller, while delay is still significant. The
Picking Logic brings the glue logic on chip and
provides an open-drain HIT
pin to interface with
the rendering controller.
A block diagram of the Picking Logic is shown in
Figure 1.9. Initially, the Picking Logic should be
enabled and the HIT flag should be cleared, which
is done by writing to byte 3 of the Compare
Control Register. The HIT
pin will be set to high
(i.e., not driven low by 3D-RAM) after seven
cycles (corresponding to the Pipeline Stage 8). In
the figure below, this is indicated by the number 8
in the square box above the HIT
pin label. This
design of the pipeline flow for the HIT flag and the
HIT
pin prevents an incorrect HIT value from the
Stateful Data Write operations before the Picking
Logic is enabled. A sequence of Stateful Data
Write operations may be issued immediately after
the register writing. A low value on the HIT
pin
means that at least one of the Stateful Data Writes
passed the on-chip and off-chip comparison tests
and the pixel data was written to the Pixel Buffer.
If the HIT
pin is high, none of the Stateful Data
Writes passed and no pixel is updated. See
Figure 8.6, “Picking Logic Timing,” for an
illustration of the operations described in this
section.
Compare
Control
Register
Stateful_WE
PASS_IN
PASS_OUT
D25
D24
D27
D26
7
Figure 1.9 Block diagram of the Picking Logic
QD
0
QD
1
0
1
Pick Enable
QD
QD
HIT Flag
12
8
HIT
(open drain)
Set HIT Flag
M1040
2
Pin Descriptions and Pinouts
M
ITSUBISHI
ELECTRONIC DEVICE GROUP
Pin Descriptions and Pinouts
Rev. 1.03
3D-RAM (M5M410092B)
This chapter describes the 3D-RAM pins. Unless
otherwise specified, all signals comply with the
Low Voltage TTL (LVTTL) standard. The
functional block diagram in Figure 2.1 shows all
I/O signals on the external pins. The master clock
MCLK synchronizes all operations of the Pixel
ALU Control and DRAM Covntrol. The Video
Control specifies the video interface. The Test
Access Port is used for the JTAG (Joint Test
Action Group) boundary scan. The following
sections describe each signal in detail.
Common Pins
These signals are common to several sections of
the 3D-RAM.
Table 3.1
Signal
Name
MCLK1IMaster clock
RESET
Total2
Common control signals
Pin
CountI/ODescription
1IReset
RESET
The RESET
nous signal used for power up and restart initialization. During power-up, the RESET
signal should be held low for at least 500µs
after stable V
supply can be stabilized. After the RESET
signal goes high, nine idle cycles must elapse
before the internal registers can be reset to
default values. The power-up reset procedure
is illustrated in Figure 8.1. When the RESET
signal is asserted low during normal operations, a restart reset sequence begins. The
restart reset includes resetting registers in
nine idle cycles and initializing DRAM array
as in the power-up reset. The restart reset
sequence is shown in Figure 8.3. In DRAM
array initialization, the Access Page (ACP)
operation should be performed on one page
for every DRAM bank, followed by the Precharge Bank (PRE) operation for every bank.
Figure 8.3 shows two approaches to initializing the DRAM array.
pin is an active low asynchro-
DD, so that the internal power
Pin Descriptions and Pinouts
2
MCLK
The master clock MCLK is used for timing
synchronization of internal circuitry. All external timing parameters, except video output
operation and boundary scan, are specified
with respect to the MCLK rising edge.
13
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
DRAM
Bank A
640
Video Buffer I
640
Pin Descriptions and Pinouts
2
DRAM
Bank C
Global Bus
SCAN_RST
SCAN_TCK
SCAN_TMS
SCAN_TDI
Test
Access
Port
SCAN_TDO
SRAM
Pixel
Buffer
DRAM
Bank B
Video Buffer II
256
DRAM
Bank D
Video
Control
VID_CKE
VID_QSF
VID_CLK
640
16
VID_OE
VID_Q
DRAM_EN
640
DRAM
Control
3
2
9
DRAM_OP
DRAM_BS
DRAM_A
MCLK
RESET
2
PALU_EN
PALU_WE
3
Pixel
Control
32
4
ALU
32
32
6
4
2
PALU_OP
PALU_A
PALU_BE
PASS_OUT
PASS_IN
HIT
PALU_DX
PALU_DQ
Figure 2.1 3D-RAM functional block diagram with external pins
14
M1028
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
3D-RAM (M5M410092B)
Pixel ALU Interface
These signals control the Pixel ALU and Pixel
Buffer.
Table 3.2 Pixel ALU control signals
Signal NamePin CountI/ODescription
PALU_EN2IEnable Pixel ALU operation starting next cycle
PALU_WE1IPixel ALU write enable
PALU_OP3IPixel ALU opcode
PALU_A6IRead/Write address
PALU_BE4IByte write or output enable
PALU_DQ32I/OData pins
PALU_DX4IData extension pins for blending
PASS_OUT1OCompare output (special signal level, see Table 7.2)
Rev. 1.03
PASS_IN2ICompare input (special signal level, see Table 7.2)
HIT
Total56
1OPicking Logic flag output (open-drain, see Table 7.2)
PALU_EN
The PALU_EN
a Pixel ALU operation. If either PALU_EN pin
is “0”, then all other Pixel ALU pins are
ignored.
PALU_WE
The PALU_WE indicates a write operation
when high (“1”) and a read operation when
low (“0”).
PALU_OP
The PALU_OP
PALU_WE, specify the operation to be performed. See Table 3.4 for the Pixel ALU operation encoding.
PALU_A
The PALU_A
the specified operation.
[5:0]
[1:0]
[2:0]
pins must be “11” to start
[1:0]
pins, together with
[2:0]
pins provide an address for
[5:0]
PALU_BE
PALU_DQ
[3:0]
The PALU_BE
write operations, including register writes and
Dirty Tag writes. If PALU_WE is low “0”, indicating a read, the PALU_BE pins are per byte
output enables. If PALU_WE is high “1”, indicating a write, the PALU_BE pins are per byte
write enables. PALU_BE0 controls
PALU_DQ
PALU_DQ
PALU_DQ
PALU_DQ
Data is read from or written to the
PALU_DQ[31:0] pins. The write address of
Pixel Buffer may be input from
PALU_DQ
See “An Application of the Write Address
Control Register” on page 62.
[7:0]
[15:8]
[23:16
[31:24]
[31:0]
[29:24]
pins apply to all read and
[3:0]
; PALU_BE1 controls
; PALU_BE2 controls
]; and PALU_BE3 controls
.
in some modes of operation.
Pin Descriptions and Pinouts
2
15
M
ELECTRONIC DEVICE GROUP
ITSUBISHI
Rev. 1.03
3D-RAM (M5M410092B)
PALU_DX
Extra high-order bits of PALU_DQ data are
provided by PALU_DX
associated with PALU_DQ[
associated with PALU_DQ
is for PALU_DQ
PALU_DQ
PASS_OUT
The comparison result of the Dual Compare
unit is output on the PASS_OUT pin.
PASS_OUT is low (“0”) only when the Pixel
ALU operation during the fifth stage of Pixel
ALU pipeline is a Stateful Initial/Normal Data
Write operation (see “Pixel Data Operations”
on page 40) and when either match comparison or magnitude comparison fails. Otherwise, PASS_OUT is high (“1”), indicating
either the Pixel ALU operation is not a State-
Pin Descriptions and Pinouts
2
ful Initial/Normal Data Write or both comparison tests passed during the Stateful Initial/
Normal Data Write.
PASS_IN
When the PASS_IN
and the internal comparison test also passes
(PASS_OUT is high (“1”)), data is written to
the Pixel Buffer if the Pixel ALU operation is a
Stateful Normal/Initial Data Write. Each of the
PASS_IN[1:0] pins may be individually
masked by the PASS_INs Select register bits
0 and 8, PINS[0, 8], respectively.
HIT
The HIT pin is an open-drain, active low output. This pin reflects the internal status of the
HIT flag. See “Compare Control Register
(CCR
description.
[3:0]
[3:0]
; and PALU_DX3 is for
[23:16]
.
[31:24]
[1:0]
pins are high (“11”)
[1:0]
)” on page 36 for a detailed
[31:0]
. PALU_DX0 is
; PALU_DX1 is
7:0]
; PALU_DX2
[15:8]
DRAM Control
These signals command operations on the four
DRAM banks, Global Bus and Video Buffer.
Table 3.3 DRAM control signals
Signal
Name
DRAM_EN1IEnable DRAM
DRAM_OP3IDRAM opcode
DRAM_BS2IDRAM bank
DRAM_A9IAddress for
Total15
DRAM_EN
When DRAM_EN is high (“1”) at the rising
edge of MCLK, a DRAM operation is initiated
at the next clock cycle. Only the selected
DRAM bank is enabled.
DRAM_OP
The DRAM Opcode DRAM_OP
the DRAM operation. See Table 4.1 for the
DRAM operation encoding.
DRAM_BS
DRAM_BS
four banks. The selection codes are: “00” for
Bank A, “01” for Bank B, “10” for Bank C, and
“11” for Bank D.
Pin
CountI/ODescription
operation at
next cycle
select
page, block,
and video line
[2:0]
[1:0]
is used to select one out of
[1:0]
specifies
[2:0]
16
Loading...
+ 174 hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.