Nvidia GeForce GTX 200 GPU Technical Brief

Technical Brief
NVIDIA GeForce® GTX 200 GPU Architectural Overview
Second-Generation Unified GPU Architecture for Visual Computing
Table of Contents
Introduction ...................................................................................................................4
GeForce GTX 200 Architectural Design Goals and Key Capabilities...............................5
Architectural Design Goals................................................................................................. 5
Gaming Beyond: Dynamic 3D Realism................................................................................ 6
Gaming Beyond: Extreme HD......................................................................................... 7
Gaming Beyond: SLI .....................................................................................................7
Beyond Gaming: High-Performance Visual Computing and Professional Computat ion..............8
GeForce GTX 200 GPU Architecture ...............................................................................9
More Processor Cores.................................................................................................... 9
Graphics Processing Architecture.................................................................................. 10
Parallel Computing Architecture.................................................................................... 12
SIMT Architecture....................................................................................................... 13
Greater Number of Threads in Flight............................................................................. 13
Larger Register File..................................................................................................... 14
Improved Dual Issue................................................................................................... 15
Double Precision Support............................................................................................. 15
Improved Texturing Performance................................................................................. 15
Higher Shader to Texture Ratio.................................................................................... 16
ROP Improvements..................................................................................................... 16
1 GB Framebuffer ....................................................................................................... 16
Geometry Shading and Stream Out .............................................................................. 17
512-bit Memory Interface............................................................................................ 17
Power Management Enhancements..............................................................................18
Additional Pipeline and Architecture Enhancements........................................................ 18
Summary ......................................................................................................................20
Appendix A: Retrospective...........................................................................................21
Appendix B: Figure 1 References .................................................................................22
2 May, 2008 | TB-04044-001_v01
Figures
Figure 1: Realistic warrior from NVIDIA “Medusa” demo..........................................................6
Figure 2: Far Cry 2 – Extreme HD Dynamic Beauty! (Ubisoft)................................................... 7
Figure 3: Significant Speedup Using GPU................................................................................8
Figure 4: GeForce GTX 280 GPU Graphics Processing Architecture.......................................... 10
Figure 5: GeForce GTX 280 GPU Parallel Computing Architecture ........................................... 12
Figure 6: TPC (Thread Processing Cluster) ........................................................................... 13
Figure 7: Local Register File 2× versus 1× ........................................................................... 14
Figure 8: Geometry Shading Performance ............................................................................ 17
Tables
Table 1: Number of GPU Processing Cores .............................................................................9
Table 2: GeForce 8800 GTX vs GeForce GTX 280.................................................................. 11
Table 3: Maximum Number of Threads ................................................................................ 14
Table 4: Theoretical vs Measured Texture Filtering Rates....................................................... 16
May 2008 | TB-04044-001_v01 3
Introduction
In this technical brief we introduce NVIDIA’s new GeForce family, the first GPUs to implement NVIDIA’s second-generation unified graphics and computing architecture. The high-end, enthusiast-class GeForce GTX 280 GPU and performance-oriented GeForce GTX 260 GPU are the first members of the GeForce GTX 200 GPU family and deliver the ultimate visual computing and extreme high-definition (HD) gaming experience.
We’ll begin by describing architectural design goals and key features, and then dive into the technical implementation of the GeForce GTX 200 GPUs. We assume you have a basic understanding of first-generation NVIDIA unified GPU architecture, including unified shader design, scalar processing cores, decoupled texture and math units, and other architectural features. If you are not well versed in NVIDIA unified GPU architecture, we suggest you first read the Technical Brief titled NVIDIA GeForce 8800 GPU Architecture Overview. You can also refer to Appendix A for a historical retrospective.
®
GTX 200 GPU
4 May, 2008 | TB-04044-001_v01
GeForce GTX 200 Architectural Design Goals
and Key Capabilities
GeForce GTX 200 GPUs are massively multithreaded, many-core, visual computing processors that incorporate both a second-generation unified graphics architecture and an enhanced high-performance, parallel-computing architecture.
Two overarching themes drove GeForce GTX 200 architectural design and are represented by two key phrases: “Beyond Gaming” and “Gaming Beyond.”
Beyond Gaming means the GPU has evolved beyond being used primarily for 3D games and driving standard PC display capabilities. More and more, GPUs are accelerating non-gaming, computationally-intensive applications for both professionals and consumers.
Gaming Beyond means that the GeForce GTX 200 GPUs enable amazing new gaming effects and dynamic realism, delivering much higher levels of scene and character detail, more natural character motion, and very accurate and convincing physics effects.
The GeForce GTX 200 GPUs are designed to be fully compliant with Microsoft DirectX 10 and Open GL 2.1.
Architectural Design Goals
NVIDIA engineers specified the following design goals for the GeForce GTX 200 GPUs:
Design a processor with up to twice the performance of GeForce 8800
GTX
Rebalance the architecture for future games that use more complex
shaders and more memory
Improve architectural efficiency per watt and per square millimeter Improve performance for DirectX 10 features such as geometry
shading and stream out
Provide significantly enhanced computation ability for high-
performance CUDA
Deliver improved power management capability, including a substantial
reduction in idle power.
GeForce GTX 200 GPUs enable major new graphics and compute capabilities, providing the most realistic 3D graphics effects ever rendered by GPUs to date, while also providing nearly a teraflop of computational power.
applications and GPU physics
May 2008 | TB-04044-001_v01 5
Gaming Beyond: Dynamic 3D Realism
While prior-generation GPUs could deliver real-time images that appeared true-to­life in many cases, frame rates could drop to unplayable levels in complex scenes with significant animation, numerous physical effects, and multiple characters. The combination of the sheer shader processing power of GeForce GTX 200 GPUs and NVIDIA’s new PhysX including:
Convincing facial and character animation Multiple ultra-high polygon characters in complex environments Advanced volumetric effects (smoke, fog, mist, etc.) Fluid and cloth simulation Fully simulated physical effects such as live debris, explosions, and
fires.
Physical weather effects such as accumulating snow and water, sand
storms, soaking, drying, dampening, overheating, and freezing
Better lighting for dramatic and spectacular effect, including ambient
occlusion, global illumination, soft shadows, color bleeding, indirect lighting, and accurate reflections.
technology facilitates many new high-end graphics effects
Figure 1: Realistic warrior from NVIDIA “Medusa” demo
6 May, 2008 | TB-04044-001_v01
Gaming Beyond: Extreme HD
GeForce GTX 200 GPUs provide 50-100% more performance over prior­generation GPUs, permitting increased frame rates and higher visual quality settings at extreme resolutions, resulting in a truly cinematic gaming experience.
Figure 2: Far Cry 2 – Extreme HD Dynamic Beauty! (Ubisoft)
Support for the new DisplayPort interface allows resolutions beyond 2560 × 1600, and 10-bit color support permits up to a billion different colors on screen (driver, display, and application support is also required). Note that prior-generation GPUs included internal 10-bit processing, but could only output 8-bit component colors (RGB). GeForce GTX 200 GPUs permit both 10-bit internal processing and 10-bit color output.
Gaming Beyond: SLI
NVIDIA’s SLI® technology is the industry’s leading multi-GPU technology, giving you an easy, low-cost, high-impact performance upgrade. PC gaming simply doesn’t get any faster or more realistic than running GeForce GTX 200 GPU-based boards in SLI mode on the latest nForce
Two flavors of SLI are supported by the initial GeForce GTX 200 GPUs:
Standard SLI (two GPU boards), which typically boosts supported
game performance by 60-90% and permits higher quality settings
3-way SLI, which provides even higher frame rates and permits higher
quality settings for the ultimate experience in PC gaming when connected to a high-end, high-resolution monitor.
GeForce GTX 200 GPUs process and display complex DirectX 10 and OpenGL game environments with amazing graphics effects and high frame rates at extreme, high-definition resolutions.
®
motherboards.
May 2008 | TB-04044-001_v01 7
Loading...
+ 16 hidden pages