Nvidia TCST4MATX-PB Product Data Sheet

NVIDIA T4
05X10X
15X
20X
25X
30X
35X
40X
21X
Comparisons made of one NVIDIA T4 GPU versus servers with dual-socket Xeon Gold 6140 CPU.
GNMT
ResNet-50
27X
36X
DeepSpeech2
CPU
02X4X5X6X7X8X9X10X
9.3X
Comparison made of dual NVIDIA T4 GPUs versus servers with dual-socket Xeon Gold 6140 CPU.
CPU
1X
3X
TENSOR CORE GPU
Powering Scale-Out AI Training and Inference
Supercharge any server with NVIDIA® T4 GPU, the world’s most performant scale-out accelerator. Its low-profile, 70W design is powered by NVIDIA Turing™ Tensor Cores, delivering revolutionary multi-precision performance to accelerate a wide range of modern applications. This advanced GPU is packaged in an energy-efficient 70-watt, small PCIe form factor, optimized for scale-out servers and purpose-built to deliver state-of-the-art AI.
Inference Performance
Training Performance
SPECIFICATIONS
GPU Architecture NVIDIA Turing
NVIDIA Turing Tensor Cores
NVIDIA CUDA® Cores 2,560
Single-Precision 8.1 TFLOPS
Mixed-Precision (FP16/FP32)
INT8 130 TOPS
INT4 260 TOPS
GPU Memory 16 GB GDDR6
ECC Yes
Interconnect Bandwidth
System Interface x16 PCIe Gen3
Form Factor Low-Profile PCIe
Thermal Solution Passive
Compute APIs CUDA, NVIDIA TensorRT™,
320
65 TFLOPS
300 GB/s
32 GB/sec
ONNX
ResNet-50
NVIDIA T4 | DATASHEET | DEC18
Scale-Out Performance Driving Data Center Acceleration
Small form factor 70-watt (W) design
makes T4 optimized for scale-out servers, providing an incredible 50X higher energy efficiency compared to CPUs, drastically reducing operational costs. In the last two years, NVIDIA’s Inference Platform has increased efficiency by over 10X, and remains the most energy-efficient solution for distributed AI training and inference.
Turing Tensor Core technology with
multi-precision computing for AI powers breakthrough performance from FP32 to FP16 to INT8, as well as INT4 precisions. It delivers up to 9.3X higher performance than CPUs on training and up to 36X on inference.
The NVIDIA T4 data center GPU is the
ideal universal accelerator for distributed computing environments. Revolutionary multi-precision performance accelerates deep learning and machine learning training and inference, video transcoding, and virtual desktops. T4 supports all AI frameworks and network types, deliver­ing dramatic performance and efficiency that maximize the utility of at-scale deployments.
To learn more about the NVIDIA T4, visit www.nvidia.com/T4
© 2018 NVIDIA Corporation. All rights reserved. NVIDIA, the NVIDIA logo, NVIDIA Turing, CUDA, and TensorRT are trademarks and/or registered trademarks of NVIDIA Corporation in the U.S. and other countries. All other trademarks and copyrights are the property of their respective owners. DEC18
Loading...