Mercury 7410 User Manual

500 MHz PowerPC
7410 Daughtercard
AltiVec Parallel Vector Technology
Scalable from Two to Hundreds of Processors
1K CFFT in 20 µs on Each Processor
125 MHz Memory System with Prefetch and ECC
Advanced DMA Engine for Chained Submatrix Moves
267 MB/s RACE++ Switch Fabric Interconnect
Fast L2 Cache (250 MHz)
Embedded computing reaches a new level of
performance with
RACE++
PowerPC
tercards from Mercury
Each PowerPC 7410
daughtercard contains two
500 MHz MPC7410 microprocessors with AltiVec™ technology. These unique microprocessors combine a modern super­scalar RISC architecture with an AltiVec parallel vector execution unit.
The AltiVec vector processing unit revolutionizes the performance of computationally intensive applications such as image and signal processing. Each vector unit can operate in parallel on up to four floating-point numbers or up to sixteen 8-bit integers. This dramatically accelerates vector arithmetic and provides greater application performance on smaller, less power-hungry processors.
AltiVec technology also represents a leap in simplifying the programming required to achieve high performance. Whereas previous DSP-based systems required handcrafted assembly language code for optimal performance, easy-to-use exten­sions to the C language provide a direct mapping to AltiVec instructions. This permits developers to program more productively in a higher-level language, even for critical sections of code.
®
Series
®
7410 daugh-
Computer Systems.
Optimized Performance
To keep the processor fed with ample data, increased emphasis is placed on the memory system and communications fabric that delivers data to the processor. Each compute node on the 500 MHz PowerPC 7410 daughtercard has a dedicated fabric interface at 267 MB/s and maximum mem­ory speed of 125 MHz. By maximizing the performance of the memory and the fabric interface to the processor, Mercury has optimized RACE++ compute nodes for processing continuous streams of data.
AltiVec in RACE++ Computers
The computational power of RACE++ Series systems is built from compute nodes comprised of processors, memory, and interfaces to the RACE++ interconnect. PowerPC 7410 daughtercards each contain two compute nodes.
Each compute node (CN) consists of an MPC7410 microprocessor with AltiVec technology, level 2 (L2) cache, synchronous DRAM (SDRAM), and a Mercury-designed ASIC. This CN ASIC contains architectural advancements that enhance concurrency between arithmetic and I/O operations.
PowerPC 7410 Daughtercard Architecture
With the huge increase in processing performance brought by AltiVec, most applications are no longer CPU-limited.
Mercury can configure systems with hundreds of compute nodes, communicating over the second-generation RACE++ switch fabric interconnect. Merging RACE++ and AltiVec technology provides embedded computers with unprec­edented computational power.
AltiVec Vector Processing Unit
The AltiVec vector processing unit operates on 128 bits of data concurrently with the other PowerPC execution units. AltiVec instructions may be interleaved with other PowerPC instructions without any penalty such as a context switch. The 128-bit wide execution unit can be used to operate on four floating-point numbers, four 32-bit integers, eight 16-bit integers, or sixteen 8-bit integers simultaneously.
AltiVec instructions are carried out by one of two AltiVec sub-units. The Vector arithmetic logic unit handles the vector fixed-point and vector floating-point operations. Two floating-point operations are possible in a single cycle with the vector multiply-add instruction and the vector negative multiply-subtract instruction.
The Permute sub-unit incorporates a crossbar network to perform 16 individual byte moves in a single cycle. This capability can be used for simple tasks such as converting the "endian-ness" of data or for more complicated tasks such as byte interleaving, dynamic address alignment, or accelerating small look-up tables.
PowerPC RISC Architecture
In addition to the AltiVec execution unit, the MPC7410 contains a floating-point unit and two integer units that can operate concurrently with the AltiVec unit. Data and instruc­tions are fed through two on-chip, 32-Kbyte, eight-way set-associative caches that enhance performance of both vector and scalar code.
Each PowerPC 7410 CN also includes a fully pipelined backside L2 cache operating at 250 MHz. This high-
performance cache system provides quick access to data previously loaded from memory but too large to fit into the on-chip cache.
Compute Node ASIC
The CN ASIC, included in each compute node, acts as both a memory controller and as a network interface to the RACE++ switch fabric interconnect. The CN ASIC includes an enhanced DMA controller, a high-performance memory system with error checking and correcting, metering logic, and a RACE++ interface. By combining memory control and network interface into a single chip, Mercury's compute node provides the highest performance with the lowest power consumption and highest reliability.
High-Performance Memory System
Mercury's high-performance memory subsystem allows the memory to reach the intrinsic limits of its performance capability with:
125-MHz Synchronous DRAM
Prefetch Buffers: bring sequential data to the ASIC ahead
of their explicit requests by the processor. These prefetch buffers greatly improve the performance of the CN in vec­tor operations such as those used in DSP applications.
FIFO Buffers: efficiently overlap accesses to SDRAM from the local processor and the RACEway interconnect.
The PowerPC CN contains error-correcting circuitry for improved data integrity. One-bit errors are corrected on the fly, and multi-bit errors generate an interrupt error condition.
Enhanced DMA Controller
Each CN has an advanced DMA controller to support RACEway transfers at 267 MB/s with chaining and striding.
MPC7410 Data and Instruction Flow
Compute Node ASIC Architecture
Loading...
+ 2 hidden pages