Channel decoding of voice and low bit-rate data channels found in third generation (3G) cellular standards
requires decoding of convolutional encoded data. The Viterbi-decoder coprocessor 2 (VCP2) provided in
the TCI648x/9x devices has been designed to perform Viterbi decoding for IS2000 and 3GPP wireless
standards. The VCP2 coprocessor has been designed to perform forward-error correction for 2G and 3G
wireless systems. The VCP2 coprocessor offers a very cost effective and synergistic solution when
combined with Texas Instruments (TI) DSPs. The VCP2 supports 762 12.2 Kbps 3G AMR channels when
running at 333 MHz. This document describes the operation and programming of the VCP2.
Notational Conventions
This document uses the following conventions.
•Hexadecimal numbers are shown with the suffix h. For example, the following number is 40
hexadecimal (decimal 64): 40h.
•Registers in this document are shown in figures and described in tables.
– Each register figure shows a rectangle divided into fields that represent the fields of the register.
Each field is labeled with its bit name, its beginning and ending bit numbers above, and its
read/write properties below. A legend explains the notation used for the properties.
– Reserved bits in a register figure designate a bit that is used for future device expansion.
•The term "word" describes a 32-bit value.
Preface
SPRUE09E–May 2006–Revised December 2009
Read This First
Related Documentation From Texas Instruments
The following documents describe the C6000™ devices and related support tools. Copies of these
documents are available on the Internet at www.ti.com. Tip: Enter the literature number in the search box
provided at www.ti.com.
SPRU189 — TMS320C6000 DSP CPU and Instruction Set Reference Guide. Describes the CPU
architecture, pipeline, instruction set, and interrupts for the TMS320C6000 digital signal processors
(DSPs).
SPRU198 — TMS320C6000 Programmer's Guide. Describes ways to optimize C and assembly code for
the TMS320C6000™ DSPs and includes application program examples.
SPRU301 — TMS320C6000 Code Composer Studio Tutorial. Introduces the Code Composer Studio™
integrated development environment and software tools.
SPRU321 — Code Composer Studio Application Programming Interface Reference Guide.
Describes the Code Composer Studio™ application programming interface (API), which allows you
to program custom plug-ins for Code Composer.
SPRU871 — TMS320C64x+ Megamodule Reference Guide. Describes the TMS320C64x+ digital signal
processor (DSP) megamodule. Included is a discussion on the internal direct memory access
(IDMA) controller, the interrupt controller, the power-down controller, memory protection, bandwidth
management, and the memory and cache.
C6000, TMS320C6000, Code Composer Studio are trademarks of Texas Instruments.
All other trademarks are the property of their respective owners.
Channel decoding of voice and low bit-rate data channels found in cellular standards such as 2.5G, 3G,
and WiMAX requires the decoding of convolutional encoded data. The Viterbi-decoder coprocessor 2
(VCP2) provided in the TCI648x/9x devices performs Viterbi decoding for IS2000 and 3GPP wireless
standards. The VCP2 coprocessor also performs forward-error correction for 2G and 3G wireless systems.
The VCP2 coprocessor offers a very cost effective and synergistic solution when combined with Texas
Instruments (TI) DSPs. The VCP2 supports 762 12.2 Kbps 3G AMR channels when running at 333 MHz.
1Features
The VCP2 provides:
•High flexibility:
– Variable constraint length, K = 5, 6, 7, 8, or 9
– User-supplied code coefficients
– Code rates (1/2, 1/3, or 1/4)
– Configurable trace back settings (convergence distance, frame structure)
– Branch metrics calculation and depuncturing done in software by the DSP
•System and development cost optimization:
– The VCP2 releases DSP resources for other processing
– Reduces board space and power consumption by performing on-chip decoding
– Communication between the DSP and the VCP2 is performed through the high-performance
EDMA3 engine
– Uses its own optimized working memories
– Provides debug capabilities during frame processing
– Libraries are provided for reduced development time
User's Guide
SPRUE09E–May 2006–Revised December 2009
TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2
SPRUE09E–May 2006–Revised December 2009TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2
A convolutional code is generated by passing the information sequence to be transmitted through a linear
finite-state shift register. The VCP2 is able to decode only a subset of those codes known as a single-shift
register, nonrecursive convolutional code (an example is given in Figure 1). Important parameters for this
type of codes are:
•The constraint length K (length of the delay line, the VCP2 supports K values from 5 to 9).
•The rate R given by R = k/n where k is the number of information bits needed to produce n output bits
also known as codewords (the VCP2 supports 1/2, 1/3, and 1/4 codes with rates).
•The generator polynomials Gn describe how the outputs are generated from the inputs.
www.ti.com
Figure 1. Convolutional Encoder Example Block Diagram
NOTE: K = 3, R = k/n = 1/3, G0= (100)8, G1= (101)8, G2= (111)80/000 means input is 0, output0 is
0, output1 is 0, output2 is 0.There are 2
(K-1)
states and 2kincoming branches per state.
From the parameters, we can derive a trellis diagram providing a useful representation of the code, but
whose complexity grows exponentially with the constraint length K. Figure 2 shows the trellis diagram of
the code from Figure 1. The fact that there is a limited number of possible transitions from one state to
another makes the code powerful and will be used in the decoding process.
As a maximum-likelihood sequence estimation (MLSE) decoder, the Viterbi decoder identifies the code
sequence with the highest probability of matching the transmitted sequence based on the received
sequence.
The Viterbi algorithm is composed of a metric update and a traceback routine. The metric update performs
a forward recursion in the trellis over a finite number of symbol periods where probabilities are
accumulated (the VCP2 accumulates on 13 bits) for each individual state based on the current input
symbol (branch metric information). The accumulated metric is known as path metrics or state metrics.
Once a path through the trellis is identified, the traceback routine performs a backward recursion in the
trellis and outputs hard decisions or soft decisions.
8
TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2SPRUE09E–May 2006–Revised December 2009
The DSP controls the operation of the VCP2 (Figure 3) using memory-mapped registers. The DSP
typically sends and receives data using synchronized EDMA3 transfers through the EDMA3 bus. The
VCP2 sends two synchronization events to the EDMA3: a receive event (VCPREVT) and a transmit event
(VCPXEVT). The VCP2 input data corresponds to the branch metrics and the output data to the hard
decisions or soft decisions.
www.ti.com
Figure 3. VCP2 Block Diagram
10
TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2SPRUE09E–May 2006–Revised December 2009
The branch metrics (BM) are calculated by the DSP and stored in the DSP memory subsystem as 8-bit
signed values. Per symbol interval T, for a rate R = k/n and a constraint length K, there are a total of 2
branches in the trellis. For rate 1/n codes, only 2
n-1
branch metrics need to be computed per symbol period
K-1+k
and passed to the VCP2. Moreover, n soft inputs are required to calculate 1 branch metric.
Assuming BSPK modulated bits (0 → 1, 1 → -1), the branch metrics are calculated as follows:
•Rate 1/2: there are 2 branch metrics per symbol period
– BM0(t) = r0(t) + r1(t)
– BM1(t) = r0(t) - r1(t)
where r(t) is the received codeword at time t (2 symbols, r0(t) is the symbol corresponding to the encoder
upper branch, see Figure 1).
•Rate 1/3: there are 4 branch metrics per symbol period
– BM0(t) = r0(t) + r1(t) + r2(t)
where r(t) is the received codeword (4 symbols, r0(t) is the symbol corresponding to the encoder upper
branch, see Figure 1).
The data must be sent to the VCP2 as described in Table 1, Table 2, and Table 3 for rates 1/2, 1/3, and
1/4, respectively (the base address must be double-word aligned).
The branch metrics can be saved in the DSP memory subsystem in either their native format or packed in
words (user implementation). When working in big-endian mode, the VCP2 endian mode register
(VCPEND) indicates if the data is 32-bit word packed or native 8-bit format and the VCP2 will handle the
endianness byte swapping accordingly (see Section 7).
Table 1. Branch Metrics for Rate 1/2
Data
Address (hex)MSBLSB
BaseBM1(t=T)BM0(t=T)BM1(t=0)BM0(t=0)
Base + 4hBM1(t=3T)BM0(t=3T)BM1(t=2T)BM0(t=2T)
Base + 8h...
SPRUE09E–May 2006–Revised December 2009TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2
BaseBM3(t=0)BM2(t=0)BM1(t=0)BM0(t=0)
Base + 4hBM3(t=T)BM2(t=T)BM1(t=T)BM0(t=T)
Base + 8h...
Address (hex)MSBLSB
BaseBM3(t=0)BM2(t=0)BM1(t=0)BM0(t=0)
Base + 4hBM7(t=0)BM6(t=0)BM5(t=0)BM4(t=0)
Base + 8hBM3(t=T)BM2(t=T)BM1(t=T)BM0(t=T)
Base + ChBM7(t=T)BM6(t=T)BM5(t=T)BM4(t=T)
Base + 10h...
The state metric accumulation resolution is 13 bits on the VCP2. Consequently, full 8-bit dynamic range is
available for branch metrics on the TCI648x/9x VCP2, for all constraint lengths and all code rates.
4.2Soft Input Dynamic Ranges
The VCP2 implementation implies that the soft inputs need to be quantized so that the branch metrics
satisfy the following bound B1 (branch metrics upper bound - absolute value):
(C - 1)
2
- 1 ≥ (2 × (K - 1) + 2) × B
K is the constraint length and C determines the truncation of state metrics that can be performed without
loss of decoding performance.
The VCP2 is designed with C = 13. The branch metrics can have a maximum dynamic range of 7 + 1 sign
bits [-128; +127]. This gives another branch metrics upper bound B2≤ 128.
www.ti.com
Table 2. Branch Metrics for Rate 1/3
Data
Table 3. Branch Metrics for Rate 1/4
Data
1
So for a given constraint length, min (B1, B2) gives the final branch metrics maximum bound B.
To satisfy B in the branch metrics calculation, the soft input values, delivered as 8-bit-signed equalized
values, are linearly scaled with the following formula where 1/n is the rate.
Scaled = min (B1, B2)/n × SoftValue/128
Example
K = 9, then B1≤ 227.5 and the branch metrics range B2is [-128; +127]. So the branch metrics need to be
in [-128;+127] range.
If rate 1/3, 128/342, so the soft inputs need to be scaled by a factor of 0.333333 and saturated within
the range [-42; +42].
Table 4 summarizes the calculations for the different constraint length and rate:
The VCP2 can be configured to generate either hard decisions (one bit per decision), or soft decisions
(8-bit value per decision). Ordering of the VCP2 decisions depends on the OUT_ORDER field of VCPIC3
and the SD field of VCPEND. If the DSP is set to work in big-endian mode and the results are soft
decisions (see the VCP2 endian mode register, Section 6.3). The decisions buffer start address must be
double-word aligned and the buffer size must be a multiple of 8 bytes.
The soft decisions in the VCP2 are initially computed with the path metrics at 13-bit values. The results
are then clipped to 8-bit signed integer values before being stored in the traceback soft decision memory.
Decision Data
SPRUE09E–May 2006–Revised December 2009TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2
The VCP2 contains several memory-mapped registers accessible by the CPU, the IDMA, the QDMA, and
the EDMA3. A configuration-bus access is faster than an EDMA3-bus access for isolated accesses
(typically when accessing control registers). EDMA3-bus accesses are used for EDMA3 transfers and
provide maximum throughput to/from the VCP2. The registers are listed in Table 5. For the memory map
and full register addresses, see the device-specific data manual.
The branch metric and traceback decision memories contents are not accessible and the memories can
be regarded as FIFOs by the DSP, meaning you do not have to perform any indexing on the addresses.
•Data Transfer Alignment: Normal (non-emulation) mode data transfers to/from the VCP2
must be aligned on a double-word (64-bit) boundary. Alignment can be forced in C using
the 'DATA_ALIGN' pragma. Non-alignment results in data transfer failure.
Example:
#pragma DATA_ALIGN(configIc, 8)// Should be double-word aligned
VCP_ConfigIc configIc;// VCP Input Configuration Reg
•Data Transfer Size: Normal (non-emulation) mode data transfers to/from the VCP2 must
be of a length that is an 8-byte (double-word) multiple.
•Emulation mode transfers are performed on 32-bit boundaries and are 4 bytes in length.
TMS320TCI648x/9x Viterbi-Decoder Coprocessor 2SPRUE09E–May 2006–Revised December 2009
The VCP2 peripheral identification register (VCPPID) is a constant register that contains the ID and ID
revision number for the peripheral. The PID stores version information used to identify the peripheral. All
bits within this register are read-only (writes have no effect), meaning that the values within this register
should be hard-coded with the appropriate values and must not change from their reset state.
The VCPPID register is shown in Figure 4 and described in Table 7.
Figure 4. VCP2 Peripheral ID Register (VCPPID)
TCI648x DSP
3124 2316 15870
ReservedTYPECLASSREV
R-0R-0x01R-0x11R-rev
TCI649x DSP
3130 2928 2716 1511 1087650
SCHEMEReservedPIDRTLMAJORCUSTOMMINOR
R-1R-0R-0x80AR-<rtl>R-<major>R-R-<minor>
LEGEND: R/W = Read/Write; R = Read only; -n = value after reset
Table 7. VCP2 Peripheral ID Register (VCPPID) Field Descriptions
BitFieldValueDescription
TCI648x DSP
31-24Reserved0Reserved
23-16TYPE01hPeripheral Type. Identifies the type of the peripheral.
15-8CLASS11hPeripheral Class. Identifies the class.
7-0REV<rev>Peripheral Revision. Identifies the revision level of the specific instance of the peripheral. This
value should begin at 0x01 and be incremented each time the design is revised.