Cirrus Logic CS48LV13 User Manual

Codec
SoC or A pplication Processor
PGA
Noise
Reduction
Acoustic Echo Canceller
Residual Echo Suppressor
NLP
Spectrally Matched
Comfort Noise
Dynamic
FlexEQ
ALC PGA
Codec Port
Host Port
Full-Duplex Control + Voice Activity Detection + Double Talk Detection
PGA
Comfort
Noise
Automatic Volume
Control
Automatic Level
Control
Dynamic
FlexEQ
Noise
Reduction
PGA
Media Processor
GND V
D
V
L
Debug MCLK
I2S
Raw PCM
I2S
Clean
PCM
RESET
BUSY
I
2
S
Clean
PCM
CLK
SPI/I
2
C
I
2
S
Raw PCM
INT
Near
End
Far
End
Optional
CS48LV12/13
Ultralow Power HD Voice Processors,
Featuring SoundClear® Technology
Overview of Features
• ASR Enhance™ automatic speech recognition (ASR) preprocessing for increased ASR accuracy in noisy
• SoundClear Voice™ noise reduction, echo cancellation, and voice enhancement
• RAPID2™ GUI-based diagnostic and tuning tool for ease of design-in
• Media postprocessing support —Integrated Cirrus Logic playback enhancement for
speakers and headphones
—Optional Dolby® and DTS® playback enhancement
1. Use of TrulyHandsfree-, Dolby-, or DTS-supported features requires the existence and proof of a valid license agreement with the
corresponding company to be able to use or distribute its technology in any finished end-user or ready-to-use final product.
environments
• Voice activity detector (VAD) enables always-on speech recognition
• TrulyHandsfree™ voice control by Sensory, Inc. supported
• Powerful 130-MHz dual-MAC 32-bit DSP core
• Ultralow power consumption (core typically <8 mW @ 1 V during narrowband call)
2
•I
S, I2C, SPI™ digital connectivity
1
http://www.cirrus.com
CS48LV12 Block Diagram
Copyright Cirrus Logic, Inc. 2014
(All Rights Reserved)
DS1057F1
FEB ‘14
Codec
Ho st
PGA
Noise
Reduction
Acoustic Echo Canceller
Residual Echo Suppressor
NLP
Spectrally Matched
Comfort Noise
Dynamic
FlexEQ
ALC
PGA
Codec Port
Host Port
Full-Duplex Control + Vo ice Activit y Detect ion + Double-Talk Detection
PGA
Comfort
Noise
Automatic Volume
Control
Auto matic Level
Control
Dynamic
FlexEQ
Noise
Reduction
PGA
GND V
D
V
L
BUSY PLL/ClkMgr
I2S
Raw PCM
I2S
Clean
PCM
I2S
Clean
PCM
I
2
S
Raw PCM
Near
End
Far End
Advanced Media Processor
SoundClear ASR Enhance™
SPI/I2C
RESET
Optional Voice Processing Featu res
Optional Postprocessing Features
Voice Activity
Detecto r
SoundClear
ASR Enhance™
Senso ry In c.
TrulyHandsfree™ Voice Control
Dolby Laboratories, Inc. Audio Postprocessing Algo rithms
DTS, Inc. Audio Postprocessing Algorithms
SoundClear Voice™ Features
• Flexible and tunable enabling freedom in product ID, transducer placement and selection
• Robust proprietary algorithms assure consistent performance across diverse sound environments and off-axis
• Ambient-aware technologies constantly compensate for changing noise types and level and varying product positioning
• HD voice/wideband and narrowband support
• Supports handset, tablet, laptop, and speakerphone single and multiple microphone configurations
• Conference room–grade AEC plus nonlinear residual echo suppressor for superior full-duplex speakerphone operation without echo
• Tx and Rx noise elimination
• Automatic volume control for both Tx and Rx
• Ambient-aware volume control compensates for variation
compensates variations in talker level, proximity, and orientation
in Rx voice level and near-end noise level
• Tx and Rx comfort noise generators provide natural, smooth transitions between single-talk Rx, single-talk Tx, double-talk, and silent states
CS48LV13 Block Diagram
• Tx spectrally matched comfort-noise generator samples and synthesizes ambient noise for more transparent talk state transitions
• Tx and Rx parametric EQ simplifies carrier and industry compliance, achieving natural sound and compensating for transducer limitations
—Up to four concurrent Tx and four concurrent Rx filters —Eight different filter types can be combined to achieve
exact requirements
—Each filter has tunable frequency, gain, and Q or
bandwidth
enables automatic real time tuning for improved
• Ambient-aware dynamic parametric EQ for Tx or Rx
intelligibility and to compensate for transducer characteristics; responds to
—Tx or Rx stream amplitude —Near-end ambient noise level —User controls
• Rx compander for optimal speaker output level
• Mixed and mismatched microphone compensation
• Automatic calibration for up to ±6-dB sensitivity variation
• Compensation for microphone phase differences
DS1057F1 2
Cirrus Logic Speech Features (CS48LV13 Only)
• ASR Enhance preprocessor for ASR engines —Improves speech command recognition of ASR
engines in noisy environments
—Can be applied with VAD described below and optional
TrulyHandsfree voice control and/or cloud based ASR
Optional Speech Features Supported (CS48LV13 Only)
• Sensory TrulyHandsfree voice control
• Complete always-on local ASR solution —Ultralow power always-on Cirrus Logic voice detection
(VAD)
—ASR Enhance pre-processor increases ASR accuracy
in noisy conditions
—TrulyHandsFree voice trigger and voice command
1
Audio Playback Features
• Full complement of Cirrus Logic audio-processing algorithms preintegrated to enhance mono or stereo playback over speakers or headphones
—Virtual surround —Bass enhancement —Bass virtualization —Parametric EQ —Multiband compressor
• Graphical interface for selection and tuning of algorithms
Optional Audio Playback Features Supported (CS48LV13 Only)
• Dolby® postprocessing (enhancement and virtualization)
• DTS® postprocessing (enhancement and virtualization)
• Headphone and speaker playback support
1
Applications
The CS48LV12/13 provides a complete voice, audio playback and speech preprocessing solution for smartphone, tablet, laptop, headphone/headset, and speaker/speakerphone applications. They are optimized for devices where pristine voice quality and echo-free, full-duplex communication is required, especially under conditions of adverse noise and where space and power are limited.
1.Use of TrulyHandsfree-, Dolby-, or DTS-supported features requires the existence and proof of a valid license agreement with the corresponding company to be able to use or distribute its technology in any finished end-user or ready-to-use final product.
3 DS1057F1
General Description
The CS48LV12 and CS48LV13 ultralow power voice processors feature Cirrus Logic’s patented SoundClear® technology to provide a new standard in HD Voice quality performance, functionality, and cost effectiveness. These ICs provide a total voice processing solution for handset and hands-free communications that deliver best-in-class noise reduction, echo cancellation, and speech recognition. The CS48LV12 and CS48LV13 can enable advanced features including always-on voice trigger, command recognition, ASR pre-processing, and audio enhancement. Innovative single and multi-mic algorithms with intelligent speech tracking and noice elimination assure optimal user experience in the most challenging and dynamic noise environments and deliver superior performance despite varying speech levels, talker distance, or product orientation.
The CS48LV12 and CS48LV13 feature an integrated media processor with built-in virtual surround, bass enhancement, bass synthesis, multi-band compression, and parametric EQ algorithms to enrich music playback through wireless speakers and headphones. All are tunable through a simple GUI. In addition, the CS48LV13 provides the option of adding a Cirrus Logic proprietary Voice Activity Detector for always-on ASR capability and integrated TrulyHandsfree™ Voice Control. Also available is Cirrus Logic’s ASR Enhance™ specialized preprocessor to enhance the accuracy of any ASR (Automatic Speech Recognition) engine under noisy conditions. An expanded menu of third-party media playback algorithms from Dolby and DTS can also be integrated. Powerful real-time diagnostic and tuning tools combined with specialized labs and a global applications support network assure ease of design, optimal performance, and achievement of network and industry compliance.
DS1057F1 4

1 Documentation

1 Documentation
This document describes the CS48LV12 and CS48LV13 HD voice processors. When evaluating or designing a system around the CS48LV12/13 processors, use this document in conjunction with the documents listed in Table 1-1.
Table 1-1. CS48LV12/13 Related Documentation
Document Name Description
CS48LV12/13 Data Sheet This document RAPID2 DSPComposer
CS48LV12/13 CS48L10 Hardware User’s Manual Includes detailed system design information including typical connection diagrams and boot
CRD48L10 4in4out Board Manual Manual for development and evaluation board for CS48L10/L11/LV12/LV13 Micro-condensers User’s Guide Instructional manual for using Micro-condenser for creating microcode and flash image for
AN344 Firmware User’s Manual for CS48L10/L11/LV12/LV13 AN344CBE Applications note for Cirrus Bass Enhancement (CBE) Module AN344CBV Applications note for Cirrus Bass Virtualization (CBV) Module AN344CVT Applications note for Cirrus Virtualization Technology (CVT) AN344EQ Applications note for Cirrus Equalization (EQ) Module AN344TC Applications note for Tone Control Post-processor Module
User’s Guide Instructional manual for using the RAPID2 tool for voice processing diagnostics and tuning
User’s Manual for
Manual for using the CS48LV12/13 version of DSP composer tool for post-processing configuration and tuning
procedures applicable for CS48L10/L11/LV12/LV13
embedded systems applications
The primary scope of this document is to provide the hardware specifications of the CS48LV12/13 family of devices. These include hardware functionality, characteristic data, pinout, and packaging information. The intended audience includes system PCB designers, MCU programmers, and quality-control engineers.

2 Overview

The CS48LV12 and CS48LV13 products are based on Cirrus Logic 32 bit fixed point DSP's which feature the ultralow power, tiny foot print, high performance and low cost that is required by today's mobile voice communication products. For ease of implementation and high computational efficiency, each product includes an embedded software package highly optimized for the DSP. Designed into each product is the ability to support multiple modes and configurations that match an array of product use models for smartphones, tablets, and mobile computing devices as well as a variety of consumer and automotive products with hands-free communication features.
The CS48LV12 incorporates Cirrus Logic SoundClear technology to perform all voice processing functions typically required in handset and hands-free products including noise reduction, echo cancellation, and a comprehensive set of voice enhancement capabilities. SoundClear technology uses proprietary algorithms to decipher spatial and spectral characteristics of both the Rx (far-end) and Tx (near-end) digital voice streams and categorize various types of noise and speech, removing noise and competing talkers while automatically adjusting for SPL changes, changes in product position and orientation, and ongoing environmental changes. Other SoundClear modules monitor talk status (single Tx, single Rx, silence, double-talk), cancel echo, suppress residual echo, and inject comfort noise if required to achieve natural, consistent, full-duplex, echo-free conversation, in both hand-set and hands-free modes.
The CS48LV12 also includes integrated media processing capabilities to enhance audio playback over internal speakers or attached devices such as speakers and headphones.
The CS48LV13 includes all the voice and media processing capabilities as well as ASR pre-processing (ASR Enhance) to remove noise that limits ASR accuracy and reliability. It also includes a specialized Voice Activity Detector (VAD) that enables the CS48LV13 to remain in a very low power "always-on" state until the VAD detects human speech in the proximity of a microphone. This feature is typically used in conjunction with a local ASR solution such as the CS48LV13's optional Sensory TrulyHandsfree voice control.
The CS48LV13 also supports several optional features requiring third party licenses:
Sensory, Inc. TrulyHandsfree voice control
Dolby, Inc. media playback enhancement algorithms
DTS, Inc. media playback enhancement algorithms
5 DS1057F1
Codec
SoC or Application Processor
PGA
Noise
Reduction
Acoustic Echo Canceller
Residual Echo Suppressor
NLP
Spectrally Matched
Comfort Noise
Dynamic
FlexEQ
ALC PGA
Codec Port
Host Port
Full-Duplex Control + Voice Activity Detection + Double Talk Detection
PGA
Comfort
Noise
Automatic Volume
Control
Automatic Level
Control
Dynamic
FlexEQ
Noise
Reduction
PGA
Media Processor
GND V
D
V
L
Debug MCLK
I2S
Raw PCM
I2S
Clean
PCM
RESET
BUSY
I
2
S
Clean
PCM
CLK
SPI/I
2
C
I
2
S
Raw PCM
INT
Near
End
Far
End
Optional

2.1 Licensing

A key feature of both products that enables ease of implementation, quick time to market and performance optimized to a particular ID is the RAPID2 diagnostic and tuning tool. This Microsoft Windows® based tool provides GUI based monitoring and control of all critical SoundClear parameters as well as system level measurements and statistics. RAPID2 tool features are described in Section 3.16.
2.1 Licensing
Licenses are required for any third-party audio-processing algorithms, including but not limited to Sensory, Inc. TrulyHandsfree™ and Dolby and DTS postprocessing solutions provided for the CS48LV12/13. A Cirrus Logic royalty free license is also required to distribute product containing the CS48LV12 or CS48LV13 embedded software packages required for functionality described in this data sheet. Contact your local Cirrus Logic Sales representative for more information.

3 Functional Description

Figure 3-1. CS48LV12 Block Diagram
DS1057F1 6
Codec
Host
PGA
Noise
Reduction
Acoustic Echo Canceller
Residual Echo Suppressor
NLP
Spectrally Matched
Comfort Noise
Dynamic
FlexEQ
ALC
PGA
Codec Port
Host Port
Full-Duplex Control + Vo ice Activit y Detect ion + Double-Talk Detection
PGA
Comfort
Noise
Automatic Volume
Control
Auto matic Level
Control
Dynamic
FlexEQ
Noise
Reduction
PGA
GND V
D
V
L
BUSY PLL/ClkMgr
I2S
Raw PCM
I2S
Clean
PCM
I2S
Clean
PCM
I
2
S
Raw PCM
Near
End
Far End
Advanced Media Processor
SoundClear ASR Enhance™
SPI/I2C
RESET
Optional Voice Processing Featu res
Optional Postprocessing Features
Voice Activity
Detecto r
SoundClear
ASR Enhance™
Senso ry In c.
TrulyHandsfree™ Voice Control
Dolby Laboratories, Inc. Audio Postprocessing Algo rithms
DTS, Inc. Audio Postprocessing Algorithms

3.1 Cirrus Logic 32-bit DSP Core

Figure 3-2. CS48LV13 Block Diagram
3.1 Cirrus Logic 32-bit DSP Core
The core is a high-performance, 32-bit, fixed-point DSP that is capable of performing two multiply-and-accumulate (MAC) operations per clock cycle. The core has eight 72-bit accumulators, four X- and four Y-data registers, and 12 index registers. It can operate up to 130 MHz, depending on mode and concurrency requirements, but it may also operate at low speed to support specialized low-power modes, such as always-on voice wake.
The DSP core is coupled to a flexible DMA engine. The DMA engine can move data between peripherals such as the multi-channel serial audio port, or any DSP core memory, without the intervention of the DSP. The DMA engine off-loads data move instructions from the DSP core, leaving more MIPS available for signal-processing instructions.

3.2 Processing Groups

Providing consistent high-quality Tx and Rx voice streams in constantly changing environments requires a complex data flow with constant interaction between various functional modules. While the actual data flow is more complex and not linear, the architecture can be approximated as a set of in-line processing groups or chains that operate in different modes depending on the current use model as follows.
CS48LV12:
1. Tx Voice DSP Chain
2. Rx Voice DSP Chain
3. Voice DSP Control and Detection
4. Audio Playback DSP Chain
7 DS1057F1
PGA
Noise
Reduction
Acoustic Echo Canceller
Residual Echo Suppressor
NLP
Spectrally Matched
Comfort Noise
Dynamic
FlexEQ
ALC PGA

3.3 Tx Voice DSP Chain

The CS48LV13 includes two additional processing groups:
5. Speech DSP Chain
6. Advanced Audio Playback DSP Chain
Cirrus Logic provides two specialized tools for controlling and tuning the various processing groups. For voice- and speech-related processing groups, the RAPID2 tool provides real-time analysis and tuning of all parameters. For audio-playback chains, a specialized version of the DSP Composer tool is used for real-time control and tuning.
Each group may operate in more than one mode. Typically, smartphones have three or more operating modes:
a) Handset mode using processing Groups 1, 2, and 3
b) Speakerphone mode using processing Groups 1, 2, and 3
c) Media playback mode using processing Group 4 or 6
d) ASR mode using process using Group 5
Each mode may have tuning variations; for example, handset mode may include default tuning (using integrated microphones and receiver), pass-through tuning for BT accessories, which perform their own voice processing and wired headset tuning.
Tablets may have a single mode, similar to Speakerphone Mode, or may have multiple modes using different microphones and processing, depending on their orientation and desired use model, such as portrait, landscape, handheld, on stand, personal or group. Similarly, other applications may have a single or multiple modes using one or more processing groups.
3.3 Tx Voice DSP Chain
The Tx Voice DSP Chain accepts raw PCM voice data from one or two microphones and uses this data, along with any incorporated spatial information, to remove undesired noise and competing speech while preserving voice integrity. It also includes AEC and residual echo suppression functions to remove echo. In combination with the Voice DSP Control and Detection and Rx Voice DSP Chain groups, it manages full-duplex operation. A number of additional voice processing blocks are included to provide a natural, intelligible, and consistent PCM voice stream.
Fig. 3-3 is a simplified diagram of the Tx Voice Chain. The start of the chain is fed by one or two voice PCM streams
originating from the voice microphones and arriving on one of the CS48LV12/13’s two I The output of the chain, Tx Out, is transmitted out of one of the two I
2
S DAO outputs (typically DAO_1), typically a host
2
S DAI inputs (typically DAI_2).
processor, applications processor, or system on a chip (SoC), which then sends the stream to a digital baseband or other network processor.
Figure 3-3. Tx Voice DSP Chain

3.4 Programmable Gain Amplifiers (PGAs)

One set of PGAs controls the level of the input streams from the mic and another PGA at the output of the chain controls the level of the output stream (Tx Out). The RAPID2 tool includes PGA level meters with clipping detectors can be used to adjust PGA gain level to maintain maximum SNR without danger of clipping.

3.5 Noise Reduction

SoundClear Voice technology uses a variety of innovative techniques and algorithms to distinguish between desired speech, undesired speech (competing talkers), and nonspeech, and then suppresses all but desired speech.
In a two-mic configuration, SoundClear Voice technology uses signature analysis techniques to distinguish and eliminate noise and to preserve the vocal quality of speech. Also, a proprietary beam forming technology analyzes the aural space around the user and classifies sounds, based on direction of arrival and proximity. It also makes real-time voice-tracking
DS1057F1 8
Loading...
+ 18 hidden pages