SI/SD Voice Recognizer, Recorder/Player, and Speech Synthesizer
GENERAL DESCRIPTION
The MSM6679A-110 Voice Recognition Processor (VRP) is a slave-mode device that performs
five func-tions: speaker-independent (SI) voice recognition, speaker-dependent (SD) voice
recognition, solid-state sound recording, sound playback, and speech synthesis. The highly
integrated device also provides an on-chip memory controller, Flash memory interface, analog
data conversion, Oki speech synthesizer interface, and pulse width modulation (PWM) sound
output.
For SI recognition, the MSM6679A-110 contains a vocabulary template in external memory.
Pretrained SI vocabularies eliminate the need for laborious training, as usually required by SD
products. The memory requirements are dependent on the size of the vocabulary. The MSM6679A110 can tolerate background noise, while providing high recognition accuracy. In its designated
operating environment, the device achieves a typical recognition accuracy of >95% (using an
Oki-defined test procedure).
For SD recognition, the MSM6679A-110 stores SD vocabulary templates, as defined by the user,
in external SRAM. The MSM6679A-110 can create SD vocabularies of up to 61 words each, with
each word using approximately 50 bytes.
In addition to providing voice recognition capabilities, the MSM6679A-110 integrates a solidstate recorder/player, speech synthesis functions, and a tone generator. ADPCM recording/
playback provides high quality sound and efficient memory utilization. The MSM6679A-110 can
respond to spoken com-mands, verbally or with tones, via an on-chip speech synthesizer and
tone generator. For larger speech-synthesis requirements, the MSM6679A-110 also provides a
glueless MSM665x control interface for off-chip speech synthesis.
The MSM6679A-110 can interface to any application or personal computer via a serial interface
through an open, device-independent serial mode API (SMAPI). To accelerate code development,
Oki supplies an evaluation kit, and assembly and C language programs for this product.
FEATURES
• SI recognition
- Up to 20 - 25 words in each vocabulary
- Multiple vocabulary support
• SD recognition
- Up to 61 words in each vocabulary
- Multiple vocabulary support
• Speech synthesis
- Up to 2.3-sec internal and 27.6-sec external
speech synthesis on-chip; sample looping
and concatenation allows even longer
phrases.
- On-chip controller for MSM665x speech
synthesizer
- Standard beep tone outputs
- Pulse code modualation (PCM) and
adaptive differential pulse code
modualation (ADPCM) voice or soundeffect output
• Speech capture and playback
- 28-kbps ADPCM speech compression
• Serial ASCII command interface
• 6944-Hz audio input sample rate for record
and playback
• 10-kHz sample rate for voice recognition
• 200-msec recognition latency
• Flexible memory mapping for EPROM,
FLASH, and SRAM
• 32-MHz operation
• Packages: 84-pin PLCC (QFJ84-P-S115) or
100-pin TQFP (TQFP100-P-1414-0.50-K)
13/3N/C+
14/4N/C+
15/5N/C
16/6N/C
17/7N/C
18/8N/C
19/9RXD1InputSerial Port Receive. This is the receive data line for serial port.
-/11N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
-/14N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
N/C
Signal TypeDescription
(not connected)Reserved. This pin is reserved for future use and should be left open.
Analog Input. These ten inputs are tied together and serve as the
Analog input
Reference voltage
Input
(do not connected) Reserved. These pins are reserved for future use and must be left open.
analog input. Signal conditioning, via a bandpass filter and gain circuit,
is required before this input.
Analog Ground. This pin provides an analog ground point, allowing
independent grounding of the analog and digital circuitry. Separate
grounds reduce the impact of digital switching noise on analog
sampling accuracy.
Analog Reference Voltage. The MSM6679A-110's on-chip A/D
converter uses this analog reference voltage when converting an
analog signal into digital samples
Reserved. These pins are reserved for future use and must be tied to
VDD.
Oscillator 0/External Clock. When the MSM6679A-110 uses a crystal
oscillator, this input is the oscillator input pin. The pin is then
connected to one side of a crystal and load capacitor. When used with
an external clock, the external clock is applied to this input.
Oscillator 1. When the MSM6679A-110 uses a crystal oscillator, this
output is the oscillator output pin. The pin is then connected to one
side of a crystal and load capacitor. When used with an external clock,
this output is left unconnected.
Memory Address Latch Enable. An external memory latch is controlled
by this signal, the address latch enable output.
-/36N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
-/38N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
Signal TypeDescription
ROM Read. This is a strobe signal for direct connection to an external
Output
Bidirectional I/O
OutputsMemory Address Bus. These are the upper eight address pins.
ROM's READ input. When asserted LOW, this signal indicates that the
MSM6679A-110 is ready to read data from the ROM.
RAM Write. This is a strobe signal for direct connection to an external
RAM's WR input. When asserted LOW, this signal indicates that the
MSM6679A-110 is ready to write data to RAM.
RAM Read. This is a strobe signal for direct connection to an external
RAM's RD input. When asserted LOW, this signal indicates that the
MSM6679A-110 is ready to read data from RAM.
MSM665x Reset. This pin provides a reset signal for an external
speech synthesis engine.
Flash Bank Control (Extended Segments). This is the control signal for
flash memory banking.
MSM665x Next Address Request. This pin signals to the
MSM6679A-110 that the external speech synthesis engine is ready for
another command.
Reserved. These pins are reserved for future use and must be left open.
Reserved. These pins are reserved for future use and should be left open.
Voice Out. This pin is the PWM output for speech synthesis, voice
sample playback, and voice prompts. An external integrator must be
used to convert this to an analog signal.
Memory Address/Data Bus. These are multiplexed address/data lines
for the eight data bits and the lower eight address bits (the upper eight
address bits are not multiplexed).
49/45
50/46A15
51/47N/C–InputReserved. This pin is reserved for future use and must be tied to GND.
52/48N/C
53/49N/C
-/50,51N/C(not connected)
54/52N/C
55/53N/C
56/54A15FLIPOutput
57/55STROBEOutput
58/56ROMPAGE0
59/57ROMPAGE1
60/58N/C(do not connect)Reserved. This pin is reserved for future use and must be left open.
61/60BUSYInput
62/61SIOutput
63/62SDOutput
64/63GNDDigital GroundGround.
65/65N/C(do not connect)Reserved. This pin is reserved for future use and must be left open.
66/66LOADPGMOutput
67/67RAMPAGE0
68/68RAMPAGE1
69/69N/C
70/70N/C
71/71N/C(do not connect)
72/72N/C
73/73N/C
74/74N/C
-/75,76N/C(not connected)
Pin Name
A14
-/59N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
-/64N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
Signal TypeDescription
OutputsMemory Address Bus. These are the upper eight address pins.
(do not connect)
(do not connect)
Outputs
Output
Reserved. These pins are reserved for future use and must be left open.
Reserved. These pins are reserved for future use and should be left open.
Reserved. These pins are reserved for future use and must be left open.
Memory Address A15 Flip. This signal inverts the A15 address signal
for 32-Kbyte bank switching on the local memory bus.
MSM665x Strobe. This output provides the LOAD signal for an external
speech synthesizer.
ROM Page Select. These signals select one of four 64-Kbyte ROM
pages.
MSM665x Busy. When using an external MSM665x device, this pin
monitors the MSM665x BUSY signal and connects directly to the
MSM665x BUSY signal output.
MSM665x Serial Clock. This MSM6679A-110 output connects to the
MSM665x SI input. The SI pin is the MSM665x serial clock input pin.
MSM665x Serial Data. This MSM6679A-110 output connects to the
MSM665x SD input. The SD pin is the MSM665x serial data input pin.
Load Program. This signal allows the MSM6679A-110 to write data to
program memory. When asserted low, this signal should set the
program memory in write mode.
RAM Page Select. These signals support selection of one out of four
RAM pages. Each page is 64kbytes in size.
Reserved. These pins are reserved for future use and must be left open.
Reserved. These pins are reserved for future use and should be left open.
75/77
76/78N/C
77/79N/C(do not connect)
78/80N/C
79/81N/C
80/82N/C+InputReserved. This pin is reserved for future use and must be tied to VDD.
81/83RESInput
82/84EAInput
83/85VDD
84/87AVDD
-/100N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
Pin Name
N/C
-/86N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
-/88N/C(not connected)Reserved. This pin is reserved for future use and should be left open.
Signal TypeDescription
Reserved. These pins are reserved for future use and must be left open.
MSM6679A-110 Reset. External logic should assert this power-on
reset signal LOW when power is applied to the MSM6679A-110.
External ROM Address Select. This control signal enables external
ROM execution. This signal is usually connected to ROMPAGE1 and a
pullup resistor.
Digital power supply voltage
Input voltage
Output voltage
Analog power voltage
Analog reference voltage
Analog input voltage
Power dissipation
Storage temperature
SymbolConditions
V
DD
V
I
V
AV
V
O
DD
REF
GND = AGND = 0 V
VAI–0.3 to V
PD
T
STG
Ta = 85˚C, per package1300 max.
Ta = 85˚C, per pin50 max.
—–50 to +150˚C˚C
ValueParameter
–0.3 to +7.0
–0.3 to VDD +0.3
–0.3 to V
DD
+0.3
–0.3 to VDD +0.3
–0.3 to AVDD +0.3
REF
Unit
V
mW
1.Permanent device damage may occur if ABSOLUTE MAXIMUM RATINGS are exceeded.
Functional operation should be restricted to the conditions as detailed elsewhere in this
data sheet. Exposure to absolute maximum rating conditions for extended periods may
affect device reliability.
Operating Conditions
Digital power supply voltage
Analog power supply voltage
Analog reference voltage
Analog input voltage
Storage holding voltage
Operating frequency
Ambient temperature
The MSM6679A-110 performs both SD and SD recognition. SI vocabularies are embedded in the
MSM6679A-110. For SD recognition, each recognized phrase must be enrolled in the MSM6679A110’s vocabulary by creating a composite template from multiple recordings of the same phrase.
Then the com-posite tempalte is stored in SRAM or FLASH memory. During both SI and SD
recognition, the MSM6679A-110 performs the following steps:
1. After external band-pass filtering, the MSM6679A-110 converts the analog signal to PCM
samples.
2. The MSM6679A-110 extracts significant features from the sample data by frequency and
time-domain analysis.
3. The MSM6679A-110 compares the analyzed input with the reference data for each signal,
weighing the significance of similarities according to control software parameters. A score
(expressed as distance) is generated for each phrase.
4. he vocabulary phrase that achieves the highest score (or lowest distance) is judged to match
the input phrase, assuming that the score exceeds a predetermined threshold.
5. Via a special command, the MSM6679A-110 can also return the scores of the input against all
defined vocabulary phrases for SI or SD recognition. This feature allows external host
software to select the next best match, if the closest match is not contextually logical.
SI Recognition
Oki supplies the MSM6679A-110 with predefined SI vocabularies which Oki builds from
hundreds of utterances by a wide variety of speakers. SI vocabularies are limited to 25 words or
less, which allows the MSM6679A-110 to achieve a net accuracy of >95%, even in noisy
conditions.
SI vocabularies are grouped into sub-vocabularies of ≤15 words, to maintain the highest
accuracy. Similar words in any one sub-vocabulary can cause substitution errors.
Oki Semiconductor’s standard cellular vocabulary is intended for an automotive environment
with a far-talk microphone. This vocabulary may work adequately in other conditions, such as
an office or outside, but recognition performance may be degraded.
SI vocabulary generation starts with collecting reference utterances from ≥400 speakers with:
• An equal mixture of males and females
• Accents from all regions of the country of intended use
• ~15% non-native speakers.
The samples should be generated from a randomly-ordered list, with each word spoken twice
and with a dummy word at the beginning and end. There must be >2 sec between each sample
for accurate data processing. To provide the audio fidelity required for high-quality recognition
training, a DAT recorder, together with the microphone that will be used in the final application,
is required. To ensure data integ-rity, data is submitted to Oki after collecting samples from the
first 20 speakers for initial screening. If acceptable, then the remaining collection may proceed.
If substitution errors are possible, collection of spare words during initial collection is
recommended. For example, alternate words to “Stop” and “Top” could be “Halt” and “First.”
Collections should contain a wide variety of the background sound conditions that will exist
during actual usage. For example, if the collection is for use in an automobile, conditions such
as vehicle speed, road conditions, various window opening positions, heater or AC blower
speeds and radio volumes should be varied during the collection. The signal-to-noise ratio
should be maintained at ≥ 20dB.
To achieve high accuracy rates, phrase selection, data collection, background initialization
strategy, and control software need careful consideration. There are no published standards for
recognition accuracy.
Oki defines accuracy by:
Accuracy = 100% - E
E
RATE
= E
SUB
+ 1/2 E
RATE
REJ
with the following definitions:
Parameters for Recognition Accuracy
NameConditionSymbol
Substitution ErrorMost critical type error, e.g., Say "Five", recogrize "Nine"E
Rejection ErrorWord not recognized, opportunity for operator to repeatE
Gap ErrorWord spoken before recognizer readyE
Time-Out ErrorWord length is too longE
Spurious Response Error
E
SUB
REJ
GAP
TME
Sourd or imvalid word classfied as a valid word
SPU
(i.e., drop handset or speak wong word)
A typical target accuracy of 97% is achieved with a 3% E
a 3%E
REJ
rate.
, composed of a 1.5% E
RATE
rate and
SUB
SD Recognition
In SD recognition mode, the MSM6679A-110 can be trained to recognize up to 61 words. The
MSM6679A-110 can support multiple speakers by switching vocabularies, but only one speaker’s
vocabulary should be active at one time.
The end user enrolls a phrase in the MSM6679A-110’s vocabulary by recording the phrase three
times or more. The host Micro Controller Unit (MCU) controls the number of times each phrase
in enrolled. Generally, higher recognition accuracy is achieved with each additional enrollment.
The word set is made more robust by pronouncing each phrase slightly differently during initial
enrollment.
In addition to enrollment training, adaptive template updating can drive the accuracy towards
100%. The host MCU updates templates by first asking the speaker to confirm a recognized
phrase with a “yes” or “no” response, and subsequently updating the template for corresponding
words. The use of name tags (see next paragraph) facilitates this process.
Name Tag Recording
To facilitate SD recognition, the MSM6679A-110 supports recording and playback of name tags.
Name tags are used to confirm correct responses in SD recognition. For example, in a phone
dialer application, the user associates a “name” (which is recorded into memory) with a phone
number. The MSM6679A-110 then plays back the name tag so that the user can verify that the
recognized phrase is the correct one.
The VRP stores names tags in memory using an ADPCM compression algorithm with 28 kbps
of speech. The length of a name tag is controlled with a command from the users host MCU
program. The maximum number of name tags possible is 61, but the actual number is dependent
upon record time and memory available. See the section on memory interface for more detail.
A critical item for high-accuracy speech recognition is correct design of the audio input circuit.
A circuit with appropriate gain and frequency responses must be placed between the microphone
and MSM6679A-110’s A/D input. Oki recommends input gain and a band pass filter with the
following characteristics:
• Four pole Chebyshev high-pass filter, 3 dB point at 225 Hz
• Dual-pole low-pass filter, 3 dB point at 4250 Hz
• Midband gain of 46 dB at 1000 Hz
The above gain and filter characteristics are obtained by using a rail-to-rail quad CMOS op-amp
and one-half supply rail splitter to bias the input signal at 2.5 V nominal.
The MSM6679A-110 uses multiple analog inputs to improve sampling quality. An on-chip
analogy to digital (A/D) conversion unit transforms the analog signal to a digital data stream.
Audio Output Interface
The MSM6679A-110 also provides the VOICEOUT1 PWM output. The MSM6679A-110 uses
ADPCM to generate voice or sound-effect output. ADPCM represents an improvement over
conventional PCM techniques in that it adaptively changes the quantizer step (scale factor) to suit
the waveform being encoded. The result is more efficient memory usage with no loss of quality.
Careful selection of the components for internal and external output filters and amplifiers is
recommended. An incorrect choice would impair the original quality. This consideration equally
includes:
• Careful separation of analog and digital lines
• Grounding of analog lines at both ends
• Further adequate separation from high-speed digital circuits to avoid distortions thereof
Memory Interface
The memory control section manages RAM and/or ROM devices in two 64-Kbyte memory
spaces, in conjunction with internal memory for voice templates and working memory. Some
versions work with no external memory, some have some external RAM, some use only external
EPROM, and some use external memory in conjunction with both internal ROM and RAM. The
MSM6679A-110 requires a minimum of 32 Kbytes SRAM and 16 Kbytes ROM.
The following table shows vocabulary sizes and playback facilities for various configurations.
1.Phrase chaining features usually permit much longer overall playback durations; not
including external speech synthesizer.
2.SD recognition vocabularies are volatile in these configurations.
3.Per download. Vocabulary swapping by host permits unlimited vocabulary size.
The MSM6679A-110 supports up to 64 Kbytes of RAM per bank, and up to 64 Kbytes of ROM per
bank in separate memory spaces. The 8-bit data bus is multiplexed with the lower eight address
bits; the upper eight address bits are not multiplexed.
To demultiplex the address and data bits during all read and write cycles, the MSM6679A-110
requires an external octal latch, such as the 74H373. The MSM6679A-110’s Address Latch Enable
(ALE) signal controls the octal latch.
For accessing the ROM and RAM address spaces, the MSM6679A-110 provides the separate
Write RAM (WRRAM), Read RAM (RDRAM), and ROM Read (ROMRD) signals. The RDRAM
and ROMRD signals connect directly to Output Enable (OE) control signal inputs on the RAM
and ROM, respectively. The WRRAM signal connects directly to the Write Enable (WE) control
signal input on the RAM.
The following diagrams show the memory maps for the MSM6679A-110. In all MSM6679A-110
memory maps, the DL data memory space must be in RAM. The DH data memory space and PH
program memory space can either be implemented in ROM, EPROM, FLASH, RAM, or PROM.
In standalone applications, flash memory can be used for recording and subsequent playback of
voice prompts (e.g., the user’s name) and user sounds (e.g., DTMF dial tones, etc.).
Figure 10 shows the configuration for writing to flash memory used when writing SD templates
or when flash is used for data memory.
The MSM6679A-110 is capable of interfacing to the MSM665x family of Oki ROM, OTP, or
external EPROM speech synthesizers, allowing for up to 260 seconds of high-quality voice and
sound effects. The following table indicates the speech capabilities of the MSM665x family.
2.Longer speech patterns can be created by chaining and repeating existing speech samples.
3.Via external ROM only (no on-chip ROM available).
4.One-Time-Programmable (OTP) version of MSM6654. See the MSM66P54 data sheet for
more information.
5.One-Time-Programmable (OTP) version of MSM6656. See the MSM66P56 data sheet for
more information.
The MSM665x interface consists of the following signals:
• BUSY - Asserted LOW during MSM665x device playback. The MSM6679A-110 F50Bh and
F10100xxh commands select this signal for MSM665x command polling.
• NAR - Next Address Request status signal. By default, the MSM6679A-110 uses this signal to
poll commands to the MSM665x. The F51Bh, F480h, and F440h commands select NAR for
polling.
• SI - Serial Input Clock.
• SD - Serial Data Out.
• STROBE - Initiates speech synthesis.
• RESOUT - Initializes device when asserted LOW. The MSM6679A-110 F480h command
generates this signal.
Serial Interface
The MSM6679A-110 supplies a serial interface suitable for connection to an RS-232C serial port
buffer or equivalent. The serial interface uses one MSM6679A-110 input (RXD) and one
MSM6679A-110 output (TXD). The interface operates at 9600 Baud with:
• 8 data bits
• 1 start bit
• 1 stop bit
• No parity
• No handshake
A host processor sends serial ASCII commands to the MSM6679A-110 and receives serial ASCII
responses based on voice input responses.
This section describes the slave-mode Applications Protocol Interface (API) between a host MCU
and the MSM6679A-110. The slave-mode API offers the following features:
• Direct slave-mode control voice recognition, sound recording and playback, and sound
synthesis
• Serial port interfaces
• Simple procedures for downloading and uploading data
• ASCII format
• Comprehensive return codes and error reporting
The host MCU selects the active speech recognition vocabulary, speech responses, and controls
all actions required to implement an interactive voice response system. The MSM6679A-110
performs speech recognition, based on the vocabulary selected by the host, and returns digital
codes representing the most probable match of the current utterance to an individual utterance
in the selected vocabulary. The MSM6679A-110 can also respond with “name tags.” Name tags
can be fixed words, phrases or sound effects, or can be words, phrases or sound effects that have
been interactively recorded by the user.
The API supports serial interface. The MSM6679A-110 returns each response using the same
interface through which the most recent message was received. The user can thus connect and
use both interfaces.
For all messages, the serial interface represents each 8-bit value with two hexadecimal digits
coded in ASCII. When downloading and uploading data, the MSM6679A-110 uses a stream of
8-bit binary values.
The serial-mode interface uses a 9600-baud UART with 1 start bit, 8 data bits, and 1 stop bit. There
is no parity or handshaking. Serial-interface messages are of variable length, but consist of an
even number of bytes. The serial interface echoes all received ASCII characters immediately back
to the host MCU.
Messages are of variable length. All messages consist of an even number of bytes. Opcodes
consist of exactly four bytes, with values between F000h and FEFEh. Operand bytes may take
values from 0000h to FFFFh. The MSM6679A-110 issues a return code for many of the host
commands. The return code generally consists of the same opcode, followed by data indicating
success of failure of the operation.
Opcodes are organized into the following categories:
• Purge
• Set parameter
• Initialize
• Recognize
• Speak
• Request
• Record
• SD recognition control
The following tables summarize available opcodes and provide detailed descriptions of the
opcode functions.
F2xx mod 80
F2xx mod 40
F2xx mod 20
F2xx mod 10
F2xx mod 8
F2xx mod 4
F2xx mod 2
F2xx mod 1
F300
F301 to F33F
F340
F341
F342
F343
F344
F351
F361
F371
F401 to F43D
F441 to F47C
F47E
F47F
F480
F481 - F4FF
F50B
F51B
FE03 to FEFE
F500
F501
F510
F520
F522
F513
Set SP/SI origin to xxxx.
Set SD origin.
Set triggering origin.
Set IRQ level to IRQ x.
Set SD SP table to table x.
Select triggering table.
Set ISA mode.
Initialize background estimation.
Wait for F3h command after each response.
Beep after each triggered utterance
Reserved
Set speech response level to default.
Send acknowledge after each speech output response.
Only detect triggers.
Initialize SD parameter table and name tags.
Stop listening (recognition).
Start SI recognition.
Start SD recognition.
Sort SD recognition distances, return index to utterance with
least distance.
Update SD enrollment.
Request recognition parameter upload to host.
Sort SD recognition distances, return index and distance to
utterance with least distance
Sort SD recognition distances, return all distances.
Sort SD recognition distances, return minimum and
maximum energy values.
Sort SD recognition distances, return all energy values and
distances.
Play back name tag from external memory.
Play back sound from internal memory.
Play 50-ms beep.
Pause for 0.2 sec.
Initialize MSM665x IC, set MSM665x busy mode OFF, select
FLASH SI recognition.
Play back one of 127 phrases in external MSM665x device.
Set MSM665x busy mode ON.
Set 6654 NAR mode
Set output volume (03h = minimum, FEh = maximum).
Status request.
Select last FLASH bank for SI recognition.
Select download RAM bank for speaker independent/signal
processing (SI/SP) template area.
Select buffer RAM bank for SI/SP.
Copy download RAM bank to buffer RAM bank
Save download RAM bank templates in first FLASH.
(8000 - F2FF)
Default (Hex)
—
8000
4A00
F100
0005
F123
0101, 0202...
Disabled.
Disabled.
Disabled.
Disabled.
Disabled.
Enabled.
Enabled.
Disabled.
Load from first
FLASH.
Save name tag pointers in last FLASH (5480-56FF→FD80-FFFF)
Set record volume high.
Set record volume normal (default).
Record name tag 01h - 3Dh.
Set SD pointer to segment xxh.
Search for SD utterance xxh.
Enroll SD utterance selected by search command (F9xx).
Erase utterance from SD vocabulary.
Clear SDR table (4A00 - 547B)
Default (Hex)
—
—
—
—
—
3136
—
3330
—
—
—
—
F509
0051
0000
01FF
—
—
—
—
—
F50F
F50F
—
—
—
—
—
—
Response Summary
Command
F101h 00 tm
F102h AdH AdL
Result after
Parameter Set
F103h AdH AdL
F104h AdH AdL
F11Xh
F12Xh
F280h
F240h
F220h
Initialization
Acknowledgment
F210h
F208h
F204h
F202h
F201h
Speech Ack F400hSpeech acknowledgment.
OperandsDescription
Record time = tm*14 msec.
High and low bytes of SP/SI origin address.
High and low bytes of SD origin address.
High and low bytes of triggering origin address.
IRQ Xh selected.
SP table Xh selected.
MSM6679A-110 ready.
Operation complete.
Operations complete; MSM6679A-110 disabled (vocabulary 0).
MSM6679A-110 waiting for start command.
MSM6679A-110 waiting for end trigger.
MSM6679A-110 processing recognition.
Download/upload in progress.
Download/upload complete.
Select/jump complete.
Speak output in progress.
Aborting SI listen mode.
Utt = utterance ID.
Utterance ID, high/low byte of distance to utterance 1...utterance N.
Utterance ID, high/low byte of min. and max. energy value,
Utterance ID, high/low byte of distance to utterance 1...utterance N,
high/low byte of minimum energy value, high/low byte of
maximum energy value.
F63Ah
F63Bh
F63Ch
F63Dh
F63Eh
F63Fh
F700h
F73Eh
F73Fh
F740h
Trigger detection code (see init command).
Rejection: utterance too loud.
Rejection: utterance too long.
Rejection: utterance begins too soon.
Rejection: bad signal/noise ratio.
Rejection: reason uncertain.
Aborting SD Listen mode. After SD utterance search: not found.
Rejection.
Sort completed. After SD utterance search: empty.
Rejection: MSM6679A-110 SD memory full/empty. After SD
1.Sample data overrun issued when real-time SP in Listen mode cannot keep up with
incoming samples, i.e., if the A/D signal input routine overwrites a sample data buffer
before it is fully processed.
2.This acknowledge is sent only if Init command 1111 0010 xxxx x1xx (F2 xxxx x1xx) is set
to enable acknowledgments.
3.These messages are sent in response to a request command (F5XYh) from the host.
4.Upload/download in progress, acknowledging load request immediately before data
transfer. If in response to an N-byte download request, the MSM6679A-110 then receives
N bytes (if N is even, or N+1 if N is odd) of data from the host. If N is odd and N+1 bytes
are received, only N bytes are written to MSM6679A-110 memory. If in response to an
upload, the MSM6679A-110 then sends N bytes (if N is even, or N+1 if N is odd) of data
to the host.
5.If an utterance was recognized, XYh is the utterance identity or class number, and
additional parameters may be appended, if requested in the SI Recog (F3XYh with X=0...3)
command. Otherwise, XYh indicates various results as detailed.
Purge MSM6679A-110 Input Stack. This command clears the
MSM6679A-110 input stack of commands that are waiting to
be executed. Commands already in progress, such as a
pending MSM6654 poll action, are not affected. It does not
affect the MSM6679A-110 output stack.
Description
Set SP/SI Recognition Origin. Prior to SD or SI recognition,
address pointers must be set to point at the SP or SI
recognition parameter tables.This command sets the starting
address of SP and SI recognition parameter tables.
This address is the location of the first word of a header that
contains pointers to one or more individual SP/SI tables.
XXYYh = high (XXh) and low (YYh) bytes of requested
address. The MSM6679A-110 uses and returns an even
address outside the MSM6679A-110 work space that is as
near as possible to the requested address.
Leave this parameter at its default value unless you are using
an Oki custom SI vocabulary and are instructed to alter SP/SI
recognition origin.
Default SP/SI origin: 8000h
[2]
Set SD Recognition Origin
origin address at the starting address of the current SD
recognition parameter table. This command may be used to
select among mul-tiple RAM-resident SD vocabulary tables.
XXYYh = high (XXh) and low (YYh) bytes of requested
address. The MSM6679A-110 uses and returns an even
address outside the MSM6679A-110 work space that is as
near as possible to the requested address.
Leave this parameter at its default value unless you are using
an Oki custom vocabulary and are instructed to alter SD
recognition origin.
The table length is 0A7Ch bytes.
Set Triggering Origin. This command sets the starting
address of triggering parameter tables.
This address is the location of the first word of a section of
data memory containing one or more contiguous triggering
parameter tables.
XXYYh = high (XXh) and low (YYh) bytes of requested
address. The MSM6679A-110 uses and returns an even
address outside the MSM6679A-110 work space that is as
near as possible to the requested address.
Leave this parameter at its default value unless you are using
an Oki custom SI vocabulary and are instructed to alter
triggering origin.
. This command sets the SD
Default SD origin: 4A00h
Default triggering origin: F100h.
Return ValuesDescription
None
Return Values
F102h XXYYh = High (XXh) and
low (YYh) bytes of resultant
address.
If a valid header is not found at
the resultant address, the
MSM6679A-110 immediately
sends response code:
F802h = Invalid SP/SI header.
F103h XXYYh = high (XXh) and
low (YYh) of resultant address.
F104h XXYYh = high (XXh) and
low (YYh) bytes of resultant
address.
Set IRQ Level. This command requests direction of host
interrupts to IRQ Y. The MSM6679A-110 then selects IRQ Z,
F11Yh
F12Yh
F130h VN TN
F440hNone. Default is off.
where Z is the nearest legal value to Y. Legal IRQ values are
any from the set {5 (default),A,B,C}.
Set SD Recognition SP table. This command sets the SP
parameter table number to be used in processing speech
input during SD Recognition. The MSM6679A-110 selects SP
table number Z, where Z is the nearest valid value to Y. By
default, the MSM6679A-110 selects SP table 3 until this
command is issued. This command selects SP parameters
only, and does not select among multiple RAM-resident SD
vocabulary tables, which can be independently selected by the
Set SD Origin command (F103h).
After setting the table number and returning the resultant
value, the MSM6679A-110 checks the validity of the SP
header. If the header is invalid, an error message is returned.
Set this value to (NSI +1), where NSI is the number of SI
subvocabularies.
Select Triggering Table. This command selects triggering
table TN for use with SP table VN. Valid values for VN and TN
are between 01h and 0Fh.
Leave this parameter at its default value unless you are using
an Oki custom SI vocabulary and are instructed to alter the
triggering table.
Set ISA Mode. This command sets the port configuration for
the ISA bus.
Description
Default IRQ level: 5
Default SP table: 3.
Return Values
F11Zh = IRQ Z selected.
F12Z = SP table Z selected.
If the SP header is invalid, a
second message follows:
F802h = Invalid SP header.
After power-on, the MSM6679A-110's mode corresponds to that after issuing a F20C command.
This mode may NOT be the optimum condition for most situations, so the user is advised to carefully understand
the desired condition and develop a suitable command for the application at hand.
In addition, ensure that unwanted bits do not get set or reset when attempting to set individual conditions. The
conditions selected are based on the XXh values associated with the last F2 command issued.
Background Noise Initialization. When set to 1, the MSM6679A110 starts a 500-ms background noise initialization. When set to
0, the MSM6679A-110 does not perform background noise
initialization.
The MSM6679A-110 requires this command prior to recognition
for noise vector subtraction during the utterance sampling period.
Use the background initialization command whenever there is a
F501 = Background
initialization
complete
change in the background noise level. For example, sample the
noise signature in a vehicle at rest and moving at 35 MPH with its
windows rolled down. The quality of a phone line connection can
1xxx xxxxCleared
also vary from call to call.
The host MCU must implement a strategy as to when to issue a
background initialization command. In a vehicle, the host MCU
could monitor the vehicle speed, fan speed, radio volume, etc.
Alternatively, the host MCU could issue this command each time a
new recognition session starts or a new line connection is
established. However, the 0.5-sec sample period could degrade
F2XY = Initialization
acknowledge.
[1]
system responsiveness if used too frequently. A zero in this bit
location during the F2XXh command will not cause an
initialization. The F505h command causes the same initialization
sequence.
Wait for Recognition Command/Auto Restart SI Recognition.
When set to 1, the MSM6679A-110 waits for a recognition
command after each response. When set to 0, the MSM6679A-
x1xx xxxxCleared
110 auto-restarts SI recogni-tion after each response.
This bit should be set to 1 when an action is to be taken
immediately after an utterance. Auto-restart recognition is the
F2XY = Initialization
acknowledge.
[1]
desired mode during digit string recognition, automated tape
testing of digits, or in demonstrations where continuous
recognition is desired.
Beep After Each Voice Trigger. When set to 1, the MSM6679A-110
beeps after each voice trigger. When set to 0, the MSM6679A-110
does not beep after each voice trigger. These beeps do not cause a
F400h message to be issued to the host MCU.
When set to 1, the MSM6679A-110 beep can help a user avoid
speaking before the MSM6679A-110 is ready. This mode is
normally used with a digits vocabulary to pace the user and
confirm each utterance reception.
Instead of using beeps, an external MSM665x speech synthesizer
can repeat digits as they are recognized. However, some users find
the number repetition annoying. Therefore, firmware could repeat
digits during initial usage and switch to beep mode later. Typically,
performance improves with time as users learns to speak with the
correct enunciation and volumes. The MSM6679A-110 in this case
trains the user. Note that the host MCU can also make the
MSM6679A-110 beep with the F47Eh command.
Set Output Volume. When set to 1, VOICEOUT1 sound output level
is set to half of full volume (80h). When set to 0, voice output level
is unaffected.
MSM6679A-110 sound output volume can also be set at any level
on a continuous scale from 00h to FEh (low to high) with the
FEXXh command. The MSM665x speech synthesizer has four
discrete sound output volumes, corresponding to 0h - 20h, 21h 40h, 41h - 80h, and 81h - FEh.
Send Response Code After Sound Output. When set to 1, the
MSM6679A-110 issues an acknowledge response (F400h) when
sound output is completed. When set to 0, the MSM6679A-110
does not issue an acknowledge response when speech response is
completed. Automatic beeps after voice triggers do not cause an
F400h command to be issued.
Trigger Detection Only. When set to 1, the MSM6679A-110 does
not sort SI vocabularies for the best match, instead returning
F63Ah code when an utterance has been detected. When set to 0,
normal recognition is performed.
When this bit is set to 1, the host MCU can use the F343h
command to upload the recognition parameter vector, so that the
host can perform independent processing.
Clear SD Recognition and Name Tag RAM. When set to 1, the
MSM6679A-110 initializes the SD parameter table. When set to 0,
existing SD parameters are preserved.
After this bit is set to 1, all SD training and name tag pointers are
erased. Use this command to start training for a new user. If the
old name tags are to be retained, the F50Ch command can recall
old name tags from FLASH.
To set up for a blank SD and name tag table at the next power-on,
issue the command sequence F201h F507h.
F2XY = Initialization
acknowledge.
[1]
F2XY = Initialization
acknowledge.
[1]
F2XY = Initialization
acknowledge.
[1]
F2XY = Initialization
acknowledge.
[1]
F2XY = Initialization
acknowledge.
[1]
1.See the Response Summary table earlier in this section for a complete description of the
XY codes in initialization acknowledgment messages.
MSM6679A-110 to exit SI or SD Listen mode,
whichever was active.
Start SI Listen Mode. For all the following
opcodes, the MSM6679A-110 per-forms SI
recognition on incoming utterances, using SI
vocabulary Y. The vocabulary Y is identified by
one of 15 sets, thus Y = 1h ~ Fh.
F30Yh
F31Yh
F32Yh
F33Yh
Start SD Listen Mode. When an utterance is
captured, it is analyzed and converted to a
"recognition parameter vector." The host may
then command the MSM6679A-110 to use this
vector in various ways (e.g., Sort, Update, or
Recognition Vector Upload).
SD Recognition Sort. These commands sort
the distances between the recognition
parameter vector and the reference vectors for
the utterances in the current SD vocabulary.
F341h
F344h
F351
F361h
Return recognized phrase using
vocabulary number Y.
Return recognized phrase and
distance table for vocab Y.
Return recognized phrase and energy
value for vocab Y.
Return recognized phrase, distance
table, and energy value for vocab Y.
Return recognized phrase for vocab
Y. This command can be issued
several times to yield first, second,
third best, etc.
Return recognized phrase and
distance for the current vocabulary.
Return recognized phrase and
distance table for vocab Y.
Return recognized phrase and energy
value for vocab Y.
Update SD Recognition Enrollment. This
command updates enrollment on utter-ance
Utt, immediately after a "F7h Utt" response to
the Sort SD Distances command (F341h).
Alternatively, the utterance to be updated can
be selected by the SD Search command
(F9XYh).
This command uses the recognition parameter
vector from the most recently captured
utterance, and does not start SD Listen mode.
Generally, update should be performed only if
correct utterance identify is confirmed by the
user.
Recognition Vector Upload. Request
recognition parameter vector upload to host.
Return recognized phrase, distance
table, and energy value for vocab Y.
F743h NH NL V1H V1L... VNH VNL = Success, where
NH/NL = high/low bytes of N, N = Length of recognition
parameter vector V, V1H/V1L = high/low bytes of first
element of V, VNH/VNL = high/low bytes of Nth element.
Utterance ID, high and low byte of
distance to utterance 1...distance to
utterance N, high and low byte of
minimum and maximum energy value.
Failure.F743h 00 00
Speak
Opcode
F401h ~
F43Dh
F441h ~
F450h
ActionReturn Value
Speak Phrase from External Memory. This
command causes the MSM6679A-110 to play
back a name tag from external memory. If no
sound is defined for a selected index, the
MSM6679A-110 plays a beep. See the Record
commands for information on creating name
tags.
Speak Phrase from Low Internal Memory. If no
sound is defined for a selected index, the
MSM6679A-110 plays a beep. The default
phrases supplied with the MSM6679A-110 in
the smaller low playback memory area are
listed below.
F441h
F442h
F443h
F444h
Drip.
Buzzer.
Dial tone.
Bonk.
F400h
F400h
If enabled, this value is returned upon
completion of playback.
If enabled, this value is returned upon
completion of playback.
Speak Phrase from High Internal/External
Memory. If no sound is defined for a selected
index, the MSM6679A-110 plays a beep. The
default phras-es supplied with the MSM6679A110 in the larger upper playback memory area
are listed below.
F451h
F452h
F451h ~
F47Ch
F47D——
F47Eh
F47Fh
F480hNone.
F481h F4FFh
F50BhNone.
F453h
F454h
F455h
F456h
F457h
F458h
F459h
F45Ah
F45Bh
F45Ch
Reserved. This command is reserved for future
use.
Beep. This causes the MSM6679A-110 to beep
for 50 ms.
Pause. This command can be issued while the
MSM6679A-110 is performing sound output
and is then put in the MSM6679A-110
command stack for subsequent processing.
When this command is executed, sound output
pauses for 0.2 sec.
The pause command is useful for word
spacing.
Set MSM6654 Mode. This command causes
the MSM6679A-110 to initialize
the external MSM665x device, also clearing the
device from BUSY mode.
Playback Sound from MSM665x Device. This
command causes the MSM6679A-110 to issue
a speak command to the MSM665x slave
device.
The value is passed on the MSM665x device as
01h - 07Fh. The actual phrase is determined by
the vocabulary programmed into the MSM665x
device. Up to 127 external phrases are
supported.
If enabled, this value is returned upon
completion of playback.
If enabled, this value is returned upon
completion of playback.
If enabled, this value is returned upon
completion of playback.
If enabled, this value is returned upon
completion of playback.
If NAR is set, the F400h command is
sent when the MSM665x device is ready
for an-other command. If busy mode is
selected, the F400 command is
returened when the sound is finished.
Set 6654 NAR mode. This command, which is
the complement of the F50B command, sets up
F51BhNone.
FEXYhNone.
the handshaking to the attached 6654 speech
synthe-sizer to use the NAR. This setup uses
the 6654's double buffer feature to eliminate
any gap between two consecutive phases.
Set Output Level. This command sets the
speech output level to one of 255 values as
follows:
FE03
FE80h
FEFEh
Set minimum output level.
Set output level half way (default).
Set maximum output level.
ActionReturn Value
Request
Opcode
Status Request. This command causes the
F500h
F501h
F510hNo return value
F520hNo return value
F522hCopy is complete.F501h
F513hSave is complete.F501h
MSM6679A-110 to return a 2-byte value
indicating its current status.
Select last FLASH bank for SI recognition.
Select download RAM bank for SI/SP template
area. This command enables the download
RAM bank in the upper 32 K of data memory
for SI recognition.
Select buffer RAM bank for SI/SP. This
command enables the buffer RAM bank in the
upper 32 K of data memory for SI recognition.
Copy download RAM bank to buffer RAM bank.
This command copies the download RAM bank
to the buffer RAM bank. The copied address
range is (8000-FFFF).
Save download RAM bank templates in first
FLASH. Save the download RAM SI/SP area
(8000-F2FF) to the same address range in the
first FLASH.
ActionReturn Value
MSM6679A-110 ready.F500h
MSM6679A-110 disabled.F520h
MSM6679A-110 waiting for start.F540h
MSM6679A-110 waiting for end.F560h
MSM6679A-110 processing.F580h
Download/upload in progress.F5A0h
Download/upload complete.F5C0h
Select/jump complete.F5E0h
first FLASH. Recall the download RAM SI/SP
template (8000 - FFFF) from the same address
range in the first FLASH.
Save download RAM bank templates in last
F515hSave is complete.F501h
FLASH. SAVE the download RAM bank SI/SP
template area (8000 - F2FF) to the same
address range in the last FLASH.
Get download RAM bank templates from last
F516hSave is complete.F501h
FLASH. Recall the download RAM bank SI/SP
template area (8000 - FFFF) to the same
address range in the last FLASH.
Download/Upload.
Full syntax: F5 02 00 Ctl AdH AdL NH NL [Dt1... DtN [Dt(N+1)]]
Full syntax: F5 02 00 Ctl AdH AdL NH NL [Dt1... DtN [Dt(N+1)]]
Ctl(7) = 0 for download, Ctl(7) = 1 for upload
Ctl(6) = 0 for data RAM, Ctl(6) = 1 for program RAM/ROM
If Ctl(6)=0 then Ctl(1-0) = Seg: Data segment selection
If Ctl(6)=1 and Ctl(1-0) = x0, then external program
segment 0 is used.
If Ctl(6)=1 and Ctl(1-0) = x1, then external program
segment 1 is used.
Immediately after receiving parameter NL, the
MSM6679A-110 responds with a message to indicate
acceptance or denial of the transfer request. Acceptance
is indicated by F5A0h.
Denial is indicated by a F8XYh.
At the end of an accepted transfer, the MSM6679A-110
re-sponds with a message to confirm or deny valid
completion of the transfer. Valid completion is indicated
by F5C0h.
AdH AdL = high, low bytes of starting address.
NH NL = high, low bytes of N
N = Number of bytes to be downloaded or
uploaded (maximum 07FFCh)
Dt1... DtN = Download data. Note (here and in
upload response) that data are 8-bit binary
values, even if using the serial interface.
Dt(N+1). If N is odd, an extra byte is appended
to the data so that the total number of bytes in
F502h
the message remains even.
This command requests data transfer to/from data
or external program memory.The control
parameter (Ctl) controls the direction of the
transfer (i.e., download vs. upload) and specifies
which of six 64-Kbyte memory segments (i.e., four
data segments and two external program
segments) is to be accessed. This command does
not work with internal program memory. It is not
possible to download to external program memory
while running in external program memory. The
address and length parameters (AdH AdL NH NL)
specify the starting address and length of the
transfer in bytes. Since the MSM6679A-110 can
only perform download /upload transfers within
one 32-Kbyte block in one Download /Upload
command, the address and length parameters
must not specify a transfer that violates a 32-Kbyte
address boundary. If this restriction is violated, the
download/upload request will be denied.
Select/Jump. This command selects a new data segment, or Jumps to a new program segment.
Ctl(7)=0 is used to first select a new data segment. Ctl(7)=1 then jumps to that program segment.
Seg(7)=0
Ctl(7)=0
F503h
Ctl Seg
Ctl(7)=1
F504hFour-digit ASCII number.XXXX
Retrieve MSM6679A-110 Firmware Revision
Number.
Seg(7)=1
Seg(6~2)
Seg(1~0)
Seg(7)=0
Seg(7)=1
Seg(6~1)
Seg(0)Failure, with XY(2) = 1.F8XYh
ActionReturn Value
Upper 32-Kbyte of
selected segment is
accessed nor-mally.
Access lower 32-Kbyte
block of selected segment
in up-per 32 Kbytes of
data space.
Reserved.
Data segment selection.
Jump to selected external
program segment.
Jump to internal program
segment.
Reserved.
If Seg(7) =1, not used.
If Seg(7) = 0 and Seg(0) =
0: external program
segment 0.
If Seg(7) = 0 and Seg(0) =
1: external program
segment 1.
Initialize in Background. Background noise
initialization is performed for 500 ms.
The MSM6679A-110 requires this command
prior to recognition for noise vector subtraction
during the utterance sampling period. Use the
background initialization command whenever
there is a change in the background noise level.
For example, sample the noise signature in a
vehicle at rest and moving at 35 MPH with its
windows rolled down. The quality of a phone
line connection can also vary from call to call.
The host MCU must implement a strategy as to
F505hInitialization is complete.F501h
F506hFour digit ASCII number.XXXX
F507hSave is complete.F501h
F508hNo return value
F509h——
when to issue a background initialization
command. In a vehicle, the host MCU could
monitor the vehicle speed, fan speed, radio
volume, etc. Alternatively, the host MCU could
issue this command each time a new
recognition session starts or a new line
connection is established.
However, the 0.5-sec sample period could
degrade system responsiveness if used too
frequently. A zero in this bit location during the
F2XXh command will not cause an initialization.
The F2xxh command can also be used to
perform background noise initialization.
Retrieve Vocabulary and Trigger Table Revision
Number.
Save SDR templates in last FLASH. Save the
download RAM bank SD template area.
Saves 2684 bytes from the address set by the
F103 command to the address range F300FD7F in the last FLASH. The default is 4A00547B→F300-FD7F).
Get SDR templates from last FLASH. Get the
download RAM bank SD template area.
Saves 2684 bytes to the address set by the
F103 command from the address range F300FD7B in the last FLASH. The default is (F300FD7B→4A00-547B).
Set Name Tag Length, Set MSM665x Busy
Mode ON. Name tag record length is set by
F101h
00XXh
F105
xxxx
F106
xxxx
F50AhName tag table cleared.F501h
F50ChSaved name tag table recalled.F501h
F51ChName tag pointers recalled.F501h
F50DhName tag table saved.F501h
F51DhName tag pointers saved.F501h
F50Eh——
F50Fh——
FA00h——
FA01h ~
FA3Dh
XXh, with XXh defining record length in 14-ms
intervals.
The maximum record length of FFh yields a
recording interval of 3.57 sec.
The default value is 1.2 sec.
Set Name Tag Record Origin. This command
sets the beginning address for recording name
tags.
XXXX = 128 byte blocks from 0000 to 02FF.
The reset default is 0000.
This is only effective before an F50A command
since new recordings start after the end of the
previous recording. The F50A command uses
this num-ber to calculate the first address.
Set Name Tag Record End. This command sets
the ending address for recording name tags.
XXXX = 128 byte blocks from 0000 to 02FF.
The reset default is 01FF.
Clear Name Tag Table.
Recall name tag pointers from first FLASH.
Save the first FLASH name tag pointers (FD80 FFFF) to the working name tag pointer table.
The default is (FD80-FFFF→5480-56FF).
Recall name tag pointers from last FLASH.
Save the last FLASH name tag pointesr (FD80 FFFF) to the working name tag pointer table.
The default is (FD80-FFFF→5480-56FF).
Save name tag pointers in first FLASH. Save
the working name tag pointer table to the first
FLASH name tag pointers. The default is (5480
-56FD→FD80-FFFD).
Save name tag pointers in last FLASH. Save the
working name tag pointer table to the last
FLASH name tag pointers. The default is (5480
-56FD→FD80-FFFD).
Set Record Volume HIGH.
Set Record Volume to Normal. This is the
default setting.
Reserved. This command is reserved for future
use.
Record Name Tag.
ActionReturn Value
F105 BAAA,
where B is the
bank num-ber
(0,1,2), and
AAA is the
bank ad-dress
/16
(800 - FF8)
F106 BAAA,
where B is the
bank num-ber
(0,1,2), and
AAA is the
bank ad-dress
/16
(800 - FF8)
Reserved. These commands are reserved for
future use.
ActionReturn Value
——
SD Recognition Control
Opcode
Recognition performance is largely a function of how well the enrollment data represents subsequent tokens of the
enrolled utterances, and performance generally improves steadily with each additional enrollment pass. For most
applications, three initial enrollment passes are recommended. Subsequent reference updating can be performed
with the SD Recognize Update command (F342).
Clear SDR table. This command initializes a
blank SD template table. The 2684-byte area
F521hSDR table is clearedF501h
F6XYhNo return value.
F9XYhUtterance number not found.F700h
from the address set by the F103 command
(the working SDR table) is set to zeros. The
SDR tables in the FLASH banks are not affected. The default is (4A00 - 547B).
Set SD Segment Pointer. This command sets
the SD segment pointer to XY00h, i.e., set the
starting address of the current SD recognition
parame-ter table to XY00h. Issuing this
command is equivalent to issuing the Set SD
Origin command, F103h XY00h. (For further
details of operation, please refer to the
description of that command.)
Search for SD Utterance XY. This is the first
step in adding an utterance to the vocabulary,
or in replacing an existing one. The SD
vocabulary memory is searched for utt. no.
XYh. If it is not found and if sufficient SD
memory exists, the MSM6679A prepares to
add utterance number XYh to the vo-cabulary.
ActionReturn Value
Utterance number found.F740h
Memory full.F73Fh
Enroll SD Utterance. This command starts
MSM6679A SD Listen mode, then uses the
next captured utterance to start or update
training of the reference data for SD utterance
number XY specified in the most recent Search
command (F9XYh). The user must be
FB00h
FC00hOperation complete.F740h
prompted to say the utter-ance prior to issuing
this command.
If the utterance was previously enrolled, a
training update is performed; if not, the
reference data is initialized. Each utterance in
the SD vocabulary must be enrolled at least
once before it can be recognized.
Erase utterance from SD vocabulary. This
command erases the reference parameters for
utterance number XYh from the SD vocabulary,
where XYh is the utterance number retained
from the previous Search command (F9XYh).
All messages to the MSM6679A (except downloads and uploads) are echoed, but replies from the
MSM6679A to the host are not echoed by the host. This arrangement facilitates manual
communication with the MSM6679A using standard terminals. The following table illustrates
the range of MSM6679A functions.
CommentAction
Initialize MSM6679AHost initializes MSM6679A.
MSM6679A acknowledges.
Install new software
ersion.
Upload software for
verification of transfer.
Run new software.Host commands jump
Load trigger tables at
5000h.
Set new triggering origin. Host requests
Download new SD
vocabulary.
Host requests download
to program segment 40,
starting at location 0,
of 32 Kbytes (7FFCh).
MSM6679A accepts request.
Host sends 32 Kbytes.
(~34 sec at 9600 baud).
MSM6679A indicates downloadcomplete.
Host requests upload
from program segment 0,
starting at location 0,
of 32 Kbytes (7FFCh).
MSM6679A accepts request.
MSM6679A sends 32 Kbytes.
MSM6679A indicates upload complete.
to external program segment 0.
MSM6679A begins running new load.
Host requests download
to data segment 0,
starting at location 5000h,
of 256 bytes (0100h).
MSM6679A accepts request.
Host sends 256 bytes
(~0.25 sec at 9600 baud).
MSM6679A indicates download complete.
Set triggering origin to 5000h.
MSM6679A sets triggering origin
and sends confirming response.
Host requests download
to data segment 0,
starting at location 6000h,
of 4 Kbytes (1000h).
MSM6679A accepts request.
Host sends 4 Kbytes
(~4.3 sec at 9600 baud)
MSM6679A indicates download complete.
Host sets recording length to 1 sec.
MSM6679A signals operation complete.
Host clears name tag table
MSM6679A signals operation complete.
Host sets record gain to max. level.
Start recording tag one.
MSM6679A signals name tag recording
complete.
Save name tags to FLASH.
Name tags saved.
Name tag playback.Host sets volume to max. level.
Host commands play back name tag 1.
MSM6679A signals playback OK.
Sound playback.Host sets output volume to mid point.
Play MSM6679A internal sound 1.
Play back sound from MSM6654.
Voice Input
"Jane Doe"
Host
Command
F480
F101 0047
F50A
F50E
FA01
F50D
FEFF
F401
FE80
F442
F49F
MSM6679A
Response
F480
F101 0047
F101 0047
F50A
F501
F50E
FA01
FA00
F50D
F501
FEFF
F401
"Jane Doe"
F400
FE80
F442
"bzzzz"
F49F
"Completed"
The information contained herein can change without notice owing to product and/or technical
improvements.
Please make sure before using the product that the information you are referring to is up-to-date.
The outline of action and examples of application circuits described herein have been chosen as
an explanation of the standard action and performance of the product. When you actually plan
to use the product, please ensure that the outside conditions are reflected in the actual circuit and
assembly designs.
OKI assumes no responsibility or liability whatsoever for any failure or unusual or unexpected
operation resulting from misuse, neglect, improper installation, repair, alteration or accident,
improper handling, or unusual physical or electrical stress including, but not limited to,
exposure to parameters outside the specified maximum ratings or operation outside the
specified operating range.
Neither indemnity against nor license of a third party’s industrial and intellectual property
right,etc.is granted by us in connection with the use of product and/or the information and
drawings contained herein. No responsibility is assumed by us for any infringement of a third
party’s right which may result from the use thereof.
When designing your product, please use our product below the specified maximum ratings and
within the specified operating ranges, including but not limited to operating voltage, power
dissipation, and operating temperature.
The products listed in this document are intended for use in general electronics equipment for
commercial applications (e.g.,office automation, communication equipment, measurement
equipment, consumer electronics, etc.).These products are not authorized for use in any system
or application that requires special or enhanced quality and reliability characteristics nor in any
system or application where the failure of such system or application may result in the loss or
damage of property or death or injury to humans. Such applications include, but are not limited
to: traffic control, automotive, safety, aerospace, nuclear power control, and medical, including
life support and maintenance.
Certain parts in this document may need governmental approval before they can be exported to
certain countries. The purchaser assumes the responsibility of determining the legality of export
of these parts and will take appropriate and necessary steps, at their own expense, for export to
another country.
Copyright 1997 OKI SEMICONDUCTOR
OKI Semiconductor reserves the right to make changes in specifications at anytime and without
notice. This information furnished by OKI Semiconductor in this publication is believed to be
accurate and reliable. However, no responsibility is assumed by OKI Semiconductor for its use;
nor for any infringements of patents or other rights of third parties resulting from its use. No
license is granted under any patents or patent rights of OKI.
48
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.