AN2551
Application note
Configuring the STR91xFA MCU for optimum
CPU performance
Introduction
The STR91xFA series of Flash MCUs is based on an ARM966E CPU core which executes
code directly from its internal Flash memory at a rate of up to 96 MHz. To allow flexibility for
a variety of applications and power management schemes, there are many configuration
settings available to firmware during the initialization of the STR91xFA at start-up. This
application note outlines the necessary steps to ensure the STR91xFA is configured for
optimum performance for those applications which require the STR91xFA to operate at full
speed and deliver the highest performance from the CPU core and highest bandwidth of
data movement.
The related software is packed into a zip file associated with this application note and
available from http://www.st.com/.
May 2007 Rev 1 1/8
www.st.com
Contents AN2551
Contents
1 About STR91xFA . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2 Configuring STR91xFA for best performance . . . . . . . . . . . . . . . . . . . . 4
2.1 System clock configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.2 Wait states insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.3 Buffered writes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
2.4 Pre-Fetch Queue (PFQ), Branch Cache (BC) accelerator . . . . . . . . . . . . . 5
2.5 Instruction-set mode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
4 Revision history . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2/8
AN2551 About STR91xFA
1 About STR91xFA
The STR91xFA family of MCUs combines a powerful 32-bit ARM966E RISC processor core,
dual-bank Flash memory reaching 544 Kbytes and a vast 96 Kbyte SRAM for data or code
storage. It includes a rich peripheral set to form an ideal embedded controller for a wide
variety of applications.
This microcontroller has a custom memory accelerator, consisting of a Pre-fetch Unit and
Branch Cache coupled with the Instruction Tightly Coupled Memory (I-TCM) of the CPU
core, to accelerate the performance of the Flash memory system and lower Interrupt
Latency. The job of the PFQ is to keep the ARM core continuously fed with instructions. It
performs asynchronous pre-fetch cycles to the Flash memory during idle bus cycles to keep
the Pre-fetch queue full. When instruction addresses are sequential, the burst Flash
memory will supply the CPU core with 32-bit instructions at a continuous rate of 96 MHz.
However, when instructions have non-sequential addresses, such as during a branch in
code execution, the PFQ must be flushed and refilled, which imposes a stall to the CPU
core. The role of the BC is to minimize the occurrences of these stalls. The BC will
remember the first eight 32-bit instructions of each of the previous 15 branch destinations
taken by the firmware. Anytime one of these 15 branches is taken again, the BC will
immediately supply the first eight instructions to the PFQ, which significantly reduces the
amount of time the CPU is stalled. While the CPU consumes these first eight instructions,
the PFQ has a chance to refill itself and be ready to supply instructions again at 96 MHz on
the 9th instruction.
In order to take full advantage of the memory accelerator, the system configuration registers
of the STR91xFA must be initialized correctly.
For more about STR91xFA, refer to http://www.st.com.
3/8