Any Source Code (software and/or firmware) is owned by Cypress Semiconductor Corporation (Cypress) and is protected by
and subject to worldwide patent protection (United States and foreign), United States copyright laws and international treaty
provisions. Cypress hereby grants to licensee a personal, non-exclusive, non-transferable license to copy, use, modify, create
derivative works of, and compile the Cypress Source Code and derivative works for the sole purpose of creating custom software and or firmware in support of licensee product to be used only in conjunction with a Cypress integrated circuit as specified in the applicable agreement. Any reproduction, modification, translation, compilation, or representation of this Source
Code except as specified above is prohibited without the express written permission of Cypress.
Disclaimer: CYPRESS MAKES NO WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, WITH REGARD TO THIS MATERIAL, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A
PARTICULAR PURPOSE. Cypress reserves the right to make changes without further notice to the materials described
herein. Cypress does not assume any liability arising out of the application or use of any product or circuit described herein.
Cypress does not authorize its products for use as critical components in life-support systems where a malfunction or failure
may reasonably be expected to result in significant injury to the user. The inclusion of Cypress’ product in a life-support systems application implies that the manufacturer assumes all risk of such use and in doing so indemnifies Cypress against all
charges.
Use may be limited by and subject to the applicable Cypress software license agreement.
All trademarks or registered trademarks referenced herein are property of the respective corporations.
13.1.4 Cy Control Center ........................................................................................190
14. GPIF™ II Designer191
FX3 Programmers Manual, Doc. # 001-64707 Rev. *C7
Contents
8FX3 Programmers Manual, Doc. # 001-64707 Rev. *C
1.Introduction
Cypress EZ-USB® FX3™ is the next-generation USB 3.0 peripheral controller providing highly
integrated and flexible features that enable developers to add USB 3.0 functionality to any system.
Figure 1-1. EZ USB FX3 System Diagram
EZ-USB FX3 has a fully configurable, parallel, general programmable interface called GPIF™ II,
which can connect to any processor, ASIC, DSP, image sensor, or FPGA. It has an integrated PHY
and controller along with a 32-bit microcontroller (ARM926EJ-S) for powerful data processing and
for building custom applications. It has an interport DMA architecture that enables data transfers of
greater than 400 Mbps.
The FX3 is a fully compliant USB 3.0 and USB 2.0 peripheral. An integrated USB 2.0 OTG controller
enables applications that need dual role usage scenarios. It has 512 KB of on-chip SRAM for code
and data. It supports serial peripherals such as UART, SPI, I
to on board peripherals; for example, the I
GPIF II is an enhanced version of the GPIF in FX2LP™, Cypress’s flagship USB 2.0 product. It
provides easy and glueless connectivity to popular industry interfaces such as asynchronous and
synchronous Slave FIFO, asynchronous SRAM, asynchronous and synchronous Address Data
Multiplexed interface, parallel ATA, and so on. The GPIF II controller on the FX3 device supports a
total of 256 states. It can be used to implement multiple disjointed state machines.
The FX3 comes with the easy-to-use EZ-USB tools providing a complete solution for fast application
development. The software development kit includes application examples to accelerate time to
market.
2
C interface is typically connected to an EEPROM.
2
C, and I2S that enable communicating
FX3 Programmers Manual, Doc. # 001-64707 Rev. *C9
Introduction
The FX3 is fully compliant with USB 3.0 V1.0 Specification and is also backward compatible with
USB 2.0. It is also complaint with the Battery Charging Specification V1.1 and USB 2.0 OTG
Specification.
1.1Chapter Overview
The following chapters describe in greater details each of the components of the Programmers
Manual.
Introduction to USB on page 13 presents an overview of the USB standard.
FX3 Overview on page 23 presents a hardware overview of the FX3 system.
FX3 Software on page 49 provides an overview of the SDK that is provided with the FX3.
FX3 Firmware on page 53 provides a brief description of each programmable firmware block. This
includes the system boot and initialization, USB, GPIF 2, serial interfaces, DMA, power
management, and debug infrastructure.
FX3 APIs on page 75 provides the description of the APIs for USB, GPIF2, serial interfaces, DMA,
RTOS, and debug.
FX3 Application Examples on page 77 presents code examples, which illustrate the API usage and
the firmware framework.
FX3 Application Structure on page 85 describes the FX3 application framework and usage model for
FX3 APIs.
FX3 Serial Peripheral Register Access chapter on page 99 describes the register based access from
an application processor when FX3 device is configured for PP mode slave operation.
FX3 Boot Image Format chapter on page 149 describes the FX3 image (img) format as required by
the FX3 boot-loader.
FX3 Development Tools on page 151 describes the available options for the firmware development
environment, including JTAG based debugging.
FX3 Host Software on page 189 describes the Cypress generic USB 3.0 WDF driver, the
convenience APIs, and the USB control center.
GPIF™ II Designer on page 191 provides a guide to the GPIF II Designer tool.
1.2Document Revision History
Table 1-1. Revision History
Revision
**05/10/2011SHRSNew user guide
*A07/14/2011SHRSFX3 Programmers Manual update for beta release.
*B03/27/2012SHRSFX3 Programmers Manual update for FX3 SDK 1.1 release.
*C08/10/2012SHRSFX3 Programmers Manual update for SDK 1.2 release.
The universal serial bus (USB) has gained wide acceptance as the connection method of choice for
PC peripherals. Equally successful in the Windows and Macintosh worlds, USB has delivered on its
promises of easy attachment, an end to configuration hassles, and true plug-and-play operation. The
USB is the most successful PC peripheral interconnect ever. In 2006 alone, over 2 billion USB
devices were shipped and there are over 6 billion USB products in the installed base today.
2.1USB 2.0 System Basics
A USB system is an asynchronous serial communication 'host-centric' design, consisting of a single
host and a myriad of devices and downstream hubs connected in a tiered-star topology. The USB
2.0 Specification supports the low-speed, full-speed, and high-speed data rates. It employs a
half-duplex two-wire signaling featuring unidirectional data flow with negotiated directional bus
transitions.
2.1.1Host, Devices, and Hubs
The USB system has one master: the host computer. Devices implement specific functions and
transfer data to and from the host (for example: mouse, keyboard, and thumb drives). The host owns
the bus and is responsible for detecting a device as well as initiating and managing transfers
between various devices. Hubs are devices that have one upstream port and multiple down stream
ports and connect multiple devices to the host creating a tiered topology. Associated with a host is
the host controller that manages the communication between the host and various devices. Every
host controller has a root hub associated with it. A maximum of 127 devices may be connected to a
host controller with not more than seven tiers (including root hubs). Because the host is always the
bus master, the USB direction OUT refers to the direction from the host to the device, and IN refers
to the device to host direction.
2.1.2Signaling Rates
USB 2.0 supports following signaling rates:
■ A low-speed rate of 1.5 Mbit/s is defined by USB 1.0.
■ A full-speed rate of 12 Mbit/s is the basic USB data rate defined by USB 1.1. All USB hubs
support full speed.
■ A high-speed (USB 2.0) rate of 480 Mbit/s introduced in 2001. All high-speed devices are
capable of falling back to full-speed operation if necessary; they are backward compatible.
2.1.3Layers of Communication Flow
A layered communication model view is adopted to describe the USB system because of its
complexity and generic nature. The components that make up the layers are presented here.
USB data transfer can occur between the host software and a logical entity on the device called an
endpoint through a logical channel called pipe. A USB device can have up to 32 active pipes, 16 for
data transfers to the host, and 16 from it. An interface is a collection of endpoints working together to
implement a specific function.
2.1.3.2Descriptors
USB devices describe themselves to the host using a chain of information (bytes) known as
descriptors. Descriptors contain information such as the function the device implements, the
manufacturer of the device, number of endpoints, and class specific information. The first two bytes
of any descriptor specify the length and type of descriptor respectively.
All devices generally have the following descriptors.
■ Device descriptors
■ Configuration descriptors
■ Interface descriptors
■ Endpoint descriptors
■ String descriptors
A device descriptor specifies the Product ID (PID) and Vendor ID (VID) of the device as well as the
USB revision that the device complies with. Among other information listed are the number of
configurations and the maximum packet size for endpoint 0. The host system loads looks at the VID
and PID to load the appropriate device drivers. A USB device can have only one device descriptor
associated with it.
The configuration descriptor contains information such as the device's remote wake up feature,
number of interfaces that can exist for the configuration, and the maximum power a particular
configuration uses. Only one configuration of a device can be active at any time.
Each function of the device has an interface descriptor associated with it. An interface descriptor
specifies the number of endpoints associated with that interface and other alternate settings.
Functions that fall under a predefined category are indicated using the interface class code and sub
class code fields. This enables the host to load standard device drivers associated with that function.
More than one interface can be active at any time.
The endpoint descriptor specifies the type of transfer, direction, polling interval, and maximum
packet size for each endpoint. Endpoint 0 is an exception; it does not have any descriptor and is
always configured to be a control endpoint.
2.1.3.3Transfer Types
USB defines four transfer types through its pipes. These match the requirements of different data
types that need to be delivered over the bus.
Bulk data is 'bursty,' traveling in packets of 8, 16, 32, or 64 bytes at full speed or 512 bytes at high
speed. Bulk data has guaranteed accuracy, due to an automatic retry mechanism for erroneous
data. The host schedules bulk packets when there is available bus time. Bulk transfers are typically
used for printer, scanner, modem data, and storage devices. Bulk data has built-in flow control
provided by handshake packets.
Interrupt data is similar to bulk data; it can have packet sizes of 1 through 64 bytes at full-speed or up
to 1024 bytes at high-speed. Interrupt endpoints have an associated polling interval that ensures
they are polled (receive an IN token) by the host on a regular basis.
Isochronous data is time-critical and used to stream data similar to audio and video. An isochronous
packet may contain up to 1023 bytes at full-speed, or up to 1024 bytes at high-speed. Time of
delivery is the most important requirement for isochronous data. In every USB frame, a certain
amount of USB bandwidth is allocated to isochronous transfers. To lighten the overhead,
isochronous transfers have no handshake and no retries; error detection is limited to a 16-bit CRC.
Control transfers configure and send commands to a device. Because they are so important, they
employ the most extensive USB error checking. The host reserves a portion of each USB frame for
control transfers.
2.1.3.4Protocol Layer
The function of the protocol layer is to understand the type of transfer, create the necessary packet
IDs and headers, packet long data and generate CRCs, and pass them on to the link layer. Protocol
level decisions similar to packet retry are also handled in this layer.
All communication over USB happen in the form of packets. Every USB packet, consist of a Packet
ID (PID). These PIDs may fall into one of the four different categories and are listed here.
PID TypePID Name
Tok e nIN , OU T, SO F, SE T UP
DataDATA0, DATA1, DATA2, MDATA
HandshakeACK, NAK, STALL, NYET
SpecialPRE, ERR, SPLIT, PING
Introduction to USB
The PIDs shown in bold are additions that happened in the USB 2.0 specification.
Figure 2-1. USB Packets
A regular pay load data transfer requires at least three packets: Token, Data, and Ack. Figure 2-1
illustrates a USB OUT transfer. Host traffic is shown in solid shading, while device traffic is shown
cross-hatched. Packet 1 is an OUT token, indicated by the OUT PID. The OUT token signifies that
data from the host is about to be transmitted over the bus. Packet 2 contains data, as indicated by
the DATA1 PID. Packet 3 is a hand-shake packet, sent by the device using the ACK (acknowledge)
PID to signify to the host that the device received the data error-free. Continuing with Figure 2-1, a
second transaction begins with another OUT token 4, followed by more data 5, this time using the
DATA0 PID. Finally, the device again indicates success by transmitting the ACK PID in a handshake
packet 6.
SETUP tokens are unique to CONTROL transfers. They preface eight bytes of data from which the
peripheral decodes host device requests. At full-speed, start of frame (SOF) tokens occur once per
millisecond. At high speed, each frame contains eight SOF tokens, each denoting a 125-µs
microframe.
Four handshake PIDs indicate the status of a USB transfer: ACK (Acknowledge) means 'success';
the data is received error-free. NAK (Negative Acknowledge) means 'busy, try again.' It is tempting to
assume that NAK means 'error,' but it does not; a USB device indicates an error by not responding.
STALL means that something is wrong (probably as a result of miscommunication or lack of
cooperation between the host and device software). A device sends the STALL handshake to
indicate that it does not understand a device request, that something went wrong on the peripheral
end, or that the host tried to access a resource that was not there. It is similar to HALT, but better,
because USB provides a way to recover from a stall. NYET (Not Yet) has the same meaning as ACK
- the data was received error-free - but also indicates that the endpoint is not yet ready to receive
another OUT transfer. NYET PIDs occur only in high-speed mode. A PRE (Preamble) PID precedes
a low-speed (1.5 Mbps) USB transmission.
One notable feature of the USB 2.0 protocol is the data toggle mechanism. There are two DATA
PIDs (DATA0 and DATA1) in Figure 2-1. As mentioned previously, the ACK handshake is an
indication to the host that the peripheral received data with-out error (the CRC portion of the packet
is used to detect errors). However, the handshake packet can get garbled during transmission. To
detect this, each side (host and device) maintains a 'data toggle' bit, which is toggled between data
packet transfers. The state of this internal toggle bit is compared with the PID that arrives with the
data, either DATA0 or DATA1. When sending data, the host or device sends alternating DATA0DATA1 PIDs. By comparing the received Data PID with the state of its own internal toggle bit, the
receiver can detect a corrupted handshake packet.
The PING protocol was introduced in the USB 2.0 specification to avoid wasting bus bandwidth
under certain circumstances. When operating at full speed, every OUT transfer sends the OUT data,
even when the device is busy and cannot accept the data. Such unsuccessful repetitive bulk data
transfers resulted in significant wastage of bus bandwidth. Realizing that this could get worse at high
speed, this issue was remedied by using the new 'Ping' PID. The host first sends a short PING token
to an OUT endpoint, asking if there is room for OUT data in the peripheral device. Only when the
PING is answered by an ACK does the host send the OUT token and data.
The protocol for the interrupt, bulk, isochronous and control transfers are illustrated in the following
figures.
The link layer performs additional tasks to increase the reliability of the data transfer. This includes
byte ordering, line level framing, and so on.
More commonly known as the electrical interface of USB 2.0, this layer consists of circuits to
serialize and de-serialize data, pre and post equalization circuits and circuits to drive and detect
differential signals on the D+ and D– lines. All error handling is done at the protocol layer and there
is no discernible low level link layer to manage errors.
Introduction to USB
2.1.4Device Detection and Enumeration
One of the most important advantages of USB over other contemporary communication system is its
plug-and-play capability. A change in termination at the USB port indicates that a USB device is
connected.
When a USB device is first connected to a USB host, the host tries to learn about the device from its
descriptors; this process is called enumeration. The host goes through the following sign on
sequence
1. The host sends a Get Descriptor-Device request to address zero (all USB devices must respond
to address zero when first attached).
2. The device responds to the request by sending ID data back to the host to identify itself.
3. The host sends a Set Address request, which assigns a unique address to the just-attached
device so it may be distinguished from the other devices connected to the bus.
4. The host sends more Get Descriptor requests, asking for additional device information. From
this, it learns every-thing else about the device such as number of endpoints, power
requirements, required bus bandwidth, and what driver to load.
All high-speed devices begin the enumeration process in full-speed mode; devices switch to
high-speed operation only after the host and device have agreed to operate at high speed. The
high-speed negotiation process occurs during USB reset, via the 'Chirp' protocol.
Because the FX2 configuration is 'soft', a single chip can take on the identities of multiple distinct
USB devices. When first plugged into USB, the FX2 enumerates automatically and downloads
firmware and USB descriptor tables over the USB cable. A soft disconnect is triggered following
which, the FX2 enumerates again, this time as a device defined by the downloaded information. This
patented two-step process, called ReNumeration™, happens instantly when the device is plugged
in, with no hint that the initial download step had occurred.
Power management refers to the part of the USB Specification that spell out how power is allocated
to the devices connected downstream and how different communication layers operate to make best
use of the available bus power under different circumstances.
USB 2.0 supports both self and bus powered devices. Devices indicate this through their descriptors.
Devices, irrespective of their power requirements and capabilities are configured in their low power
state unless the software instructs the host to configure the device in its high power state. Low power
devices can draw up to 100 mA of current and high power devices can draw a maximum of 500 mA.
The USB host can 'suspend' a device to put it into a power-down mode. A 3 ms 'J' state (Differential
'1' indicated by D+ high D– low) on the USB bus triggers the host to issue a suspend request and
enter into a low power state. USB devices are required to enter a low power state in response to this
request.
When necessary, the device or the host issues a Resume. A Resume signal is initiated by driving a
'K' state on the USB bus, requesting that the host or device be taken out of its low power 'suspended'
mode. A USB device can only signal a resume if it has reported (through its Configuration
Descriptor) that it is 'remote wakeup capable', and only if the host has enabled remote wakeup from
that device.
This suspend-resume mechanism minimizes power consumed when activity on the USB bus is
absent
2.1.6Device Classes
In an attempt to simplify the development of new devices, commonly used device functions were
identified and nominal drivers were developed to support these devices. The host uses the
information in the class code, subclass code, and protocol code of the device and interface
descriptors to identify if built-in drivers can be loaded to communicate with the device attached. The
human interface device (HID) class and mass storage class (MSC) are some of the commonly used
device classes.
The HID class refers to interactive devices such as mouse, keyboards, and joy sticks. This interface
use control and interrupt transfer types to transfer data because data transfer speeds are not critical.
Data is sent or received using HID reports. Either the device or the interface descriptor contains the
HID class code
The MSC class is primarily intended to transfer data to storage devices. This interface primarily uses
bulk transfer type to transfer data. At least two bulk endpoints for each direction is necessary. The
MSC class uses the SCSI transparent command set to read or write sectors of data on the disk
drive.
Details about other classes can be found at the Implementers forum website http://www.usb.org.
2.2USB 3.0: Differences and Enhancements over USB 2.0
2.2.1USB 3.0 Motivation
USB 3.0 is the next stage of USB technology. Its primary goal is to provide the same ease of use,
flexibility, and hot-plug functionality but at a much higher data rate. Another major goal of USB 3.0 is
power management. This is important for "Sync and Go" applications that need to trade off features
for battery life.
The USB 3.0 interface consists of a physical SuperSpeed bus in addition to the physical USB 2.0
bus. The USB 3.0 standard defines a dual simplex signaling mechanism at a rate of 5 Gbits/s.
Inspired by the PCI Express and the OSI 7-layer architecture, the USB 3.0 protocol is also
abstracted into different layers as illustrated in the following sections.
In this document, USB 3.0 implicitly refers to the SuperSpeed portion of USB 3.0.
Figure 2-6. USB Protocol Layers
2.2.2Protocol Layer
USB 3.0 SuperSpeed inherits the data transfer types from its predecessor retaining the model of
pipes, endpoints and packets. Nonetheless, the type of packets used and some protocols associated
with the bulk, control, isochronous, and control transfers have undergone some changes and
enhancements. These are discussed in the sections to follow.
Link Management packets are sent between links to communicate link level issues such as link
configuration and status and hence travel predominantly between the link layers of the host and the
device. For example, U2 Inactivity Timeout LMP is used to define the timeout from the U1 state to
the U2 state. The structure of a LMP is shown here.
Figure 2-7. Link Management Packet Structure
Transaction packets reproduce the functionality provided by the Token and Handshake packets and
travel between the host and endpoints in the device. They do not carry any data but form the core of
the protocol.
For example, the ACK packet is used to acknowledge a packet received. The structure of a
transaction packet is shown in Figure 2-8.
Data packets actually carry data. These are made up of two parts: a data header and the actual data.
The structure of a data packet is shown on the right.
Isochronous Time Stamp packets contain timestamps and are broadcast by the host to all active
devices. Devices use timestamps to synchronize with the host. These do not have any routing
information. The structure of an ITP is shown in Figure 2-10.
Figure 2-9. Example Data Packet
Figure 2-10. ITP Structure
OUT transfers are initiated by the host by sending a data packet on the downstream bus. The data
packet contains the device routing address and the endpoint number. If the transaction is not an
isochronous transaction, then, on receiving the data packet, the device launches an
acknowledgement packet, which also contains the next packet number in the sequence. This
process continues until all the packets are transmitted unless an endpoint responds with an error
during the transaction. In transfers are initiated by the host by sending an acknowledge packet to the
device containing the device, endpoint address and the number of packets that the host expects.
The device then starts sending the data packets to the host. The response from the host
acknowledges the previous transfer while initiating the next transfer from the device.
One important modification in the USB 3.0 specification is uni-casting in place of broadcasting.
Packets in USB 2.0 were broadcast to all devices. This necessitated every connected device to
decode the packet address to check if the packet was targeted at it. Devices had to wake up to any
USB activity regardless of its necessity in the transfer. This resulted in higher idle power. USB 3.0
packets (except ITP) are uni-casted to the target. Necessary routing information for hubs is built into
the packet.
Another significant modification introduced in USB 3.0 relates to interrupt transfers. In USB 2.0,
Interrupt transfers were issued by the host every service interval regardless of whether or not the
device was ready for transfers. However, SuperSpeed interrupt endpoints may send an ERDY/
NRDY in return for an interrupt transfer/request from the host. If the device returned an ERDY, the
host continues to interrupt the device endpoint every service interval. If the device returned NRDY,
the host stops interrupt request or transfers to the endpoint until the device asynchronously (not
initiated by the host) notifies ERDY.
One of the biggest advantage the dual simplex bus architecture provides the USB 3.0 protocol with
is the ability to launch multiple packets in one direction without waiting for an acknowledge packet
from the other side which otherwise on a half duplex bus would cause bus contention. This ability is
exploited to form a new protocol that dictates that packets be sent with a packet number, so that any
missing or unfavorable acknowledges that comes after a long latency can be used to trigger the
retransmission of the missed packet identified by the packet number. The number of burst packets
that can be sent (without waiting for acknowledge) is communicated before the transfer.
Another notable feature of USB 3.0 is the stream protocol available for bulk transfers. Normal bulk
(OUT) transfers transfer a single stream of data to an endpoint in the device. Typically, each stream
of data is sourced from a buffer (FIFO) in the transmitter to another buffer (FIFO) in the receiver. The
stream protocol allows the transmitter to associate a stream ID (1-65536) with the current stream
transfer/request. The receiver of the stream or request sources or sinks the data to/from the
appropriate FIFO. This multiplexing of the streams achieves mimicking a pipe which can dynamically
shift its ends. Streams make it possible to realize an out-of-order execution model required for
command queuing. The concept of streams enable more powerful mass storage protocols. A typical
communication link consists of a command OUT pipe, an IN and OUT pipe (with multiple data
streams), and a status pipe. The host can queue commands, that is, issue a new command without
waiting for completion of a prior one, tagging each command with a Stream ID.
Because of the manner in which the USB 3.0 power management is defined, nonactive links (hubs,
devices) may take longer to get activated on seeing bus activity. Isochronous transfers that activate
the links take longer to reach the destination and may violate the service interval requirement. The
Isochronous-PING protocol circumvents this issue. The host sends a PING transfer before an
isochronous transaction. A PING RESPONSE indicates that all links in the path are active (or have
been activated). The host can then send or request an isochronous data packet. USB 2.0
isochronous devices can not enter low power bus state in between service intervals.
2.2.3Link Layer
The link layer maintains link connectivity and ensures data integrity between link partners by
implementing error detection. The link layer ensure reliable data delivery by framing packet headers
at the transmitting end and detecting link level errors at the receiving end. The link layer also
implements protocols for flow control and participates in power management. The link layer provides
an interface to the protocol layer for pass through of messages between the protocol layers. Link
partners communicate using link commands.
2.2.4Physical Layer
The two pairs of differential lines, one for OUT transfers and another for IN transfers define the
physical connection between a USB 3.0 SuperSpeed host and the device. The physical layer
accepts one byte at a time, scrambles the bits (a procedure that is known to reduce EMI emissions),
converts it to 10 bits, serializes the bits, and transmits data differentially over a pair of wires. The
clock data recovery circuit helps to recover data at the receiving end. The LFPS (Low frequency
periodic signaling) block is used for initialization and power management when the bus is IDLE.
Detection of SuperSpeed devices is done by looking at the line terminations similar to USB 2.0
devices.
USB 3.0 provides enhanced power management capabilities to address the needs of
battery-powered portable applications. Two "Idle" modes (denoted as U1 and U2) are defined in
addition to the "Suspend" mode (denoted as U3) of the USB 2.0 standard.
The U2 state provides higher power savings than U1 by allowing more analog circuitry (such as
clock generation circuits) to be quiesced. This results in a longer transition time from U2 to active
state. The Suspend state (U3) consumes the least power and again requires a longer time to wake
up the system.
The Idle modes may be entered due to inactivity on a downstream port for a programmable period of
time or may be initiated by the device, based on scheduling information received from the host. Such
information is indicated by the host to the device using the flags "Packet pending," "End of burst,"
and "Last packet." Based on these flags, the device may decide to enter an Idle mode without having
to wait for inactivity on the bus. When a link is in one of these Idle states, communication may take
place via low-frequency period signaling (LFPS), which consumes significantly lower power than
SuperSpeed signaling. In fact, the Idle mode can be exited with an LFPS transmission from either
the host or device.
The USB 3.0 standard also introduces the "Function Suspend" feature, which enables the power
management of the individual functions of a composite device. This provides the flexibility of
suspending certain functions of a composite device, while other functions remain active.
Additional power saving is achieved via a latency tolerance messaging (LTM) mechanism implemented by USB 3.0. A device may inform the host of the maximum delay it can tolerate from the time
it reports an ERDY status to the time it receives a response. The host may factor in this latency tolerance to manage system power.
Thus, power efficiency is embedded into all levels of a USB 3.0 system, including the link layer, protocol layer, and PHY. A USB 3.0 system requires more power while active. But due to its higher data
rate and various power-efficiency features, it remains active for shorter periods. A SuperSpeed data
transfer could cost up to 50 percent less power than a hi-speed transfer. This is crucial to the battery
life of mobile handset devices such as cellular phones.
2.3Reference Documents
Some of this chapter’s contents have been sourced from the following documents:
■ Universal Serial Bus 3.0 Specification, Revision 1.0
■ Universal Serial Bus Specification, Revision 2.0
■ On-The-Go Supplement to the USB 2.0 Specification, Revision 1.3
FX3 is a full-feature, general purpose integrated USB 3.0 Super-Speed controller with built-in flexible
interface (GPIF II), which is designed to interface to any processor thus enabling customers to add
USB 3.0 to any system.
The logic block diagram shows the basic block diagram of FX3. The integrated USB 3.0 Phy and
controller along with a 32-bit processor make FX3 powerful for data processing and building custom
applications. An integrated USB 2.0 OTG controller enables applications that need dual role usage
scenarios. A fully configurable, parallel, General Programmable Interface (GPIF II) provides
connection to any processor, ASIC, DSP, or FPGA. There is 512 kB of on-chip SRAM for code and
data. There are also low performance peripherals such as UART, SPI, I
to onboard peripherals such as EEPROM. The CPU manages the data transfer between the USB,
GPIF II, I2S, SPI, and UART interfaces through firmware and internal DMA interfaces.
2
C, and I2S to communicate
3.1CPU
FX3 is powered by ARM926EJS, a 32-bit advanced processor core licensed from ARM that is
capable of executing 220 MIPS [Wikipedia] at 200 MHz, the compute power needed to perform MP3
encoding, decryption, and header processing at USB 3.0 rates for the Universal Video Class
The 'Harvard Architecture' based processor accesses instruction and data memory separate
dedicated 32-bit industry standard AHB buses. Separate instruction and data caches are built into
the core to facilitate low latency access to frequently used areas of code and data memory. In
addition, the two tightly coupled memories (TCM) (one each for data and instruction) associated with
the core provide a guaranteed low latency memory (without cache hit or miss uncertainties).
The ARM926 CPU contains a full Memory Management Unit (MMU) with virtual to physical address
translation. FX3 contains 8 KB of data and instruction caches. ARM926-EJS has 4-way set
associative caches and cache lines are 32 bytes wide. Each set therefore has 64 cache lines.
Interrupts vectored into one of the FIQ or IRQ request lines provide a method to generate interrupt
exceptions to the core.
A built-in logic provides an integrated on-chip JTAG based debug support for the processor core.
Figure 3-1. Key CPU Features
3.2Interconnect Fabric
The Advanced Microcontroller Bus Architecture - Advanced High Performance Bus (AMBA AHB)
interconnect forms the central nervous system of FX3. This fabric allows easy integration of
processors, on-chip memories, and other peripherals using low power macro cell functions while
providing a high-bandwidth communication link between elements that are involved in majority of the
transfers. This multi-master high bandwidth interconnect has the following components:
■ AHB bus master(s) that can initiate read and write operations by providing an address and
control information. At any given instant, a bus can at most have one owner. When multiple
masters demand bus ownership, the AHB arbiter block decides the winner.
■ AHB bus slave(s) that respond to read or write operations within a given address-space range.
The bus slave signals back to the active master the success, failure, or waiting of the data
transfer. An AHB decoder is used to decode the address of each transfer and provide a select
signal for the slave that is involved in the transfer.
■ AHB bridges in the system to translate traffic of different frequency, bus width, and burst size.
These blocks are essential in linking the buses
■ AHB Slave/Master interfaces: These macro cells connect peripherals, memories, and other
To allow implementation of an AHB system without the use of tri-state drivers and to facilitate
concurrent read/write operations, separate read and write data buses are required. The minimum
data bus width is specified as 32 bits, but the bus width can be increased for realizing higher
bandwidths.
3.3Memory
In addition to the ARM core's tightly coupled instruction and data memories, a 512 KB general
purpose internal System memory is available in FX3. The system SRAM is implemented using 64- or
128-bit wide SRAM banks, which run at full CPU clock frequency. Each bank may be built up from
narrow SRAM instances for implementation specific reasons. A Cypress-proprietary highperformance memory controller translates a stream of AHB read and writes requests into SRAM
accesses to the SRAM memory array. This controller also manages power and clock gating for the
memory array. The memory controller is capable of achieving full 100% utilization of the SRAM array
(meaning 1 read or 1 write at full width each cycle). CPU accesses are done 64 or 128 bit at a time to
SRAM and then multiplexed/demultiplexed as 2/4 32-bit accesses on the CPU AHB bus. The
controller does not support concurrent accesses to multiple different banks in memory.
The 512 KB system memory can be broadly divided into three. The first few entries of this area is
used to store DMA instructions (also known as descriptors). The DMA hardware logic executes
instructions from these locations. The last 16 K of the system memory shadows the translation table
necessary for cache operations. The remaining area can be used as user code area and/or user
data area and/or DMA buffer area.
Note 1 entry = 4 words
3.4Interrupts
Interrupt exceptions are facilitated using the FIQ and IRQ lines of the ARM9 processor. The ISR
branch instruction for each of these interrupts is provided in the 32 byte exception table located at
the beginning of the ITCM.
The embedded PL192 vectored interrupt controller (VIC) licensed from ARM provides a hardware
based interrupt management system that handles interrupt vectoring, priority, masking and timing,
providing a real time interrupt status. The PL192 VIC supports 32 'active high' interrupt sources, the
ISR location of which can be programmed into the VIC. Each interrupt can be assigned one of the 15
programmable priority levels; equal priority interrupts are further prioritized based on the interrupt
number. While each interrupt pin has a corresponding mask and enable bits, interrupts with a
particular priority level can all be masked out at the same time if desired. Each of the '32-interrupt'
can be vectored to one of the active low FIQ or IRQ outputs of the VIC that are directly hooked to the
corresponding lines of the ARM 9 CPU. PL192 also supports daisy chained interrupts, a feature that
is not enabled in FX3.
Note Other exceptions include reset, software exception, data abort, and pre-fetch abort.
When both FIQ and IRQ interrupt inputs assert, the CPU jumps to the FIQ entry of the exception
table. The FIQ handler is usually placed immediately after the table, saving a branch. The FIQ mode
uses dedicated FIQ bank registers. When an IRQ line alone asserts, CPU jumps to the IRQ handler.
The IRQ handler saves the workspace on stack, reads the address of the ISR from the VIC, and
jumps to the actual ISR.
In general, high priority, low latency interrupts are vectored into FIQ while the IRQ line is reserved for
general interrupts. Re-entrant interrupts can be supported with additional firmware.
3.5JTAG Debugger Interface
Debug support is implemented by using the ARM9EJ-S core embedded within the ARM926EJ-S
processor. The ARM9EJ-S core has hardware that eases debugging at the lowest level. The debug
extensions allow to stall the core's program execution, examine the internal state of the core and the
memory system, and further resume program execution.
The ARM debugging environment has three components: A debug-host resident program (Real
View debugger), a debug communication channel (JTAG) and a target (Embedded ICE-RT). The two
JTAG-style scan chains (Scan1 and Scan2) enable debugging and 'EmbeddedICE-RT-block'
programming.
Scan Chain 1 is used to debug the ARM9EJ-S core when it has entered the debug state. The scan
chain can be used to inject instructions into ARM pipeline and also read or write core registers
without having to use the external data bus. Scan Chain 2 enables access to the EmbeddedICE
registers. The boundary scan interface includes a state machine controller called the TAP controller
that controls the action of scan chains using the JTAG serial protocol.
The ARM9EJ-S EmbeddedICE-RT logic provides integrated on-chip debug support for the
ARM9EJ-S core. The EmbeddedICE-RT logic comprises two real time watchpoint units, two
independent registers, the Debug Control Register and the Debug Status Register, and the debug
communication channel. A watchpoint unit can either be configured to monitor data accesses
(commonly called watchpoints) or monitor instruction fetches (commonly called breakpoints).
The EmbeddedICE-RT logic interacts with the external logic (logic outside the CPU subsystem)
using the debug interface. In addition, it can be programmed (for example, setting a breakpoint)
using the JTAG based TAP controller. The debug interface signals not only communicate the debug
status of the core to the external logic but also provide a means to for the external logic to raise
breakpoints if needed (disabled in FX3 by default).
ARM9EJ-S supports two debug modes: Halt mode and Monitor mode. In halt mode debug, a watchpoint or breakpoint request forces the core into debug state. The internal state of the core can then
be examined and instructions inserted into its pipeline using the TAP controller without using the
external bus thus leaving the rest of the system unaltered. The core can then be forced to resume
normal operation. Alternately, the EmbeddedICE-RT logic can be configured in monitor mode, where
watchpoints or breakpoints generate Data or Pre-fetch Aborts respectively. This enables the debug
monitor system to debug the ARM while enabling critical fast interrupt requests to be serviced.
3.6Peripherals
3.6.1I2S
FX3 is capable of functioning as a master mode transmitter over its Integrated Inter-chip Sound (I2S)
interface. When integrated with an audio device, the I2S bus only handles audio data, while the other
signals, such as sub-coding and control, are transferred separately.
The I2S block can be configured to support different audio bus widths, endianess, number of
channels, and data rate. By default, the interface is protocol standard big endian (most significant bit
first); nevertheless, the block's endianess can be reversed. FX3 also supports the left justified and
right justified variants of the protocol. When the block is enabled in left justified mode, the left
channel audio sample is sent first on the SDA line.
In the mono mode, the 'left data' is sent to both channels on the receiver (WordSelect=Left and
WordSelect=Right). Supported audio sample widths include 8, 16, 18, 24, and 32 bit. In the variable
SCK (Serial Clock) mode, WS (WordSelect) toggles every Nth edge of SCK, where N is the bus
width chosen. In fixed SCK mode, however, WS toggles every thirty-second SCK edge. In this mode,
the audio sample is zero padded to 32 bit. FX3 supports word at a time (SGL_LEFT_DATA,
SGL_RIGHT_DATA) I2S operations for small transfers and DMA based I2S operations for larger
transfers. The Serial Clock can be derived from the internal clock architecture of FX3 or supplied
from outside using a crystal oscillator. Typical frequencies for WS include 8, 16, 32, 44.1, 48, 96, and
192 KHz.
Two special modes of operation, Mute and Pause are supported. When Mute is held asserted, DMA
data is ignored and zeros are transmitted instead. When paused, DMA data flow into the block is
stopped and zeros are transmitted over the interface.
FX3 Overview
FX3_I2C
Master
Other I2C
Master
I2C Slave1
I2C Slave2
R1R2
VDD
SCL
SDA
3.6.2I
2
C
Figure 3-7. I2C Block Diagram
FX3 is capable of functioning as a master transceiver and supports 100 KHz, 400 KHz, and 1 MHz
2
operation. The I
C block operates in big endian mode (Most significant bit first) and supports both
7-bit and 10-bit slave addressing. Similar to I2S, this block supports both single and burst (DMA)
data transfers.
Slow devices on its I
2
C bus can work with FX3's I2C using the clock stretching based flow control.
FX3 can function in multi-master bus environments as it is capable of carrying out negotiations with
other masters on the bus using SDA based arbitration. Additionally, FX3 supports the repeated start
feature to communicate to multiple slave devices on the bus without losing ownership of the bus in
between (see the stop last and start first feature in the following sections).
Combined format communication is supported, which allows the user to load multiple bytes of data
(including slave chip address phases) into using special registers called preamble. The user can
choose to place start (repeated) or stop bits between the bytes and can also define the master's
behavior on receiving either a NAK or ACK for bytes in the preamble. In applications such as
EEPROM reads, this greatly reduces firmware complexity and execution time by packing initial
communication bytes into a transaction header with the ability to abort the header transaction on
receiving NAK exceptions in the middle of an operation.
In addition, the preamble repeat feature available in FX3 simplifies firmware and saves time in
situations - for instance, ACK monitoring from the EEPROM to check completion of a previously
issued program operation. In this case, FX3's I
2
preamble containing the EEPROM's I
By programming the burst read count value for this block, burst reads from the slave (EEPROM for
example), can be performed with no firmware intervention. In this case, the FX3 master receiver
sends ACK response for all bytes received as long as the burst read counter does not expire. When
the last byte of the burst is received, FX3's I
C address until the device responds with an ACK.
2
C can be programmed to repeat a single byte
2
C block signals a NAK followed by a stop bit forcing the