R4100 Series, and EEPROM are trademarks of NEC Corporation.
Micro Wire is a trademark of National Semiconductor Corp.
iAPX is a trademark of Intel Corp.
DEC VAX is a trademark of Digital Equipment Corp.
UNIX is a registered trademark in the United States and other countries, licensed exclusively through
X/Open Company, Ltd.
Ethernet is a trademark of Xerox Corp.
MIPS is a trademark of MIPS Technologies, Inc.
• The information contained in this document is being issued in advance of the production cycle for the
device. The parameters for the device may change before final production or NEC Corporation, at its own
discretion, may withdraw the device prior to its production.
• Not all devices/types available in every country. Please check with local NEC representative for availability
and additional information.
• No part of this document may be copied or reproduced in any form or by any means without the prior written
consent of NEC Corporation. NEC Corporation assumes no responsibility for any errors which may appear in
this document.
• NEC Corporation does not assume any liability for infringement of patents, copyrights or other intellectual property
rights of third parties by or arising from use of a device described herein or any other liability arising from use
of such device. No license, either express, implied or otherwise, is granted under any patents, copyrights or other
intellectual property rights of NEC Corporation or others.
• Descriptions of circuits, software, and other related information in this document are provided for illustrative
purposes in semiconductor product operation and application examples. The incorporation of these circuits,
software, and information in the design of the customer's equipment shall be done under the full responsibility
of the customer. NEC Corporation assumes no responsibility for any losses incurred by the customer or third
parties arising from the use of these circuits, software, and information.
• While NEC Corporation has been making continuous effort to enhance the reliability of its semiconductor devices,
the possibility of defects cannot be eliminated entirely. To minimize risks of damage or injury to persons or
property arising from a defect in an NEC semiconductor device, customers must incorporate sufficient safety
measures in its design, such as redundancy, fire-containment, and anti-failure features.
• NEC devices are classified into the following three quality grades:
"Standard", "Special", and "Specific". The Specific quality grade applies only to devices developed based on a
customer designated "quality assurance program" for a specific application. The recommended applications of
a device depend on its quality grade, as indicated below. Customers must check the quality grade of each device
before using it in a particular application.
Standard: Computers, office equipment, communications equipment, test and measurement equipment,
audio and visual equipment, home electronic appliances, machine tools, personal electronic
equipment and industrial robots
Special: Transportation equipment (automobiles, trains, ships, etc.), traffic control systems, anti-disaster
systems, anti-crime systems, safety equipment and medical equipment (not specifically designed
for life support)
Specific: Aircraft, aerospace equipment, submersible repeaters, nuclear reactor control systems, life
support systems or medical equipment for life support, etc.
The quality grade of NEC devices is "Standard" unless otherwise specified in NEC's Data Sheets or Data Books.
If customers intend to use NEC devices for applications other than those specified for Standard quality grade,
they should contact an NEC sales representative in advance.
Preliminary User’s Manual S15543EJ1V0UM
M5D 98. 12
5
PREFACE
ReadersThis manual is intended for engineers who need to be familiar with the capability of
PD98502 in order to develop application systems based on it.
µ
the
PurposeThe purpose of this manual is to help users understand the hardware capabilities
(listed below) of the
ConfigurationThis manual consists of the following chapters:
• Introduction
4120A CPU
• VR
• System controller
• ATM cell processor
• Ethernet controller
• USB controller
• PCI controller
• UART
• Timer
• Micro Wire
PD98502.
µ
GuidanceReaders of this manual should already have a general knowledge of electronics, logic
circuits, and microcomputers.
PD98502:
To gain an overall understanding of the function of the
→ Read through all the chapters, in sequence.
To check the electrical characteristics of the
→ Refer to the separate data sheet.
NotationThis manual uses the following conventions:
Data bit significance:High-order bits on the left side;
low-order bits on the right side
Active low:XXXX_B (Pin and signal names are suffixed with _B.)
Note:Explanation of an indicated part of text
Caution:Information requiring the user’s special attention
Remark:Supplementary information
Numerical value:Binary ... xxxx or xxxxB
Decimal ... xxxx
Hexadecimal ... xxxxH
Related DocumentUse this manual in combination with the following document.
The related documents indicated in this publication may include preliminary versions.
However, preliminary versions are not marked as such.
1.7.14 I.C. – open .....................................................................................................................................48
5.1.1 Features ......................................................................................................................................277
5.1.2 Block diagram of Ethernet controller block ..................................................................................277
1-1Examples of the µPD98502 System Configuration ........................................................................................24
1-2Block Diagram of the µPD98502....................................................................................................................25
1-3Block Diagram of VR4120A RISC Processor..................................................................................................26
1-4Block Diagram of IBUS ..................................................................................................................................27
1-5Block Diagram of System Controller ..............................................................................................................28
1-6Block Diagram of ATM Cell Processor...........................................................................................................29
1-7Block Diagram of Ethernet Controller.............................................................................................................30
1-8Block Diagram of USB Controller...................................................................................................................31
1-9Block Diagram of PCI Bus controller..............................................................................................................32
1-12Interrupt Signal Connection............................................................................................................................55
1-13Block Diagram of Clock Control Unit..............................................................................................................56
2-8MIPS III ISA CPU Instruction Formats ...........................................................................................................66
2-9Pipeline Stages (MIPS III Instruction Mode)...................................................................................................84
2-10Instruction Execution in the Pipeline ..............................................................................................................85
2-21Data Cache Miss Stall....................................................................................................................................97
2-23Load Data Interlock........................................................................................................................................98
2-32CP0 Registers and TLB...............................................................................................................................117
2-33Format of a TLB Entry .................................................................................................................................118
2-49Count Register Format ................................................................................................................................132
2-50Compare Register Format ........................................................................................................................... 133
2-56WatchHi Register Format ............................................................................................................................139
2-57XContext Register Format ...........................................................................................................................140
2-64Soft Reset and NMI Exception Handling......................................................................................................164
2-65Logical Hierarchy of Memory .......................................................................................................................168
2-67Instruction Cache Line Format.....................................................................................................................170
2-68Data Cache Line Format..............................................................................................................................170
2-69Cache Data and Tag Organization ..............................................................................................................171
2-70Data Cache State Diagram.......................................................................................................................... 173
PD98502 Physical Address Space ............................................................................................................116
µ
Preliminary User’s Manual S15543EJ1V0UM
17
LIST OF FIGURES (3/5)
Figure No.TitlePage
2-71Instruction Cache State Diagram.................................................................................................................173
2-72Data Check Flow on Instruction Fetch .........................................................................................................174
2-73Data Check Flow on Load Operations .........................................................................................................174
2-74Data Check Flow on Store Operations.........................................................................................................175
2-75Data Check Flow on Index_Invalidate Operations .......................................................................................175
2-76Data Check Flow on Index_Writeback_Invalidate Operations .....................................................................176
2-77Data Check Flow on Index_Load_Tag Operations ......................................................................................176
2-78Data Check Flow on Index_Store_Tag Operations......................................................................................177
2-79Data Check Flow on Create_Dirty Operations .............................................................................................177
2-80Data Check Flow on Hit_Invalidate Operations............................................................................................178
2-81Data Check Flow on Hit_Writeback_Invalidate Operations..........................................................................178
2-82Data Check Flow on Fill Operations.............................................................................................................179
2-83Data Check Flow on Hit_Writeback Operations...........................................................................................179
2-89Masking of Interrupt Request Signals ..........................................................................................................184
3-1Bit and Byte Order of Endian Modes............................................................................................................227
3-2Half-word Data Array Example.....................................................................................................................227
3-3Word Data Array Example ...........................................................................................................................228
4-1Block Diagram of ATM Cell Processor.........................................................................................................230
4-2AAL-5 Sublayer and ATM Layer ..................................................................................................................232
4-3AAL-5 Sublayer and ATM Layer ..................................................................................................................233
4-9Tx Buffer Elements ......................................................................................................................................248
4-12Rx Pool Structure.........................................................................................................................................251
4-13Rx Pool Descriptor/Rx Buffer Directory/Rx Buffer Descriptor/Rx Link Pointer..............................................252
4-14Rx Pool Descriptor.......................................................................................................................................253
4-15Rx Buffer Descriptor/ Link Pointer................................................................................................................254
4-16Transfer of F/W............................................................................................................................................255
4-17Instruction RAM and Instruction Cache........................................................................................................256
4-25Structure of the Transmit Queue..................................................................................................................265
4-26Packet Info Structure ...................................................................................................................................265
4-29Raw Cell with CRC-10 .................................................................................................................................269
4-31LLC Encapsulation Format ..........................................................................................................................270
4-33Raw Cell Data Format .................................................................................................................................273
5-1Block Diagram of Ethernet Controller ..........................................................................................................278
5-2Tx FIFO Control Mechanism........................................................................................................................295
5-3Rx FIFO Control Mechanism .......................................................................................................................297
5-4Buffer Structure for Ethernet Block ..............................................................................................................300
6-9Transmit Status Register .............................................................................................................................340
6-11Transmit Indication Format ..........................................................................................................................343
6-12Division of Data into USB Packets...............................................................................................................344
6-25Example of Buffers Including Corrupted Data..............................................................................................361
6-26Receive Indication Format ...........................................................................................................................362
6-29Remote Wake Up Sequence........................................................................................................................366
6-30Allowable Skew for SOF ..............................................................................................................................367
6-31Data Flow in Loopback Mode.......................................................................................................................368
6-32Example of Connection................................................................................................................................369
7-2Posted Write Transaction from Internal Bus to PCI......................................................................................372
7-3Non Posted Write Transaction from Internal Bus to PCI..............................................................................373
7-4Delayed Read Transaction from Internal Bus to PCI ...................................................................................374
7-5Non Delayed Read Transaction from Internal Bus to PCI............................................................................375
7-6Posted Write Transaction from PCI to Internal bus......................................................................................377
7-7Non Posted Write Transaction from PCI to Internal bus ..............................................................................378
7-8Delayed Read Transaction from PCI to Internal bus....................................................................................379
7-9Non Delayed Read Transaction from PCI to Internal bus ............................................................................380
7-10The Sequence of the Transition by Issues from PCI-Host ...........................................................................384
7-11The Sequence of the Transition by PME......................................................................................................385
7-12The Content of P_PCAR Register for Type0 Configuration Cycle ...............................................................386
7-13The Content of P_PCAR Register for Type1 Configuration Cycle ...............................................................386
7-14An Example How to Connect AD [31:16] Signal Line to IDSEL Port............................................................388
7-15Address Stepping for IDSEL ........................................................................................................................388
7-16Arbitration in Alternating Mode.....................................................................................................................389
7-17Arbitration in Rotating Mode.........................................................................................................................389
4120A Opcode Bit Encoding....................................................................................................................588
A-1V
R
20
Preliminary User’s Manual S15543EJ1V0UM
LIST OF TABLES (1/2)
Table No.TitlePage
2-1System Control Coprocessor (CP0) Register Definitions...............................................................................64
2-2Number of Delay Slot Cycles Necessary for Load and Store Instructions .....................................................67
2-3Byte Specification Related to Load and Store Instructions ............................................................................68
2-8Three-Operand Type Instruction....................................................................................................................72
2-9Three-Operand Type Instruction (Extended ISA)...........................................................................................73
2-25Description of Pipeline Exception ..................................................................................................................95
2-26VR Series Supported Instructions................................................................................................................100
2-27Comparison of useg and xuseg ................................................................................................................... 107
2-2832-bit and 64-bit Supervisor Mode Segments..............................................................................................109
2-33Mask Values and Page Sizes......................................................................................................................121
3-10SDRAM Word Order for Instruction-Cache Line-Fill.....................................................................................217
3-11Endian Translation Table for the data swap mode (IBUS master)...............................................................221
3-12Endian Translation Table for the data swap mode (IBUS slave)..................................................................222
4-1List of Tx Packet Attribute............................................................................................................................249
4-2List of Rx Pool Attributes..............................................................................................................................253
5-2MAC Control Register Map ..........................................................................................................................279
5-4DMA and FIFO Management Registers Map...............................................................................................283
5-5Interrupt and Configuration Registers Map ..................................................................................................284
5-6Attribute for Transmit Descriptor..................................................................................................................301
5-7Attribute for Receive Descriptor...................................................................................................................302
7-1Device Number Decode Table.....................................................................................................................387
8-1Correspondence between Baud Rates and Divisors....................................................................................417
10-1EEPROM Initial Data ...................................................................................................................................428
10-2EEPROM Command List .............................................................................................................................428
(Byte)
ATMF000H4A_GMRR/WGeneral Mode Register
ATMF004H4A_GSRRGeneral Status Register
ATMF008H4A_IMRR/WInterrupt Mask Register
ATMF00CH4A_RQURReceive Queue Underrunning
ATMF010H4A_RQARReceive Queue Alert
ATMF014H-N/A-Reserved for future use
ATMF018H4A_VERRVersion Number
ATMF01CH-N/A-Reserved for future use
ATMF020H4A_CMRR/WCommand Register
ATMF024H-N/A-Reserved for future use
ATMF028H4A_CERR/WCommand Extension Register
ATMF02CH-F04CH-N/A-Reserved for future use
ATMF050H4A_MSA0R/WMailbox0 Start Address
ATMF054H4A_MSA1R/WMailbox1 Start Address
ATMF058H4A_MSA2R/WMailbox2 Start Address
ATMF05CH4A_MSA3R/WMailbox3 Start Address
ATMF060H4A_MBA0R/WMailbox0 Bottom Address
ATMF064H4A_MBA1R/WMailbox1 Bottom Address
ATMF068H4A_MBA2R/WMailbox2 Bottom Address
ATMF06CH4A_MBA3R/WMailbox3 Bottom Address
ATMF070H4A_MTA0R/WMailbox0 Tail Address
ATMF074H4A_MTA1R/WMailbox1 Tail Address
ATMF078H4A_MTA2R/WMailbox2 Tail Address
ATMF07CH4A_MTA3R/WMailbox3 Tail Address
ATMF080H4A_MWA0R/WMailbox0 Write Address
ATMF084H4A_MWA1R/WMailbox1 Write Address
ATMF088H4A_MWA2R/WMailbox2 Write Address
ATMF08CH4A_MWA3R/WMailbox3 Write Address
ATMF090H4A_RCCRValid Receiving Cell Counter
ATMF094H4A_TCCRValid Transmitting Cell Counter
ATMF098H4A_RUECRReceive Unprovisioned VPI/VCI Error Cell Counter
ATMF09CH4A_RIDCRReceiving Internal Discarded Cell Counter
ATMF0A0H-F0AFH-N/A-Reserved for future use
ATMF0B0H-F0B3H-N/A-Reserved for future use
ATMF0B4H-F0BCH-N/A-Reserved for future use
ATMF0C0H4A_T1RR/WT1 Timer Register
ATMF0C4H-N/A-Reserved for future use
ATMF0C8H4A_TSRR/WTime Stamp Register
ATMF200H-F2FFH-N/A-Can not access from VR4120A RISC Core.
ATMF300H4A_IBBARR/WIBUS Base Address Register
ATMF304H4A_INBARR/WInstruction Base Address Register
ATMF308H- F31FH-N/A-Reserved for future use
ATMF320H4A_UMCMDR/WUTOPIA Management Interface Command Register
ATMF324H- F3FFH-N/A-Reserved for future use
ATMF400H-F4FFH-N/A-Can not access from VR4120A RISC Core.
ATMF500H-FFFFH-N/A-Reserved for future use
PCI000H4P_PLBAR/WPCI Lower Base Address
PCI008H4P_IBBAR/WInternal bus Base Address
PCI000H4P_PLBAR/WPCI Lower Base Address
PCI008H4P_IBBAR/WInternal bus Base Address
PCI00CH4N/A-Reserved for future use
PCI010H4P_VERRRVersion Register
PCI014H4P_PCARR/WPCI Configuration Address Register
PCI018H4P_PCDRR/WPCI Configuration Data Register
PCI01CH4P_IGSRRInternal bus General Status Register
PCI020H4P_IIMRR/WInternal bus Interrupt Mask Register
PCI024H4P_PGSRR/WPCI General Status Register
PCI028H4P_PIMRR/WPCI Interrupt Mask Register
PCI02CH4N/A-Reserved for future use
PCI030H4P_HMCRR/WHost Mode Control Register
PCI034H-03CH4N/A-Reserved for future use
PCI040H4P_PWCDR/WPower Consumption Data Register
PCI044H4P_PWDDR/WPower Dissipation Data Register
Preliminary User’s Manual S15543EJ1V0UM
49
CHAPTER 1 INTRODUCTION
CoreOffsetRegister
Length
NameAccess by
V
4120A
R
Description
(Byte)
PCI048H-04CH4N/A-Reserved for future use
PCI050H4P_BCNTR/WBridge Control Register
PCI054H4P_PPCRR/WPower Control Register
PCI058H4P_SWRRWSoftware Reset Register
PCI05CH4P_PTMRR/WRetry Timer Register
PCI060H-0FFH4N/A-Reserved for future use
PCI100H-1FFH4P_CONFIG(*)Configuration Registers.
* Some registers are R/W. Other registers are Read only.
Ether00H4En_MACC1R/WMAC configuration register 1
Ether04H4En_MACC2R/WMAC configuration register 2
Ether08H4En_IPGTR/WBack-to-Back IPG register
Ether0CH4En_IPGRR/WNon Back-to-Back IPG register
Ether10H4En_CLRTR/WCollision register
Ether14H4En_LMAXR/WMax packet length register
Ether18H-1CH-N/A-Reserved for future use
Ether20H4En_RETXR/WRetry count register
Ether24H-50H-N/A-Reserved for future use
Ether54H4En_LSA2R/WStation Address register 2
Ether58H4En_LSA1R/WStation Address register 1
Ether5CH4En_PTVRRPause timer value read register
Ether60H-N/A-Reserved for future use
Ether64H4En_VLTPR/WVLAN type register
Ether80H4En_MIICR/WMII configuration register
Ether84H-90H-N/A-Reserved for future use
Ether94H4En_MCMDWMII command register
Ether98H4En_MADRR/WMII address register
Ether9CH4En_MWTDR/WMII write data register
EtherA0H4En_MRDDRMII read data register
EtherA4H4En_MINDRMII indicator register
EtherA8H-C4H-N/A-Reserved for future use
EtherCCH4En_HT1R/WHash table register 1
EtherD0H4En_HT2R/WHash table register 2
EtherD4H-D8H-N/A-Reserved for future use
EtherDCH4En_CAR1R/WCarry register 1
EtherE0H4En_CAR2R/WCarry register 2
EtherE4H-12CH-N/A-Reserved for future use
Ether130H4En_CAM1R/WCarry mask register 1
Ether134H4En_CAM2R/WCarry mask register 2
Ether138H-13CH-N/A-Reserved for future use
Ether140H4En_RBYTR/WReceive Byte Counter
Ether144H4En_RPKTR/WReceive Packet Counter
Ether148H4En_RFCSR/WReceive FCS Error Counter
Ether14CH4En_RMCAR/WReceive Multicast Packet Counter
Ether150H4En_RBCAR/WReceive Broadcast Packet Counter
Ether154H4En_RXCFR/WReceive Control Frame Packet Counter
Ether158H4En_RXPFR/WReceive PAUSE Frame Packet Counter
Ether15CH4En_RXUOR/WReceive Unknown OP code Counter
Ether160H4En_RALNR/WReceive Alignment Error Counter
Ether164H4En_RFLRR/WReceive Frame Length Out of Range Counter
Ether168H4En_RCDER/WReceive Code Error Counter
Ether16CH4En_RFCRR/WReceive False Carrier Counter
Ether170H4En_RUNDR/WReceive Undersize Packet Counter
Ether174H4En_ROVRR/WReceive Oversize Packet Counter
Ether178H4En_RFRGR/WReceive Error Undersize Packet Counter
Ether17CH4En_RJBRR/WReceive Error Oversize Packet Counter
Ether180H4En_R64R/WReceive 64 Byte Frame Counter
Ether184H4En_R127R/WReceive 65 to 127 Byte Frame Counter
Ether188H4En_R255R/WReceive 128 to 255 Byte Frame Counter
Ether18CH4En_R511R/WReceive 256 to 511 Byte Frame Counter
Ether190H4En_R1KR/WReceive 512 to 1023 Byte Frame Counter
Ether194H4En_RMAXR/WReceive Over 1023 Byte Frame Counter
Ether198H4En_RVBTR/WReceive Valid Byte Counter
Ether1C0H4En_TBYTR/WTransmit Byte Counter
Ether1C4H4En_TPCTR/WTransmit Packet Counter
Ether1C8H4En_TFCSR/WTransmit CRC Error Packet Counter
Ether1CCH4En_TMCAR/WTransmit Multicast Packet Counter
50
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 1 INTRODUCTION
CoreOffsetRegister
Length
NameAccess by
V
4120A
R
Description
(Byte)
Ether1D0H4En_TBCAR/WTransmit Broadcast Packet Counter
Ether1D4H4En_TUCAR/WTransmit Unicast Packet Counter
Ether1D8H4En_TXPFR/WTransmit PAUSE control Frame Counter
Ether1DCH4En_TDFRR/WTransmit Single Deferral Packet Counter
Ether1E0H4En_TXDFR/WTransmit Excessive Deferral Packet Counter
Ether1E4H4En_TSCLR/WTransmit Single Collision Packet Counter
Ether1E8H4En_TMCLR/WTransmit Multiple collision Packet Counter
Ether1ECH4En_TLCLR/WTransmit Late Collision Packet Counter
Ether1F0H4En_TXCLR/WTransmit Excessive Collision Packet Counter
Ether1F4H4En_TNCLR/WTransmit Total Collision Counter
Ether1F8H4En_TCSER/WTransmit Carrier Sense Error Counter
Ether1FCH4En_TIMER/WTransmit Internal MAC Error Counter
Ether200H4En_TXCRR/WTransmit Configuration Register
Ether204H4En_TXFCRR/WTransmit FIFO Control Register
Ether208H4En_TXDTRWTransmit Data Register
Ether20CH4En_TXSRRTransmit Status Register
Ether210H4N/A-Reserved for future use
Ether214H4En_TXDPRR/WTransmit Descriptor Register
Ether218H4En_RXCRR/WReceive Configuration Register
Ether21CH4En_RXFCRR/WReceive FIFO Control Register
Ether220H4En_RXDTRRReceive Data Register
Ether224H4En_RXSRRReceive Status Register
Ether228H4N/A-Reserved for future use
Ether22CH4En_RXDPRR/WReceive Descriptor Register
Ether230H4En_RXPDRR/WReceive Pool Descriptor Register
SYSCNT00H4S_GMRR/WGeneral Mode Register
SYSCNT04H4S_GSRRGeneral Status Register
SYSCNT08H4S_ISRRCInterrupt Status Register
SYSCNT0CH4S_IMRWInterrupt Mask Register
SYSCNT10H4S_NSRRNMI Status Register
SYSCNT14H4S_NERR/WNMI Enable Register
SYSCNT18H4S_VERRVersion Register
SYSCNT1CH4S_IORR/WIO Port Register
SYSCNT20H-2FH-N/A-Reserved for future use
SYSCNT30H4S_WRCRWWarm Reset Control Register
SYSCNT34H4S_WRSRRWarm Reset Status Register
SYSCNT38H4S_PWCRWPower Control Register
SYSCNT3CH4S_PWSRRPower Control Status Register
SYSCNT40H-48H-N/A-Reserved for future use
SYSCNT4CH4S_ITCNTRR/WIBUS Timeout Timer Control Register
SYSCNT50H4S_ITSETRR/WIBUS Timeout Timer Set Register
SYSCNT54H-7FH-N/A-Reserved for future use
SYSCNT80H4UARTDLLR/WUART, Divisor Latch LSB Register [DLAB=1]
SYSCNT80H4UARTRBRRUART, Receiver Buffer Register [DLAB=0,READ]
SYSCNT80H4UARTTHRWUART, Transmitter Holding Register [DLAB=0,WRITE]
SYSCNT84H4UARTDLMR/WUART, Divisor Latch MSB Register [DLAB=1]
SYSCNT84H4UARTIERR/WUART, Interrupt Enable Register [DLAB=0]
SYSCNT88H4UARTFCRWUART, FIFO control Register [WRITE]
SYSCNT88H4UARTIIRRUART, Interrupt ID Register [READ]
SYSCNT8CH4UARTLCRR/WUART, Line control Register
SYSCNT90H4UARTMCRR/WUART, Modem Control Register
SYSCNT94H4UARTLSRR/WUART, Line status Register
SYSCNT98H4UARTMSRR/WUART, Modem Status Register
SYSCNT9CH4UARTSCRR/WUART, Scratch Register
SYSCNTA0H4DSUCNTRR/WDSU Control Register
SYSCNTA4H4DSUSETRR/WDSU Dead Time Set Register
SYSCNTA8H4DSUCLRRWDSU Clear Register
SYSCNTACH4DSUTIMRR/WDSU Elapsed Time Register
SYSCNTB0H4TMMRR/WTimer Mode Register
SYSCNTB4H4TM0CSRR/WTimer CH0 Count Set Register
SYSCNTB8H4TM1CSRR/WTimer CH1 Count Set Register
SYSCNTBCH4TM0CCRRTimer CH0 Current Count Register
SYSCNTC0H4TM1CCRRTimer CH1 Current Count Register
SYSCNTC4H-CFH-N/A-Reserved for future use
SYSCNTD0H4ECCRWEEPROM Command Control Register
SYSCNTD4H4ERDRREEPROM Read Data Register
Preliminary User’s Manual S15543EJ1V0UM
51
CHAPTER 1 INTRODUCTION
CoreOffsetRegister
Length
(Byte)
SYSCNTD8H4MACAR1RMAC Address Register 1
SYSCNTDCH4MACAR2RMAC Address Register 2
SYSCNTE0H4MACAR3RMAC Address Register 3
SYSCNTE4H-FFH-N/A-Reserved for future use
SYSCNT100H4RMMDRR/WBoot ROM Mode Register
SYSCNT104H4RMATRR/WBoot ROM Access Timing Register
SYSCNT108H4SDMDRR/WSDRAM Mode Register
SYSCNT10CH4SDTSRR/WSDRAM Type Selection Register
SYSCNT110H4SDPTRR/WSDRAM Precharge Timing Register
SYSCNT114H4SDRMRR/WSDRAM Precharge Mode Register
SYSCNT118H4SDRCRRSDRAM Precharge Timer Count Register
SYSCNT11CH4SDRMRR/WSDRAM Refresh Mode Register
SYSCNT120H4SDRCRRSDRAM Refresh Timer Count Register
SYSCNT124H4MBCRR/WMemory Bus Control Register
SYSCNT128H-FFFH-N/A-Reserved for future use
USB00H4U_GMRR/WUSB General Mode Register
USB04H4U_VERRUSB Frame number/Version Register
USB08H-N/A-Reserved for future use
USB0CH-N/A-Reserved for future use
USB10H4U_GSR1RUSB General Status Register 1
USB14H4U_IMR1R/WUSB Interrupt Mask Register 1
USB18H4U_GSR2RUSB General Status Resister 2
USB1CH4U_IMR2R/WUSB Interrupt Mask Register 2
USB20H4U_EP0CRR/WUSB EP0 Control Register
USB24H4U_EP1CRR/WUSB EP1 Control Register
USB28H4U_EP2CRR/WUSB EP2 Control Register
USB2CH4U_EP3CRR/WUSB EP3 Control Register
USB30H4U_EP4CRR/WUSB EP4 Control Register
USB34H4U_EP5CRR/WUSB EP5 Control Register
USB38H4U_EP6CRR/WUSB EP6 Control Register
USB3CH-N/A-Reserved for future use
USB40H4U_CMRR/WUSB Command Register
USB44H4U_CAR/WUSB Command Address Register
USB48H4U_TEPSRRUSB Tx EndPoint Status Register
USB4CH-N/A-Reserved for future use
USB50H4U_RP0IRR/WUSB Rx Pool0 Information Register
USB54H4U_RP0ARRUSB Rx Pool0 Address Register
USB58H4U_RP1IRR/WUSB Rx Pool1 Information Register
USB5CH4U_RP1ARRUSB Rx Pool1 Address Register
USB60H4U_RP2IRR/WUSB Rx Pool2 Information Register
USB64H4U_RP2ARRUSB Rx Pool2 Address Register
USB68H-N/A-Reserved for future use
USB6CH-N/A-Reserved for future use
USB70H4U_TMSAR/WUSB Tx MailBox Start Address Register
USB74H4U_TMBAR/WUSB Tx MailBox Bottom Address Register
USB78H4U_TMRAR/WUSB Tx MailBox Read Address Register
USB7CH4U_TMWARUSB Tx MailBox Write Address Register
USB80H4U_RMSAR/WUSB Rx MailBox Start Address Register
USB84H4U_RMBAR/WUSB Rx MailBox Bottom Address Register
USB88H4U_RMRAR/WUSB Rx MailBox Read Address Register
USB8CH4U_RMWARUSB Rx MailBox Write Address Register
USB90H-FFH-N/A-Reserved for future use
Using a 32-bit address, the processor physical address space encompasses 4 Gbytes. VR4120A uses this 4-Gbyte
physical address space as shown in the following figure.
Figure 1-10. Memory Map
FFFF_FFFFH
Mirror of
0000_0000H - 1FFF_FFFF
2000_0000H
1FFF_FFFFH
1F00_0000H
1EFF_FFFFH
1030_0000H
102F_FFFFH
1010_0000H
100F_FFFFH
1002_0000H
1001_FFFFH
1001_0000H
1000_FFFFH
1000_5000H
1000_4FFFH
1000_4000H
1000_3FFFH
1000_3000H
1000_2FFFH
1000_2000H
1000_1FFFH
1000_1000H
1000_0FFFH
1000_0000H
0FFF_FFFFH
Boot ROM/Flash
PCI Controller
(For PCI Window)
AT M C ell P roce ssor
IBUS Target Address Range
PCI Controller
(For Register)
Ethernet Controller #2
Ethernet Controller #1
System Controller
SDRAM
RFU
RFU
RFU
USB Controller
16 MB
2 MB
64 KB
4 KB
4 KB
4 KB
4 KB
4 KB
256 MB
Actual size of PRO M/Flash is max . 8 MB.
Configuration:
1 MB: 1FCF_FFFFH-1FC0_0000H
2 MB: 1FDF_FFFFH-1FC0_0000H
4 MB: 1FFF_FFFFH-1FC0_0000H
8 MB: 1FFF_FFFFH-1F80_0000H
Actual size of SDRAM is max. 32 MB.
Configuration:
04 MB: 003F_FFFFH-0000_0000H
08 MB: 007F_FFFFH-0000_0000H
16 MB: 00FF_FFFFH-0000_0000H
32 MB: 01FF_FFFFH-0000_0000H
0000_0000H
Preliminary User’s Manual S15543EJ1V0UM
53
CHAPTER 1 INTRODUCTION
1.10 Reset Configuration
The falling edge of Clock Control Unit (CCU)’s reset line (RST_B) serves as the µPD98502's internal reset. The
System Controller generates the IBUS reset signal using RST_B for the global reset of the µPD98502. After 4 IBUS
clock (SDCLK), the System Controller deasserts the IBUS reset signal synchronously with IBUS clock (66 MHz). And
also the System Controller generates the internal Cold Reset signal and Hot Reset signal for performing the cold reset
4120A. Once power to the µPD98502 is established, the System Controller asserts internal CLKSET signal,
of VR
internal Cold Reset (COLDRST#) signal and internal Hot Reset (HOTRST#) signal at the falling edge of RST_B
signal. After 2 VR
the CLKSET signal synchronously with “clkm”. Then 16 “clkm” cycles (see section 1.12) at the rising edge of the
RST_B signal, the System Controller deasserts the COLDRST# synchronously with “clkm”. And also the System
Controller deasserts the HOTRST# synchronously with “clkm” after 16 “clkm” clock cycles at deassertion of the
COLDRST#.
4120A clock (internal VCLOCK) cycles at rising edge of the RST_B, the System Controller deasserts
Figure 1-11. Reset Configuration
µ
µ
PD98502
µ µ
USB
MII
MII
PCI
ibrsetibrset
USB Controller
usbwrst
usbrdy
Ethernet
Controller
#1
macwrst
macrdy
ibrset
Ethernet
Controller
#2
mac2wrst
mac2rdy
ibrset
PCI
Controller
pciwrst
pcirdy
UTOPIA2
ATM Cell Processor
PHY-MGR
atmwrst
atmwrst
usbwrst
usbrdy
ibrsetibrset
macwrst
macrdy
mac2wrst
mac2rdy
System Controller
pciwrst
pcirdy
IBUS
VR4120A RISC
Processor Core
atmrdy
atmrdy
reset
CLKSET
COLDRST#
HOTRST#
cresetb
Boot ROM
SDRAM
UART
RESET
54
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 1 INTRODUCTION
1.11 Interrupts
The controller supports maskable interrupts and Non-Maskable to the CPU.
Figure 1-12. Interrupt Signal Connection
System Controller
EXTNMI
V
R
4120A
EXTINT
ATM Cell Processor
USB Controller
Ethernet Controller #1
Ethernet Controller #2
PCI Controller
BUS-IF
DSU
BUS-IF
UART
TIMER
S_NSR
S_NER
S_ISR
S_IMR
nmib
intb[4]
intb[0]
intb[1]
intb[2]
intb[3]
Preliminary User’s Manual S15543EJ1V0UM
55
CHAPTER 1 INTRODUCTION
1.12 Clock Control Unit
This section describe µPD98502’s internal clock is supplied by Clock Control Unit (CCU) with following figure.
Figure 1-13. Block Diagram of Clock Control Unit
CCU (CLOCK CONTROL UNIT)
SCLK (33 MHz)
PLL
(x6)
1/3
1/2
1/4
1/8
CLOCK
ENABLER
CLOCK
ENABLER
CLOCK
ENABLER
CLOCK
ENABLER
33/25/16.5 MHz
66 MHz
USBCLK (12 MHz)
66 MHz
MITCLK (25 MHz)
MIRCLK (25 MHz)
66 MHz
MITCLK (25 MHz)
MIRCLK (25 MHz)
66 MHz
PCICLK (33 MHz)
ATM Cell
Processor
48 MHz
USB
Controller
25 MHz
25 MHz
Ethernet
Controller
#1
25 MHz
25 MHz
Ethernet
Controller
#2
56
CLOCK
ENABLER
1/2
1/3
Preliminary User’s Manual S15543EJ1V0UM
66 MHz
66 MHz
66 MHz
100/66 MHz
URTCLK
(18.432 MHz)
25/16.7
MHz
100/66MHz
pcistop
mac2stop
macstop
usbstop
atmstop
Peripheral
SEL
PC I
Controller
IBUS
UART
System
Controller
VR4120A
CHAPTER 2 V
4120A
R
CautionThe
This chapter describes an V
µµµµ
PD98502 doesn’t support MIPS16 instructions.
4120A RISC Processor Core operation (MIPS instruction, Pipeline, etc.). Following in
R
this Document, it is call for VR4120A RISC Processor Core with “VR4120A” or “VR4120A Core” simply.
2.1 Overview for VR4120A
Figure 2-1 shows the internal block diagram of the VR4120A core.
In addition to the conventional high-performance integer operation units, this CPU core has the full-associative
format translation look aside buffer (TLB), which has 32 entries that provide mapping to 2-page pairs (odd and even)
for one entry. Moreover, it also has instruction caches, data caches, and bus interface.
Figure 2-1. VR4120A Core Internal Block Diagram
System
VA bus
Controller
ID bus
Control(o)
Control(i)
Address/Data(o)
Address/Data(i)
Bus
Interface
Data
Cache
8 Kbyte
Cache
16 Kbyte
TLB
CPUCP0Instruction
Clock
Generator
Preliminary User’s Manual S15543EJ1V0UM
R
4120A Core
V
57
CHAPTER 2 VR4120A
2.1.1 Internal block configuration
2.1.1.1 CPU
CPU has hardware resources to process an integer instruction. They are the 64-bit register file, 64-bit integer data
bus, and multiply-and-accumulate operation unit.
2.1.1.2 Coprocessor 0 (CP0)
CP0 incorporates a memory management unit (MMU) and exception handling function. MMU checks whether
there is an access between different memory segments (user, supervisor, and kernel) by executing address
conversion. The translation lookaside buffer (TLB) converts virtual addresses to physical addresses.
2.1.1.3 Instruction cache
The instruction cache employs direct mapping, virtual index, and physical tag. Its capacity is 16 Kbytes.
2.1.1.4 Data cache
The data cache employs direct mapping, virtual index, physical tag, and write back. Its capacity is 8 Kbytes.
2.1.1.5 CPU bus interface
4120A and the BCU, which is one of
The CPU bus interface controls data transmission/reception between the V
peripheral units. The VR
4120A interface consists of two 32-bit multiplexed address/data buses (one is for input, and
R
another is for output), clock signals, and control signals such as interrupts.
58
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.1.2 VR4120A registers
4120A has the following registers.
The V
R
general-purpose register (GPR): 64 bits × 32
In addition, the processor provides the following special registers:
64-bit Program Counter (PC)
64-bit HI register, containing the integer multiply and divide upper doubleword result
64-bit LO register, containing the integer multiply and divide lower doubleword result
Two of the general-purpose registers have assigned the following functions:
r0 is hardwired to a value of zero, and can be used as the target register for any instruction whose result is to
be discarded. r0 can also be used as a source when a zero value is needed.
r31 is the link register used by link instruction, such as JAL (Jump and Link) instructions. This register can be
used for other instructions. However, be careful that use of the register by a link instruction will not coincide
with use of the register for other operations.
The register group is provided within the CP0 (system control coprocessor), to process exceptions and to manage
addresses.
4120A processor operation
CPU registers can operate as either 32-bit or 64-bit registers, depending on the V
R
mode.
Figure 2-2 shows the CPU registers.
Figure 2-2. VR4120A Registers
General-purpose register
r0 = 0
r1
r2
⋅
⋅
⋅
⋅
r29
r30
r31 = LinkAddress
031
63
63
Multiply/divide register3263
Program Counter
HI
3132
LO
3132
PC
0313263
0
0
The VR4120A has no Program Status Word (PSW) register as such; this is covered by the Status and Cause
registers incorporated within the System Control Coprocessor (CP0).
The CP0 registers are used for exception handling or address management. The overview of these registers is
described in 2.1.5 Coprocessors (CP0).
Preliminary User’s Manual S15543EJ1V0UM
59
CHAPTER 2 VR4120A
0
0
2.1.3 VR4120A instruction set overview
For CPU instructions, there are only one type of instructions – 32-bit length instruction (MIPS III).
2.1.3.1 MIPS III instruction
All the CPU instructions are 32-bit length when executing MIPS III instructions, and they are classified into three
instruction formats as shown in Figure 2-3: immediate (I-type), jump (J-type), and register (R-type). The field of each
instruction format is described in Section 2.2 MIPS III Instruction Set Summary.
Figure 2-3. CPU Instruction Formats (32-bit Length Instruction)
16 1521 2026 2531
I-type (immediate)
J-type (jump)
op
op
rs
rt
target
immediate
026 2531
R-type (register)
31
26 25
oprsrt
16 1521 206 511 10
rdsafunct
The instruction set can be further divided into the following five groupings:
(a) Load and store instructions move data between memory and general-purpose registers. They are all
immediate (I-type) instructions, since the only addressing mode supported is base register plus 16-bit, signed
immediate offset.
(b) Computational instructions perform arithmetic, logical, shift, and multiply and divide operations on values in
registers. They include R-type (in which both the operands and the result are stored in registers) and I-type
(in which one operand is a 16-bit signed immediate value) formats.
(c) Jump and branch instructions change the control flow of a program. Jumps are always made to an absolute
address formed by combining a 26-bit target address with the high-order bits of the Program Counter (J-type
format) or register address (R-type format). The format of the branch instructions is I type. Branches have
16-bit offsets relative to the Program Counter. JAL instructions save their return address in register 31.
(d) Coprocessor 0 (System Control Coprocessor, CP0) instructions perform operations on CP0 registers to
control the memory-management and exception-handling facilities of the processor.
(e) Special instructions perform system calls and breakpoint operations, or cause a branch to the general
exception-handling vector based upon the result of a comparison. These instructions occur in both R-type
and I-type formats.
For the operation of each instruction, refer to Section 2.2 MIPS III Instruction Set Summary and APPENDIX A
PD98502, byte ordering within all of the larger data formats - halfword, word, doubleword - can be
For the
µ
configured in either big-endian or little-endian order.
Endianness refers to the location of byte 0 within the multi-byte data structure.
When configured as a little-endian system, byte 0 is always the least-significant (rightmost) byte, which is
compatible with iAPX™ and DEC VAX™ conventions. Figures 2-4 and 2-5 show this configuration.
Figure 2-4. Little-Endian Byte Ordering in Word Data
High-order
address
Word
address
12
8
4
Low-order
0
address
Remarks 1. The lowest byte is the lowest address.
2. The address of word data is specified by the lowest byte’s address.
Figure 2-5. Little-Endian Byte Ordering in Double Word Data
High-order
address
Low-order
address
Double word
address
16
8
0
63
23
1514
76
Word
48 47
22
2120
1312
54
Bit No.
Half word
1918
1110
32
16 15
12131415
891011
4567
0123
Byte
87
1716
98
10
0151682324731
03132
Remarks 1. The lowest byte is the lowest address.
2. The address of word data is specified by the lowest byte’s address.
The CPU core uses the following byte boundaries for halfword, word, and doubleword accesses:
Halfword: An even byte boundary (0, 2, 4...)
Word: A byte boundary divisible by four (0, 4, 8...)
Doubleword: A byte boundary divisible by eight (0, 8, 16...)
Preliminary User’s Manual S15543EJ1V0UM
61
CHAPTER 2 VR4120A
The following special instructions to load and store data that are not aligned on 4-byte (word) or 8-byte
(doubleword) boundaries:
LWLLWRSWLSWR
LDLLDRSDLSDR
These instructions are used in pairs to provide an access to misaligned data. Accessing misaligned data incurs
one additional instruction cycle over that required for accessing aligned data.
Figure 2-6 shows the access of a misaligned word that has byte address 3 for the little-endian conventions.
Figure 2-6. Misaligned Word Accessing (Little-Endian)
High-order
address
0151682324731
564
3
Low-order
address
62
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.1.5 Coprocessors (CP0)
MIPS ISA defines 4 types of coprocessors (CP0 to CP3).
•CP0 translates virtual addresses to physical addresses, switches the operating mode (kernel, supervisor, or
user mode), and manages exceptions. It also controls the cache subsystem to analyze a cause and to return
from the error state.
•CP1 is reserved for floating-point instructions.
•CP2 is reserved for future definition by MIPS.
•CP3 is no longer defined. CP3 instructions are reserved for future extensions.
Figure 2-7 shows the definitions of the CP0 register, and Table 2-1 shows simple descriptions of each register. For
the detailed descriptions of the registers related to the virtual system memory, refer to Section 2.4 Memory
Management System. For the detailed descriptions of the registers related to exception handling, refer to Section
2.5 Exception Processing.
Figure 2-7. CP0 Registers
Register No.Register name
Index
Random
EntryLo0
EntryLo1
Context
Wired
RFU
Count
EntryHi
Compare
Status
Cause
Note 2
EPC
Note 1
PRId
Note 1
Note 1
Note 1
Note 1
Note 2
Note 1
Note 1
Note 1
Note 2
Note 1
Note 2
Note 2
Note 2
10
11
12
13
14
15
0
1
2
3
4
5
PageMask
6
7
8
BadVAddr
9
Notes 1. for Memory management
for Exception handling
2.
Register No.Register name
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
WatchLo
WatchHi
XContext
CacheErr
ErrorEPC
Config
LLAddr
RFU
RFU
RFU
RFU
RFU
PErr
TagLo
TagHi
RFU
Note 1
Note 1
Note 2
Note 2
Note 2
Note 2
Note 2
Note 1
Note 1
Note 2
Remark RFU: Reserved for future use
Preliminary User’s Manual S15543EJ1V0UM
63
CHAPTER 2 VR4120A
Table 2-1. System Control Coprocessor (CP0) Register Definitions
Register
Number
0IndexProgrammable pointer to TLB array
1RandomPseudo-random pointer to TLB array (read only)
2EntryLo0Low half of TLB entry for even VPN
3EntryLo1Low half of TLB entry for odd VPN
4ContextPointer to kernel virtual PTE in 32-bit mode
5PageMaskTLB page mask
6WiredNumber of wired TLB entries
7Reserved for future use
8BadVAddrVirtual address where the most recent error occurred
The contents of registers rt and rs are multiplied, treating both operands as signed integers.
The 128-bit result is stored into special registers HI and LO.
Doubleword Multiply
Unsigned
Doubleword DivideDDIV rs, rt
Doubleword Divide
Unsigned
Multiply and Add
Accumulate
Doubleword Multiply
and Add Accumulate
DMULTU rs, rt
The contents of registers rt and rs are multiplied, treating both operands as unsigned integers.
The 128-bit result is stored into special registers HI and LO.
The contents of register rs are divided by that of register rt, treating both operands as signed integers.
The 64-bit quotient is stored into special register LO, and the 64-bit remainder is stored into special
register HI.
DDIVU rs, rt
The contents of register rs are divided by that of register rt, treating both operands as unsigned
integers.
The 64-bit quotient is stored into special register LO, and the 64-bit remainder is stored into special
register HI.
MACC{h}{u}{s} rd, rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. The
result is added to the combined value of special registers HI and LO. The 64-bit result is stored into
special registers HI and LO.
If h=0, the same data as that stored in register LO is also stored in register rd; if h=1, the same data as
that stored in register HI is also stored in register rd.
If u is specified, the operand is treated as unsigned data.
If s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and
the value obtained by combining registers HI and LO is treated as a 32-bit value (64 bits sign- or zero-
extended). Moreover, saturation processing is performed for the operation result in the format
specified with u.
DMACC{h}{u}{s} rd, rs, rt
The contents of registers rt and rs are multiplied, treating both operands as 32-bit signed integers. The
result is added to value of special register LO. The 64-bit result is stored into special register LO.
If h=0, the same data as that stored in register LO is also stored in register rd; if h=1, undefined data is
stored in register rd.
If u is specified, the operand is treated as unsigned data.
If s is specified, registers rs and rd are treated as a 16-bit value (32 bits sign- or zero-extended), and
register LO is treated as a 32-bit value (64 bits sign- or zero-extended). Moreover, saturation
processing is performed for the operation result in the format specified with u.
op
rsrt
r
sa
funct
MFHI and MFLO instructions after a multiply or divide instruction generate interlocks to delay execution of the next
instruction, inhibiting the result from being read until the multiply or divide instruction completes.
Table 2-14 gives the number of processor cycles (PCycles) required to resolve interlock or stall between various
multiply or divide instructions and a subsequent MFHI or MFLO instruction.
76
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
Table 2-14. Number of Stall Cycles in Multiply and Divide Instructions
InstructionNumber of Instruction Cycles
MULT1
MULTU1
DIV36
DIVU36
DMULT3
DMULTU3
DDIV68
DDIVU68
MACC0
DMACC0
2.2.2.3 Jump and branch instructions
Jump and branch instructions change the control flow of a program. All jump and branch instructions occur with a
delay of one instruction: that is, the instruction immediately following the jump or branch instruction (this is known as
the instruction in the delay slot) always executes while the target instruction is being fetched from memory.
For instructions involving a link (such as JAL and BLTZAL), the return address is saved in register r31.
Table 2-15. Number of Delay Slot Cycles in Jump and Branch Instructions
InstructionNecessary Number of Cycles
Branch instruction1
Jump instruction1
(1) Overview of jump instructions
Subroutine calls in high-level languages are usually implemented with J or JAL instructions, both of which are J-
type instructions. In J-type format, the 26-bit target address shifts left 2 bits and combines with the high-order 4
bits of the current program counter to form a 32-bit or 64-bit absolute address.
Returns, dispatches, and cross-page jumps are usually implemented with the JR or JALR instructions. Both are
R-type instructions that take the 32-bit or 64-bit byte address contained in one of the general registers.
For more information, refer to APPENDIX A MIPS III INSTRUCTION SET DETAILS.
(2) Overview of branch instructions
A branch instruction has a PC-related signed 16-bit offset.
Tables 2-16 through 2-18 show the lists of Jump, Branch, and Expanded ISA instructions, respectively.
Preliminary User’s Manual S15543EJ1V0UM
77
CHAPTER 2 VR4120A
d
Table 2-16. Jump Instruction
InstructionFormat and Description
JumpJAL target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction.
Jump And Link
InstructionFormat and Description
Jump And Link
Exchange
InstructionFormat and Description
Jump RegisterJR rs
Jump And Link RegisterJALR rs, rd
J target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction. The
address of the instruction following the delay slot is stored into r31 (link register).
JALX target
The contents of 26-bit target address is shifted left by two bits and combined with the high-order four
bits of the PC. The program jumps to this calculated address with a delay of one instruction, and then
the ISA mode bit is reversed. The address of the instruction following the delay slot is stored into r31
(link register).
The program jumps to the address specified in register rs with a delay of one instruction.
The program jumps to the address specified in register rs with a delay of one instruction.
The address of the instruction following the delay slot is stored into rd.
op
op
op
target
target
rs
rtfunct
r
sa
There are the following common restrictions for Tables 2-17 and 2-18.
(3) Branch address
All branch instruction target addresses are computed by adding the address of the instruction in the delay slot to
the 16-bit offset (shifted left by 2 bits and sign-extended to 64 bits). All branches occur with a delay of one
instruction.
(4) Operation when unbranched (Table 2-18)
If the branch condition does not meet in executing a likely instruction, the instruction in its delay slot is nullified.
For all other branch instructions, the instruction in its delay slot is unconditionally executed.
Remark The target instruction of the branch is fetched at the EX stage of the branch instruction. Comparison of
the operands of the branch instruction and calculation of the target address is performed at phase 2 of
the RF stage and phase 1 of the EX stage of the instruction. Branch instructions require one cycle of
the branch delay slot defined by the architecture. Jump instructions also require one cycle of delay slot.
If the branch condition is not satisfied in a branch likely instruction, the instruction in its delay slot is
nullified.
78
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
There are special symbols used in the instruction formats of Tables 2-17 through 2-21.
If the contents of register rs are equal to that of register rt, the program branches to the target address.
Branch On Not EqualBNE rs, rt, offset
If the contents of register rs are not equal to that of register rt, the program branches to the target
address.
Branch On Less Than
Or Equal To Zero
Branch On Greater
Than Zero
InstructionFormat and Description
Branch On Less Than
Zero
Branch On Greater
Than Or Equal To Zero
Branch On Less Than
Zero And Link
Branch On Greater
Than Or Equal To Zero
And Link
BLEZ rs, offset
If the contents of register rs are less than or equal to zero, the program branches to the target address.
BGTZ rs, offset
If the contents of register rs are greater than zero, the program branches to the target address.
BLTZ rs, offset
If the contents of register rs are less than zero, the program branches to the target address.
BGEZ rs, offset
If the contents of register rs are greater than or equal to zero, the program branches to the target
address.
BLTZAL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are less than zero, the program branches to the target address.
BGEZAL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register). If the
contents of register rs are greater than or equal to zero, the program branches to the target address.
op
REGIMM
rs
rssub
rt
offset
offset
InstructionFormat and Description
Branch On
Coprocessor 0 True
Branch On
Coprocessor 0 False
BC0T offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate the branch target address.
If the conditional signal of the coprocessor 0 is true, the program branches to the target address with
one-instruction delay.
BC0F offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate the branch target address.
If the conditional signal of the coprocessor 0 is false, the program branches to the target address with
one-instruction delay.
COP0
Preliminary User’s Manual S15543EJ1V0UM
BC
br
offset
79
CHAPTER 2 VR4120A
Table 2-18. Branch Instructions (Extended ISA)
InstructionFormat and Description
Branch On Equal LikelyBEQL rs, rt, offset
If the contents of register rs are equal to that of register rt, the program branches to the target address.
If the branch condition is not met, the instruction in the delay slot is discarded.
Branch On Not Equal
Likely
BNEL rs, rt, offset
If the contents of register rs are not equal to that of register rt, the program branches to the target
address. If the branch condition is not met, the instruction in the delay slot is discarded.
Branch On Less Than
Or Equal To Zero Likely
BLEZL rs, offset
If the contents of register rs are less than or equal to zero, the program branches to the target address.
If the branch condition is not met, the instruction in the delay slot is discarded.
Branch On Greater
Than Zero Likely
BGTZL rs, offset
If the contents of register rs are greater than zero, the program branches to the target address. If the
branch condition is not met, the instruction in the delay slot is discarded.
InstructionFormat and Description
Branch On Less Than
Zero Likely
BLTZL rs, offset
If the contents of register rs are less than zero, the program branches to the target address. If the
branch condition is not met, the instruction in the delay slot is discarded.
Branch On Greater
Than Or Equal To Zero
Likely
Branch On Less Than
Zero And Link Likely
BGEZL rs, offset
If the contents of register rs are greater than or equal to zero, the program branches to the target
address. If the branch condition is not met, the instruction in the delay slot is discarded.
BLTZALL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register).
If the contents of register rs are less than zero, the program branches to the target address. If the
branch condition is not met, the instruction in the delay slot is discarded.
Branch On Greater
Than Or Equal To Zero
And Link Likely
BGEZALL rs, offset
The address of the instruction that follows delay slot is stored to register r31 (link register).
If the contents of register rs are greater than or equal to zero, the program branches to the target
address. If the branch condition is not met, the instruction in the delay slot is discarded.
op
REGIMM
rsrt
rs
sub
offset
offset
InstructionFormat and Description
Branch On
Coprocessor 0 True
Likely
Branch On
Coprocessor 0 False
Likely
80
COP0
BC
br
offset
BC0TL offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate the branch target address.
If the conditional signal of the coprocessor 0 is true, the program branches to the target address with
one-instruction delay.
If the branch condition is not met, the instruction in the delay slot is discarded.
BC0FL offset
Adds the 16-bit offset (shifted left by two bits and sign extended to 32 bits) to the address of the
instruction in the delay slot to calculate the branch target address.
If the conditional signal of the coprocessor 0 is false, the program branches to the target address with
one-instruction delay.
If the branch condition is not met, the instruction in the delay slot is discarded.
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
d
d
2.2.2.4 Special instructions
Special instructions generate software exceptions. Their formats are R-type (Syscall, Break). The Trap instruction
4000 Series. All the other instructions are available for all VR Series.
is available only for the V
R
Table 2-19. Special Instructions
InstructionFormat and Description
SynchronizeSYNC
Completes the load/store instruction executing in the current pipeline before the next load/store
instruction starts execution.
System CallSYSCALL
Generates a system call exception, and then transits control to the exception handling program.
BreakpointBREAK
Generates a break point exception, and then transits control to the exception handling program.
Table 2-20. Special Instructions (Extended ISA) (1/2)
InstructionFormat and Description
Trap If Greater Than Or
Equal
Trap If Greater Than Or
Equal Unsigned
Trap If Less ThanTLT rs, rt
Trap If Less Than
Unsigned
Trap If EqualTEQ rs, rt
Trap If Not EqualTNE rs, rt
TGE rs, rt
The contents of register rs are compared with that of register rt, treating both operands as signed
integers. If the contents of register rs are greater than or equal to that of register rt, an exception
occurs.
TGEU rs, rt
The contents of register rs are compared with that of register rt, treating both operands as unsigned
integers. If the contents of register rs are greater than or equal to that of register rt, an exception
occurs.
The contents of register rs are compared with that of register rt, treating both operands as signed
integers. If the contents of register rs are less than that of register rt, an exception occurs.
TLTU rs, rt
The contents of register rs are compared with that of register rt, treating both operands as unsigned
integers. If the contents of register rs are less than that of register rt, an exception occurs.
If the contents of registers rs and rt are equal, an exception occurs.
If the contents of registers rs and rt are not equal, an exception occurs.
SPECIAL
SPECIAL
rs
rs
rt
rt
r
r
sa
sa
funct
funct
Preliminary User’s Manual S15543EJ1V0UM
81
CHAPTER 2 VR4120A
d
Table 2-20. Special Instructions (Extended ISA) (2/2)
InstructionFormat and Description
Trap If Greater Than Or
Equal Immediate
Trap If Greater Than Or
Equal Immediate
Unsigned
Trap If Less Than
Immediate
Trap If Less Than
Immediate Unsigned
Trap If Equal
Immediate
Trap If Not Equal
Immediate
TGEI rs, immediate
The contents of register rs are compared with 16-bit sign-extended immediate data, treating both
operands as signed integers. If the contents of register rs are greater than or equal to 16-bit sign-
extended immediate data, an exception occurs.
TGEIU rs, immediate
The contents of register rs are compared with 16-bit zero-extended immediate data, treating both
operands as unsigned integers. If the contents of register rs are greater than or equal to 16-bit sign-
extended immediate data, an exception occurs.
TLTI rs, immediate
The contents of register rs are compared with 16-bit sign-extended immediate data, treating both
operands as signed integers. If the contents of register rs are less than 16-bit sign-extended
immediate data, an exception occurs.
TLTIU rs, immediate
The contents of register rs are compared with 16-bit zero-extended immediate data, treating both
operands as unsigned integers. If the contents of register rs are less than 16-bit sign-extended
immediate data, an exception occurs.
TEQI rs, immediate
If the contents of register rs and immediate data are equal, an exception occurs.
TNEI rs, immediate
If the contents of register rs and immediate data are not equal, an exception occurs.
REGIMM
rssub
immediate
2.2.2.5 System control coprocessor (CP0) instructions
System control coprocessor (CP0) instructions perform operations specifically on the CP0 registers to manipulate
the memory management and exception handling facilities of the processor.
Table 2-21. System Control Coprocessor (CP0) Instructions (1/2)
InstructionFormat and Description
Move To System
Control Coprocessor
Move From System
Control Coprocessor
Doubleword Move To
System Control
Coprocessor 0
Doubleword Move
From System Control
Coprocessor 0
MTC0 rt, rd
The word data of general register rt in the CPU are loaded into general register rd in the CP0.
MFC0 rt, rd
The word data of general register rd in the CP0 are loaded into general register rt in the CPU.
DMTC0 rt, rd
The doubleword data of general register rt in the CPU are loaded into general register rd in the CP0.
DMFC0 rt, rd
The doubleword data of general register rd in the CP0 are loaded into general register rt in the CPU.
COP0
subrt
r
0
82
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
Table 2-21. System Control Coprocessor (CP0) Instructions (2/2)
InstructionFormat and Description
Read Indexed TLB
Entry
TLBR
The TLB entry indexed by the index register is loaded into the entryHi, entryLo0, entryLo1, or page
mask register.
Write Indexed TLB
Entry
TLBWI
The contents of the entryHi, entryLo0, entryLo1, or page mask register are loaded into the TLB entry
indexed by the index register.
Write Random TLB
Entry
TLBWR
The contents of the entryHi, entryLo0, entryLo1, or page mask register are loaded into the TLB entry
indexed by the random register.
Probe TLB For
Matching Entry
TLBP
The address of the TLB entry that matches with the contents of entryHi register is loaded into the index
register.
Return From ExceptionERET
The program returns from exception, interrupt, or error trap.
InstructionFormat and Description
STANDBYSTANDBY
The processor's operating mode is transited from fullspeed mode to standby mode.
SUSPENDSUSPEND
The processor's operating mode is transited from fullspeed mode to suspend mode.
HIBERNATEHIBERNATE
The processor's operating mode is transited from fullspeed mode to hibernate mode.
COP0
COP0
CO
CO
funct
funct
InstructionFormat and Description
Cache OperationCache op, offset (base)
The 16-bit offset is sign extended to 32 bits and added to the contents of the register case, to form
virtual address. This virtual address is translated to physical address with TLB. For this physical
address, cache operation that is indicated by 5-bit sub-opcode is performed.
CACHE
base
op
offset
Preliminary User’s Manual S15543EJ1V0UM
83
CHAPTER 2 VR4120A
2.3 Pipeline
This section describes the basic operation of the VR4120A Core pipeline, which includes descriptions of the delay
slots (instructions that follow a branch or load instruction in the pipeline), interrupts to the pipeline flow caused by
interlocks and exceptions, and CP0 hazards.
2.3.1 Pipeline stages
The pipeline is controlled by PClock(one cycle of PClock which runs at 4-times frequency of MasterClock) and one
cycle of this PClock is called PCycle. Each pipeline stage takes one PCycle.
2.3.1.1 Pipeline in MIPS III instruction mode
4120A has a five-stage instruction pipeline; each stage takes one PCycle, and each PCycle has two
R
The V
phases:
instruction can take longer - for example, if the required data is not in the cache, the data must be retrieved from main
memory.
1 and Φ2, as shown in Figure 2-9. Thus, the execution of each instruction takes at least 5 PCycles. An
Φ
Figure 2-9. Pipeline Stages (MIPS III Instruction Mode)
PCycle
PClock
Phase
Cycle
Φ
Φ
1
IF
Φ
1
2
2
RFEXDCWB
Φ
Φ
Φ
1
Φ
1
2
2
Φ
Φ
Φ
2
1
The five pipeline stages are:
IF - Instruction cache fetch
RF - Register fetch
EX - Execution
DC - Data cache fetch
WB - Write back
Figure 2-10 shows the five stages of the instruction pipeline. In this figure, a row indicates the execution process of
each instruction, and a column indicates the processes executed simultaneously.
84
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
Figure 2-10. Instruction Execution in the Pipeline
PCycle
(Five stages)
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
Current CPU cycle
2.3.1.2 Pipeline activities
(1) MIPS III instruction
Figure 2-11 shows the activities that can occur during each pipeline stage in MIPS III Instruction mode. Table 2-22
describes these pipeline activities.
Figure 2-11. Pipeline Activities (MIPS III)
PClock
Phase
Cycle
I Fetch
and
Decode
ALU
Load/Store
Branch
PCycle
Φ
1
IF1
IDC
ITLB
2
DCA
DTLB
Φ
2
1
Φ
IF2
ICA
ITC
1
2
RF1RF2EX1EX2DC1DC2WB1WB2
IDEC
RF
BAC
1
2
EX
DVA
Φ
Φ
Φ
Φ
Φ
Φ
WB
WB
DCWDTDSA
Φ
2
1
Preliminary User’s Manual S15543EJ1V0UM
85
CHAPTER 2 VR4120A
Table 2-22. Operation in Each Stage of Pipeline (MIPS III)
CyclePhaseMnemonicDescription
IF
RF
EX
DC
WB
1IDCInstruction cache address decode
Φ
ITLBInstruction address translation
2ICAInstruction cache array access
Φ
ITCInstruction tag check
1IDECInstruction decode
Φ
2RFRegister operand fetch
Φ
BACBranch address calculation
1EXExecution stage
Φ
DVAData virtual address calculation
SAStore align
2DCAData cache address decode/array access
Φ
DTLBData address translation
1DLAData cache load align
Φ
DTCData tag check
DTDData transfer to data cache
1DCWData cache write
Φ
WBWrite back to register file
86
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.3.2 Branch delay
During a V
4120A's pipeline operation, a one-cycle branch delay occurs when:
R
•Target address is calculated by a Jump instruction
•Branch condition of branch instruction is met and then logical operation starts for branch-destination
comparison
The instruction location following the Jump/Branch instruction is called a branch delay slot.
The instruction address generated at the EX stage in the Jump/Branch instruction are available in the IF stage, two
instructions later. In MIPS III instruction mode, branch delay is two cycles. One instruction in the branch delay slot is
executed, except for likely instruction.
Figure 2-12 illustrates the branch delay and the location of the branch delay slot during MIPS III instruction mode.
Figure 2-12. Branch Delay (In MIPS III Instruction Mode)
PCycle
Branch
(Branch delay slot)
Target
IFRFEXDCWB
IFRFEXDCWB
IFRFEXDCWB
Branch delay
2.3.3 Load delay
In the case of a load instruction, 2 cycles are required for the DC stage, for reading from the data cache and
performing data alignment. In this case, the hardware automatically generates on interlock.
A load instruction that does not allow its result to be used by the instruction immediately following is called a
delayed load instruction. The instruction immediately following this delayed load instruction is referred to as the load
delay slot.
4120A, the instruction immediately following a load instruction can use the contents of the loaded register,
In the V
R
however in such cases hardware interlocks insert additional delay cycles. Consequently, scheduling load delay slots
can be desirable, both for performance and VR-Series processor compatibility.
Preliminary User’s Manual S15543EJ1V0UM
87
CHAPTER 2 VR4120A
2.3.4 Pipeline operation
The operation of the pipeline is illustrated by the following examples that describe how typical instructions are
executed. The instructions described are six: ADD, JALR, BEQ, TLT, LW, and SW. Each instruction is taken through
the pipeline and the operations that occur in each relevant stage are described.
2.3.4.1 Add instruction (ADD rd, rs, rt)
1 of the IF stage, the eleven least-significant bits of the virtual access are used to access
IF stageIn
RF stageDuring
EX stageThe ALU controls are set to do an A + B operation. The operands flow into the ALU inputs, and
DC stageThis stage is a NOP for this instruction. The data from the output of the EX stage (the ALU) is
WB stageDuring
Φ
the instruction cache. In Φ2 of the IF stage, the cache index is compared with the page frame
number and the cache data is read out. The virtual PC is incremented by 4 so that the next
instruction can be fetched.
2, the 2-port register file is addressed with the rs and rt fields and the register data is
Φ
valid at the register file output. At the same time, bypass multiplexers select inputs from either
the EX- or DC-stage output in addition to the register file output, depending on the need for an
operand bypass.
the ALU operation is started. The result of the ALU operation is latched into the ALU output
1.
latch during
moved into the output latch of the DC.
the rd field. The file write strobe is enabled. By the end of
Φ
1, the WB latch feeds the data to the inputs of the register file, which is addressed by
Φ
1, the data is written into the file.
Φ
Figure 2-13. ADD Instruction Pipeline Activities (In MIPS III Instruction Mode)
PCycle
PClock
Φ
Φ
Phase
Cycle
Φ
IF1
IDC
ITLB
Φ
Φ
Φ
1
IF2
ICA
Φ
1
2
RF1RF2EX1EX2DC1DC2WB1WB2
ITC
IDECWBEXRF
Φ
1
2
Φ
1
2
1
2
Φ
2
88
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.3.4.2 Jump and link register instruction (JALR rd, rs)
IF stageSame as the IF stage for the ADD instruction.
IT stageSame as the IT stage for the ADD instruction.
RF stageA register specified in the rs field is read from the file during
read from the rs register is input to the virtual PC latch synchronously. This value is used to
fetch an instruction at the jump destination. The value of the virtual PC incremented during the
IF stage is incremented again to produce the link address PC + 8 where PC is the address of
the JALR instruction. The resulting value is the PC to which the program will eventually return.
This value is placed in the Link output latch of the Instruction Address unit.
EX stageThe PC + 8 value is moved from the Link output latch to the output latch of the EX stage.
DC stageThe PC + 8 value is moved from the output latch of the EX stage to the output latch of the DC
stage.
WB stageRefer to the ADD instruction. Note that if no value is explicitly provided for rd then register 31 is
used as the default. If rd is explicitly specified, it cannot be the same register addressed by rs;
if it is, the result of executing such an instruction is undefined.
Figure 2-14. JALR Instruction Pipeline Activities (In MIPS III Instruction Mode)
2 at the RF stage, and the value
Φ
PClock
Phase
Cycle
PCycle
Φ
1
IF1
IDC
ITLB
Φ
IF2
ICA
ITC
2
1
RF1RF2EX1EX2DC1DC2WB1WB2
IDECWBEX
RF
BAC
Φ
Φ
Φ
2
1
2
Φ
Φ
Φ
1
2
Φ
Φ
2
1
Preliminary User’s Manual S15543EJ1V0UM
89
CHAPTER 2 VR4120A
2.3.4.3 Branch on equal instruction (BEQ rs, rt, offset)
IF stageSame as the IF stage for the ADD instruction.
IT stageSame as the IT stage for the ADD instruction.
RF stageDuring
determine if each corresponding bit position of these two operands has equal values. If they
are equal, the PC is set to PC + target, where target is the sign-extended offset field. If they are
not equal, the PC is set to PC + 4.
EX stageThe next PC resulting from the branch comparison is valid at the beginning of
fetch.
DC stageThis stage is a NOP for this instruction.
WB stageThis stage is a NOP for this instruction.
Figure 2-15. BEQ Instruction Pipeline Activities (In MIPS III Instruction Mode)
PClock
2, the register file is addressed with the rs and rt fields. A check is performed to
Φ
PCycle
2 for instruction
Φ
Phase
Cycle
Φ
IF1
IDC
ITLB
Φ
1
IF2
ICA
Φ
2
RF1RF2EX1EX2DC1DC2WB1 WB2
ITC
IDECEX
2
1
RF
BAC
Φ
Φ
Φ
1
Φ
1
2
2
Φ
Φ
Φ
2
1
90
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.3.4.4 Trap if less than instruction (TLT rs, rt)
IF stageSame as the IF stage for the ADD instruction.
RF stageSame as the RF stage for the ADD instruction.
EX stageALU controls are set to do an A - B operation. The operands flow into the ALU inputs, and the
ALU operation is started. The result of the ALU operation is latched into the ALU output latch
1. The sign bits of operands and of the ALU output latch are checked to determine if a
Φ
during
less than condition is true. If this condition is true, a Trap exception occurs. The value in the
PC register is used as an exception vector value, and from now on any instruction will be
invalid.
DC stageNo operation
WB stageThe value of the PC is loaded to EPC register if the less than condition was met in the EX
stage. The Cause register ExCode field and BD bit are updated appropriately, as is the EXL bit
of the Status register. If the less than condition was not met in the EX stage, no activity occurs
in the WB stage.
Figure 2-16. TLT Instruction Pipeline Activities
PCycle
PClock
Phase
Cycle
Φ
IF1
IDC
ITLB
Φ
1
IF2
ICA
ITC
1
2
RF1RF2EX1EX2DC1DC2WB1WB2
IDECEXRF
1
2
Φ
Φ
Φ
Φ
Φ
2
2
1
Φ
Φ
Φ
2
1
Preliminary User’s Manual S15543EJ1V0UM
91
CHAPTER 2 VR4120A
2.3.4.5 Load word instruction (LW rt, offset (base))
IF stageSame as the IF stage for the ADD instruction.
IT stageSame as the IT stage for the ADD instruction.
RF stageSame as the RF stage for the ADD instruction. Note that the base field is in the same position
as the rs field.
EX stageRefer to the EX stage for the ADD instruction. For LW, the inputs to the ALU come from
GPR[base] through the bypass multiplexer and from the sign-extended offset field. The result
1 represents the effective
of the ALU operation that is latched into the ALU output latch in
virtual address of the operand (DVA).
DC stageThe cache tag field is compared with the Page Frame Number (PFN) field of the TLB entry.
After passing through the load aligner, aligned data is placed in the DC output latch during
WB stageDuring
Figure 2-17. LW Instruction Pipeline Activities (In MIPS III Instruction Mode)
PClock
1, the cache read data is written into the register file addressed by the rt field.
Φ
PCycle
Φ
2.
Φ
Phase
Cycle
Φ
IF1
IDC
ITLB
2
Φ
1
EXRFDCA
DTLB
Φ
Φ
1
IF2
ICA
Φ
1
2
RF1RF2EX1EX2DC1DC2WB1WB2
ITC
IDEC
Φ
Φ
1
2
DL
DT
2
WBDVA
Φ
Φ
Φ
2
1
92
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.3.4.6 Store word instruction (SW rt, offset (base))
IF stageSame as the IF stage for the ADD instruction.
IT stageSame as the IT stage for the ADD instruction.
RF stageSame as the RF stage for the LW instruction.
EX stageRefer to the LW instruction for a calculation of the effective address. From the RF output latch,
the GPR[rt] is sent through the bypass multiplexer and into the main shifter, where the shifter
performs the byte-alignment operation for the operand. The results of the ALU are latched in
1. The shift operations are latched in the output latches during Φ2.
the output latches during
DC stageRefer to the LW instruction for a description of the cache access.
WB stageIf there was a cache hit, the content of the store data output latch is written into the data cache
at the appropriate word location.
Note that all store instructions use the data cache for two consecutive PCycles. If the following
instruction requires use of the data cache, the pipeline is slipped for one PCycle to complete the
writing of an aligned store data.
Figure 2-18. SW Instruction Pipeline Activities (In MIPS III Instruction Mode)
PCycle
Φ
PClock
Phase
Cycle
Φ
IF1
IDC
ITLB
2
DTLB
Φ
DT
2
1
DCWDTDSA
Φ
1
IF2
ICA
ITC
1
2
RF1RF2EX1EX2DC1DC2WB1WB2
IDEC
1
2
EXRF
DVA
Φ
Φ
Φ
Φ
Φ
Φ
Φ
2
1
Preliminary User’s Manual S15543EJ1V0UM
93
CHAPTER 2 VR4120A
2.3.5 Interlock and exception handling
Smooth pipeline flow is interrupted when cache misses or exceptions occur, or when data dependencies are
detected. Interruptions handled using hardware, such as cache misses, are referred to as interlocks, while those that
are handled using software are called exceptions. As shown in Figure 2-19, all interlock and exception conditions are
collectively referred to as faults.
Figure 2-19. Relationship among Interlocks, Exceptions, and Faults
Faults
SoftwareHardware
ExceptionsInterlocks
SlipStallAbort
At each cycle, exception and interlock conditions are checked for all active instructions.
Because each exception or interlock condition corresponds to a particular pipeline stage, a condition can be traced
back to the particular instruction in the exception/interlock stage, as shown in Table 2-23. For instance, an LDI
Interlock is raised in the Register Fetch (RF) stage.
Tables 2-24 and 2-25 describe the pipeline interlocks and exceptions listed in Table 2-23.
Table 2-23. Correspondence of Pipeline Stage to Interlock and Exception Conditions
StageIFRF
Status
Stall−ITM
ICM
Slip−LDI
MDI
SLI
CP0
ExceptionIAErrNMI
ITLB
IPErr
INTr
IBE
SYSC
BP
CUn
RSVD
(IT)
EXDCWB
−DTM
DCM
DCB
−−−
Trap
OVF
DAErr
Reset
DTLB
TMod
DPErr
WAT
DBE
−Interlock
−
Remark In the above table, exception conditions are listed up in higher priority order.
94
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
Table 2-24. Pipeline Interlock
InterlockDescription
ITMInstruction TLB Miss
ICMInstruction Cache Miss
LDILoad Data Interlock
MDIMD Busy Interlock
SLIStore-Load Interlock
CP0Coprocessor 0 Interlock
DTMData TLB Miss
DCMData Cache Miss
DCBData Cache Busy
Table 2-25. Description of Pipeline Exception
ExceptionDescription
IAErrInstruction Address Error exception
NMINon-maskable Interrupt exception
ITLBITLB exception
IPErrInstruction Parity Error exception
INTrInterrupt exception
IBEInstruction Bus Error exception
SYSCSystem Call exception
BPBreakpoint exception
CUnCoprocessor Unusable exception
RSVDReserved Instruction exception
TrapTrap exception
OVFInteger overflow exception
DAErrData Address Error exception
ResetReset exception
DTLBDTLB exception
DTModDTLB Modified exception
DPErrData Parity Error exception
WATWatch exception
DBEData Bus Error exception
Preliminary User’s Manual S15543EJ1V0UM
95
CHAPTER 2 VR4120A
2.3.5.1 Exception conditions
When an exception condition occurs, the relevant instruction and all those that follow it in the pipeline are
cancelled. Accordingly, any stall conditions and any later exception conditions that may have referenced this
instruction are inhibited; there is no benefit in servicing stalls for a cancelled instruction.
4120A will discard it and all following
When an exceptional conditions is detected for an instruction, the V
R
instructions. When this instruction reaches the WB stage, the exception flag and various information items are written
to CP0 registers. The current PC is changed to the appropriate exception vector address and the exception bits of
earlier pipeline stages are cleared.
This implementation allows all preceding instructions to complete execution and prevents all subsequent
instructions from completing. Thus the value in the EPC is sufficient to restart execution. It also ensures that
exceptions are taken in the order of execution; an instruction taking an exception may itself be killed by an instruction
further down the pipeline that takes an exception in a later cycle.
Figure 2-20. Exception Detection
Exception
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
1
2
Exception vector
: Killed stage
: Interrupt
EX1EX2
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
RF1RF2EX1EX2DC1DC2WB1
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
IF1IF2RF1RF2EX1EX2DC1DC2WB1
DC1DC2WB1
IF1IF2 RF1 RF2 EX1 EX2 DC1 DC2 WB1 WB2
96
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
2.3.5.2 Stall conditions
Stalls are used to stop the pipeline for conditions detected after the RF stage. When a stall occurs, the processor
will resolve the condition and then the pipeline will continue. Figure 2-21 shows a data cache miss stall, and Figure 2-
22 shows a CACHE instruction stall.
Figure 2-21. Data Cache Miss Stall
IFRFEXDCWB WBWB WBWB
1
23
IFRFEXDCDCDCDCDC WB
IFRFEXEXEXEXEXDCWB
IFRFRFRFRFRFEXDCWB
Detect data cache miss
1
Start moving data cache line to write buffer2
Get last word into cache and restart pipeline3
If the cache line to be replaced is dirty the W bit is set the data is moved to the internal write buffer in the next
cycle. The write-back data is returned to memory. The last word in the data is returned to the cache at 3, and
pipelining restarts.
Figure 2-22. CACHE Instruction Stall
IFRFEXDCWB WBWB WBWB
1
IFRFEXDCDCDCDCDC WB
IFRFEXEXEXEXEXDCWB
IFRFRFRFRFRFEXDCWB
CACHE instruction start
1
CACHE instruction complete2
2
When the CACHE instruction enters the DC stage, the pipeline stalls while the CACHE instruction is executed.
The pipeline begins running again when the CACHE instruction is completed, allowing the instruction fetch to proceed.
Preliminary User’s Manual S15543EJ1V0UM
97
CHAPTER 2 VR4120A
2.3.5.3 Slip conditions
During
2 of the RF stage and Φ1 of the EX stage, internal logic will determine whether it is possible to start the
Φ
current instruction in this cycle. If all of the source operands are available (either from the register file or via the
internal bypass logic) and all the hardware resources necessary to complete the instruction will be available whenever
required, then the instruction “run”; otherwise, the instruction will “slip”. Slipped instructions are retired on subsequent
cycles until they issue. The backend of the pipeline (stages DC and WB) will advance normally during slips in an
attempt to resolve the conflict. NOPs will be inserted into the bubble in the pipeline. Instructions killed by branch
likely instructions, ERET or exceptions will not cause slips.
Figure 2-23. Load Data Interlock
Load A
Load B
IFRFEXDCWB
IFRFEXDCWB
Bypass
ADD A,B
IFRFRFEXDCWB
1
2
IFRFEXDCWB
Detect load interlock
1
Get the target data2
Load Data Interlock is detected in the RF stage shown in as Figure 2-23 and also the pipeline slips in the stage.
Load Data Interlock occurs when data fetched by a load instruction and data moved from HI, LO or CP0 registers is
required by the next immediate instruction. The pipeline begins running again when the clock after the target of the
load is read from the data cache, HI, LO and CP0 registers. The data returned at the end of the DC stage is input into
the end of the RF stage, using the bypass multiplexers.
98
Preliminary User’s Manual S15543EJ1V0UM
CHAPTER 2 VR4120A
Figure 2-24. MD Busy Interlock
IFRFEXDCWB
Bypass
MFLO/MFHI
IFRFRFEXDCWB
1
2
IFRFEXDCWB
Detect MD busy interlock
1
Get target data2
MD Busy Interlock is detected in the RF stage as shown in Figure 2-24 and also the pipeline slips in the stage. MD
Busy Interlock occurs when HI/LO register is required by MFHI/MFLO instruction before finishing Mult/Div execution.
The pipeline begins running again the clock after finishing Mult/Div execution. The data returned from the HI/LO
register at the end of the DC stage is input into the end of the RF stage, using the bypass multiplexers.
Store-Load Interlock is detected in the EX stage and the pipeline slips in the RF stage. Store-Load Interlock occurs
when store instruction followed by load instruction is detected. The pipeline begins running again one clock after.
Coprocessor 0 Interlock is detected in the EX stage and the pipeline slips in the RF stage. A coprocessor interlock
occurs when an MTC0 instruction for the Configuration or Status register is detected.
The pipeline begins running again one clock after.
2.3.5.4 Bypassing
In some cases, data and conditions produced in the EX, DC and WB stages of the pipeline are made available to
the EX stage (only) through the bypass data path.
Operand bypass allows an instruction in the EX stage to continue without having to wait for data or conditions to be
written to the register file at the end of the WB stage. Instead, the Bypass Control Unit is responsible for ensuring
data and conditions from later pipeline stages are available at the appropriate time for instructions earlier in the
pipeline.
The Bypass Control Unit is also responsible for controlling the source and destination register addresses supplied
to the register file.
Preliminary User’s Manual S15543EJ1V0UM
99
CHAPTER 2 VR4120A
2.3.6 Program compatibility
4120A core is designed taking into consideration program compatibility with other VR-Series processors.
The V
R
However, because the VR
4120A differs from other processors in its architecture, it may not be able to run some
programs that run on other processors. Likewise, programs that run on the VR4120A will not necessarily run on other
processors. Matters which should be paid attention to when porting programs between the VR4120A core and other
-Series processors are listed below.
VR
•The VR4120A core does not support floating-point instructions since it has no Floating-Point Unit (FPU).
•Multiply-add instructions (DMACC, MACC) are added in the VR4120A.
•Instructions for power modes (HIBERNATE, STANDBY, SUSPEND) are added in the VR
4120A to support
power modes.
•The VR
4120A does not have the LL bit to perform synchronization of multiprocessing. Therefore, the CPU
core does not support instructions which manipulate the LL bit (LL, LLD, SC, SCD).
4120A (but the µPD98502 does not support MIPS16
•A 16-bit length MIPS16 instruction set is added in the V
R
mode).
•The CP0 hazards of the VR4120A are equally or less stringent than those of other processors (for details, see
APPENDIX B V
•An instruction for debug has been added for the V
4120A.
VR
R4120A COPROCESSOR 0 HAZARDS).
4120A. However, this instruction cannot be used for the
R
For more information, refer to APPENDIX A MIPS III INSTRUCTION SET DETAILS, the VR
4100, VR4111™
User's Manual, or the VR4300 User's Manual.
The list of instructions supported by VR
Product
Instruction
MIPS I instruction setΟΟΟΟΟ
MIPS II instruction setΟΟΟΟΟ
MIPS III instruction setΟΟΟΟΟ
LL bit operation×××ΟΟ
MIPS IV instruction set××××Ο
MIPS16 instruction set×Ο
16-bit multiply-add operationΟΟΟ
32-bit multiply-add operation××Ο××
Floating-point operation×××ΟΟ
Power mode transferΟΟΟ××
Note The
PD98502 does not support MIPS16 mode. The MIPD16EN pin (located at D11) should be connected to
µ
-Series products is shown below.
Table 2-26. V
VR4100
V
4102™
R
R Series Supported Instructions
4111VR4120A
R
V
Core
Note
Ο
(Use of 32-bit
multiply-add
operation
VR4300
V
4305™
R
V
4310™
R
××
××
V
5000™
R
V
10000™
R
GND.
100
Preliminary User’s Manual S15543EJ1V0UM
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.