Compaq EV68A User Manual

21264/EV68A Microprocessor Hardware Reference Manual

Part Number: DS–0038B–TE

This manual is directly derived from the internal 21264/EV68A Specifications, Revision 1.1. You can access this hardware reference manual in PDF format from t he following site:

ftp://ftp.compaq.com/pub/products/alphaCPUdocs

Revision/Update Information: Revision 1.1, March 2002

Compaq Computer Corporation Shrewsbur y, Massachuse tts

March2002

The information in this publication is subject to changewithout notice.

COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNIS HING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ COM PUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS,IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULARPURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT.

This publicationcontains information protectedby copyright. No partof this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.

COMPAQ, the Compaq logo, the Digital logo, and VAXRegistered in United States Patent and Trademark Office.

Pentium is a registered trademark of IntelC orporation.

Other product names mentioned herein may be trademarks and/or registered trademarksof their respective companies.

21264/EV68A Hardware Reference Manual

Preface

1 Introduction

1.1 TheArchitecture.......................................................... 1–1

1.1.1 Addressing........................................................... 1–2

1.1.2 Integer Data Types. . . .................................................. 1–2

1.1.3 Floating-PointDataTypes............................................... 1–2

1.2 21264/EV68A Microprocessor Features. . ...................................... 1–3

2 Internal Architecture

2.1 21264/EV68A Microarchitecture . . ............................................ 2–1

2.1.1 InstructionFetch,Issue,andRetireUnit .................................... 2–2

2.1.1.1 Virtual Program Counter Logic . . ...................................... 2–2

2.1.1.2 BranchPredictor................................................... 2–3

2.1.1.3 Instruction-StreamTranslationBuffer................................... 2–5

2.1.1.4 InstructionFetchLogic.............................................. 2–6

2.1.1.5 RegisterRenameMaps ............................................. 2–6

2.1.1.6 Integer Issue Queue................................................ 2–6

2.1.1.7 Floating-Point Issue Queue .......................................... 2–7

2.1.1.8 Exception and Interrupt Logic. . . ...................................... 2–8

2.1.1.9 Retire Logic ....................................................... 2–8

2.1.2 Integer Execution Unit .................................................. 2–8

2.1.3 Floating-PointExecutionUnit............................................. 2–10

2.1.4 ExternalCacheandSystemInterfaceUnit .................................. 2–11

2.1.4.1 VictimAddressFileandVictimDataFile ................................ 2–11

2.1.4.2 I/OWriteBuffer.................................................... 2–11

2.1.4.3 ProbeQueue...................................................... 2–11

2.1.4.4 DuplicateDcacheTagArray.......................................... 2–11

2.1.5 OnchipCaches........................................................ 2–11

2.1.5.1 InstructionCache .................................................. 2–11

2.1.5.2 DataCache....................................................... 2–12

2.1.6 MemoryReferenceUnit................................................. 2–12

2.1.6.1 LoadQueue ...................................................... 2–13

2.1.6.2 StoreQueue...................................................... 2–13

2.1.6.3 MissAddressFile.................................................. 2–13

2.1.6.4 DstreamTranslationBuffer........................................... 2–13

2.1.7 SROMInterface....................................................... 2–13

2.2 PipelineOrganization ...................................................... 2–13

2.2.1 PipelineAborts........................................................ 2–16

2.3 InstructionIssueRules..................................................... 2–16

21264/EV68A Hardware Refere nce Manual

iii

2.3.1 InstructionGroupDefinitions............................................. 2–17

2.3.2 EboxSlotting......................................................... 2–18

2.3.3 InstructionLatencies ................................................... 2–20

2.4 InstructionRetireRules..................................................... 2–21

2.4.1 Floating-PointDivide/SquareRootEarlyRetire............................... 2–22

2.5 RetireofOperateInstructionsintoR31/F31..................................... 2–22

2.6 LoadInstructionstoR31andF31............................................. 2–23

2.6.1 NormalPrefetch:LDBU,LDF,LDG,LDL,LDT,LDWU,HW_LDLInstructions....... 2–23

2.6.2 PrefetchwithModifyIntent:LDSInstruction ................................. 2–23

2.6.3 Prefetch,EvictNext:LDQandHW_LDQInstructions.......................... 2–24

2.7 SpecialCasesofAlphaInstructionExecution.................................... 2–24

2.7.1 LoadHitSpeculation ................................................... 2–24

2.7.2 Floating-PointStoreInstructions.......................................... 2–26

2.7.3 CMOVInstruction...................................................... 2–26

2.8 MemoryandI/OAddressSpaceInstructions.................................... 2–27

2.8.1 MemoryAddressSpaceLoadInstructions .................................. 2–27

2.8.2 I/O Address Space Load Instructions. ...................................... 2–27

2.8.3 MemoryAddressSpaceStoreInstructions.................................. 2–28

2.8.4 I/OAddressSpaceStoreInstructions ...................................... 2–29

2.9 MAFMemoryAddressSpaceMergingRules.................................... 2–30

2.10 InstructionOrdering........................................................ 2–30

2.11 ReplayTraps............................................................. 2–31

2.11.1 MboxOrderTraps..................................................... 2–31

2.11.1.1 Load-LoadOrderTrap .............................................. 2–31

2.11.1.2 Store-LoadOrderTrap.............................................. 2–31

2.11.2 OtherMboxReplayTraps............................................... 2–32

2.12 I/OWriteBufferandtheWMBInstruction....................................... 2–32

2.12.1 MemoryBarrier(MB/WMB/TBFillFlow).................................... 2–32

2.12.1.1 MBInstructionProcessing ........................................... 2–33

2.12.1.2 WMBInstructionProcessing.......................................... 2–33

2.12.1.3 TBFillFlow....................................................... 2–34

2.13 Performance Measurement Support—Performance Counters . ...................... 2–35

2.14 Floating-PointControlRegister............................................... 2–35

2.15 AMASKandIMPLVERInstructionValues ...................................... 2–37

2.15.1 AMASK.............................................................. 2–38

2.15.2 IMPLVER............................................................ 2–38

2.16 DesignExamples ......................................................... 2–38

3 Hardware Interface

3.1 21264/EV68A Microprocessor Logic Symbol . . . ................................. 3–1

3.2 21264/EV68A Signal Names and Functions..................................... 3–3

3.3 PinAssignments.......................................................... 3–8

3.4 MechanicalSpecifications................................................... 3–17

3.5 21264/EV68A Packaging . .................................................. 3–18

4 Cache and External Interfaces

4.1 IntroductiontotheExternalInterfaces.......................................... 4–1

4.1.1 SystemInterface...................................................... 4–3

4.1.1.1 CommandsandAddresses........................................... 4–4

4.1.2 Second-Level Cache (Bcache) Interface . . .................................. 4–4

4.2 PhysicalAddressConsiderations............................................. 4–4

4.3 BcacheStructure.......................................................... 4–7

4.3.1 Bcache Interface Signals ................................................ 4–7

4.3.2 SystemDuplicateTagStores............................................. 4–7

21264/EV68A Hardware R eference Manual

4.4 VictimDataBuffer......................................................... 4–8

4.5 Cache Coherency . . ....................................................... 4–8

4.5.1 Cache Coherency Basics................................................ 4–8

4.5.2 CacheBlockStates.................................................... 4–9

4.5.3 CacheBlockStateTransitions............................................ 4–10

4.5.4 UsingSysDcCommands................................................ 4–11

4.5.5 DcacheStatesandDuplicateTags........................................ 4–13

4.6 LockMechanism.......................................................... 4–14

4.6.1 In-OrderProcessingofLDx_L/STx_CInstructions ............................ 4–15

4.6.2 InternalEvictionofLDx_LBlocks.......................................... 4–15

4.6.3 LivenessandFairness.................................................. 4–15

4.6.4 ManagingSpeculativeStoreIssueswithMultiprocessorSystems ................ 4–16

4.7 SystemPort.............................................................. 4–16

4.7.1 SystemPortPins...................................................... 4–17

4.7.2 ProgrammingtheSystemInterfaceClocks.................................. 4–18

4.7.3 21264/EV68A-to-System Commands ...................................... 4–19

4.7.3.1 BankInterleaveonCacheBlockBoundaryMode ......................... 4–19

4.7.3.2 PageHitMode .................................................... 4–20

4.7.4 21264/EV68A-to-System Commands Descriptions . ........................... 4–21

4.7.5 ProbeResponse Commands (Command[4:0] = 00001). . . ...................... 4–24

4.7.6 SysAckand21264/EV68A-to-SystemCommandsFlowControl.................. 4–25

4.7.7 System-to-21264/EV68A Commands ...................................... 4–26

4.7.7.1 ProbeCommands (Four Cycles) ...................................... 4–26

4.7.7.2 DataTransfer Commands (Two Cycles)................................. 4–28

4.7.8 DataMovementInandOutofthe21264/EV68A.............................. 4–30

4.7.8.1 21264/EV68A Clock Basics .......................................... 4–30

4.7.8.2 FastDataMode ................................................... 4–31

4.7.8.3 FastDataDisableMode............................................. 4–33

4.7.8.4 SysDataInValid_LandSysDataOutValid_L .............................. 4–34

4.7.8.5 SysFillValid_L..................................................... 4–35

4.7.8.6 DataWrapping . . .................................................. 4–36

4.7.9 NonexistentMemoryProcessing.......................................... 4–38

4.7.10 OrderingofSystemPortTransactions...................................... 4–40

4.7.10.1 21264/EV68A Commands and System Probes ........................... 4–40

4.7.10.2 System Probesand SysDc Commands ................................. 4–42

4.8 BcachePort.............................................................. 4–42

4.8.1 BcachePortPins...................................................... 4–43

4.8.2 BcacheClocking ...................................................... 4–44

4.8.2.1 SettingthePeriodoftheCacheClock.................................. 4–45

4.8.3 BcacheTransactions................................................... 4–47

4.8.3.1 BcacheDataReadandTagReadTransactions .......................... 4–47

4.8.3.2 BcacheDataWriteTransactions ...................................... 4–48

4.8.3.3 BubblesontheBcacheDataBus...................................... 4–49

4.8.4 PinDescriptions....................................................... 4–50

4.8.4.1 BcAdd_H[23:4] . . .................................................. 4–51

4.8.4.2 BcacheControlPins................................................ 4–51

4.8.4.3 BcDataInClk_HandBcTagInClk_H .................................... 4–53

4.8.5 BcacheBanking....................................................... 4–53

4.8.6 Disabling the Bcache for Debugging . ...................................... 4–53

4.9 Interrupts................................................................ 4–54

5 Internal Processor Registers

5.1 EboxIPRs............................................................... 5–3

5.1.1 CycleCounterRegister–CC............................................. 5–3

5.1.2 CycleCounterControlRegister–CC_CTL.................................. 5–3

5.1.3 VirtualAddressRegister–VA............................................ 5–4

21264/EV68A Hardware Refere nce Manual

5.1.4 VirtualAddressControlRegister–VA_CTL ................................. 5–4

5.1.5 VirtualAddressFormatRegister–VA_FORM................................ 5–5

5.2 IboxIPRs................................................................ 5–6

5.2.1 ITBTagArrayWriteRegister–ITB_TAG................................... 5–6

5.2.2 ITBPTEArrayWriteRegister–ITB_PTE................................... 5–6

5.2.3 ITBInvalidateAllProcess(ASM=0)Register–ITB_IAP........................ 5–7

5.2.4 ITBInvalidateAllRegister–ITB_IA........................................ 5–7

5.2.5 ITBInvalidateSingleRegister–ITB_IS..................................... 5–7

5.2.6 ProfileMePCRegister–PMPC........................................... 5–8

5.2.7 ExceptionAddressRegister–EXC_ADDR.................................. 5–8

5.2.8 InstructionVirtualAddressFormatRegister—IVA_FORM...................... 5–9

5.2.9 InterruptEnableandCurrentProcessorModeRegister–IER_CM................ 5–9

5.2.10 SoftwareInterruptRequestRegister–SIRR................................. 5–10

5.2.11 InterruptSummaryRegister–ISUM....................................... 5–11

5.2.12 HardwareInterruptClearRegister–HW_INT_CLR ........................... 5–12

5.2.13 ExceptionSummaryRegister–EXC_SUM.................................. 5–13

5.2.14 PAL Base Register – PAL_BASE . . . ...................................... 5–15

5.2.15 IboxControlRegister–I_CTL............................................ 5–15

5.2.16 IboxStatusRegister–I_STAT............................................ 5–18

5.2.17 IcacheFlushRegister–IC_FLUSH........................................ 5–21

5.2.18 IcacheFlushASMRegister–IC_FLUSH_ASM .............................. 5–21

5.2.19 ClearVirtual-to-PhysicalMapRegister–CLR_MAP........................... 5–21

5.2.20 SleepModeRegister–SLEEP ........................................... 5–21

5.2.21 ProcessContextRegister–PCTX......................................... 5–21

5.2.22 PerformanceCounterControlRegister–PCTR_CTL.......................... 5–23

5.3 MboxIPRs............................................................... 5–25

5.3.1 DTBTagArrayWriteRegisters0and1–DTB_TAG0,DTB_TAG1............... 5–25

5.3.2 DTBPTEArrayWriteRegisters0and1–DTB_PTE0,DTB_PTE1............... 5–26

5.3.3 DTBAlternateProcessorModeRegister–DTB_ALTMODE..................... 5–26

5.3.4 DstreamTBInvalidateAllProcess(ASM=0)Register–DTB_IAP................ 5–27

5.3.5 DstreamTBInvalidateAllRegister–DTB_IA................................ 5–27

5.3.6 DstreamTBInvalidateSingleRegisters0and1–DTB_IS0,1................... 5–27

5.3.7 DstreamTBAddressSpaceNumberRegisters0and1–DTB_ASN0,1........... 5–28

5.3.8 Memory Management Status Register – MM_STAT........................... 5–28

5.3.9 MboxControlRegister–M_CTL.......................................... 5–29

5.3.10 DcacheControlRegister–DC_CTL ....................................... 5–30

5.3.11 DcacheStatusRegister–DC_STAT....................................... 5–31

5.4 CboxCSRsandIPRs...................................................... 5–32

5.4.1 CboxDataRegister–C_DATA........................................... 5–33

5.4.2 CboxShiftRegister–C_SHFT ........................................... 5–33

5.4.3 CboxWRITE_ONCEChainDescription .................................... 5–33

5.4.4 CboxWRITE_MANYChainDescription .................................... 5–38

5.4.5 CboxReadRegister(IPR)Description ..................................... 5–41

6 Privileged Architecture Library Code

6.1 PALcodeDescription....................................................... 6–1

6.2 PALmodeEnvironment..................................................... 6–2

6.3 RequiredPALcodeFunctionCodes........................................... 6–3

6.4 Opcodes Reserved for PALcode. . ............................................ 6–3

6.4.1 HW_LDInstruction..................................................... 6–3

6.4.2 HW_STInstruction..................................................... 6–4

6.4.3 HW_RETInstruction ................................................... 6–5

6.4.4 HW_MFPRandHW_MTPRInstructions.................................... 6–6

6.5 InternalProcessorRegisterAccessMechanisms................................. 6–7

6.5.1 IPR Scoreboard Bits . . .................................................. 6–8

6.5.2 HardwareStructureofExplicitlyWrittenIPRs................................ 6–8

21264/EV68A Hardware R eference Manual

6.5.3 HardwareStructureofImplicitlyWrittenIPRs................................ 6–9

6.5.4 IPRAccessOrdering................................................... 6–9

6.5.5 CorrectOrderingofExplicitWritersFollowedbyImplicitReaders................. 6–10

6.5.6 CorrectOrderingofExplicitReadersFollowedbyImplicitWriters................. 6–11

6.6 PALshadow Registers...................................................... 6–11

6.7 PALcodeEmulationoftheFPCR ............................................. 6–11

6.7.1 StatusFlags.......................................................... 6–12

6.7.2 MF_FPCR ........................................................... 6–12

6.7.3 MT_FPCR ........................................................... 6–12

6.8 PALcodeEntryPoints...................................................... 6–12

6.8.1 CALL_PALEntryPoints................................................. 6–12

6.8.2 PALcodeExceptionEntryPoints.......................................... 6–13

6.9 TranslationBuffer(TB)FillFlows ............................................. 6–14

6.9.1 DTBFill ............................................................. 6–14

6.9.2 ITBFill.............................................................. 6–16

6.10 Performance Counter Support . . . ............................................ 6–17

6.10.1 GeneralPrecautions ................................................... 6–18

6.10.2 AggregateModeProgrammingGuidelines.................................. 6–18

6.10.2.1 AggregateModePrecautions......................................... 6–18

6.10.2.2 Operation ........................................................ 6–19

6.10.2.3 AggregateCountingModeDescription.................................. 6–20

6.10.2.3.1 Cyclecounting................................................. 6–20

6.10.2.3.2 Retiredinstructionscycles........................................ 6–20

6.10.2.3.3 Bcachemissorlonglatencyprobescycles........................... 6–20

6.10.2.3.4 Mboxreplaytrapscycles......................................... 6–20

6.10.2.4 Counter M odes for Aggregate Mode. . .................................. 6–20

6.10.3 ProfileMeModeProgrammingGuidelines................................... 6–20

6.10.3.1 ProfileMeModePrecautions.......................................... 6–20

6.10.3.2 Operation ........................................................ 6–21

6.10.3.3 ProfileMeCounting Mode Description . ................................. 6–23

6.10.3.3.1 Cyclecounting................................................. 6–23

6.10.3.3.2 Inumretiredelaycycles.......................................... 6–23

6.10.3.3.3 Retiredinstructionscycles........................................ 6–23

6.10.3.3.4 Bcachemissorlonglatencyprobescycles........................... 6–23

6.10.3.3.5 Mboxreplaytrapscycles......................................... 6–23

6.10.3.4 CounterModesforProfileMeMode.................................... 6–24

7 Initialization and Configuration

7.1 Power-UpResetFlowandtheReset_LandDCOK_HPins......................... 7–1

7.1.1 Power Sequencing and Reset State for Signal Pins ........................... 7–3

7.1.2 ClockForwardingandSystemClockRatioConfiguration....................... 7–4

7.1.3 PLLRampUp......................................................... 7–6

7.1.4 BiSTandSROMLoadandtheTestStat_HPin............................... 7–6

7.1.5 ClockForwardResetandSystemInterfaceInitialization........................ 7–7

7.2 FaultResetFlow.......................................................... 7–8

7.3 EnergyStarCertificationandSleepModeFlow.................................. 7–9

7.4 WarmResetFlow......................................................... 7–11

7.5 ArrayInitialization ......................................................... 7–12

7.6 InitializationModeProcessing................................................ 7–12

7.7 ExternalInterfaceInitialization ............................................... 7–14

7.8 InternalProcessorRegisterPower-UpResetState............................... 7–14

7.9 IEEE1149.1TestPortReset................................................ 7–16

7.10 ResetStateMachine....................................................... 7–16

7.11 Phase-LockLoop(PLL)FunctionalDescription.................................. 7–19

7.11.1 DifferentialReferenceClocks............................................. 7–19

7.11.2 PLLOutputClocks..................................................... 7–19

21264/EV68A Hardware Refere nce Manual

vii

7.11.2.1 GCLK........................................................... 7–19

7.11.2.2 Differential 21264/EV68A Clocks ...................................... 7–19

7.11.2.3 Nominal Operating Frequency . . ...................................... 7–19

7.11.2.4 Power-Up/ResetClocking............................................ 7–20

8 Error Detection and Error Handling

8.1 DataErrorCorrectionCode.................................................. 8–2

8.2 IcacheDataorTagParityError............................................... 8–2

8.3 DcacheTagParityError.................................................... 8–2

8.4 DcacheDataSingle-BitCorrectableECCError .................................. 8–3

8.4.1 LoadInstruction....................................................... 8–3

8.4.2 Store Instruction (Quadword or Smaller) . . . ................................. 8–4

8.4.3 DcacheVictimExtracts ................................................. 8–4

8.5 DcacheStoreSecondError ................................................. 8–4

8.6 DcacheDuplicateTagParityError............................................ 8–4

8.7 BcacheTagParityError .................................................... 8–5

8.8 ControllingBcacheBlockParityCalculation..................................... 8–5

8.9 BcacheDataSingle-BitCorrectableECCError .................................. 8–5

8.9.1 IcacheFillfromBcache................................................. 8–5

8.9.2 DcacheFillfromBcache ................................................ 8–6

8.9.3 BcacheVictimRead.................................................... 8–7

8.9.3.1 BcacheVictimReadDuringaDcache/BcacheMiss ....................... 8–7

8.9.3.2 BcacheVictimReadDuringanECBInstruction........................... 8–7

8.10 Memory/SystemPortSingle-BitDataCorrectableECCError........................ 8–7

8.10.1 IcacheFillfromMemory................................................. 8–7

8.10.2 DcacheFillfromMemory................................................ 8–8

8.11 BcacheDataSingle-BitCorrectableECCErroronaProbe......................... 8–9

8.12 Double-BitFillErrors....................................................... 8–9

8.13 ErrorCaseSummary....................................................... 8–10

9 Electrical Data

9.1 ElectricalCharacteristics.................................................... 9–1

9.2 DCCharacteristics ........................................................ 9–2

9.3 Power Supply Sequencing and AvoidingPotential FailureMechanisms ............... 9–5

9.4 ACCharacteristics......................................................... 9–6

10 Thermal Management

10.1 OperatingTemperature..................................................... 10–1

10.2 HeatSinkSpecifications.................................................... 10–3

10.3 ThermalDesignConsiderations .............................................. 10–6

11 Testability and Diagnostics

11.1 TestPins................................................................ 11–1

11.2 SROM/SerialDiagnosticTerminalPort......................................... 11–2

11.2.1 SROMLoadOperation.................................................. 11–2

11.2.2 SerialTerminalPort.................................................... 11–2

11.3 IEEE 1149.1 Port. . . ....................................................... 11–3

11.4 TestStat_HPin ........................................................... 11–4

11.5 Power-UpSelf-TestandInitialization .......................................... 11–5

11.5.1 Built-inSelf-Test....................................................... 11–5

viii

21264/EV68A Hardware R eference Manual

11.5.2 SROMInitialization..................................................... 11–5

11.5.2.1 SerialInstructionCacheLoadOperation ................................ 11–6

11.6 Notes on IEEE 1149.1 Operation and Compliance ............................... 11–7

A Alpha Instruction Set

A.1 AlphaInstructionSummary.................................................. A–1

A.2 Reserved O pcodes . ....................................................... A–8

A.2.1 Opcodes Reserved for Compaq........................................... A–8

A.2.2 Opcodes Reserved for PALcode .......................................... A–9

A.3 IEEEFloating-PointInstructions.............................................. A–9

A.4 VAXFloating-PointInstructions............................................... A–11

A.5 IndependentFloating-Point Instructions . . ...................................... A–11

A.6 OpcodeSummary......................................................... A–12

A.7 RequiredPALcodeFunctionCodes........................................... A–13

A.8 IEEEFloating-PointConformance ............................................ A–14

B 21264/EV68A Boundary-Scan Register

B.1 Boundary-ScanRegister . . .................................................. B–1

B.1.1 BSDL Description of the Alpha21264/EV68A Boundary-ScanRegister . . .......... B–1

C Serial Icache Load Predecode Values

D PALcode Restrictions and Guidelines

D.1 Restriction 1 : Reset Sequence Required by Retire Logic and Mapper............... D–1

D.2 Restriction 2 : No Multiple Writers toIPRs in Same Scoreboard Group ............... D–8

D.3 Restriction 4 : No Writers and R eaders to IPRs in Same Scoreboard Group .......... D–8

D.4 Guideline 6 : Avoid Consecutive Read-Modify-Write-Read-Modify-Write. . .......... D–9

D.5 Restriction 7 :ReplayTrap,InterruptCodeSequence,andSTF/ITOF............... D–9

D.6 Restriction 9 : PALmode Istream Address Ranges . . . ........................... D–10

D.7 Restriction 10:DuplicateIPRModeBits ....................................... D–10

D.8 Restriction 11: Ibox IPR Update Synchronization................................ D–11

D.9 Restriction 12: MFPR of Implicitly-WrittenIPRs EXC_ADDR, IVA_FORM, and EXC_SUM D–11

D.10 Restriction13:DTBFillFlowCollision......................................... D–11

D.11 Restriction14:HW_RET ................................................... D–11

D.12 Guideline16:JSR-BADVA................................................. D–12

D.13 Restriction17:MTPRtoDTB_TAG0/DTB_PTE0/DTB_TAG1/DTB_PTE1 ............. D–12

D.14 Restriction 18: No FP Operates, FP Conditional Branches, FTOI, or STF in Same Fetch Block as

HW_MTPR .............................................................. D–12

D.15 Restriction 19: HW_RET/STALL After Updating the FPCR by way of MT_FPCR in PALmode D–12

D.16 Guideline 20 : I_CTL[SBE] Stream Buf fer Enable................................ D–12

D.17 Restriction21:HW_RET/STALLAfterHW_MTPRASN0/ASN1...................... D–12

D.18 Restriction22:HW_RET/STALLAfterHW_MTPRIS0/IS1.......................... D–13

D.19 Restriction23:HW_ST/P/CONDITIONALDoesNotCleartheLockFlag............... D–13

D.20 Restriction 24: HW_RET/STALL After HW_MTPR IC_FLUSH, IC_FLUSH_ASM, CLEAR_MAP

....................................................................... D–14

D.21 Restriction25:HW_MTPRITB_IAAfterReset................................... D–14

D.22 Guideline 26: Conditional Branches in PALcode ................................. D–14

D.23 Restriction27:Resetof‘Force-FailLockFlag’StateinPALcode..................... D–15

D.24 Restriction 28: Enforce Ordering Between IPRs Implicitly Written by Loads and Subsequent Loads

....................................................................... D–15

D.25 Guideline29:JSR,JMP,RET,andJSR_CORinPALcode......................... D–15

21264/EV68A Hardware Refere nce Manual

D.26 Restriction30:HW_MTPRandHW_MFPRtotheCboxCSR....................... D–15

D.27 Restriction 31 : I_CTL[VA_48]Update . . . ...................................... D–17

D.28 Restriction32:PCTR_CTLUpdate ........................................... D–17

D.29 Restriction33:HW_LDPhysical/LockUse...................................... D–18

D.30 Restriction34:WritingMultipleITBEntriesintheSamePALcodeFlow............... D–18

D.31 Guideline 35:HW_INT_CLRUpdate......................................... D–18

D.32 Restriction36:UpdatingI_CTL[SDE].......................................... D–18

D.33 Restriction 37 : UpdatingVA_CTL[VA_48] ...................................... D–18

D.34 Restriction38:UpdatingPCTR_CTL.......................................... D–18

D.35 Guideline39:WritingMultipleDTBEntriesintheSamePALFlow.................... D–19

D.36 Restriction40:ScrubbingaSingle-BitError..................................... D–19

D.37 Restriction41:MTPRITB_TAG,MTPRITB_PTEMustbeintheSameFetchBlock..... D–21

D.38 Restriction42:UpdatingVA_CTL,CC_CTL,orCCIPRs........................... D–21

D.39 Restriction 43: No Trappable InstructionsAlong with HW_MTPR..................... D–21

D.40 Restriction 44: Not Applicable to the 21264/EV68A ............................... D–21

D.41 Restriction45: NoHW_JMPorJMPIntructionsinPALcode........................ D–21

D.42 Restriction 46: Avoiding Livelocks i n Speculative Load CRD Handlers ................ D–22

D.43 Restriction47: CacheEvictionforSingle-BitCacheErrors......................... D–22

D.44 Restriction 48: MB Bracketing of Dcache Writes to Force Bad Data ECC and Force Bad Tag Parity

....................................................................... D–24

E 21264/EV68A-to-Bcache Pin Interface

E.1 ForwardingClockPinGroupings.............................................. E–1

E.2 Late-WriteNon-BurstingSSRAMs............................................ E–2

E.3 Dual-DataRateSSRAMs ................................................... E–3

Glossary

Index

21264/EV68A Hardware R eference Manual

Figures

2–1 21264/EV68A Block Diagram ................................................ 2–3

2–2 BranchPredictor.......................................................... 2–4

2–3 LocalPredictor ........................................................... 2–4

2–4 Global Predictor........................................................... 2–5

2–5 ChoicePredictor.......................................................... 2–5

2–6 Integer Execution Unit—Clusters0 and 1 ....................................... 2–9

2–7 Floating-PointExecutionUnits............................................... 2–10

2–8 PipelineOrganization ...................................................... 2–14

2–9 Pipeline Timing for Integer Load Instructions . . . ................................. 2–24

2–10 PipelineTimingforFloating-PointLoadInstructions............................... 2–25

2–11 Floating-PointControlRegister............................................... 2–36

2–12 TypicalUniprocessorConfiguration ........................................... 2–39

2–13 TypicalMultiprocessorConfiguration .......................................... 2–39

3–1 21264/EV68A Microprocessor Logic Symbol . . . ................................. 3–2

3–2 PackageDimensions....................................................... 3–17

3–3 21264/EV68A Top View (Pin Down) ........................................... 3–18

3–4 21264/EV68A Bottom View (Pin Up)........................................... 3–19

4–1 21264/EV68A System and Bcache Interfaces . . ................................. 4–3

4–2 21264/EV68A Bcache Interface Signals . . ...................................... 4–7

4–3 CacheSubsetHierarchy.................................................... 4–9

4–4 System Interface Signals. . .................................................. 4–17

4–5 FastTransferTimingExample ............................................... 4–32

4–6 SysFillValid_LTiming...................................................... 4–36

5–1 CycleCounterRegister..................................................... 5–3

5–2 CycleCounterControlRegister............................................... 5–3

5–3 VirtualAddressRegister.................................................... 5–4

5–4 VirtualAddressControlRegister.............................................. 5–4

5–5 VirtualAddressFormatRegister(VA_48=0,VA_FORM_32=0).................... 5–5

5–6 VirtualAddressFormatRegister(VA_48=1,VA_FORM_32=0).................... 5–6

5–7 VirtualAddressFormatRegister(VA_48=0,VA_FORM_32=1).................... 5–6

5–8 ITBTagArrayWriteRegister ................................................ 5–6

5–9 ITBPTEArrayWriteRegister................................................ 5–7

5–10 ITBInvalidateSingleRegister................................................ 5–7

5–11 ProfileMePCRegister...................................................... 5–8

5–12 ExceptionAddressRegister ................................................. 5–8

5–13 InstructionVirtualAddressFormatRegister(VA_48=0,VA_FORM_32=0)........... 5–9

5–14 InstructionVirtualAddressFormatRegister(VA_48=1,VA_FORM_32=0)........... 5–9

5–15 InstructionVirtualAddressFormatRegister(VA_48=0,VA_FORM_32=1)........... 5–9

5–16 InterruptEnableandCurrentProcessorModeRegister............................ 5–10

5–17 SoftwareInterruptRequestRegister........................................... 5–11

5–18 InterruptSummaryRegister ................................................. 5–11

5–19 HardwareInterruptClearRegister ............................................ 5–12

5–20 ExceptionSummaryRegister................................................ 5–14

5–21 PALBaseRegister ........................................................ 5–15

5–22 IboxControlRegister....................................................... 5–16

5–23 IboxStatusRegister....................................................... 5–19

5–24 ProcessContextRegister................................................... 5–22

5–25 PerformanceCounterControlRegister......................................... 5–23

5–26 DTBTagArrayWriteRegisters0and1........................................ 5–25

5–27 DTBPTEArrayWriteRegisters0and1........................................ 5–26

5–28 DTBAlternateProcessorModeRegister ....................................... 5–26

5–29 DstreamTranslationBufferInvalidateSingleRegisters............................ 5–27

5–30 DstreamTranslationBufferAddressSpaceNumberRegisters0and1................ 5–28

5–31 Memory Management Status Register . . . ...................................... 5–28

5–32 MboxControlRegister...................................................... 5–29

5–33 DcacheControlRegister.................................................... 5–31

21264/EV68A Hardware Refere nce Manual

5–34 DcacheStatusRegister..................................................... 5–32

5–35 CboxDataRegister........................................................ 5–33

5–36 CboxShiftRegister........................................................ 5–33

5–37 WRITE_MANYChainWriteTransactionExample................................ 5–39

6–1 HW_LDInstructionFormat.................................................. 6–4

6–2 HW_STInstructionFormat.................................................. 6–4

6–3 HW_RETInstructionFormat................................................. 6–6

6–4 HW_MFPRandHW_MTPRInstructionsFormat................................. 6–6

6–5 Single-MissDTBInstructionsFlowExample..................................... 6–14

6–6 ITBMissInstructionsFlowExample........................................... 6–16

7–1 Power-Up Timing Sequence ................................................. 7–3

7–2 Fault Reset Sequence of Operation ........................................... 7–9

7–3 SleepModeSequenceofOperation .......................................... 7–11

7–4 ExampleforInitializingBcache............................................... 7–13

7–5 21264/EV68A Reset State Machine State Diagram ............................... 7–17

10–1 Type1HeatSink.......................................................... 10–3

10–2 Type2HeatSink.......................................................... 10–4

10–3 Type3HeatSink.......................................................... 10–5

11–1 TestStat_HPinTimingDuringPower-UpBuilt-InSelf-Test(BiST) ................... 11–5

11–2 TestStat_HPinTimingDuringBuilt-InSelf-Initialization(BiSI)....................... 11–5

11–3 SROMContentMap ....................................................... 11–6

xii

21264/EV68A Hardware R eference Manual

Tables

1–1 Integer Data Types . ....................................................... 1–2

2–1 PipelineAbortDelay(GCLKCycles)........................................... 2–16

2–2 InstructionName,Pipeline,andTypes......................................... 2–17

2–3 InstructionGroupDefinitionsandPipelineUnit................................... 2–18

2–4 InstructionClassLatencyinCycles............................................ 2–20

2–5 MinimumRetireLatenciesforInstructionClasses ................................ 2–21

2–6 InstructionsRetiredWithoutExecution......................................... 2–23

2–7 RulesforI/OAddressSpaceLoadInstructionDataMerging........................ 2–28

2–8 RulesforI/OAddressSpaceStoreInstructionDataMerging........................ 2–29

2–9 MAFMergingRules........................................................ 2–30

2–10 MemoryReferenceOrdering................................................. 2–30

2–11 I/OReferenceOrdering..................................................... 2–31

2–12 TB Fill Flow Example Sequence 1 ............................................ 2–34

2–13 TB Fill Flow Example Sequence 2 ............................................ 2–34

2–14 Floating-PointControlRegisterFields.......................................... 2–36

2–15 21264/EV68A AMASK Values................................................ 2–38

2–16 AMASKBitAssignments.................................................... 2–38

3–1 Signal Pi n Types Definitions ................................................. 3–3

3–2 21264/EV68A Signal Descriptions ............................................ 3–3

3–3 21264/EV68A Signal Descriptions by Function. . ................................. 3–6

3–4 PinListSortedbySignalName............................................... 3–8

3–5 PinListSortedbyPGALocation.............................................. 3–12

3–6 Ground and Power (VSS and VDD) Pin List . . . ................................. 3–16

4–1 TranslationofInternalReferencestoExternalInterfaceReference................... 4–5

4–2 21264/EV68A-Supported Cache Block States . . ................................. 4–9

4–3 CacheBlockStateTransitions ............................................... 4–10

4–4 System Responsesto 21264/EV68A Commands................................. 4–10

4–5 System Responsesto 21264/EV68A Commands and Reactions ..................... 4–11

4–6 SystemPortPins.......................................................... 4–17

4–7 ProgrammingValuesforSystemInterfaceClocks................................ 4–18

4–8 ProgramValuesforData-Sample/DriveCSRs................................... 4–18

4–9 ForwardedClocksandFrameClockRatio...................................... 4–19

4–10 BankInterleaveonCacheBlockBoundaryModeofOperation...................... 4–19

4–11 PageHitModeofOperation................................................. 4–20

4–12 21264/EV68A-to-System Command Fields Definitions. . ........................... 4–20

4–13 MaximumPhysicalAddressforShortBusFormat................................ 4–21

4–14 21264/EV68A-to-System Commands Descriptions................................ 4–21

4–15 ProgrammingINVAL_TO_DIRTY_ENABLE[1:0].................................. 4–23

4–16 ProgrammingSET_DIRTY_ENABLE[2:0]....................................... 4–24

4–17 21264/EV68A ProbeResponse Command ...................................... 4–24

4–18 ProbeResponse Fields Descriptions........................................... 4–25

4–19 System-to-21264/EV68A Probe Commands..................................... 4–26

4–20 System-to-21264/EV68A Probe Commands Fields Descriptions ..................... 4–27

4–21 Data Movement Selection by Probe[4:3] . . ...................................... 4–27

4–22 Next Cache Block State Selection by Probe[2:0] ................................. 4–27

4–23 DataTransferCommandFormat ............................................. 4–28

4–24 SysDc[4:0]FieldDescription................................................. 4–29

4–25 SYSCLK Cycles Between SysAddOut and SysData............................... 4–32

4–26 CboxCSRSYSDC_DELAY[4:0]Examples ..................................... 4–33

4–27 FourTimingExamples ..................................................... 4–34

4–28 Data Wrapping Rules ...................................................... 4–36

4–29 SystemWrapandDeliverData............................................... 4–37

4–30 WrapInterleaveOrder...................................................... 4–37

4–31 WrapOrderforDouble-PumpedDataTransfers.................................. 4–38

4–32 21264/EV68A Commands with NXM Addresses and System Response............... 4–39

4–33 21264/EV68A Response t o System Probe and I n-Flight Command Interaction.......... 4–41

21264/EV68A Hardware Refere nce Manual

xiii

4–34 Rules for System Controlof Cache Status Update Order ........................... 4–42

4–35 RangeofMaximumBcacheClockRatios....................................... 4–43

4–36 BcachePortPins.......................................................... 4–43

4–37 BC_CPU_CLK_DELAY[1:0]Values........................................... 4–45

4–38 BC_CLK_DELAY[1:0]Values................................................ 4–45

4–39 ProgramValuestoSettheCacheClockPeriod(Single-Data)....................... 4–46

4–40 ProgramValuestoSettheCacheClockPeriod(Dual-DataRate).................... 4–46

4–41 Data-Sample/DriveCboxCSRs .............................................. 4–47

4–42 Programming the Bcache to Support Each Size of the Bcache ...................... 4–51

4–43 ProgrammingtheBcacheControlPins......................................... 4–51

4–44 ControlPinAssertionforRAM_TYPEA........................................ 4–51

4–45 ControlPinAssertionforRAM_TYPEB........................................ 4–52

4–46 ControlPinAssertionforRAM_TYPEC........................................ 4–52

4–47 ControlPinAssertionforRAM_TYPED........................................ 4–52

5–1 InternalProcessorRegisters................................................. 5–1

5–2 CycleCounterControlRegisterFieldsDescription................................ 5–4

5–3 VirtualAddressControlRegisterFieldsDescription............................... 5–5

5–4 ProfileMePCFieldsDescription.............................................. 5–8

5–5 IER_CMRegisterFieldsDescription........................................... 5–10

5–6 SoftwareInterruptRequestRegisterFieldsDescription............................ 5–11

5–7 InterruptSummaryRegisterFieldsDescription................................... 5–12

5–8 HardwareInterruptClearRegisterFieldsDescription.............................. 5–13

5–9 ExceptionSummaryRegisterFieldsDescription ................................. 5–14

5–10 PALBaseRegisterFieldsDescription ......................................... 5–15

5–11 IboxControlRegisterFieldsDescription........................................ 5–16

5–12 IboxStatusRegisterFieldsDescription ........................................ 5–19

5–13 IPRIndexBitsandRegisterFields............................................ 5–21

5–14 ProcessContextRegisterFieldsDescription .................................... 5–22

5–15 PerformanceCounterControlRegisterFieldsDescription.......................... 5–23

5–16 PerformanceCounterControlRegisterInputSelectFields.......................... 5–25

5–17 DTBAlternateProcessorModeRegisterFieldsDescription......................... 5–26

5–18 Memory Management Status Register Fields Description .......................... 5–28

5–19 MboxControlRegisterFieldsDescription....................................... 5–30

5–20 DcacheControlRegisterFieldsDescription..................................... 5–31

5–21 DcacheStatusRegisterFieldsDescription...................................... 5–32

5–22 CboxDataRegisterFieldsDescription......................................... 5–33

5–23 CboxShiftRegisterFieldsDescription......................................... 5–33

5–24 CboxWRITE_ONCEChainOrder ............................................ 5–34

5–25 CboxWRITE_MANYChainOrder ............................................ 5–39

5–26 CboxReadIPRFieldsDescription............................................ 5–41

6–1 RequiredPALcodeFunctionCodes........................................... 6–3

6–2 Opcodes Reserved for PALcode. . ............................................ 6–3

6–3 HW_LDInstructionFieldsDescriptions......................................... 6–4

6–4 HW_STInstructionFieldsDescriptions......................................... 6–5

6–5 HW_RETInstructionFieldsDescriptions ....................................... 6–6

6–6 HW_MFPRandHW_MTPRInstructionsFieldsDescriptions........................ 6–7

6–7 PairedInstructionFetchOrder ............................................... 6–9

6–8 PALcodeExceptionEntryLocations........................................... 6–13

6–9 IPRs Used for Performance Counter Support. . . ................................. 6–18

6–10 AggregateModeReturnedIPRContents....................................... 6–19

6–11 AggregateModePerformanceCounterIPRInputSelectFields...................... 6–20

6–12 CMOVDecomposed....................................................... 6–21

6–13 ProfileMeModeReturnedIPRContents........................................ 6–22

6–14 ProfileMeModePCTR_CTLInputSelectFields.................................. 6–24

7–1 21264/EV68A Reset State Machine Major Operations. . ........................... 7–1

7–2 Signal Pi n Reset State . . . .................................................. 7–3

7–3 PinSignalNamesandInitializationState....................................... 7–5

7–4 Power-Up FlowSignals and Their Constraints . ................................. 7–7

7–5 EffectonIPRsAfterFaultReset.............................................. 7–8

xiv

21264/EV68A Hardware R eference Manual

7–6 Effect on IPRs After Transition Through Sleep Mode . . . ........................... 7–10

7–7 Signals and Constraints for the Sleep Mode Sequence . ........................... 7–11

7–8 EffectonIPRsAfterWarmReset............................................. 7–11

7–9 WRITE_MANYChainCSRValuesforBcacheInitialization......................... 7–12

7–10 InternalProcessorRegistersatPower-UpResetState ............................ 7–14

7–11 21264/EV68A Reset State Machine State Descriptions . ........................... 7–17

7–12 Differential Reference Clock Frequencies in Full-SpeedLock . ...................... 7–20

8–1 21264/EV68A Error Detection Mechanisms ..................................... 8–1

8–2 64-BitDataandCheckBitECCCode.......................................... 8–2

8–3 ErrorCaseSummary....................................................... 8–10

9–1 MaximumElectricalRatings................................................. 9–1

9–2 Signal Types ............................................................. 9–2

9–3 VDD(I_DC_POWER)...................................................... 9–3

9–4 Input DC Reference Pin (I_DC_REF) .......................................... 9–3

9–5 Input Differential AmplifierReceiver (I_DA)...................................... 9–3

9–6 Input Differential Amplifier Clock Receiver (I_DA_CLK) . ........................... 9–3

9–7 PinType:Open-DrainOutputDriver(O_OD).................................... 9–4

9–8 Bidirectional,DifferentialAmplifierReceiver,Open-DrainOutputDriver(B_DA_OD)..... 9–4

9–9 PinType:Open-DrainDriverforTestPins(O_OD_TP)............................ 9–4

9–10 Bidirectional,DifferentialAmplifierReceiver,Push-PullOutputDriver(B_DA_PP) ....... 9–4

9–11 Push-PullOutputDriver(O_PP).............................................. 9–5

9–12 Push-PullOutputClockDriver(O_PP_CLK)..................................... 9–5

9–13 ACSpecifications ......................................................... 9–7

10–1 OperatingTemperatureatHeatSinkCenter(Tc)................................. 10–1

10–2 qca at Various Airflows for 21264/EV68A . ...................................... 10–2

10–3 Maximum Ta for 21264/EV68A @ 750 MHz and @ 1.7 V with Various Airflows ......... 10–2

10–4 Maximum Ta for 21264/EV68A @ 833 MHz and @ 1.7 V with Various Airflows ......... 10–2

10–5 Maximum Ta for 21264/EV68A @ 875 MHz and @ 1.7 V with Various Airflows ......... 10–2

10–6 Maximum Ta for 21264/EV68A @ 940 MHz and @ 1.7 V with Various Airflows ......... 10–2

11–1 DedicatedTestPortPins.................................................... 11–1

11–2 IEEE 1149.1 Instructions and Opcodes . . ...................................... 11–3

11–3 TAPControllerStateMachine................................................ 11–4

11–4 IcacheBitFieldsinanSROMLine............................................ 11–7

A–1 InstructionFormatandOpcodeNotation ....................................... A–1

A–2 ArchitectureInstructions.................................................... A–2

A–3 Opcodes Reserved for Compaq . . ............................................ A–8

A–4 Opcodes Reserved for PALcode. . ............................................ A–9

A–5 IEEE Floating-Point Instruction FunctionCodes . ................................. A–9

A–6 VAXFloating-PointInstructionFunctionCodes .................................. A–11

A–7 Independent Floating-Point InstructionFunction Codes ............................ A–12

A–8 OpcodeSummary......................................................... A–12

A–9 KeytoOpcodeSummaryUsedinTableA–8.................................... A–13

A–10 RequiredPALcodeFunctionCodes........................................... A–13

A–11 Exceptional Inputand Output Conditions ...................................... A–15

E–1 BcacheForwardingClockPinGroupings...................................... E–1

E–2 Late-WriteNon-BurstingSSRAMsDataPinUsage............................... E–2

E–3 Late-WriteNon-BurstingSSRAMsTagPinUsage................................ E–2

E–4 Dual-DataRateSSRAMDataPinUsage....................................... E–3

E–5 Dual-DataRateSSRAMTagPinUsage........................................ E–4

21264/EV68A Hardware Refere nce Manual

Audience

Content

Preface

This manual is for system designers and programmers who use the Alpha 21264/ EV68A microprocessor (referred to as the 21264/EV68A).

This manual contains the following chapters and appendixes: Chapter 1, Introduction, introduces the 21264/EV68A and provides an overview of the

Alpha architecture. Chapter 2, Internal Architecture, describes the major hardware functions and the inter-

nal chip architecture.It describesperformance m easurement facilities,coding r ules, and design examples.

Chapter 3, Hardware Interface, lists and describes the internal hardware interface signals, and provides mechanical data and packaging information, including signal pin lists.

Chapter 4, Cache and External Interfaces, describes the e xternal bus functions and transactions, lists bus commands, and describes the clock functions.

Chapter 5, Internal Processor Registers,lists and describes the internal processor register set.

Chapter 6, Privileged Architecture Library Code, describes the privileged architecture library code (PALcode).

Chapter 7, Initialization and Configuration, describes the initialization and configuration sequence.

Chapter 8, Error Detection and Error Handling, describes error de tection and error handling.

Chapter 9, Electrical Da ta, provides electrical data and describes signal integrity issues. Chapter 10, Thermal Management, provides information about thermal management. Chapter 11, Testability a nd Diagnostics, describes chip and system testability features. Appendix A, Alpha Instruction Set, summarizes the Alpha instruction set. Appendix B, 21264/EV68A Boundary-Scan Register, presents the BSDL description

of the 21264/EV68A boundary-scan register.

21264/EV68A Hardware Refere nce Manual

xvii

Appendix C, Serial Icache Load Predecode Values, provides a pointer to the Alpha Motherboards Software Developer’s Kit (SDK), which contains this information.

Appendix D, PALcode Restrictions and Guidelines, lists restrictions and guidelines that must be adhered to when generating PALcode.

Appendix E, 21264/EV68A-to-Bcache P in Interface, provides the pin interface between the 21264/EV68A and Bcache SSRAMs.

The Glossary lists and defines terms associated with the 21264/EV68A. An Index is provided at the end of the doc ument.

Documentation Included by Reference

The companion volume to this manual, the Alpha Architecture Reference Manual, Fourth Edition, can be accessed from the following website: ftp.compaq.com/

pub/products/alphaCPUdocs.

xviii

21264/EV68A Hardware R eference Manual

Terminology and Conventions

This section defines the abbreviations, terminology, and other conventions used throughout this document.

Abbreviations

Binary Multiples

•

The abbreviations K, M, and G (kilo, mega, and giga) represent binary multiples and have the following values.

K M G

(1024) (1,048,576) (1,073,741,824)

For example:

2KB = 2 kilobytes 4MB = 4 megabytes 8GB = 8 gigabytes 2K pixels = 2 kilopixels 4M pixels = 4 m egapixels

• Register Access

=2× 2 =4× 2 =8× 2 =2× 2 =4× 2

10 20 30 10 20

bytes bytes bytes pixels pixels

The abbreviations used to indicate the type of access to register fieldsand bits have the following definitions:

Abbreviation Meaning

IGN Ignore

Bitsandfieldsspecifiedareignoredonwrites.

MBZ Must Be Zero

Software must never place a nonzero value in bits and fields specified as MBZ. A nonzero read produces an Illegal Operand exception. Also, MBZ fields are reserved for future use.

RAZ Read As Zero

Bits andfields return a zero when read.

RC Read Clears

Bits and fields are cleared when read. Unless otherwise specified, such bits cannot be w ritten.

RES Reserved

Bits and fields are reserved by Compaq and should not be used; however, zeros can be written to reserved fields that cannot be masked.

RO Read Only

Thevaluemaybereadbysoftware.Itiswrittenbyhardware.Softwarewrite operations are ignored.

RO,n Read Only, and takes the value n at power-on reset.

Thevaluemaybereadbysoftware.Itiswrittenbyhardware.Softwarewrite operations are ignored.

21264/EV68A Hardware Refere nce Manual

xix

Abbreviation Meaning

RW Read/Write

Bits and fields can be read and written.

RW,n Read/Write, and takes the value n at power-on reset.

Bits and fields can be read and written.

W1C Write One to Clear

If read operations are allowed to the register, then the value may be read by software. If it is a write-only register, then a re ad operation by software returns an UNPR E DICTABLE result. Software write operations of a 1 cause the bit to be cleared by hardware. Software write operations of a 0 do not modify the state of the bit.

W1S Write One toSet

If read operations are allowed to the register, then the value may be read by software. If it is a write-only register, then a re ad operation by software returns an UNPR E DICTABLE result. Software write operations of a 1 cause the bit to be set by hardware. Software write operations of a 0 do not modify the state of the bit.

WO WriteOnly

Bits and fields can be written but not read.

WO,n Write Only, and takes the value n at power-on reset.

Bits and fields can be written but not read.

• Sign extension

SEXT(x) means x is sign-extended to the required size.

Addresses

Unless otherwise noted, all addresses and offsets are hexadecimal.

Aligned and Unaligned

The terms aligned and naturally aligned are interchangeable and refer to data objects that are powers of two in size. An aligned datum of size 2n is stored in memory at a byte address that is a multiple of 2n; that is, one that has n low-order zeros. For example, an aligned 64-byte stack frame has a memory address that is a multiple of 64.

A datum of size 2n is unaligned if it is stored in a byte address that is not a multiple of 2n.

Bit Notation

Multiple-bit fields can include contiguous and noncontiguous bits contained in square brackets ([]). Multiple contiguous bitsare indicated by a pair of numbers separated by a colon [:].For example, [9:7,5,2:0]specifies bits 9,8,7,5,2,1, and0. Similarly, singlebits are frequently indicated with square brackets. For example, [27] specifies bit 27. See also Field Notation.

Caution

Cautions indicate potential damage to equipment or loss of data.

21264/EV68A Hardware R eference Manual

Data Units

The following data unit terminology is used throughout this manual.

Term Words Bytes Bits Other

Byte ½ 1 8 — Word1216— Longword 2 4 32 Dword Quadword 4 8 64 2 longword

Do Not Care (X)

A capital X represents any valid value.

External

Unless otherwise stated, external means not contained in the chip.

Field Notation

The names of single-bit and multiple-bit fields can be used rather than the actual bit numbers (see Bit Notation). When the field name is used, it is contained in square brackets ([]). For example, RegisterName[LowByte] specifies RegisterName[7:0].

Note

Notes emphasize particularly important information.

Numbering

All numbers are decimal or hexadecimal unless otherwise indicated. The prefix 0x indicates a hexadecimal number. For example, 19 is decimal, but 0x19 and 0x19A are hexadecimal (also see Addresses). Otherwise, the base is indicated by a subscript; for example, 100

Ranges and Extents

is a binary number.

Ranges are specified by a pair of numbers separated by two periods (..) and are inclusive. For example, a range of integers 0..4 includes the integers 0, 1, 2, 3, and 4.

Extents are specified by a pair of numbers in square brackets ([]) separated by a colon (:) and are inclusive. Bit fields are often specified as extents. For example, bits [7:3] specifies bits 7, 6, 5, 4, and 3.

The gray areas in register figures indicate reserved or unused bits and fields. Bit ranges that are coupled with the field name specify the bits of the named field that

are included in the register. The bit range may, but need not necessarily, correspond to the bitExtent in theregister.See the explanationabove Table 5–1 formore information.

Signal Names

The following examples describe signal-name conventions used in this document.

21264/EV68A Hardware Refere nce Manual

xxi

AlphaSignal[n:n] Boldface, mixed-case type denotes signal names that are

assigned internal and external to the 21264/EV68A (that is, the signal traverses a chip interface pin).

AlphaSignal_x[n:n] When a signal has high and low assertion states, a lower-

case italic x represents the assertion states. For example,

SignalName_x[3:0] represents SignalName_H[3:0] and SignalName_L[3:0].

UNDEFINED

Operations specified as UNDEFINED may vary from moment to moment, implementation to implementation, and instruction to instruction within implementations. The operation may vary in effect from nothing to stopping system operation.

UNDEFINED operations may halt the processor or cause it to lose information. However, UNDEFINED operations m ust not cause the processor to hang, that is, reach an unhalted state from which there is no transition to a normal state in which the machine executes instructions.

UNPREDICTABLE

UNPREDICTABLE resultsor occurrences do not disrupt the basic operation of the processor; it continues to execute instructions in its normal manner. Further:

• Results or occurrences specified as UNPREDICTABLE m ay vary from moment to

moment, implementation to implementation, and instruction to instruction within implementations. Software can never depend on results specified a s UNPREDICTABLE.

• An UNPREDICTABLE result may acquire an arbitrary value subject to a few c on-

straints. Such a result may be an arbitrary function of the input operands or of any state information that is accessible to the process in its current access mode. UNPREDICTABLE results may be unchanged from their previous values.

Operations that produce UNPREDICTABLE results may also produce exceptions.

• An occurrence specified as UNPREDICTABLE may happen or not based on an

arbitrary choice function. The choice function is subject to the same constraints as are UNPREDICTABLE results and, in particular, must not constitute a security hole.

Specifically, UNPREDICTABLEresults must not depend upon, or be a functionof, the contents of memory locations or registers that are inaccessible to the current process in the current access mode.

Also, operations that may produce UNPREDICTABLE results must not: – Write or modify the c ontents of memory locations or registers to which the cur-

rent process in the current access mode does not have access, or – Halt or hang the system or any of its components. For example, a security hole would exist if some UNPREDICTABLE result

depended on the value of a registerin another process, on the contents of processor temporary registers left be hind by some previously running process, or on a sequence of actions of different processes.

xxii

21264/EV68A Hardware R eference Manual

Do not care. A capital X represents any valid value.

21264/EV68A Hardware Refere nce Manual

xxiii

This chapter provides a brief introduction to the Alpha architecture, Compaq’s RISC (reduced instruction set computing) architecture designed for high performance. The chapter then summarizes the specific features of the Alpha 21264/EV68A microprocessor (hereafter called the 21264/EV68A) that implements the Alpha architecture. Appendix A provides a list of Alpha instructions.

The companion volume to this document, the Alpha Architecture Reference Manual, Fourth Edition, contains the complete architecture information.

1.1 The Architecture

The Alpha architecture is a 64-bit load and store RISC architecture designed with particular emphasis on speed, multiple instruction issue, multiple processors, and software migration from many operating systems.

All registers are 64 bits long and all operations are performed between 64-bit registers. All instructions are 32 bits long. Memory operations are either load or storeoperations. All data manipulation is done between registers.

Introduction

The Alpha architecture supports the following data types:

• 8-, 16-, 32-, and 64-bit integers

• IEEE 32-bit a nd 64-bit floating-point formats

• VAX architecture 32-bit and 64-bit floating-point formats

In the Alpha architecture, instructions interact with each other only by one instruction writing to a register or memory location and a nother instruction reading fromthat register or memory location. This use of resources makes it easy to build implementations that issue multiple instructions every CPU cycle.

The 21264/EV68A uses a set of subroutines, called privileged a rchitecture library code (PALc ode), that is specific to a particular A lpha operating system implementation and hardware platform. These subroutines provide operating system primitives for context switching, interrupts, exceptions, and memory management. These subroutines can be invoked by hardware or CALL_PAL instructions. CALL_PAL instructions use the function field of the instruction to vector to a specified subroutine. PALcode is written in standard machine code with some implementation-specific extensions to provide direct accessto low-level hardware f unctions. PALcode supports optimizations for multiple operating systems, flexible memory-management implementations, a nd multiinstruction atomic sequences.

21264/EV68A Hardware Refere nce Manual

Introduction 1–1

The Architecture

The Alpha architecture performs byte shifting and masking with normal 64-bit, register-to-register instructions. The 21264/EV68A performs single-byte and single-word load and store instructions.

1.1.1 Addressing

The basic addressable unit in the Alpha architecture is the 8-bit byte. The 21264/ EV68A supports a 48-bit or 43-bit virtual address (selectable under IPR control).

Virtual addresses as seen by the program are translated into physical memory addresses by the me mory-management mechanism. The21264/EV68A supports a 44-bit physical address.

1.1.2 Integer Data Types

Alpha architecture supports the four integer data types listed in Table 1–1.

Table 1–1 Integer Data Types

Data Type Description

Byte A byte is 8 contiguous bits that start at an addressable byte boundary.

A byte is an 8-bit value.

Word A word is 2 contiguous bytes that start at an arbitrary byte boundary.

A word is a 16-bit value.

Longword A longword is 4 contiguousbytes that start at an arbitrary byte boundary. A

longword is a 32-bit value.

Quadword A quadword is 8 contiguous bytes that start at an arbitrary byte boundary.

Note: Alpha implementations may impose a significant performance penalty

when accessing operands that are not naturally aligned. Refer to the Alpha Architecture Handbook, Version 4 for details.

1.1.3 Floating-Point Data Types

The 21264/EV68A supports the following floating-point data types:

• Longword integer format in floating-point unit

• Quadword integer format in floating-point unit

• IEEE f loating-point formats

– S_floating – T_floating

• VAX floating-point formats

– F_floating

1–2 Introduction

– G_floating – D_floating (limited support)

21264/EV68A Hardware R eference Manual

21264/EV68A Microprocessor Features

1.2 21264/EV68A Microprocessor Features

The 21264/EV68A microprocessor is a superscalar pipelined processor. It is packaged in a 587-pin PGA carrier and has removable application-specific heat sinks. A number of configuration options allow its use in a range of system designs ranging from extremely simple uniprocessor systems with minimum component count to high-performance multiprocessor systems with very high cache and memory bandwidth.

The 21264/EV68A can issue four Alpha instructions in a single cycle, thereby m inimizing the average cycles per instruction (CPI). A number of low-latency and/or highthroughput features in the instructionissue unit and the onchip components of the memory subsystem further reduce the average CPI.

The 21264/EV68A and associated PALcode implements IEEE single-precision and double-precision, VA X F_floating a nd G_floating data types, and supports longword (32-bit) and quadword (64-bit) integers. Byte (8-bit) and word (16-bit) support is provided by byte-manipulation instructions. Limited hardware support is provided for the VAX D _floating data type.

Other 21264/EV68A features include:

• The a bility to issue up to four instructions during each CPU clock cycle.

• A peak instruction execution rate of four times the CPU clock frequency.

• An onchip, demand-paged memory-management unit with translation buffer, which,

when used with PALcode, can implement a variety of page table structures and translation algorithms. The unit consists of a 128-entry, fully-associative data translation buffer(DTB) and a 128-entry, fully-associative instruction translationbuffer (ITB), with each entry able to map a single 8KB page or a group of 8, 64, or 512 8KB pages. The allocation scheme for the ITB and DTB is round-robin.The size of each translation buffer entry’s group is specified by hint bits stored in the entry. The DTB and ITB implement 8-bit address space numbers (ASN), MAX_ASN=255.

• Two onchip, high-throughput pipelined floating-point units, capable of executing

both VAX a nd IEEE floating-point data types.

• An onchip, 64KB virtually-addressed instruction cache with 8-bit ASNs

(MAX_ASN=255).

• An onchip, virtually-indexed, physically-tagged dual-read-ported, 64KB data

cache.

• Supports a 48-bit or 43-bit virtual address (program selectable).

• Supports a 44-bit physical address.

• An onchip I/O write buffer with four 64-byte entries for I/O write transactions.

• An onchip, 8-entry victim data buffer.

• An onchip, 32-entry load queue.

• An onchip, 32-entry store queue.

• An onchip, 8-entry miss address file for cache fill requests and I/O read

transactions.

• An onchip, 8-entry probe queue, holding pending system port probe commands.

21264/EV68A Hardware Refere nce Manual

Introduction 1–3

21264/EV68A Microprocessor Features

•

An onchip, duplicate tag array used to maintain level 2 cache coherency.

• A 64-bit data bus with onchip parity and error correction code (ECC) support.

• Support for an external second-level (Bcache) cache. The size and some timing

parameters of the Bcache are programmable.

• An internal c lock generator providing a high-speed clock used by the 21264/

EV68A, and two clocks for use by the C PU module.

• Onchip performance counters to measure and analyze CPU and system perfor-

mance.

• Chip a nd module level test support, including an instruction cache test interface to

support chip and module level testing.

• A 2.0-V external interface.

Refer to Chapter 9 for 21264/EV68A dc and ac e lectrical characteristics. Refer to the

Alpha Architecture Handbook, Version 4, Appendix E, for waivers and any other

implementation-dependent information.

1–4 Introduction

21264/EV68A Hardware R eference Manual

Internal Architecture

This chapterprovides both an overviewof the 21264/EV68A microarchitecture and a system designer’s view of the 21264/EV68A implementation of the Alpha architecture. The combination of the 21264/EV68A microarchitecture and privileged architecture library code (PALcode) defines the chip’s implementation of the Alpha architecture. If a certain piece of hardware seems to be “architecturally incomplete,” the missing functionality is implemented in PALcode. Chapter 6 provides more information on PALcode.

This chapter describes the major functional hardware units and is not intended to be a detailed hardware description of the chip. It is organized as follows:

• 21264/EV68A microarchitecture

• Pipeline organization

• Instruction issue and retire rules

• Load instructions to R31/F31 (software-directed instruction pr efetch)

• Special cases of Alpha instruction e xecution

• Memory and I/O address space

• Miss a ddress file (MAF) and load-merging rules

• Instruction ordering

• Replay traps

• I/O wr ite buffer and the WMB instruction

• Performance measurement support

• Floating-point control register

• AM ASK and IMPLVER instruction values

• Design examples

2.1 21264/EV68A Microarchitecture

The 21264/EV68A microprocessor is a high-performance third-generationimplementation of the Compaq Alpha architecture. The 21264/EV68A consists of the following sections, as shown in Figure 2–1:

• Instruction fetch, issue, and retire unit (Ibox)

• Integer execution unit (Ebox)

21264/EV68A Hardware Refere nce Manual

Internal Architecture 2–1

21264/EV68A Microarchitecture

•

Floating-point e xecution unit (Fbox)

• Onchip caches (Icache and Dcache)

• Memor y reference unit (Mbox)

• External cache and system interface unit (Cbox)

• Pipeline operation sequence

2.1.1 Instruction Fetch, Issue, and Retire Unit

The instruction fetch, issue, and retire unit (Ibox) consists of the following subsections:

• Virtual program counter logic

• Branch predictor

• Instruction-stream translation buffer (ITB)

• Instruction fetch logic

• Register rename maps

• Integer and floating-point issue queues

• Exception and interrupt logic

• Retire logic

2.1.1.1 Virtual Program CounterLogic

The virtual program counter (VPC) logic maintains the virtual addresses f or instructions thatare in f light. There c an be up to 80 instructions, in20 successive fetch slots,in flight between the register rename mappers and the end of the pipeline. The VPC logic contains a 20-entry table to store these fetched VPC addresses.

2–2 Internal Architecture

21264/EV68A Hardware R eference Manual

+ 326 hidden pages

Compaq EV68A User Manual

Specifications and Main Features

Frequently Asked Questions

User Manual

Table of Contents

Figures

Tables

Preface

1.1 The Architecture

Introduction

1.1.1 Addressing

1.1.2 Integer Data Types

1.1.3 Floating-Point Data Types

21264/EV68A Microprocessor Features

Internal Architecture

2.1 21264/EV68A Microarchitecture

2.1.1 Instruction Fetch, Issue, and Retire Unit

2.1.1.1 Virtual Program CounterLogic