Compaq EV68A User Manual

21264/EV68A Microprocessor Hardware Reference Manual
Part Number: DS–0038B–TE
This manual is directly derived from the internal 21264/EV68A Specifications, Revi­sion 1.1. You can access this hardware reference manual in PDF format from t he following site:
Revision/Update Information: Revision 1.1, March 2002
Compaq Computer Corporation Shrewsbur y, Massachuse tts
March2002
The information in this publication is subject to changewithout notice.
COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAM­AGES RESULTING FROM THE FURNIS HING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ COM PUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS,IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WAR­RANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULARPURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT.
This publicationcontains information protectedby copyright. No partof this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.
© Compaq Computer Corporation 2002. All rights reserved. Printed in the U.S.A.
COMPAQ, the Compaq logo, the Digital logo, and VAXRegistered in United States Patent and Trademark Office.
Pentium is a registered trademark of IntelC orporation.
Other product names mentioned herein may be trademarks and/or registered trademarksof their respective compa­nies.
21264/EV68A Hardware Reference Manual

Table of Contents

Preface
1 Introduction
1.1 TheArchitecture.......................................................... 1–1
1.1.1 Addressing........................................................... 1–2
1.1.2 Integer Data Types. . . .................................................. 1–2
1.1.3 Floating-PointDataTypes............................................... 1–2
1.2 21264/EV68A Microprocessor Features. . ...................................... 1–3
2 Internal Architecture
2.1 21264/EV68A Microarchitecture . . ............................................ 2–1
2.1.1 InstructionFetch,Issue,andRetireUnit .................................... 2–2
2.1.1.1 Virtual Program Counter Logic . . ...................................... 2–2
2.1.1.2 BranchPredictor................................................... 2–3
2.1.1.3 Instruction-StreamTranslationBuffer................................... 2–5
2.1.1.4 InstructionFetchLogic.............................................. 2–6
2.1.1.5 RegisterRenameMaps ............................................. 2–6
2.1.1.6 Integer Issue Queue................................................ 2–6
2.1.1.7 Floating-Point Issue Queue .......................................... 2–7
2.1.1.8 Exception and Interrupt Logic. . . ...................................... 2–8
2.1.1.9 Retire Logic ....................................................... 2–8
2.1.2 Integer Execution Unit .................................................. 2–8
2.1.3 Floating-PointExecutionUnit............................................. 2–10
2.1.4 ExternalCacheandSystemInterfaceUnit .................................. 2–11
2.1.4.1 VictimAddressFileandVictimDataFile ................................ 2–11
2.1.4.2 I/OWriteBuffer.................................................... 2–11
2.1.4.3 ProbeQueue...................................................... 2–11
2.1.4.4 DuplicateDcacheTagArray.......................................... 2–11
2.1.5 OnchipCaches........................................................ 2–11
2.1.5.1 InstructionCache .................................................. 2–11
2.1.5.2 DataCache....................................................... 2–12
2.1.6 MemoryReferenceUnit................................................. 2–12
2.1.6.1 LoadQueue ...................................................... 2–13
2.1.6.2 StoreQueue...................................................... 2–13
2.1.6.3 MissAddressFile.................................................. 2–13
2.1.6.4 DstreamTranslationBuffer........................................... 2–13
2.1.7 SROMInterface....................................................... 2–13
2.2 PipelineOrganization ...................................................... 2–13
2.2.1 PipelineAborts........................................................ 2–16
2.3 InstructionIssueRules..................................................... 2–16
21264/EV68A Hardware Refere nce Manual
iii
2.3.1 InstructionGroupDefinitions............................................. 2–17
2.3.2 EboxSlotting......................................................... 2–18
2.3.3 InstructionLatencies ................................................... 2–20
2.4 InstructionRetireRules..................................................... 2–21
2.4.1 Floating-PointDivide/SquareRootEarlyRetire............................... 2–22
2.5 RetireofOperateInstructionsintoR31/F31..................................... 2–22
2.6 LoadInstructionstoR31andF31............................................. 2–23
2.6.1 NormalPrefetch:LDBU,LDF,LDG,LDL,LDT,LDWU,HW_LDLInstructions....... 2–23
2.6.2 PrefetchwithModifyIntent:LDSInstruction ................................. 2–23
2.6.3 Prefetch,EvictNext:LDQandHW_LDQInstructions.......................... 2–24
2.7 SpecialCasesofAlphaInstructionExecution.................................... 2–24
2.7.1 LoadHitSpeculation ................................................... 2–24
2.7.2 Floating-PointStoreInstructions.......................................... 2–26
2.7.3 CMOVInstruction...................................................... 2–26
2.8 MemoryandI/OAddressSpaceInstructions.................................... 2–27
2.8.1 MemoryAddressSpaceLoadInstructions .................................. 2–27
2.8.2 I/O Address Space Load Instructions. ...................................... 2–27
2.8.3 MemoryAddressSpaceStoreInstructions.................................. 2–28
2.8.4 I/OAddressSpaceStoreInstructions ...................................... 2–29
2.9 MAFMemoryAddressSpaceMergingRules.................................... 2–30
2.10 InstructionOrdering........................................................ 2–30
2.11 ReplayTraps............................................................. 2–31
2.11.1 MboxOrderTraps..................................................... 2–31
2.11.1.1 Load-LoadOrderTrap .............................................. 2–31
2.11.1.2 Store-LoadOrderTrap.............................................. 2–31
2.11.2 OtherMboxReplayTraps............................................... 2–32
2.12 I/OWriteBufferandtheWMBInstruction....................................... 2–32
2.12.1 MemoryBarrier(MB/WMB/TBFillFlow).................................... 2–32
2.12.1.1 MBInstructionProcessing ........................................... 2–33
2.12.1.2 WMBInstructionProcessing.......................................... 2–33
2.12.1.3 TBFillFlow....................................................... 2–34
2.13 Performance Measurement Support—Performance Counters . ...................... 2–35
2.14 Floating-PointControlRegister............................................... 2–35
2.15 AMASKandIMPLVERInstructionValues ...................................... 2–37
2.15.1 AMASK.............................................................. 2–38
2.15.2 IMPLVER............................................................ 2–38
2.16 DesignExamples ......................................................... 2–38
3 Hardware Interface
3.1 21264/EV68A Microprocessor Logic Symbol . . . ................................. 3–1
3.2 21264/EV68A Signal Names and Functions..................................... 3–3
3.3 PinAssignments.......................................................... 3–8
3.4 MechanicalSpecifications................................................... 3–17
3.5 21264/EV68A Packaging . .................................................. 3–18
4 Cache and External Interfaces
4.1 IntroductiontotheExternalInterfaces.......................................... 4–1
4.1.1 SystemInterface...................................................... 4–3
4.1.1.1 CommandsandAddresses........................................... 4–4
4.1.2 Second-Level Cache (Bcache) Interface . . .................................. 4–4
4.2 PhysicalAddressConsiderations............................................. 4–4
4.3 BcacheStructure.......................................................... 4–7
4.3.1 Bcache Interface Signals ................................................ 4–7
4.3.2 SystemDuplicateTagStores............................................. 4–7
iv
21264/EV68A Hardware R eference Manual
4.4 VictimDataBuffer......................................................... 4–8
4.5 Cache Coherency . . ....................................................... 4–8
4.5.1 Cache Coherency Basics................................................ 4–8
4.5.2 CacheBlockStates.................................................... 4–9
4.5.3 CacheBlockStateTransitions............................................ 4–10
4.5.4 UsingSysDcCommands................................................ 4–11
4.5.5 DcacheStatesandDuplicateTags........................................ 4–13
4.6 LockMechanism.......................................................... 4–14
4.6.1 In-OrderProcessingofLDx_L/STx_CInstructions ............................ 4–15
4.6.2 InternalEvictionofLDx_LBlocks.......................................... 4–15
4.6.3 LivenessandFairness.................................................. 4–15
4.6.4 ManagingSpeculativeStoreIssueswithMultiprocessorSystems ................ 4–16
4.7 SystemPort.............................................................. 4–16
4.7.1 SystemPortPins...................................................... 4–17
4.7.2 ProgrammingtheSystemInterfaceClocks.................................. 4–18
4.7.3 21264/EV68A-to-System Commands ...................................... 4–19
4.7.3.1 BankInterleaveonCacheBlockBoundaryMode ......................... 4–19
4.7.3.2 PageHitMode .................................................... 4–20
4.7.4 21264/EV68A-to-System Commands Descriptions . ........................... 4–21
4.7.5 ProbeResponse Commands (Command[4:0] = 00001). . . ...................... 4–24
4.7.6 SysAckand21264/EV68A-to-SystemCommandsFlowControl.................. 4–25
4.7.7 System-to-21264/EV68A Commands ...................................... 4–26
4.7.7.1 ProbeCommands (Four Cycles) ...................................... 4–26
4.7.7.2 DataTransfer Commands (Two Cycles)................................. 4–28
4.7.8 DataMovementInandOutofthe21264/EV68A.............................. 4–30
4.7.8.1 21264/EV68A Clock Basics .......................................... 4–30
4.7.8.2 FastDataMode ................................................... 4–31
4.7.8.3 FastDataDisableMode............................................. 4–33
4.7.8.4 SysDataInValid_LandSysDataOutValid_L .............................. 4–34
4.7.8.5 SysFillValid_L..................................................... 4–35
4.7.8.6 DataWrapping . . .................................................. 4–36
4.7.9 NonexistentMemoryProcessing.......................................... 4–38
4.7.10 OrderingofSystemPortTransactions...................................... 4–40
4.7.10.1 21264/EV68A Commands and System Probes ........................... 4–40
4.7.10.2 System Probesand SysDc Commands ................................. 4–42
4.8 BcachePort.............................................................. 4–42
4.8.1 BcachePortPins...................................................... 4–43
4.8.2 BcacheClocking ...................................................... 4–44
4.8.2.1 SettingthePeriodoftheCacheClock.................................. 4–45
4.8.3 BcacheTransactions................................................... 4–47
4.8.3.1 BcacheDataReadandTagReadTransactions .......................... 4–47
4.8.3.2 BcacheDataWriteTransactions ...................................... 4–48
4.8.3.3 BubblesontheBcacheDataBus...................................... 4–49
4.8.4 PinDescriptions....................................................... 4–50
4.8.4.1 BcAdd_H[23:4] . . .................................................. 4–51
4.8.4.2 BcacheControlPins................................................ 4–51
4.8.4.3 BcDataInClk_HandBcTagInClk_H .................................... 4–53
4.8.5 BcacheBanking....................................................... 4–53
4.8.6 Disabling the Bcache for Debugging . ...................................... 4–53
4.9 Interrupts................................................................ 4–54
5 Internal Processor Registers
5.1 EboxIPRs............................................................... 5–3
5.1.1 CycleCounterRegister–CC............................................. 5–3
5.1.2 CycleCounterControlRegister–CC_CTL.................................. 5–3
5.1.3 VirtualAddressRegister–VA............................................ 5–4
21264/EV68A Hardware Refere nce Manual
v
5.1.4 VirtualAddressControlRegister–VA_CTL ................................. 5–4
5.1.5 VirtualAddressFormatRegister–VA_FORM................................ 5–5
5.2 IboxIPRs................................................................ 5–6
5.2.1 ITBTagArrayWriteRegister–ITB_TAG................................... 5–6
5.2.2 ITBPTEArrayWriteRegister–ITB_PTE................................... 5–6
5.2.3 ITBInvalidateAllProcess(ASM=0)Register–ITB_IAP........................ 5–7
5.2.4 ITBInvalidateAllRegister–ITB_IA........................................ 5–7
5.2.5 ITBInvalidateSingleRegister–ITB_IS..................................... 5–7
5.2.6 ProfileMePCRegister–PMPC........................................... 5–8
5.2.7 ExceptionAddressRegister–EXC_ADDR.................................. 5–8
5.2.8 InstructionVirtualAddressFormatRegister—IVA_FORM...................... 5–9
5.2.9 InterruptEnableandCurrentProcessorModeRegister–IER_CM................ 5–9
5.2.10 SoftwareInterruptRequestRegister–SIRR................................. 5–10
5.2.11 InterruptSummaryRegister–ISUM....................................... 5–11
5.2.12 HardwareInterruptClearRegister–HW_INT_CLR ........................... 5–12
5.2.13 ExceptionSummaryRegister–EXC_SUM.................................. 5–13
5.2.14 PAL Base Register – PAL_BASE . . . ...................................... 5–15
5.2.15 IboxControlRegister–I_CTL............................................ 5–15
5.2.16 IboxStatusRegister–I_STAT............................................ 5–18
5.2.17 IcacheFlushRegister–IC_FLUSH........................................ 5–21
5.2.18 IcacheFlushASMRegister–IC_FLUSH_ASM .............................. 5–21
5.2.19 ClearVirtual-to-PhysicalMapRegister–CLR_MAP........................... 5–21
5.2.20 SleepModeRegister–SLEEP ........................................... 5–21
5.2.21 ProcessContextRegister–PCTX......................................... 5–21
5.2.22 PerformanceCounterControlRegister–PCTR_CTL.......................... 5–23
5.3 MboxIPRs............................................................... 5–25
5.3.1 DTBTagArrayWriteRegisters0and1–DTB_TAG0,DTB_TAG1............... 5–25
5.3.2 DTBPTEArrayWriteRegisters0and1–DTB_PTE0,DTB_PTE1............... 5–26
5.3.3 DTBAlternateProcessorModeRegister–DTB_ALTMODE..................... 5–26
5.3.4 DstreamTBInvalidateAllProcess(ASM=0)Register–DTB_IAP................ 5–27
5.3.5 DstreamTBInvalidateAllRegister–DTB_IA................................ 5–27
5.3.6 DstreamTBInvalidateSingleRegisters0and1–DTB_IS0,1................... 5–27
5.3.7 DstreamTBAddressSpaceNumberRegisters0and1–DTB_ASN0,1........... 5–28
5.3.8 Memory Management Status Register – MM_STAT........................... 5–28
5.3.9 MboxControlRegister–M_CTL.......................................... 5–29
5.3.10 DcacheControlRegister–DC_CTL ....................................... 5–30
5.3.11 DcacheStatusRegister–DC_STAT....................................... 5–31
5.4 CboxCSRsandIPRs...................................................... 5–32
5.4.1 CboxDataRegister–C_DATA........................................... 5–33
5.4.2 CboxShiftRegister–C_SHFT ........................................... 5–33
5.4.3 CboxWRITE_ONCEChainDescription .................................... 5–33
5.4.4 CboxWRITE_MANYChainDescription .................................... 5–38
5.4.5 CboxReadRegister(IPR)Description ..................................... 5–41
6 Privileged Architecture Library Code
6.1 PALcodeDescription....................................................... 6–1
6.2 PALmodeEnvironment..................................................... 6–2
6.3 RequiredPALcodeFunctionCodes........................................... 6–3
6.4 Opcodes Reserved for PALcode. . ............................................ 6–3
6.4.1 HW_LDInstruction..................................................... 6–3
6.4.2 HW_STInstruction..................................................... 6–4
6.4.3 HW_RETInstruction ................................................... 6–5
6.4.4 HW_MFPRandHW_MTPRInstructions.................................... 6–6
6.5 InternalProcessorRegisterAccessMechanisms................................. 6–7
6.5.1 IPR Scoreboard Bits . . .................................................. 6–8
6.5.2 HardwareStructureofExplicitlyWrittenIPRs................................ 6–8
vi
21264/EV68A Hardware R eference Manual
6.5.3 HardwareStructureofImplicitlyWrittenIPRs................................ 6–9
6.5.4 IPRAccessOrdering................................................... 6–9
6.5.5 CorrectOrderingofExplicitWritersFollowedbyImplicitReaders................. 6–10
6.5.6 CorrectOrderingofExplicitReadersFollowedbyImplicitWriters................. 6–11
6.6 PALshadow Registers...................................................... 6–11
6.7 PALcodeEmulationoftheFPCR ............................................. 6–11
6.7.1 StatusFlags.......................................................... 6–12
6.7.2 MF_FPCR ........................................................... 6–12
6.7.3 MT_FPCR ........................................................... 6–12
6.8 PALcodeEntryPoints...................................................... 6–12
6.8.1 CALL_PALEntryPoints................................................. 6–12
6.8.2 PALcodeExceptionEntryPoints.......................................... 6–13
6.9 TranslationBuffer(TB)FillFlows ............................................. 6–14
6.9.1 DTBFill ............................................................. 6–14
6.9.2 ITBFill.............................................................. 6–16
6.10 Performance Counter Support . . . ............................................ 6–17
6.10.1 GeneralPrecautions ................................................... 6–18
6.10.2 AggregateModeProgrammingGuidelines.................................. 6–18
6.10.2.1 AggregateModePrecautions......................................... 6–18
6.10.2.2 Operation ........................................................ 6–19
6.10.2.3 AggregateCountingModeDescription.................................. 6–20
6.10.2.3.1 Cyclecounting................................................. 6–20
6.10.2.3.2 Retiredinstructionscycles........................................ 6–20
6.10.2.3.3 Bcachemissorlonglatencyprobescycles........................... 6–20
6.10.2.3.4 Mboxreplaytrapscycles......................................... 6–20
6.10.2.4 Counter M odes for Aggregate Mode. . .................................. 6–20
6.10.3 ProfileMeModeProgrammingGuidelines................................... 6–20
6.10.3.1 ProfileMeModePrecautions.......................................... 6–20
6.10.3.2 Operation ........................................................ 6–21
6.10.3.3 ProfileMeCounting Mode Description . ................................. 6–23
6.10.3.3.1 Cyclecounting................................................. 6–23
6.10.3.3.2 Inumretiredelaycycles.......................................... 6–23
6.10.3.3.3 Retiredinstructionscycles........................................ 6–23
6.10.3.3.4 Bcachemissorlonglatencyprobescycles........................... 6–23
6.10.3.3.5 Mboxreplaytrapscycles......................................... 6–23
6.10.3.4 CounterModesforProfileMeMode.................................... 6–24
7 Initialization and Configuration
7.1 Power-UpResetFlowandtheReset_LandDCOK_HPins......................... 7–1
7.1.1 Power Sequencing and Reset State for Signal Pins ........................... 7–3
7.1.2 ClockForwardingandSystemClockRatioConfiguration....................... 7–4
7.1.3 PLLRampUp......................................................... 7–6
7.1.4 BiSTandSROMLoadandtheTestStat_HPin............................... 7–6
7.1.5 ClockForwardResetandSystemInterfaceInitialization........................ 7–7
7.2 FaultResetFlow.......................................................... 7–8
7.3 EnergyStarCertificationandSleepModeFlow.................................. 7–9
7.4 WarmResetFlow......................................................... 7–11
7.5 ArrayInitialization ......................................................... 7–12
7.6 InitializationModeProcessing................................................ 7–12
7.7 ExternalInterfaceInitialization ............................................... 7–14
7.8 InternalProcessorRegisterPower-UpResetState............................... 7–14
7.9 IEEE1149.1TestPortReset................................................ 7–16
7.10 ResetStateMachine....................................................... 7–16
7.11 Phase-LockLoop(PLL)FunctionalDescription.................................. 7–19
7.11.1 DifferentialReferenceClocks............................................. 7–19
7.11.2 PLLOutputClocks..................................................... 7–19
21264/EV68A Hardware Refere nce Manual
vii
7.11.2.1 GCLK........................................................... 7–19
7.11.2.2 Differential 21264/EV68A Clocks ...................................... 7–19
7.11.2.3 Nominal Operating Frequency . . ...................................... 7–19
7.11.2.4 Power-Up/ResetClocking............................................ 7–20
8 Error Detection and Error Handling
8.1 DataErrorCorrectionCode.................................................. 8–2
8.2 IcacheDataorTagParityError............................................... 8–2
8.3 DcacheTagParityError.................................................... 8–2
8.4 DcacheDataSingle-BitCorrectableECCError .................................. 8–3
8.4.1 LoadInstruction....................................................... 8–3
8.4.2 Store Instruction (Quadword or Smaller) . . . ................................. 8–4
8.4.3 DcacheVictimExtracts ................................................. 8–4
8.5 DcacheStoreSecondError ................................................. 8–4
8.6 DcacheDuplicateTagParityError............................................ 8–4
8.7 BcacheTagParityError .................................................... 8–5
8.8 ControllingBcacheBlockParityCalculation..................................... 8–5
8.9 BcacheDataSingle-BitCorrectableECCError .................................. 8–5
8.9.1 IcacheFillfromBcache................................................. 8–5
8.9.2 DcacheFillfromBcache ................................................ 8–6
8.9.3 BcacheVictimRead.................................................... 8–7
8.9.3.1 BcacheVictimReadDuringaDcache/BcacheMiss ....................... 8–7
8.9.3.2 BcacheVictimReadDuringanECBInstruction........................... 8–7
8.10 Memory/SystemPortSingle-BitDataCorrectableECCError........................ 8–7
8.10.1 IcacheFillfromMemory................................................. 8–7
8.10.2 DcacheFillfromMemory................................................ 8–8
8.11 BcacheDataSingle-BitCorrectableECCErroronaProbe......................... 8–9
8.12 Double-BitFillErrors....................................................... 8–9
8.13 ErrorCaseSummary....................................................... 8–10
9 Electrical Data
9.1 ElectricalCharacteristics.................................................... 9–1
9.2 DCCharacteristics ........................................................ 9–2
9.3 Power Supply Sequencing and AvoidingPotential FailureMechanisms ............... 9–5
9.4 ACCharacteristics......................................................... 9–6
10 Thermal Management
10.1 OperatingTemperature..................................................... 10–1
10.2 HeatSinkSpecifications.................................................... 10–3
10.3 ThermalDesignConsiderations .............................................. 10–6
11 Testability and Diagnostics
11.1 TestPins................................................................ 11–1
11.2 SROM/SerialDiagnosticTerminalPort......................................... 11–2
11.2.1 SROMLoadOperation.................................................. 11–2
11.2.2 SerialTerminalPort.................................................... 11–2
11.3 IEEE 1149.1 Port. . . ....................................................... 11–3
11.4 TestStat_HPin ........................................................... 11–4
11.5 Power-UpSelf-TestandInitialization .......................................... 11–5
11.5.1 Built-inSelf-Test....................................................... 11–5
viii
21264/EV68A Hardware R eference Manual
11.5.2 SROMInitialization..................................................... 11–5
11.5.2.1 SerialInstructionCacheLoadOperation ................................ 11–6
11.6 Notes on IEEE 1149.1 Operation and Compliance ............................... 11–7
A Alpha Instruction Set
A.1 AlphaInstructionSummary.................................................. A–1
A.2 Reserved O pcodes . ....................................................... A–8
A.2.1 Opcodes Reserved for Compaq........................................... A–8
A.2.2 Opcodes Reserved for PALcode .......................................... A–9
A.3 IEEEFloating-PointInstructions.............................................. A–9
A.4 VAXFloating-PointInstructions............................................... A–11
A.5 IndependentFloating-Point Instructions . . ...................................... A–11
A.6 OpcodeSummary......................................................... A–12
A.7 RequiredPALcodeFunctionCodes........................................... A–13
A.8 IEEEFloating-PointConformance ............................................ A–14
B 21264/EV68A Boundary-Scan Register
B.1 Boundary-ScanRegister . . .................................................. B–1
B.1.1 BSDL Description of the Alpha21264/EV68A Boundary-ScanRegister . . .......... B–1
C Serial Icache Load Predecode Values
D PALcode Restrictions and Guidelines
D.1 Restriction 1 : Reset Sequence Required by Retire Logic and Mapper............... D–1
D.2 Restriction 2 : No Multiple Writers toIPRs in Same Scoreboard Group ............... D–8
D.3 Restriction 4 : No Writers and R eaders to IPRs in Same Scoreboard Group .......... D–8
D.4 Guideline 6 : Avoid Consecutive Read-Modify-Write-Read-Modify-Write. . .......... D–9
D.5 Restriction 7 :ReplayTrap,InterruptCodeSequence,andSTF/ITOF............... D–9
D.6 Restriction 9 : PALmode Istream Address Ranges . . . ........................... D–10
D.7 Restriction 10:DuplicateIPRModeBits ....................................... D–10
D.8 Restriction 11: Ibox IPR Update Synchronization................................ D–11
D.9 Restriction 12: MFPR of Implicitly-WrittenIPRs EXC_ADDR, IVA_FORM, and EXC_SUM D–11
D.10 Restriction13:DTBFillFlowCollision......................................... D–11
D.11 Restriction14:HW_RET ................................................... D–11
D.12 Guideline16:JSR-BADVA................................................. D–12
D.13 Restriction17:MTPRtoDTB_TAG0/DTB_PTE0/DTB_TAG1/DTB_PTE1 ............. D–12
D.14 Restriction 18: No FP Operates, FP Conditional Branches, FTOI, or STF in Same Fetch Block as
HW_MTPR .............................................................. D–12
D.15 Restriction 19: HW_RET/STALL After Updating the FPCR by way of MT_FPCR in PALmode D–12
D.16 Guideline 20 : I_CTL[SBE] Stream Buf fer Enable................................ D–12
D.17 Restriction21:HW_RET/STALLAfterHW_MTPRASN0/ASN1...................... D–12
D.18 Restriction22:HW_RET/STALLAfterHW_MTPRIS0/IS1.......................... D–13
D.19 Restriction23:HW_ST/P/CONDITIONALDoesNotCleartheLockFlag............... D–13
D.20 Restriction 24: HW_RET/STALL After HW_MTPR IC_FLUSH, IC_FLUSH_ASM, CLEAR_MAP
....................................................................... D–14
D.21 Restriction25:HW_MTPRITB_IAAfterReset................................... D–14
D.22 Guideline 26: Conditional Branches in PALcode ................................. D–14
D.23 Restriction27:Resetof‘Force-FailLockFlag’StateinPALcode..................... D–15
D.24 Restriction 28: Enforce Ordering Between IPRs Implicitly Written by Loads and Subsequent Loads
....................................................................... D–15
D.25 Guideline29:JSR,JMP,RET,andJSR_CORinPALcode......................... D–15
21264/EV68A Hardware Refere nce Manual
ix
D.26 Restriction30:HW_MTPRandHW_MFPRtotheCboxCSR....................... D–15
D.27 Restriction 31 : I_CTL[VA_48]Update . . . ...................................... D–17
D.28 Restriction32:PCTR_CTLUpdate ........................................... D–17
D.29 Restriction33:HW_LDPhysical/LockUse...................................... D–18
D.30 Restriction34:WritingMultipleITBEntriesintheSamePALcodeFlow............... D–18
D.31 Guideline 35:HW_INT_CLRUpdate......................................... D–18
D.32 Restriction36:UpdatingI_CTL[SDE].......................................... D–18
D.33 Restriction 37 : UpdatingVA_CTL[VA_48] ...................................... D–18
D.34 Restriction38:UpdatingPCTR_CTL.......................................... D–18
D.35 Guideline39:WritingMultipleDTBEntriesintheSamePALFlow.................... D–19
D.36 Restriction40:ScrubbingaSingle-BitError..................................... D–19
D.37 Restriction41:MTPRITB_TAG,MTPRITB_PTEMustbeintheSameFetchBlock..... D–21
D.38 Restriction42:UpdatingVA_CTL,CC_CTL,orCCIPRs........................... D–21
D.39 Restriction 43: No Trappable InstructionsAlong with HW_MTPR..................... D–21
D.40 Restriction 44: Not Applicable to the 21264/EV68A ............................... D–21
D.41 Restriction45: NoHW_JMPorJMPIntructionsinPALcode........................ D–21
D.42 Restriction 46: Avoiding Livelocks i n Speculative Load CRD Handlers ................ D–22
D.43 Restriction47: CacheEvictionforSingle-BitCacheErrors......................... D–22
D.44 Restriction 48: MB Bracketing of Dcache Writes to Force Bad Data ECC and Force Bad Tag Parity
....................................................................... D–24
E 21264/EV68A-to-Bcache Pin Interface
E.1 ForwardingClockPinGroupings.............................................. E–1
E.2 Late-WriteNon-BurstingSSRAMs............................................ E–2
E.3 Dual-DataRateSSRAMs ................................................... E–3
Glossary
Index
x
21264/EV68A Hardware R eference Manual

Figures

2–1 21264/EV68A Block Diagram ................................................ 2–3
2–2 BranchPredictor.......................................................... 2–4
2–3 LocalPredictor ........................................................... 2–4
2–4 Global Predictor........................................................... 2–5
2–5 ChoicePredictor.......................................................... 2–5
2–6 Integer Execution Unit—Clusters0 and 1 ....................................... 2–9
2–7 Floating-PointExecutionUnits............................................... 2–10
2–8 PipelineOrganization ...................................................... 2–14
2–9 Pipeline Timing for Integer Load Instructions . . . ................................. 2–24
2–10 PipelineTimingforFloating-PointLoadInstructions............................... 2–25
2–11 Floating-PointControlRegister............................................... 2–36
2–12 TypicalUniprocessorConfiguration ........................................... 2–39
2–13 TypicalMultiprocessorConfiguration .......................................... 2–39
3–1 21264/EV68A Microprocessor Logic Symbol . . . ................................. 3–2
3–2 PackageDimensions....................................................... 3–17
3–3 21264/EV68A Top View (Pin Down) ........................................... 3–18
3–4 21264/EV68A Bottom View (Pin Up)........................................... 3–19
4–1 21264/EV68A System and Bcache Interfaces . . ................................. 4–3
4–2 21264/EV68A Bcache Interface Signals . . ...................................... 4–7
4–3 CacheSubsetHierarchy.................................................... 4–9
4–4 System Interface Signals. . .................................................. 4–17
4–5 FastTransferTimingExample ............................................... 4–32
4–6 SysFillValid_LTiming...................................................... 4–36
5–1 CycleCounterRegister..................................................... 5–3
5–2 CycleCounterControlRegister............................................... 5–3
5–3 VirtualAddressRegister.................................................... 5–4
5–4 VirtualAddressControlRegister.............................................. 5–4
5–5 VirtualAddressFormatRegister(VA_48=0,VA_FORM_32=0).................... 5–5
5–6 VirtualAddressFormatRegister(VA_48=1,VA_FORM_32=0).................... 5–6
5–7 VirtualAddressFormatRegister(VA_48=0,VA_FORM_32=1).................... 5–6
5–8 ITBTagArrayWriteRegister ................................................ 5–6
5–9 ITBPTEArrayWriteRegister................................................ 5–7
5–10 ITBInvalidateSingleRegister................................................ 5–7
5–11 ProfileMePCRegister...................................................... 5–8
5–12 ExceptionAddressRegister ................................................. 5–8
5–13 InstructionVirtualAddressFormatRegister(VA_48=0,VA_FORM_32=0)........... 5–9
5–14 InstructionVirtualAddressFormatRegister(VA_48=1,VA_FORM_32=0)........... 5–9
5–15 InstructionVirtualAddressFormatRegister(VA_48=0,VA_FORM_32=1)........... 5–9
5–16 InterruptEnableandCurrentProcessorModeRegister............................ 5–10
5–17 SoftwareInterruptRequestRegister........................................... 5–11
5–18 InterruptSummaryRegister ................................................. 5–11
5–19 HardwareInterruptClearRegister ............................................ 5–12
5–20 ExceptionSummaryRegister................................................ 5–14
5–21 PALBaseRegister ........................................................ 5–15
5–22 IboxControlRegister....................................................... 5–16
5–23 IboxStatusRegister....................................................... 5–19
5–24 ProcessContextRegister................................................... 5–22
5–25 PerformanceCounterControlRegister......................................... 5–23
5–26 DTBTagArrayWriteRegisters0and1........................................ 5–25
5–27 DTBPTEArrayWriteRegisters0and1........................................ 5–26
5–28 DTBAlternateProcessorModeRegister ....................................... 5–26
5–29 DstreamTranslationBufferInvalidateSingleRegisters............................ 5–27
5–30 DstreamTranslationBufferAddressSpaceNumberRegisters0and1................ 5–28
5–31 Memory Management Status Register . . . ...................................... 5–28
5–32 MboxControlRegister...................................................... 5–29
5–33 DcacheControlRegister.................................................... 5–31
21264/EV68A Hardware Refere nce Manual
xi
5–34 DcacheStatusRegister..................................................... 5–32
5–35 CboxDataRegister........................................................ 5–33
5–36 CboxShiftRegister........................................................ 5–33
5–37 WRITE_MANYChainWriteTransactionExample................................ 5–39
6–1 HW_LDInstructionFormat.................................................. 6–4
6–2 HW_STInstructionFormat.................................................. 6–4
6–3 HW_RETInstructionFormat................................................. 6–6
6–4 HW_MFPRandHW_MTPRInstructionsFormat................................. 6–6
6–5 Single-MissDTBInstructionsFlowExample..................................... 6–14
6–6 ITBMissInstructionsFlowExample........................................... 6–16
7–1 Power-Up Timing Sequence ................................................. 7–3
7–2 Fault Reset Sequence of Operation ........................................... 7–9
7–3 SleepModeSequenceofOperation .......................................... 7–11
7–4 ExampleforInitializingBcache............................................... 7–13
7–5 21264/EV68A Reset State Machine State Diagram ............................... 7–17
10–1 Type1HeatSink.......................................................... 10–3
10–2 Type2HeatSink.......................................................... 10–4
10–3 Type3HeatSink.......................................................... 10–5
11–1 TestStat_HPinTimingDuringPower-UpBuilt-InSelf-Test(BiST) ................... 11–5
11–2 TestStat_HPinTimingDuringBuilt-InSelf-Initialization(BiSI)....................... 11–5
11–3 SROMContentMap ....................................................... 11–6
xii
21264/EV68A Hardware R eference Manual

Tables

1–1 Integer Data Types . ....................................................... 1–2
2–1 PipelineAbortDelay(GCLKCycles)........................................... 2–16
2–2 InstructionName,Pipeline,andTypes......................................... 2–17
2–3 InstructionGroupDefinitionsandPipelineUnit................................... 2–18
2–4 InstructionClassLatencyinCycles............................................ 2–20
2–5 MinimumRetireLatenciesforInstructionClasses ................................ 2–21
2–6 InstructionsRetiredWithoutExecution......................................... 2–23
2–7 RulesforI/OAddressSpaceLoadInstructionDataMerging........................ 2–28
2–8 RulesforI/OAddressSpaceStoreInstructionDataMerging........................ 2–29
2–9 MAFMergingRules........................................................ 2–30
2–10 MemoryReferenceOrdering................................................. 2–30
2–11 I/OReferenceOrdering..................................................... 2–31
2–12 TB Fill Flow Example Sequence 1 ............................................ 2–34
2–13 TB Fill Flow Example Sequence 2 ............................................ 2–34
2–14 Floating-PointControlRegisterFields.......................................... 2–36
2–15 21264/EV68A AMASK Values................................................ 2–38
2–16 AMASKBitAssignments.................................................... 2–38
3–1 Signal Pi n Types Definitions ................................................. 3–3
3–2 21264/EV68A Signal Descriptions ............................................ 3–3
3–3 21264/EV68A Signal Descriptions by Function. . ................................. 3–6
3–4 PinListSortedbySignalName............................................... 3–8
3–5 PinListSortedbyPGALocation.............................................. 3–12
3–6 Ground and Power (VSS and VDD) Pin List . . . ................................. 3–16
4–1 TranslationofInternalReferencestoExternalInterfaceReference................... 4–5
4–2 21264/EV68A-Supported Cache Block States . . ................................. 4–9
4–3 CacheBlockStateTransitions ............................................... 4–10
4–4 System Responsesto 21264/EV68A Commands................................. 4–10
4–5 System Responsesto 21264/EV68A Commands and Reactions ..................... 4–11
4–6 SystemPortPins.......................................................... 4–17
4–7 ProgrammingValuesforSystemInterfaceClocks................................ 4–18
4–8 ProgramValuesforData-Sample/DriveCSRs................................... 4–18
4–9 ForwardedClocksandFrameClockRatio...................................... 4–19
4–10 BankInterleaveonCacheBlockBoundaryModeofOperation...................... 4–19
4–11 PageHitModeofOperation................................................. 4–20
4–12 21264/EV68A-to-System Command Fields Definitions. . ........................... 4–20
4–13 MaximumPhysicalAddressforShortBusFormat................................ 4–21
4–14 21264/EV68A-to-System Commands Descriptions................................ 4–21
4–15 ProgrammingINVAL_TO_DIRTY_ENABLE[1:0].................................. 4–23
4–16 ProgrammingSET_DIRTY_ENABLE[2:0]....................................... 4–24
4–17 21264/EV68A ProbeResponse Command ...................................... 4–24
4–18 ProbeResponse Fields Descriptions........................................... 4–25
4–19 System-to-21264/EV68A Probe Commands..................................... 4–26
4–20 System-to-21264/EV68A Probe Commands Fields Descriptions ..................... 4–27
4–21 Data Movement Selection by Probe[4:3] . . ...................................... 4–27
4–22 Next Cache Block State Selection by Probe[2:0] ................................. 4–27
4–23 DataTransferCommandFormat ............................................. 4–28
4–24 SysDc[4:0]FieldDescription................................................. 4–29
4–25 SYSCLK Cycles Between SysAddOut and SysData............................... 4–32
4–26 CboxCSRSYSDC_DELAY[4:0]Examples ..................................... 4–33
4–27 FourTimingExamples ..................................................... 4–34
4–28 Data Wrapping Rules ...................................................... 4–36
4–29 SystemWrapandDeliverData............................................... 4–37
4–30 WrapInterleaveOrder...................................................... 4–37
4–31 WrapOrderforDouble-PumpedDataTransfers.................................. 4–38
4–32 21264/EV68A Commands with NXM Addresses and System Response............... 4–39
4–33 21264/EV68A Response t o System Probe and I n-Flight Command Interaction.......... 4–41
21264/EV68A Hardware Refere nce Manual
xiii
4–34 Rules for System Controlof Cache Status Update Order ........................... 4–42
4–35 RangeofMaximumBcacheClockRatios....................................... 4–43
4–36 BcachePortPins.......................................................... 4–43
4–37 BC_CPU_CLK_DELAY[1:0]Values........................................... 4–45
4–38 BC_CLK_DELAY[1:0]Values................................................ 4–45
4–39 ProgramValuestoSettheCacheClockPeriod(Single-Data)....................... 4–46
4–40 ProgramValuestoSettheCacheClockPeriod(Dual-DataRate).................... 4–46
4–41 Data-Sample/DriveCboxCSRs .............................................. 4–47
4–42 Programming the Bcache to Support Each Size of the Bcache ...................... 4–51
4–43 ProgrammingtheBcacheControlPins......................................... 4–51
4–44 ControlPinAssertionforRAM_TYPEA........................................ 4–51
4–45 ControlPinAssertionforRAM_TYPEB........................................ 4–52
4–46 ControlPinAssertionforRAM_TYPEC........................................ 4–52
4–47 ControlPinAssertionforRAM_TYPED........................................ 4–52
5–1 InternalProcessorRegisters................................................. 5–1
5–2 CycleCounterControlRegisterFieldsDescription................................ 5–4
5–3 VirtualAddressControlRegisterFieldsDescription............................... 5–5
5–4 ProfileMePCFieldsDescription.............................................. 5–8
5–5 IER_CMRegisterFieldsDescription........................................... 5–10
5–6 SoftwareInterruptRequestRegisterFieldsDescription............................ 5–11
5–7 InterruptSummaryRegisterFieldsDescription................................... 5–12
5–8 HardwareInterruptClearRegisterFieldsDescription.............................. 5–13
5–9 ExceptionSummaryRegisterFieldsDescription ................................. 5–14
5–10 PALBaseRegisterFieldsDescription ......................................... 5–15
5–11 IboxControlRegisterFieldsDescription........................................ 5–16
5–12 IboxStatusRegisterFieldsDescription ........................................ 5–19
5–13 IPRIndexBitsandRegisterFields............................................ 5–21
5–14 ProcessContextRegisterFieldsDescription .................................... 5–22
5–15 PerformanceCounterControlRegisterFieldsDescription.......................... 5–23
5–16 PerformanceCounterControlRegisterInputSelectFields.......................... 5–25
5–17 DTBAlternateProcessorModeRegisterFieldsDescription......................... 5–26
5–18 Memory Management Status Register Fields Description .......................... 5–28
5–19 MboxControlRegisterFieldsDescription....................................... 5–30
5–20 DcacheControlRegisterFieldsDescription..................................... 5–31
5–21 DcacheStatusRegisterFieldsDescription...................................... 5–32
5–22 CboxDataRegisterFieldsDescription......................................... 5–33
5–23 CboxShiftRegisterFieldsDescription......................................... 5–33
5–24 CboxWRITE_ONCEChainOrder ............................................ 5–34
5–25 CboxWRITE_MANYChainOrder ............................................ 5–39
5–26 CboxReadIPRFieldsDescription............................................ 5–41
6–1 RequiredPALcodeFunctionCodes........................................... 6–3
6–2 Opcodes Reserved for PALcode. . ............................................ 6–3
6–3 HW_LDInstructionFieldsDescriptions......................................... 6–4
6–4 HW_STInstructionFieldsDescriptions......................................... 6–5
6–5 HW_RETInstructionFieldsDescriptions ....................................... 6–6
6–6 HW_MFPRandHW_MTPRInstructionsFieldsDescriptions........................ 6–7
6–7 PairedInstructionFetchOrder ............................................... 6–9
6–8 PALcodeExceptionEntryLocations........................................... 6–13
6–9 IPRs Used for Performance Counter Support. . . ................................. 6–18
6–10 AggregateModeReturnedIPRContents....................................... 6–19
6–11 AggregateModePerformanceCounterIPRInputSelectFields...................... 6–20
6–12 CMOVDecomposed....................................................... 6–21
6–13 ProfileMeModeReturnedIPRContents........................................ 6–22
6–14 ProfileMeModePCTR_CTLInputSelectFields.................................. 6–24
7–1 21264/EV68A Reset State Machine Major Operations. . ........................... 7–1
7–2 Signal Pi n Reset State . . . .................................................. 7–3
7–3 PinSignalNamesandInitializationState....................................... 7–5
7–4 Power-Up FlowSignals and Their Constraints . ................................. 7–7
7–5 EffectonIPRsAfterFaultReset.............................................. 7–8
xiv
21264/EV68A Hardware R eference Manual
7–6 Effect on IPRs After Transition Through Sleep Mode . . . ........................... 7–10
7–7 Signals and Constraints for the Sleep Mode Sequence . ........................... 7–11
7–8 EffectonIPRsAfterWarmReset............................................. 7–11
7–9 WRITE_MANYChainCSRValuesforBcacheInitialization......................... 7–12
7–10 InternalProcessorRegistersatPower-UpResetState ............................ 7–14
7–11 21264/EV68A Reset State Machine State Descriptions . ........................... 7–17
7–12 Differential Reference Clock Frequencies in Full-SpeedLock . ...................... 7–20
8–1 21264/EV68A Error Detection Mechanisms ..................................... 8–1
8–2 64-BitDataandCheckBitECCCode.......................................... 8–2
8–3 ErrorCaseSummary....................................................... 8–10
9–1 MaximumElectricalRatings................................................. 9–1
9–2 Signal Types ............................................................. 9–2
9–3 VDD(I_DC_POWER)...................................................... 9–3
9–4 Input DC Reference Pin (I_DC_REF) .......................................... 9–3
9–5 Input Differential AmplifierReceiver (I_DA)...................................... 9–3
9–6 Input Differential Amplifier Clock Receiver (I_DA_CLK) . ........................... 9–3
9–7 PinType:Open-DrainOutputDriver(O_OD).................................... 9–4
9–8 Bidirectional,DifferentialAmplifierReceiver,Open-DrainOutputDriver(B_DA_OD)..... 9–4
9–9 PinType:Open-DrainDriverforTestPins(O_OD_TP)............................ 9–4
9–10 Bidirectional,DifferentialAmplifierReceiver,Push-PullOutputDriver(B_DA_PP) ....... 9–4
9–11 Push-PullOutputDriver(O_PP).............................................. 9–5
9–12 Push-PullOutputClockDriver(O_PP_CLK)..................................... 9–5
9–13 ACSpecifications ......................................................... 9–7
10–1 OperatingTemperatureatHeatSinkCenter(Tc)................................. 10–1
10–2 qca at Various Airflows for 21264/EV68A . ...................................... 10–2
10–3 Maximum Ta for 21264/EV68A @ 750 MHz and @ 1.7 V with Various Airflows ......... 10–2
10–4 Maximum Ta for 21264/EV68A @ 833 MHz and @ 1.7 V with Various Airflows ......... 10–2
10–5 Maximum Ta for 21264/EV68A @ 875 MHz and @ 1.7 V with Various Airflows ......... 10–2
10–6 Maximum Ta for 21264/EV68A @ 940 MHz and @ 1.7 V with Various Airflows ......... 10–2
11–1 DedicatedTestPortPins.................................................... 11–1
11–2 IEEE 1149.1 Instructions and Opcodes . . ...................................... 11–3
11–3 TAPControllerStateMachine................................................ 11–4
11–4 IcacheBitFieldsinanSROMLine............................................ 11–7
A–1 InstructionFormatandOpcodeNotation ....................................... A–1
A–2 ArchitectureInstructions.................................................... A–2
A–3 Opcodes Reserved for Compaq . . ............................................ A–8
A–4 Opcodes Reserved for PALcode. . ............................................ A–9
A–5 IEEE Floating-Point Instruction FunctionCodes . ................................. A–9
A–6 VAXFloating-PointInstructionFunctionCodes .................................. A–11
A–7 Independent Floating-Point InstructionFunction Codes ............................ A–12
A–8 OpcodeSummary......................................................... A–12
A–9 KeytoOpcodeSummaryUsedinTableA–8.................................... A–13
A–10 RequiredPALcodeFunctionCodes........................................... A–13
A–11 Exceptional Inputand Output Conditions ...................................... A–15
E–1 BcacheForwardingClockPinGroupings...................................... E–1
E–2 Late-WriteNon-BurstingSSRAMsDataPinUsage............................... E–2
E–3 Late-WriteNon-BurstingSSRAMsTagPinUsage................................ E–2
E–4 Dual-DataRateSSRAMDataPinUsage....................................... E–3
E–5 Dual-DataRateSSRAMTagPinUsage........................................ E–4
21264/EV68A Hardware Refere nce Manual
xv
Audience
Content

Preface

This manual is for system designers and programmers who use the Alpha 21264/ EV68A microprocessor (referred to as the 21264/EV68A).
This manual contains the following chapters and appendixes: Chapter 1, Introduction, introduces the 21264/EV68A and provides an overview of the
Alpha architecture. Chapter 2, Internal Architecture, describes the major hardware functions and the inter-
nal chip architecture.It describesperformance m easurement facilities,coding r ules, and design examples.
Chapter 3, Hardware Interface, lists and describes the internal hardware interface sig­nals, and provides mechanical data and packaging information, including signal pin lists.
Chapter 4, Cache and External Interfaces, describes the e xternal bus functions and transactions, lists bus commands, and describes the clock functions.
Chapter 5, Internal Processor Registers,lists and describes the internal processor regis­ter set.
Chapter 6, Privileged Architecture Library Code, describes the privileged architecture library code (PALcode).
Chapter 7, Initialization and Configuration, describes the initialization and configura­tion sequence.
Chapter 8, Error Detection and Error Handling, describes error de tection and error han­dling.
Chapter 9, Electrical Da ta, provides electrical data and describes signal integrity issues. Chapter 10, Thermal Management, provides information about thermal management. Chapter 11, Testability a nd Diagnostics, describes chip and system testability features. Appendix A, Alpha Instruction Set, summarizes the Alpha instruction set. Appendix B, 21264/EV68A Boundary-Scan Register, presents the BSDL description
of the 21264/EV68A boundary-scan register.
21264/EV68A Hardware Refere nce Manual
xvii
Appendix C, Serial Icache Load Predecode Values, provides a pointer to the Alpha Motherboards Software Developer’s Kit (SDK), which contains this information.
Appendix D, PALcode Restrictions and Guidelines, lists restrictions and guidelines that must be adhered to when generating PALcode.
Appendix E, 21264/EV68A-to-Bcache P in Interface, provides the pin interface between the 21264/EV68A and Bcache SSRAMs.
The Glossary lists and defines terms associated with the 21264/EV68A. An Index is provided at the end of the doc ument.
Documentation Included by Reference
The companion volume to this manual, the Alpha Architecture Reference Manual, Fourth Edition, can be accessed from the following website: ftp.compaq.com/
pub/products/alphaCPUdocs.
xviii
21264/EV68A Hardware R eference Manual
Terminology and Conventions
This section defines the abbreviations, terminology, and other conventions used throughout this document.
Abbreviations
Binary Multiples
The abbreviations K, M, and G (kilo, mega, and giga) represent binary multiples and have the following values.
K M G
10
=2
20
=2
30
=2
(1024) (1,048,576) (1,073,741,824)
For example:
2KB = 2 kilobytes 4MB = 4 megabytes 8GB = 8 gigabytes 2K pixels = 2 kilopixels 4M pixels = 4 m egapixels
Register Access
=2× 2 =4× 2 =8× 2 =2× 2 =4× 2
10 20 30 10 20
bytes bytes bytes pixels pixels
The abbreviations used to indicate the type of access to register fieldsand bits have the following definitions:
Abbreviation Meaning
IGN Ignore
Bitsandfieldsspecifiedareignoredonwrites.
MBZ Must Be Zero
Software must never place a nonzero value in bits and fields specified as MBZ. A nonzero read produces an Illegal Operand exception. Also, MBZ fields are reserved for future use.
RAZ Read As Zero
Bits andfields return a zero when read.
RC Read Clears
Bits and fields are cleared when read. Unless otherwise specified, such bits cannot be w ritten.
RES Reserved
Bits and fields are reserved by Compaq and should not be used; however, zeros can be written to reserved fields that cannot be masked.
RO Read Only
Thevaluemaybereadbysoftware.Itiswrittenbyhardware.Softwarewrite operations are ignored.
RO,n Read Only, and takes the value n at power-on reset.
Thevaluemaybereadbysoftware.Itiswrittenbyhardware.Softwarewrite operations are ignored.
21264/EV68A Hardware Refere nce Manual
xix
Abbreviation Meaning
RW Read/Write
Bits and fields can be read and written.
RW,n Read/Write, and takes the value n at power-on reset.
Bits and fields can be read and written.
W1C Write One to Clear
If read operations are allowed to the register, then the value may be read by software. If it is a write-only register, then a re ad operation by software returns an UNPR E DICTABLE result. Software write operations of a 1 cause the bit to be cleared by hardware. Software write operations of a 0 do not modify the state of the bit.
W1S Write One toSet
If read operations are allowed to the register, then the value may be read by software. If it is a write-only register, then a re ad operation by software returns an UNPR E DICTABLE result. Software write operations of a 1 cause the bit to be set by hardware. Software write operations of a 0 do not modify the state of the bit.
WO WriteOnly
Bits and fields can be written but not read.
WO,n Write Only, and takes the value n at power-on reset.
Bits and fields can be written but not read.
Sign extension
SEXT(x) means x is sign-extended to the required size.
Addresses
Unless otherwise noted, all addresses and offsets are hexadecimal.
Aligned and Unaligned
The terms aligned and naturally aligned are interchangeable and refer to data objects that are powers of two in size. An aligned datum of size 2n is stored in memory at a byte address that is a multiple of 2n; that is, one that has n low-order zeros. For ex­ample, an aligned 64-byte stack frame has a memory address that is a multiple of 64.
A datum of size 2n is unaligned if it is stored in a byte address that is not a multiple of 2n.
Bit Notation
Multiple-bit fields can include contiguous and noncontiguous bits contained in square brackets ([]). Multiple contiguous bitsare indicated by a pair of numbers separated by a colon [:].For example, [9:7,5,2:0]specifies bits 9,8,7,5,2,1, and0. Similarly, singlebits are frequently indicated with square brackets. For example, [27] specifies bit 27. See also Field Notation.
Caution
Cautions indicate potential damage to equipment or loss of data.
xx
21264/EV68A Hardware R eference Manual
Data Units
The following data unit terminology is used throughout this manual.
Term Words Bytes Bits Other
Byte ½ 1 8 — Word1216— Longword 2 4 32 Dword Quadword 4 8 64 2 longword
Do Not Care (X)
A capital X represents any valid value.
External
Unless otherwise stated, external means not contained in the chip.
Field Notation
The names of single-bit and multiple-bit fields can be used rather than the actual bit numbers (see Bit Notation). When the field name is used, it is contained in square brackets ([]). For example, RegisterName[LowByte] specifies RegisterName[7:0].
Note
Notes emphasize particularly important information.
Numbering
All numbers are decimal or hexadecimal unless otherwise indicated. The prefix 0x indi­cates a hexadecimal number. For example, 19 is decimal, but 0x19 and 0x19A are hexa­decimal (also see Addresses). Otherwise, the base is indicated by a subscript; for example, 100
Ranges and Extents
is a binary number.
2
Ranges are specified by a pair of numbers separated by two periods (..) and are inclu­sive. For example, a range of integers 0..4 includes the integers 0, 1, 2, 3, and 4.
Extents are specified by a pair of numbers in square brackets ([]) separated by a colon (:) and are inclusive. Bit fields are often specified as extents. For example, bits [7:3] specifies bits 7, 6, 5, 4, and 3.
Register Figures
The gray areas in register figures indicate reserved or unused bits and fields. Bit ranges that are coupled with the field name specify the bits of the named field that
are included in the register. The bit range may, but need not necessarily, correspond to the bitExtent in theregister.See the explanationabove Table 5–1 formore information.
Signal Names
The following examples describe signal-name conventions used in this document.
21264/EV68A Hardware Refere nce Manual
xxi
AlphaSignal[n:n] Boldface, mixed-case type denotes signal names that are
assigned internal and external to the 21264/EV68A (that is, the signal traverses a chip interface pin).
AlphaSignal_x[n:n] When a signal has high and low assertion states, a lower-
case italic x represents the assertion states. For example,
SignalName_x[3:0] represents SignalName_H[3:0] and SignalName_L[3:0].
UNDEFINED
Operations specified as UNDEFINED may vary from moment to moment, implementa­tion to implementation, and instruction to instruction within implementations. The operation may vary in effect from nothing to stopping system operation.
UNDEFINED operations may halt the processor or cause it to lose information. How­ever, UNDEFINED operations m ust not cause the processor to hang, that is, reach an unhalted state from which there is no transition to a normal state in which the machine executes instructions.
UNPREDICTABLE
UNPREDICTABLE resultsor occurrences do not disrupt the basic operation of the pro­cessor; it continues to execute instructions in its normal manner. Further:
Results or occurrences specified as UNPREDICTABLE m ay vary from moment to
moment, implementation to implementation, and instruction to instruction within implementations. Software can never depend on results specified a s UNPREDICT­ABLE.
An UNPREDICTABLE result may acquire an arbitrary value subject to a few c on-
straints. Such a result may be an arbitrary function of the input operands or of any state information that is accessible to the process in its current access mode. UNPREDICTABLE results may be unchanged from their previous values.
Operations that produce UNPREDICTABLE results may also produce exceptions.
An occurrence specified as UNPREDICTABLE may happen or not based on an
arbitrary choice function. The choice function is subject to the same constraints as are UNPREDICTABLE results and, in particular, must not constitute a security hole.
Specifically, UNPREDICTABLEresults must not depend upon, or be a functionof, the contents of memory locations or registers that are inaccessible to the current process in the current access mode.
Also, operations that may produce UNPREDICTABLE results must not: – Write or modify the c ontents of memory locations or registers to which the cur-
rent process in the current access mode does not have access, or – Halt or hang the system or any of its components. For example, a security hole would exist if some UNPREDICTABLE result
depended on the value of a registerin another process, on the contents of processor temporary registers left be hind by some previously running process, or on a sequence of actions of different processes.
xxii
21264/EV68A Hardware R eference Manual
X
Do not care. A capital X represents any valid value.
21264/EV68A Hardware Refere nce Manual
xxiii
This chapter provides a brief introduction to the Alpha architecture, Compaq’s RISC (reduced instruction set computing) architecture designed for high performance. The chapter then summarizes the specific features of the Alpha 21264/EV68A microproces­sor (hereafter called the 21264/EV68A) that implements the Alpha architecture. Appen­dix A provides a list of Alpha instructions.
The companion volume to this document, the Alpha Architecture Reference Manual, Fourth Edition, contains the complete architecture information.

1.1 The Architecture

The Alpha architecture is a 64-bit load and store RISC architecture designed with par­ticular emphasis on speed, multiple instruction issue, multiple processors, and software migration from many operating systems.
All registers are 64 bits long and all operations are performed between 64-bit registers. All instructions are 32 bits long. Memory operations are either load or storeoperations. All data manipulation is done between registers.
1

Introduction

The Alpha architecture supports the following data types:
8-, 16-, 32-, and 64-bit integers
IEEE 32-bit a nd 64-bit floating-point formats
VAX architecture 32-bit and 64-bit floating-point formats
In the Alpha architecture, instructions interact with each other only by one instruction writing to a register or memory location and a nother instruction reading fromthat regis­ter or memory location. This use of resources makes it easy to build implementations that issue multiple instructions every CPU cycle.
The 21264/EV68A uses a set of subroutines, called privileged a rchitecture library code (PALc ode), that is specific to a particular A lpha operating system implementation and hardware platform. These subroutines provide operating system primitives for context switching, interrupts, exceptions, and memory management. These subroutines can be invoked by hardware or CALL_PAL instructions. CALL_PAL instructions use the function field of the instruction to vector to a specified subroutine. PALcode is written in standard machine code with some implementation-specific extensions to provide direct accessto low-level hardware f unctions. PALcode supports optimizations for mul­tiple operating systems, flexible memory-management implementations, a nd multi­instruction atomic sequences.
21264/EV68A Hardware Refere nce Manual
Introduction 1–1
The Architecture
The Alpha architecture performs byte shifting and masking with normal 64-bit, regis­ter-to-register instructions. The 21264/EV68A performs single-byte and single-word load and store instructions.

1.1.1 Addressing

The basic addressable unit in the Alpha architecture is the 8-bit byte. The 21264/ EV68A supports a 48-bit or 43-bit virtual address (selectable under IPR control).
Virtual addresses as seen by the program are translated into physical memory addresses by the me mory-management mechanism. The21264/EV68A supports a 44-bit physical address.

1.1.2 Integer Data Types

Alpha architecture supports the four integer data types listed in Table 1–1.
Table 1–1 Integer Data Types
Data Type Description
Byte A byte is 8 contiguous bits that start at an addressable byte boundary.
A byte is an 8-bit value.
Word A word is 2 contiguous bytes that start at an arbitrary byte boundary.
A word is a 16-bit value.
Longword A longword is 4 contiguousbytes that start at an arbitrary byte boundary. A
longword is a 32-bit value.
Quadword A quadword is 8 contiguous bytes that start at an arbitrary byte boundary.
Note: Alpha implementations may impose a significant performance penalty
when accessing operands that are not naturally aligned. Refer to the Alpha Architecture Handbook, Version 4 for details.

1.1.3 Floating-Point Data Types

The 21264/EV68A supports the following floating-point data types:
Longword integer format in floating-point unit
Quadword integer format in floating-point unit
IEEE f loating-point formats
S_floating – T_floating
VAX floating-point formats
F_floating
1–2 Introduction
G_floating – D_floating (limited support)
21264/EV68A Hardware R eference Manual

21264/EV68A Microprocessor Features

1.2 21264/EV68A Microprocessor Features
The 21264/EV68A microprocessor is a superscalar pipelined processor. It is packaged in a 587-pin PGA carrier and has removable application-specific heat sinks. A number of configuration options allow its use in a range of system designs ranging from extremely simple uniprocessor systems with minimum component count to high-per­formance multiprocessor systems with very high cache and memory bandwidth.
The 21264/EV68A can issue four Alpha instructions in a single cycle, thereby m inimiz­ing the average cycles per instruction (CPI). A number of low-latency and/or high­throughput features in the instructionissue unit and the onchip components of the mem­ory subsystem further reduce the average CPI.
The 21264/EV68A and associated PALcode implements IEEE single-precision and double-precision, VA X F_floating a nd G_floating data types, and supports longword (32-bit) and quadword (64-bit) integers. Byte (8-bit) and word (16-bit) support is pro­vided by byte-manipulation instructions. Limited hardware support is provided for the VAX D _floating data type.
Other 21264/EV68A features include:
The a bility to issue up to four instructions during each CPU clock cycle.
A peak instruction execution rate of four times the CPU clock frequency.
An onchip, demand-paged memory-management unit with translation buffer, which,
when used with PALcode, can implement a variety of page table structures and trans­lation algorithms. The unit consists of a 128-entry, fully-associative data translation buffer(DTB) and a 128-entry, fully-associative instruction translationbuffer (ITB), with each entry able to map a single 8KB page or a group of 8, 64, or 512 8KB pages. The allocation scheme for the ITB and DTB is round-robin.The size of each translation buffer entry’s group is specified by hint bits stored in the entry. The DTB and ITB implement 8-bit address space numbers (ASN), MAX_ASN=255.
Two onchip, high-throughput pipelined floating-point units, capable of executing
both VAX a nd IEEE floating-point data types.
An onchip, 64KB virtually-addressed instruction cache with 8-bit ASNs
(MAX_ASN=255).
An onchip, virtually-indexed, physically-tagged dual-read-ported, 64KB data
cache.
Supports a 48-bit or 43-bit virtual address (program selectable).
Supports a 44-bit physical address.
An onchip I/O write buffer with four 64-byte entries for I/O write transactions.
An onchip, 8-entry victim data buffer.
An onchip, 32-entry load queue.
An onchip, 32-entry store queue.
An onchip, 8-entry miss address file for cache fill requests and I/O read
transactions.
An onchip, 8-entry probe queue, holding pending system port probe commands.
21264/EV68A Hardware Refere nce Manual
Introduction 1–3
21264/EV68A Microprocessor Features
An onchip, duplicate tag array used to maintain level 2 cache coherency.
A 64-bit data bus with onchip parity and error correction code (ECC) support.
Support for an external second-level (Bcache) cache. The size and some timing
parameters of the Bcache are programmable.
An internal c lock generator providing a high-speed clock used by the 21264/
EV68A, and two clocks for use by the C PU module.
Onchip performance counters to measure and analyze CPU and system perfor-
mance.
Chip a nd module level test support, including an instruction cache test interface to
support chip and module level testing.
A 2.0-V external interface.
Refer to Chapter 9 for 21264/EV68A dc and ac e lectrical characteristics. Refer to the
Alpha Architecture Handbook, Version 4, Appendix E, for waivers and any other
implementation-dependent information.
1–4 Introduction
21264/EV68A Hardware R eference Manual
2

Internal Architecture

This chapterprovides both an overviewof the 21264/EV68A microarchitecture and a sys­tem designer’s view of the 21264/EV68A implementation of the Alpha architecture. The combination of the 21264/EV68A microarchitecture and privileged architecture library code (PALcode) defines the chip’s implementation of the Alpha architecture. If a certain piece of hardware seems to be “architecturally incomplete,” the missing functionality is implemented in PALcode. Chapter 6 provides more information on PALcode.
This chapter describes the major functional hardware units and is not intended to be a detailed hardware description of the chip. It is organized as follows:
21264/EV68A microarchitecture
Pipeline organization
Instruction issue and retire rules
Load instructions to R31/F31 (software-directed instruction pr efetch)
Special cases of Alpha instruction e xecution
Memory and I/O address space
Miss a ddress file (MAF) and load-merging rules
Instruction ordering
Replay traps
I/O wr ite buffer and the WMB instruction
Performance measurement support
Floating-point control register
AM ASK and IMPLVER instruction values
Design examples

2.1 21264/EV68A Microarchitecture

The 21264/EV68A microprocessor is a high-performance third-generationimplementa­tion of the Compaq Alpha architecture. The 21264/EV68A consists of the following sections, as shown in Figure 2–1:
Instruction fetch, issue, and retire unit (Ibox)
Integer execution unit (Ebox)
21264/EV68A Hardware Refere nce Manual
Internal Architecture 2–1
21264/EV68A Microarchitecture
Floating-point e xecution unit (Fbox)
Onchip caches (Icache and Dcache)
Memor y reference unit (Mbox)
External cache and system interface unit (Cbox)
Pipeline operation sequence

2.1.1 Instruction Fetch, Issue, and Retire Unit

The instruction fetch, issue, and retire unit (Ibox) consists of the following subsections:
Virtual program counter logic
Branch predictor
Instruction-stream translation buffer (ITB)
Instruction fetch logic
Register rename maps
Integer and floating-point issue queues
Exception and interrupt logic
Retire logic
2.1.1.1 Virtual Program CounterLogic
The virtual program counter (VPC) logic maintains the virtual addresses f or instruc­tions thatare in f light. There c an be up to 80 instructions, in20 successive fetch slots,in flight between the register rename mappers and the end of the pipeline. The VPC logic contains a 20-entry table to store these fetched VPC addresses.
2–2 Internal Architecture
21264/EV68A Hardware R eference Manual
Loading...
+ 326 hidden pages