DEC DIGITAL Server 7300 DIGITAL Server 7300/7300R Series Service Manual

DIGITAL Server 7300/7300R Series Service Manual
Part Number: EK-K9FWW-SG. A01
This manual is for anyone who services a DIGITAL Server 7300/7300R Series system. It covers installation, power-up, initial troubleshooting, and component installation.
January 1998
Digital Equipment Corporation Maynard, Massachusetts
Digital Equipment Corporati on makes no representati ons that the use of it s products in t he manner described in this publication will not infringe on existing or future patent rights, nor do the descriptions contained in this publication imply the granting of licenses to make, use, or sell equipment or software in accordance with the description.
The information in thi s docum ent i s subj ect to change wit hout noti ce and should not be construed as a commitment by Digital Equipment Corporation.
Digital Equipment Corporation assumes no responsibility for any errors that may appear in this document.
The software, if any, described in this document is f urnished under a license and may be used or copied only in accordance with the terms of such li cense. No responsi bility is assumed for the use or reliability of software or equipment that is not supplied by Digital Equipment Corporation or it s affiliated companies.
© Digital Equipment Corporation 1998. All rights reserved. The following are trademarks of Digital Equipment Corporation: DEClaser, Digital, OpenVMS,
PATHWORKS, and the DIGITAL logo. The following are third-party trademarks: Adobe and PostScript are registered trademarks of Adobe Systems, Incorporated. Helvetica and Times are registered trademarks of Linotype Co. Microsoft and MS-DOS are registered trademarks and Windows is a trademark of Microsoft Corporation.
The following are third-party trademarks: Lifestyle 28.8 DATA/FAX Modem is a trademark of Motorola, Inc. UNIX is a registered trademark in the U.S. and other countries, l icensed exclusively through X/Open Company Ltd. U.S. Robotics and Sportster are registered trademarks of U.S. Robotics. Windows NT is a trademark of Microsoft, Inc. All other trademarks and registered trademarks are the property of their respective holders.
FCC Notice:
The equipment described in this manual generates, uses, and may emit radio frequency energy. The equipment has been t ype tested and found to comply with the limits for a Class A digital device pursuant t o Part 15 of FCC Rules, which are designed to pr ovide reasonable protection against such radio frequency inter ference. Operation of this equipment in a resi dential area may cause interference, in which case the user at his own expense will be required to take whatever measures are required to correct the interference.
Shielded Cables:
If shielded cables have been supplied or specified, they must be used on the
system in order to maintain international regulatory compliance.
Warning!
This is a Class A product. In a domestic environment this product may cause radio
interference, in which case the user may be required to take adequate measures.
Achtung!
Dieses ist ein Gerät der Funkstörgrenzwertklasse A. In Wohnbereichen können bei Betrieb dieses Gerätes Rundfunkstörungen auftreten, in welchen Fällen der Benutzer für entsprechende Gegenmaßnahmen verantwortlich ist.
Avertissement!
Cet appareil est un appareil de Classe A. Dans un environnement résidentiel, cet appareil peut provoquer des brouillages radioélectriques. Dans ce cas, il peut être demandé à l'utilisateur de prendre les mesures appropriées.

Table of Contents

1 System Overview

DIGITAL Server 7300/7300R System Drawer (BA30A) ......................................................1–3
Cover Interlocks ............................................................................................................ 1–4
Cabinet System.....................................................................................................................1–6
Cabinet Differences.......................................................................................................1–7
Cabinet System Fan Tray............................................................................................... 1–7
Pedestal System.................................................................................................................... 1–8
Control Panel and Drives.................................................................................................... 1–10
System Consoles................................................................................................................. 1–12
SRM Console............................................................................................................... 1–12
AlphaBIOS Console.....................................................................................................1–12
Environment Variables................................................................................................1–13
System Architecture ........................................................................................................... 1–14
System Motherboard........................................................................................................... 1–16
CPU Types.........................................................................................................................1–18
CPU Variants............................................................................................................... 1–18
CPU Module Layout....................................................................................................1–18
Alpha Chip Composition ............................................................................................. 1–19
Chip Description.......................................................................................................... 1–19
CPU Configuration Rules ............................................................................................ 1–19
CPU Module Color Codes............................................................................................ 1–19
Memory Modules ............................................................................................................... 1–20
Memory Variants......................................................................................................... 1–21
Memory Operation.......................................................................................................1–21
Memory Configuration Rules....................................................................................... 1–21
Memory Addressing ........................................................................................................... 1–23
System Bus......................................................................................................................... 1–25
System Bus to PCI Bus Bridge Module............................................................................... 1–27
PCI I/O Subsystem ............................................................................................................. 1–29
DIGITAL Server 7300/7300R Series Service Manual iii
Server Control Module....................................................................................................... 1–31
Power Control Module ....................................................................................................... 1–33
Power Supply..................................................................................................................... 1–35

2 Power-Up

Control Panel ....................................................................................................................... 2–2
Power-Up Sequence....................................................................................................... 2–4
SROM Power-Up Test Flow................................................................................................. 2–8
SROM Errors Reported ...................................................................................................... 2–11
XSROM Power-Up Test Flow............................................................................................ 2–12
XSROM Errors Reported.................................................................................................... 2–15
Console Power-Up Tests .................................................................................................... 2–17
Console Device Determination........................................................................................... 2–19
Console Device Options .............................................................................................. 2–20
Console Power-Up Display................................................................................................. 2–21
Fail-Safe Loader................................................................................................................. 2–25

3 Troubleshooting

Troubleshooting with LEDs...................................................................................................3-2
Processor (CPU) LEDs ..................................................................................................3-3
System Bus to PCI Bus Bridge Module LEDs (B3040-AA)............................................3-3
Cabinet Power and Fan LEDs.........................................................................................3-4
Troubleshooting Power Problems..........................................................................................3-5
If Halt Is Caused by Power, Fan, or Over-Temperature Problems...................................3-5
If Power Problem Occurs at Power-Up ...........................................................................3-6
Recommended Order for Troubleshooting Failure at Power-Up......................................3-6
Power Control Module LEDs..........................................................................................3-7
Troubleshooting with the Maintenance Bus (I
Monitoring System Conditions .....................................................................................3-10
Displaying Faults..........................................................................................................3-10
Writing Error States......................................................................................................3-10
Tracking Configurations............................................................................................... 3-10
Running Diagnostics — Test Command.............................................................................. 3-11
Testing an Entire System..................................................................................................... 3-12
Testing Memory...........................................................................................................3-15
Testing PCI Buses and Devices .................................................................................... 3-18
2
C Bus)............................................................3-9

4 Power System

Power Supply....................................................................................................................... 4–2
Power Control Module Features ........................................................................................... 4–4
Power Circuit and Cover Interlocks...................................................................................... 4–6
DIGITAL Server 7300/7300R Series Service Manual
iv
Power-Up/Down Sequence...................................................................................................4–8
Cabinet Power Configuration Rules.................................................................................... 4–10
Pedestal Power Configuration Rules (North America and Japan)........................................4–12
Pedestal Power Configuration Rules (Europe and Asia Pacific).......................................... 4–14

5 Error Detection with Error Registers

Overview of Error Detection.................................................................................................5–2
Error Registers......................................................................................................................5–5
External Interface Status Register – EI_STAT............................................................... 5–6
External Interface Address Register - EI_ADDR ......................................................... 5–10
MC Error Information Register 0 (MC_ERR0 - Offset = 800)...................................... 5–12
MC Error Information Register 0 (MC_ERR0 - Offset = 800)...................................... 5–13
MC Error Information Register 1 (MC_ERR1 - Offset = 840)..................................... 5–14
CAP Error Register (CAP_ERR - Offset = 880).......................................................... 5–16
PCI Error Status Register 1 (PCI_ERR1 - Offset = 1040)............................................. 5–19
Troubleshooting IOD-Detected Errors................................................................................ 5–20
System Bus ECC Error ................................................................................................ 5–21
System Bus Nonexistent Address Error........................................................................ 5–22
System Bus Address Parity Error................................................................................. 5–23
PIO Buffer Overflow Error (PIO_OVFL)..................................................................... 5–24
Page Table Entry Invalid Error .................................................................................... 5–25
PCI Master Abort.........................................................................................................5–25
PCI System Error......................................................................................................... 5–25
PCI Parity Error........................................................................................................... 5–25
Broken Memory........................................................................................................... 5–26
Command Codes.......................................................................................................... 5–28
Node IDs 5–29
Double Error Halts and Machine Checks While in PAL Mode............................................ 5–31
PALcode Overview......................................................................................................5–31
Double Error Halt........................................................................................................5–32
Machine Checks While in PAL.................................................................................... 5–32

6 Removal and Replacement

System Safety.......................................................................................................................6–2
FRU List...............................................................................................................................6–3
Power System FRUs.............................................................................................................6–9
System Drawer Exposure (Cabinet)....................................................................................6–11
System Drawer Exposure (Pedestal) ................................................................................... 6–13
CPU Removal and Replacement........................................................................................ 6–15
CPU Fan Removal and Replacement.................................................................................. 6–17
Memory Removal and Replacement................................................................................... 6–19
Power Control Module Removal and Replacement............................................................. 6–21
DIGITAL Server 7300/7300R Series Service Manual
v
System Bus to PCI Bus Bridge (B3040-AA) Module Removal and Replacement ............... 6–23
System Motherboard Removal and Replacement................................................................ 6–25
PCI/EISA Motherboard (B3050/B3052) Removal and Replacement.................................. 6–27
Server Control Module Removal and Replacement ............................................................ 6–29
PCI/EISA Option Removal and Replacement..................................................................... 6–31
Power Supply Removal and Replacement........................................................................... 6–33
Power Harness Removal and Replacement......................................................................... 6–35
System Drawer Fan Removal and Replacement.................................................................. 6–37
Cover Interlock Removal and Replacement........................................................................ 6–39
Operator Control Panel Removal and Replacement (Cabinet)............................................. 6–41
Operator Control Panel Removal and Replacement (Pedestal)............................................ 6–43
Floppy Removal and Replacement ..................................................................................... 6–45
CD-ROM Removal and Replacement................................................................................. 6–47
Cabinet Fan Tray Removal and Replacement..................................................................... 6–49
Cabinet Fan Tray Power Supply Removal and Replacement............................................... 6–51
Cabinet Fan Tray Fan Removal and Replacement .............................................................. 6–53
Cabinet Fan Tray Fan Fail Detect Module Removal and Replacement ............................... 6–55
StorageWorks Shelf Removal and Replacement................................................................. 6–57

7 Running Utilities

Selecting Utilities from the AlphaBIOS Menu...................................................................... 7–2
Running Utilities from a Serial Terminal.............................................................................. 7–3
Running the EISA Configuration Utility............................................................................... 7–4
Running RAID Standalone Configuration Utility.................................................................. 7–5
Updating Firmware............................................................................................................... 7–6
Updating Firmware from the Internal CD-ROM ............................................................ 7–8
Updating Firmware from the Internal Floppy Disk......................................................... 7–9
Updating Firmware from a Network Device................................................................ 7–12
LFU Commands................................................................................................................. 7–13

8 SRM Console Commands and Environment Variables

Summary of SRM Console Commands................................................................................. 8–2
Summary of SRM Environment Variables............................................................................ 8–4
Recording Environment Variables........................................................................................ 8–6

9 Operating the System Remotely

RCM Console Overview....................................................................................................... 9–2
Modem Usage ............................................................................................................... 9–3
Entering and Leaving Command Mode.......................................................................... 9–6
RCM Commands........................................................................................................... 9–7
Dial-Out Alerts............................................................................................................ 9–15
DIGITAL Server 7300/7300R Series Service Manual
vi

Figures

Resetting the RCM to Factory Defaults........................................................................ 9–18
Troubleshooting Guide ................................................................................................ 9–19
Modem Dialog Details................................................................................................. 9–22
Figure 1-1 Components of the BA30A System Drawer............................... 1–3
Figure 1-2 Cover Interlock Circuit............................................................... 1–5
Figure 1-3 DIGITAL Server 7300/7300R Cabinet System ........................... 1–6
Figure 1-4 Cabinet Fan Tray........................................................................ 1–7
Figure 1-5 Pedestal System Front................................................................. 1–8
Figure 1-6 Pedestal System Rear.................................................................. 1–9
Figure 1-7 Control Panel Assembly ........................................................... 1–10
Figure 1-8 AlphaBIOS Boot Menu............................................................. 1–13
Figure 1-9 Architecture Diagram ............................................................... 1–14
Figure 1-10 System Motherboard Module Locations.................................. 1–16
Figure 1-11 CPU Module Layout............................................................... 1–18
Figure 1-12 Memory Module Layout......................................................... 1–20
Figure 1-13 How Memory Addressing Is Calculated.................................. 1–23
Figure 1-14 System Bus Block Diagram.................................................... 1–25
Figure 1-15 Bridge Module........................................................................ 1–27
Figure 1-16 PCI Block Diagram................................................................. 1–29
Figure 1-17 Server Control Module ........................................................... 1–31
Figure 1-18 Power Control Module............................................................ 1–33
Figure 1-19 Location of Power Supply....................................................... 1–35
Figure 2-1 Control Panel and LCD Display.................................................. 2–2
Figure 2-2 Power-Up Flow........................................................................... 2–4
Figure 2-3 Contents of FEPROMs................................................................ 2–5
Figure 2-4 Console Code Critical Path......................................................... 2–6
Figure 2-5 SROM Power-Up Test Flow....................................................... 2–8
Figure 2-6 XSROM Power-Up Flowchart .................................................. 2–12
Figure 2-7 Console Device Determination Flowchart................................. 2–19
Figure 3-1 CPU and Bridge Module LEDs....................................................3-2
Figure 3-2 Cabinet Power and Fan LEDs......................................................3-4
Figure 3-3 PCM LEDs..................................................................................3-7
Figure 3-4 I
2
C Bus Block Diagram................................................................3-9
Figure 4-1 Power Supply Outputs ................................................................ 4–2
Figure 4-2 Power Control Module................................................................ 4–4
Figure 4-3 Power Circuit Diagram...............................................................4–6
Figure 4-4 Power Up/Down Sequence Flowchart......................................... 4–8
Figure 4-5 -EN & -EP Single Drawer Cabinet Power Configuration .......... 4–10
Figure 4-6 -EN Three Drawer Cabinet Power Configuration...................... 4–11
DIGITAL Server 7300/7300R Series Service Manual
vii
Figure 4-7 Pedestal Power Distribution (N.A. and Japan) ...........................4–12
Figure 4-8 Pedestal Power Distribution (Europe and AP)............................ 4–14
Figure 5-1 Error Detector Placement ........................................................... 5–2
Figure 6-1 System Drawer FRU Locations................................................... 6–3
Figure 6-2 Location of Power System FRUs................................................ 6–9
Figure 6-3 Exposing System Drawer (H9A10-EN & -EP Cabinet)..............6–11
Figure 6-4 Exposing System Drawer (Pedestal) ..........................................6–13
Figure 6-5 Removing a CPU Module..........................................................6–15
Figure 6-6 Removing CPU Fan...................................................................6–17
Figure 6-7 Removing a Memory Module....................................................6–19
Figure 6-8 Removing Power Control Module .............................................6–21
Figure 6-9 Removing System Bus to PCI/EISA Bus Bridge Module...........6–23
Figure 6-10 Removing the System Motherboard........................................6–25
Figure 6-11 Replacing PCI/EISA Motherboard...........................................6–27
Figure 6-12 Removing Server Control Module ...........................................6–29
Figure 6-13 Removing PCI/EISA Option....................................................6–31
Figure 6-14 Removing Power Supply .........................................................6–33
Figure 6-15 Removing Power Harness........................................................6–35
Figure 6-16 Removing System Drawer Fan ................................................6–37
Figure 6-17 Removing Cover Interlocks.....................................................6–39
Figure 6-18 Removing OCP (Cabinet)........................................................6–41
Figure 6-19 Removing OCP (Pedestal).......................................................6–43
Figure 6-20 Removing Floppy Drive ..........................................................6–45
Figure 6-21 Removing CD-ROM................................................................6–47
Figure 6-22 Removing Cabinet Fan Tray....................................................6–49
Figure 6-23 Removing Cabinet Fan Tray Power Supply .............................6–51
Figure 6-24 Removing Cabinet Fan Tray Fan.............................................6–53
Figure 6-25 Removing Fan Tray Fan Fail Detect Module...........................6–55
Figure 6-26 Removing StorageWorks Shelf................................................6–57
Figure 7-1 Running a Utility from a Graphics Monitor ............................... 7–2
Figure 7-2 Starting LFU from the AlphaBIOS Console................................ 7–6
Figure 7-3 Formatting a FAT Partition......................................................... 7–9
Figure 7-4 Standard Formatting..................................................................7–10
Figure 9-1 RCM Connections...................................................................... 9–3

Tables

Table 1-1 PCI Motherboard Slot Numbering ..............................................1–29
Table 2-1 Control Panel Display.................................................................. 2–3
Table 2-2 XSROM Tests ............................................................................2–13
Table 2-3 Memory Tests.............................................................................2–14
DIGITAL Server 7300/7300R Series Service Manual
viii
Table 2-4 IOD Tests .................................................................................. 2–17
Table 2-5 PCI Motherboard Tests (B3050/B3052)..................................... 2–18
Table 3-1 Power Control Module LED States...............................................3-8
Table 5-1 External Interface Status Register................................................ 5–8
Table 5-2 Loading and Locking Rules for External Interface Registers...... 5–11
Table 5-3 MC Error Information Register 0............................................... 5–12
Table 5-4 MC Error Information Register 0..............................................5–13
Table 5-5 MC Error Information Register 1............................................... 5–15
Table 5-6 CAP Error Register.................................................................... 5–17
Table 5-7 PCI Error Status Register 1........................................................ 5–19
Table 5-8 CAP Error Register Data Pattern................................................ 5–20
Table 5-9 System Bus ECC Error Data Pattern .......................................... 5–21
Table 5-10 System Bus Nonexistent Address Error Troubleshooting.......... 5–22
Table 5-11 Address Parity Error Troubleshooting...................................... 5–23
Table 5-12 Cause of PIO_OVFL Error....................................................... 5–24
Table 5-13 ECC Syndrome Bits Table....................................................... 5–27
Table 5-14 Decoding Commands............................................................... 5–28
Table 5-15 Node IDs.................................................................................. 5–30
Table 6-1 Field-Replaceable Unit Part Numbers..........................................6–4
Table 7-1 AlphaBIOS Option Key Mapping ................................................ 7–3
Table 7-2 File Locations for Creating Update Diskettes on a PC................ 7–10
Table 7-3 LFU Command Summary.......................................................... 7–13
Table 8-1 Summary of SRM Console Commands........................................8–2
Table 8-2 Environment Variable Summary.................................................. 8–4
Table 8-3 Environment Variables Worksheet............................................... 8–6
Table 9-1 RCM Command Summary........................................................... 9–7
Table 9-2 RCM Status Command Fields.................................................... 9–14
Table 9-3 RCM Troubleshooting ............................................................... 9–19
Table 9-4 RCM/Modem Interchange Summary.......................................... 9–24
DIGITAL Server 7300/7300R Series Service Manual
ix
DIGITAL Server 7300/7300R Series Service Manual
x

Document Audience

This manual is written for the customer service engineer.

Document Structure

This manual uses a structured documentation design. Topics are organized into small sections for efficient online and printed reference. Each topic begins with an abstract, followed by an illustration or example, and ends with descriptive text.
This manual has nine chapters, as follows:
Chapter 1, System Overview, introduces the DIGITAL Server 7300/7300R series
pedestal and cabinet systems and gives an overview of the system bus modules.
Chapter 2, Power-Up, provides information on how to interpret the power-up display
on the operator control panel, the console screen, and system LEDs. It also describes how hardware diagnostics execute when the system is initialized.
Chapter 3, Troubleshooting, describes troubleshooting during power-up and booting,
as well as the test command.
Chapter 4, Power System, describes the DIGITAL Server 7300/7300R power system

Preface

Chapter 5, Error Detection with Error Registers, describes the error registers used
to hold error information.
Chapter 6, Removal and Replacement, describes removal and replacement
procedures for field-replaceable units (FRUs).
Chapter 7, Running Utilities, explains how to run utilities such as the EISA
Configuration Utility and RAID Standalone Configuration Utility.
DIGITAL Server 7300/7300R Series Service Manual xi
Chapter 8, SRM Console Commands and Environment Variables, summarizes the
commands used to examine and alter the system configuration.
Chapter 9, Operating the System Remotely, describes how to use the remote
console monitor (RCM) to monitor and control the system remotely.

Documentation Titles

The following table lists titles related to DIGITAL Server 7300/7300R series systems.
DIGITAL Server 7300/7300R Series Documentation
Title Order Number
DIGITAL 7300/7300R Series User and Configuration Documentation Kit
System Drawer User's Guide Configuration and Installation Guide Illustrated Parts Breakdown CPU Installation Card Memory Installation Card Power Supply Installation Card ServerWORKS Manager Administrator User's Guide
QC-06DAC-H8
ER-K9FWW-UA ER-K9FWW-IA ER-K9FWW-IP ER-PD02U-IN ER-ACSMA–IN ER–H7291–IN ER–4QXAA–UA

Information on the Internet

Using a Web browser, you can access information about DIGITAL Servers at:
http://www.windowsnt.digital.com/products
Access the latest system firmware either with a Web browser as follows:
http://www.windowsnt.digital.com/support
xii DIGITAL Server 7300/7300R Series Service Manual
1

System Overview

This chapter introduces the DIGITAL Server 7300/7300R series systems. These systems are available in cabinets or pedestals.
The pedestal system has one system drawer and up to three StorageWorks shelves. The cabinet system can have a combination of system drawers and StorageWorks shelves that occupy the five sections of the cabinet. There is one system drawer, the BA30A, used with the DIGITAL Server 7300/7300R series.
Topics in this chapter include the following:
DIGITAL Server 7300/7300R System Drawer (BA30A)
Cabinet System
Pedestal System
Control Panel and Drives
System Consoles
System Architecture
System Motherboard
CPU Types
DIGITAL Server 7300/7300R Series Service Manual 1–1
System Overview
Memory Modules
System Bus
System Bus to PCI Bus Bridge Module
PCI I/O Subsystem
Server Control Module
Power Control Module
Power Supply
1–2
DIGITAL Server 7300/7300R Series Service Manual
System Overview

DIGITAL Server 7300/7300R System Drawer (BA30A)

Components in t he BA30A system dra wer are l ocated in the system bus card cage, the PCI card cage, the control panel assembly, and the power and cooling section. The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs approximately 45.5 kg (~100 lbs).
Figure 1-1 Components of the BA30A System Drawer
1
5
2
3
4
PK-0702-96
When the system drawer is in a pedestal, the control panel assembly is mounted in a tray at the top of the drawer.
The numbered callouts in Figure 1-1 refer to components of the system drawer.
System card cage, which holds the system motherboard and the CPU, memory, bridge, and power control modules.
DIGITAL Server 7300/7300R Series Service Manual
1–3
System Overview
PCI/EISA card cage, which holds the PCI motherboard, option cards, and server control module.
Server control module, which holds the I/O connectors and remote console monitor.
Control panel assembly, which includes the control panel, a floppy drive, and a CD­ROM drive.
Power and cooling section, which contains one to three power supplies and fans.

Cover Interlocks

The system drawer has three cover interlocks: one for the system bus card cage, one for the PCI card cage, and one for the power and system fan area. Figure 1-2 shows the cover interlock circuit. Note that “B305n” in Figure 1-2 stands for either the B3050-AA or B3052-AA PCI Motherboard.
1–4
DIGITAL Server 7300/7300R Series Service Manual
Figure 1-2 Cover Interlock Circuit
System Overview
OCP Logic
OCP
Switch
17-04201-02
17-04217-01
17-04201-01
Or
17-04302-01
B305n
B3040
17-04196-01
RSM_DC_EN_L
SCM
Power Supply
Motherboard
DC_ENABLE_L
Cover Interlocks
70-32016-01
PCM
POWER_FAULT_L
3 Interlock
Switches
70­32016­01
To OCP
LJ-06315
NOTE: The cover interlocks must be engaged to enable power-up. To
override the cover interlocks, find a suitable object to close the interlock circuit.
DIGITAL Server 7300/7300R Series Service Manual
1–5
System Overview

Cabinet System

The DIGITAL Server 7300/7300R series cabinet system can accommodate multiple systems in a single cabinet. There are two cabinet variations that can hold different system configurations. From the outside, the cabinets look almost identical and are of one basic type. The differences are in power controllers.
Figure 1-3 DIGITAL Server 7300/7300R Cabinet System
1–6
DIGITAL Server 7300/7300R Series Service Manual
PK-0306-96

Cabinet Differences

Cabinet Power Mounting Destination
H9A10-EN Two 120 volt
H7600-AA power controllers
H9A10-EP Two 240 volt
H7600-DB power controllers

Cabinet System Fan Tray

At the top of cabinet systems is a fan tray containing three exhaust fans, a small 12-volt power supply, and a module that distributes power to the server control module in each drawer.
Figure 1-4 Cabinet Fan Tray
Pull-out tray (max drawers: 3)
Pull-out tray (max drawers: 3)
System Overview
North America Asia Pacific
Europe
Fan LE D Powe r LE D
To S C M
AC Powe r
PKW 04 41A -96
DIGITAL Server 7300/7300R Series Service Manual
1–7
System Overview

Pedestal System

The pedestal system contains one system drawer with a control panel, a CD-ROM drive, and a floppy drive. In the pedestal control panel area there is space for an optional tape or disk drive. Three StorageWorks shelves provide up to 90 Gbytes of in-cabinet storage.
Figure 1-5 Pedestal System Front
In the pedestal system, the control panel is located at the top left in a tray. There is space for an optional device beside it.
1–8
DIGITAL Server 7300/7300R Series Service Manual
PK-030 1-96
Figure 1-6 Pedestal System Rear
PK-0307 a-96
System Overview
DIGITAL Server 7300/7300R Series Service Manual
1–9
System Overview

Control Panel and Drives

The control panel include s the On/Off, Halt, and Rese t buttons and a display. In a pedestal system the control panel is located in a tray at the top of the system drawer. In a cabinet system, the control panel is at the bottom of the system drawer with the CD-ROM drive and the floppy drive.
Figure 1-7 Control Panel Assembly
2 3
1
Pedestal Cabinet
Control Panel
CD-ROM Drive
CD-ROM Drive
Floppy Drive
4
PK-0751-96
On/Off button. Powers the system drawer on or off. When the LED at the top of the
button is lit, the power is on. The On/Off button is connected to the power supplies and the system interlocks.
NOTE: The LEDs on some modules are on when the line cord is
1–10
DIGITAL Server 7300/7300R Series Service Manual
System Overview
missing, regardless of the position of the On/Off button.
Halt button. Pressing this button in (so the LED at the top of the button is on) has no
effect on Windows NT. If the Halt button is in when the system is reset or powered up, the system halts in the
SRM console. AlphaBIOS is not loaded and started. Reset button. Initializes the system drawer. If the Halt button is pressed (LED on)
when the system is reset, the SRM console is loaded and remains in the system regardless of any other conditions.
Control panel display. Indicates status during power-up and self-test. The OCP
display is a 16-character LCD. Its controller is on the XBUS on the PCI motherboard.
While the operating system is running, displays the system type as a default. This message can be changed by the user.
CD-ROM drive. The CD-ROM drive is used to load software, firmware, and updates. Its controller is on PCI1 on the PCI motherboard.
Floppy disk drive. The floppy drive is used to load software and firmware updates. The floppy controller is on the XBUS on the PCI motherboard.
DIGITAL Server 7300/7300R Series Service Manual
1–11
System Overview

System Consoles

There are two console programs: the SRM console and the AlphaBIOS console.

SRM Console

The SRM console is a command-line interface that tests the system after power-up or reset and launches the AlphaBIOS graphical interface. For some configuration and diagnostic or testing tasks, you may need to use the SRM console interface rather than launch the AlphaBIOS console. To reach the SRM console interface, power up or reset the system with the Halt button pressed in. You then see the SRM console prompt:
P00>>>
NOTE: The console prompt displays only after the entire power-up
After the SRM console prompt appears, you should change the Halt button back to the “out” position
sequence is complete. This can take up to several minutes if the memory is very large.

AlphaBIOS Console

The AlphaBIOS console is a menu-based interface that supports the Microsoft Windows NT operating system. You use AlphaBIOS to set up operating system selections, boot Windows NT, and display information about the system configuration. You also run the EISA Configuration Utility and the RAID Standalone Configuration Utility from the AlphaBIOS console. With the DIGITAL Server 7300/7300R series, AlphaBIOS runs on either a serial (character-cell) terminal or a graphics monitor.
When you invoke the AlphaBIOS console, you see the following Boot menu:
1–12
DIGITAL Server 7300/7300R Series Service Manual
Figure 1-8 AlphaBIOS Boot Menu
AlphaBIOS Version 5.12
Please select the operating system to start:
Windows NT Server 3.51
Use and to move the highlight to your choice. Press Enter to choose.
Alpha

Environment Variables

System Overview
Press <F2> to enter SETUP
PK-0728-96
Environment variables are software parameters that, among other things, define the system configuration. You can use them to pass information to different pieces of software running in the system at various times.
The os_type environment variable determines which of the two consoles is to be used. The SRM console is always brought into memory, but AlphaBIOS is loaded if os_type is set to NT (which it must be on the DIGITAL Server 7300/7300R series) and the Halt button is out (not lit).
See the section “Summary of SRM Environment Variables” in Chapter 8 of this manual for a list of the environment variables used to configure DIGITAL Server 7300/7300R series systems.
Refer to the DIGITAL Server 7300/7300R Series System Drawer User’s Guide for information on setting environment variables.
You should keep a record of the environment variables for each system that you service. Some environment variable settings are lost when a module is swapped and must be restored after the new module is installed. Refer to Table 8-3 for a convenient worksheet for recording environment variable settings.
DIGITAL Server 7300/7300R Series Service Manual
1–13
System Overview

System Architecture

Alpha microprocessor chips are used in these systems. The CPU, memory and the I/O bridge module are connected to the system bus motherboard.
Figure 1-9 Architecture Diagram
CPU 0
System Bus
128-Bit Data Bus + 16 ECC and 40-Bit Command/Address Bus
Bridge
EISA
Bridge
PCI Slot
PCI/EISA
Slot
PCI/EISA
Slot
PCI/EISA
Slot
PCI Motherboard
System to
PCI Bus Bridge 0
6 4
B
i t
P C
I
Memory
Pairs
PCI Slot
PCI Slot
PCI Slot
PCI Slot
System to
PCI Bus
Bridge 1
6 4
B
i t
P C
I
ML014280
1–14
DIGITAL Server 7300/7300R Series Service Manual
System Overview
DIGITAL Server 7300/7300R series systems use the Alpha chip for the CPU. The CPU, memory, and I/O bridge module to PCI/EISA I/O buses are connected to the system bus motherboard. A fourth type of module, the power control module, also plugs into the system motherboard.
A fully configured DIGITAL Server 7300/7300R series system drawer can have up to four CPUs, four memory pairs, and a total of eight I/O options. The I/O options can be all PCI options or a combination of PCI options and EISA options. However, there can be no more than three EISA options.
The system bus has a 144-bit data bus protected by 16 bits of ECC and a 40-bit command/address bus protected by parity. The bus speed depends on the speed of the CPU in slot 0 which provides the clock for the buses. The 40-bit address bus can create one terabyte of addresses (that’s a million billion). The bus connects CPUs, memory, and the system bus to PCI bus bridge(s).
The CPU modules are available with an onboard cache. The Alpha chip has an 8-Kbyte instruction cache (I-cache), an 8-Kbyte write-through data cache (D-cache), and a 96­Kbyte, write-back secondary data cache (S-cache). The cache system is write-back. The system drawer supports up to four CPUs.
The memory modules are placed on the system motherboard in pairs. Each module drives half of the system bus, along with the associated ECC bits. Memory pairs consist of two modules that are the same size and type. Two types are available: synchronous and asynchronous (EDO) memory.
The system bus to PCI bus bridge module translates system bus commands and data addressed to I/O space to PCI commands and data. It also translates PCI bus commands and data addressed to system memory or CPUs to system bus commands and data. The PCI bus is a 64-bit wide bus used for I/O. The 7300/7300R series has one PCI/EISA card cage.
The power control module, which is on the system motherboard, monitors power and the system environment.
DIGITAL Server 7300/7300R Series Service Manual
1–15
System Overview

System Motherboard

The system motherboard is on the floor of the system card cage. It has slots for the CPU, memory, power control, and bridge modules.
Figure 1-10 System Motherboard Module Locations
1
2
2
1
2
1
4
1–16
DIGITAL Server 7300/7300R Series Service Manual
3
2
1
2
PK-0703D-96
System Overview
The system motherboard has the logic for the system bus. It is the backplane that holds the CPU, memory, bridge, and power control modules. Figure 1-10 shows a diagram of the motherboard used in DIGITAL Server systems. The module locations are designated by the call outs.
CPU module
Memory module
Bridge module
Power control module
Server
7300/7300R series
DIGITAL Server 7300/7300R Series Service Manual
1–17
System Overview

CPU Types

DIGITAL Server 7300/7300R series systems can be configured with one of two CPU variants.

CPU Variants

Module Variant Clock Frequency Onboard Cache
B3105-AA 400 MHz 4 Mbytes B3105-CA 533 MHz 4 Mbytes

CPU Module Layout

Figure 1-11 shows the layout of the CPU module.
Figure 1-11 CPU Module Layout
System Motherboard
3
2
CPU Module Slots
1
0
Typical Cached CPU Module
1–18
DIGITAL Server 7300/7300R Series Service Manual
ML014196

Alpha Chip Composition

The Alpha chip is made using state-of-the-art chip technology, has a transistor count of 9.3 million, consumes 50 watts of power, and is air cooled (a fan is on the chip). The default cache system is write-back and when the module has an external cache, it is write-back.

Chip Description

Unit Description
Instruction 8-byte cache, 4-way issue Execution 4-way execution; 2 integer units, 1 floating-point adder,
Memory Merge logic, 8-Kbyte write-through first-level data cache, 96-Kbyte

CPU Configuration Rules

The first CPU must be in CPU slot 0 to provide the system clock.
System Overview
1 floating-point multiplier
write-back second-level data cache, bus interface unit
Additional CPU modules should be installed in ascending order by slot number.
All CPUs must have the same Alpha chip clock speed. The system bus hangs without
an error message if the oscillators clocking the CPUs are different.

CPU Module Color Codes

The top edge of the CPU module variant is color coded for easy identification.
Option
Color
Orange Violet
Number Description
B3105-AA 400 MHz, 4MB cached B3105-CA 533 MHz, 4MB cached
DIGITAL Server 7300/7300R Series Service Manual
1–19
System Overview

Memory Modules

Memory modules are used only in pairs — two modules of the same size and type. Each module provides either the low half or the high half of the memory space. The 7300/7300R series system drawer can hold up to four memory module pairs.
Figure 1-12 Memory Module Layout
Typical Synchronous Mem ory
Typical EDO Memory
C56
1–20
DIGITAL Server 7300/7300R Series Service Manual
R3
PK W 0423C -96

Memory Variants

Each memory option consists of two identical modules. Each DIGITAL Server 7300/7300R series drawer supports up to four memory options, for a total of 4 Gbytes of memory. Memory modules are used only in pairs and are available in 128 Mbyte, 512 Mbyte, 1 Gbyte, and 2 Gbyte sizes. The 128-Mbyte option is synchronous memory, while the larger sizes are asynchronous memory (EDO).
Option Part No Size Module Type Number Size
FR-ACSMA-AA 128 MB B3020-CA Synch. 36 4 MB x 4 FR-ACSMA-AB 512 MB B3030-EA Asynch.
FR-ACSMA-AC 1 GB B3030-FA Asynch.
FR-ACSMA-AD 2 GB B3030-GA Asynch.
System Overview
DRAM
144 4 MB x 4
(EDO)
72 16 MB x 4
(EDO)
144 16 MB x 4
(EDO)

Memory Operation

Memory modules are used only in pairs; each module provides half the data, or 64 bits plus 8 ECC bits, of the octaword (16 byte) transferred on the system bus. Modules are placed in slots designated MEMxL and MEMxH.
NOTE: Modules in slots MEMxL do not drive the lower 8 bytes, and
modules in slots MEMxH do not drive the higher 8 bytes of the 16 byte transfer.
Unless otherwise programmed, memory drives the system bus in bursts. Upon each memory fetch, data is transferred in 4 consecutive cycles transferring 64 bytes. There are situations, however, when memories made with EDO DRAMs cannot provide data fast enough to complete the system bus transactions. When these situations arise, EDO type memories assert a signal that causes the system bus to stall for one (occasionally more) clock tick. When memory completes such an operation, it releases the system bus.
Memory Configuration Rules
In a system, memories of different sizes and types are permitted, but:
Memory modules are installed and used in pairs. Both modules in a memory pair
must be of the same size and type.
DIGITAL Server 7300/7300R Series Service Manual
1–21
System Overview
The largest memory pair must be in slots MEM 0L and MEM 0H.
Other memory pairs must be the same size or smaller than the first memory pair.
Memory pairs must be installed in consecutive slots.
1–22
DIGITAL Server 7300/7300R Series Service Manual

Memory Addressing

Alpha system memory addressing is unusual because memory address space is determined not by the amount of physical memory but is calculated by a multiple of the size of the memory pair in slot MEM0x.
Figure 1-13 How Memory Addressing Is Calculated
2028 M byte
1536 M byte
System Overview
Fourth pair address space 512 M byte space em pty
Third pair address space 512 M byte 1/2 occupied (2 B 3020-DA - 128 M byte/mod)
1024 M byte
Address hole
512 M byte
Second pair address space 512 M byte 1/2 occupied (2 B 3020-DA - 128 M byte/mod)
First pair defines total address space always fully occupied (2 B 3020-EA 2 56 M byte/mod)
0
PKW0424-96
DIGITAL Server 7300/7300R Series Service Manual
1–23
System Overview
The rules for addressing memory are as follows:
Address space is determined by the memory pair in slot MEM0.
Memory pairs need not be the same size.
The memory pair in slot MEM0 must be the largest of all memory pairs. Other
memory pairs may be as large but none may be larger.
The starting address of each memory pair is N times the size of the memory pair in
slot MEM0. N=0,1,2,3.
Memory addresses are contiguous within each module pair.
If memory pairs are of different sizes, memory “holes” can occur in the physical
address space. See Figure 1-13.
Software creates contiguous virtual memory even though physical memory may not be contiguous.
1–24
DIGITAL Server 7300/7300R Series Service Manual

System Bus

The system bus consists of a 40-bi t command/address bus, a 128-bit plus ECC data bus, and several control signals and clocks.
Figure 1-14 System Bus Block Diagram
SYNC DRAMS
MEM3
MEM2
MEM1
MEM0
ADR DATA
CTRL
SIM_ADR
MEM CTRL &
CNTRL ARB
System Overview
ROW COL
CPU3
CPU2
CPU1
CPU0
PCI/EISA
PCI/EISA0
PCI1
A
L P H A
EV_ADR EV_DATA
System to
PCI Bus Bridge
IOD0
IOD1
CTRL
ADR
System Bus Control
MC ADR <39:4>
MC DATA <127:0>
ML014283
DIGITAL Server 7300/7300R Series Service Manual
1–25
System Overview
The system bus motherboard consists of a 40-bit command/address bus, a 128-bit plus ECC data bus, and several control signals, clocks, and a bus arbiter. The bus requires that all CPUs have the same high-speed oscillator providing the clock to the Alpha chip.
The DIGITAL Server 7300/7300R series system bus connects up to four CPUs, four pairs of memory modules, and a single I/O bus bridge module. The I/O bus bridges may be designated as IODn where n is the number of the PCI bus. The bridge is designated IOD0 and IOD1.
The system bus clock is provided by an oscillator on the CPU in slot CPU0. This oscillator has a 1:5 ratio to the Alpha chip. With 400 MHz CPUs, for example, the system bus operates at 80 MHz.
The system bus motherboard initiates memory refresh transactions. The motherboard sits at the bottom of the system drawer, and in addition to CPUs, memory, and I/O bridges, holds a power control module.
5 volt and 3.43 volt power is provided directly to the motherboard from the power supplies.
1–26
DIGITAL Server 7300/7300R Series Service Manual

System Bus to PCI Bus Bridge Module

The bridge module is the physical interconnect between the system motherboard and any PCI motherboard in the system.
Figure 1-15 Bridge Module
System Overview
PCI Bus
Control
Address
ECC & Data <63:0>
ECC & Data <127:64>
Control
CAP
MDPA
MDPB
AD<31:0>
Data A to B bus
DataA to B & BtoAbus
AD<63:32>
PKW0426r-96
DIGITAL Server 7300/7300R Series Service Manual
1–27
System Overview
The system bus to PCI bus bridge module converts:
System bus commands and data addressed to I/O space to PCI commands and data
PCI bus commands and data addressed to system memory or CPUs to system bus
commands and data.
A DIGITAL Server 7300/7300R series system has one bridge module. The bridge module has two major components:
Command/address processor (CAP) chip
Two data path chips (MDPA and MDPB)
There are two sets of these three chips, one set on each side of the module. Each set bridges to one of the PCI buses on the PCI motherboard.
The interface on the system bus side of the bridge responds to system bus commands addressed to the upper 64 Gbytes of I/O space. I/O space is addressed whenever bit <39> on the system bus address lines is set. The space so defined is 512 Gbytes in size. The first 448 Gbytes are reserved and the last 64 Gbytes, when bits <38:36> are set, are mapped to the PCI I/O buses.
The interface on the PCI side of the bridge responds to commands addressed to CPUs and memory on the system bus. On the PCI side, the bridge provides the interface to the PCIs. Each PCI bus is addressed separately. The bridge does not respond to devices communicating with each other on the same PCI bus. However, should a device on one PCI address a device on the other PCI bus, commands, addresses, and data run through the bridge out onto the system bus and back through the bridge to the other PCI bus.
In addition to its bridge function, the system bus to PCI bus bridge module monitors every transaction on the system bus for errors. It monitors the data lines for ECC errors and the command/address lines for parity errors.
1–28
DIGITAL Server 7300/7300R Series Service Manual

PCI I/O Subsystem

The I/O subsystem is PCI. The DIGITAL Server 7300/7300R series has two four-slot PCI buses that hol d up to ei ght I/O opt ions. O ne of the se buses can be both PCI and EISA, but can hold not more than four options three of which may be EISA.
Figure 1-16 PCI Block Diagram
System Overview
System Bus
NVRAM
8Kx8
Serial
Interrupt Logic
3.3 Mhz
OSC Clock Bfr
Serial
Interrupt Logic
BDATA
Xceivers
Flash ROM
2MB
PCI-1 Bus
PCI-0 Bus
Realtime
Clock
PCI-1
4 64-bit slots
PCI-0
4 64-bit slots
XBUS
Combo I/O:
serial ports
parallel port
floppy cntrl
Mouse/
Keyboard
SCSI Control
53C810
Connector
PCI to EISA
Bridge
Chipset
XBUS
Xceivers
12C Bus Interface
40Mhz
Clock
EISA:
32-bit slots
Table 1-1 PCI Motherboard Slot Numbering
Slot PCI0 PCI1
0 Reserved Reserved 1 PCI to EISA bridge Internal CD-ROM controller 2 PCI or EISA slot PCI slot 3 PCI or EISA slot PCI slot 4 PCI or EISA slot PCI slot 5 PCI slot PCI slot
EISA Data Bus
3
ML014284
DIGITAL Server 7300/7300R Series Service Manual
1–29
System Overview
The logic for two PCI buses is on each PCI motherboard. PCI0 is a 64-bit bus with a built-in PCI to EISA bus bridge. PCI0 has one dedicated PCI
slot and three slots, though there are six connectors, that can be PCI or EISA slots. Each slot has an EISA connector and a PCI connector only one of which may be used at a time. PCI0 is powered by 5V.
PCI1 is a 64-bit bus with a built-in CD-ROM controller and four PCI slots. PCI1 is powered by 5V.
An 8-bit XBUS is connected to the EISA bus. On this bus there is an interface to the system I
2
C bus; mouse and keyboard support; an I/O combo controller supporting two serial ports, the floppy controller, and a parallel port; a real-time clock; two 1-Mbyte flash ROMs containing system firmware, and an 8-Kbyte NVRAM.
1–30
DIGITAL Server 7300/7300R Series Service Manual

Server Control Module

The server control module enables remote console connections to the system drawer. The module passes signals to COM ports 1 and 2, the keyboard, and the mouse to the standard I/O connectors.
Figure 1-17 Server Control Module
Remote Console Monitor
System Overview
Standard I/O
PK-0702B-96
DIGITAL Server 7300/7300R Series Service Manual
1–31
System Overview
The server control module has two sections: the remote console monitor (RCM) and the standard I/O. See Chapter 9 for information on controlling the system remotely.
The remote console monitor connects to a modem through the modem port on the bulkhead. The RCM requires a 12V power connection.
The standard I/O ports (keyboard, mouse, COM1 and COM2 serial, and parallel ports) are on the same bulkhead.
1–32
DIGITAL Server 7300/7300R Series Service Manual

Power Control Module

The power control module controls power sequencing and monitors power supply voltage, temperature, and fans.
Figure 1-18 Power Control Module
System Motherboard
Power Control Module Slot
System Overview
PK-0710-96
DIGITAL Server 7300/7300R Series Service Manual
1–33
System Overview
The power control module performs the following functions:
Controls power sequencing.
Monitors the combined output of power supplies and shuts down power if it is not in
range.
Monitors system temperature and shuts off power if it is out of range.
Monitors the fans in the system drawer and on the CPU modules and shuts down
power if a fan fails.
Provides visual indication of faults through LEDs.
1–34
DIGITAL Server 7300/7300R Series Service Manual

Power Supply

The system dr awer po wer supplies provide power only to components in the drawer. One or two power supplies are required, depending on the number of CPU modules and PCI card cages; a second or third can be added for redundancy. The power system is described in detail in Chapter 4.
Figure 1-19 Location of Power Supply
Power Supply 2 Power Supply 1 Power Supply 0
System Overview
PK-0715-96
DIGITAL Server 7300/7300R Series Service Manual
1–35
System Overview

Description

One to three power supplies provide power to components in the system drawer. (They supply power only for the drawer in which they are located.) Three power supplies provide redundant power in fully loaded DIGITAL Server 7300/7300R series systems.
These power supplies share the load, and redundant configurations are supported. They autoselect line voltage (120V to 240V). Each has 450 W output and supplies up to 75A of
3.43V, 50A of 5.0V, 11A of 12V, and small amounts of –5V, –12V, and auxiliary voltage (Vaux).
NOTE: The LEDs on some modules are on when the line cord is

Configuration

A DIGITAL Server 7300/7300R series system with one or two CPUs requires one power supply (two for redundancy).
A DIGITAL Server 7300/7300R series system with three or four CPUs requires two power supplies (three for redundancy).
Power supply 0 is installed first, power supply 2 second, and power supply 1 third. See Figure 1-19 Location of Power Supply. (The power supply numbering shown here corresponds to the numbering displayed by the SRM console's show power command.)
plugged in, regardless of the position of the On/Off button.
1–36
DIGITAL Server 7300/7300R Series Service Manual
2

Power-Up

This chapter describes system power-up testing and explains the power-up displays. The following topics are covered:
Control Panel
Power-Up Sequence
SROM Power-Up Test Flow
SROM Errors Reported
XSROM Power-UP Test Flow
XSROM Errors Reported
Console Power-Up Tests
Console Device Determination
Console Power-Up Display
Fail-Safe Loader
DIGITAL Server 7300/7300R Series Service Manual 2–1
Power-Up

Control Panel

The control panel display indicates the likely device when testing fails.
Figure 2-1 Control Panel and LCD Display
O n/Off
Halt
Reset
Potentiom eter Access H ole
P0 TEST 11 CPU00
When the On/Off button LED is on, power is applied and the system is running. When it is off, the system is not running, but power may or may not be present. If power is present, the PCM or the power LED on the system bus to PCI bus bridge module should be flashing. Otherwise, there is a power problem.
When the Halt button LED is lit and the On/Off button is on, the system should be running either the SRM console or Windows NT. If the Halt button is in, but the LED is off, the OCP, its cables, or the PCM are likely to be broken.
2–2
DIGITAL Server 7300/7300R Series Service Manual
PK-0706G-96
Table 2-1 Control Panel Display
Field Content Display Meaning
CPU number P0–P3 CPU reporting status Status TEST Tests are executing
FAIL Failure has been detected MCHK Machine check has occurred INTR Error interrupt has occurred
Power-Up
Test number Suspected device CPU0–3 CPU module number
MEM0–3 and L, H, or *
Memory pair number and low module, high module, or either
IOD0 Bridge to PCI bus 0 IOD1 Bridge to PCI bus 1 FROM0 Flash ROM COMBO COM controller
4
4
PCEB PCI-to-EISA bridge ESC EISA system controller NVRAM Nonvolatile RAM TOY Real-time clock
4
4
I8242 Keyboard and mouse controller
1
2
3
3
4
4
4
The potentiometer, accessible through the access hole just above the Reset button controls the intensity of the LCD. Use a small Phillips head screwdriver to adjust.
1
CPU module
2
Memory module
3
Bridge module (B3040-AA)
4
EISA/PCI motherboard
DIGITAL Server 7300/7300R Series Service Manual
2–3
Power-Up

Power-Up Sequence

Console and most power-up tests re side on the I/O subsyste m, not o n the CPU nor on any other module on the system bus.
Figure 2-2 Power-Up Flow
Power-Up/Reset
SRO M code loaded
into each C PU's
I-cache
SRO M tests execute
XS ROM loaded into
each CPU's S-cache
Definitions
XS ROM tests execute
SR M console loaded
into memory
SR M console tests
execute
SR M console either
rem ains in the system
or loads AlphaBIOS
console
PKW0432B-96
SROM. The SROM is a 128-Kbit ROM on each CPU module. SROM contains minimal diagnostics that test the Alpha chip and the path to the XSROM. Once the path is verified, it loads XSROM code into the Alpha chip and jumps to it.
2–4
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
y
XSROM. The XSROM, or extended SROM, contains back-up cache and memory tests, and a fail-safe loader. The XSROM code resides in sector 0 of FEPROM 0 on the XBUS. Sector 2 of FEPROM 0 contains a duplicate copy of the code and is used if sector 0 is bad.
FEPROM. Two 1-Mbyte programmable ROMs are on the XBUS on PCI0. FEPROM 0 contains two copies of the XSROM, and the SRM console and decompression code. FEPROM 1 contains the AlphaBIOS and NT HALcode. These two FEPROMs can be flash updated. Refer to Chapter 7.
Figure 2-3 Contents of FEPROMs
FEPROM 0 FEPROM 1
Sector
XSROM
0
Fa il S a fe ld r
1
XSROM
2
Fail S afe ld r
Decom press
3
31
and
Pal
Code
and
SRM
Console
Code
64Kb
64Kb
64Kb
64Kb
AlphaBIOS
Code
1Mb
te
PKW0431D-96
DIGITAL Server 7300/7300R Series Service Manual
2–5
Power-Up
For the console to run, the path from the CPU to the XSROM must be functional. The XSROM resides in FEPROM0 on the XBUS, off the EISA bus, off PCI 0, off IOD 0. See Figure 2-4. This path is minimally tested by SROM.
Figure 2-4 Console Code Critical Path
EISA
Bridge
EISA
Bus
XBUS
Xceivers
CPU
Memory
Pair
System Bus
128-BitData Bus + 16 ECC and 40-Bit Com mand/Address Bus
B r idg e Mod u le
PCI Bus 0
64 Bits
System to PCI Bus Bridge 0
System to
PCI Bus
Bridge 1
PCI Bus 1
PCI Slot
P C I/E IS A
Slot
P C I/EISA
Slot
P CI M otherboa rd
P C I/EISA
Slot
XBUS
Combo I/O:
serial ports
parallel port
floppy cntrl
Mouse/
Keyboard
I2C B us
Interface
NVRAM
8Kx8
Flash
ROM
2MB
BDATA
Xceivers
Real-Time
Clock
64 Bits
PCI Slot
PCI Slot
PCI Slot
PCI Slot
PKW0431E-96
2–6
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
The SROM contents are loaded into each CPU’s I-cache and executed on power-up/reset. After testing the caches on each processor chip, it tests the path to the XSROM. Once this path is tested and deemed reliable, layers of the XSROM are loaded sequentially into the processor chip on each CPU. None of the SROM or XSROM power-up tests are run from memory—all run from the caches in the CPU chip, thus providing excellent diagnostic isolation. Later power-up tests, run under the console, are used to complete testing of the I/O subsystem.
There are two console programs: the SRM console and the AlphaBIOS console, as detailed in the DIGITAL Server 7300/7300R Series System Drawer User’s Guide (ER–K9FWW– UA). By default, the SRM console is always loaded and I/O system tests are run under it before the system loads AlphaBIOS. To load AlphaBIOS, the os_type environment variable must be set to NT and the Halt button should be out (LED not lit).
DIGITAL Server 7300/7300R Series Service Manual
2–7
Power-Up
y
y

SROM Power-Up Test Flow

The SROM tests the CPU chip and the path to the XSROM.
Figure 2-5 SROM Power-Up Test Flow
For each CPU Initialize C PU ch ip Turn off CPU LED
HANG
HANG
Yes
No
D-cache
er ro rs
No
All 3 S-cache
banks pass
Yes
Initialize
PCI-EISA bridge
chip
Read TOY
NVRAM
Initialize C o m b o C h ip
on XBUS for access
to COM port 1
HANG
Yes
Du pilc a te Tagor
Fill erro rs
No
Light CPU LED
Determ ine Prim ar
Size IOD
Loopback on
each IOD
Pass
Light IO D L ED s
Fail
Initialize O C P port
on XBUS for access
to O C P d is p la
Pr in t t o console
device and OCP
Initialize all S- ca c he
banks
Check integrityof
XSROM
Pass
Load first 8K of
XSRO M into
S-cache
Jump to XSROM
overlayin S-ca c h e
Fail twi ce
PKW04 32-96
HANG
2–8
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
The Alpha chip built-in self-test tests the I-cache at power-up and upon reset. Each CPU chip loads its SROM code into its I-cache and starts executing it. If the chip is
partially functional, the SROM code continues to execute. However, if the chip cannot perform most of its functions, that CPU hangs and that CPU pass/fail LED remains off.
If the system has more than one CPU and at least one passes both the SROM and XSROM power-up tests, the system will bring up the console. The console checks the FW_SCRATCH register where evidence of the power-up failure is left. Upon finding the error, the console sends these messages to COM1 and the OCP:
COM1 (or VGA): Power-up tests have detected a problem with your system
OCP: Power-up failure
DIGITAL Server 7300/7300R Series Service Manual
2–9
Power-Up
Table 2-2 lists the tests performed by the SROM.
Table 2-2 SROM Tests
Test Name Logic Tested
D-cache RAM March test
D-cache Tag RAM March test
S-cache Data March test
S-cache Tag RAM March test
I-cache Parity Error test
D-cache Parity Error test
S-cache Parity Error test
IOD Access test Access to IOD CSRs, data path through CAP chip and MDP0
D-cache access, D-cache data, D-cache address logic
D-cache tag store RAM, D-cache bank address logic
S-cache RAM cells, S-cache data path, S-cache address path
S-cache tag store RAM, S-cache bank address logic
I-cache parity error detection, ISCR register and error forcing logic, IC_PERR_STAT register and reporting logic
D-cache parity error detection, DC_MODE register and parity error forcing logic, DC_PERR_STAT register and reporting logic
S-cache parity error detection, AC_CTL register and parity error forcing logic, SC_STAT register and reporting logic
on each IOD, PCI0 A/D lines <31:0>
2–10
DIGITAL Server 7300/7300R Series Service Manual

SROM Errors Reported

The SROM reports machine checks, pending interrupt/exception errors, and errors related to corruption of FEPROM 0. If SROM errors are fatal, the particular CPU will hang and only the CPU self-test pass LEDs and/or the LEDs on the system bus to PCI bus bridge module will indicate the failure.
Example 2-1 SROM Errors Reported at Power-Up
Unexpected Machine Check (CPU Error)
UNEX MCHK on CPU 0 EXC_ADR 42a9
EI_STAT fffffff004ffffff EI_ADDR ffffff000000801f SC_STAT 0 SC_ADDR FFFFFF0000005F2F
Power-Up
Pending Interrupt/Exception (CPU Error)
INT-EXC on CPU0 ISR 400000
EI_STAT fffffff007ffffff EI_ADDR ffffff7fffffffdf FIL_SYN 631B
BCTGADR ffffffa7fffcafff
FEPROM Failures (PCI Motherboard Error)
Sctr 0 -XSROM headr PTTRN fail Sctr 0 -XSROM headr CHKSM fail Sctr 0 -XSROM code CHKSM fail Sctr 2 -XSROM headr PTTRN fail Sctr 2 -XSROM headr CHKSM fail Sctr 2 -XSROM code CHKSM fail
DIGITAL Server 7300/7300R Series Service Manual
2–11
Power-Up

XSROM Power-Up Test Flow

After the SROM has completed its tests and verified the path to the FEPROM containi ng the XSROM code, it lo ads the first 8 Kbytes of XSRO M into the pr imary CPU’s S-cache and jumps to it.
Figure 2-6 XSROM Power-Up Flowchart
XSRO M banner to OCP/console device
Run mem ory texts . Print trace to OC P/console dev.
Clear SC_FH IT (force hit)
Enable all 3 S-cache banks
Print errors to OC P/console dev. Done message to console dev.
Run B-cache tests
Print errors to OC P/console dev.
Done message to console dev.
Boot processor
redetermination
Ini tia lize B - ca ch e
and enable duplicate tag
Size system memory
through I squared C bus
Print m em info to c onsole dev. Check for illegal m emory config. Print w arnings to console dev. and OCP. In itia lize a ll memo ry p airs.
Note: T he X SROM c an only print to the console device if the environment variable console = serial. It always sends output to the O CP.
Boot processor redetermination
Primary
verifies checksum
of PAL/decomp/console
code
Pass
Primary unloads PAL/ decompression code or fail-safe loader depending upon results of checksum
Primary jum ps to PALcode and starts the console
Secondaries alerted that console has started. They jump to and run PALcode joining the c onsole.
Fail
Fail- sa fe loader
PKW0432A-96
XSROM tests are described in following table. Failure indicates a CPU failure.
2–12
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
After jumping to the primary CPU's S-cache, the code then intentionally I-caches itself and is completely register based (no D-stream for stack or data storage is used). The only D­stream accesses are writes/reads during testing.
Each FEPROM has sixteen 64-Kbyte sectors. The first sector contains B-cache tests, memory tests, and a fail-safe loader. The second sector contains PALcode. The third sector contains a copy of the first sector. The remaining thirteen sectors contain the SRM console and decompression code.
NOTE: Memory tests are run during power-up and reset (see Table 2-
3). They are also affected by the state of the memory_test environment variable, which can have the following values:
FULL Test all memory PARTIAL Test up to the first 256 Mbytes NONE Test 32 Mbytes
Table 2-2 XSROM Tests
Test Test Name Logic Tested
11 B-cache Tag Data Line test Access to B-cache tags, shorts between
tag data and its status and parity bits
12 B-cache Tag March test B-cache tag store RAMs, B-cache
STAT store RAMs
13 B-cache Data Line test B-cache data lines to B-cache data
RAMs, B-cache read/write logic
14 B-cache Data March test B-cache data RAMs, CPU chip B-
cache control, CPU chip B-cache address decode, INDEX_H<2x:6> (address bus)
15 B-cache ECC Data Line test CPU chip ECC generation and
checking logic, ECC lines from CPU chip to B-cache, B-cache ECC RAMs
16 B-cache Data ECC March test Portion of B-cache data RAMs used for
ECC
17 CPU chip ECC Single/Double bit
Error test
18 B-cache Tag Store Parity Error
test
19 B-cache STAT Store Parity Error
test
CPU chip ECC single-bit error detection and correction, ECC double­bit error detection, ECC error reporting
B-cache tag array, CPU parity detection, EI_ADDR and EI_STAT register operation
B-cache STAT array, CPU chip B­cache STAT parity generation/detection
DIGITAL Server 7300/7300R Series Service Manual
2–13
Power-Up
Table 2-3 Memory Tests
Test Test Name Logic Tested Description
20 Memory Data test Data path to and from
memory Data path on memory and RAMs
21 Memory Address
test
23* Memory Bitmap
Building
24 Memory March
test
* There is no test 22.
Address path to and from memory Address path on memory and RAMs
No new logic Maps out bad memory
No new logic Maps out bad memory.
01 – FF Errors are reported as an 8-bit binary field. A set bit indicates a module failure. Bit <0> indicates pass/fail of MEM0_L; <1> indicates pass/fail of MEM0_H; <2> indicates pass/fail of MEM1_L; <7> indicates pass/fail of MEM3_H.
Same as test 20.
by way of the bitmap. It does not completely fail memory.
2–14
DIGITAL Server 7300/7300R Series Service Manual

XSROM Errors Reported

The XSROM r eports B -cache test errors and memory test errors. The XSROM al so reports a warning if memory is illegally configured.
Example 2-2 XSROM Errors Reported at Power-Up
B-Cache Error (CPU Error)
TEST ERR on cpu0 #CPU running the test FRU cpu0 err# 2 tst# 11 exp: 5555555555555555 #Expected data rcv: aaaaaaaaaaaaaaaa #Received data adr: ffff8 #B-cache location error
#occurred
Power-Up
Memory Error (Memory Module Indicated)
20..21.. TEST ERR on cpu0 #CPU running test FRU: MEM1L #Low member of memory pair 1
err# c tst# 21
22..23..24..Memory testing complete on cpu0
Memory Configuration Error (Operator Error)
ERR! mem_pair0 misconfigured ERR! mem_pair1 card size mismatch ERR! mem_pair1 card type mismatch ERR! mem_pair1 EMPTY
FEPROM Failures (PCI Motherboard Error)
Sctr 1 -PAL headr PTTRN fail
DIGITAL Server 7300/7300R Series Service Manual
2–15
Power-Up
Sctr 1 -PAL headr CHKSM fail Sctr 1 -PAL code CHKSM fail Sctr 3 -CONSLE headr PTTRN fail Sctr 3 -CONSLE headr CHKSM fail Sctr 3 -CONSLE code CHKSM fail
2–16
DIGITAL Server 7300/7300R Series Service Manual

Console Power-Up Tests

Once the SRM console is loaded, it does further testing of each IOD. Table 2-4 describes the IOD power-up tests, and Table 2-5 describes the PCI motherboard power-up tests.
Table 2-4 IOD Tests
Power-Up
Test Number
1 IOD CSR Access test Read and write all CSRs in each IOD. 2 Loopback test Dense space writes to the IOD’s PCI dense
3 ECC test Loopback tests similar to test 2 but with a
4 Parity Error and Fill
5 Translation Error test A loopback test using scatter/gather address
6 Write Pending test Runs test 2 with the write-pending bit set and
7 PCI Loopback test Loops data through each PCI on each IOD,
8 PCI Peer-to-Peer
Test Name Description
space to check the integrity of ECC lines on the IODs.
varying pattern to create an ECC of 0s. Single- and double-bit errors are checked.
Parity errors are forced on the address and
Error tests
Byte Mask test
data lines on system bus and PCI buses. A fill error transaction is forced on the system bus.
translation logic on each IOD.
clear in the CAP chip control register.
testing the mask field of the system bus. Tests that devices on the same PCI and on
different PCIs can communicate.
DIGITAL Server 7300/7300R Series Service Manual
2–17
Power-Up
Table 2-5 PCI Motherboard Tests (B3050/B3052)
Test Number
1 PCEB pceb_diag Tests the PCI to EISA bridge chip 2 ESC esc_diag Tests the EISA system controller 3 8K NVRAM nvram_diag Tests the NVRAM 4 Real-Time Clock ds1287_diag Tests the real-time clock chip 5 Keyboard and
6 Flash ROM flash_diag Dumps contents of flash ROM 7 Serial and
8 CD-ROM ncr810_diag Tests the CD-ROM controller
Test Name Diagnostic
Name
i8242_diag Tests the keyboard/mouse chip
Mouse
combo_diag Tests COM ports 1 and 2, the Parallel Ports and Floppy
Description
parallel port, and the floppy
For both IOD tests and PCI 0 and PCI 1 tests, trace and failure status is sent to the OCP. If any of these tests fail, a warning is sent to the SRM console device after the console prompt (or AlphaBIOS pop-up box). The LEDs on the system bus to PCI bus bridge module are controlled by the diagnostics. If a LED is off, a failure occurred.
2–18
DIGITAL Server 7300/7300R Series Service Manual

Console Device Determination

After the SROM and XSRO M ha ve c ompl ete d their tasks, the SRM c onsol e pro gra m, as it starts, determines where to send its power-up messages.
Figure 2-7 Console Device Determination Flowchart
Power-Up/Reset
or
P00> >> Init
Power-Up
Cons ole Envar
=serial
Yes
En able C OM port 1
and send m essages
as system is pow ering up
No
Co nsole Envar
= graphics
Yes
VGA ad apter
on
PCI0
No
Enable C OM port 1
and send m essages
as system is powering up.
Warning me ssage sent if a
VGA ad apter is seen on P CI 1
Yes
VGA bec ome s the
conso le device.
PKW 0434-96
DIGITAL Server 7300/7300R Series Service Manual
2–19
Power-Up

Console Device Options

The console device on a DIGITAL Server 7330/7300R series must be either a serial terminal connected to COM1 off the server control module set at 9600 baud or a graphics monitor off an adapter on PCI0. The console program must be AlphaBIOS.
During power-up, the SROM and the XSROM always send progress and error messages to the OCP. Since the console environment variable is set to graphics, no messages are sent to COM1.
Console power-up messages are sent to the graphics monitor console device, but SROM and XSROM power-up messages are lost. No matter what the console environment variable setting, each of the three programs sends messages to the control panel display.
Messages Sent By To a Graphics Console Device Are
SROM Lost XSROM Lost SRM console Sent to VGA
2–20
DIGITAL Server 7300/7300R Series Service Manual

Console Power-Up Display

The last several lines of the power-up display prints appear on a graphics monitor and parts of it print to the control panel display.
Example 2-3 Power-Up Display
Power-Up
SROM V1.0 on cpu0 SROM V1.0 on cpu1 SROM V1.0 on cpu2 SROM V1.0 on cpu3 XSROM V1.0 on cpu2 XSROM V1.0 on cpu1 XSROM V1.0 on cpu3 XSROM V1.0 on cpu0 BCache testing complete on cpu2 BCache testing complete on cpu0 BCache testing complete on cpu3 BCache testing complete on cpu1 mem_pair0 - 128 MB
mem_pair1 - 128 MB
20..20..21..20..21..20..21..21..23..24..24..24..24.. Memory testing complete on cpu0 Memory testing complete on cpu1 Memory testing complete on cpu3 Memory testing complete on cpu2
DIGITAL Server 7300/7300R Series Service Manual
2–21
Power-Up
At power-up or reset, the SROM code on each CPU module is loaded into that
module’s I-cache and tests the module. If all tests pass, the processor’s LED lights. If any test fails, the LED remains off and power-up testing terminates on that CPU.
The first determination of the primary processor is made, and the primary processor executes a loopback test to each PCI bridge. If this test passes, the bridge LED lights. If it fails, the LED remains off and power-up continues. The EISA system controller, PCI-to-EISA bridge, COM1 port, and control panel port are all initialized thereafter.
Each CPU prints an SROM banner to the device attached to the COM1 port and to the control panel display. (The banner prints to the COM1 port if the console environment variable is set to serial. If it is set to graphics, nothing prints to the console terminal, only to the control panel display, until
Each processor's S-cache is initialized, and the XSROM code in the FEPROM on
the PCI 0 is unloaded into them. (If the unload is not successful, a copy is unloaded from a different FEPROM sector. If the second try fails, the CPU hangs.)
Each processor jumps to the XSROM code and sends an XSROM banner to the COM1 port and to the control panel display.
The three S-cache banks on each processor are enabled, and then the
B-cache is tested. If a failure occurs, a message is sent to the COM1 port and to the control panel display.
Each CPU sends a B-cache completion message to COM1. The primary CPU is again determined, and it sizes memory by reading memory
registers on the I The information on memory pairs is sent to COM1. If an illegal memory
configuration is detected, a warning message is sent to COM1 and the control panel display.
Memory is initialized and tested, and the test trace is sent to COM1 and the control
panel display. Each CPU participates in the memory testing. The numbers for tests 20 and 21 might appear interspersed. This is normal behavior. Test 24 can take several minutes if the memory is very large. The message “P0 TEST 24 MEM**” is displayed on the control panel display; the second asterisk rotates to indicate that testing is continuing. If a failure occurs, a message is sent to the COM1 port and to the control panel display.
Each CPU sends a test completion message to COM1.
2
C bus.
.
Continued
2–22
DIGITAL Server 7300/7300R Series Service Manual
Example 2-3 Power-Up Display (Continued)
Power-Up
starting console on CPU 0 sizing memory
0 128 MB SYNC 1 128 MB SYNC starting console on CPU 1 starting console on CPU 2 starting console on CPU 3
probing IOD1 hose 1 bus 0 slot 1 - NCR 53C810 bus 0 slot 2 - DECchip 21041-AA bus 0 slot 3 - NCR 53C810 bus 0 slot 4 - DECchip 21040-AA probing IOD0 hose 0 bus 0 slot 1 - PCEB Configuring I/O adapters... DIGITAL Server 7300 Console V1.0, 13-MAR-1997 18:18:26 P00>>>
¡
DIGITAL Server 7300/7300R Series Service Manual
2–23
Power-Up
The final primary CPU determination is made. The primary CPU unloads PALcode
and decompression code from the FEPROM on the PCI 0 to its B-cache. The primary CPU then jumps to the PALcode to start the SRM console.
The primary CPU prints a message indicating that it is running the console. Starting with this message, the power-up display is printed to the default console terminal, regardless of the state of the console environment variable. (If console is set to graphics, the display from here to the end is saved in a memory buffer and printed to the graphics monitor after the PCI buses are sized and the graphics device is initialized.)
The size and type of each memory pair is determined.
The console is started on each of the secondary CPUs. A status message prints for each CPU.
The PCI bridges (indicated as IODn) are probed and the devices are reported. I/O
adapters are configured. The SRM console banner and prompt are printed. (The SRM prompt is shown in
¡
this manual as P00>>>. It can, however, be P01>>>, P02>>>, or P03>>>. The number indicates the primary processor.)
When the os_type environment variable is set to nt (as it must be on the DIGITAL Server 7300/7300R series), the SRM console loads and starts the AlphaBIOS console and does not print the SRM banner or prompt.
2–24
DIGITAL Server 7300/7300R Series Service Manual

Fail-Safe Loader

The fail-safe loader is a software routine that loads the SRM console image from floppy. Onc e the co nsole i s running you will want to run LFU to update FEPROM 0 with a new image.
NOTE: FEPROM 0 contains images of the SROM, XSROM,
If the fail-safe loader loads, the following conditions exist on the machine:
The SROM has passed its tests and successfully unloaded the XSROM. If the SROM
fails to unload both copies of XSROM, it reports the failure to the control panel display and COM1 if possible, and the system hangs.
The XSROM reports the errors encountered and loads the fail-safe loader.
Power-Up
decompression, and SRM console code.
DIGITAL Server 7300/7300R Series Service Manual
2–25
Power-Up
2–26
DIGITAL Server 7300/7300R Series Service Manual
3

Troubleshooting

This chapter describes troubleshooting during power-up and booting, as well as diagnostics for DIGITAL Server 7300/7300R series systems. The chapter covers the following topics:
Troubleshooting with LEDs
Troubleshooting Power Problems
Troubleshooting with the Maintenance Bus (I2C Bus)
Running Diagnostics — Test Command
Testing an Entire System
DIGITAL Server 7300/7300R Series Service Manual 3-1
Troubleshooting

Troubleshooting with LEDs

During power-up, reset, initialization, or testing, diagnostics are run on CPUs, memories, bridge modules, PCI motherboards, and sometimes options. The following sections describe possible problems that can be identified by checking LEDs.
Figure 3-1 CPU and Bridge Module LEDs
Bridge Module LEDs
(IOD 0 & 1)
IOD0 Self-Test Pass IOD0 Self-Test Pass
POWER_FAN_OK TEMP_OK
CPU LEDs
DC_OK SROM Oscillator
CPU Self-Test Pass Regulator OK (EV56)
ML014285
3-2
DIGITAL Server 7300/7300R Series Service Manual

Processor (CPU) LEDs

If the CPU STP LED on any processor (CPU) module is lit, that CPU chip is functioning properly. If the CPU STP LED is off, that CPU may or may not be functioning.
You can use the Halt button on the OCP to prevent the AlphaBIOS console (which turns off the CPU STP LED) from booting, thus assuring the validity of the CPU STP LED. If the LED is off, replace the CPU. If the LED is lit, you can use the SRM console command alphabios to load and run the AlphaBIOS console.
The top LED on a CPU module is a DC OK LED. It is driven by the PCM module. If it is not lit, there are probably power problems.
The second from the top LED on a CPU lights only when the SROM on the CPU is loaded. On modules with EV56 CPU processors a fourth LED is present at the bottom of the
column. The LED is normally on indicating that the power regulator on the module is working properly. If the LED is off, replace the module.

System Bus to PCI Bus Bridge Module LEDs (B3040-AA)

Troubleshooting
There are four LEDs on the B3040-AA system bus to PCI bus bridge module: The top two LEDs indicate the condition of the bridge module. If either is off, the module
should be replaced. The bottom two LEDs are passed from the PCM. Both should be on during normal
operation. If either is off while the system is on, the LEDs on the PCM module should indicate what failed. If they do not, the PCM could be broken or the bridge module is not passing the signals to the LEDs.
NOTE: If AC power is applied and the system is off and a power supply
is in operation, the power LED, the top one of the bottom two, flashes, indicating the presence of Vaux (auxiliary voltage).
DIGITAL Server 7300/7300R Series Service Manual
3-3
Troubleshooting

Cabinet Power and Fan LEDs

Figure 3-2 shows the cabinet power and fan LEDs.
Figure 3-2 Cabinet Power and Fan LEDs
Fan LED
Power LED
PK-0664-96
A cabinet system has three exhaust fans at the top of the cabinet. They are powered from a small power supply in the fan tray. This power supply also powers the server control module at the bottom of the PCI card cage to allow remote access to the system. A failure of the power supply is indicated only by the LEDs. No messages are displayed.
There are two LEDs on the top panel: a fan LED and a power LED. When the fan LED (amber) is flashing, a cabinet fan needs replacing. Look to see which
fan appears broken (either not functioning at all; or turningh slower than the others). When the power LED (green) is off, either the power supply in the fan tray is broken or
there is a power problem.
3-4
DIGITAL Server 7300/7300R Series Service Manual

Troubleshooting Power Problems

Power problems can occur before the system is up or while the system is running. If a system stops running, make a habit of checking the PCM.
Power Problem List
The system will halt for the following:
1. A CP U fan failure
2. A system fan failure
3. An overtemperature condition
4. Powe r supplied out of tolerance
5. C ircuit brea ke r(s ) trip pe d
6. AC problem
7. Interlock switch activation or failure
8. P CM failure
9. Environmental electrical failure or unrecoverable system fault with auto_action ev = halt or bo ot
10.Operator error - failure to unplug all power supplies and letting Vaux drain (10 sec de lay) be fore resta r ting
11.Cable fa ilu r e
12. M od ule failure - S ystem m otherboard, PC I
motherboard, or system bus to P CI bus bridge
13. SCM breaking the interlock circuit
Troubleshooting
Indications of failure:
1. Powe r control m o dule LED s indicate
CPU fan, system fan, overtemperature, and p ower su pply failur es
2. C ircuit brea ke r(s ) trip pe d
No obv ious indications for failures 7 - 13 from the pow er system .
PKW 0436A-96
If Halt Is Caused by Power, Fan, or Over-Temperature Problems
If a system is stopped because of a power, fan, or over-temperature problem, use the PCM LEDs to diagnose the problem..
DIGITAL Server 7300/7300R Series Service Manual
3-5
Troubleshooting

If Power Problem Occurs at Power-Up

If the system has a power problem on a cold start, the PCM LEDs are not valid until after DCOK_SENSE has been asserted. The cause is one of the following:
Broken system fan
Broken CPU fan
Power supplied to the system is out of tolerance (a power supply could be broken and
the system could still power up)
PCM failure
Interlock failure
Wire problems
Temperature problem (unlikely)

Recommended Order for Troubleshooting Failure at Power-Up

1. Check to see if any CPU fan or system fan is not spinning. Fans can fail by not spinning and/or not putting out the tachometer output necessary as input to the PCM comparator that checks the fans. (See steps 4 and 5.) Replace broken fan.
2. Replace the PCM.
3. Sequentially remove CPUs and try to power up after you remove a CPU. If the system powers up, the last CPU you removed had a fan failure.
4. Check the output of the power supplies. See the section “Power Supply” in Chapter 4 for locations of +5 and +3.43 volt output pins. If the output is above or below the threshold, replace the faulty power supply.
5. Check the output of each system fan with a voltmeter. Probe the middle of three outputs of the fans with the positive lead of the meter and ground the other probe. The meter should read 2.5 volts to 3 volts. If a fan’s output is out of this range, replace the fan.
NOTE: You will have to disable the interlocks to check the voltages in
step 5. You will have only 10 seconds to measure them. There is a 10-second delay before the PCM turns off the power.
The PCM must sense a change in Vaux (auxiliary voltage) to start the power supplies. Pressing the On button has no effect if the machine halted because of a failure in the power system. The power supplies must be unplugged and plugged back in for the On button to work.
3-6
DIGITAL Server 7300/7300R Series Service Manual

Power Control Module LEDs

The PCM has 1 1 LEDs v isibl e t hrough t he syst em c ard c age . The LED di splay sho ws the relative placement of the LEDs.
Figure 3-3 PCM LEDs
Troubleshooting
DCOK_SENSE PS0_OK PS1_OK PS2_OK
TEMP_OK CPUFAN_OK SYSFAN_OK CS_FAN0
CS_FAN1 CS_FAN2 C_FAN3
Normally On Tested at one-second intervals
Off if power supply not present or broken
PK-0714-96
DIGITAL Server 7300/7300R Series Service Manual
3-7
Troubleshooting
Table 3-1 Power Control Module LED States
LED State Description
DCOK_SENSE On Both +5.0V and +3.43V are present and within limits. PS0_OK On Power supply 0 is present and has asserted POK_H. PS1_0K On
PS2_OK On
TEMP_OK On The system temperature is below 55° C. CPUFAN_OK On
SYSFAN_OK On
CS_FAN0 On
CS_FAN1 On
CS_FAN2 On
C_FAN3 On
Off
Off
Off
Off
Off
Off
Off
Off
Power supply 1 is present and has asserted POK_H. Power supply 1 not present.
Power supply 2 is present and has asserted POK_H. Power supply 2 not present.
All CPU fans are OK. A CPU fan has failed. The specific fan is identified by the CS_FANx or C_FAN3 LED that remains lit.
All system fans are OK. A system fan has failed. The specific fan is identified by the CS_FANx that remains lit.
CPU fan 0 and system fan 0 are being sampled or one of them has failed as indicated by CPUFAN_OK and SYSFAN_OK. CPU fan 0 and system fan 0 are not being sampled and are functioning properly.
CPU fan 1 and system fan 1 are being sampled or one of them has failed as indicated by CPUFAN_OK and SYSFAN_OK. CPU fan 1 and system fan 1 are not being sampled and are functioning properly.
CPU fan 2 and system fan 2 are being sampled or one of them has failed as indicated by CPUFAN_OK and SYSFAN_OK. CPU fan 2 and system fan 2 are not being sampled and are functioning properly.
CPU fan 3 is being sampled or has failed as indicated by CPUFAN_OK and SYSFAN_OK. Off CPU fan 3 and system fan 3 are not being sampled and are functioning properly.
3-8
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting with the Maintenance Bus (I2C Bus)
The I2C bus (referred to as the “I squared C bus”) is a small inte rnal mai ntena nce bus used to monitor system conditions scanned by the power control module, write the fault display, store error state, and track configuration information in the system. Although all system modules ( not I/O modules) si t on the mainte nance bus, only the
2
I
C controller accesses it. Everything written or read on the I2C bus is done by the
controller.
Figure 3-4 I2C Bus Block Diagram
3
2
1
CPU 0
I2C Bus
IOD 1 PCI 1
Motherboard
Memory
Pairs
CPUs
Troubleshooting
PCM
Registers
IOD 0 PCI 0
OCP
Controller
DIGITAL Server 7300/7300R Series Service Manual
I2C Bus
Controller
MEMs
IOD 0
PCI 0
XBUS EISA
ML014286
3-9
Troubleshooting

Monitoring System Conditions

The I2C bus monitors the state of system conditions scanned by the PCM. There are two registers on the PCM:
One records the state of the fans and power supplies and is latched when there is a fault. The other causes an interrupt on the I
temperature condition exists, or power supplied to the system is out of tolerance. The interrupt received by the I2C bus controller on PCI 0 alerts the system of imminent
power shutdown. The controller has 30 seconds to read the two registers and store the information in the EEPROM on the PCM. The SRM console command show power reads these registers.

Displaying Faults

The OCP display is written through the I2C bus.

Writing Error States

2
C bus when a CPU or system fan fails, an over-
Error state is written and read for power conditions. The state of the Halt button (in/out) is read on the I
2
C bus.

Tracking Configurations

Each CPU, PCI bridge, PCI motherboard, and system motherboard has an EEPROM that contains information about the module that can be written and read over the I modules contain the following information:
Module type
Module serial number
Hardware revision
Firmware revision
Memory size (only required for memory modules)
2
C bus. All
3-10
DIGITAL Server 7300/7300R Series Service Manual

Running Diagnostics — Test Command

The test command runs diagnostics on the entire system, CPU devices, memory devices, and the PCI I/O subsystem. The test command runs only from the SRM console. Ctrl/C stops the test.
Example 3-1 Test Command Syntax
P00>>> help test
FUNCTION
SYNOPSIS
test ([-q] [-t <time>] [option]
where option is:
cpun
Troubleshooting
memn
pcin
and n can be one of 0, 1, 2, 3, or *.
The entire system is tested by default if no option specified.
NOTE: Switch from AlphaBIOS to the SRM console to enter the test
command. From the AlphaBIOS console, press in the Halt button (the LED will light) and reset the system.
test [-t time] [-q] [option]
-t
time
-q
option
Specifies the run time in seconds. The default for system test is 600 seconds (10 minutes). Disables the display of status messages as exerciser processes are started and
stopped during testing.
cpu
mem
Either specified, the entire system is tested.
n
,
n
, or
pci
n
, where n is 0, 1, 2, 3, or *. If nothing is
DIGITAL Server 7300/7300R Series Service Manual
3-11
Troubleshooting

Testing an Entire System

A test command with no modifiers runs all e xercisers for subsystems and devices on the system. I/O devices tested are supported boot devices. The test runs for 10 minutes.
Example 3-2 Sample Test Command
P00>>> test Console is in diagnostic mode
System test, runtime 600 seconds
Type ^C to stop testing
Configuring system..
polling ncr0 (NCR 53C810) slot 1, bus 0 PCI, hose 1 SCSI Bus ID 7
dka500.5.0.1.1 DKa500 RRD45 1645
polling ncr1 (NCR 53C810) slot 3, bus 0 PCI, hose 1 SCSI Bus ID 7
dkb200.2.0.3.1 DKb200 RZ29B 0007
dkb400.4.0.3.1 DKb400 RZ29B 0007
polling floppy0 (FLOPPY) PCEB - XBUS hose 0
dva0.0.0.1000.0 DVA0 RX23
polling tulip0 (DECchip 21040-AA) slot 2, bus 0 PCI, hose 1
ewa0.0.0.2.1: 08-00-2B-E5-B4-1A
Testing EWA0 network device
Testing VGA (alphanumeric mode only)
Starting background memory test, affinity to all CPUs.. Starting processor/cache thrasher on each CPU.. Starting processor/cache thrasher on each CPU.. Starting processor/cache thrasher on each CPU..
3-12
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting
Starting processor/cache thrasher on each CPU..
Testing SCSI disks (read-only)
No CD/ROM present, skipping embedded SCSI test
Testing other SCSI devices (read-only)..
Testing floppy drive (dva0, read-only)
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­00003047 memtest memory 1 0 0 134217728 134217728
00003050 memtest memory 205 0 0 213883392 213883392 00003059 memtest memory 192 0 0 200253568 200253568 00003062 memtest memory 192 0 0 200253568 200253568 00003084 memtest memory 80 0 0 82827392 82827392 000030d8 exer_kid dkb200.2.0.3 26 0 0 0 13690880 000030d9 exer_kid dkb400.4.0.3 26 0 0 0 13674496 0000310d exer_kid dva0.0.0.100 0 0 0 0 0
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­00003047 memtest memory 1 0 0 432013312 432013312 00003050 memtest memory 635 0 0 664716032 664716032 00003059 memtest memory 619 0 0 647940864 647940864
00003062 memtest memory 620 0 0 648989312 648989312
00003084 memtest memory 263 0 0 274693376 274693376
000030d8 exer_kid dkb200.2.0.3 90 0 0 0 47572992
000030d9 exer_kid dkb400.4.0.3 90 0 0 0 47523840
0000310d exer_kid dva0.0.0.100 0 0 0 0 327680
DIGITAL Server 7300/7300R Series Service Manual
3-13
Troubleshooting
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ------------
00003047 memtest memory 1 0 0 727711744 727711744
00003050 memtest memory 1054 0 0 1104015744 1104015744
00003059 memtest memory 1039 0 0 1088289024 1088289024
00003062 memtest memory 1041 0 0 1090385920 1090385920
00003084 memtest memory 447 0 0 467607808 467607808
000030d8 exer_kid dkb200.2.0.3 155 0 0 0 81488896
000030d9 exer_kid dkb400.4.0.3 155 0 0 0 81472512
0000310d exer_kid dva0.0.0.100 1 0 0 0 607232
Testing aborted. Shutting down tests.
Please wait..
System test complete
^C
P00>>>
3-14
DIGITAL Server 7300/7300R Series Service Manual

Testing Memory

The test mem command tests individual memory devices or all memory. The test shown in Example 3-3 runs for 2 minutes.
Example 3-3 Sample Test Memory Command
P00>>> test memory Console is in diagnostic mode System test, runtime 120 seconds
Type ^C to stop testing
Starting background memory test, affinity to all CPUs.. Starting memory thrasher on each CPU.. Starting memory thrasher on each CPU..
Starting memory thrasher on each CPU Starting memory thrasher on each CPU..
..
Troubleshooting
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­000046d7 memtest memory 1 0 0 48234496 48234496 000046e0 memtest memory 122 0 0 126862208 126862208 000046e9 memtest memory 111 0 0 115329280 115329280 000046f2 memtest memory 109 0 0 113232384 113232384
000046fb memtest memory 41 0 0 41937920 41937920
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­000046d7 memtest memory 1 0 0 226492416 226492416 000046e0 memtest memory 566 0 0 592373120 592373120
000046e9 memtest memory 555 0 0 580840192 580840192
000046f2 memtest memory 554 0 0 579791744 579791744
DIGITAL Server 7300/7300R Series Service Manual
3-15
Troubleshooting
000046fb memtest memory 211 0 0 220174080 220174080
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ------------
000046d7 memtest memory 1 0 0 404750336 404750336 000046e0 memtest memory 1011 0 0 1058932480 1058932480 000046e9 memtest memory 1000 0 0 1047399552 1047399552 000046f2 memtest memory 999 0 0 1046351104 1046351104 000046fb memtest memory 381 0 0 398410240 398410240 ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ------------
000046d7 memtest memory 1 0 0 583008256 583008256 000046e0 memtest memory 1456 0 0 1525491840 1525491840 000046e9 memtest memory 1446 0 0 1515007360 1515007360 000046f2 memtest memory 1444 0 0 1512910464 1512910464 000046fb memtest memory 550 0 0 575597952 575597952
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ------------
000046d7 memtest memory 1 0 0 761266176 761266176
000046e0 memtest memory 1901 0 0 1992051200 1992051200
000046e9 memtest memory 1892 0 0 1982615168 1982615168
000046f2 memtest memory 1889 0 0 1979469824 1979469824
000046fb memtest memory 720 0 0 753834112 753834112
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- ------------
000046d7 memtest memory 1 0 0 937426944 937426944
000046e0 memtest memory 2346 0 0 2458610560 2458610560
000046e9 memtest memory 2337 0 0 2449174528 2449174528 000046f2 memtest memory 2333 0 0 2444980736 2444980736
000046fb memtest memory 890 0 0 932070272 932070272
3-16
DIGITAL Server 7300/7300R Series Service Manual
Memory test complete Test time has expired... P00>>>
Troubleshooting
DIGITAL Server 7300/7300R Series Service Manual
3-17
Troubleshooting

Testing PCI Buses and Devices

The test pci command tests PCI buses and devices. The test runs for 2 minutes.
Example 3-4 Sample Test Command for PCI
P00>>> test pci* Console is in diagnostic mode System test, runtime 120 seconds
Type ^C to stop testing
Configuring all PCI buses.. polling ncr0 (NCR 53C810) slot 1, bus 0 PCI, hose 1 SCSI Bus ID 7 dka500.5.0.1.1 DKa500 RRD45 1645 polling ncr1 (NCR 53C810) slot 3, bus 0 PCI, hose 1 SCSI Bus ID 7 dkb200.2.0.3.1 DKb200 RZ29B 0007 dkb400.4.0.3.1 DKb400 RZ29B 0007 polling tulip0 (DECchip 21040-AA) slot 2, bus 0 PCI, hose 1 ewa0.0.0.2.1: 08-00-2B-E5-B4-1A polling floppy0 (FLOPPY) PCEB - XBUS hose 0 dva0.0.0.1000.0 DVA0 RX23
Testing all PCI buses..
Testing EWA0 network device
Testing VGA (alphanumeric mode only)
Testing SCSI disks (read-only)
Testing floppy (dva0, read-only)
3-18
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­00002c29 exer_kid dkb200.2.0.3 27 0 0 0 14642176
00002c2a exer_kid dkb400.4.0.3 27 0 0 0 14642176 00002c5e exer_kid dva0.0.0.100 0 0 0 0 0
Program Device Pass Hard/Soft Bytes Written Bytes Read
-------- ------------ ------------ ------ --------- ------------- -----------­00002c29 exer_kid dkb200.2.0.3 92 0 0 0 00002c2a exer_kid dkb400.4.0.3 92 0 0 0 48689152 00002c5e exer_kid dva0.0.0.100 0 0 0 0 286720 Testing aborted. Shutting down tests. Please wait..
Testing complete
^C P00>>>
48689152
DIGITAL Server 7300/7300R Series Service Manual
3-19
Troubleshooting
3-20
DIGITAL Server 7300/7300R Series Service Manual

Power System

This chapter describes the DIGITAL Server 7300/7300R series power system:
Power Supply
Power Control Module Features
Power Circuit and Cover Interlocks
Power-Up/Down Sequence
Cabinet Power Configuration Rules
Pedestal Power Configuration Rules (North America and Japan)
Pedestal Power Configuration Rules (Europe and Asia Pacific)
4
DIGITAL Server 7300/7300R Series Service Manual 4–1
Power System
g

Power Supply

Power supply outputs are shown in Figure 4-1.
Figure 4-1 Power Supply Outputs
Misc.
nal
Si
Current share
+5V/Return
+3 .4V/Re tur n
+3 .4V/Re tur n
+12V/Return
PKW0402A-96
4–2
DIGITAL Server 7300/7300R Series Service Manual
Power System
Power Supply Features
90–264 Vrms input
450 watts output. Output voltages are as follows:
Output Voltage Min. Voltage Max. Voltage Max. Current
+5.0 4.85 5.25 50 +3.43 3.400 3.465 75 +12 11.5 12.6 11 –12 –10.9 –13.2 0.2 –5.0 –4.6 –5.5 0.2 Vaux 8.5 9.5 0.05
Remote sense on +5.0V and +3.43V
+5.0V is sensed on all CPUs in the system, the system bus motherboard, and the PCI bus motherboard(s).
+3.43V is sensed on all CPUs in the system and the system bus motherboard.
Current share on +5.0V, +3.43V, and +12V.
1 % regulation on +3.43V.
Fault protection (latched). If a fault is detected by the power supply, it will shut
down. The faults detected are:
Overvoltage Overcurrent Power overload
DC_ENABLE_L input signal starts the DC outputs.
POK_H output signal indicates that the power supply is operating properly.
DIGITAL Server 7300/7300R Series Service Manual
4–3
Power System

Power Control Module Features

The power control module (54-24117-01) is located behind the B3040-AA module, the system bus to PCI bus bridge module.
Figure 4-2 Power Control Module
System Motherboard
Power Control Module Slot
The power control module performs the following functions:
4–4
DIGITAL Server 7300/7300R Series Service Manual
PK-0710-96
Power System
Controls the power-up/down sequencing.
Monitors the combined output of power supplies VDD (3.43V) and VCC (5.0V) and
asserts DCOK_SENSE if these voltages are within range and asserts POWER_FAULT_L causing an immediate power shutdown if either is not.
Monitors system temperature and asserts TEMP_FAIL, if temperature exceeds 55° C.
Monitors CPU and system drawer fans and asserts CPUFAN_OK if all CPU fans are
functioning properly, asserts SYSTEM_FAN_OK if the drawer cooling fans are functioning properly; otherwise it asserts FAN_FAULT_L. Each fan is checked at 1 second intervals.
Powers down the system 30 seconds after detecting TEMP_FAIL, or the absence of
CPUFAN_OK, or the absence of SYSTEM_FAN_OK by asserting POWER_FAULT_L.
Provides visual indication of faults through LEDs.
Has two registers, one that generates interrupts when bits change, and one that latches
errors but does not generate interrupts.
DIGITAL Server 7300/7300R Series Service Manual
4–5
Power System

Power Circuit and Cover Interlocks

Figure 4-3 is a diagram of the power circuit. Note that B305n in the diagram stands for either the B3050-AA or B3052-AA PCI Motherboard.
Figure 4-3 Power Circuit Diagram
OCP Logic
OCP
Switch
17-04201-02
17-04217-01
17-04201-01
Or
17-04302-01
B305n
B3040
17-04196-01
RSM_DC_EN_L
SCM
Power Supply
Motherboard
DC_ENABLE_L
Cover Interlocks
70-32016-01
PCM
POWER_FAULT_L
ML014282
4–6
DIGITAL Server 7300/7300R Series Service Manual
Loading...