This manual is for anyone who services a DIGITAL Server 7300/7300R Series system.
It covers installation, power-up, initial troubleshooting, and component installation.
January 1998
Digital Equipment Corporation
Maynard, Massachusetts
January 1998
Digital Equipment Corporati on makes no representati ons that the use of it s products in t he manner
described in this publication will not infringe on existing or future patent rights, nor do the
descriptions contained in this publication imply the granting of licenses to make, use, or sell
equipment or software in accordance with the description.
The information in thi s docum ent i s subj ect to change wit hout noti ce and should not be construed as
a commitment by Digital Equipment Corporation.
Digital Equipment Corporation assumes no responsibility for any errors that may appear in this
document.
The software, if any, described in this document is f urnished under a license and may be used or
copied only in accordance with the terms of such li cense. No responsi bility is assumed for the use
or reliability of software or equipment that is not supplied by Digital Equipment Corporation or it s
affiliated companies.
PATHWORKS, and the DIGITAL logo. The following are third-party trademarks: Adobe and
PostScript are registered trademarks of Adobe Systems, Incorporated. Helvetica and Times are
registered trademarks of Linotype Co. Microsoft and MS-DOS are registered trademarks and
Windows is a trademark of Microsoft Corporation.
The following are third-party trademarks: Lifestyle 28.8 DATA/FAX Modem is a trademark of
Motorola, Inc. UNIX is a registered trademark in the U.S. and other countries, l icensed exclusively
through X/Open Company Ltd. U.S. Robotics and Sportster are registered trademarks of U.S.
Robotics. Windows NT is a trademark of Microsoft, Inc. All other trademarks and registered
trademarks are the property of their respective holders.
FCC Notice:
The equipment described in this manual generates, uses, and may emit radio
frequency energy. The equipment has been t ype tested and found to comply with the limits for a
Class A digital device pursuant t o Part 15 of FCC Rules, which are designed to pr ovide reasonable
protection against such radio frequency inter ference. Operation of this equipment in a resi dential
area may cause interference, in which case the user at his own expense will be required to take
whatever measures are required to correct the interference.
Shielded Cables:
If shielded cables have been supplied or specified, they must be used on the
system in order to maintain international regulatory compliance.
Warning!
This is a Class A product. In a domestic environment this product may cause radio
interference, in which case the user may be required to take adequate measures.
Achtung!
Dieses ist ein Gerät der Funkstörgrenzwertklasse A. In Wohnbereichen können bei
Betrieb dieses Gerätes Rundfunkstörungen auftreten, in welchen Fällen der Benutzer für
entsprechende Gegenmaßnahmen verantwortlich ist.
Avertissement!
Cet appareil est un appareil de Classe A. Dans un environnement résidentiel, cet
appareil peut provoquer des brouillages radioélectriques. Dans ce cas, il peut être demandé à
l'utilisateur de prendre les mesures appropriées.
Table of Contents
1 System Overview
DIGITAL Server 7300/7300R System Drawer (BA30A) ......................................................1–3
This manual is written for the customer service engineer.
Document Structure
This manual uses a structured documentation design. Topics are organized into small
sections for efficient online and printed reference. Each topic begins with an abstract,
followed by an illustration or example, and ends with descriptive text.
This manual has nine chapters, as follows:
• Chapter 1, System Overview, introduces the DIGITAL Server 7300/7300R series
pedestal and cabinet systems and gives an overview of the system bus modules.
• Chapter 2, Power-Up, provides information on how to interpret the power-up display
on the operator control panel, the console screen, and system LEDs. It also describes
how hardware diagnostics execute when the system is initialized.
• Chapter 3, Troubleshooting, describes troubleshooting during power-up and booting,
as well as the test command.
• Chapter 4, Power System, describes the DIGITAL Server 7300/7300R power system
Preface
• Chapter 5, Error Detection with Error Registers, describes the error registers used
to hold error information.
• Chapter 6, Removal and Replacement, describes removal and replacement
procedures for field-replaceable units (FRUs).
• Chapter 7, Running Utilities, explains how to run utilities such as the EISA
Configuration Utility and RAID Standalone Configuration Utility.
DIGITAL Server 7300/7300R Series Service Manual xi
• Chapter 8, SRM Console Commands and Environment Variables, summarizes the
commands used to examine and alter the system configuration.
• Chapter 9, Operating the System Remotely, describes how to use the remote
console monitor (RCM) to monitor and control the system remotely.
Documentation Titles
The following table lists titles related to DIGITAL Server 7300/7300R series systems.
DIGITAL Server 7300/7300R Series Documentation
TitleOrder Number
DIGITAL 7300/7300R Series User and Configuration
Documentation Kit
System Drawer User's Guide
Configuration and Installation Guide
Illustrated Parts Breakdown
CPU Installation Card
Memory Installation Card
Power Supply Installation Card
ServerWORKS Manager Administrator User's Guide
Using a Web browser, you can access information about DIGITAL Servers at:
http://www.windowsnt.digital.com/products
Access the latest system firmware either with a Web browser as follows:
http://www.windowsnt.digital.com/support
xii DIGITAL Server 7300/7300R Series Service Manual
1
System Overview
This chapter introduces the DIGITAL Server 7300/7300R series systems. These systems
are available in cabinets or pedestals.
The pedestal system has one system drawer and up to three StorageWorks shelves. The
cabinet system can have a combination of system drawers and StorageWorks shelves that
occupy the five sections of the cabinet. There is one system drawer, the BA30A, used
with the DIGITAL Server 7300/7300R series.
Topics in this chapter include the following:
• DIGITAL Server 7300/7300R System Drawer (BA30A)
• Cabinet System
• Pedestal System
• Control Panel and Drives
• System Consoles
• System Architecture
• System Motherboard
• CPU Types
DIGITAL Server 7300/7300R Series Service Manual 1–1
System Overview
• Memory Modules
• System Bus
• System Bus to PCI Bus Bridge Module
• PCI I/O Subsystem
• Server Control Module
• Power Control Module
• Power Supply
1–2
DIGITAL Server 7300/7300R Series Service Manual
System Overview
DIGITAL Server 7300/7300R System Drawer (BA30A)
Components in t he BA30A system dra wer are l ocated in the system bus card cage,
the PCI card cage, the control panel assembly, and the power and cooling section.
The drawer measures 30 cm x 45 cm (11.8 in. x 17.7 in.) and fully configured weighs
approximately 45.5 kg (~100 lbs).
Figure 1-1 Components of the BA30A System Drawer
1
5
2
3
4
PK-0702-96
When the system drawer is in a pedestal, the control panel assembly is mounted in a tray
at the top of the drawer.
The numbered callouts in Figure 1-1 refer to components of the system drawer.
System card cage, which holds the system motherboard and the CPU, memory, bridge,
and power control modules.
DIGITAL Server 7300/7300R Series Service Manual
1–3
System Overview
PCI/EISA card cage, which holds the PCI motherboard, option cards, and server
control module.
Server control module, which holds the I/O connectors and remote console monitor.
Control panel assembly, which includes the control panel, a floppy drive, and a CDROM drive.
Power and cooling section, which contains one to three power supplies and fans.
Cover Interlocks
The system drawer has three cover interlocks: one for the system bus card cage, one for
the PCI card cage, and one for the power and system fan area. Figure 1-2 shows the cover
interlock circuit. Note that “B305n” in Figure 1-2 stands for either the B3050-AA or
B3052-AA PCI Motherboard.
1–4
DIGITAL Server 7300/7300R Series Service Manual
Figure 1-2 Cover Interlock Circuit
System Overview
OCP
Logic
OCP
Switch
17-04201-02
17-04217-01
17-04201-01
Or
17-04302-01
B305n
B3040
17-04196-01
RSM_DC_EN_L
SCM
Power Supply
Motherboard
DC_ENABLE_L
Cover
Interlocks
70-32016-01
PCM
POWER_FAULT_L
3 Interlock
Switches
703201601
To OCP
LJ-06315
NOTE: The cover interlocks must be engaged to enable power-up. To
override the cover interlocks, find a suitable object to close the
interlock circuit.
DIGITAL Server 7300/7300R Series Service Manual
1–5
System Overview
Cabinet System
The DIGITAL Server 7300/7300R series cabinet system can accommodate multiple
systems in a single cabinet. There are two cabinet variations that can hold different
system configurations. From the outside, the cabinets look almost identical and are of one
basic type. The differences are in power controllers.
Figure 1-3 DIGITAL Server 7300/7300R Cabinet System
1–6
DIGITAL Server 7300/7300R Series Service Manual
PK-0306-96
Cabinet Differences
CabinetPowerMountingDestination
H9A10-ENTwo 120 volt
H7600-AA power
controllers
H9A10-EPTwo 240 volt
H7600-DB power
controllers
Cabinet System Fan Tray
At the top of cabinet systems is a fan tray containing three exhaust fans, a small 12-volt
power supply, and a module that distributes power to the server control module in each
drawer.
Figure 1-4 Cabinet Fan Tray
Pull-out tray
(max drawers: 3)
Pull-out tray
(max drawers: 3)
System Overview
North America
Asia Pacific
Europe
Fan LE D
Powe r LE D
To S C M
AC
Powe r
PKW 04 41A -96
DIGITAL Server 7300/7300R Series Service Manual
1–7
System Overview
Pedestal System
The pedestal system contains one system drawer with a control panel, a CD-ROM drive,
and a floppy drive. In the pedestal control panel area there is space for an optional tape or
disk drive. Three StorageWorks shelves provide up to 90 Gbytes of in-cabinet storage.
Figure 1-5 Pedestal System Front
In the pedestal system, the control panel is located at the top left in a tray. There is space
for an optional device beside it.
1–8
DIGITAL Server 7300/7300R Series Service Manual
PK-030 1-96
Figure 1-6 Pedestal System Rear
PK-0307 a-96
System Overview
DIGITAL Server 7300/7300R Series Service Manual
1–9
System Overview
Control Panel and Drives
The control panel include s the On/Off, Halt, and Rese t buttons and a display. In a
pedestal system the control panel is located in a tray at the top of the system drawer.
In a cabinet system, the control panel is at the bottom of the system drawer with the
CD-ROM drive and the floppy drive.
Figure 1-7 Control Panel Assembly
23
1
PedestalCabinet
Control Panel
CD-ROM Drive
CD-ROM
Drive
Floppy Drive
4
PK-0751-96
On/Off button. Powers the system drawer on or off. When the LED at the top of the
button is lit, the power is on. The On/Off button is connected to the power supplies
and the system interlocks.
NOTE: The LEDs on some modules are on when the line cord is
1–10
DIGITAL Server 7300/7300R Series Service Manual
System Overview
missing, regardless of the position of the On/Off button.
Halt button. Pressing this button in (so the LED at the top of the button is on) has no
effect on Windows NT.
If the Halt button is in when the system is reset or powered up, the system halts in the
SRM console. AlphaBIOS is not loaded and started.
Reset button. Initializes the system drawer. If the Halt button is pressed (LED on)
when the system is reset, the SRM console is loaded and remains in the system
regardless of any other conditions.
Control panel display. Indicates status during power-up and self-test. The OCP
display is a 16-character LCD. Its controller is on the XBUS on the PCI
motherboard.
While the operating system is running, displays the system type as a default. This
message can be changed by the user.
CD-ROM drive. The CD-ROM drive is used to load software, firmware, and
updates. Its controller is on PCI1 on the PCI motherboard.
Floppy disk drive. The floppy drive is used to load software and firmware updates.
The floppy controller is on the XBUS on the PCI motherboard.
DIGITAL Server 7300/7300R Series Service Manual
1–11
System Overview
System Consoles
There are two console programs: the SRM console and the AlphaBIOS console.
SRM Console
The SRM console is a command-line interface that tests the system after power-up or
reset and launches the AlphaBIOS graphical interface. For some configuration and
diagnostic or testing tasks, you may need to use the SRM console interface rather than
launch the AlphaBIOS console. To reach the SRM console interface, power up or reset the
system with the Halt button pressed in. You then see the SRM console prompt:
P00>>>
NOTE: The console prompt displays only after the entire power-up
After the SRM console prompt appears, you should change the Halt button back to the
“out” position
sequence is complete. This can take up to several minutes if
the memory is very large.
AlphaBIOS Console
The AlphaBIOS console is a menu-based interface that supports the Microsoft Windows
NT operating system. You use AlphaBIOS to set up operating system selections, boot
Windows NT, and display information about the system configuration. You also run the
EISA Configuration Utility and the RAID Standalone Configuration Utility from the
AlphaBIOS console. With the DIGITAL Server 7300/7300R series, AlphaBIOS runs on
either a serial (character-cell) terminal or a graphics monitor.
When you invoke the AlphaBIOS console, you see the following Boot menu:
1–12
DIGITAL Server 7300/7300R Series Service Manual
Figure 1-8 AlphaBIOS Boot Menu
AlphaBIOS Version 5.12
Please select the operating system to start:
Windows NT Server 3.51
Use and to move the highlight to your choice.
Press Enter to choose.
Alpha
Environment Variables
System Overview
Press <F2> to enter SETUP
PK-0728-96
Environment variables are software parameters that, among other things, define the system
configuration. You can use them to pass information to different pieces of software
running in the system at various times.
The os_type environment variable determines which of the two consoles is to be used.
The SRM console is always brought into memory, but AlphaBIOS is loaded if os_type is
set to NT (which it must be on the DIGITAL Server 7300/7300R series) and the Halt
button is out (not lit).
See the section “Summary of SRM Environment Variables” in Chapter 8 of this manual
for a list of the environment variables used to configure DIGITAL Server 7300/7300R
series systems.
Refer to the DIGITAL Server 7300/7300R Series System Drawer User’s Guide for
information on setting environment variables.
You should keep a record of the environment variables for each system that you service.
Some environment variable settings are lost when a module is swapped and must be
restored after the new module is installed. Refer to Table 8-3 for a convenient worksheet
for recording environment variable settings.
DIGITAL Server 7300/7300R Series Service Manual
1–13
System Overview
System Architecture
Alpha microprocessor chips are used in these systems. The CPU, memory and the I/O
bridge module are connected to the system bus motherboard.
Figure 1-9 Architecture Diagram
CPU 0
System Bus
128-Bit Data Bus + 16 ECC and 40-Bit Command/Address Bus
Bridge
EISA
Bridge
PCI
Slot
PCI/EISA
Slot
PCI/EISA
Slot
PCI/EISA
Slot
PCI Motherboard
System to
PCI Bus
Bridge 0
6
4
B
i
t
P
C
I
Memory
Pairs
PCI
Slot
PCI
Slot
PCI
Slot
PCI
Slot
System to
PCI Bus
Bridge 1
6
4
B
i
t
P
C
I
ML014280
1–14
DIGITAL Server 7300/7300R Series Service Manual
System Overview
DIGITAL Server 7300/7300R series systems use the Alpha chip for the CPU. The CPU,
memory, and I/O bridge module to PCI/EISA I/O buses are connected to the system bus
motherboard. A fourth type of module, the power control module, also plugs into the
system motherboard.
A fully configured DIGITAL Server 7300/7300R series system drawer can have up to four
CPUs, four memory pairs, and a total of eight I/O options. The I/O options can be all PCI
options or a combination of PCI options and EISA options. However, there can be no more
than three EISA options.
The system bus has a 144-bit data bus protected by 16 bits of ECC and a 40-bit
command/address bus protected by parity. The bus speed depends on the speed of the
CPU in slot 0 which provides the clock for the buses. The 40-bit address bus can create
one terabyte of addresses (that’s a million billion). The bus connects CPUs, memory, and
the system bus to PCI bus bridge(s).
The CPU modules are available with an onboard cache. The Alpha chip has an 8-Kbyte
instruction cache (I-cache), an 8-Kbyte write-through data cache (D-cache), and a 96Kbyte, write-back secondary data cache (S-cache). The cache system is write-back. The
system drawer supports up to four CPUs.
The memory modules are placed on the system motherboard in pairs. Each module drives
half of the system bus, along with the associated ECC bits. Memory pairs consist of two
modules that are the same size and type. Two types are available: synchronous and
asynchronous (EDO) memory.
The system bus to PCI bus bridge module translates system bus commands and data
addressed to I/O space to PCI commands and data. It also translates PCI bus commands
and data addressed to system memory or CPUs to system bus commands and data. The
PCI bus is a 64-bit wide bus used for I/O. The 7300/7300R series has one PCI/EISA card
cage.
The power control module, which is on the system motherboard, monitors power and the
system environment.
DIGITAL Server 7300/7300R Series Service Manual
1–15
System Overview
System Motherboard
The system motherboard is on the floor of the system card cage. It has slots for the
CPU, memory, power control, and bridge modules.
Figure 1-10 System Motherboard Module Locations
1
2
2
1
2
1
4
1–16
DIGITAL Server 7300/7300R Series Service Manual
3
2
1
2
PK-0703D-96
System Overview
The system motherboard has the logic for the system bus. It is the backplane that
holds the CPU, memory, bridge, and power control modules. Figure 1-10 shows
a diagram of the motherboard used in DIGITAL Server
systems. The module locations are designated by the call outs.
CPU module
Memory module
Bridge module
Power control module
Server
7300/7300R series
DIGITAL Server 7300/7300R Series Service Manual
1–17
System Overview
CPU Types
DIGITAL Server 7300/7300R series systems can be configured with one of two CPU
variants.
CPU Variants
Module VariantClock FrequencyOnboard Cache
B3105-AA400 MHz4 Mbytes
B3105-CA533 MHz4 Mbytes
CPU Module Layout
Figure 1-11 shows the layout of the CPU module.
Figure 1-11 CPU Module Layout
System Motherboard
3
2
CPU Module Slots
1
0
Typical Cached CPU Module
1–18
DIGITAL Server 7300/7300R Series Service Manual
ML014196
Alpha Chip Composition
The Alpha chip is made using state-of-the-art chip technology, has a transistor count of 9.3
million, consumes 50 watts of power, and is air cooled (a fan is on the chip). The default
cache system is write-back and when the module has an external cache, it is write-back.
Memory modules are used only in pairs — two modules of the same size and type.
Each module provides either the low half or the high half of the memory space. The
7300/7300R series system drawer can hold up to four memory module pairs.
Figure 1-12 Memory Module Layout
Typical Synchronous Mem ory
Typical EDO Memory
C56
1–20
DIGITAL Server 7300/7300R Series Service Manual
R3
PK W 0423C -96
Memory Variants
Each memory option consists of two identical modules. Each DIGITAL Server
7300/7300R series drawer supports up to four memory options, for a total of 4 Gbytes of
memory. Memory modules are used only in pairs and are available in 128 Mbyte, 512
Mbyte, 1 Gbyte, and 2 Gbyte sizes. The 128-Mbyte option is synchronous memory, while
the larger sizes are asynchronous memory (EDO).
Option Part NoSizeModuleTypeNumberSize
FR-ACSMA-AA128 MBB3020-CASynch. 364 MB x 4
FR-ACSMA-AB512 MBB3030-EAAsynch.
FR-ACSMA-AC1 GBB3030-FAAsynch.
FR-ACSMA-AD2 GBB3030-GAAsynch.
System Overview
DRAM
1444 MB x 4
(EDO)
7216 MB x 4
(EDO)
14416 MB x 4
(EDO)
Memory Operation
Memory modules are used only in pairs; each module provides half the data, or 64 bits
plus 8 ECC bits, of the octaword (16 byte) transferred on the system bus. Modules are
placed in slots designated MEMxL and MEMxH.
NOTE: Modules in slots MEMxL do not drive the lower 8 bytes, and
modules in slots MEMxH do not drive the higher 8 bytes of the
16 byte transfer.
Unless otherwise programmed, memory drives the system bus in bursts. Upon each
memory fetch, data is transferred in 4 consecutive cycles transferring 64 bytes. There are
situations, however, when memories made with EDO DRAMs cannot provide data fast
enough to complete the system bus transactions. When these situations arise, EDO type
memories assert a signal that causes the system bus to stall for one (occasionally more)
clock tick. When memory completes such an operation, it releases the system bus.
Memory Configuration Rules
In a system, memories of different sizes and types are permitted, but:
• Memory modules are installed and used in pairs. Both modules in a memory pair
must be of the same size and type.
DIGITAL Server 7300/7300R Series Service Manual
1–21
System Overview
• The largest memory pair must be in slots MEM 0L and MEM 0H.
• Other memory pairs must be the same size or smaller than the first memory pair.
• Memory pairs must be installed in consecutive slots.
1–22
DIGITAL Server 7300/7300R Series Service Manual
Memory Addressing
Alpha system memory addressing is unusual because memory address space is
determined not by the amount of physical memory but is calculated by a multiple of
the size of the memory pair in slot MEM0x.
Figure 1-13 How Memory Addressing Is Calculated
2028 M byte
1536 M byte
System Overview
Fourth pair address space
512 M byte space em pty
Third pair address space
512 M byte 1/2 occupied
(2 B 3020-DA - 128 M byte/mod)
1024 M byte
Address
hole
512 M byte
Second pair address space
512 M byte 1/2 occupied
(2 B 3020-DA - 128 M byte/mod)
First pair defines
total address space
always fully occupied
(2 B 3020-EA 2 56 M byte/mod)
0
PKW0424-96
DIGITAL Server 7300/7300R Series Service Manual
1–23
System Overview
The rules for addressing memory are as follows:
• Address space is determined by the memory pair in slot MEM0.
• Memory pairs need not be the same size.
• The memory pair in slot MEM0 must be the largest of all memory pairs. Other
memory pairs may be as large but none may be larger.
• The starting address of each memory pair is N times the size of the memory pair in
slot MEM0. N=0,1,2,3.
• Memory addresses are contiguous within each module pair.
• If memory pairs are of different sizes, memory “holes” can occur in the physical
address space. See Figure 1-13.
Software creates contiguous virtual memory even though physical memory may not be
contiguous.
1–24
DIGITAL Server 7300/7300R Series Service Manual
System Bus
The system bus consists of a 40-bi t command/address bus, a 128-bit plus ECC data
bus, and several control signals and clocks.
Figure 1-14 System Bus Block Diagram
SYNC
DRAMS
MEM3
MEM2
MEM1
MEM0
ADR
DATA
CTRL
SIM_ADR
MEM CTRL &
CNTRL ARB
System Overview
ROW
COL
CPU3
CPU2
CPU1
CPU0
PCI/EISA
PCI/EISA0
PCI1
A
L
P
H
A
EV_ADR
EV_DATA
System to
PCI Bus Bridge
IOD0
IOD1
CTRL
ADR
System Bus
Control
MC ADR
<39:4>
MC DATA
<127:0>
ML014283
DIGITAL Server 7300/7300R Series Service Manual
1–25
System Overview
The system bus motherboard consists of a 40-bit command/address bus, a 128-bit plus
ECC data bus, and several control signals, clocks, and a bus arbiter. The bus requires that
all CPUs have the same high-speed oscillator providing the clock to the Alpha chip.
The DIGITAL Server 7300/7300R series system bus connects up to four CPUs, four pairs
of memory modules, and a single I/O bus bridge module. The I/O bus bridges may be
designated as IODn where n is the number of the PCI bus. The bridge is designated IOD0
and IOD1.
The system bus clock is provided by an oscillator on the CPU in slot CPU0. This
oscillator has a 1:5 ratio to the Alpha chip. With 400 MHz CPUs, for example, the system
bus operates at 80 MHz.
The system bus motherboard initiates memory refresh transactions. The motherboard sits
at the bottom of the system drawer, and in addition to CPUs, memory, and I/O bridges,
holds a power control module.
5 volt and 3.43 volt power is provided directly to the motherboard from the power
supplies.
1–26
DIGITAL Server 7300/7300R Series Service Manual
System Bus to PCI Bus Bridge Module
The bridge module is the physical interconnect between the system motherboard and
any PCI motherboard in the system.
Figure 1-15 Bridge Module
System Overview
PCI Bus
Control
Address
ECC & Data
<63:0>
ECC & Data
<127:64>
Control
CAP
MDPA
MDPB
AD<31:0>
Data A
to B bus
DataA to B &
BtoAbus
AD<63:32>
PKW0426r-96
DIGITAL Server 7300/7300R Series Service Manual
1–27
System Overview
The system bus to PCI bus bridge module converts:
• System bus commands and data addressed to I/O space to PCI commands and data
• PCI bus commands and data addressed to system memory or CPUs to system bus
commands and data.
A DIGITAL Server 7300/7300R series system has one bridge module. The bridge module
has two major components:
• Command/address processor (CAP) chip
• Two data path chips (MDPA and MDPB)
There are two sets of these three chips, one set on each side of the module. Each set
bridges to one of the PCI buses on the PCI motherboard.
The interface on the system bus side of the bridge responds to system bus commands
addressed to the upper 64 Gbytes of I/O space. I/O space is addressed whenever bit <39>
on the system bus address lines is set. The space so defined is 512 Gbytes in size. The
first 448 Gbytes are reserved and the last 64 Gbytes, when bits <38:36> are set, are
mapped to the PCI I/O buses.
The interface on the PCI side of the bridge responds to commands addressed to CPUs and
memory on the system bus. On the PCI side, the bridge provides the interface to the PCIs.
Each PCI bus is addressed separately. The bridge does not respond to devices
communicating with each other on the same PCI bus. However, should a device on one
PCI address a device on the other PCI bus, commands, addresses, and data run through the
bridge out onto the system bus and back through the bridge to the other PCI bus.
In addition to its bridge function, the system bus to PCI bus bridge module monitors every
transaction on the system bus for errors. It monitors the data lines for ECC errors and the
command/address lines for parity errors.
1–28
DIGITAL Server 7300/7300R Series Service Manual
PCI I/O Subsystem
The I/O subsystem is PCI. The DIGITAL Server 7300/7300R series has two four-slot
PCI buses that hol d up to ei ght I/O opt ions. O ne of the se buses can be both PCI and
EISA, but can hold not more than four options three of which may be EISA.
Figure 1-16 PCI Block Diagram
System Overview
System Bus
NVRAM
8Kx8
Serial
Interrupt Logic
3.3 Mhz
OSC Clock Bfr
Serial
Interrupt Logic
BDATA
Xceivers
Flash
ROM
2MB
PCI-1 Bus
PCI-0 Bus
Realtime
Clock
PCI-1
4 64-bit slots
PCI-0
4 64-bit slots
XBUS
Combo I/O:
serial ports
parallel port
floppy cntrl
Mouse/
Keyboard
SCSI Control
53C810
Connector
PCI to EISA
Bridge
Chipset
XBUS
Xceivers
12C Bus
Interface
40Mhz
Clock
EISA:
32-bit slots
Table 1-1 PCI Motherboard Slot Numbering
SlotPCI0PCI1
0ReservedReserved
1PCI to EISA bridgeInternal CD-ROM controller
2PCI or EISA slotPCI slot
3PCI or EISA slotPCI slot
4PCI or EISA slotPCI slot
5PCI slotPCI slot
EISA
Data
Bus
3
ML014284
DIGITAL Server 7300/7300R Series Service Manual
1–29
System Overview
The logic for two PCI buses is on each PCI motherboard.
PCI0 is a 64-bit bus with a built-in PCI to EISA bus bridge. PCI0 has one dedicated PCI
slot and three slots, though there are six connectors, that can be PCI or EISA slots. Each
slot has an EISA connector and a PCI connector only one of which may be used at a time.
PCI0 is powered by 5V.
PCI1 is a 64-bit bus with a built-in CD-ROM controller and four PCI slots. PCI1 is
powered by 5V.
An 8-bit XBUS is connected to the EISA bus. On this bus there is an interface to the
system I
2
C bus; mouse and keyboard support; an I/O combo controller supporting two
serial ports, the floppy controller, and a parallel port; a real-time clock; two 1-Mbyte flash
ROMs containing system firmware, and an 8-Kbyte NVRAM.
1–30
DIGITAL Server 7300/7300R Series Service Manual
Server Control Module
The server control module enables remote console connections to the system drawer.
The module passes signals to COM ports 1 and 2, the keyboard, and the mouse to the
standard I/O connectors.
Figure 1-17 Server Control Module
Remote Console
Monitor
System Overview
Standard I/O
PK-0702B-96
DIGITAL Server 7300/7300R Series Service Manual
1–31
System Overview
The server control module has two sections: the remote console monitor (RCM) and the
standard I/O. See Chapter 9 for information on controlling the system remotely.
The remote console monitor connects to a modem through the modem port on the
bulkhead. The RCM requires a 12V power connection.
The standard I/O ports (keyboard, mouse, COM1 and COM2 serial, and parallel ports) are
on the same bulkhead.
1–32
DIGITAL Server 7300/7300R Series Service Manual
Power Control Module
The power control module controls power sequencing and monitors power supply
voltage, temperature, and fans.
Figure 1-18 Power Control Module
System Motherboard
Power Control
Module Slot
System Overview
PK-0710-96
DIGITAL Server 7300/7300R Series Service Manual
1–33
System Overview
The power control module performs the following functions:
• Controls power sequencing.
• Monitors the combined output of power supplies and shuts down power if it is not in
range.
• Monitors system temperature and shuts off power if it is out of range.
• Monitors the fans in the system drawer and on the CPU modules and shuts down
power if a fan fails.
• Provides visual indication of faults through LEDs.
1–34
DIGITAL Server 7300/7300R Series Service Manual
Power Supply
The system dr awer po wer supplies provide power only to components in the drawer.
One or two power supplies are required, depending on the number of CPU modules
and PCI card cages; a second or third can be added for redundancy. The power
system is described in detail in Chapter 4.
Figure 1-19 Location of Power Supply
Power Supply 2
Power Supply 1
Power Supply 0
System Overview
PK-0715-96
DIGITAL Server 7300/7300R Series Service Manual
1–35
System Overview
Description
One to three power supplies provide power to components in the system drawer.
(They supply power only for the drawer in which they are located.) Three
power supplies provide redundant power in fully loaded DIGITAL Server
7300/7300R series systems.
These power supplies share the load, and redundant configurations are supported. They
autoselect line voltage (120V to 240V). Each has 450 W output and supplies up to 75A of
3.43V, 50A of 5.0V, 11A of 12V, and small amounts of –5V, –12V, and auxiliary voltage
(Vaux).
NOTE: The LEDs on some modules are on when the line cord is
Configuration
A DIGITAL Server 7300/7300R series system with one or two CPUs requires one power
supply (two for redundancy).
A DIGITAL Server 7300/7300R series system with three or four CPUs requires
two power supplies (three for redundancy).
Power supply 0 is installed first, power supply 2 second, and power supply 1 third. See
Figure 1-19 Location of Power Supply. (The power supply numbering shown here
corresponds to the numbering displayed by the SRM console's show power command.)
plugged in, regardless of the position of the On/Off button.
1–36
DIGITAL Server 7300/7300R Series Service Manual
2
Power-Up
This chapter describes system power-up testing and explains the power-up displays. The
following topics are covered:
• Control Panel
• Power-Up Sequence
• SROM Power-Up Test Flow
• SROM Errors Reported
• XSROM Power-UP Test Flow
• XSROM Errors Reported
• Console Power-Up Tests
• Console Device Determination
• Console Power-Up Display
• Fail-Safe Loader
DIGITAL Server 7300/7300R Series Service Manual 2–1
Power-Up
Control Panel
The control panel display indicates the likely device when testing fails.
Figure 2-1 Control Panel and LCD Display
O n/Off
Halt
Reset
Potentiom eter
Access H ole
P0 TEST 11 CPU00
When the On/Off button LED is on, power is applied and the system is running. When it
is off, the system is not running, but power may or may not be present. If power is
present, the PCM or the power LED on the system bus to PCI bus bridge module should be
flashing. Otherwise, there is a power problem.
When the Halt button LED is lit and the On/Off button is on, the system should be running
either the SRM console or Windows NT. If the Halt button is in, but the LED is off, the
OCP, its cables, or the PCM are likely to be broken.
2–2
DIGITAL Server 7300/7300R Series Service Manual
PK-0706G-96
Table 2-1 Control Panel Display
FieldContentDisplayMeaning
CPU numberP0–P3CPU reporting status
StatusTESTTests are executing
FAILFailure has been detected
MCHKMachine check has occurred
INTRError interrupt has occurred
Power-Up
Test number
Suspected deviceCPU0–3CPU module number
MEM0–3 and
L, H, or *
Memory pair number and low
module, high module, or either
IOD0Bridge to PCI bus 0
IOD1Bridge to PCI bus 1
FROM0Flash ROM
COMBOCOM controller
4
4
PCEBPCI-to-EISA bridge
ESCEISA system controller
NVRAMNonvolatile RAM
TOYReal-time clock
4
4
I8242Keyboard and mouse controller
1
2
3
3
4
4
4
The potentiometer, accessible through the access hole just above the Reset button controls
the intensity of the LCD. Use a small Phillips head screwdriver to adjust.
1
CPU module
2
Memory module
3
Bridge module (B3040-AA)
4
EISA/PCI motherboard
DIGITAL Server 7300/7300R Series Service Manual
2–3
Power-Up
Power-Up Sequence
Console and most power-up tests re side on the I/O subsyste m, not o n the CPU nor on
any other module on the system bus.
Figure 2-2 Power-Up Flow
Power-Up/Reset
SRO M code loaded
into each C PU's
I-cache
SRO M tests execute
XS ROM loaded into
each CPU's S-cache
Definitions
XS ROM tests execute
SR M console loaded
into memory
SR M console tests
execute
SR M console either
rem ains in the system
or loads AlphaBIOS
console
PKW0432B-96
SROM. The SROM is a 128-Kbit ROM on each CPU module. SROM contains minimal
diagnostics that test the Alpha chip and the path to the XSROM. Once the path is verified,
it loads XSROM code into the Alpha chip and jumps to it.
2–4
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
y
XSROM. The XSROM, or extended SROM, contains back-up cache and memory tests,
and a fail-safe loader. The XSROM code resides in sector 0 of FEPROM 0 on the XBUS.
Sector 2 of FEPROM 0 contains a duplicate copy of the code and is used if sector 0 is bad.
FEPROM. Two 1-Mbyte programmable ROMs are on the XBUS on PCI0. FEPROM 0
contains two copies of the XSROM, and the SRM console and decompression code.
FEPROM 1 contains the AlphaBIOS and NT HALcode. These two FEPROMs can be flash
updated. Refer to Chapter 7.
Figure 2-3 Contents of FEPROMs
FEPROM 0FEPROM 1
Sector
XSROM
0
Fa il S a fe ld r
1
XSROM
2
Fail S afe ld r
Decom press
3
31
and
Pal
Code
and
SRM
Console
Code
64Kb
64Kb
64Kb
64Kb
AlphaBIOS
Code
1Mb
te
PKW0431D-96
DIGITAL Server 7300/7300R Series Service Manual
2–5
Power-Up
For the console to run, the path from the CPU to the XSROM must be functional. The
XSROM resides in FEPROM0 on the XBUS, off the EISA bus, off PCI 0, off IOD 0. See
Figure 2-4. This path is minimally tested by SROM.
Figure 2-4 Console Code Critical Path
EISA
Bridge
EISA
Bus
XBUS
Xceivers
CPU
Memory
Pair
System Bus
128-BitData Bus + 16 ECC and 40-Bit Com mand/Address Bus
B r idg e Mod u le
PCI Bus 0
64 Bits
System to
PCI Bus
Bridge 0
System to
PCI Bus
Bridge 1
PCI Bus 1
PCI Slot
P C I/E IS A
Slot
P C I/EISA
Slot
P CI M otherboa rd
P C I/EISA
Slot
XBUS
Combo I/O:
serial ports
parallel port
floppy cntrl
Mouse/
Keyboard
I2C B us
Interface
NVRAM
8Kx8
Flash
ROM
2MB
BDATA
Xceivers
Real-Time
Clock
64 Bits
PCI Slot
PCI Slot
PCI Slot
PCI Slot
PKW0431E-96
2–6
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
The SROM contents are loaded into each CPU’s I-cache and executed on power-up/reset.
After testing the caches on each processor chip, it tests the path to the XSROM. Once this
path is tested and deemed reliable, layers of the XSROM are loaded sequentially into the
processor chip on each CPU. None of the SROM or XSROM power-up tests are run from
memory—all run from the caches in the CPU chip, thus providing excellent diagnostic
isolation. Later power-up tests, run under the console, are used to complete testing of the
I/O subsystem.
There are two console programs: the SRM console and the AlphaBIOS console, as detailed
in the DIGITAL Server 7300/7300R Series System Drawer User’s Guide (ER–K9FWW–
UA). By default, the SRM console is always loaded and I/O system tests are run under it
before the system loads AlphaBIOS. To load AlphaBIOS, the os_type environment
variable must be set to NT and the Halt button should be out (LED not lit).
DIGITAL Server 7300/7300R Series Service Manual
2–7
Power-Up
y
y
SROM Power-Up Test Flow
The SROM tests the CPU chip and the path to the XSROM.
Figure 2-5 SROM Power-Up Test Flow
For each CPU
Initialize C PU ch ip
Turn off CPU LED
HANG
HANG
Yes
No
D-cache
er ro rs
No
All 3 S-cache
banks pass
Yes
Initialize
PCI-EISA bridge
chip
Read TOY
NVRAM
Initialize C o m b o C h ip
on XBUS for access
to COM port 1
HANG
Yes
Du pilc a te Tagor
Fill erro rs
No
Light CPU LED
Determ ine Prim ar
Size IOD
Loopback on
each IOD
Pass
Light IO D L ED s
Fail
Initialize O C P port
on XBUS for access
to O C P d is p la
Pr in t t o console
device and OCP
Initialize all S- ca c he
banks
Check integrityof
XSROM
Pass
Load first 8K of
XSRO M into
S-cache
Jump to XSROM
overlayin S-ca c h e
Fail
twi ce
PKW04 32-96
HANG
2–8
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
The Alpha chip built-in self-test tests the I-cache at power-up and upon reset.
Each CPU chip loads its SROM code into its I-cache and starts executing it. If the chip is
partially functional, the SROM code continues to execute. However, if the chip cannot
perform most of its functions, that CPU hangs and that CPU pass/fail LED remains off.
If the system has more than one CPU and at least one passes both the SROM and XSROM
power-up tests, the system will bring up the console. The console checks the
FW_SCRATCH register where evidence of the power-up failure is left. Upon finding the
error, the console sends these messages to COM1 and the OCP:
• COM1 (or VGA):Power-up tests have detected a problem with your system
• OCP:Power-up failure
DIGITAL Server 7300/7300R Series Service Manual
2–9
Power-Up
Table 2-2 lists the tests performed by the SROM.
Table 2-2 SROM Tests
Test NameLogic Tested
D-cache RAM March
test
D-cache Tag RAM
March test
S-cache Data March
test
S-cache Tag RAM
March test
I-cache Parity Error
test
D-cache Parity Error
test
S-cache Parity Error
test
IOD Access testAccess to IOD CSRs, data path through CAP chip and MDP0
S-cache RAM cells, S-cache data path, S-cache address path
S-cache tag store RAM, S-cache bank address logic
I-cache parity error detection, ISCR register and error
forcing logic, IC_PERR_STAT register and reporting logic
D-cache parity error detection, DC_MODE register and
parity error forcing logic, DC_PERR_STAT register and
reporting logic
S-cache parity error detection, AC_CTL register and parity
error forcing logic, SC_STAT register and reporting logic
on each IOD, PCI0 A/D lines <31:0>
2–10
DIGITAL Server 7300/7300R Series Service Manual
SROM Errors Reported
The SROM reports machine checks, pending interrupt/exception errors, and errors
related to corruption of FEPROM 0. If SROM errors are fatal, the particular CPU
will hang and only the CPU self-test pass LEDs and/or the LEDs on the system bus to
PCI bus bridge module will indicate the failure.
After the SROM has completed its tests and verified the path to the FEPROM
containi ng the XSROM code, it lo ads the first 8 Kbytes of XSRO M into the pr imary
CPU’s S-cache and jumps to it.
Figure 2-6 XSROM Power-Up Flowchart
XSRO M banner to
OCP/console device
Run mem ory texts .
Print trace to OC P/console dev.
Clear SC_FH IT (force hit)
Enable all 3 S-cache banks
Print errors to OC P/console dev.
Done message to console dev.
Run B-cache tests
Print errors to OC P/console dev.
Done message to console dev.
Boot processor
redetermination
Ini tia lize B - ca ch e
and enable duplicate tag
Size system memory
through I squared C bus
Print m em info to c onsole dev.
Check for illegal m emory config.
Print w arnings to console dev.
and OCP.
In itia lize a ll memo ry p airs.
Note: T he X SROM c an only print to the console device if
the environment variable console = serial. It always sends
output to the O CP.
Boot processor
redetermination
Primary
verifies checksum
of PAL/decomp/console
code
Pass
Primary unloads PAL/
decompression code or
fail-safe loader depending
upon results of checksum
Primary jum ps to PALcode
and starts the console
Secondaries alerted that
console has started. They
jump to and run PALcode
joining the c onsole.
Fail
Fail- sa fe
loader
PKW0432A-96
XSROM tests are described in following table. Failure indicates a CPU failure.
2–12
DIGITAL Server 7300/7300R Series Service Manual
Power-Up
After jumping to the primary CPU's S-cache, the code then intentionally I-caches itself and
is completely register based (no D-stream for stack or data storage is used). The only Dstream accesses are writes/reads during testing.
Each FEPROM has sixteen 64-Kbyte sectors. The first sector contains B-cache tests,
memory tests, and a fail-safe loader. The second sector contains PALcode. The third
sector contains a copy of the first sector. The remaining thirteen sectors contain the SRM
console and decompression code.
NOTE: Memory tests are run during power-up and reset (see Table 2-
3). They are also affected by the state of the memory_test
environment variable, which can have the following values:
FULL Test all memory
PARTIAL Test up to the first 256 Mbytes
NONETest 32 Mbytes
Table 2-2 XSROM Tests
TestTest NameLogic Tested
11B-cache Tag Data Line testAccess to B-cache tags, shorts between
tag data and its status and parity bits
12B-cache Tag March testB-cache tag store RAMs, B-cache
STAT store RAMs
13B-cache Data Line testB-cache data lines to B-cache data
RAMs,
B-cache read/write logic
14B-cache Data March testB-cache data RAMs, CPU chip B-
cache control, CPU chip B-cache
address decode, INDEX_H<2x:6>
(address bus)
15B-cache ECC Data Line testCPU chip ECC generation and
checking logic, ECC lines from CPU
chip to B-cache, B-cache ECC RAMs
16B-cache Data ECC March testPortion of B-cache data RAMs used for
ECC
17CPU chip ECC Single/Double bit
Error test
18B-cache Tag Store Parity Error
test
19B-cache STAT Store Parity Error
test
CPU chip ECC single-bit error
detection and correction, ECC doublebit error detection, ECC error reporting
B-cache tag array, CPU parity
detection, EI_ADDR and EI_STAT
register operation
B-cache STAT array, CPU chip Bcache STAT parity
generation/detection
DIGITAL Server 7300/7300R Series Service Manual
2–13
Power-Up
Table 2-3 Memory Tests
TestTest NameLogic TestedDescription
20Memory Data testData path to and from
memory
Data path on memory
and RAMs
21Memory Address
test
23*Memory Bitmap
Building
24Memory March
test
* There is no test 22.
Address path to and
from memory
Address path on
memory and RAMs
No new logicMaps out bad memory
No new logicMaps out bad memory.
01 – FF Errors are
reported as an 8-bit
binary field. A set bit
indicates a module
failure. Bit <0>
indicates pass/fail of
MEM0_L; <1>
indicates pass/fail of
MEM0_H; <2>
indicates pass/fail of
MEM1_L; <7>
indicates pass/fail of
MEM3_H.
Same as test 20.
by way of the bitmap. It
does not completely fail
memory.
2–14
DIGITAL Server 7300/7300R Series Service Manual
XSROM Errors Reported
The XSROM r eports B -cache test errors and memory test errors. The XSROM al so
reports a warning if memory is illegally configured.
Example 2-2 XSROM Errors Reported at Power-Up
B-Cache Error (CPU Error)
TEST ERR on cpu0#CPU running the test
FRUcpu0
err#2
tst#11
exp:5555555555555555#Expected data
rcv:aaaaaaaaaaaaaaaa#Received data
adr:ffff8#B-cache location error
#occurred
Power-Up
Memory Error (Memory Module Indicated)
20..21..
TEST ERR on cpu0#CPU running test
FRU:MEM1L#Low member of memory pair 1
Once the SRM console is loaded, it does further testing of each IOD. Table 2-4
describes the IOD power-up tests, and Table 2-5 describes the PCI motherboard
power-up tests.
Table 2-4 IOD Tests
Power-Up
Test
Number
1IOD CSR Access testRead and write all CSRs in each IOD.
2Loopback testDense space writes to the IOD’s PCI dense
3ECC testLoopback tests similar to test 2 but with a
4Parity Error and Fill
5Translation Error testA loopback test using scatter/gather address
6Write Pending testRuns test 2 with the write-pending bit set and
7PCI Loopback testLoops data through each PCI on each IOD,
8PCI Peer-to-Peer
Test NameDescription
space to check the integrity of ECC lines on
the IODs.
varying pattern to create an ECC of 0s.
Single- and double-bit errors are checked.
Parity errors are forced on the address and
Error tests
Byte Mask test
data lines on system bus and PCI buses. A fill
error transaction is forced on the system bus.
translation logic on each IOD.
clear in the CAP chip control register.
testing the mask field of the system bus.
Tests that devices on the same PCI and on
different PCIs can communicate.
DIGITAL Server 7300/7300R Series Service Manual
2–17
Power-Up
Table 2-5 PCI Motherboard Tests (B3050/B3052)
Test
Number
1PCEBpceb_diagTests the PCI to EISA bridge chip
2ESCesc_diagTests the EISA system controller
38K NVRAMnvram_diagTests the NVRAM
4Real-Time Clockds1287_diagTests the real-time clock chip
5Keyboard and
6Flash ROMflash_diagDumps contents of flash ROM
7Serial and
8CD-ROMncr810_diagTests the CD-ROM controller
Test NameDiagnostic
Name
i8242_diagTests the keyboard/mouse chip
Mouse
combo_diagTests COM ports 1 and 2, the
Parallel Ports
and Floppy
Description
parallel port, and the floppy
For both IOD tests and PCI 0 and PCI 1 tests, trace and failure status is sent to the OCP. If
any of these tests fail, a warning is sent to the SRM console device after the console
prompt (or AlphaBIOS pop-up box). The LEDs on the system bus to PCI bus bridge
module are controlled by the diagnostics. If a LED is off, a failure occurred.
2–18
DIGITAL Server 7300/7300R Series Service Manual
Console Device Determination
After the SROM and XSRO M ha ve c ompl ete d their tasks, the SRM c onsol e pro gra m,
as it starts, determines where to send its power-up messages.
Figure 2-7 Console Device Determination Flowchart
Power-Up/Reset
or
P00> >> Init
Power-Up
Cons ole Envar
=serial
Yes
En able C OM port 1
and send m essages
as system is pow ering up
No
Co nsole Envar
= graphics
Yes
VGA ad apter
on
PCI0
No
Enable C OM port 1
and send m essages
as system is powering up.
Warning me ssage sent if a
VGA ad apter is seen on P CI 1
Yes
VGA bec ome s the
conso le device.
PKW 0434-96
DIGITAL Server 7300/7300R Series Service Manual
2–19
Power-Up
Console Device Options
The console device on a DIGITAL Server 7330/7300R series must be either a serial
terminal connected to COM1 off the server control module set at 9600 baud or a
graphics monitor off an adapter on PCI0. The console program must be AlphaBIOS.
During power-up, the SROM and the XSROM always send progress and error messages to
the OCP. Since the console environment variable is set to graphics, no messages are sent
to COM1.
Console power-up messages are sent to the graphics monitor console device, but SROM
and XSROM power-up messages are lost. No matter what the console environment
variable setting, each of the three programs sends messages to the control panel display.
Messages Sent By To a Graphics Console Device Are
SROMLost
XSROMLost
SRM consoleSent to VGA
2–20
DIGITAL Server 7300/7300R Series Service Manual
Console Power-Up Display
The last several lines of the power-up display prints appear on a graphics monitor
and parts of it print to the control panel display.
Example 2-3 Power-Up Display
Power-Up
SROM V1.0 on cpu0
SROM V1.0 on cpu1
SROM V1.0 on cpu2
SROM V1.0 on cpu3
XSROM V1.0 on cpu2
XSROM V1.0 on cpu1
XSROM V1.0 on cpu3
XSROM V1.0 on cpu0
BCache testing complete on cpu2
BCache testing complete on cpu0
BCache testing complete on cpu3
BCache testing complete on cpu1
mem_pair0 - 128 MB
mem_pair1 - 128 MB
20..20..21..20..21..20..21..21..23..24..24..24..24..
Memory testing complete on cpu0
Memory testing complete on cpu1
Memory testing complete on cpu3
Memory testing complete on cpu2
DIGITAL Server 7300/7300R Series Service Manual
2–21
Power-Up
At power-up or reset, the SROM code on each CPU module is loaded into that
module’s I-cache and tests the module. If all tests pass, the processor’s LED lights.
If any test fails, the LED remains off and power-up testing terminates on that CPU.
The first determination of the primary processor is made, and the primary processor
executes a loopback test to each PCI bridge. If this test passes, the bridge LED
lights. If it fails, the LED remains off and power-up continues. The EISA system
controller, PCI-to-EISA bridge, COM1 port, and control panel port are all
initialized thereafter.
Each CPU prints an SROM banner to the device attached to the COM1 port and to
the control panel display. (The banner prints to the COM1 port if the console
environment variable is set to serial. If it is set to graphics, nothing prints to the
console terminal, only to the control panel display, until
Each processor's S-cache is initialized, and the XSROM code in the FEPROM on
the PCI 0 is unloaded into them. (If the unload is not successful, a copy is unloaded
from a different FEPROM sector. If the second try fails, the CPU hangs.)
Each processor jumps to the XSROM code and sends an XSROM banner to the
COM1 port and to the control panel display.
The three S-cache banks on each processor are enabled, and then the
B-cache is tested. If a failure occurs, a message is sent to the COM1 port and to the
control panel display.
Each CPU sends a B-cache completion message to COM1.
The primary CPU is again determined, and it sizes memory by reading memory
registers on the I
The information on memory pairs is sent to COM1. If an illegal memory
configuration is detected, a warning message is sent to COM1 and the control panel
display.
Memory is initialized and tested, and the test trace is sent to COM1 and the control
panel display. Each CPU participates in the memory testing. The numbers for
tests 20 and 21 might appear interspersed. This is normal behavior. Test 24 can
take several minutes if the memory is very large. The message “P0 TEST 24
MEM**” is displayed on the control panel display; the second asterisk rotates to
indicate that testing is continuing. If a failure occurs, a message is sent to the
COM1 port and to the control panel display.
Each CPU sends a test completion message to COM1.
2
C bus.
.
Continued
2–22
DIGITAL Server 7300/7300R Series Service Manual
Example 2-3 Power-Up Display (Continued)
Power-Up
starting console on CPU 0
sizing memory
0 128 MB SYNC
1 128 MB SYNC
starting console on CPU 1
starting console on CPU 2
starting console on CPU 3
probing IOD1 hose 1
bus 0 slot 1 - NCR 53C810
bus 0 slot 2 - DECchip 21041-AA
bus 0 slot 3 - NCR 53C810
bus 0 slot 4 - DECchip 21040-AA
probing IOD0 hose 0
bus 0 slot 1 - PCEB
Configuring I/O adapters...
DIGITAL Server 7300 Console V1.0, 13-MAR-1997 18:18:26
P00>>>
¡
DIGITAL Server 7300/7300R Series Service Manual
2–23
Power-Up
The final primary CPU determination is made. The primary CPU unloads PALcode
and decompression code from the FEPROM on the PCI 0 to its B-cache. The
primary CPU then jumps to the PALcode to start the SRM console.
The primary CPU prints a message indicating that it is running the console. Starting
with this message, the power-up display is printed to the default console terminal,
regardless of the state of the console environment variable. (If console is set to
graphics, the display from here to the end is saved in a memory buffer and printed
to the graphics monitor after the PCI buses are sized and the graphics device is
initialized.)
The size and type of each memory pair is determined.
The console is started on each of the secondary CPUs. A status message prints for
each CPU.
The PCI bridges (indicated as IODn) are probed and the devices are reported. I/O
adapters are configured.
The SRM console banner and prompt are printed. (The SRM prompt is shown in
¡
this manual as P00>>>. It can, however, be P01>>>, P02>>>, or P03>>>. The
number indicates the primary processor.)
When the os_type environment variable is set to nt (as it must be on the DIGITAL
Server 7300/7300R series), the SRM console loads and starts the AlphaBIOS
console and does not print the SRM banner or prompt.
2–24
DIGITAL Server 7300/7300R Series Service Manual
Fail-Safe Loader
The fail-safe loader is a software routine that loads the SRM console image from
floppy. Onc e the co nsole i s running you will want to run LFU to update FEPROM 0
with a new image.
NOTE: FEPROM 0 contains images of the SROM, XSROM,
If the fail-safe loader loads, the following conditions exist on the machine:
• The SROM has passed its tests and successfully unloaded the XSROM. If the SROM
fails to unload both copies of XSROM, it reports the failure to the control panel
display and COM1 if possible, and the system hangs.
• The XSROM reports the errors encountered and loads the fail-safe loader.
Power-Up
decompression, and SRM console code.
DIGITAL Server 7300/7300R Series Service Manual
2–25
Power-Up
2–26
DIGITAL Server 7300/7300R Series Service Manual
3
Troubleshooting
This chapter describes troubleshooting during power-up and booting, as well as diagnostics
for DIGITAL Server 7300/7300R series systems. The chapter covers the following topics:
• Troubleshooting with LEDs
• Troubleshooting Power Problems
• Troubleshooting with the Maintenance Bus (I2C Bus)
• Running Diagnostics — Test Command
• Testing an Entire System
DIGITAL Server 7300/7300R Series Service Manual 3-1
Troubleshooting
Troubleshooting with LEDs
During power-up, reset, initialization, or testing, diagnostics are run on CPUs,
memories, bridge modules, PCI motherboards, and sometimes options. The following
sections describe possible problems that can be identified by checking LEDs.
Figure 3-1 CPU and Bridge Module LEDs
Bridge Module LEDs
(IOD 0 & 1)
IOD0 Self-Test Pass
IOD0 Self-Test Pass
POWER_FAN_OK
TEMP_OK
CPU LEDs
DC_OK
SROM Oscillator
CPU Self-Test Pass
Regulator OK (EV56)
ML014285
3-2
DIGITAL Server 7300/7300R Series Service Manual
Processor (CPU) LEDs
If the CPU STP LED on any processor (CPU) module is lit, that CPU chip is functioning
properly. If the CPU STP LED is off, that CPU may or may not be functioning.
You can use the Halt button on the OCP to prevent the AlphaBIOS console (which turns
off the CPU STP LED) from booting, thus assuring the validity of the CPU STP LED. If
the LED is off, replace the CPU. If the LED is lit, you can use the SRM console command
alphabios to load and run the AlphaBIOS console.
The top LED on a CPU module is a DC OK LED. It is driven by the PCM module. If it is
not lit, there are probably power problems.
The second from the top LED on a CPU lights only when the SROM on the CPU is loaded.
On modules with EV56 CPU processors a fourth LED is present at the bottom of the
column. The LED is normally on indicating that the power regulator on the module is
working properly. If the LED is off, replace the module.
System Bus to PCI Bus Bridge Module LEDs (B3040-AA)
Troubleshooting
There are four LEDs on the B3040-AA system bus to PCI bus bridge module:
The top two LEDs indicate the condition of the bridge module. If either is off, the module
should be replaced.
The bottom two LEDs are passed from the PCM. Both should be on during normal
operation. If either is off while the system is on, the LEDs on the PCM module should
indicate what failed. If they do not, the PCM could be broken or the bridge module is not
passing the signals to the LEDs.
NOTE: If AC power is applied and the system is off and a power supply
is in operation, the power LED, the top one of the bottom two,
flashes, indicating the presence of Vaux (auxiliary voltage).
DIGITAL Server 7300/7300R Series Service Manual
3-3
Troubleshooting
Cabinet Power and Fan LEDs
Figure 3-2 shows the cabinet power and fan LEDs.
Figure 3-2 Cabinet Power and Fan LEDs
Fan LED
Power LED
PK-0664-96
A cabinet system has three exhaust fans at the top of the cabinet. They are powered from
a small power supply in the fan tray. This power supply also powers the server control
module at the bottom of the PCI card cage to allow remote access to the system. A failure
of the power supply is indicated only by the LEDs. No messages are displayed.
There are two LEDs on the top panel: a fan LED and a power LED.
When the fan LED (amber) is flashing, a cabinet fan needs replacing. Look to see which
fan appears broken (either not functioning at all; or turningh slower than the others).
When the power LED (green) is off, either the power supply in the fan tray is broken or
there is a power problem.
3-4
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting Power Problems
Power problems can occur before the system is up or while the system is running. If a
system stops running, make a habit of checking the PCM.
Power Problem List
The system will halt for the following:
1. A CP U fan failure
2. A system fan failure
3. An overtemperature condition
4. Powe r supplied out of tolerance
5. C ircuit brea ke r(s ) trip pe d
6. AC problem
7. Interlock switch activation or failure
8. P CM failure
9. Environmental electrical failure or
unrecoverable system fault
with auto_action ev = halt or bo ot
10.Operator error - failure to unplug all power
supplies and letting Vaux drain (10 sec de lay)
be fore resta r ting
11.Cable fa ilu r e
12. M od ule failure - S ystem m otherboard, PC I
motherboard, or system bus to P CI bus bridge
13. SCM breaking the interlock circuit
Troubleshooting
Indications of failure:
1. Powe r control m o dule LED s indicate
CPU fan, system fan, overtemperature,
and p ower su pply failur es
2. C ircuit brea ke r(s ) trip pe d
No obv ious indications for failures 7 - 13
from the pow er system .
PKW 0436A-96
If Halt Is Caused by Power, Fan, or Over-Temperature Problems
If a system is stopped because of a power, fan, or over-temperature problem, use the PCM
LEDs to diagnose the problem..
DIGITAL Server 7300/7300R Series Service Manual
3-5
Troubleshooting
If Power Problem Occurs at Power-Up
If the system has a power problem on a cold start, the PCM LEDs are not valid until after
DCOK_SENSE has been asserted. The cause is one of the following:
• Broken system fan
• Broken CPU fan
• Power supplied to the system is out of tolerance (a power supply could be broken and
the system could still power up)
• PCM failure
• Interlock failure
• Wire problems
• Temperature problem (unlikely)
Recommended Order for Troubleshooting Failure at Power-Up
1. Check to see if any CPU fan or system fan is not spinning. Fans can fail by not
spinning and/or not putting out the tachometer output necessary as input to the PCM
comparator that checks the fans. (See steps 4 and 5.) Replace broken fan.
2. Replace the PCM.
3. Sequentially remove CPUs and try to power up after you remove a CPU. If the
system powers up, the last CPU you removed had a fan failure.
4. Check the output of the power supplies. See the section “Power Supply” in Chapter 4
for locations of +5 and +3.43 volt output pins. If the output is above or below the
threshold, replace the faulty power supply.
5. Check the output of each system fan with a voltmeter. Probe the middle of three
outputs of the fans with the positive lead of the meter and ground the other probe.
The meter should read 2.5 volts to 3 volts. If a fan’s output is out of this range,
replace the fan.
NOTE: You will have to disable the interlocks to check the voltages in
step 5. You will have only 10 seconds to measure them. There
is a 10-second delay before the PCM turns off the power.
The PCM must sense a change in Vaux (auxiliary voltage) to
start the power supplies. Pressing the On button has no effect
if the machine halted because of a failure in the power system.
The power supplies must be unplugged and plugged back in for
the On button to work.
3-6
DIGITAL Server 7300/7300R Series Service Manual
Power Control Module LEDs
The PCM has 1 1 LEDs v isibl e t hrough t he syst em c ard c age . The LED di splay sho ws
the relative placement of the LEDs.
Figure 3-3 PCM LEDs
Troubleshooting
DCOK_SENSE
PS0_OK
PS1_OK
PS2_OK
TEMP_OK
CPUFAN_OK
SYSFAN_OK
CS_FAN0
CS_FAN1
CS_FAN2
C_FAN3
Normally On
Tested at one-second intervals
Off if power supply not present
or broken
PK-0714-96
DIGITAL Server 7300/7300R Series Service Manual
3-7
Troubleshooting
Table 3-1 Power Control Module LED States
LEDStateDescription
DCOK_SENSEOnBoth +5.0V and +3.43V are present and within limits.
PS0_OKOnPower supply 0 is present and has asserted POK_H.
PS1_0KOn
PS2_OKOn
TEMP_OKOnThe system temperature is below 55° C.
CPUFAN_OKOn
SYSFAN_OKOn
CS_FAN0On
CS_FAN1On
CS_FAN2On
C_FAN3On
Off
Off
Off
Off
Off
Off
Off
Off
Power supply 1 is present and has asserted POK_H.
Power supply 1 not present.
Power supply 2 is present and has asserted POK_H.
Power supply 2 not present.
All CPU fans are OK.
A CPU fan has failed. The specific fan is identified by the
CS_FANx or C_FAN3 LED that remains lit.
All system fans are OK.
A system fan has failed. The specific fan is identified by the
CS_FANx that remains lit.
CPU fan 0 and system fan 0 are being sampled or one of
them has failed as indicated by CPUFAN_OK and
SYSFAN_OK.
CPU fan 0 and system fan 0 are not being sampled and are
functioning properly.
CPU fan 1 and system fan 1 are being sampled or one of
them has failed as indicated by CPUFAN_OK and
SYSFAN_OK.
CPU fan 1 and system fan 1 are not being sampled and are
functioning properly.
CPU fan 2 and system fan 2 are being sampled or one of
them has failed as indicated by CPUFAN_OK and
SYSFAN_OK.
CPU fan 2 and system fan 2 are not being sampled and are
functioning properly.
CPU fan 3 is being sampled or has failed as indicated by
CPUFAN_OK and SYSFAN_OK.
Off CPU fan 3 and system fan 3 are not being sampled and
are functioning properly.
3-8
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting with the Maintenance Bus (I2C Bus)
The I2C bus (referred to as the “I squared C bus”) is a small inte rnal mai ntena nce bus
used to monitor system conditions scanned by the power control module, write the
fault display, store error state, and track configuration information in the system.
Although all system modules ( not I/O modules) si t on the mainte nance bus, only the
2
I
C controller accesses it. Everything written or read on the I2C bus is done by the
controller.
Figure 3-4 I2C Bus Block Diagram
3
2
1
CPU 0
I2C Bus
IOD 1PCI 1
Motherboard
Memory
Pairs
CPUs
Troubleshooting
PCM
Registers
IOD 0PCI 0
OCP
Controller
DIGITAL Server 7300/7300R Series Service Manual
I2C Bus
Controller
MEMs
IOD 0
PCI 0
XBUSEISA
ML014286
3-9
Troubleshooting
Monitoring System Conditions
The I2C bus monitors the state of system conditions scanned by the PCM. There are two
registers on the PCM:
One records the state of the fans and power supplies and is latched when there is a fault.
The other causes an interrupt on the I
temperature condition exists, or power supplied to the system is out of tolerance.
The interrupt received by the I2C bus controller on PCI 0 alerts the system of imminent
power shutdown. The controller has 30 seconds to read the two registers and store the
information in the EEPROM on the PCM. The SRM console command show power reads
these registers.
Displaying Faults
The OCP display is written through the I2C bus.
Writing Error States
2
C bus when a CPU or system fan fails, an over-
Error state is written and read for power conditions. The state of the Halt button (in/out) is
read on the I
2
C bus.
Tracking Configurations
Each CPU, PCI bridge, PCI motherboard, and system motherboard has an EEPROM that
contains information about the module that can be written and read over the I
modules contain the following information:
• Module type
• Module serial number
• Hardware revision
• Firmware revision
• Memory size (only required for memory modules)
2
C bus. All
3-10
DIGITAL Server 7300/7300R Series Service Manual
Running Diagnostics — Test Command
The test command runs diagnostics on the entire system, CPU devices, memory
devices, and the PCI I/O subsystem. The test command runs only from the SRM
console. Ctrl/C stops the test.
Example 3-1 Test Command Syntax
P00>>> help test
FUNCTION
SYNOPSIS
test ([-q] [-t <time>] [option]
where option is:
cpun
Troubleshooting
memn
pcin
and n can be one of 0, 1, 2, 3, or *.
The entire system is tested by default if no option specified.
NOTE: Switch from AlphaBIOS to the SRM console to enter the test
command. From the AlphaBIOS console, press in the Halt
button (the LED will light) and reset the system.
test [-t time] [-q] [option]
-t
time
-q
option
Specifies the run time in seconds. The default for system test is 600 seconds
(10 minutes).
Disables the display of status messages as exerciser processes are started and
stopped during testing.
cpu
mem
Either
specified, the entire system is tested.
n
,
n
, or
pci
n
, where n is 0, 1, 2, 3, or *. If nothing is
DIGITAL Server 7300/7300R Series Service Manual
3-11
Troubleshooting
Testing an Entire System
A test command with no modifiers runs all e xercisers for subsystems and devices on
the system. I/O devices tested are supported boot devices. The test runs for 10
minutes.
Example 3-2 Sample Test Command
P00>>> test
Console is in diagnostic mode
System test, runtime 600 seconds
Type ^C to stop testing
Configuring system..
polling ncr0 (NCR 53C810) slot 1, bus 0 PCI, hose 1 SCSI Bus ID 7
dka500.5.0.1.1 DKa500 RRD45 1645
polling ncr1 (NCR 53C810) slot 3, bus 0 PCI, hose 1 SCSI Bus ID 7
Starting background memory test, affinity to all CPUs..
Starting processor/cache thrasher on each CPU..
Starting processor/cache thrasher on each CPU..
Starting processor/cache thrasher on each CPU..
3-12
DIGITAL Server 7300/7300R Series Service Manual
Troubleshooting
Starting processor/cache thrasher on each CPU..
Testing SCSI disks (read-only)
No CD/ROM present, skipping embedded SCSI test
Testing other SCSI devices (read-only)..
Testing floppy drive (dva0, read-only)
ID Program Device Pass Hard/Soft Bytes Written Bytes Read
+5.0V is sensed on all CPUs in the system, the system bus motherboard, and the PCI
bus motherboard(s).
+3.43V is sensed on all CPUs in the system and the system bus motherboard.
• Current share on +5.0V, +3.43V, and +12V.
• 1 % regulation on +3.43V.
• Fault protection (latched). If a fault is detected by the power supply, it will shut
down. The faults detected are:
Overvoltage
Overcurrent
Power overload
• DC_ENABLE_L input signal starts the DC outputs.
• POK_H output signal indicates that the power supply is operating properly.
DIGITAL Server 7300/7300R Series Service Manual
4–3
Power System
Power Control Module Features
The power control module (54-24117-01) is located behind the B3040-AA module, the
system bus to PCI bus bridge module.
Figure 4-2 Power Control Module
System Motherboard
Power Control
Module Slot
The power control module performs the following functions:
4–4
DIGITAL Server 7300/7300R Series Service Manual
PK-0710-96
Power System
• Controls the power-up/down sequencing.
• Monitors the combined output of power supplies VDD (3.43V) and VCC (5.0V) and
asserts DCOK_SENSE if these voltages are within range and asserts
POWER_FAULT_L causing an immediate power shutdown if either is not.
• Monitors system temperature and asserts TEMP_FAIL, if temperature exceeds 55° C.
• Monitors CPU and system drawer fans and asserts CPUFAN_OK if all CPU fans are
functioning properly, asserts SYSTEM_FAN_OK if the drawer cooling fans are
functioning properly; otherwise it asserts FAN_FAULT_L. Each fan is checked at 1
second intervals.
• Powers down the system 30 seconds after detecting TEMP_FAIL, or the absence of
CPUFAN_OK, or the absence of SYSTEM_FAN_OK by asserting
POWER_FAULT_L.
• Provides visual indication of faults through LEDs.
• Has two registers, one that generates interrupts when bits change, and one that latches
errors but does not generate interrupts.
DIGITAL Server 7300/7300R Series Service Manual
4–5
Power System
Power Circuit and Cover Interlocks
Figure 4-3 is a diagram of the power circuit. Note that B305n in the diagram stands
for either the B3050-AA or B3052-AA PCI Motherboard.
Figure 4-3 Power Circuit Diagram
OCP
Logic
OCP
Switch
17-04201-02
17-04217-01
17-04201-01
Or
17-04302-01
B305n
B3040
17-04196-01
RSM_DC_EN_L
SCM
Power Supply
Motherboard
DC_ENABLE_L
Cover
Interlocks
70-32016-01
PCM
POWER_FAULT_L
ML014282
4–6
DIGITAL Server 7300/7300R Series Service Manual
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.