Compaq ProLiant 800 Maintenance And Service Manual

Page 1

ProLiant 800 Servers

Supporting Pentium II Processors and 100 MHz System Bus Maintenance and Service Guide
First Edition (September 1998) Part Number 320984-001 Spare Part Number 320981-001 Compaq Computer Corporation
Page 2

Notice

COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST INFRINGEMENT.
This publication contains information protected by copyright. No part of this publication may be photocopied or reproduced in any form without prior written consent from Compaq Computer Corporation.
ã 1998 Compaq Computer Corporation. All rights reserved. Printed in the U.S.A.
The software described in this guide is furnished under a license agreement or nondisclosure agreement. The software may be used or copied only in accordance with the terms of the agreement.
Compaq, Deskpro, Fastart, Compaq Insight Manager, Systempro, Systempro/LT, ProLiant, ROMPaq, QVision, SmartStart, NetFlex, QuickFind, PaqFax, ProSignia, registered United States Patent and Trademark Office.
Netelligent, Systempro/XL, SoftPaq, QuickBlank, QuickLock are trademarks and/or service marks of Compaq Computer Corporation.
Microsoft, MS-DOS, Windows, and Windows NT are registered trademarks of Microsoft Corporation.
Other product names mentioned herein may be trademarks and/or registered trademarks of their respective companies.
Compaq ProLiant 800 Servers Supporting Pentium II Processors and 100 MHz System Bus Maintenance and Service Guide
First Edition (September 1998) Part Number 320984-001 Spare Part Number 320981-001
Page 3

Contents

About This Guide
Symbols in Text.........................................................................................................................vii
Compaq Technician Notes.........................................................................................................vii
Where to Go for Additional Help .............................................................................................viii
Telephone Numbers...........................................................................................................viii
Chapter 1
Illustrated Parts Catalog
Mechanical Parts Exploded View.............................................................................................1-1
System Components Exploded View........................................................................................1-2
Spare Parts List.........................................................................................................................1-3
Chapter 2
Removal and Replacement Procedures
Electrostatic Discharge Information .........................................................................................2-1
Symbols in Equipment..............................................................................................................2-2
Preparation Procedures.............................................................................................................2-2
Server Warnings and Precautions......................................................................................2-3
Front Bezel Door......................................................................................................................2-4
Left Side Access Panel ............................................................................................................. 2-5
Feet...........................................................................................................................................2-6
U-Channel Access Panel...........................................................................................................2-7
Top Access Panel .....................................................................................................................2-8
Removable Media and Mass Storage Devices.......................................................................... 2-9
Non-Hot-Plug Drive Cage ............................................................................................... 2-10
IDE CD-ROM Drive .......................................................................................................2-11
1.44-MB Diskette Drive..................................................................................................2-12
Cable Folding and Routing Diagrams.....................................................................................2-13
Riser Board with Tray ............................................................................................................2-15
I/O Fan Assembly...................................................................................................................2-17
Processor Fan..........................................................................................................................2-18
Power Switch and Cable Assembly ........................................................................................ 2-19
Processor .........................................................................................................................2-20
Processor Retention Bracket............................................................................................2-21
Processor Power Module ................................................................................................. 2-22
Memory........................................................................................................................... 2-23
Power Supply..........................................................................................................................2-26
External Replacement Battery................................................................................................2-27
iii
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 4
iv
Chapter 3
Diagnostic Tools
Default Configuration...............................................................................................................3-2
Default Configuration Messages .......................................................................................3-2
Utilities Access.........................................................................................................................3-2
Running Compaq Utilities ................................................................................................. 3-2
Power-On Self-Test (POST).....................................................................................................3-4
POST Error Messages .......................................................................................................3-4
Diagnostics Software.............................................................................................................. 3-17
Running Diagnostics........................................................................................................3-18
Primary Processor Test Error Codes................................................................................ 3-19
Memory Test Error Codes............................................................................................... 3-20
Keyboard Test Error Codes ............................................................................................. 3-21
Parallel Printer Test Error Codes.....................................................................................3-21
Video Display Unit Test Error Codes..............................................................................3-22
Diskette Drive Test Error Codes .....................................................................................3-23
Monochrome Video Board Test Error Codes ..................................................................3-23
Serial Test Error Codes ...................................................................................................3-24
Modem Communications Test Error Codes ....................................................................3-24
Hard Drive Test Error Codes...........................................................................................3-25
Tape Drive Test Error Codes...........................................................................................3-26
Advanced VGA Board Test Error Codes ........................................................................3-27
NetFlex-2 Controller Test Error Codes ...........................................................................3-29
Compaq Network Interface Boards Test Error Codes .....................................................3-30
SCSI Hard Drive Test Error Codes .................................................................................3-31
SCSI/IDE CD-ROM Drive Test Error Codes .................................................................. 3-31
SCSI Tape Drive Test Error Codes .................................................................................3-32
Server Manager/R Board Test Error Codes..................................................................... 3-32
Pointing Device Interface Test Error Codes ....................................................................3-33
Drive Array Advanced Diagnostics (DAAD)......................................................................... 3-34
Starting DAAD................................................................................................................3-35
DAAD Diagnostic Messages........................................................................................... 3-36
Integrated Management Log...................................................................................................3-46
Multiple Ways of Viewing the Log ................................................................................. 3-46
Event List ........................................................................................................................3-48
Event Messages...............................................................................................................3-48
Rapid Recovery Services ........................................................................................................3-50
Automatic Server Recovery-2.........................................................................................3-50
Server Health Logs.......................................................................................................... 3-59
Storage Fault Recovery Tracking ....................................................................................3-63
Storage Automatic Reconstruction .................................................................................. 3-63
Network Interface Fault Recovery Tracking ...................................................................3-63
Memory Fault Recovery Tracking...................................................................................3-63
Remote Service Features ........................................................................................................3-64
ROMPaq.................................................................................................................................3-65
Compaq Insight Manager .......................................................................................................3-65
Features of Compaq Insight Management.......................................................................3-65
Compaq Insight Management Software Architecture......................................................3-66
Page 5
Chapter 4
Connectors, Switches, Jumpers, and LEDs
Connectors................................................................................................................................4-1
Rear Panel Connectors ......................................................................................................4-1
System Board Connectors .................................................................................................4-2
Riser Board Connector ......................................................................................................4-3
Switches....................................................................................................................................4-4
System Maintenance Switch..............................................................................................4-4
Bus/Core Frequency Switch ..............................................................................................4-5
Jumpers.....................................................................................................................................4-7
LEDs.........................................................................................................................................4-8
Power LEDs ......................................................................................................................4-8
Network Interface Controller LEDs ..................................................................................4-9
Chapter 5
Physical, Operating, and Performance Specifications
System Unit ..............................................................................................................................5-2
Power Supply............................................................................................................................5-3
SDRAM Dual Inline Memory Module (DIMM) ...................................................................... 5-4
1.44-MB Diskette Drive ...........................................................................................................5-4
IDE CD-ROM Drive ................................................................................................................5-5
Diskette Drive Cable ................................................................................................................5-6
IDE CD-ROM Drive Cable......................................................................................................5-6
Wide SCSI Cable......................................................................................................................5-7
SCSI Drive Power Cable..........................................................................................................5-7
v
Index
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 6

About This Guide

This Maintenance and Service Guide is a troubleshooting guide that can be used for reference when servicing Compaq ProLiant 800 Servers.
IMPORTANT: The installation of options and servicing of this product shall be performed by individuals that are knowledgeable of the procedures, precautions, and hazards associated with equipment containing hazardous energy circuits.

Symbols in Text

These symbols may be found in the text of this guide. They have the following meanings.
vii
WARNING: To reduce the risk of personal injury from electrical shock and hazardous energy levels, only authorized service technicians should attempt to repair this equipment. Improper repairs could create conditions that are hazardous.
WARNING: Indicates that failure to follow directions in the warning could result in bodily harm or loss of life.
CAUTION: Indicates that failure to follow directions could result in damage to equipment or loss of information.
IMPORTANT: Presents clarifying information or specific instructions.
NOTE: Presents commentary, sidelights, or interesting points of information.

Compaq Technician Notes

WARNING: Only authorized technicians trained by Compaq should attempt to
repair this equipment. All troubleshooting and repair procedures are detailed to allow only subassembly/module level repair. Because of the complexity of the individual boards and subassemblies, no one should attempt to make repairs at the component level or to make modifications to any printed wiring board. Improper repairs can create a safety hazard. Any indications of component replacement or printed wiring board modifications may void any warranty.
WARNING: To reduce the risk of personal injury from electrical shock and hazardous energy levels, do not exceed the level of repair specified in these procedures. Because of the complexity of the individual boards and subassemblies, do not attempt to make repairs at the component level or to make modifications to any printed wiring board. Improper repairs could create conditions that are hazardous.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 7
viii About This Guide
WARNING: To reduce the risk of electric shock or damage to the equipment:
n If the system has multiple power supplies, disconnect power from the system
by unplugging all power cords from the power supplies.
n Do not disable the power cord grounding plug. The grounding plug is an
important safety feature.
n Plug the power cord into a grounded (earthed) electrical outlet that is easily
accessible at all times.
CAUTION: To properly ventilate your system, you must provide at least 12 inches (30.5 cm) of clearance at the front and back of the computer.
CAUTION: The computer is designed to be electrically grounded. To ensure proper operation, plug the AC power cord into a properly grounded AC outlet only.

Where to Go for Additional Help

In addition to this guide, the following information sources are available:
User Documentation
Compaq Service Quick Reference Guide
Service Training Guides
Compaq Service Advisories and Bulletins
Compaq QuickFind
Compaq Insight Manager
Compaq Download Facility: Call 1-281-518-1418

Telephone Numbers

For the name of your nearest Compaq Authorized Reseller:
In the United States, call 1-800-345-1518 In Canada, call 1-800-263-5868
For Compaq technical support: In the United States and Canada, call 1-800-386-2172
For Compaq technical support phone numbers outside the United States and Canada, visit the Compaq website at:
http://www.compaq.com
Page 8
Chapter 1
Illustrated Parts Catalog
This chapter provides the illustrated parts breakdow n and a spa re parts list for Compaq ProLiant 800 Servers. See Table 1-1 for the names of referenced spare parts.

Mechanical Parts Exploded View

1-1
2c
2b
2a
1
4
3
Figure 1-1. ProLiant 800 Servers mechanical parts exploded view
5
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 9
1-2 Illustrated Parts Catalog

System Components Exploded View

13
14
9
8
16
17
1
6
12
15
7
18
19
5
Figure 1-2. ProLiant 800 Servers system components exploded view
Page 10

Spare Parts List

Ref Description Spare Part #
CHASSIS
1 Chassis 320979-001
1-3
Table 1-1
Spare Parts List
2 Panel Kit
320973-001
a) U-Channel Access Panel b) Left Side Access Panel c)
Top Access Panel
3 Feet 333575-001
4 Front Bezel 298012-001
5 Drive Cage 320975-001
SYSTEM COMPONENTS
6 Power Supply, 325W 320976-001
7 External Replacement Battery (4.5 V) 160274-001
8 Power Switch and Cable Assembly 320974-001
9 Processor with Heat Sink 350/100 313623-001
10 Processor with Heat Sink 400/100 313624-001 *
11 Processor with Heat Sink 450/100 179780-001 *
12 Processor Power Module 327660-001
13 Processor Retention Bracket 333575-001
BOARDS
14 System Board 320978-001
15 Riser Board with Tray 320977-001
FANS
16 I/O Fan Assembly 327308-001
17 Processor Fan With Bracket 326873-001
MEMORY
18 64-MB DIMM (SDRAM, Reg 100 MHz) 317745-001
MASS STORAGE DEVICES
19 1.44-MB Diskette Drive 160788-201
20 IDE CD-ROM Drive 328369-001
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 11
1-4 Illustrated Parts Catalog
Spare Parts List Continued
Ref Description Spare Part #
CABLES
21 Data Cable Kit
386559-001 *
a) Wide SCSI cable b)
Diskette drive cable
c)
40-position SCSI cable
22 Adapter Cable 186423-001 *
MISCELLANEOUS
23 Return Kit 298017-001 *
24 Carton and Buns (International) 298017-002 *
25 Maintenance and Service Guide 320981-001 *
26 Illustrated Parts Map 320980-001 *
OPTIONS
27 32-MB DIMM (SDRAM, Reg 100 MHz) 317747-001 *
28 128-MB DIMM (SDRAM, Reg 100 MHz) 317756-001 *
29 256-MB DIMM (SDRAM, Reg 100 MHz) 317749-001 *
30 9.1-GB Non-Pluggable Wide Ultra 1.6-inch Hard Drive 199886-001 *
31 9.1-GB Non-Pluggable Fast-SCSI-2 Hard Drive 199885-001 *
32 4.3-GB Non-Pluggable Wide Ultra 1-inch Hard Drive 242606-001 *
33 4.3-GB Non-Pluggable Fast-Wide SCSI-2 Hard Drive 199599-001 *
34 4.3-GB Non-Pluggable Fast-SCSI-2 Hard Drive 199585-001 *
35 4.3-GB Wide Ultra SCSI Hard Drive (7,200 rpm) 339514-001 *
36 4.3-GB Wide Ultra SCSI Hard Drive (10,000 rpm) 336383-001 *
37 9.1-GB Wide Ultra SCSI Hard Drive 339515-001 *
38 18.2-GB Wide Ultra SCSI Hard Drive 336385-001 *
* Not Shown
Page 12
Chapter 2
Removal and Replacement Procedures
This chapter provides subassembly/module-level removal and replacement procedures for Compaq ProLiant 800 Servers. After completing all necessary removal and replacement procedures, run the Diagnostics program to verify that all components operate properly.
To service Compaq ProLiant 800 Servers, you might need the following:
Torx T-15 screwdriver
From the Compaq SmartStart and Support Software CD:
System Configuration Utility software
Drive Array Advanced Diagnostics software
Diagnostics software

Electrostatic Discharge Information

A discharge of static electricity can damage static-sensitive devices or microcircuitry. Proper packaging and grounding techniques are necessary precautions to prevent damage. To prevent electrostatic damage, observe the following precautions:
2-1
Transport products in static-safe containers such as conductive tubes, bags, or boxes.
Keep electrostatic-sensitive parts in their containers until they arrive at
static-free stations.
Cover work stations with approved static-dissipating material. Provide a wrist strap
connected to the work surface and properly grounded tools and equipment.
Keep work area free of non-conductive materials such as ordinary plastic assembly aids
and foam packing.
Make sure you are always properly grounded when touching a static-sensitive component
or assembly.
Avoid touching pins, leads, or circuitry.
Always place drives PCB assembly side down.
Use conductive field service tools.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 13
2-2 Removal and Replacement Procedures

Symbols in Equipment

WARNING: Any surface or area of the equipment marked with these
symbols indicates the presence of a hot surface or hot component. If this surface is contacted, the potential for injury exists. To reduce the risk of injury from a hot component, allow the surface to cool before touching.
WARNING: Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. The enclosed area contains no operator serviceable parts. To reduce the risk of injury from electrical shock hazards, do not open this enclosure.
WARNING: Any RJ-45 receptacle marked with these symbols indicates a Network Interface Connection. To reduce the risk of electrical shock, fire, or damage to the equipment, do not plug telephone or telecommunications connectors into this receptacle.
CLASS 1 LASER PRODUCT

Preparation Procedures

Before beginning any of the removal and replacement procedures for non-hot-plug devices:
1.
Turn off the server.
Disconnect the AC power cord from the AC outlet, then from the server.
2.
Disconnect all external peripheral devices from the server.
3.
For some removal and replacement procedures, you must remove the server from the rack
4. and place it on a sturdy table or workbench. Refer to the ProLiant 800 Servers Supporting Pentium II Processors and 100 MHz System Bus Setup and Installation Guide for instructions.
CAUTION: Electrostatic discharge can damage electronic components. Be sure you are properly grounded before beginning any installation procedure. See the section titled “Electrostatic Discharge Information” in this chapter, for more information.
WARNING: This label or equivalent is located on the surface of your CD-ROM drive. This label indicates that the product is classified as a CLASS 1 LASER PRODUCT.
Page 14

Server Warnings and Precautions

WARNING: To reduce the risk of personal injury from hot surfaces, allow the internal system components to cool before touching.
WARNING: To reduce the risk of electric shock or damage to the equipment:
Do not disable the power cord grounding plug. The grounding plug is an
important safety feature.
Plug the power cord into a grounded (earthed) electrical outlet that is easily
accessible at all times.
Disconnect power from the server by unplugging the power cord from either
the electrical outlet or the server.
CAUTION: Protect the server from power fluctuations and temporary interruptions with a regulating uninterruptible power supply (UPS). This device protects the hardware from damage caused by power surges and voltage spikes and keeps the system in operation during a power failure.
CAUTION: Compaq ProLiant 800 Servers must always be operated with the system unit cover on. Proper cooling will not be achieved if the system unit cover is removed.
2-3
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 15
2-4 Removal and Replacement Procedures

Front Bezel Door

The front bezel door is removed for replacement or to convert the unit from tower to rack. To access the front of the server, open the front bezel door.
To remove the front bezel door:
1.
Unlock the front bezel door.
Open the front bezel door.
2.
Lift up the front bezel door, then pull it away from the chassis.
3.
CT
PA M O C
Figure 2-1. Removing the front bezel door
Reverse steps 1 through 3 to replace the front bezel door.
Page 16

Left Side Access Panel

Remove the left side access panel to access the IDE CD-ROM drive, power supply, riser board with tray, power switch and cable assembly, and 1.44-MB diskette drive cables.
To remove the left side access panel:
WARNING: To reduce the risk of personal injury from hot surfaces, allow the internal system components to cool before touching them.
NOTE: The illustration below shows the tower model. Procedures are the same for the rack­mountable model when removed from the rack.
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Unlock and open the front bezel door.
2.
3.
Loosen the two thumbscrews securing the left side access panel to the front of
the chassis.
Slide the left side access panel back, then pull it away from the chassis.
4.
2-5
Figure 2-2. Removing the left side access panel
Reverse steps 1 through 4 to replace the left side access panel.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 17
2-6 Removal and Replacement Procedures

Feet

The feet are removed for replacement, to replace the U-channel access panel, or to convert the unit from tower to rack.
To remove the feet from the chassis, one at a time:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the front bezel door. See Front Bezel Door earlier in this chapter.
2.
Place the server on its left side.
3.
Remove the T-15 screw from each foot
4.
Pivot each foot down
5.
Figure 2-3. Removing the feet from the chassis
, then pull it off the base of the chassis .
2
1
.
3
Reverse steps 1 through 5 to replace the feet. Make sure each foot snaps securely into its position.
Page 18

U-Channel Access Panel

The U-channel access panel is removed only for replacement. It does not need to be removed to access any other parts.
To remove the U-channel access panel:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the front bezel door. See Front Bezel Door earlier in this chapter.
2.
Remove the feet on the base of the U-channel access panel. See “Feet earlier in this
3. chapter.
Remove the two T-15 screws securing the U-channel access panel to the front of the
4. chassis.
Pull the U-channel access panel back, then away from the chassis.
5.
2-7
Figure 2-4. Removing the U-channel access panel
Reverse steps 1 through 5 to replace the U-channel access panel.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 19
2-8 Removal and Replacement Procedures

Top Access Panel

Remove the top access panel to access PCI and ISA boards, IDE CD-ROM drive cables, the I/O fan with bracket, and the power switch.
WARNING: To reduce the risk of personal injury from hot surfaces, allow the internal system components to cool before touching them.
To remove the top access panel:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Unlock and open the front bezel door.
2.
Loosen the thumbscrew securing the top access panel to the chassis.
3.
Slide the top access panel back, then away from the chassis.
4.
Figure 2-5. Removing the top access panel
Reverse steps 1 through 4 to replace the top access panel.
Page 20

Removable Media and Mass Storage Devices

Compaq ProLiant 800 Servers ship standard with four removable media and four mass storage device bays. The removable media bays contain a one-third height, 1.44-MB diskette drive, a one-half height IDE CD-ROM drive, and two open bays. The open bays may be used for a second CD-ROM drive, tape drives, hard drives, or any SCSI device. The four mass storage bays can contain 1-inch or 1.6-inch non-hot-plug drives. Figure 2-6 and Table 2-1 depict the standard drive configuration.
B A
5 4
2-9
0
0
1
1
2
23
Figure 2-6. Removable media and mass storage device bays
Table 2-1
Description of Removable Media and
Mass Storage Device Bays
Drive Position Configuration
0-3 Hard Drive Bays
4-5 Removable Media Bays
A IDE CD-ROM Drive
B 1.44-MB Diskette Drive
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 21
2-10 Removal and Replacement Procedures

Non-Hot-Plug Drive Cage

The non-hot-plug drive cage is removed for replacement or to access the non-hot-plug hard drives.
To remove the non-hot-plug drive cage:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Unlock and remove the front bezel door. See Front Bezel Door earlier in this chapter.
2.
3.
Remove the left side access panel. See Left Side Access Panel earlier in this chapter.
Disconnect all cables from any installed drives in the drive cage.
4.
Remove the four T-15 screws that secure the drive cage to the chassis.
5.
6.
Slide the drive cage out the front of the chassis.
Figure 2-7. Removing the non-hot-plug drive cage
Reverse steps 1 through 6 to replace the non-hot-plug drive cage.
CAUTION: Make sure that all power and signal cables to the non-hot-plug drive cage have been reseated properly.
Page 22

IDE CD-ROM Drive

The IDE CD-ROM is removed for replacement or to reconfigure the drives in the removable media area.
To remove the IDE CD-ROM drive:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the left side access panel. See Left Side Access Panel earlier in this chapter.
2.
3.
Remove the two T-15 screws and washers from the front of the drive.
Disconnect all cables from the CD-ROM drive.
4.
Slide the CD-ROM drive out the front of the chassis.
5.
2-11
Figure 2-8. Removing the IDE CD-ROM drive
Reverse steps 1 through 5 to replace the IDE CD-ROM drive.
COMPACT
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 23
2-12 Removal and Replacement Procedures

1.44-MB Diskette Drive

The 1.44-MB diskette drive is removed for replacement only.
To remove the 1.44-MB diskette drive:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the top cover. See Top Cover earlier in this chapter.
2.
Disconnect all cables from the diskette drive.
3.
Remove the two T-15 screws and washers from the front of the drive.
4.
Slide the diskette drive out the front of the chassis.
5.
O
C
T
C A P
M
Figure 2-9. Removing the 1.44-MB diskette drive
Reverse steps 1 through 5 to replace the 1.44-MB diskette drive.
Page 24

Cable Folding and Routing Diagrams

Figure 2-10. IDE CD-ROM drive cable
2-13
Figure 2-11. 1.44-MB diskette drive cable
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 25
2-14 Removal and Replacement Procedures
Figure 2-12. Hard drive cage cable (Wide SCSI cable)
Page 26

Riser Board with Tray

The riser board with tray seats any installed PCI and ISA boards. Figure 2-13 depicts the layout of the PCI and PCI/ISA slots on the riser board.
1
2
2-15
Figure 2-13. Riser board expansion board slots
Riser Board Expansion Board Slot Descriptions
Slot Description
PCI/ISA expansion board slots
PCI expansion board slots
Table 2-2
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 27
2-16 Removal and Replacement Procedures
The riser board with tray is removed for replacement and to access the expansion board slots or the riser board connectors.
To remove the riser board with tray:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the top access panel. See Top Access Panel earlier in this chapter.
2.
Disconnect all cables from the expansion boards.
3.
Remove the side access panel. See Side Access Panel earlier in this chapter.
4.
Remove any installed boards. Place them on a non-conductive work surface.
5.
Remove the retaining screws.
6.
Slide the riser board out the side of the chassis.
7.
Figure 2-14. Removing the riser board
Reverse steps 1 through 7 to replace the riser board with tray. Reinstall any boards removed in step 4 into the same slots from which they were removed.
Page 28

I/O Fan Assembly

The I/O fan assembly is removed for replacement only.
To remove the I/O fan assembly:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
2.
Remove the top cover. See “Top Cover earlier in this chapter.
2-17
Loosen the single thumbscrew securing the I/O fan assembly to the chassis
3.
Tilt the top of the fan assembly forward, then away from the chassis
4.
Disconnect the fan assembly power cable
5.
Figure 2-15. Removing the I/O fan assembly
.
1
2
.
.
3
Reverse steps 1 through 5 to replace the I/O fan assembly.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 29
2-18 Removal and Replacement Procedures

Processor Fan

The processor fan is removed for replacement only.
To remove the processor fan:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
2.
Remove the left side access panel. See Left Side Access Panel earlier in this chapter.
Remove the four T-15 screws securing the processor fan to the chassis.
3.
Pull the processor fan forward slightly, then out the side of the server.
4.
Figure 2-16. Removing the processor fan
Reverse steps 1 through 4 to replace the processor fan.
Page 30

Power Switch and Cable Assembly

The power switch and cable assembly is removed for replacement only.
To remove the power switch and cable assembly:
WARNING: Any surface or area of the equipment marked with these symbols indicates the presence of electrical shock hazards. The enclosed area contains no operator-serviceable parts. To reduce the risk of injury from electrical shock hazards, do not open this enclosure.
1. Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the top cover. See “Top Cover earlier in this chapter.
2.
3.
Disconnect all cables from the power switch and cable assembly.
Remove the single T-15 screw that secures the power switch and cable assembly to the
4.
chassis
.
2-19
Slide the power switch back
5.
Figure 2-17. Removing the power switch and cable assembly
, then lift it out of the chassis .
2
3
1
Reverse steps 1 through 5 to replace the power switch and cable assembly.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 31
2-20 Removal and Replacement Procedures

Processor

The processor is removed for replacement or for the replacement of the system board.
To remove the processor:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the left side access panel. See Left Side Access Panel earlier in this chapter.
2.
Push in the latches on each side of the processor until you hear two clicks
3. the tabs in the open position.
Lift the processor from the system board
4.
2
1
Figure 2-18. Removing the processor
.
. This locks
Reverse steps 1 through 4 to replace the processor.
Page 32

Processor Retention Bracket

The processor retention bracket is removed for replacement or to replace the system board.
To remove the processor retention bracket:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the processor from the system board. See Processor earlier in this chapter.
2.
Remove the four T-15 screws, then lift the processor retention bracket from the
3. system board.
2-21
Figure 2-19. Removing the processor retention bracket.
Reverse steps 1 through 3 to replace the processor retention bracket.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 33
2-22 Removal and Replacement Procedures

Processor Power Module

The processor power module is removed for replacement or to replace the system board.
To remove the processor power module:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Pull back the clips at each end of the processor power module
2.
Lift the processor power module from the system board
3.
2
1
Figure 2-20. Removing the processor power module
.
.
Reverse steps 1 through 3 to replace a processor power module. The clips on the processor power module will snap into a locked position automatically when the processor power module is properly seated in its slot.
Page 34

Memory

Compaq ProLiant 800 Servers ship standard with 64 MB (1 DIMM) of memory installed on the system board in socket 4. Memory is expandable to a maximum of 1GB, when using four 256-MB DIMMs.
4 3 2 1
2-23
Figure 2-21. SDRAM DIMM sockets on the system board
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 35
2-24 Removal and Replacement Procedures
The following guidelines MUST be followed when installing or replacing memory:
Use 100 MHz, 32-, 64-, 128-, or 256-MB, registered SDRAM DIMMs.
WARNING: Use only Compaq SDRAM DIMMs. SDRAM DIMMs from other sources may adversely affect data integrity. Power-On Self-Test (POST) will warn of non-supported SDRAM DIMMs.
To remove an SDRAM DIMM:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Push the levers at each end of the memory module
2.
Pull the module from the board
3.
1
2
Figure 2-22. Removing a SDRAM DIMM from the system board
.
1
.
Reverse steps 1 through 3 to replace the SDRAM DIMM. Memory modules can be installed in one way only. Match the notch on the module with the tab on the memory socket. Push the module down into the socket making sure that the module is inserted fully and seated properly.
Page 36
The recommended order of SDRAM DIMM installation is:
Second SDRAM DIMM in slot 2 (J15)
Third SDRAM DIMM in slot 3 (J16)
Fourth SDRAM DIMM in slot 4 (J19)
Any combination of SDRAM DIMMs can be used.
Table 2-3
Examples of SDRAM DIMM Upgrade Combinations
Total Memory Slot 1 Slot 2 Slot 3 Slot 4
64 MB 32 MB 32 MB
64 MB 64 MB 32 MB
96 MB 64 MB
128 MB 64 MB 64 MB
240 MB 64 MB 128 MB
2-25
256 MB 128 MB 128 MB 32 MB
256 MB 64 MB 64 MB 64 MB 64 MB
384 MB 64 MB 64 MB 128 MB 128 MB
512 MB 128 MB 128 MB 128 MB 128 MB
1 GB 256 MB 256 MB 256 MB 256 MB
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 37
2-26 Removal and Replacement Procedures

Power Supply

The power supply is removed for replacement only.
To remove the power supply:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
2.
Remove the left side access panel. See Left Side Access Panel earlier in this chapter.
Disconnect all cables from the power supply.
3.
Remove the four T-15 screws securing the power supply to the back of the chassis
4.
Pull the power supply out the side of the chassis
5.
1
Figure 2-23. Removing the power supply
Reverse steps 1 through 5 to replace the power supply.
.
2
.
Page 38

External Replacement Battery

The external replacement battery is installed when the system board battery fails.
CAUTION: Do not remove the lithium battery from the system board or permanent damage may occur. If the battery fails, use the replacement battery.
To install the external replacement battery:
1.
Perform the preparation procedures. See Preparation Procedures earlier in this chapter.
Remove the top cover. See Top Cover earlier in this chapter.
2.
Remove the adhesive backing from the hook-and-loop fastener strip. Place the battery and
3. the hook-and-loop fastener strip on the system board close to the lithium battery.
Connect the external replacement battery cable connector to battery header E2 on the
4.
system board. The connector should fit over pins 4, 5, and 7
2-27
.
Move the jumper from pins 1 and 2
5.
2
1
3
Figure 2-24. Installing the replacement battery
to pins 2 and 3 .
6. Place the sticker included with your external replacement battery kit on the back of your server above the power connector.
NOTE: If an external replacement battery is not installed before the lithium battery fails, and CMOS/NVRAM is lost, run the System Configuration Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 39
Chapter 3
Diagnostic Tools
This chapter describes software and firmware diagnostic tools available for all Compaq server products. The sections in this chapter are:
Default Configuration
Access to Compaq Utilities
Power-On Self-Test (POST)
Diagnostics Software
Drive Array Advanced Diagnostics (DAAD)
Integrated Management Log
Rapid Recovery Services
Remote Service Features
ROMPaq
Compaq Insight Manager
3-1
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 40
3-2 Diagnostic Tools

Default Configuration

When the system is first powered on, the system ROM detects the un-configured state of the hardware and provides default configuration settings for most devices. By providing this initialization, the system can run Diagnostics and other software applications before running the normal SmartStart and System Configuration programs.

Default Configuration Messages

IMPORTANT: If you chose to format and partition your boot drive before running
SmartStart and the System Configuration programs, this may prohibit creating a System Partition and the off-line remote management features that it provides.
If you insert a System Configuration, Diagnostics, or SmartStart and Support Software CD in the CD-ROM drive prior to powering on the Server, the system ROM will boot to that utility. If the system ROM does not detect one of those CDs, you will be prompted for your intended operating system. The system will reboot if any operating system-dependent configurations have changed with the new operating system selection. If the selected operating system-dependent configurations are the same as the current configurations, the system will boot normally. If you enter a wrong choice, on subsequent re-boots you may change your operating system.

Utilities Access

The Compaq SmartStart and Support Software CD contains the SmartStart program and many of the Compaq utilities needed to maintain your system, including:
System Configuration Utility
Array Configuration Utility
Drive Array Advanced Diagnostics Utility
ROMPaq Firmware Upgrade Utilities
CAUTION: Do not select the Erase Utility when running the SmartStart and Support Software CD. This will result in data loss to the entire system.

Running Compaq Utilities

There are three way to access Compaq Utilities:
Run the utilities on the system partition.
If the system was installed using SmartStart, the Compaq utilities will automatically be
available on the system partition. The system partition could also have been created during a manual system installation.
Page 41
To run the utilities on the system partition, boot the system and press F10 when you see: “Press F10 for system partition utilities.” Then select the utilities from the menu.
System Configuration Utility is available under the System Configuration menu.
Array Configuration Utility is available under the System Configuration menu.
Drive Array Advanced Diagnostics Utility is available under the Diagnostics and
Utilities menu.
ROMPaq Firmware Upgrade Utility is available under the Diagnostics and
Utilities menu.
Run the utilities from diskette.
You can also run the utilities from their individual diskettes. If you have a utility diskette
newer than the version on the SmartStart and Support Software CD, use that diskette.
You can also create a diskette version of the utility from the SmartStart and Support
Software CD. To create diskette versions of the utilities from the CD:
Boot the Compaq SmartStart and Support Software CD.
1.
From the Compaq System Utilities screen, select Create Support Software à Next.
2.
Select the diskette you would like to create from the list, then follow the instructions
3.
on the screen.
3-3
Run the utilities from the Compaq SmartStart and Support Software CD.
IMPORTANT: Only the System Configuration Utility and the Array Configuration Utility can be executed from the Compaq SmartStart and Support Software CD. All other utilities must be executed from the system partition or from diskette.
To run these utilities directly from the Compaq SmartStart and Support Software CD:
1.
Boot the Compaq SmartStart and Support Software CD.
From the Compaq System Utilities screen, select the utility you wish to run, then
2.
select Next.
To execute the System Configuration Utility, select Run System
Configuration Utility.
To execute the Array Configuration Utility, select Run Array
Configuration Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 42
3-4 Diagnostic Tools

Power-On Self-Test (POST)

POST is a series of diagnostic tests that run automatically on Compaq computers when the system is turned on. POST checks the following assemblies to ensure that the computer system is functioning properly:
Processors
Keyboard
Power supply
System board
Memory
Memory expansion boards
Controllers
Diskette drives
Hard drives

POST Error Messages

If POST finds an error in the system, an error condition is indicated by an audible and/or visual message. If an error code displays on the screen during POST or after resetting the system, follow the instructions in Table 3-1. The error messages and codes listed in Table 3-1 include all codes generated by Compaq products. Your system generates only those codes that are applicable to your configuration and options.
In each case, the Recommended Action column lists the steps necessary to correct the problem. After completing each step, run the Diagnostics program to verify whether the error condition has been corrected. If the error code reappears, perform the next step, then run the Diagnostics program again. Follow this procedure until Diagnostics no longer detects an error condition.
Table 3-1
POST Error Messages
Audible Beeps
Error Code
A Critical Error occurred prior to this power-up
101-ROM Error 1L,1S System ROM checksum. Run Diagnostics. Replace failed
101-I/O ROM Error None Options ROM checksum. Run Diagnostics. Replace failed
102-System Board Failure
104-ASR-2 Timer Failure
L=Long S=Short Probable Source of Problem Recommended Action
None A catastrophic system error,
which caused the server to crash, has been logged.
None DMA, timers, and so on. Replace the system board. Run the
None System board failure. Run Diagnostics. Replace failed
Run Diagnostics. Replace failed assembly as indicated.
assembly as indicated.
assembly as indicated.
Compaq System Configuration Utility.
assembly as indicated.
Continued
Page 43
POST Error Messages Continued
Audible Beeps
Error Code
L=Long S=Short Probable Source of Problem Recommended Action
3-5
162-System Options Not Set
163-Time & Date Not Set
170- Expansion Device Not Responding
172- Configuration Nonvolatile Memory Invalid
172-1 Configuration Nonvolatile Memory Invalid
172-2 IRC Configuration Invalid
173- Slot ID Mismatch
174­Configuration/Slot Mismatch Device Not Found
2S Configuration incorrect. Run the System Configuration Utility
and correct.
2S Invalid time or date in
configuration memory.
None EISA or PCI expansion
board failure.
None Nonvolatile configuration
corrupt or jumper installed.
None Nonvolatile configuration
corrupt.
None IRC enabled and video
controller is in PCI slot on the
Run the System Configuration Utility and correct.
Check board for secure installation. Replace the failed board if necessary.
Run the System Configuration Utility and correct.
Run the System Configuration Utility and correct.
Move video controller to a PCI slot on the primary PCI bus.
secondary bus.
None Board replaced, configuration
not updated.
Run the System Configuration Utility and correct.
None EISA or PCI board not found. Run the System Configuration Utility
and correct.
175­Configuration/Slot
None EISA or PCI board added,
configuration not updated.
Run the System Configuration Utility
and correct. Mismatch Device Found
176-Slot with Not Readable ID Yields
None EISA or PCI board in slot that
should contain an ISA board.
Run the System Configuration Utility
and correct. Valid ID
177-Configuration Not Complete
178-Processor Configuration Invalid
179-System Revision Mismatch
None Incomplete System
Configuration.
None Processor type or step does not
match configuration memory.
None A board was installed that has a
different revision date.
Run the System Configuration Utility
and correct.
Run the System Configuration Utility
and correct.
Run the System Configuration Utility
and correct.
180-Log Reinitialized None N/A N/A
Compaq ProLiant 800 Servers Maintenance and Service Guide
Continued
Page 44
3-6 Diagnostic Tools
POST Error Messages Continued
Error Code
201-Memory Error None RAM failure. Run Diagnostics. Replace failed
Audible Beeps
L=Long S=Short Probable Source of Problem Recommended Action
assembly as indicated.
203-Memory Address Error
205-Cache Memory Error
205-Option Cache Memory Error
206-Cache Controller Error
207-Invalid Memory Configuration ­Check DIMM [SIMM] Installation
208-Invalid Memory Speed - Check DIMM [SIMM] Installation
210- Invalid Memory Configuration Detected. System halted.
None RAM failure. Run Diagnostics. Replace failed
assembly as indicated.
None Cache memory error. Replace the processor board in the
slot indicated.
None Option cache memory error. Replace the option cache board.
None Cache controller failure. Run Diagnostics. Replace failed
assembly as indicated.
None Memory module installed
incorrectly.
1L, 1S The speed of the memory is too
slow, where: xx00 = expansion board SIMMs are too slow, or 00yy = system board SIMMs are too slow. xx and yy have corresponding bit set.
None Maximum amount of
memory exceeded.
Verify placement of memory modules.
The speed of the memory modules
must be 60 ns. Verify the speed of the
memory modules installed and replace.
Verify installed memory does not
exceed 1 GB.
211-Cache Switch Set Incorrectly
212-System Processor Failed/Mapped out
213-Cache Size Error
213-System Processor Not Installed
214-DC-DC Converter Failed
None Switch not set properly during
installation or upgrade.
1S Processor in slot x failed. Run Diagnostics and replace
None Invalid optional cache size. Replace cache with 256K cache.
1S System processor configured
for slot indicated is missing.
None PowerSafe Module (DC-DC
Converter) failed.
Verify switch settings.
failed processor.
Install processor in the slot indicated or
run the System Configuration Utility to
remove the processor from the.CFG file.
Run Diagnostics. Replace failed
assembly as indicated.
Continued
Page 45
POST Error Messages Continued
Audible Beeps
Error Code
301-Keyboard Error None Keyboard failure. Turn off the computer, then reconnect
L=Long S=Short Probable Source of Problem Recommended Action
the keyboard.
3-7
301-Keyboard Error or Test Fixture Installed
ZZ-301-Keyboard Error
303-Keyboard Controller Error
304-Keyboard or System Unit Error
40X-Parallel Port X Address Assignment Conflict
402-Monochrome Adapter Failure
501-Display Adapter Failure
601-Diskette Controller Error
None Keyboard failure. Replace the keyboard.
None Keyboard failure. (ZZ represents
the Keyboard Scan Code.)
None System board, keyboard, or
mouse controller failure.
None Keyboard, keyboard cable, or
system board failure.
1. A key is stuck. Try to free it.
2.
Replace the keyboard.
1. Run Diagnostics.
2.
Replace failed assembly as
indicated.
1. Make sure the keyboard is attached.
Run Diagnostics to determine
2. which is in error.
Replace the part indicated.
3.
2S Both external and internal ports
are assigned to parallel port X.
1L, 2S Monochrome display controller. Replace the monochrome
1L, 2S Video display controller. Replace the video board.
None Diskette controller circuitry
failure.
Run the System Configuration Utility and correct.
display controller.
1. Make sure the diskette drive cables are attached.
Replace the diskette drive
2. and/or cable.
Replace the system board.
3.
605-Diskette Drive Type Error
702-A coprocessor has been detected that was not reported by CMOS
2S Mismatch in drive type. Run the System Configuration Utility to
set diskette type correctly.
None Installed coprocessor not
configured.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Run the System Configuration Utility and correct.
Continued
Page 46
3-8 Diagnostic Tools
POST Error Messages Continued
Error Code
703-CMOS reports a coprocessor that has not been detected
Audible Beeps
L=Long S=Short Probable Source of Problem Recommended Action
2S Coprocessor or configuration
error.
1. Run the System Configuration Utility and correct.
Replace the coprocessor.
2.
1151-Com Port 1 Address Assignment Conflict
1152-Com Port 2, 3, or 4 Address Assignment Conflict
1600-Server Manager/R Failure
1610-Temperature violation detected. Waiting for system to cool
1611-Fan [fan description] failure detected
1611- Fan [fan description] not present
1612-Primary power supply failure
2S Both external and internal serial
ports are assigned to COM1.
2S Both external and internal serial
ports are assigned to COM2, COM3 or COM4.
None Server Manager/R board failure.
Error code displays after error message.
2S Ambient system temperature
too hot.
2S Required fan has failed. Check fans.
2S Fan not present. Make sure fans are plugged in.
2S Primary power supply
has failed.
Run the System Configuration Utility and correct.
Run the System Configuration Utility and correct.
Run Diagnostics. Replace failed assembly as indicated.
Check fan in system environment.
Replace power supply as soon as possible.
1613-Low System Battery
1615- Power Supply Failure in Bay X
1616- Power Supply Configuration Error
1701-SCSI Controller failure
None Real time clock system battery
is running low on power.
None A power supply has failed. Replace or check specified
2L, 2S Single power supply system is
installed in Bay 2 and not in Bay 1.
None A test on the Fast SCSI-2
Controller failed
Run Diagnostics. Replace failed assembly as indicated.
power supply.
Move power supply from Bay 2 to Bay 1.
Run Diagnostics. Replace failed assembly as indicated.
Continued
Page 47
POST Error Messages Continued
Audible Beeps
Error Code
1702-SCSI cable error detected. System halted.
L=Long S=Short Probable Source of Problem Recommended Action
None Incorrect cabling.
1. For integrated SCSI Controllers, ensure that the internal connector has SCSI termination attached.
For option card SCSI
2. controllers, ensure that only one of the two internal connectors has termination attached.
3-9
1703-SCSI cable error detected. Internal SCSI cable not attached to system board connector. System halted.
1704-Unsupported Virtual Mode Disk Operation. DOS Driver Required. System halted.
1705-Locked SCSI Bus Detected. System halted.
1730-Fixed Disk 0 does not support DMA Mode.
1731-Fixed Disk 1 does not support DMA Mode.
None Incorrect cabling. Ensure that the integrated SCSI
controller has SCSI termination attached.
None System attempted to perform a
virtual mode disk operation without virtual mode memory services.
None SCSI bus failure. Run Diagnostics. Replace failed
None Fixed disk drive error. Run the System Configuration Utility
None Fixed disk drive error. Run the System Configuration Utility
Use fixed-disk device driver that supports virtual mode memory services.
assembly as indicated.
and correct.
and correct.
1740-Fixed Disk 0 failed Set Block Mode command
1741-Fixed Disk 1 failed Set Block Mode command
1750-Fixed Disk 0 failed Identify command
None Fixed disk drive error. Run the System Configuration Utility
and correct.
None Fixed disk drive error. Run the System Configuration Utility
and correct.
None Fixed disk drive error. Run the System Configuration Utility
and correct.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 48
3-10 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps L=Long S=Short Probable Source of Problem Recommended Action
1751-Fixed Disk 1 failed Identify command
1760-Fixed Disk 0 does not support Block Mode
1761-Fixed Disk 1 does not support Block Mode
1764-Slot x Drive Array - Capacity Expansion Process is temporarily disabled (followed by one of the following):
Expansion will resume when Array Accelerator has been reattached.
Expansion will resume when Array Accelerator has been replaced.
Expansion will resume when Array Accelerator RAM allocation is successful.
Expansion will resume when Array Accelerator battery reaches full charge.
Expansion will resume when automatic data recovery has been completed.
1765-Slot x Drive Array Option ROM Appears to Conflict With an ISA Card. ISA cards with 16-bit memory cannot be configured in memory range C0000 to DFFFF along with the SMART-2/E 8-bit Option ROM due to EISA bus limitations. Please remove or reconfigure your ISA card.
None Fixed disk drive error. Run the System Configuration Utility
None Fixed disk drive error. Run the System Configuration Utility
None Fixed disk drive error. Run the System Configuration Utility
and correct.
and correct.
and correct.
Reattach or replace Array Accelerator, wait until the Array Accelerator batteries have charged, or for Automatic Data Recovery to complete, as indicated.
Remove or reconfigure conflicting ISA cards. Disable “shared memory” on any ISA network cards that may be installed.
1766-Slot x Drive Array requires System ROM Upgrade. Run Systems ROMPaq Utility.
1767-Slot x Drive Array Option ROM is Not Programmed Correctly or may Conflict with the Memory Address Range of an ISA Card. Check the Memory Address Configuration of installed ISA Card(s) or run Options ROMPaq Utility to attempt SMART-2/E Option ROM Reprogramming.
1768-Slot x Drive Array -Resuming logical drive expansion process.
None SMART-2 Controller error No action required. Appears whenever a
Run the latest Systems ROMPaq Utility to upgrade your System ROMs.
Remove or reconfigure conflicting ISA cards, especially any cards that are not recognized by the System Configuration Utility. Try reprogramming the SMART-2/E Controller’s ROMs using the latest Options ROMPaq (version 2.29 or higher).
controller reset or power cycle occurs while array expansion is in progress.
Continued
Page 49
POST Error Messages Continued
Audible Beeps
Error Code
L=Long S=Short Probable Source of Problem Recommended Action
3-11
1769-Slot x Drive Array - Drive(s) disabled due to failure during expand. Select F1 to continue with logical drives disabled. Select F2 to accept data loss and to re­enable logical drives.
1771-Primary Disk Port Address Assignment Conflict
1772-Secondary Disk Port Address Assignment Conflict
1773-Primary Fixed Disk Port Assignment Conflict
None SMART-2 Controller error. Data has been lost while expanding the
array, therefore the drives have been temporarily disabled. Press F2 to accept the data loss and re-enable the logical drives. Restore data from backup.
None Internal and external hard
drive controllers are both
Run the System Configuration Utility
and correct. assigned to the primary address.
None Address Assignment Conflict.
Internal and external hard
Run the System Configuration Utility
and correct. drive controllers are both assigned to the secondary address.
None Fixed disk drive error. Run the System Configuration Utility
and correct.
1774-Slot x Drive Array - Obsolete data found in Array Accelerator. Select F1 to discard contents of Array Accelerator. Select F2 to write contents of Array Accelerator to drives.
1776-Drive Array ­SCSI Port Termination Error
1777-Drive Array External Drive Subsystem Error
None SMART-2 Controller error. Data found in Array Accelerator is older
than data found on drives. Press F1 to
discard the older data in the Array
Accelerator and retain the newer data on
the drives.
None External and internal SCSI
Reconfigure drives. drives are both configured to Port 1.
None Cooling fan failure, internal
temperature alert or open
Inspect for cooling fan failure or open
side panel. side panel.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 50
3-12 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps
L=Long S=Short Probable Source of Problem Recommended Action
1778-Drive Array resuming Automatic Data Recovery process
1779-Drive Array Controller detects replacement drives
1780-Disk 0 Failure None Hard drive/format error. Run Diagnostics. Replace failed
1781-Disk 1 Failure None Hard drive/format error. Run Diagnostics. Replace failed
1782-Disk Controller Failure
1784-Drive Array Drive Failure, Physical Drive
None This message appears whenever
a controller reset or power cycle occurs while Automatic Data Recovery is in progress.
None Intermittent drive failure and/or
possible loss of data.
None Hard disk drive circuitry error. Run Diagnostics. Replace failed
None Defective drive and/or cables. Check for loose cables. Replace
No action necessary.
If this message appears and drive X has not been replaced, this indicates an intermittent drive failure. This message also appears once immediately following drive replacement whenever data must be restored from backup.
assembly as indicated.
assembly as indicated.
assembly as indicated.
defective drive X and/or cable(s).
1785-Drive Array not Configured
1786-Drive Array Recovery Needed
The following drive(s) need Automatic Data Recovery: Drive X.
Select "F1" to continue with recovery of data to drive(s). Select "F2" to continue without recovery of data to drive(s).
None Configuration error. Run the System Configuration Utility
and correct.
None Interim Data Recovery
mode. Data has not been recovered yet.
Press F1 key to allow Automatic Data Recovery to begin. Data will automatically be restored to drive X now that the drive has been replaced or now seems to be working.
-Or-
Press the F2 key and the system will continue to operate in the Interim Data Recovery mode.
Continued
Page 51
POST Error Messages Continued
Audible Beeps
Error Code
1787-Drive Array Operating in Interim Recovery Mode.
Physical drive replacement needed: Drive X
L=Long S=Short Probable Source of Problem Recommended Action
None Hard drive X failed or cable is
loose or defective. Following a system restart, this message reminds you that drive X is defective and fault tolerance is being used.
1. Replace drive X as soon as possible.
Check loose cables.
2.
3.
Replace defective cables.
3-13
*1788-Incorrect Drive Replaced: Drive X Drive(s) were incorrectly replaced: Drive Y Select "F1" to continue - drive array will remain disabled. Select "F2" to reset configuration - all data will be lost.
*NOTE: The 1788 error message might also be displayed inadvertently due to a bad power cable connection to the drive or by noise on the data cable. If this message was due to a bad power cable connection, but not because of an incorrect drive replacement, repair the connection and press F2.
-Or-
If this message was not due to a bad power cable connection, and no drive replacement took place, this could indicate noise on the data cable. Check cable for proper routing.
None Drives are not installed in their
original positions, so the drives have been disabled. See note below.
Reinstall the drives correctly as indicated.
Press F1 to restart the computer with the drive array disabled.
-Or-
Press F2 to use the drives as configured and lose all the data on them.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 52
3-14 Diagnostic Tools
POST Error Messages Continued
Error Code
1789-Drive Not Responding, Physical Drive
Check cables or replace physical drive X.
Select "F1" to continue - drive array will remain disabled.
Select "F2" to fail drive(s) that are not responding -
Interim Recovery Mode will be enabled if configured for fault tolerance.
Audible Beeps
L=Long S=Short Probable Source of Problem Recommended Action
None Cable or hard drive failure.
1. Check the cable
2.
Replace the cables.
3.
Replace
4. If you do not want to
1790-Disk 0 Configuration Error
1791-Disk 1 Error None Hard drive error or wrong
1792-Drive Array Reports Valid Data Found in Array Accelerator.
Data will automatically be written to drive array.
None Hard drive error or wrong
drive type
drive type.
None This indicates that while the
system was in use, power was interrupted while data was in the Array Accelerator memory. Power was then restored within eight to ten days, and the data in the Array Accelerator was flushed to the drive array.
Run the System Configuration Utility and Diagnostics and correct.
Run the System Configuration Utility and Diagnostics and correct.
No action necessary; no data has been lost. Perform orderly system shutdowns to avoid data remaining in the Array Accelerator.
Continued
Page 53
POST Error Messages Continued
Audible Beeps
Error Code
L=Long S=Short Probable Source of Problem Recommended Action
3-15
1793-Drive Array ­Array Accelerator Battery Depleted ­Data Lost
(Error message 1794 also displays.)
1794-Drive Array ­Array Accelerator Battery Charge Low. Array Accelerator is temporarily disabled. Array Accelerator will be re-enabled when battery reaches full charge.
1795-Drive Array ­Array Accelerator Configuration Error.
Data does not correspond to this drive array. Array Accelerator is temporarily disabled.
1796-Drive Array ­Array Accelerator Not Responding.
Array Accelerator is temporarily disabled.
None This indicates that while the
system was in use, power was interrupted while data was in the Array Accelerator memory. Array Accelerator batteries failed. Data in Array Accelerator has been lost.
None This is a warning that the
battery charge is below 75%. Posted-writes are disabled.
None This indicates that while the
system was in use, power was interrupted while data was in the Array Accelerator memory.
The data stored in the Array Accelerator does not correspond to this drive array.
None Array Accelerator is defective or
has been removed.
Power was not restored within eight to ten days. Perform orderly system shutdowns to avoid data remaining in the Array Accelerator.
Replace the Array Accelerator board if batteries do not recharge within 36 power-on hours.
Match the Array Accelerator to the correct drive array, or run the System Configuration Utility to clear the data in the Array Accelerator.
1. Check that the Array Accelerator is properly seated.
Run the System Configuration
2. Utility to reconfigure the Compaq IDA-2 without the Array Accelerator.
1797-Drive Array ­Array Accelerator Read Error Occurred. Data in Array Accelerator has been lost. Array Accelerator is disabled.
None Hard parity error while reading
data from posted-writes memory.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Enable Array Accelerator.
Continued
Page 54
3-16 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps
L=Long S=Short Probable Source of Problem Recommended Action
1798-Drive Array ­Array Accelerator Write Error Occurred.
Array Accelerator is disabled.
1799-Drive Array ­Drive(s) Disabled due to Array Accelerator Data Loss. Select "F1" to continue with logical drives disabled. Select "F2" to accept data loss and to re-enable logical drives.
Beeps only: 2 Long + 2 Short
(Run System Configuration Utility ­F10 key)
None Hard parity error while writing
data to posted-writes memory.
None Volume failed due to loss of data
in posted-writes memory.
2L, 2S Power is cycled. Temperature
too hot. Processor fan not installed or spinning.
None A configuration error occurred
during POST.
Enable Array Accelerator.
Press F1 to continue with logical drives disabled or F2 to accept data loss and re-enable logical drive.
Check fans.
Press F10 to run System Configuration Utility.
(RESUME - F1 KEY) None As indicated to continue. Press the F1 key.
Page 55

Diagnostics Software

Tables 3-2 through 3-20 include all test error codes generated by Compaq products. Each code has a corresponding description and recommended action(s). Your system generates only those codes that are applicable to your configuration and options.
When you select Diagnostics and Utilities from the System Configuration Utility main menu, the utility prompts you to test, inspect, upgrade, and diagnose the server.
Diagnostics and Utilities are located on the system partition on the hard drive and must be accessed when a system configuration error is detected during the Power-On Self-Test (POST). Compaq Diagnostics software is also available on the Compaq SmartStart and Support Software CD. You can create a Diagnostics diskette from the SmartStart and Support Software CD and run Diagnostics from diskette.
The following options are available from the Diagnostics and Utilities menu:
Test Computer
Inspect Computer
Upgrade Firmware
Remote Utilities
3-17
Diagnose Drive Array
Diagnostic error codes are generated when the Diagnostics software recognizes a problem. These error codes, listed in tables 3-2 through 3-20, help identify possible defective subassemblies.
In each case, the Recommended Action column lists the steps necessary to correct the problem. After completing each step, run the Diagnostics program to verify whether the error condition has been corrected. If the error code reappears, perform the next step, then run the Diagnostics program again. Follow this procedure until the Diagnostics program no longer detects an error condition.
If you encounter an error condition, complete the following steps before starting problem isolation procedures:
Be certain proper ventilation exists. The computer should have approximately 12 inches
1. (30.5 cm) clearance at the front and back of the system unit.
Turn off the computer and peripheral devices.
2.
Disconnect any peripheral devices not required for testing. Do not disconnect the printer if
3. you want to test it or use it to log error messages.
Turn on the computer.
4.
Delete the power-on password, if set. You will know that the power-on password is set
5. when a key icon appears on the screen when POST completes. If this occurs, you must enter the password to continue. To delete the password, type the current password, a forward slash ( / ), and press the Enter key.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 56
3-18 Diagnostic Tools
6. Disable the power-on password by using the Password Disable switch on the system board, if you do not have access to the password.
Install a loopback plug (Part Number 142054-001), when required by Diagnostics.
7.
Run the latest version of Diagnostics.
8.

Running Diagnostics

There are two ways to access the utilities:
From the System Partition.
From diskette. A diskette can be created from the SmartStart and Support Software CD.
To access the utilities from the system partition:
Reboot the server by pressing the Ctrl+Alt+Delete keys.
1.
2.
Press F10 when the following prompt appears at the top of the screen during POST.
Press “F10” for System Partition Utilities.
IMPORTANT: The text appears for only two seconds. If you do not press F10 during this
time, you must reboot the server.
3. From the System Configuration Main Menu, select Diagnostics and Utilities.
If errors are detected in your Server Health Log, the Diagnostics Utility automatically displays the following screen message:
CAUTION: Errors have been detected in you Server Health Log. Diags will now identify your system hardware.
4. Press the Enter key to continue.
5.
After a short pause, the Server Health Log menu displays with a list of system errors.
If there is more than one error, press the Space Bar to select the error you want to correct. Press Enter.
The Diagnostics Utility prompts you and suggests corrective action.
6.
Page 57

Primary Processor Test Error Codes

The 100 series of Diagnostic error codes identifies failures with processor and system board functions.
Error Code Description Recommended Action
101-xx CPU test failed Replace the processor board and retest.
3-19
Table 3-2
Primary Processor Test Error Codes
103-xx 104-xx 105-xx 106-xx
107-xx 108-xx 109-xx
DMA page registers test failed. Interrupt controller master test failed. Port 61 error. Keyboard controller self-test failed.
CMOS RAM test failed. CMOS interrupt test failed. CMOS clock load data test failed.
For error codes 103-xx through 106-xx, replace the processor board and retest.
The following steps apply to error codes 107-xx through 109-xx:
1. Replace the battery/clock module and retest.
Replace the system board and retest.
2.
110-xx 111-xx 112-xx 113-xx
114-xx Speaker test failed.
116-xx Cache test failed. Replace the system board and retest.
122-xx 123-xx
Programmable timer load data test failed. Refresh detect test failed. Speed test slow mode out of range. Protected mode test failed.
Multiprocessor Dispatch test failed. Interprocessor Communication test failed.
For error codes 110-xx through 113-xx, replace the system board and retest.
1. Verify the speaker connection and retest.
2.
Replace the speaker and retest.
3.
Replace the system board and retest.
The following steps apply to error codes 122-xx through 123-xx:
1. Check the system configuration and retest.
2. Replace the processor board and retest.
3.
Replace the system board and retest.
199-xx Installed devices test failed.
1. Check the system configuration and retest.
2. Verify cable connections and retest.
3.
Check switch and/or jumper settings and
retest.
Run the Configuration utility and retest.
4.
5. Replace the processor board and retest.
6.
Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 58
3-20 Diagnostic Tools

Memory Test Error Codes

The 200 series of Diagnostic error codes identifies failures with the memory subsystem.
Error Code Description Recommended Action
200-xx Invalid memory configuration. Reinsert memory modules in correct location
201-xx 202-xx
203-xx 204-xx 205-xx 206-xx
207-xx Invalid memory configuration-check DIMM
208-xx Invalid memory speed detected - check DIMM
210-xx Random pattern test failed.
215 Non-functioning DC-DC converter for
Memory machine ID test failed. Memory system ROM checksum failed.
Memory write/read test failed. Memory address test failed. Walking I/O test failed. Increment pattern test failed.
installation. DIMMs installed have 8K refresh.
installation. Slow DIMMs may cause data loss.
processor X.
Table 3-3
Memory Test Error Codes
and retest.
The following steps apply to error codes 201-xx and 202-xx:
1. Replace the system ROM and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board and
The following steps apply to error codes 203-xx through 210-xx:
1. Replace the memory module and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board
Replace DIMMs.
Replace DIMMs with timing greater than 60 ns.
1. Replace the memory module and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board and
Replace the DC-DC converter(processor power module).
retest.
and retest.
retest.
Page 59

Keyboard Test Error Codes

The 300 series of Diagnostic error codes identifies failures with keyboard and system board functions.
Error Code Description Recommended Action
301-xx 302-xx 303-xx 304-xx
Keyboard short test, 8042 self-test failed. Keyboard long test failed. Keyboard LED test, 8042 self-test failed. Keyboard typematic test failed.

Parallel Printer Test Error Codes

Table 3-4
Keyboard Test Error Codes
The following steps apply to error codes 301-xx through 304-xx:
1. Check the keyboard connection. If disconnected, turn off the computer and connect the keyboard and retest.
Replace the keyboard and retest.
2.
3.
Replace the system board and retest.
3-21
The 400 series of Diagnostic error codes identifies failures with parallel printer interface card or system board functions.
Table 3-5
Parallel Printer Test Error Codes
Error Code Description Recommended Action
401-xx 402-xx 403-xx 498-xx
Printer failed or not connected. Printer data register failed. Printer pattern test failed. Printer failed or not connected.
The following steps apply to error codes 401-xx through 498-xx:
1. Connect the printer and retest.
2.
Check the power to the printer and retest.
3. Install the loopback connector and retest.
4.
Check the switch on the Serial/Parallel
Interface board (if applicable) and retest.
Replace the Serial/Parallel Interface board
5. (if applicable) and retest.
Replace the system board and retest.
6.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 60
3-22 Diagnostic Tools

Video Display Unit Test Error Codes

The 500 series of Diagnostic error codes identifies failures with video or system board functions.
Error Code Description Recommended Action
501-xx 502-xx 503-xx 504-xx 505-xx 506-xx 507-xx 508-xx 509-xx 510-xx 511-xx 512-xx 514-xx 516-xx
Video controller test failed. Video memory test failed. Video attribute test failed. Video character set test failed. Video 80 x 25 mode 9 x 14 character cell test failed. Video 80 x 25 mode 8 x 8 character cell test failed. Video 40 x 25 mode test failed. Video 320 x 200 mode color set 0 test failed. Video 320 x 200 mode color set 1 test failed. Video 640 x 200 mode test failed. Video screen memory page test failed. Video gray scale test failed. Video white screen test failed. Video noise pattern test failed.
Table 3-6
Video Display Unit Test Error Codes
The following steps apply to error codes 501-xx through 516-xx:
1. Replace the monitor and retest.
2.
Replace the Advanced VGA board
Replace the system board and
3.
and retest.
retest.
Page 61

Diskette Drive Test Error Codes

The 600 series of Diagnostic error codes identifies failures with diskette, diskette drive, or system board functions.
Error Code Description Recommended Action
600-xx 601-xx 602-xx 603-xx 604-xx 605-xx 606-xx 607-xx 608-xx 609-xx 610-xx 694-xx 697-xx 698-xx
699-xx Diskette drive/media ID error.
Diskette ID drive types test failed. Diskette format failed. Diskette read test failed. Diskette write/read/compute test failed. Diskette random seek test failed. Diskette ID media failed. Diskette speed test failed. Diskette wrap test failed. Diskette write protect test failed. Diskette reset controller test failed. Diskette change line test failed. Pin 34 is not cut on 360 KB diskette drive. Diskette type error. Diskette drive speed not within limits.
Table 3-7
Diskette Drive Test Error Codes
1. Replace the diskette and retest.
2.
Check and/or replace the diskette power
and signal cables and retest.
Replace the diskette drive and retest.
3.
4.
Replace the system board and retest.
1. Replace the media and retest.
2.
Run the Configuration utility and retest.
3-23

Monochrome Video Board Test Error Codes

The 800 series of Diagnostic error codes identifies failures with monochrome video boards or system board functions.
Monochrome Video Board Test Error Codes
Error Code Description Recommended Action
802-xx 824-xx
Video memory test failed. Monochrome video text mode
test failed.
Table 3-8
1. Replace monitor and retest.
2. Replace the Advanced VGA board and retest.
3.
Replace monochrome board and retest.
4.
Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 62
3-24 Diagnostic Tools

Serial Test Error Codes

The 1100 series of Diagnostic error codes identifies failures with serial/parallel interface board or system board functions.
Error Code Description Recommended Action
1101-xx 1109-xx
Serial port test failed. Clock register test failed.

Modem Communications Test Error Codes

The 1200 series of Diagnostic error codes identifies failures with the modem(s).
Table 3-9
Serial Test Error Codes
1. Check the switch settings on the
2.
3.
Serial/Parallel Interface board (if applicable) and retest.
Replace the Serial/Parallel Interface
board (if applicable) and retest.
Replace the system board and retest.
Table 3-10
Modem Communications Test Error Codes
Error Code Description Recommended Action
1201-xx 1202-xx 1203-xx 1204-xx 1206-xx 1210-xx
Modem internal loopback test failed. Modem time-out test failed. Modem external termination test failed. Modem auto originate test failed. Dial multi-frequency tone test failed. Modem direct connect test failed.
1. Refer to the modem documentation for correct setup procedures and retest.
Check the modem line and retest.
2.
3.
Replace the modem and retest.
Page 63

Hard Drive Test Error Codes

The 1700 series of Diagnostic error codes identifies failures with hard drives, hard drive controller boards, hard drive cabling, and system board functions. If your system uses a drive array controller, see the section for Drive Array Advanced Diagnostics (DAAD).
Error Code Description Recommended Action
1700-xx 1701-xx 1702-xx 1703-xx 1704-xx 1705-xx 1708-xx 1709-xx 1710-xx 1715-xx 1716-xx 1717-xx 1719-xx 1736-xx 1799-xx
* Error Checking and Correcting
Fixed disk ID drive types test failed. Fixed disk format test failed. Fixed disk read test failed. Fixed disk write/read/compare test failed. Fixed disk random seek test failed. Fixed disk controller test failed. Fixed disk format bad track test failed. Fixed disk reset controller test failed. Fixed disk park head test failed. Fixed disk head select test failed. Fixed disk conditional format test failed. Fixed disk ECC* test failed. Fixed disk drive power mode test failed. Drive Monitoring failed. Invalid fixed disk drive type failed.
Table 3-11
Hard Drive Test Error Codes
1. Run the System Configuration Utility and verify the drive type.
Replace the fixed disk drive signal and
2.
3.
Replace the fixed disk drive controller and
retest.
Replace the fixed disk drive and retest.
4.
5.
Replace the system board and retest.
3-25
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 64
3-26 Diagnostic Tools

Tape Drive Test Error Codes

The 1900 series of Diagnostic error codes identifies failures with tape cartridges, tape drives, tape drive cabling, adapter boards, or the system board assembly.
Error Code Description Recommended Action
1900-xx
1901-xx
1902-xx
1903-xx
1904-xx
1905-xx
1906-xx
Tape ID failed.
Tape servo write failed.
Tape format failed.
Tape drive sensor test failed.
Tape BOT/EOT test failed.
Tape read test failed.
Tape write/read/compare test failed.
Table 3-12
Tape Drive Test Error Codes
1. Replace the tape cartridge and retest.
2.
Check and/or replace the signal cable
Check the switch settings on the adapter
3.
Replace the tape adapter board (if
4.
Replace the tape drive and retest.
5.
6.
Replace the system board and retest.
and retest.
board (if applicable).
and retest.
Page 65

Advanced VGA Board Test Error Codes

The 2400 series of Diagnostic error codes identifies failures with video boards, monitors, or the system board assembly.
Advanced VGA Board Test Error Codes
Error Code Description Recommended Action
2402-xx
2403-xx
2404-xx
2405-xx
Video memory test failed.
Video attribute test failed.
Video character set test failed.
Video 80 x 25 mode 9 x 14 character cell test failed.
Table 3-13
1. Run the System Configuration Utility.
2.
Replace the monitor and retest.
3.
Replace the Advanced VGA board or other
video board and retest.
Replace the system board and retest.
4.
3-27
2406-xx
2407-xx
2408-xx
2409-xx
2410-xx
2411-xx
2412-xx
2414-xx
2416-xx
2417-xx
2418-xx
Video 80 x 25 mode 8 x 8 character cell test failed.
Video 40 x 25 mode test failed.
Video 320 x 320 mode color set 0 test failed.
Video 320 x 320 mode color set 1 test failed.
Video 640 x 200 mode test failed.
Video screen memory page test failed.
Video gray scale test failed.
Video white screen test failed.
Video noise pattern test failed.
Lightpen text mode test failed, no response.
ECG/VGC memory test failed.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 66
3-28 Diagnostic Tools
Advanced VGA Board Test Error Codes Continued
Error Code Description Recommended Action
2419-xx
2420-xx
2421-xx
2422-xx
ECG/VGC ROM checksum test failed.
ECG/VGC attribute test failed.
ECG/VGC 640 x 200 graphics mode test failed.
ECG/VGC 640 x 350 16-color set test failed.
1. Run the System Configuration Utility.
2.
Replace the monitor and retest.
3.
Replace the Advanced VGA board or
other video board and retest.
Replace the system board and retest.
4.
2423-xx
2424-xx
2425-xx
2431-xx
2432-xx
2448-xx
2451-xx
2456-xx
2458-xx
2468-xx
2477-xx
2480-xx
ECG/VGC 640 x 350 64-color test failed.
ECG/VGC monochrome text mode test failed.
ECG/VGC monochrome graphics mode test failed.
640 x 480 graphics test failure.
320 x 200 graphics (256-color mode) test failure.
Advanced VGA Controller test failed.
132-column Advanced VGA test failed.
Advanced VGA 256-Color test failed.
Advanced VGA Bit BLT Test.
Advanced VGA DAC Test.
Advanced VGA Data Path Test.
Advanced VGA DAC Test.
1. Run Setup.
2.
Replace the system board and retest.
Page 67

NetFlex-2 Controller Test Error Codes

The 6000 series of Diagnostic error codes identifies failures with 32-bit DualSpeed NetFlex-2 and NetFlex-2 Token Ring Controllers.
NetFlex-2 Controller Test Error Codes
Error Code Description Recommended Action
6000-xx
6001-xx
6002-xx
6014-xx
6016-xx
6028-xx
6029-xx
6089-xx
Network card ID failed
Network card setup failed
Network card transmit failed
Network card Configuration failed
Network card Reset failed
Network card Internal failed
Network card External failed
Network card Open failed
Table 3-14
3-29
1. Check the controller installation in the EISA slot.
Check the interrupt type and number
2. setting.
Check the media connection at the
3. controller and Multistation Access Unit (MAU).
Check the media speed (4/16 ) and type
4. Unshielded Twisted Pair/Shielded Twisted Pair (UTP/STP) settings.
Check the MAU, cabling, or other
5. network components.
Replace the controller.
6.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 68
3-30 Diagnostic Tools

Compaq Network Interface Boards Test Error Codes

The 6000 series of Diagnostic error codes identifies failures with 32-bit DualSpeed NetFlex­2/Token Ring Controllers.
Error Code Description Recommended Action
6000-xx
6001-xx
6002-xx
6014-xx
6016-xx
6028-xx
6029-xx
6089-xx
6090-xx
Network card ID failed.
Network card setup failed.
Network card transmit failed.
Network card configuration failed.
Network card reset failed.
Network card internal failed.
Network card external failed.
Network card open failed.
Network card initialization failed.
Table 3-15
Compaq Network Interface Boards
Test Error Codes
1. Check the controller installation in the EISA slot.
Check the interrupt type and number
2. setting.
Check the media connection at the
3. controller and Multistation Access Unit (MAU).
Check the media speed (4/16 ) and type
4. Unshielded Twisted Pair/Shielded Twisted Pair (UTP/STP) settings.
Check the MAU, cabling, or other network
5. components.
Replace the controller.
6.
6091-xx
6092-xx
Network card internal loopback failed.
Network card external loopback failed.
Page 69

SCSI Hard Drive Test Error Codes

The 6500 series of Diagnostic error codes identifies failures with SCSI hard drives, SCSI hard drive controller boards, SCSI hard drive cabling, and system board functions. If your system uses a drive array controller, see the section for Drive Array Advanced Diagnostics (DAAD).
SCSI Hard Drive Test Error Codes
Error Code Description Recommended Action
6500-xx 6502-xx 6505-xx 6506-xx 6509-xx 6523-xx 6528-xx
SCSI Disk ID drive types test failed. SCSI Disk Unconditional Format test failed. SCSI Disk Read Test Failed. SCSI Disk SA/Media test failed. SCSI Disk Erase tape test failed. SCSI Disk Random Read test failed. Media load/unload test failed.

SCSI/IDE CD-ROM Drive Test Error Codes

Table 3-16
1. Run the System Configuration Utility and verify the drive type.
Replace the SCSI disk drive signal and
2. power cables and retest.
Replace the SCSI controller and retest.
3.
4.
Replace the SCSI disk drive and retest.
5.
Replace the system board and retest.
3-31
The 6600 series of Diagnostic error codes identifies failures with the CD-ROM cabling, CD-ROM drives, adapter boards, or the system board assembly.
Table 3-17
SCSI/IDE CD-ROM Drive Test Error Codes
Error Code Description Recommended Action
6600-xx 6605-xx
CD-ROM ID failed. CD-ROM Read failed.
1. Replace the CD-ROM media and retest.
2.
Check and/or replace the signal cable and
retest.
Check the switch settings on the adapter
3. board (if applicable).
4.
Replace the SCSI controller (if applicable)
and retest.
Replace the CD-ROM drive and retest.
5.
6. Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 70
3-32 Diagnostic Tools

SCSI Tape Drive Test Error Codes

The 6700 series of Diagnostic error codes identifies failures with tape cartridges, tape drives, media changers, tape drive cabling, adapter boards, or the system board assembly.
SCSI Tape Drive Test Error Codes
Error Code Description Recommended Action
6700-xx 6706-xx 6709-xx 6728-xx
SCSI Tape ID drive types test failed. SCSI Disk SA/Media test failed. SCSI Disk Erase tape test failed. Media load/unload test failed.

Server Manager/R Board Test Error Codes

The 7000 series of Diagnostic error codes identifies failures with the Server Manager/R board.
Table 3-18
1. Run the System Configuration Utility and verify the drive type.
Replace the SCSI Tape drive signal and
2. power cables and retest.
Replace the SCSI controller and retest.
3.
4.
Replace the SCSI Tape drive and retest.
5.
Replace the system board and retest.
Table 3-19
Server Manager/R Board Test Error Codes
Error Code Description Recommended Action
7000-11 7000-12 7000-13 7000-14 7000-15 7000-21 7000-22 7000-23 7000-24
Processor (80186 Timer). Processor (80186 Registers). Processor (Watch Dog Timer). Processor (8570 RAM). Processor (8570 RTC). Memory. Memory Write/Read. Memory Address. Memory Refresh Alert.
Replace the Server Manager/R board and retest.
Continued
Page 71
Server Manager/R Board Test Error Codes Continued
Error Code Description Recommended Action
7000-25 7000-26 7000-27 7000-28 7000-33 7000-34 7000-35 7000-41 7000-42 7000-43 7000-44 7000-45 7000-46
7000-51 7000-52 7000-53 7000-54 7000-55 7000-56 7000-57
7000-61 7000-62
Memory Increment Memory Random Data. Memory Disturb Address. Memory HBM. HBM IO. HBM BMIC. HBM Video. ser_int. ser_int. ser_ext. ser_ext. ser_ext_int. ser_ext_int.
mdm_int. mdm_int. mdm_ext. mdm_ext. mdm_ext_int. mdm_ext_int. mdm\c\analog.
Voice/DTMF Internal Loopback. Voice/DTMF Internal Loopback.
Replace the Server Manager/R board and retest.
Replace the Server Manager/R board Enhanced 2400-Baud Integrated Modem and retest.
Replace the Server Manager/R board Voice ROM.
3-33
7000-78 7000-79
Host ADC Measurements. Battery.

Pointing Device Interface Test Error Codes

The 8600 Diagnostic error codes identifies failures with the pointing device (mouse, trackball, and so on) or the system board assembly.
Pointing Device Interface Test Error Codes
Error Code Description Recommended Action
8601-xx Pointing Device Interface test failed.
Table 3-20
Replace the Server Manager/R board battery.
1. Replace with a working pointing device and retest.
Replace the system board and retest.
2.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 72
3-34 Diagnostic Tools

Drive Array Advanced Diagnostics (DAAD)

Drive Array Advanced Diagnostics (DAAD) is a DOS-based tool designed to run on all Compaq products that contain a Compaq Drive Array Controller. The error messages and codes listed include all codes generated by Compaq products. Your system generates only codes applicable to your configuration and options.
The two main functions of DAAD are:
Collecting all possible information about array controllers in the system
Offering a list of all detected problems
NOTE: Refer to the Drive Array Advanced Diagnostics User Guide, found on the SmartStart and Support Software CD, for complete details and procedures about this diagnostic tool.
DAAD works by issuing multiple commands to the array controllers to determine if a problem exists. This data can then be saved to a file and, in severe situations, this file can be sent to Compaq for analysis. In most cases, DAAD provides enough information to initiate problem resolution immediately.
NOTE: DAAD does not write to the drives or destroy data. It does not change or remove configuration information.
Page 73

Starting DAAD

To start DAAD:
1.
Insert the DAAD diskette into drive A.
Reboot the system - OR - if you are at the DOS prompt, enter the following:
2.
A dialog box displays, indicating the version of DAAD installed. Press the Enter (or C’) key to continue, or press the Esc (or E) key to exit without continuing.
3.
If you continue, a Please Wait panel displays, indicating that DAAD is identifying the
DAAD gathers all the information it can from all of the array controllers in the system.
A second Please Wait panel may display to indicate that the utility is identifying the ROM
3-35
A:DAAD
NOTE: To generate a DAAD report without starting the interactive portion of the utility, enter the following at the DOS prompt: DAAD filename where filename is the name of the file or report.
system parameters.
The time it takes to gather this information depends on the size of your system.
version of an array controller in the system.
CAUTION: Do not cycle the power; the utility must perform low-level operations that, if interrupted, could cause the controller to revert to a previous level of firmware if the firmware was soft-upgraded.
When the information gathering process is complete, the main DAAD screen displays.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 74
3-36 Diagnostic Tools

DAAD Diagnostic Messages

Table 3-21 lists DAAD diagnostic messages in alphabetical order.
Message Description Recommended Action
Table 3-21
DAAD Diagnostic Messages
Accelerator board not detected
Accelerator error log
Accelerator parity read errors: n
Accelerator parity write errors: n
Accelerator status: Permanently disabled
Array controller did not detect a configured array accelerator board.
List of the last 32 parity errors on transfers to or from memory on the array accelerator board. Displays starting memory address, transfer count, and operation (read and write).
Number of times that read memory parity errors were detected during transfers from memory on array accelerator board.
Number of times that write memory parity errors were detected during transfers to memory on the array accelerator board.
Array accelerator board has been permanently disabled. It remains disabled until it is reinitialized using the System Configuration Utility.
Install array accelerator board on array controller. If an array accelerator board is installed, check for proper seating on the array controller board. You may need to run the System Configuration Utility and disable the array accelerator board to get this message off the screen.
If there are many parity errors, you may need to replace the array accelerator board.
If there are many parity errors, you may need to replace the array accelerator board.
If there are many parity errors, you may need to replace the array accelerator board.
Check the Disable Code field. Run the System Configuration Utility to reinitialize the array accelerator board.
Accelerator status: Possible data loss in cache
Accelerator status: Temporarily disabled
Accelerator status: Unrecognized status
Possible data loss detected during power-up due to all batteries being below sufficient voltage level and no presence of identification signatures on the array accelerator board.
Array accelerator board has been temporarily disabled.
A status returned from the array accelerator board that DAAD does not recognize.
There is no way to determine if dirty or bad data was in the cache and is now lost.
Check the Disable Code field.
Obtain the latest version of DAAD.
Continued
Page 75
DAAD Diagnostic Messages Continued
Message Description Recommended Action
3-37
Accelerator status: Obsolete data sensed at reset
Accelerator status: Obsolete data was written to drives
Accelerator status: Obsolete data was discarded
Accelerator status: Dirty data detected. Unable to write dirty data to drives
Accelerator status: Dirty data detected has reached limit. Cache still enabled, but writes no longer being posted
During reset initialization obsolete data was found in the cache. This was due to the drives being moved and written to by another controller.
During reset initialization obsolete data was found in the cache. The obsolete data was written to the drives, but newer data may have been overwritten.
During reset initialization obsolete data was found in the cache and it was discarded (not written to the drives).
At least one cache line contains dirty data that the controller has been unable to flush (write) to the drives. This problem usually occurs when there is a problem with the drive(s).
The number of cache lines containing dirty data that cannot be flushed (written) to the drives has reached a preset limit. The cache is still enabled, but writes are no longer being posted. This problem usually occurs when there is a problem with the drive(s).
Nothing needs to be done. The controller will either write the data to the drivers or discard the data completely. Normal operations should continue.
If newer data was overwritten, you may need to restore newer data; otherwise, nothing needs to be done. Normal operations should continue.
Nothing needs to be done. Normal operations should continue.
Fix the problem with the drive(s). Then the controller will be able to write the dirty data to the drives.
Fix the problem with the drive(s). Then the controller will be able to write the dirty data to the drives and posted write operations will be restored.
Accelerator status: Excessive ECC errors detected in at least one cache line. As a result, at least on cache line is no longer in use.
Accelerator status: Data in the cache was lost due to some reason other than the battery being discharged
At least one line in the cache is no longer in use due to excessive ECC errors detected during use of the memory associated with that cache line.
Data in the cache was lost, but not because of the battery being discharged.
Replacement of the cache should be considered. If cache replacement is not done the remaining cache lines should continue to operate properly.
Check to be sure that the array accelerator is properly seated. If the error continues you may need to replace the array accelerator.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 76
3-38 Diagnostic Tools
DAAD Diagnostic Messages Continued
Message Description Recommended Action
Accelerator status: Cache was automatically configured during last controller reset. This can occur when cacheboard is replaced with one of a different size.
Accelerator status: Valid data found at reset
Accelerator status: Warranty alert
Adapter/NVRAM ID mismatch
Battery X not fully charged
Cache board was probably replaced with one of a different size.
Valid data was found in posted write memory at reinitialization. Data will be flushed to disk.
Catastrophic problem with array accelerator board. Refer to other messages on Diagnostics screen for exact meaning of this message.
EISA nonvolatile RAM has an ID for a different controller from the one physically present in the slot.
Battery is not fully charged. Allow 36 hours to recharge them.
Nothing needs to be done. Normal operations should continue.
Not an error or data loss condition. No action needs to be taken.
Replace the array accelerator board.
Run the System Configuration Utility.
Board not attached
NVRAM configuration present, controller not detected
Compatibility port problem detected
Array controller configured for use with array accelerator board, but one is not attached.
EISA nonvolatile RAM has a configuration for an array controller but there is no board in this slot. Either a board has been removed from the system or a board has been placed in the wrong slot.
Compatibility port configured for this IDA controller. When DAAD was verifying this interface, a serious problem was detected.
Attach array accelerator board to array controller.
Place the array controller in the proper slot or run the System Configuration Utility to reconfigure nonvolatile RAM to reflect the removal or new position.
A hardware problem has occurred; replace the IDA controller.
Continued
Page 77
DAAD Diagnostic Messages Continued
Message Description Recommended Action
3-39
Configuration signature is zero
Configuration signature mismatch
Controller communication failure occurred
Controller detected. NVRAM configuration not present
Controller firmware needs upgrading
Controller firmware needs upgrading (DAAD Error 102)
DAAD detected that nonvolatile RAM contains a configuration signature that is zero. Old versions of the System Configuration Utility could cause this.
Array accelerator board configured for a different array controller board. Configuration signature on array accelerator board does not match the one stored on the array controller board.
Controller communication failure occurred.
EISA nonvolatile RAM does not contain a configuration for this controller.
Controller firmware is below the latest recommended version.
Controller is correct, however, IDA firmware version should be greater than 1.26.
Run the latest version of System Configuration Utility to configure the controller and nonvolatile RAM.
To recognize the array accelerator board, run the System Configuration Utility.
DAAD was unable to successfully issue commands to the controller in this slot.
Run the System Configuration Utility to configure the nonvolatile RAM.
Run Options ROMPaq to upgrade the controller to the latest firmware revision.
Obtain the latest firmware.
Controller is located in special video slot
Controller is not configured
Controller needs replacing (DAAD Error 102)
Controller is installed in slot for special video control signals. If controller is used in this slot, LED indicators on front panel
Install the controller in a different slot and run the System Configuration Utility to configure the controller and nonvolatile RAM.
may not function properly.
Controller is not configured. If controller was previously configured and you change drive locations, there may be a problem with placement of the drives. DAAD examines each physical drive and looks for drives that have been moved to a
Look for messages indicating which drives have been moved. If none appear and drive swapping did not occur, run the System Configuration Utility to configure the controller and nonvolatile RAM. Do not run the System Configuration Utility if you believe drive swapping has occurred.
different drive bay.
IDA firmware is less than version 0.96. Replace the controller as soon as possible.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Continued
Page 78
3-40 Diagnostic Tools
DAAD Diagnostic Messages Continued
Message Description Recommended Action
Controller needs replacing (DAAD Error 104)
Controller reported POST error. Error Code: x
Controller restarted with a signature of zero
DAAD recorded errors attempting to access: X
Disable command issued
The Intelligent Array Expansion System firmware is less than version 1.14.
The controller returned an error from its internal Power-On Self Tests.
DAAD did not find a valid configuration signature to use to get the data. Nonvolatile RAM may not be present (unconfigured) or the signature present in nonvolatile RAM may not match the signature on the controller.
DAAD found errors while attempting to access physical drive X, believed to be operational. Message followed by specific information about the error.
Posted-writes have been disabled by the issuing of the Accelerator Disable command. This occurred because of an operating system device driver.
Replace the controller as soon as possible.
Replace the controller.
Run the System Configuration Utility to configure the controller and nonvolatile RAM.
Replace the drive, or correct the condition that caused the error.
Restart the system. Run the System Configuration Utility to reinitialize the array accelerator board.
Drive (bay) X needs replacing (DAAD Error 102)
Drive Monitoring features are unobtainable
Drive Monitoring is NOT enabled for drive bay X
Drive time-out occurred on physical drive bay X
The 210-megabyte hard drive has firmware version 2.30 or 2.31.
DAAD unable to get monitor and performance data due to fatal command problem such as drive time­out, or unable to get data due to these features not supported on the controller.
The monitor and performance features have not been enabled.
DAAD issued a command to a physical drive and the command was never acknowledged.
Replace the drive.
Check for other errors (time-outs, and so on). If no other errors occur, upgrade the firmware to a version that supports monitor and performance, if desired.
Run the System Configuration Utility to initialize the monitor and performance features.
The drive or cable may be bad. Check the other error messages on the Diagnostics screen to determine resolution.
Continued
Page 79
DAAD Diagnostic Messages Continued
Message Description Recommended Action
3-41
Drive (bay) X firmware needs upgrading
Drive (bay) X has invalid M&P stamp
Drive X indicates position Y
Drive (bay) X RIS copy mismatch
Drive (bay) X upload code not readable
Drive (bay) X has loose cable
Firmware on this physical drive is below the latest recommended version.
Physical drive has invalid monitor and performance data.
Message indicates which physical drive appears to be scrambled or in a drive bay other than the one for which it was originally configured.
The copies of the RIS on this drive do not match.
An error occurred while DAAD was trying to read the upload code information from this drive.
The array controller could not communicate with this drive at power­up. This drive has not previously failed.
Run the Options ROMPaq Utility to upgrade the drive firmware to the latest revision.
Run the System Configuration Utility to properly initialize this drive.
Examine the graphical drive representation on DAAD to determine proper drive locations. Remove drive X and place it in drive position Y. Rearrange the drives according to the DAAD instructions.
This drive may need to be replaced. Check for other errors.
If there were multiple errors, this drive may need to be replaced.
Check all cable connections first. The cables could be bad, loose, or disconnected. Turn on the system and attempt to reconnect signal/power cable to the drive. If this does not work, replace the cable. If that does not work, the drive may need to be replaced.
Drive (bay) X is a replacement drive
Drive (bay) X is a replacement drive marked OK
Drive (bay) X is failed
Drive (bay) X has insufficient capacity for its configuration
Drive (bay) X is undergoing drive recovery
This drive has been replaced. This message displays if a drive is replaced
If the replacement was intentional, allow the drive to rebuild.
in a fault tolerant logical volume.
This drive has been replaced and
Replace the drive. marked OK by the firmware. This may occur if a drive has an intermittent failure (for example, if a drive has previously failed, then when DAAD is run, the drive starts working again).
The indicated physical drive has failed. Replace this drive.
Drive has insufficient capacity to be
Replace this drive with a larger capacity drive. used in this logical drive configuration.
This drive is being rebuilt from the
Normal operations should occur. corresponding mirror or parity data.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 80
3-42 Diagnostic Tools
DAAD Diagnostic Messages Continued
Message Description Recommended Action
Drive (bay) X was inadvertently replaced
Duplicate write memory error
Error occurred reading RIS copy from drive (bay) X
FYI: Drive (bay) X is non-Compaq supplied
Identify controller data did not match with NVRAM
The physical drive was incorrectly replaced after another drive failed.
Data could not be written to the array accelerator board in duplicate due to the detection of parity errors. This is not a data loss situation.
An error occurred while DAAD was trying to read the RIS from this drive.
The installed drive was not supplied by Compaq.
The identify controller data from the array controller did not match the information stored in nonvolatile RAM. This could occur if new, previously configured drives have been placed in a system that has also been previously configured. It could also occur if the firmware on the controller has been upgraded and the System Configuration Utility was not run.
Replace the drive that was incorrectly replaced and
replace the original drive that failed. Do not run the
System Configuration Utility and try to reconfigure;
data will be lost.
Replace the array accelerator board.
If there were multiple errors, this drive may need to be replaced.
If problems exist with this drive, replace it with a Compaq drive.
Check the identify controller data under the Inspect Utility. If the firmware version field is the only thing different between the controller and nonvolatile RAM data, this is not a problem. Otherwise, run the System Configuration Utility.
Identify logical drive data did not match with NVRAM
Insufficient adapter resources
The identify unit data from the array controller did not match with the information stored in nonvolatile RAM. This could occur if new, previously configured drives have been placed in a system that has also been previously configured.
The adapter does not have sufficient resources to perform operations to the array accelerator board. Drive rebuild may be occurring.
Run the System Configuration Utility to configure the
controller and nonvolatile RAM.
Operate the system without the array accelerator board
until the drive rebuild completes.
Continued
Page 81
DAAD Diagnostic Messages Continued
Message Description Recommended Action
3-43
Logical drive X failed due to cache error
Logical Drive X status = FAILED
Logical Drive X status = INTERIM RECOVERY
Logical Drive X status = LOOSE CABLE DETECTED
This logical drive failed due to a catastrophic cache error.
This status could be issued for several reasons. If this logical drive is configured for No Fault Tolerance and one or more drives fail, this status will occur. If mirroring is enabled, and any two mirrored drives fail, this status will
Replace the array accelerator board and reconfigure
using the System Configuration Utility.
Check for drive failures, wrong drive replaced, or loose
cable messages. If there was a drive failure, replace the
failed drive(s) and then restore the data for this logical
drive from the tape backup. Otherwise, follow the
wrong drive replaced or loose cable detected
procedures. occur. If Data Guarding is enabled, and two or more drives fail in this unit, this status will occur. This status may also occur if another configured logical drive is in the WRONG DRIVE REPLACED or LOOSE CABLE DETECTED state.
A physical drive in this logical drive has
Replace the failed drive as soon as possible. failed. The logical drive is operating in interim recovery mode and is vulnerable.
A physical drive has a cabling problem. Turn the system off and attempt to reattach the cable
onto the drive. If this does not work, replace the cable.
Logical Drive X status = NEEDS RECOVER
Logical Drive X status = OVERHEATED
Logical Drive X status = OVERHEATING
Logical Drive X status = RECOVERING
A physical drive in this logical drive has failed and has now been replaced. This drive needs to be rebuilt from the mirror drive or the parity data.
The temperature of the Intelligent Array Expansion System drives is beyond safe operating levels and it has shut down to avoid damage.
The temperature of the Intelligent Array Expansion System drives is beyond safe operating levels.
A physical drive in this logical drive has failed and has now been replaced. The replaced drive is rebuilding from the mirror drive or the parity data.
When booting up the system, select the "F1 - rebuild
drive" option to rebuild the replaced drive.
Check the fans and the operating environment.
Check the fans and the operating environment.
Nothing needs to be done. Normal operations
can occur.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 82
3-44 Diagnostic Tools
DAAD Diagnostic Messages Continued
Message Description Recommended Action
Logical Drive X status = WRONG DRIVE REPLACED
Loose cable detected - logical drives may be marked FAILED until corrected
Mirror data miscompare
Mirrored memory location errors
No configuration for Accelerator Board
A physical drive in this logical drive has failed. The incorrect drive was replaced.
Controller unable to communicate with one or more physical drives, probably because of a cabling problem. Logical drives may be in a FAILED state until the condition is corrected, preventing access to data on the controller.
Data was found at reinitialization in the posted write memory; however, the mirror data compare test failed resulting in data being marked as invalid. Data loss is possible.
Soft errors occurred when attempting to read the same data from both sides of the mirrored memory. Data loss will occur.
The array accelerator board has not been configured.
Replace the drive that was incorrectly replaced. Then,
replace the original drive that failed with a new drive.
Do not run the System Configuration Utility to
reconfigure; you will lose data on the drive.
Check all controller and drive cable connections.
Replace the array accelerator board.
Replace the array accelerator board.
If the array accelerator board is present, run the System
Configuration Utility to configure the board, if desired.
SCSI port X, drive ID Y firmware needs upgrading
Set configuration command issued
Soft Firmware Upgrade required
Threshold for drive (bay) X violated
Drives firmware may cause problems and should be upgraded.
The configuration of the array controller has been updated. The array accelerator board may remain disabled until it is reinitialized.
DAAD has determined that your controller is running firmware that has been soft upgraded by the Compaq Upgrade Utility. However, the firmware running is not present on all drives. This could be caused by the addition of new drives in the system.
This message indicates that a monitor and performance threshold for this drive has been violated.
Run Options ROMPaq to upgrade the drives firmware to
a later revision.
Run the System Configuration Utility to reinitialize the
array accelerator board.
Run the Compaq Upgrade Utility to place the latest
firmware on all drives.
Check for the particular threshold that has
been violated.
Continued
Page 83
DAAD Diagnostic Messages Continued
Message Description Recommended Action
3-45
Threshold violations for drive (bay) X
Unknown disable code
Warning bit detected
WARNING - Drive Write Cache is enabled on X
Wrong Accelerator This could mean that either the board
This is a list of the individual thresholds that have been violated for this drive.
A code was returned from the array accelerator board that DAAD does not recognize.
A monitor and performance threshold violation may have occurred. The status of a logical drive may not be OK.
Drive has its internal write cache enabled. The drive may be a third-party drive or the drives operating parameters may have been altered. Condition may cause data corruption if power to the drive is interrupted.
was replaced in the wrong slot or placed in a system that was previously configured with another board type. Included with this message is a message indicating the type of adapter sensed by DAAD and a message indicating the type of adapter last configured in EISA nonvolatile RAM.
The drive may need to be replaced. Run the Compaq
Diagnostics Utility to determine if the drive has been
initialized and the threshold violation warrants
drive replacement.
Obtain the latest version of DAAD.
Check the other error messages for an indication of
the problem.
Replace the drive with a Compaq supplied drive, or
restore the drives operating parameters.
Check the diagnosis screen for other error messages.
Run the System Configuration Utility to update the
system configuration.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 84
3-46 Diagnostic Tools

Integrated Management Log

On servers supporting the Integrated Management Display, the Compaq Integrated Management Log (IML) replaces the Critical Error Log and Correctable Memory Logs. It records system events and stores them in an easily viewable form. It marks each event with a time-stamp with one-minute granularity.
Events listed in the Integrated Management Log are categorized as one of four event severity levels:
Status - indicates that the message is informational only.
Repaired - indicates that corrective action has been taken.
Caution - indicates a non-fatal error condition.
Critical - indicates a component failure.
The Integrated Management Log requires Compaq Operating System-dependent drivers. Refer to the Compaq Support Software CD for instructions on installing the appropriate drivers.

Multiple Ways of Viewing the Log

You can view an event in the IML in several ways:
On the Integrated Management Display
From within Compaq Insight Manager
From within Compaq Survey Utility
From within IML Management Utility
Integrated Management Display
The Integrated Management Display is a Liquid Crystal Display (LCD) panel that presents information directly at the server, assisting in diagnosing and servicing the server without a keyboard and monitor.
Compaq Insight Manager
Compaq Insight Manager is a server management tool providing in-depth fault configuration and performance monitoring of hundreds of Compaq servers from a single management console. System parameters that are monitored describe the status of all key server components. By being able to view the events that may occur to these components, you can take immediate action. You can view and print the event list from within Compaq Insight Manager by following the instructions that follow. You can also mark a Critical or Caution event as Repaired after the affected component has been replaced, for example, when a failed fan has been replaced. By marking the component as repaired, you can lower the severity of the event.
Page 85
Viewing the Event List
1. From Compaq Insight Manager, select the appropriate server, then select View Device Data. The selected server displays, with buttons around its perimeter.
Select the Recovery button à Integrated Management Log.
2.
If a failed component has been replaced, select the event from the list, then select
3. Mark Repaired.
Printing the Event List
NOTE: You can only view the event list from the Recovery/Integrated Management Log screen as described above.
1.
From the Insight Manager, select the appropriate server.
Select the Configuration button à Recovery button à Print.
2.
Compaq Survey Utility
The Compaq Survey Utility is a serviceability tool available from Windows NT and Novell NetWare that delivers online-configuration capture and comparison to maximize server availability. It is delivered on the Compaq Management CD in the SmartStart package or is available on the Compaq website. Refer to the Compaq Management CD for information on installing and running the Compaq Survey Utility.
3-47
After running the Compaq Survey Utility, you can view the IML by loading the output of the utility (typically called “survey.txt”) into a text viewer such as Notepad. The event list follows the system slot information. Once you have opened the text file, you can print it using the print feature of the viewer.
Compaq IML Management Utility
The Compaq IML Management Utility is a DOS-based tool that gives you the off-line ability to review, mark corrected, and print events from the IML. It is located on the Compaq SmartStart and Support Software CD. Refer to the SmartStart Installation for Servers poster, which ships with the server, for information on how to install and use the IML Management Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 86
3-48 Diagnostic Tools

Event List

The event list displays the affected components and the associated error messages. Though the same basic information is displayed, the format of the list may differ, depending on how you view it: on the Integrated Management Display, from within Compaq Insight Manager, the IML management utility, or the Compaq Survey Utility. An example of the format of an event (as displayed on the Integrated Management Display) is as follows:
**001 of 010**
---caution---
03/19/1997 12:54 PM FAN INSERTED Main System Location: System Board Fan ID: 03 **END OF EVENT**

Event Messages

Table 3-22
Event Messages
Event Type Event Message
Machine Environment
Fan Failure
Fan Inserted
Fan Removed
Fans Not Redundant
Overheat Condition
Main Memory
Correctable Error threshold exceeded
Uncorrectable Error
System Fan Failure (Fan X, Location)
System Fan Inserted (Fan X, Location)
System Fan Removed (Fan X, Location)
System Fans Not Redundant
System Overheating (Zone X, Location)
Corrected Memory Error threshold passed (Slot X, Memory Module X)
Corrected Memory Error threshold passed (System Memory)
Corrected Memory Error threshold passed (Memory Module unknown)
Uncorrectable Memory Error (Slot X, Memory Module X)
Uncorrectable Memory Error (System Memory)
Uncorrectable Memory Error (Memory Module unknown)
Continued
Page 87
Event Messages Continued
Event Type Event Message
Processor
3-49
Correctable Error Threshold exceeded
Uncorrectable Error
Host Bus Error
EISA Bus EISA Expansion Bus Master Timeout (Slot X)
PCI Bus Error PCI Bus Error (Slot X, Bus X, Device X, Function X)
Power Subsystem
Power Supply Failure
Power Supply Inserted
Power Supply Removed
Power Supply Not Redundant
System Configuration Battery Low
Power Module Failure
Processor Correctable error Threshold passed (Slot X, Socket X)
Unrecoverable Host Bus Data Parity Error
Unrecoverable Host Bus Address Parity Error
EISA Expansion Bus Slave Timeout
EISA Expansion Board Error (Slot X)
EISA Expansion Bus Arbitration Error
System Power Supply Failure (Power Supply X)
System Power Supply Inserted (Power Supply X)
System Power Supply Removed (Power Supply X)
System Power Supplies Hot Redundant
Real-Time Clock Battery Failing
A CPU Power Module (System Board, Socket X)
A CPU Power Module (Slot X, Socket X)
Power Modules Not Redundant
AC Voltage Problem
Power AC Overload
Automatic Server Recovery
System Lockup ASR Lockup Detected: Cause
Operating System
System Crash
Automatic OS Shutdown
System Power Modules Not Redundant
System AC Power Problem (Power Supply X)
System AC Power Overload (Power Supply X)
Blue Screen Trap: Cause [NT]
Kernel Panic: Cause [UNIX]
Abnormal Program Termination: Cause [NetWare]
Automatic Operating System Shutdown Initiated Due to Fan Failure
Automatic Operating System Shutdown Initiated Due to Overheat Condition
Fatal Exception (Number X, Cause)
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 88
3-50 Diagnostic Tools

Rapid Recovery Services

Compaq servers provide rapid recovery services for diagnosing and recovering from errors. These tools are available for local and remote diagnosis and recovery.
Rapid recovery means fast identification and resolution of complex faults. The Rapid Recovery Engine and Insight Management Agents notify the system administrator when a failure occurs, ensuring that the server experiences minimal downtime. You enable these features through the System Configuration Utility. These integrated server management features are:
Automatic Server Recovery-2 (ASR-2)
Server Health Logs (on servers not supporting Integrated Management Logs)
Storage Fault Recovery Tracking
Storage Automatic Reconstruction
Network Interface Fault Recovery Tracking
Memory Fault Recovery Tracking (with option upgrade kit)
These are discussed in more detail on the Systems Reference Library CD (SRL).

Automatic Server Recovery-2

Automatic Server Recovery-2 (ASR-2) lets the server restart automatically from the operating system or the Compaq Utilities. To use this feature, you must use the System Configuration Utility to install Compaq Utilities in the system partition.
You can tell ASR-2 to restart your server after a critical hardware or software error occurs. Using the Compaq System Configuration Utility, configure the system for either automatic recovery or for attended local or remote access to diagnostic and configuration tools.
You can also configure ASR-2 to page an administrator when the system restarts. ASR-2 depends on the application and driver that routinely notify the ASR-2 hardware of proper system operations. If the time between ASR-2 notifications exceeds the specified period, ASR-2 assumes a fault has occurred and initiates the recovery process.
To configure ASR-2:
Execute the System Configuration Utility.
1.
Select View and Edit Details.
2.
Set the software error recovery status to Enabled.
3.
Set the software error recovery time-out.
4.
Page 89
The available recovery features are:
Software Error Recovery – automatically restarts the server after a software-induced
server failure
Environmental Recovery – allows the server to restart when temperature, fan, or AC
power conditions return to normal
Unattended Recovery
For unattended recovery, ASR-2 performs the following actions:
Logs the error information to the IML
Resets the server
Pages you (if a modem is present and you selected Paging)
Tries to restart the operating system. Often the server restarts successfully, making
unattended recovery the ideal choice for remote locations where trained service personnel are not immediately available.
If ASR-2 cannot restart the server within 10 attempts, it places a critical error in the Integrated Management Log, starts the server into Compaq Utilities, and enables remote access (if you configured remote access).
3-51
To use this level of ASR-2, you must configure ASR-2 to load the operating system after restart.
Attended Recovery
For attended recovery, ASR-2 performs the following actions:
Logs the error information to the IML
Resets the server
Pages you (if a modem is present and you selected Paging)
Starts Compaq Utilities from the hard drive
Enables remote access
During system configuration, these utilities are placed on the system utilities partition of the hard drive.
If you have configured for dial-in access and have a modem with an auto-answer feature installed, you can dial in and remotely diagnose or reconfigure the server.
If you have configured the Compaq Utilities for network access, you can access the utilities over the network. You can use Compaq Insight Manager for dial-in or network access.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 90
3-52 Diagnostic Tools
Hardware Requirements
To use this level of ASR-2 over a modem, you need the following:
Compaq modem or optional Hayes compatible modem
System Configuration Utility and Diagnostics Utility installed on the system partition of
the hard drive
ASR-2 configured to load Compaq Utilities after restart
You can also run Compaq Utilities remotely over an IPX or IP network using the Network feature:
To use Compaq Utilities on an IPX network, you must have Compaq Insight Manager 2.0
or later or an NVT (Novell Virtual Terminal) Terminal Emulator with VT100 or ANSI terminal capabilities.
To use Compaq Utilities on an IP network, you must have Compaq Insight Manager 2.10
or later, or a Telnet Terminal Emulator with VT100 or ANSI capabilities.
If you are notified that ASR-2 restarted the server and you have restarted to Compaq Utilities, use the Inspect Utility or Compaq Insight Manager to view the critical error in the Critical Error Log. Run Diagnostics to diagnose and resolve the problem.
You can configure ASR-2 to restart the server into Compaq Utilities to diagnose the critical error, or to start the operating system to return the server to operational status as rapidly as possible.
When you enable ASR-2 to start the operating system, the server tries to start from the primary partition. In this mode, ASR-2 can page you if a critical error occurs, but you cannot access Compaq Utilities.
When you enable ASR-2 to start Compaq Utilities, your server restarts after a critical error and loads Compaq Utilities from the system partition on the hard drive.
You can configure your server to start Compaq Utilities in four different ways:
Without remote console support; for example, to run Compaq Utilities from the server
console only
With remote console support using modems for dial-in access
With remote console support using a modem to dial a predetermined telephone number
With remote console support through a network connection (IP or IPX)
Page 91
Compaq Integrated Remote Console
The standard Compaq Integrated Remote Console performs a wide range of configuration activities. Some of the consoles features include:
Accessible using ANSI terminal
Operates independently of the operating system
Provides for remote server reboot
Provides access to system configuration
Uses out-of-band communication with dedicated management modem installed in
the server
For more information, see the Integrated Remote Console User Guide that shipped with your server.
IMPORTANT: Before configuring ASR-2, verify that the System Configuration Utility and Diagnostics software are installed on the system partition. ASR-2 must have this to start Compaq Utilities after a system restart. Compaq recommends this even if you configure ASR-2 to start the operating system.
3-53
Compaq Health Driver
The Compaq Health Driver continually resets the ASR-2 timer according to the frequency you specified in the System Configuration Utility (for example, 10 minutes). If the ASR-2 timer counts down to zero before being reset, due to an operating system crash or a server lock-up, ASR-2 restarts the server into either Compaq Utilities or the operating system (as indicated by the System Configuration parameters). The default value is 10 minutes. The allowable settings are 5, 10, 20, and 30 minutes.
For remote and off-site (unattended) servers, setting the software error recovery time-out for 5 minutes reduces server downtime and allows the server to recover quickly. For local (attended) servers located onsite, you can set the software error recovery time-out for 20 or 30 minutes, giving you time to arrive at the server if you wish to manually diagnose the problem.
The Compaq Health Driver is independent of the ASR-2 timer. You should load it and enable the ASR-2 timer. This allows the driver to detect and log information about numerous hardware and software errors in the IML. However, you cannot enable the ASR-2 timer without loading the Compaq Health Driver.
Before ASR-2 restarts the server, it records any information available about the condition of the operating system in the Critical Error Log, or the IML depending on the server support. This information can be used to diagnose an operating system crash or server lock-up, while still allowing the server to be restarted.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 92
3-54 Diagnostic Tools
The following ASR-2 flow chart shows you the sequence of events after a hardware or software error occurs:
Figure 3-1. ASR-2 flow chart
Hardware/Software error occurs
Error records in the Critical Error Log,
or the Integrated Management Log,
depending on your server
Operating System halts normal
If a modem is installed and paging
is enabled, the Server Failure
Notification pager alert is sent to
the Server Administrator
|
Unattended server boots the
Operating System
If the server continues experiencing hardware/software errors and the number of ASR cycles exceed the specified number of recovery attempts, the server logs an error to the Server Health Log or the Integrated Management Log and boots the Compaq Utilities from the system partition on the hard drive
|
configuration.
|
operation
|
ASR Timer expires
|
Server is reset
|
---Or--- Server boots the Compaq Utilities on the system partition on the
hard drive
|
If a modem is installed, ASR puts
the modem on auto answer so that the Server Administrator can dial in (using third party terminal emulator
software) to remotely run the
Compaq Utilities to identify the
source of the fault
|
Or
|
Local Server Administrator runs
Compaq Utilities from server
console to identify the source
of the fault
Page 93
Booting into Compaq Utilities
When you enable ASR-2 to start into Compaq Utilities and a critical error occurs, the operating­system-specific Health Driver logs the error information in the Critical Error Log or the IML and the ASR-2 feature restarts the server. When the system reinitializes, the system pages the designated administrator (if enabled), and starts Compaq Utilities from the hard drive.
If Dial-In status is enabled, the modem is placed in auto-answer mode. If you enable Dial-Out status, you are automatically enabled for Dial-In.
If Network Status is enabled, the appropriate network support software is loaded, depending on the network protocol, IP or IPX. This allows remote access via the network.
IMPORTANT: Compaq Utilities are loaded from a specially created system partition on the hard drive. This partition was configured during server configuration.
You can access the server and view the Server Health Logs (in servers not supporting the IML) remotely by modem, in-band over the network, or directly from the server. For modem access, you must have either Compaq Insight Manager 2.0 or above or have a VT100 or ANSI terminal type device. You may use a standard CRT with VT100 or ANSI emulation capability, or you may use a PC with a VT100 or ANSI terminal emulation package. The communication parameters must be set for 8 data bits, no parity, and 1 stop bit.
3-55
You can also enable ASR-2 to allow network access using the Network Status feature in the System Configuration Utility. You must have either Compaq Insight Manager 2.0 or greater or a Novell Virtual Terminal (NVT) emulator on an IPX network to use this feature. You must also have version 2.24 or later of the System Configuration Utility. For IP access, you must have Compaq Insight Manager 2.10 or later, or a Telnet Terminal emulator to use this feature. You also must have version 2.24 or later of the System Configuration Utility.
The System Configuration Utility settings should resemble the settings in Table 3-23 when you enable ASR-2 to start into Compaq Utilities.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 94
3-56 Diagnostic Tools
Pager Data Setting Description
Pager status Enabled Indicates if the pager feature is enabled or disabled.
Table 3-23
Compaq System Configuration Utility Pager Settings
for Booting into Compaq Utilities
Pager dial string
Pager message
Pager test Select to test
Serial interface COM1 Select the communications port for the modem used by the pager and the remote
Dial-in status Enabled Set Dial-In Status to Enabled. Be sure the Reset Boot option is set to Boot
ATDT 555-5555
1234567# Represents a unique number (maximum seven digits, numeric only) that you
pager setup
Indicates the pager dial string and delay before the pager message. Pagers typically use one of the following formats:
Local pagers: ATDT 555-5555
Wide area pagers: ATDT 1-800-555-5555,1234567#
must designate to identify the server on your pager display. The ROM adds a three-digit code to the front of this number. The first two indicate the subsystem and the third indicates the severity of the error that caused the alert. The # symbol usually terminates the message. If no message is required, delete the # symbol.
Use this to test the current pager settings. Press Enter to dial the pager number, and the pager message (if present) displays. You must configure the computer before testing the pager and the Pager Status must be set to Enabled. Do not test the pager if you are running remotely and are using only one modem.
ASR-2 functions. The options are COM1 and COM2.
Compaq Utilities. When the system starts because of an ASR reset, it starts to the Compaq Utilities, sets the Management Modem to auto-answer, and waits for the administrator to dial in and run the Compaq Utilities.
You automatically disable this option when you configure the software error recovery start option to Boot Operating System. When ASR pages you, you cannot dial in unless ASR-2 exceeds 10, the threshold number of server restart retries. When this happens, ASR-2 restarts the server into the Compaq Utilities and places the modem in auto-answer mode.
Continued
Page 95
Compaq System Configuration Utility Pager Settings for Booting into Compaq Utilities
Pager Data Setting Description
Dial-out status Enabled Allows ASR-2 to dial out to a remote workstation. If you selected this option,
Dial-out string 555-1234 Enter the dial string followed by the remote computer telephone number.
Network status Enabled To allow network access to Compaq Utilities, set Network Status to Enabled
Network protocol To use IPX network access, set Network Protocol to IPX. When the system
Continued
Dial-In Status is automatically selected.
To use the dial-out feature, set Dial-Out Status to Enabled and set the Dial-Out String to the correct phone number. You must also set the Reset Boot option to Boot Compaq Utilities. When the system restarts because of an ASR reset, the administrator is paged via Pager Status and Pager Dial String, the system restarts to the Compaq Utilities, and dials out to the phone number provided in the Dial-Out string. The dial-out number will be tried five times. If it fails to connect after five attempts, the modem is put in auto-answer mode.
and make sure the Reset Boot option is set to Boot Compaq Utilities.
restarts to the Compaq Utilities because of an ASR reset, it loads IPX network support. This enables remote access via NVT.
3-57
To use IP network access, set Network protocol to IP. Also make sure to set Network IP address, Network IP net mask, and Network IP router address. When the system restarts to the Compaq Utilities because of an ASR reset, it loads IP network support. This enables remote access via Telnet.
NOTE: The Network Status must be set to Enabled for network access.
Network controller Compaq For all Compaq Standard Network Controllers.
Network host name
CPQHOU Enter the network name of the server. Use underscores instead of spaces
within the name, for example, Compaq_Server. If you are using IPX network access to the Compaq Utilities, this server name is used to advertise NVT host services. This server name displays in the Compaq Insight Manager server list when it determines it can communicate via NVT. Set this name to be the same as the server name you assign when the host OS is running.
Network card slot Slot # Select the slot number of the network interface card you wish to use for
network access to Compaq Utilities.
Network frame type
ETHERNET_I I Select the frame type for your network. Selections include both Ethernet and
Token Ring topologies.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 96
3-58 Diagnostic Tools
Compaq System Configuration Utility Pager Settings for Booting into Compaq Utilities
Pager Data Setting Description
Continued
Network IP address
Network IP net mask
Network IP router address
Enter the IP address for this server in standard dot notation.
NOTE: This is not used if you select Custom for Network controller. You must enter your IP address in the NET. CFG file that you load into the system partition.
Enter the net mask for this server in standard dot notation.
NOTE: This is not used if you select Custom for network controller. You must enter your IP address in the NET. CFG file that you load into the system partition.
Enter the router to be used for this server in standard dot notation.
NOTE: This is not used if you select Custom for network controller. You must enter your IP address in the NET. CFG file that you load into the system partition.
If you configure the server to boot into Compaq Utilities, it prepares for remote communications. You can remotely run Diagnostics software, the Inspect Utility, or the System Configuration Utility using a workstation running terminal emulation software, such as Compaq Insight Manager or PC Anywhere.
Booting into the Operating System
When you enable ASR-2 to restart into the operating system and a critical error occurs, ASR-2 logs the error in the Critical Error Log or IML and restarts the server. The system ROM pages the designated administrator, then executes the normal restart process.
IMPORTANT: When you enable ASR-2 to restart into the operating system, Modem Dial-In Status, Network Status, and Modem Dial-Out Status are automatically disabled. In this mode, ASR-2 can page you if a critical error
occurs, but you cannot access the server, and
the server cannot dial out to a remote workstation.
If the ASR-2 feature cannot restart the server within 10 attempts, it logs a critical error in the Critical Error Log or IML Log restarts the server into the Compaq Utilities, and puts the modem into auto-answer mode.
Your System Configuration Utility setting should resemble the following when you enable ASR to restart into the operating system:
Serial interface COM1
Dial-in status Disabled
Dial-out status Disabled
Dial-out string 555-1234
Network status Disabled
Network protocol IPX
Page 97
Network controller Compaq
Network host name CPQHOU
Network card slot Slot #
Network frame type ETHERNET_II
Network IP address xxx.xxx.xxx.xxx
Network IP net mask xxx.xxx.xxx.xxx
Network IP router address xxx.xxx.xxx.xxx
ASR-2 Security
The standard Compaq password features function differently during ASR-2 than during a typical system startup.
During ASR-2, the system does not prompt for the Power-On Password. This allows the ASR-2 to restart the operating system or Compaq Utilities without user intervention.
To maintain system security, set the server to boot in Network Server Mode (an option in the System Configuration Utility). This option ensures that the server keyboard is locked until you enter the Keyboard Password.
3-59
Select an Administrator Password (an option in the System Configuration Utility). During attended ASR-2 (local or remote), you must enter this Administrator Password before any modifications can be made to the server configuration.

Server Health Logs

In some servers, Server Health Logs are replaced by the IML, if it is supported. See “Integrated Management Display in this chapter for more information.
Server Health Logs contain information to help identify and correct any server failures and correlate hardware changes with server failure. Server Health Logs are stored in nonvolatile RAM and consist of the Critical Error Log and the Revision History Table.
If errors occur, information about the errors is automatically stored in the Critical Error Log.
Whenever boards or components (that support revision tracking) are updated to a new revision, the Revision History Table is updated.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 98
3-60 Diagnostic Tools
Critical Error Log
The Critical Error Log records memory errors, as well as catastrophic hardware and software errors that cause the system to fail. This information helps you quickly identify and correct the problem, thus minimizing downtime.
You can view the Critical Error Log through the Compaq Insight Manager. The Diagnostics Utility either resolves the error or suggests corrective action in systems that do not support event logs.
The Critical Error Log identifies and records the following errors. Each error type is briefly explained below.
Message Description
Table 3-24
Critical Error Log Messages
Abnormal Program Termination
ASR-2 detected by ROM An ASR-2 activity has been detected and logged by the system ROM.
ASR-2 Test Event The System Configuration Utility generated a test alert.
Automatic Server Recovery Base Memory Parity Error
Automatic Server Recovery Extended Memory Parity Error
Automatic Server Recovery Memory Parity Error
Automatic Server Recovery Reset Limit Reached
Battery Failing Low system battery warning. Replace battery within 7 days to prevent loss of nonvolatile
Caution: Temperature Exceeded
The operating system has encountered an abnormal situation that has caused a system failure.
The system detected a data error in base memory following a reset due to the Automatic Server Recovery-2 (ASR-2) timer expiration.
The system detected a data error in extended memory following a reset due to the ASR-2 timer expiration.
The system ROM was unable to allocate enough memory to create a stack. Then, it was unable to put a message on the screen or continue booting the server.
The maximum number of system resets due to ASR-2 timer expiration has been reached, resulting in the loading of Compaq Utilities.
configuration memory. Failure of the battery supporting the systems nonvolatile RAM is imminent.
The operating system has detected that the temperature of the system has exceeded the caution level. Accompanying data in the log notes if an auto-shutdown sequence has been invoked by the operating system.
Diagnostic Error An error was detected by the Diagnostics Utility. See the specific error code in this chapter
for a detailed explanation.
Error Detected On Boot Up The server detected an error during the Power-On Self-Test (POST).
Processor Prefailure A CPU has passed an internal corrected error threshold; excessive internal ECC
cache errors.
Continued
Page 99
Critical Error Log Messages Continued
Message Description
NMI - PCI Bus Parity Error A parity error was detected on the PCI bus.
NMI - Expansion Board Error A board on the expansion bus indicated an error condition, resulting in a server failure.
3-61
NMI - Expansion Bus Master Time-Out
NMI - Expansion Bus Slave Time-Out
NMI - Fail-Safe Timer Expiration
Processor Exception The indicated processor exception occurred.
NMI - Processor Parity Error The processor detected a data error, resulting in a server failure.
Server Manager Failure An error occurred with the Server Manager/R.
NMI - Software Generated Interrupt Detected Error
Caution: Temperature Exceeded
Abnormal Program Termination
ASR-2 Test Event The System Configuration Utility generated a test alert.
NMI- Automatic Server Recovery Timer Expiration
A bus master expansion board in the indicated slot did not release the bus after its maximum time, resulting in a server failure.
A board on the expansion bus delayed a bus cycle beyond the maximum time, resulting in a server failure.
Software was unable to reset the system fail-safe timer, resulting in a server failure.
Software indicated a system error, resulting in a server failure.
The operating system has detected that the temperature of the system has exceeded the caution level. Accompanying data in the log notes if an auto-shutdown sequence has been invoked by the operating system.
The operating system has encountered an abnormal situation that has caused a system failure.
The operating system has received notice of an impending ASR-2 timer expiration.
Required System Fan Failure The required system fan has failed. Accompanying data in the log notes if an auto-
shutdown sequence has been invoked by the operating system.
UPS A/C Line Failure Shutdown or Battery Low
ASR-2 detected by ROM An ASR-2 activity has been detected and logged by the system ROM.
The UPS notified the operating system that the AC power line has failed. Accompanying data indicates if an auto-shutdown sequence has been invoked or if the battery has been nearly depleted.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 100
3-62 Diagnostic Tools
Revision History Table
Some errors can be resolved by reviewing changes to the server configuration. The server has an Automatic Revision Tracking (ART) feature that helps you review recent changes to the server configuration.
One ART feature is the Revision History Table, which contains the hardware version number of the system board and any other system boards providing ART-compatible revision information. This feature lets you determine the level of functionality of an assembly in a system without opening or powering down the unit.
Current Revisions
Data 10/31/95
System Board Revision 03
Assembly Version 1
Functional Revision Level C
Processor 01 Revision 01
Table 3-25
Revision History Format
Assembly Version 1
Functional Revision Level A
Previous Revisions
Date 03
System Board Revision 03
Assembly Version 1
Functional Revision Level C
Processor 01 Revision 01
Assembly Version 1
Functional Revision Level A
The Revision History Table is stored in nonvolatile RAM and is accessed through the Inspect Utility and Compaq Insight Manager.
Loading...