Supporting Pentium II Processors and 100 MHz System Bus
Maintenance and Service Guide
First Edition (September 1998)
Part Number 320984-001
Spare Part Number 320981-001
Compaq Computer Corporation
Page 2
Notice
The information in this publication is subject to change without notice.
COMPAQ COMPUTER CORPORATION SHALL NOT BE LIABLE FOR TECHNICAL OR
EDITORIAL ERRORS OR OMISSIONS CONTAINED HEREIN, NOR FOR INCIDENTAL OR
CONSEQUENTIAL DAMAGES RESULTING FROM THE FURNISHING, PERFORMANCE, OR
USE OF THIS MATERIAL. THIS INFORMATION IS PROVIDED “AS IS” AND COMPAQ
COMPUTER CORPORATION DISCLAIMS ANY WARRANTIES, EXPRESS, IMPLIED OR
STATUTORY AND EXPRESSLY DISCLAIMS THE IMPLIED WARRANTIES OF
MERCHANTABILITY, FITNESS FOR PARTICULAR PURPOSE, GOOD TITLE AND AGAINST
INFRINGEMENT.
This publication contains information protected by copyright. No part of this publication may be
photocopied or reproduced in any form without prior written consent from Compaq Computer
Corporation.
ã 1998 Compaq Computer Corporation.
All rights reserved. Printed in the U.S.A.
The software described in this guide is furnished under a license agreement or nondisclosure agreement.
The software may be used or copied only in accordance with the terms of the agreement.
Compaq, Deskpro, Fastart, Compaq Insight Manager, Systempro, Systempro/LT, ProLiant, ROMPaq,
QVision, SmartStart, NetFlex, QuickFind, PaqFax, ProSignia, registered United States Patent and
Trademark Office.
Netelligent, Systempro/XL, SoftPaq, QuickBlank, QuickLock are trademarks and/or service marks of
Compaq Computer Corporation.
Microsoft, MS-DOS, Windows, and Windows NT are registered trademarks of Microsoft Corporation.
Other product names mentioned herein may be trademarks and/or registered trademarks of their respective
companies.
Compaq ProLiant 800 Servers Supporting Pentium II Processors and 100 MHz System Bus
Maintenance and Service Guide
First Edition (September 1998)
Part Number 320984-001
Spare Part Number 320981-001
Page 3
Contents
About This Guide
Symbols in Text.........................................................................................................................vii
SCSI Drive Power Cable..........................................................................................................5-7
v
Index
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 6
About This Guide
This Maintenance and Service Guide is a troubleshooting guide that can be used for reference
when servicing Compaq ProLiant 800 Servers.
IMPORTANT: The installation of options and servicing of this product shall be performed
by individuals that are knowledgeable of the procedures, precautions, and hazards
associated with equipment containing hazardous energy circuits.
Symbols in Text
These symbols may be found in the text of this guide. They have the following meanings.
vii
WARNING: To reduce the risk of personal injury from electrical shock and
hazardous energy levels, only authorized service technicians should attempt to
repair this equipment. Improper repairs could create conditions that are
hazardous.
WARNING: Indicates that failure to follow directions in the warning could result
in bodily harm or loss of life.
CAUTION: Indicates that failure to follow directions could result in damage to
equipment or loss of information.
IMPORTANT: Presents clarifying information or specific instructions.
NOTE: Presents commentary, sidelights, or interesting points of information.
Compaq Technician Notes
WARNING: Only authorized technicians trained by Compaq should attempt to
repair this equipment. All troubleshooting and repair procedures are detailed to
allow only subassembly/module level repair. Because of the complexity of the
individual boards and subassemblies, no one should attempt to make repairs at
the component level or to make modifications to any printed wiring board.
Improper repairs can create a safety hazard. Any indications of component
replacement or printed wiring board modifications may void any warranty.
WARNING: To reduce the risk of personal injury from electrical shock and
hazardous energy levels, do not exceed the level of repair specified in these
procedures. Because of the complexity of the individual boards and
subassemblies, do not attempt to make repairs at the component level or to make
modifications to any printed wiring board. Improper repairs could create conditions
that are hazardous.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 7
viii About This Guide
WARNING: To reduce the risk of electric shock or damage to the equipment:
n If the system has multiple power supplies, disconnect power from the system
by unplugging all power cords from the power supplies.
n Do not disable the power cord grounding plug. The grounding plug is an
important safety feature.
n Plug the power cord into a grounded (earthed) electrical outlet that is easily
accessible at all times.
CAUTION: To properly ventilate your system, you must provide at least 12
inches (30.5 cm) of clearance at the front and back of the computer.
CAUTION: The computer is designed to be electrically grounded. To ensure
proper operation, plug the AC power cord into a properly grounded AC outlet only.
Where to Go for Additional Help
In addition to this guide, the following information sources are available:
■ User Documentation
■ Compaq Service Quick Reference Guide
■ Service Training Guides
■ Compaq Service Advisories and Bulletins
■ Compaq QuickFind
■ Compaq Insight Manager
■ Compaq Download Facility: Call 1-281-518-1418
Telephone Numbers
For the name of your nearest Compaq Authorized Reseller:
In the United States, call 1-800-345-1518
In Canada, call 1-800-263-5868
For Compaq technical support:
In the United States and Canada, call 1-800-386-2172
For Compaq technical support phone numbers outside the United States and Canada, visit the
Compaq website at:
http://www.compaq.com
Page 8
Chapter 1
Illustrated Parts Catalog
This chapter provides the illustrated parts breakdow n and a spa re parts list for Compaq ProLiant 800 Servers. See Table 1-1 for the names of referenced spare parts.
Mechanical Parts Exploded View
1-1
2c
2b
2a
1
4
3
Figure 1-1. ProLiant 800 Servers mechanical parts exploded view
5
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 9
1-2 Illustrated Parts Catalog
System Components Exploded View
13
14
9
8
16
17
1
6
12
15
7
18
19
5
Figure 1-2. ProLiant 800 Servers system components exploded view
Page 10
Spare Parts List
RefDescriptionSpare Part #
CHASSIS
1 Chassis320979-001
1-3
Table 1-1
Spare Parts List
2 Panel Kit
320973-001
a) U-Channel Access Panel
b) Left Side Access Panel
c)
Top Access Panel
3 Feet333575-001
4 Front Bezel298012-001
5 Drive Cage320975-001
SYSTEM COMPONENTS
6 Power Supply, 325W320976-001
7 External Replacement Battery (4.5 V)160274-001
8 Power Switch and Cable Assembly320974-001
9 Processor with Heat Sink 350/100313623-001
10 Processor with Heat Sink 400/100313624-001 *
11 Processor with Heat Sink 450/100179780-001 *
12 Processor Power Module327660-001
13 Processor Retention Bracket333575-001
BOARDS
14 System Board320978-001
15 Riser Board with Tray320977-001
FANS
16 I/O Fan Assembly327308-001
17 Processor Fan With Bracket326873-001
MEMORY
18 64-MB DIMM (SDRAM, Reg 100 MHz)317745-001
MASS STORAGE DEVICES
19 1.44-MB Diskette Drive160788-201
20 IDE CD-ROM Drive328369-001
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 11
1-4 Illustrated Parts Catalog
Spare Parts List Continued
RefDescriptionSpare Part #
CABLES
21 Data Cable Kit
386559-001 *
a) Wide SCSI cable
b)
Diskette drive cable
c)
40-position SCSI cable
22Adapter Cable186423-001 *
MISCELLANEOUS
23 Return Kit298017-001 *
24 Carton and Buns (International)298017-002 *
25 Maintenance and Service Guide320981-001 *
26 Illustrated Parts Map320980-001 *
OPTIONS
27 32-MB DIMM (SDRAM, Reg 100 MHz)317747-001 *
28 128-MB DIMM (SDRAM, Reg 100 MHz)317756-001 *
29 256-MB DIMM (SDRAM, Reg 100 MHz)317749-001 *
30 9.1-GB Non-Pluggable Wide Ultra 1.6-inch Hard Drive199886-001 *
31 9.1-GB Non-Pluggable Fast-SCSI-2 Hard Drive199885-001 *
32 4.3-GB Non-Pluggable Wide Ultra 1-inch Hard Drive242606-001 *
33 4.3-GB Non-Pluggable Fast-Wide SCSI-2 Hard Drive199599-001 *
34 4.3-GB Non-Pluggable Fast-SCSI-2 Hard Drive199585-001 *
35 4.3-GB Wide Ultra SCSI Hard Drive (7,200 rpm)339514-001 *
36 4.3-GB Wide Ultra SCSI Hard Drive (10,000 rpm)336383-001 *
37 9.1-GB Wide Ultra SCSI Hard Drive339515-001 *
38 18.2-GB Wide Ultra SCSI Hard Drive336385-001 *
* Not Shown
Page 12
Chapter 2
Removal and Replacement Procedures
This chapter provides subassembly/module-level removal and replacement procedures for
Compaq ProLiant 800 Servers. After completing all necessary removal and replacement
procedures, run the Diagnostics program to verify that all components operate properly.
To service Compaq ProLiant 800 Servers, you might need the following:
■ Torx T-15 screwdriver
■ From the Compaq SmartStart and Support Software CD:
❏ System Configuration Utility software
❏ Drive Array Advanced Diagnostics software
❏ Diagnostics software
Electrostatic Discharge Information
A discharge of static electricity can damage static-sensitive devices or microcircuitry. Proper
packaging and grounding techniques are necessary precautions to prevent damage. To prevent
electrostatic damage, observe the following precautions:
2-1
■ Transport products in static-safe containers such as conductive tubes, bags, or boxes.
■ Keep electrostatic-sensitive parts in their containers until they arrive at
static-free stations.
■ Cover work stations with approved static-dissipating material. Provide a wrist strap
connected to the work surface and properly grounded tools and equipment.
■ Keep work area free of non-conductive materials such as ordinary plastic assembly aids
and foam packing.
■ Make sure you are always properly grounded when touching a static-sensitive component
or assembly.
■ Avoid touching pins, leads, or circuitry.
■ Always place drives PCB assembly side down.
■ Use conductive field service tools.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 13
2-2 Removal and Replacement Procedures
Symbols in Equipment
WARNING: Any surface or area of the equipment marked with these
symbols indicates the presence of a hot surface or hot component. If this
surface is contacted, the potential for injury exists. To reduce the risk of
injury from a hot component, allow the surface to cool before touching.
WARNING: Any surface or area of the equipment marked with these
symbols indicates the presence of electrical shock hazards. The
enclosed area contains no operator serviceable parts. To reduce the risk
of injury from electrical shock hazards, do not open this enclosure.
WARNING: Any RJ-45 receptacle marked with these symbols
indicates a Network Interface Connection. To reduce the risk of electrical
shock, fire, or damage to the equipment, do not plug telephone or
telecommunications connectors into this receptacle.
CLASS 1 LASER PRODUCT
Preparation Procedures
Before beginning any of the removal and replacement procedures for non-hot-plug devices:
1.
Turn off the server.
Disconnect the AC power cord from the AC outlet, then from the server.
2.
Disconnect all external peripheral devices from the server.
3.
For some removal and replacement procedures, you must remove the server from the rack
4.
and place it on a sturdy table or workbench. Refer to the ProLiant 800 Servers SupportingPentium II Processors and 100 MHz System Bus Setup and Installation Guide for
instructions.
CAUTION: Electrostatic discharge can damage electronic components. Be sure
you are properly grounded before beginning any installation procedure. See the
section titled “Electrostatic Discharge Information” in this chapter, for
more information.
WARNING: This label or equivalent is located on the surface of your
CD-ROM drive. This label indicates that the product is classified as a
CLASS 1 LASER PRODUCT.
Page 14
Server Warnings and Precautions
WARNING: To reduce the risk of personal injury from hot surfaces, allow the
internal system components to cool before touching.
WARNING: To reduce the risk of electric shock or damage to the equipment:
■ Do not disable the power cord grounding plug. The grounding plug is an
important safety feature.
■ Plug the power cord into a grounded (earthed) electrical outlet that is easily
accessible at all times.
■ Disconnect power from the server by unplugging the power cord from either
the electrical outlet or the server.
CAUTION: Protect the server from power fluctuations and temporary
interruptions with a regulating uninterruptible power supply (UPS). This device
protects the hardware from damage caused by power surges and voltage spikes
and keeps the system in operation during a power failure.
CAUTION: Compaq ProLiant 800 Servers must always be operated with the
system unit cover on. Proper cooling will not be achieved if the system unit cover
is removed.
2-3
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 15
2-4 Removal and Replacement Procedures
Front Bezel Door
The front bezel door is removed for replacement or to convert the unit from tower to rack. To
access the front of the server, open the front bezel door.
To remove the front bezel door:
1.
Unlock the front bezel door.
Open the front bezel door.
2.
Lift up the front bezel door, then pull it away from the chassis.
3.
CT
PA
M
O
C
Figure 2-1. Removing the front bezel door
Reverse steps 1 through 3 to replace the front bezel door.
Page 16
Left Side Access Panel
Remove the left side access panel to access the IDE CD-ROM drive, power supply, riser board
with tray, power switch and cable assembly, and 1.44-MB diskette drive cables.
To remove the left side access panel:
WARNING: To reduce the risk of personal injury from hot surfaces, allow the
internal system components to cool before touching them.
NOTE: The illustration below shows the tower model. Procedures are the same for the rackmountable model when removed from the rack.
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Unlock and open the front bezel door.
2.
3.
Loosen the two thumbscrews securing the left side access panel to the front of
the chassis.
Slide the left side access panel back, then pull it away from the chassis.
4.
2-5
Figure 2-2. Removing the left side access panel
Reverse steps 1 through 4 to replace the left side access panel.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 17
2-6 Removal and Replacement Procedures
Feet
The feet are removed for replacement, to replace the U-channel access panel, or to convert the
unit from tower to rack.
To remove the feet from the chassis, one at a time:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the front bezel door. See “Front Bezel Door” earlier in this chapter.
2.
Place the server on its left side.
3.
Remove the T-15 screw from each foot
4.
Pivot each foot down
5.
Figure 2-3. Removing the feet from the chassis
, then pull it off the base of the chassis .
2
1
.
3
Reverse steps 1 through 5 to replace the feet. Make sure each foot snaps securely
into its position.
Page 18
U-Channel Access Panel
The U-channel access panel is removed only for replacement. It does not need to be removed to
access any other parts.
To remove the U-channel access panel:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the front bezel door. See “Front Bezel Door” earlier in this chapter.
2.
Remove the feet on the base of the U-channel access panel. See “Feet” earlier in this
3.
chapter.
Remove the two T-15 screws securing the U-channel access panel to the front of the
4.
chassis.
Pull the U-channel access panel back, then away from the chassis.
5.
2-7
Figure 2-4. Removing the U-channel access panel
Reverse steps 1 through 5 to replace the U-channel access panel.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 19
2-8 Removal and Replacement Procedures
Top Access Panel
Remove the top access panel to access PCI and ISA boards, IDE CD-ROM drive cables,
the I/O fan with bracket, and the power switch.
WARNING: To reduce the risk of personal injury from hot surfaces, allow the
internal system components to cool before touching them.
To remove the top access panel:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Unlock and open the front bezel door.
2.
Loosen the thumbscrew securing the top access panel to the chassis.
3.
Slide the top access panel back, then away from the chassis.
4.
Figure 2-5. Removing the top access panel
Reverse steps 1 through 4 to replace the top access panel.
Page 20
Removable Media and Mass Storage Devices
Compaq ProLiant 800 Servers ship standard with four removable media and four mass storage
device bays. The removable media bays contain a one-third height, 1.44-MB diskette drive, a
one-half height IDE CD-ROM drive, and two open bays. The open bays may be used for a
second CD-ROM drive, tape drives, hard drives, or any SCSI device. The four mass storage bays
can contain 1-inch or 1.6-inch non-hot-plug drives. Figure 2-6 and Table 2-1 depict the standard
drive configuration.
B
A
5
4
2-9
0
0
1
1
2
23
Figure 2-6. Removable media and mass storage device bays
Table 2-1
Description of Removable Media and
Mass Storage Device Bays
Drive PositionConfiguration
0-3Hard Drive Bays
4-5Removable Media Bays
AIDE CD-ROM Drive
B1.44-MB Diskette Drive
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 21
2-10 Removal and Replacement Procedures
Non-Hot-Plug Drive Cage
The non-hot-plug drive cage is removed for replacement or to access the non-hot-plug
hard drives.
To remove the non-hot-plug drive cage:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Unlock and remove the front bezel door. See “Front Bezel Door” earlier in this chapter.
2.
3.
Remove the left side access panel. See “Left Side Access Panel” earlier in this chapter.
Disconnect all cables from any installed drives in the drive cage.
4.
Remove the four T-15 screws that secure the drive cage to the chassis.
5.
6.
Slide the drive cage out the front of the chassis.
Figure 2-7. Removing the non-hot-plug drive cage
Reverse steps 1 through 6 to replace the non-hot-plug drive cage.
CAUTION: Make sure that all power and signal cables to the non-hot-plug drive
cage have been reseated properly.
Page 22
IDE CD-ROM Drive
The IDE CD-ROM is removed for replacement or to reconfigure the drives in the removable
media area.
To remove the IDE CD-ROM drive:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the left side access panel. See “Left Side Access Panel” earlier in this chapter.
2.
3.
Remove the two T-15 screws and washers from the front of the drive.
Disconnect all cables from the CD-ROM drive.
4.
Slide the CD-ROM drive out the front of the chassis.
5.
2-11
Figure 2-8. Removing the IDE CD-ROM drive
Reverse steps 1 through 5 to replace the IDE CD-ROM drive.
COMPACT
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 23
2-12 Removal and Replacement Procedures
1.44-MB Diskette Drive
The 1.44-MB diskette drive is removed for replacement only.
To remove the 1.44-MB diskette drive:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the top cover. See “Top Cover” earlier in this chapter.
2.
Disconnect all cables from the diskette drive.
3.
Remove the two T-15 screws and washers from the front of the drive.
4.
Slide the diskette drive out the front of the chassis.
5.
O
C
T
CAP
M
Figure 2-9. Removing the 1.44-MB diskette drive
Reverse steps 1 through 5 to replace the 1.44-MB diskette drive.
Page 24
Cable Folding and Routing Diagrams
Figure 2-10. IDE CD-ROM drive cable
2-13
Figure 2-11. 1.44-MB diskette drive cable
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 25
2-14 Removal and Replacement Procedures
Figure 2-12. Hard drive cage cable (Wide SCSI cable)
Page 26
Riser Board with Tray
The riser board with tray seats any installed PCI and ISA boards. Figure 2-13 depicts the layout
of the PCI and PCI/ISA slots on the riser board.
1
2
2-15
Figure 2-13. Riser board expansion board slots
Riser Board Expansion Board Slot Descriptions
SlotDescription
PCI/ISA expansion board slots
PCI expansion board slots
Table 2-2
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 27
2-16 Removal and Replacement Procedures
The riser board with tray is removed for replacement and to access the expansion board slots or
the riser board connectors.
To remove the riser board with tray:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the top access panel. See “Top Access Panel” earlier in this chapter.
2.
Disconnect all cables from the expansion boards.
3.
Remove the side access panel. See “Side Access Panel” earlier in this chapter.
4.
Remove any installed boards. Place them on a non-conductive work surface.
5.
Remove the retaining screws.
6.
Slide the riser board out the side of the chassis.
7.
Figure 2-14. Removing the riser board
Reverse steps 1 through 7 to replace the riser board with tray. Reinstall any boards removed in
step 4 into the same slots from which they were removed.
Page 28
I/O Fan Assembly
The I/O fan assembly is removed for replacement only.
To remove the I/O fan assembly:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
2.
Remove the top cover. See “Top Cover” earlier in this chapter.
2-17
Loosen the single thumbscrew securing the I/O fan assembly to the chassis
3.
Tilt the top of the fan assembly forward, then away from the chassis
4.
Disconnect the fan assembly power cable
5.
Figure 2-15. Removing the I/O fan assembly
.
1
2
.
.
3
Reverse steps 1 through 5 to replace the I/O fan assembly.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 29
2-18 Removal and Replacement Procedures
Processor Fan
The processor fan is removed for replacement only.
To remove the processor fan:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
2.
Remove the left side access panel. See “Left Side Access Panel” earlier in this chapter.
Remove the four T-15 screws securing the processor fan to the chassis.
3.
Pull the processor fan forward slightly, then out the side of the server.
4.
Figure 2-16. Removing the processor fan
Reverse steps 1 through 4 to replace the processor fan.
Page 30
Power Switch and Cable Assembly
The power switch and cable assembly is removed for replacement only.
To remove the power switch and cable assembly:
WARNING: Any surface or area of the equipment marked with these symbols
indicates the presence of electrical shock hazards. The enclosed area contains no
operator-serviceable parts. To reduce the risk of injury from electrical shock
hazards, do not open this enclosure.
1. Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the top cover. See “Top Cover” earlier in this chapter.
2.
3.
Disconnect all cables from the power switch and cable assembly.
Remove the single T-15 screw that secures the power switch and cable assembly to the
4.
chassis
.
2-19
Slide the power switch back
5.
Figure 2-17. Removing the power switch and cable assembly
, then lift it out of the chassis .
2
3
1
Reverse steps 1 through 5 to replace the power switch and cable assembly.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 31
2-20 Removal and Replacement Procedures
Processor
The processor is removed for replacement or for the replacement of the system board.
To remove the processor:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the left side access panel. See “Left Side Access Panel” earlier in this chapter.
2.
Push in the latches on each side of the processor until you hear two clicks
3.
the tabs in the open position.
Lift the processor from the system board
4.
2
1
Figure 2-18. Removing the processor
.
. This locks
Reverse steps 1 through 4 to replace the processor.
Page 32
Processor Retention Bracket
The processor retention bracket is removed for replacement or to replace the system board.
To remove the processor retention bracket:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the processor from the system board. See “Processor” earlier in this chapter.
2.
Remove the four T-15 screws, then lift the processor retention bracket from the
3.
system board.
2-21
Figure 2-19. Removing the processor retention bracket.
Reverse steps 1 through 3 to replace the processor retention bracket.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 33
2-22 Removal and Replacement Procedures
Processor Power Module
The processor power module is removed for replacement or to replace the system board.
To remove the processor power module:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Pull back the clips at each end of the processor power module
2.
Lift the processor power module from the system board
3.
2
1
Figure 2-20. Removing the processor power module
.
.
Reverse steps 1 through 3 to replace a processor power module. The clips on the processor
power module will snap into a locked position automatically when the processor power module
is properly seated in its slot.
Page 34
Memory
Compaq ProLiant 800 Servers ship standard with 64 MB (1 DIMM) of memory installed on
the system board in socket 4. Memory is expandable to a maximum of 1GB, when using
four 256-MB DIMMs.
4
3
2
1
2-23
Figure 2-21. SDRAM DIMM sockets on the system board
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 35
2-24 Removal and Replacement Procedures
The following guidelines MUST be followed when installing or replacing memory:
Use 100 MHz, 32-, 64-, 128-, or 256-MB, registered SDRAM DIMMs.
WARNING: Use only Compaq SDRAM DIMMs. SDRAM DIMMs from other
sources may adversely affect data integrity. Power-On Self-Test (POST) will warn
of non-supported SDRAM DIMMs.
To remove an SDRAM DIMM:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Push the levers at each end of the memory module
2.
Pull the module from the board
3.
1
2
Figure 2-22. Removing a SDRAM DIMM from the system board
.
1
.
Reverse steps 1 through 3 to replace the SDRAM DIMM. Memory modules can be installed in
one way only. Match the notch on the module with the tab on the memory socket. Push the
module down into the socket making sure that the module is inserted fully and seated properly.
Page 36
The recommended order of SDRAM DIMM installation is:
■ Second SDRAM DIMM in slot 2 (J15)
■ Third SDRAM DIMM in slot 3 (J16)
■ Fourth SDRAM DIMM in slot 4 (J19)
Any combination of SDRAM DIMMs can be used.
Table 2-3
Examples of SDRAM DIMM Upgrade Combinations
Total MemorySlot 1Slot 2Slot 3Slot 4
64 MB32 MB32 MB
64 MB64 MB32 MB
96 MB64 MB
128 MB64 MB64 MB
240 MB64 MB128 MB
2-25
256 MB128 MB128 MB32 MB
256 MB64 MB64 MB64 MB64 MB
384 MB64 MB64 MB128 MB128 MB
512 MB128 MB128 MB128 MB128 MB
1 GB256 MB256 MB256 MB256 MB
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 37
2-26 Removal and Replacement Procedures
Power Supply
The power supply is removed for replacement only.
To remove the power supply:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
2.
Remove the left side access panel. See “Left Side Access Panel” earlier in this chapter.
Disconnect all cables from the power supply.
3.
Remove the four T-15 screws securing the power supply to the back of the chassis
4.
Pull the power supply out the side of the chassis
5.
1
Figure 2-23. Removing the power supply
Reverse steps 1 through 5 to replace the power supply.
.
2
.
Page 38
External Replacement Battery
The external replacement battery is installed when the system board battery fails.
CAUTION: Do not remove the lithium battery from the system board or
permanent damage may occur. If the battery fails, use the replacement battery.
To install the external replacement battery:
1.
Perform the preparation procedures. See “Preparation Procedures” earlier in this chapter.
Remove the top cover. See “Top Cover” earlier in this chapter.
2.
Remove the adhesive backing from the hook-and-loop fastener strip. Place the battery and
3.
the hook-and-loop fastener strip on the system board close to the lithium battery.
Connect the external replacement battery cable connector to battery header E2 on the
4.
system board. The connector should fit over pins 4, 5, and 7
2-27
.
Move the jumper from pins 1 and 2
5.
2
1
3
Figure 2-24. Installing the replacement battery
to pins 2 and 3 .
6. Place the sticker included with your external replacement battery kit on the back of your
server above the power connector.
NOTE: If an external replacement battery is not installed before the lithium battery fails, and
CMOS/NVRAM is lost, run the System Configuration Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 39
Chapter 3
Diagnostic Tools
This chapter describes software and firmware diagnostic tools available for all Compaq server
products. The sections in this chapter are:
■ Default Configuration
■ Access to Compaq Utilities
■ Power-On Self-Test (POST)
■ Diagnostics Software
■ Drive Array Advanced Diagnostics (DAAD)
■ Integrated Management Log
■ Rapid Recovery Services
■ Remote Service Features
■ ROMPaq
■ Compaq Insight Manager
3-1
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 40
3-2 Diagnostic Tools
Default Configuration
When the system is first powered on, the system ROM detects the un-configured state of the
hardware and provides default configuration settings for most devices. By providing this
initialization, the system can run Diagnostics and other software applications before running
the normal SmartStart and System Configuration programs.
Default Configuration Messages
IMPORTANT: If you chose to format and partition your boot drive before running
SmartStart and the System Configuration programs, this may prohibit creating a System
Partition and the off-line remote management features that it provides.
If you insert a System Configuration, Diagnostics, or SmartStart and Support Software CD in the
CD-ROM drive prior to powering on the Server, the system ROM will boot to that utility. If the
system ROM does not detect one of those CDs, you will be prompted for your intended
operating system. The system will reboot if any operating system-dependent configurations have
changed with the new operating system selection. If the selected operating system-dependent
configurations are the same as the current configurations, the system will boot normally. If you
enter a wrong choice, on subsequent re-boots you may change your operating system.
Utilities Access
The Compaq SmartStart and Support Software CD contains the SmartStart program and many of
the Compaq utilities needed to maintain your system, including:
■ System Configuration Utility
■ Array Configuration Utility
■ Drive Array Advanced Diagnostics Utility
■ ROMPaq Firmware Upgrade Utilities
CAUTION: Do not select the Erase Utility when running the SmartStart and
Support Software CD. This will result in data loss to the entire system.
Running Compaq Utilities
There are three way to access Compaq Utilities:
■Run the utilities on the system partition.
If the system was installed using SmartStart, the Compaq utilities will automatically be
available on the system partition. The system partition could also have been created during
a manual system installation.
Page 41
To run the utilities on the system partition, boot the system and press F10 when you see:
“Press F10 for system partition utilities.” Then select the utilities from the menu.
❏ System Configuration Utility is available under the System Configuration menu.
❏ Array Configuration Utility is available under the System Configuration menu.
❏ Drive Array Advanced Diagnostics Utility is available under the Diagnostics and
Utilities menu.
❏ ROMPaq Firmware Upgrade Utility is available under the Diagnostics and
Utilities menu.
■Run the utilities from diskette.
You can also run the utilities from their individual diskettes. If you have a utility diskette
newer than the version on the SmartStart and Support Software CD, use that diskette.
You can also create a diskette version of the utility from the SmartStart and Support
Software CD. To create diskette versions of the utilities from the CD:
Boot the Compaq SmartStart and Support Software CD.
1.
From the Compaq System Utilities screen, select Create Support Softwareà Next.
2.
Select the diskette you would like to create from the list, then follow the instructions
3.
on the screen.
3-3
■ Run the utilities from the Compaq SmartStart and Support Software CD.
IMPORTANT: Only the System Configuration Utility and the Array Configuration Utility can
be executed from the Compaq SmartStart and Support Software CD. All other utilities must
be executed from the system partition or from diskette.
To run these utilities directly from the Compaq SmartStart and Support Software CD:
1.
Boot the Compaq SmartStart and Support Software CD.
From the Compaq System Utilities screen, select the utility you wish to run, then
2.
select Next.
❏To execute the System Configuration Utility, select Run System
Configuration Utility.
❏To execute the Array Configuration Utility, select Run Array
Configuration Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 42
3-4 Diagnostic Tools
Power-On Self-Test (POST)
POST is a series of diagnostic tests that run automatically on Compaq computers when the
system is turned on. POST checks the following assemblies to ensure that the computer system
is functioning properly:
■ Processors
■ Keyboard
■ Power supply
■ System board
■ Memory
■ Memory expansion boards
■ Controllers
■ Diskette drives
■ Hard drives
POST Error Messages
If POST finds an error in the system, an error condition is indicated by an audible and/or visual
message. If an error code displays on the screen during POST or after resetting the system,
follow the instructions in Table 3-1. The error messages and codes listed in Table 3-1 include all
codes generated by Compaq products. Your system generates only those codes that are
applicable to your configuration and options.
In each case, the Recommended Action column lists the steps necessary to correct the problem.
After completing each step, run the Diagnostics program to verify whether the error condition
has been corrected. If the error code reappears, perform the next step, then run the Diagnostics
program again. Follow this procedure until Diagnostics no longer detects an error condition.
Table 3-1
POST Error Messages
Audible Beeps
Error Code
A Critical Error
occurred prior to
this power-up
101-ROM Error1L,1SSystem ROM checksum.Run Diagnostics. Replace failed
101-I/O ROM ErrorNoneOptions ROM checksum.Run Diagnostics. Replace failed
102-System Board
Failure
104-ASR-2 Timer
Failure
L=Long S=ShortProbable Source of ProblemRecommended Action
NoneA catastrophic system error,
which caused the server to
crash, has been logged.
NoneDMA, timers, and so on.Replace the system board. Run the
slow, where:
xx00 = expansion board SIMMs
are too slow, or
00yy = system board SIMMs
are too slow. xx and yy have
corresponding bit set.
NoneMaximum amount of
memory exceeded.
Verify placement of memory modules.
The speed of the memory modules
must be 60 ns. Verify the speed of the
memory modules installed and replace.
Verify installed memory does not
exceed 1 GB.
211-Cache Switch
Set Incorrectly
212-System
Processor
Failed/Mapped out
213-Cache Size
Error
213-System
Processor Not
Installed
214-DC-DC
Converter Failed
NoneSwitch not set properly during
installation or upgrade.
1SProcessor in slot x failed.Run Diagnostics and replace
NoneInvalid optional cache size.Replace cache with 256K cache.
1SSystem processor configured
for slot indicated is missing.
NonePowerSafe Module (DC-DC
Converter) failed.
Verify switch settings.
failed processor.
Install processor in the slot indicated or
run the System Configuration Utility to
remove the processor from the.CFG file.
Run Diagnostics. Replace failed
assembly as indicated.
Continued
Page 45
POST Error Messages Continued
Audible Beeps
Error Code
301-Keyboard ErrorNoneKeyboard failure.Turn off the computer, then reconnect
L=Long S=ShortProbable Source of ProblemRecommended Action
the keyboard.
3-7
301-Keyboard Error or
Test Fixture Installed
ZZ-301-Keyboard
Error
303-Keyboard
Controller Error
304-Keyboard or
System Unit Error
40X-Parallel Port X
Address Assignment
Conflict
402-Monochrome
Adapter Failure
501-Display Adapter
Failure
601-Diskette
Controller Error
NoneKeyboard failure.Replace the keyboard.
NoneKeyboard failure. (ZZ represents
the Keyboard
Scan Code.)
NoneSystem board, keyboard, or
mouse controller failure.
NoneKeyboard, keyboard cable, or
system board failure.
1. A key is stuck. Try to free it.
2.
Replace the keyboard.
1. Run Diagnostics.
2.
Replace failed assembly as
indicated.
1. Make sure the keyboard is
attached.
Run Diagnostics to determine
2.
which is in error.
Replace the part indicated.
3.
2SBoth external and internal ports
are assigned to parallel port X.
1L, 2SMonochrome display controller.Replace the monochrome
1L, 2SVideo display controller.Replace the video board.
NoneDiskette controller circuitry
failure.
Run the System Configuration Utility
and correct.
display controller.
1. Make sure the diskette drive
cables are attached.
Replace the diskette drive
2.
and/or cable.
Replace the system board.
3.
605-Diskette Drive
Type Error
702-A coprocessor
has been detected
that was not reported
by CMOS
2SMismatch in drive type.Run the System Configuration Utility to
set diskette type correctly.
NoneInstalled coprocessor not
configured.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Run the System Configuration Utility
and correct.
Continued
Page 46
3-8 Diagnostic Tools
POST Error Messages Continued
Error Code
703-CMOS reports a
coprocessor that has
not been detected
Audible Beeps
L=Long S=ShortProbable Source of ProblemRecommended Action
2SCoprocessor or configuration
error.
1. Run the System Configuration
Utility and correct.
Replace the coprocessor.
2.
1151-Com Port 1
Address Assignment
Conflict
1152-Com Port 2, 3,
or 4 Address
Assignment Conflict
1600-Server
Manager/R Failure
1610-Temperature
violation detected.
Waiting for system
to cool
1611-Fan [fan
description] failure
detected
1611- Fan [fan
description] not
present
1612-Primary power
supply failure
2SBoth external and internal serial
ports are assigned to COM1.
2SBoth external and internal serial
ports are assigned to COM2,
COM3 or COM4.
NoneServer Manager/R board failure.
Error code displays after error
message.
2SAmbient system temperature
too hot.
2SRequired fan has failed.Check fans.
2SFan not present.Make sure fans are plugged in.
2SPrimary power supply
has failed.
Run the System Configuration Utility
and correct.
Run the System Configuration Utility
and correct.
Run Diagnostics. Replace failed
assembly as indicated.
Check fan in system environment.
Replace power supply as soon
as possible.
1613-Low System
Battery
1615- Power Supply
Failure in Bay X
1616- Power Supply
Configuration Error
1701-SCSI Controller
failure
NoneReal time clock system battery
is running low on power.
NoneA power supply has failed.Replace or check specified
2L, 2SSingle power supply system is
installed in Bay 2 and not
in Bay 1.
NoneA test on the Fast SCSI-2
Controller failed
Run Diagnostics. Replace failed
assembly as indicated.
power supply.
Move power supply from Bay 2
to Bay 1.
Run Diagnostics. Replace failed
assembly as indicated.
Continued
Page 47
POST Error Messages Continued
Audible Beeps
Error Code
1702-SCSI cable
error detected.
System halted.
L=Long S=ShortProbable Source of ProblemRecommended Action
NoneIncorrect cabling.
1. For integrated SCSI
Controllers, ensure that the
internal connector has SCSI
termination attached.
For option card SCSI
2.
controllers, ensure that only
one of the
two internal connectors has
termination attached.
3-9
1703-SCSI cable
error detected.
Internal SCSI cable
not attached to
system board
connector. System
halted.
1704-Unsupported
Virtual Mode Disk
Operation. DOS
Driver Required.
System halted.
1705-Locked SCSI
Bus Detected.
System halted.
1730-Fixed Disk 0
does not support
DMA Mode.
1731-Fixed Disk 1
does not support
DMA Mode.
NoneIncorrect cabling.Ensure that the integrated SCSI
controller has SCSI termination
attached.
NoneSystem attempted to perform a
virtual mode disk operation
without virtual mode
memory services.
NoneSCSI bus failure.Run Diagnostics. Replace failed
NoneFixed disk drive error.Run the System Configuration Utility
NoneFixed disk drive error.Run the System Configuration Utility
Use fixed-disk device driver that
supports virtual mode memory services.
assembly as indicated.
and correct.
and correct.
1740-Fixed Disk 0
failed Set Block
Mode command
1741-Fixed Disk 1
failed Set Block
Mode command
1750-Fixed Disk 0
failed Identify
command
NoneFixed disk drive error.Run the System Configuration Utility
and correct.
NoneFixed disk drive error.Run the System Configuration Utility
and correct.
NoneFixed disk drive error.Run the System Configuration Utility
and correct.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 48
3-10 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps
L=Long S=ShortProbable Source of ProblemRecommended Action
1751-Fixed Disk 1
failed Identify
command
1760-Fixed Disk 0
does not support
Block Mode
1761-Fixed Disk 1
does not support
Block Mode
1764-Slot x Drive Array - Capacity Expansion Process is temporarily disabled
(followed by one of the following):
Expansion will resume when Array Accelerator has been reattached.
Expansion will resume when Array Accelerator has been replaced.
Expansion will resume when Array Accelerator RAM allocation
is successful.
Expansion will resume when Array Accelerator battery reaches full charge.
Expansion will resume when automatic data recovery has been completed.
1765-Slot x Drive Array Option ROM Appears to Conflict With an ISA Card. ISA
cards with 16-bit memory cannot be configured in memory range C0000 to
DFFFF along with the SMART-2/E 8-bit Option ROM due to EISA bus
limitations. Please remove or reconfigure your ISA card.
NoneFixed disk drive error.Run the System Configuration Utility
NoneFixed disk drive error.Run the System Configuration Utility
NoneFixed disk drive error.Run the System Configuration Utility
and correct.
and correct.
and correct.
Reattach or replace Array Accelerator,
wait until the Array Accelerator batteries
have charged, or for Automatic Data
Recovery to complete, as indicated.
Remove or reconfigure conflicting ISA
cards. Disable “shared memory” on
any ISA network cards that may
be installed.
1766-Slot x Drive Array requires System ROM Upgrade. Run Systems
ROMPaq Utility.
1767-Slot x Drive Array Option ROM is Not Programmed Correctly or may
Conflict with the Memory Address Range of an ISA Card. Check the Memory
Address Configuration of installed ISA Card(s) or run Options ROMPaq Utility to
attempt SMART-2/E Option ROM Reprogramming.
1768-Slot x Drive
Array -Resuming
logical drive
expansion process.
NoneSMART-2 Controller errorNo action required. Appears whenever a
Run the latest Systems ROMPaq Utility
to upgrade your System ROMs.
Remove or reconfigure conflicting ISA
cards, especially any cards that are not
recognized by the System Configuration
Utility. Try reprogramming the
SMART-2/E Controller’s ROMs using the
latest Options ROMPaq (version 2.29 or
higher).
controller reset or power cycle occurs
while array expansion is in progress.
Continued
Page 49
POST Error Messages Continued
Audible Beeps
Error Code
L=Long S=ShortProbable Source of ProblemRecommended Action
3-11
1769-Slot x Drive
Array - Drive(s)
disabled due to
failure during
expand. Select F1 to
continue with logical
drives disabled.
Select F2 to accept
data loss and to reenable logical drives.
1771-Primary Disk
Port Address
Assignment Conflict
1772-Secondary
Disk Port Address
Assignment Conflict
1773-Primary Fixed
Disk Port
Assignment Conflict
NoneSMART-2 Controller error.Data has been lost while expanding the
array, therefore the drives have been
temporarily disabled. Press F2 to accept
the data loss and re-enable the logical
drives. Restore data from backup.
NoneInternal and external hard
drive controllers are both
Run the System Configuration Utility
and correct.
assigned to the primary
address.
NoneAddress Assignment Conflict.
Internal and external hard
Run the System Configuration Utility
and correct.
drive controllers are both
assigned to the secondary
address.
NoneFixed disk drive error.Run the System Configuration Utility
and correct.
1774-Slot x Drive
Array - Obsolete data
found in Array
Accelerator. Select
F1 to discard
contents of Array
Accelerator. Select
F2 to write contents
of Array Accelerator
to drives.
1776-Drive Array SCSI Port
Termination Error
1777-Drive Array
External Drive
Subsystem Error
NoneSMART-2 Controller error.Data found in Array Accelerator is older
than data found on drives. Press F1 to
discard the older data in the Array
Accelerator and retain the newer data on
the drives.
NoneExternal and internal SCSI
Reconfigure drives.
drives are both configured to
Port 1.
NoneCooling fan failure, internal
temperature alert or open
Inspect for cooling fan failure or open
side panel.
side panel.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 50
3-12 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps
L=Long S=ShortProbable Source of ProblemRecommended Action
1778-Drive Array
resuming Automatic
Data Recovery
process
a controller reset or power cycle
occurs while Automatic Data
Recovery is in progress.
NoneIntermittent drive failure and/or
possible loss of data.
NoneHard disk drive circuitry error.Run Diagnostics. Replace failed
NoneDefective drive and/or cables.Check for loose cables. Replace
No action necessary.
If this message appears and drive X
has not been replaced, this indicates
an intermittent drive failure. This
message also appears once
immediately following drive
replacement whenever data must be
restored from backup.
assembly as indicated.
assembly as indicated.
assembly as indicated.
defective drive X and/or cable(s).
1785-Drive Array not
Configured
1786-Drive Array
Recovery Needed
The following
drive(s) need
Automatic Data
Recovery: Drive X.
Select "F1" to
continue with
recovery of data to
drive(s).
Select "F2" to
continue without
recovery of data to
drive(s).
NoneConfiguration error.Run the System Configuration Utility
and correct.
NoneInterim Data Recovery
mode. Data has not been
recovered yet.
Press F1 key to allow Automatic Data
Recovery to begin. Data will
automatically be restored to drive X
now that the drive has been replaced
or now seems to be working.
-Or-
Press the F2 key and the system will
continue to operate in the Interim Data
Recovery mode.
Continued
Page 51
POST Error Messages Continued
Audible Beeps
Error Code
1787-Drive Array
Operating in Interim
Recovery Mode.
Physical drive
replacement needed:
Drive X
L=Long S=ShortProbable Source of ProblemRecommended Action
NoneHard drive X failed or cable is
loose or defective. Following a
system restart, this message
reminds you that drive X is
defective and fault tolerance is
being used.
1. Replace drive X as soon as
possible.
Check loose cables.
2.
3.
Replace defective cables.
3-13
*1788-Incorrect
Drive Replaced:
Drive X Drive(s) were
incorrectly replaced:
Drive Y
Select "F1" to
continue - drive
array will remain
disabled.
Select "F2" to reset
configuration - all
data will be lost.
*NOTE: The 1788 error message might also be displayed inadvertently due to a bad power cable connection to the drive
or by noise on the data cable. If this message was due to a bad power cable connection, but not because of an incorrect
drive replacement, repair the connection and press F2.
-Or-
If this message was not due to a bad power cable connection, and no drive replacement took place, this could indicate
noise on the data cable. Check cable for proper routing.
NoneDrives are not installed in their
original positions, so the
drives have been disabled.
See note below.
Reinstall the drives correctly
as indicated.
Press F1 to restart the computer with the
drive array disabled.
-Or-
Press F2 to use the drives as configured
and lose all the data on them.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 52
3-14 Diagnostic Tools
POST Error Messages Continued
Error Code
1789-Drive Not
Responding,
Physical Drive
Check cables or
replace physical
drive X.
Select "F1" to
continue - drive
array will remain
disabled.
Select "F2" to fail
drive(s) that are not
responding -
Interim Recovery
Mode will be
enabled if configured
for fault tolerance.
Audible Beeps
L=Long S=ShortProbable Source of ProblemRecommended Action
NoneCable or hard drive failure.
1. Check the cable
2.
Replace the cables.
3.
Replace
4. If you do not want to
1790-Disk 0
Configuration Error
1791-Disk 1 ErrorNoneHard drive error or wrong
1792-Drive Array
Reports Valid Data
Found in Array
Accelerator.
Data will
automatically be
written to drive
array.
NoneHard drive error or wrong
drive type
drive type.
NoneThis indicates that while the
system was in use, power
was interrupted while data
was in the Array Accelerator
memory. Power was then
restored within eight to ten
days, and the data in the Array
Accelerator was flushed to the
drive array.
Run the System Configuration Utility and
Diagnostics and correct.
Run the System Configuration Utility and
Diagnostics and correct.
No action necessary; no data has been
lost. Perform orderly system shutdowns
to avoid data remaining in the Array
Accelerator.
Continued
Page 53
POST Error Messages Continued
Audible Beeps
Error Code
L=Long S=ShortProbable Source of ProblemRecommended Action
3-15
1793-Drive Array Array Accelerator
Battery Depleted Data Lost
(Error message 1794
also displays.)
1794-Drive Array Array Accelerator
Battery Charge Low.
Array Accelerator is
temporarily disabled.
Array Accelerator will
be re-enabled when
battery reaches full
charge.
Data does not
correspond to this
drive array. Array
Accelerator is
temporarily disabled.
1796-Drive Array Array Accelerator Not
Responding.
Array Accelerator is
temporarily disabled.
NoneThis indicates that while the
system was in use, power was
interrupted while data was in
the Array Accelerator memory.
Array Accelerator batteries
failed. Data in Array Accelerator
has been lost.
NoneThis is a warning that the
battery charge is below 75%.
Posted-writes are disabled.
NoneThis indicates that while the
system was in use, power was
interrupted while data was in
the Array Accelerator memory.
The data stored in the Array
Accelerator does not correspond
to this drive array.
NoneArray Accelerator is defective or
has been removed.
Power was not restored within eight to
ten days. Perform orderly system
shutdowns to avoid data remaining in
the Array Accelerator.
Replace the Array Accelerator board if
batteries do not recharge within 36
power-on hours.
Match the Array Accelerator to the
correct drive array, or run the System
Configuration Utility to clear the data in
the Array Accelerator.
1. Check that the Array
Accelerator is properly seated.
Run the System Configuration
2.
Utility to reconfigure the
Compaq IDA-2 without the
Array Accelerator.
1797-Drive Array Array Accelerator
Read Error Occurred.
Data in Array
Accelerator has been
lost. Array
Accelerator is
disabled.
NoneHard parity error while reading
data from posted-writes
memory.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Enable Array Accelerator.
Continued
Page 54
3-16 Diagnostic Tools
POST Error Messages Continued
Error Code
Audible Beeps
L=Long S=ShortProbable Source of ProblemRecommended Action
1799-Drive Array Drive(s) Disabled
due to Array
Accelerator Data
Loss. Select "F1" to
continue with logical
drives disabled.
Select "F2" to
accept data loss and
to re-enable logical
drives.
Beeps only: 2 Long
+ 2 Short
(Run System
Configuration Utility F10 key)
NoneHard parity error while writing
data to posted-writes memory.
NoneVolume failed due to loss of data
in posted-writes memory.
2L, 2SPower is cycled. Temperature
too hot. Processor fan not
installed or spinning.
NoneA configuration error occurred
during POST.
Enable Array Accelerator.
Press F1 to continue with logical drives
disabled or F2 to accept data loss and
re-enable logical drive.
Check fans.
Press F10 to run System Configuration
Utility.
(RESUME - F1 KEY)NoneAs indicated to continue.Press the F1 key.
Page 55
Diagnostics Software
Tables 3-2 through 3-20 include all test error codes generated by Compaq products. Each code
has a corresponding description and recommended action(s). Your system generates only those
codes that are applicable to your configuration and options.
When you select Diagnostics and Utilities from the System Configuration Utility main menu, the
utility prompts you to test, inspect, upgrade, and diagnose the server.
Diagnostics and Utilities are located on the system partition on the hard drive and must be
accessed when a system configuration error is detected during the Power-On Self-Test (POST).
Compaq Diagnostics software is also available on the Compaq SmartStart and Support Software
CD. You can create a Diagnostics diskette from the SmartStart and Support Software CD and
run Diagnostics from diskette.
The following options are available from the Diagnostics and Utilities menu:
■ Test Computer
■ Inspect Computer
■ Upgrade Firmware
■ Remote Utilities
3-17
■ Diagnose Drive Array
Diagnostic error codes are generated when the Diagnostics software recognizes a problem.
These error codes, listed in tables 3-2 through 3-20, help identify possible defective
subassemblies.
In each case, the Recommended Action column lists the steps necessary to correct the problem.
After completing each step, run the Diagnostics program to verify whether the error condition
has been corrected. If the error code reappears, perform the next step, then run the Diagnostics
program again. Follow this procedure until the Diagnostics program no longer detects an error
condition.
If you encounter an error condition, complete the following steps before starting problem
isolation procedures:
Be certain proper ventilation exists. The computer should have approximately 12 inches
1.
(30.5 cm) clearance at the front and back of the system unit.
Turn off the computer and peripheral devices.
2.
Disconnect any peripheral devices not required for testing. Do not disconnect the printer if
3.
you want to test it or use it to log error messages.
Turn on the computer.
4.
Delete the power-on password, if set. You will know that the power-on password is set
5.
when a key icon appears on the screen when POST completes. If this occurs, you must
enter the password to continue. To delete the password, type the current password, a
forward slash ( / ), and press the Enter key.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 56
3-18 Diagnostic Tools
6. Disable the power-on password by using the Password Disable switch on the system board,
if you do not have access to the password.
Install a loopback plug (Part Number 142054-001), when required by Diagnostics.
7.
Run the latest version of Diagnostics.
8.
Running Diagnostics
There are two ways to access the utilities:
■ From the System Partition.
■ From diskette. A diskette can be created from the SmartStart and Support Software CD.
To access the utilities from the system partition:
Reboot the server by pressing the Ctrl+Alt+Delete keys.
1.
2.
Press F10 when the following prompt appears at the top of the screen during POST.
Press “F10” for System Partition Utilities.
IMPORTANT: The text appears for only two seconds. If you do not press F10 during this
time, you must reboot the server.
3. From the System Configuration Main Menu, select Diagnosticsand Utilities.
If errors are detected in your Server Health Log, the Diagnostics Utility automatically
displays the following screen message:
CAUTION: Errors have been detected in you Server Health
Log. Diags will now identify your system hardware.
4. Press the Enter key to continue.
5.
After a short pause, the Server Health Log menu displays with a list of system errors.
If there is more than one error, press the Space Bar to select the error you want to correct.
Press Enter.
The Diagnostics Utility prompts you and suggests corrective action.
6.
Page 57
Primary Processor Test Error Codes
The 100 series of Diagnostic error codes identifies failures with processor and system board
functions.
Error CodeDescriptionRecommended Action
101-xxCPU test failedReplace the processor board and retest.
3-19
Table 3-2
Primary Processor Test Error Codes
103-xx
104-xx
105-xx
106-xx
107-xx
108-xx
109-xx
DMA page registers test failed.
Interrupt controller master test failed.
Port 61 error.
Keyboard controller self-test failed.
CMOS RAM test failed.
CMOS interrupt test failed.
CMOS clock load data test failed.
For error codes 103-xx through 106-xx, replace the
processor board and retest.
The following steps apply to error codes 107-xx
through 109-xx:
1. Replace the battery/clock module and
retest.
Replace the system board and retest.
2.
110-xx
111-xx
112-xx
113-xx
114-xxSpeaker test failed.
116-xxCache test failed.Replace the system board and retest.
122-xx
123-xx
Programmable timer load data test failed.
Refresh detect test failed.
Speed test slow mode out of range.
Protected mode test failed.
Multiprocessor Dispatch test failed.
Interprocessor Communication test failed.
For error codes 110-xx through 113-xx, replace the
system board and retest.
1. Verify the speaker connection and retest.
2.
Replace the speaker and retest.
3.
Replace the system board and retest.
The following steps apply to error codes 122-xx
through 123-xx:
1. Check the system configuration and retest.
2. Replace the processor board and retest.
3.
Replace the system board and retest.
199-xxInstalled devices test failed.
1. Check the system configuration and retest.
2. Verify cable connections and retest.
3.
Check switch and/or jumper settings and
retest.
Run the Configuration utility and retest.
4.
5. Replace the processor board and retest.
6.
Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 58
3-20 Diagnostic Tools
Memory Test Error Codes
The 200 series of Diagnostic error codes identifies failures with the memory subsystem.
Error CodeDescriptionRecommended Action
200-xxInvalid memory configuration.Reinsert memory modules in correct location
201-xx
202-xx
203-xx
204-xx
205-xx
206-xx
207-xxInvalid memory configuration-check DIMM
208-xxInvalid memory speed detected - check DIMM
210-xxRandom pattern test failed.
215Non-functioning DC-DC converter for
Memory machine ID test failed.
Memory system ROM checksum failed.
Memory write/read test failed.
Memory address test failed.
Walking I/O test failed.
Increment pattern test failed.
installation. DIMMs installed have 8K refresh.
installation. Slow DIMMs may cause data loss.
processor X.
Table 3-3
Memory Test Error Codes
and retest.
The following steps apply to error codes 201-xx and
202-xx:
1. Replace the system ROM and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board and
The following steps apply to error codes 203-xx
through 210-xx:
1. Replace the memory module and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board
Replace DIMMs.
Replace DIMMs with timing greater than 60 ns.
1. Replace the memory module and retest.
2.
Replace the processor board and retest.
3.
Replace the memory expansion board and
Replace the DC-DC converter(processor power
module).
retest.
and retest.
retest.
Page 59
Keyboard Test Error Codes
The 300 series of Diagnostic error codes identifies failures with keyboard and system board
functions.
Error CodeDescriptionRecommended Action
301-xx
302-xx
303-xx
304-xx
Keyboard short test, 8042 self-test failed.
Keyboard long test failed.
Keyboard LED test, 8042 self-test failed.
Keyboard typematic test failed.
Parallel Printer Test Error Codes
Table 3-4
Keyboard Test Error Codes
The following steps apply to error codes 301-xx
through 304-xx:
1. Check the keyboard connection. If
disconnected, turn off the computer and
connect the keyboard
and retest.
Replace the keyboard and retest.
2.
3.
Replace the system board and retest.
3-21
The 400 series of Diagnostic error codes identifies failures with parallel printer interface card or
system board functions.
Table 3-5
Parallel Printer Test Error Codes
Error CodeDescriptionRecommended Action
401-xx
402-xx
403-xx
498-xx
Printer failed or not connected.
Printer data register failed.
Printer pattern test failed.
Printer failed or not connected.
The following steps apply to error codes 401-xx
through 498-xx:
1. Connect the printer and retest.
2.
Check the power to the printer and retest.
3. Install the loopback connector and retest.
4.
Check the switch on the Serial/Parallel
Interface board (if applicable) and retest.
Replace the Serial/Parallel Interface board
5.
(if applicable) and retest.
Replace the system board and retest.
6.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 60
3-22 Diagnostic Tools
Video Display Unit Test Error Codes
The 500 series of Diagnostic error codes identifies failures with video or system board functions.
Video controller test failed.
Video memory test failed.
Video attribute test failed.
Video character set test failed.
Video 80 x 25 mode 9 x 14 character cell test failed.
Video 80 x 25 mode 8 x 8 character cell test failed.
Video 40 x 25 mode test failed.
Video 320 x 200 mode color set 0 test failed.
Video 320 x 200 mode color set 1 test failed.
Video 640 x 200 mode test failed.
Video screen memory page test failed.
Video gray scale test failed.
Video white screen test failed.
Video noise pattern test failed.
Table 3-6
Video Display Unit Test Error Codes
The following steps apply to error codes
501-xx through 516-xx:
1. Replace the monitor and retest.
2.
Replace the Advanced VGA board
Replace the system board and
3.
and retest.
retest.
Page 61
Diskette Drive Test Error Codes
The 600 series of Diagnostic error codes identifies failures with diskette, diskette drive, or
system board functions.
Diskette ID drive types test failed.
Diskette format failed.
Diskette read test failed.
Diskette write/read/compute test failed.
Diskette random seek test failed.
Diskette ID media failed.
Diskette speed test failed.
Diskette wrap test failed.
Diskette write protect test failed.
Diskette reset controller test failed.
Diskette change line test failed.
Pin 34 is not cut on 360 KB diskette drive.
Diskette type error.
Diskette drive speed not within limits.
Table 3-7
Diskette Drive Test Error Codes
1. Replace the diskette and retest.
2.
Check and/or replace the diskette power
and signal cables and retest.
Replace the diskette drive and retest.
3.
4.
Replace the system board and retest.
1. Replace the media and retest.
2.
Run the Configuration utility and retest.
3-23
Monochrome Video Board Test Error Codes
The 800 series of Diagnostic error codes identifies failures with monochrome video boards or
system board functions.
Monochrome Video Board Test Error Codes
Error CodeDescriptionRecommended Action
802-xx
824-xx
Video memory test failed.
Monochrome video text mode
test failed.
Table 3-8
1. Replace monitor and retest.
2. Replace the Advanced VGA board and retest.
3.
Replace monochrome board and retest.
4.
Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 62
3-24 Diagnostic Tools
Serial Test Error Codes
The 1100 series of Diagnostic error codes identifies failures with serial/parallel interface board
or system board functions.
Error CodeDescriptionRecommended Action
1101-xx
1109-xx
Serial port test failed.
Clock register test failed.
Modem Communications Test Error Codes
The 1200 series of Diagnostic error codes identifies failures with the modem(s).
Table 3-9
Serial Test Error Codes
1. Check the switch settings on the
2.
3.
Serial/Parallel Interface board (if
applicable) and retest.
Replace the Serial/Parallel Interface
board (if applicable) and retest.
Replace the system board and retest.
Table 3-10
Modem Communications Test Error Codes
Error CodeDescriptionRecommended Action
1201-xx
1202-xx
1203-xx
1204-xx
1206-xx
1210-xx
Modem internal loopback test failed.
Modem time-out test failed.
Modem external termination test failed.
Modem auto originate test failed.
Dial multi-frequency tone test failed.
Modem direct connect test failed.
1. Refer to the modem documentation for
correct setup procedures and retest.
Check the modem line and retest.
2.
3.
Replace the modem and retest.
Page 63
Hard Drive Test Error Codes
The 1700 series of Diagnostic error codes identifies failures with hard drives, hard drive
controller boards, hard drive cabling, and system board functions. If your system uses a drive
array controller, see the section for Drive Array Advanced Diagnostics (DAAD).
Fixed disk ID drive types test failed.
Fixed disk format test failed.
Fixed disk read test failed.
Fixed disk write/read/compare test failed.
Fixed disk random seek test failed.
Fixed disk controller test failed.
Fixed disk format bad track test failed.
Fixed disk reset controller test failed.
Fixed disk park head test failed.
Fixed disk head select test failed.
Fixed disk conditional format test failed.
Fixed disk ECC* test failed.
Fixed disk drive power mode test failed.
Drive Monitoring failed.
Invalid fixed disk drive type failed.
Table 3-11
Hard Drive Test Error Codes
1. Run the System Configuration Utility and
verify the drive type.
Replace the fixed disk drive signal and
2.
3.
Replace the fixed disk drive controller and
retest.
Replace the fixed disk drive and retest.
4.
5.
Replace the system board and retest.
3-25
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 64
3-26 Diagnostic Tools
Tape Drive Test Error Codes
The 1900 series of Diagnostic error codes identifies failures with tape cartridges, tape drives,
tape drive cabling, adapter boards, or the system board assembly.
Error CodeDescriptionRecommended Action
1900-xx
1901-xx
1902-xx
1903-xx
1904-xx
1905-xx
1906-xx
Tape ID failed.
Tape servo write failed.
Tape format failed.
Tape drive sensor test failed.
Tape BOT/EOT test failed.
Tape read test failed.
Tape write/read/compare test failed.
Table 3-12
Tape Drive Test Error Codes
1. Replace the tape cartridge and retest.
2.
Check and/or replace the signal cable
Check the switch settings on the adapter
3.
Replace the tape adapter board (if
4.
Replace the tape drive and retest.
5.
6.
Replace the system board and retest.
and retest.
board (if applicable).
and retest.
Page 65
Advanced VGA Board Test Error Codes
The 2400 series of Diagnostic error codes identifies failures with video boards, monitors, or the
system board assembly.
Advanced VGA Board Test Error Codes
Error CodeDescriptionRecommended Action
2402-xx
2403-xx
2404-xx
2405-xx
Video memory test failed.
Video attribute test failed.
Video character set test failed.
Video 80 x 25 mode 9 x 14 character cell
test failed.
Table 3-13
1. Run the System Configuration Utility.
2.
Replace the monitor and retest.
3.
Replace the Advanced VGA board or other
video board and retest.
Replace the system board and retest.
4.
3-27
2406-xx
2407-xx
2408-xx
2409-xx
2410-xx
2411-xx
2412-xx
2414-xx
2416-xx
2417-xx
2418-xx
Video 80 x 25 mode 8 x 8 character cell
test failed.
Video 40 x 25 mode test failed.
Video 320 x 320 mode color set 0 test failed.
Video 320 x 320 mode color set 1 test failed.
Video 640 x 200 mode test failed.
Video screen memory page test failed.
Video gray scale test failed.
Video white screen test failed.
Video noise pattern test failed.
Lightpen text mode test failed, no response.
ECG/VGC memory test failed.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 66
3-28 Diagnostic Tools
Advanced VGA Board Test Error Codes Continued
Error CodeDescriptionRecommended Action
2419-xx
2420-xx
2421-xx
2422-xx
ECG/VGC ROM checksum test failed.
ECG/VGC attribute test failed.
ECG/VGC 640 x 200 graphics mode test failed.
ECG/VGC 640 x 350 16-color set test failed.
1. Run the System Configuration Utility.
2.
Replace the monitor and retest.
3.
Replace the Advanced VGA board or
other video board and retest.
Replace the system board and retest.
4.
2423-xx
2424-xx
2425-xx
2431-xx
2432-xx
2448-xx
2451-xx
2456-xx
2458-xx
2468-xx
2477-xx
2480-xx
ECG/VGC 640 x 350 64-color test failed.
ECG/VGC monochrome text mode test failed.
ECG/VGC monochrome graphics mode
test failed.
640 x 480 graphics test failure.
320 x 200 graphics (256-color mode) test failure.
Advanced VGA Controller test failed.
132-column Advanced VGA test failed.
Advanced VGA 256-Color test failed.
Advanced VGA Bit BLT Test.
Advanced VGA DAC Test.
Advanced VGA Data Path Test.
Advanced VGA DAC Test.
1. Run Setup.
2.
Replace the system board and retest.
Page 67
NetFlex-2 Controller Test Error Codes
The 6000 series of Diagnostic error codes identifies failures with 32-bit DualSpeed NetFlex-2
and NetFlex-2 Token Ring Controllers.
NetFlex-2 Controller Test Error Codes
Error CodeDescriptionRecommended Action
6000-xx
6001-xx
6002-xx
6014-xx
6016-xx
6028-xx
6029-xx
6089-xx
Network card ID failed
Network card setup failed
Network card transmit failed
Network card Configuration failed
Network card Reset failed
Network card Internal failed
Network card External failed
Network card Open failed
Table 3-14
3-29
1. Check the controller installation in the
EISA slot.
The 6500 series of Diagnostic error codes identifies failures with SCSI hard drives, SCSI hard
drive controller boards, SCSI hard drive cabling, and system board functions. If your system
uses a drive array controller, see the section for Drive Array Advanced Diagnostics (DAAD).
SCSI Disk ID drive types test failed.
SCSI Disk Unconditional Format test failed.
SCSI Disk Read Test Failed.
SCSI Disk SA/Media test failed.
SCSI Disk Erase tape test failed.
SCSI Disk Random Read test failed.
Media load/unload test failed.
SCSI/IDE CD-ROM Drive Test Error Codes
Table 3-16
1. Run the System Configuration Utility and
verify the drive type.
Replace the SCSI disk drive signal and
2.
power cables and retest.
Replace the SCSI controller and retest.
3.
4.
Replace the SCSI disk drive and retest.
5.
Replace the system board and retest.
3-31
The 6600 series of Diagnostic error codes identifies failures with the CD-ROM cabling,
CD-ROM drives, adapter boards, or the system board assembly.
Table 3-17
SCSI/IDE CD-ROM Drive Test Error Codes
Error CodeDescriptionRecommended Action
6600-xx
6605-xx
CD-ROM ID failed.
CD-ROM Read failed.
1. Replace the CD-ROM media and retest.
2.
Check and/or replace the signal cable and
retest.
Check the switch settings on the adapter
3.
board (if applicable).
4.
Replace the SCSI controller (if applicable)
and retest.
Replace the CD-ROM drive and retest.
5.
6. Replace the system board and retest.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 70
3-32 Diagnostic Tools
SCSI Tape Drive Test Error Codes
The 6700 series of Diagnostic error codes identifies failures with tape cartridges, tape drives,
media changers, tape drive cabling, adapter boards, or the system board assembly.
SCSI Tape Drive Test Error Codes
Error CodeDescriptionRecommended Action
6700-xx
6706-xx
6709-xx
6728-xx
SCSI Tape ID drive types test failed.
SCSI Disk SA/Media test failed.
SCSI Disk Erase tape test failed.
Media load/unload test failed.
Server Manager/R Board Test Error Codes
The 7000 series of Diagnostic error codes identifies failures with the Server Manager/R board.
Table 3-18
1. Run the System Configuration Utility and
verify the drive type.
Replace the Server Manager/R board Enhanced
2400-Baud Integrated Modem and retest.
Replace the Server Manager/R board Voice ROM.
3-33
7000-78
7000-79
Host ADC Measurements.
Battery.
Pointing Device Interface Test Error Codes
The 8600 Diagnostic error codes identifies failures with the pointing device (mouse, trackball,
and so on) or the system board assembly.
Pointing Device Interface Test Error Codes
Error CodeDescriptionRecommended Action
8601-xxPointing Device Interface test failed.
Table 3-20
Replace the Server Manager/R board battery.
1. Replace with a working pointing device
and retest.
Replace the system board and retest.
2.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 72
3-34 Diagnostic Tools
Drive Array Advanced Diagnostics (DAAD)
Drive Array Advanced Diagnostics (DAAD) is a DOS-based tool designed to run on all Compaq
products that contain a Compaq Drive Array Controller. The error messages and codes listed
include all codes generated by Compaq products. Your system generates only codes applicable
to your configuration and options.
The two main functions of DAAD are:
■ Collecting all possible information about array controllers in the system
■ Offering a list of all detected problems
NOTE: Refer to the Drive Array Advanced Diagnostics User Guide, found on the SmartStart and
Support Software CD, for complete details and procedures about this diagnostic tool.
DAAD works by issuing multiple commands to the array controllers to determine if a problem
exists. This data can then be saved to a file and, in severe situations, this file can be sent to
Compaq for analysis. In most cases, DAAD provides enough information to initiate problem
resolution immediately.
NOTE: DAAD does not write to the drives or destroy data. It does not change or remove
configuration information.
Page 73
Starting DAAD
To start DAAD:
1.
Insert the DAAD diskette into drive A.
Reboot the system - OR - if you are at the DOS prompt, enter the following:
2.
A dialog box displays, indicating the version of DAAD installed. Press the Enter (or ‘C’) key to
continue, or press the Esc (or ‘E’) key to exit without continuing.
3.
If you continue, a Please Wait panel displays, indicating that DAAD is identifying the
DAAD gathers all the information it can from all of the array controllers in the system.
A second Please Wait panel may display to indicate that the utility is identifying the ROM
3-35
A:DAAD
NOTE: To generate a DAAD report without starting the interactive portion of the utility, enter the
following at the DOS prompt:
DAAD filename
where filename is the name of the file or report.
system parameters.
The time it takes to gather this information depends on the size of your system.
version of an array controller in the system.
CAUTION: Do not cycle the power; the utility must perform low-level operations
that, if interrupted, could cause the controller to revert to a previous level of
firmware if the firmware was soft-upgraded.
When the information gathering process is complete, the main DAAD screen displays.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 74
3-36 Diagnostic Tools
DAAD Diagnostic Messages
Table 3-21 lists DAAD diagnostic messages in alphabetical order.
MessageDescriptionRecommended Action
Table 3-21
DAAD Diagnostic Messages
Accelerator board
not detected
Accelerator error
log
Accelerator parity
read errors: n
Accelerator parity
write errors: n
Accelerator status:
Permanently
disabled
Array controller did not detect a
configured array accelerator board.
List of the last 32 parity errors on
transfers to or from memory on the
array accelerator board. Displays
starting memory address, transfer
count, and operation (read and write).
Number of times that read memory
parity errors were detected during
transfers from memory on array
accelerator board.
Number of times that write memory
parity errors were detected during
transfers to memory on the array
accelerator board.
Array accelerator board has been
permanently disabled. It remains
disabled until it is reinitialized using
the System Configuration Utility.
Install array accelerator board on array controller. If an
array accelerator board is installed, check for proper
seating on the array controller board. You may need to
run the System Configuration Utility and disable the array
accelerator board to get this message off the screen.
If there are many parity errors, you may need to replace
the array accelerator board.
If there are many parity errors, you may need to replace
the array accelerator board.
If there are many parity errors, you may need to replace
the array accelerator board.
Check the Disable Code field. Run the System
Configuration Utility to reinitialize the array
accelerator board.
Accelerator status:
Possible data loss
in cache
Accelerator status:
Temporarily
disabled
Accelerator
status:
Unrecognized
status
Possible data loss detected during
power-up due to all batteries being
below sufficient voltage level and no
presence of identification signatures
on the array accelerator board.
Array accelerator board has been
temporarily disabled.
A status returned from the array
accelerator board that DAAD does not
recognize.
There is no way to determine if dirty or bad data was in
the cache and is now lost.
Check the Disable Code field.
Obtain the latest version of DAAD.
Continued
Page 75
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
3-37
Accelerator
status: Obsolete
data sensed at
reset
Accelerator
status: Obsolete
data was written
to drives
Accelerator
status: Obsolete
data was
discarded
Accelerator
status: Dirty data
detected. Unable
to write dirty data
to drives
Accelerator
status: Dirty data
detected has
reached limit.
Cache still
enabled, but
writes no longer
being posted
During reset initialization obsolete data
was found in the cache. This was due to
the drives being moved and written to by
another controller.
During reset initialization obsolete data
was found in the cache. The obsolete data
was written to the drives, but newer data
may have been overwritten.
During reset initialization obsolete data
was found in the cache and it was
discarded (not written to the drives).
At least one cache line contains dirty data
that the controller has been unable to
flush (write) to the drives. This problem
usually occurs when there is a problem
with the drive(s).
The number of cache lines containing dirty
data that cannot be flushed (written) to the
drives has reached a preset limit. The
cache is still enabled, but writes are no
longer being posted. This problem usually
occurs when there is a problem with
the drive(s).
Nothing needs to be done. The controller will either
write the data to the drivers or discard the data
completely. Normal operations should continue.
If newer data was overwritten, you may need to
restore newer data; otherwise, nothing needs to be
done. Normal operations should continue.
Nothing needs to be done. Normal operations
should continue.
Fix the problem with the drive(s). Then the controller
will be able to write the dirty data to the drives.
Fix the problem with the drive(s). Then the controller
will be able to write the dirty data to the drives and
posted write operations will be restored.
Accelerator
status: Excessive
ECC errors
detected in at
least one cache
line. As a result,
at least on cache
line is no longer in
use.
Accelerator
status: Data in the
cache was lost
due to some
reason other than
the battery being
discharged
At least one line in the cache is no longer
in use due to excessive ECC errors
detected during use of the memory
associated with that cache line.
Data in the cache was lost, but not
because of the battery being discharged.
Replacement of the cache should be considered. If
cache replacement is not done the remaining cache
lines should continue to operate properly.
Check to be sure that the array accelerator is
properly seated. If the error continues you may need
to replace the array accelerator.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 76
3-38 Diagnostic Tools
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
Accelerator
status: Cache was
automatically
configured during
last controller
reset. This can
occur when
cacheboard is
replaced with one
of a different size.
Accelerator
status: Valid data
found at reset
Accelerator
status: Warranty
alert
Adapter/NVRAM
ID mismatch
Battery X not fully
charged
Cache board was probably replaced with
one of a different size.
Valid data was found in posted write
memory at reinitialization. Data will be
flushed to disk.
Catastrophic problem with array
accelerator board. Refer to other
messages on Diagnostics screen for exact
meaning of this message.
EISA nonvolatile RAM has an ID for a
different controller from the one physically
present in the slot.
Battery is not fully charged.Allow 36 hours to recharge them.
Nothing needs to be done. Normal operations
should continue.
Not an error or data loss condition. No action needs
to be taken.
Replace the array accelerator board.
Run the System Configuration Utility.
Board not
attached
NVRAM
configuration
present, controller
not detected
Compatibility port
problem detected
Array controller configured for use with
array accelerator board, but one is not
attached.
EISA nonvolatile RAM has a configuration
for an array controller but there is no
board in this slot. Either a board has been
removed from the system or a board has
been placed in the wrong slot.
Compatibility port configured for this IDA
controller. When DAAD was verifying this
interface, a serious problem
was detected.
Attach array accelerator board to array controller.
Place the array controller in the proper slot or run the
System Configuration Utility to reconfigure nonvolatile
RAM to reflect the removal or new position.
A hardware problem has occurred; replace the
IDA controller.
Continued
Page 77
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
3-39
Configuration
signature is zero
Configuration
signature
mismatch
Controller
communication
failure occurred
Controller
detected. NVRAM
configuration not
present
DAAD detected that nonvolatile RAM
contains a configuration signature that is
zero. Old versions of the System
Configuration Utility could cause this.
Array accelerator board configured for a
different array controller board.
Configuration signature on array
accelerator board does not match the one
stored on the array controller board.
Controller communication
failure occurred.
EISA nonvolatile RAM does not contain a
configuration for this controller.
Controller firmware is below the latest
recommended version.
Controller is correct, however, IDA
firmware version should be greater
than 1.26.
Run the latest version of System Configuration Utility
to configure the controller and nonvolatile RAM.
To recognize the array accelerator board, run the
System Configuration Utility.
DAAD was unable to successfully issue commands to
the controller in this slot.
Run the System Configuration Utility to configure the
nonvolatile RAM.
Run Options ROMPaq to upgrade the controller to the
latest firmware revision.
Obtain the latest firmware.
Controller is
located in special
“video” slot
Controller is not
configured
Controller needs
replacing (DAAD
Error 102)
Controller is installed in slot for special
video control signals. If controller is used
in this slot, LED indicators on front panel
Install the controller in a different slot and run the
System Configuration Utility to configure the
controller and nonvolatile RAM.
may not function properly.
Controller is not configured. If controller
was previously configured and you change
drive locations, there may be a problem
with placement of the drives. DAAD
examines each physical drive and looks
for drives that have been moved to a
Look for messages indicating which drives have been
moved. If none appear and drive swapping did not
occur, run the System Configuration Utility to
configure the controller and nonvolatile RAM. Do not
run the System Configuration Utility if you believe
drive swapping has occurred.
different drive bay.
IDA firmware is less than version 0.96.Replace the controller as soon as possible.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Continued
Page 78
3-40 Diagnostic Tools
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
Controller needs
replacing (DAAD
Error 104)
Controller
reported POST
error.
Error Code: x
Controller
restarted with a
signature of zero
DAAD recorded
errors attempting
to access: X
Disable command
issued
The Intelligent Array Expansion System
firmware is less than version 1.14.
The controller returned an error from its
internal Power-On Self Tests.
DAAD did not find a valid configuration
signature to use to get the data.
Nonvolatile RAM may not be present
(unconfigured) or the signature present
in nonvolatile RAM may not match the
signature on the controller.
DAAD found errors while attempting to
access physical drive X, believed to be
operational. Message followed by
specific information about the error.
Posted-writes have been disabled by
the issuing of the Accelerator Disable
command. This occurred because of an
operating system device driver.
Replace the controller as soon as possible.
Replace the controller.
Run the System Configuration Utility to configure the
controller and nonvolatile RAM.
Replace the drive, or correct the condition that caused
the error.
Restart the system. Run the System Configuration Utility
to reinitialize the array accelerator board.
Drive (bay) X
needs replacing
(DAAD Error 102)
Drive Monitoring
features are
unobtainable
Drive Monitoring
is NOT enabled
for drive bay X
Drive time-out
occurred on
physical drive
bay X
The 210-megabyte hard drive has
firmware version 2.30 or 2.31.
DAAD unable to get monitor and
performance data due to fatal
command problem such as drive timeout, or unable to get data due to these
features not supported on the
controller.
The monitor and performance features
have not been enabled.
DAAD issued a command to a physical
drive and the command was never
acknowledged.
Replace the drive.
Check for other errors (time-outs, and so on). If no other
errors occur, upgrade the firmware to a version that
supports monitor and performance, if desired.
Run the System Configuration Utility to initialize the
monitor and performance features.
The drive or cable may be bad. Check the other error
messages on the Diagnostics screen to determine
resolution.
Continued
Page 79
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
3-41
Drive (bay) X
firmware needs
upgrading
Drive (bay) X has
invalid M&P
stamp
Drive X indicates
position Y
Drive (bay) X RIS
copy mismatch
Drive (bay) X
upload code not
readable
Drive (bay) X has
loose cable
Firmware on this physical drive is below
the latest recommended version.
Physical drive has invalid monitor and
performance data.
Message indicates which physical drive
appears to be scrambled or in a drive
bay other than the one for which it was
originally configured.
The copies of the RIS on this drive do
not match.
An error occurred while DAAD was
trying to read the upload code
information from this drive.
The array controller could not
communicate with this drive at powerup. This drive has not previously failed.
Run the Options ROMPaq Utility to upgrade the drive
firmware to the latest revision.
Run the System Configuration Utility to properly initialize
this drive.
Examine the graphical drive representation on DAAD to
determine proper drive locations. Remove drive X and
place it in drive position Y. Rearrange the drives
according to the DAAD instructions.
This drive may need to be replaced. Check for
other errors.
If there were multiple errors, this drive may need to
be replaced.
Check all cable connections first. The cables could be
bad, loose, or disconnected. Turn on the system and
attempt to reconnect signal/power cable to the drive. If
this does not work, replace the cable. If that does not
work, the drive may need to be replaced.
Drive (bay) X is a
replacement drive
Drive (bay) X is a
replacement drive
marked OK
Drive (bay) X is
failed
Drive (bay) X has
insufficient
capacity for its
configuration
Drive (bay) X is
undergoing drive
recovery
This drive has been replaced. This
message displays if a drive is replaced
If the replacement was intentional, allow the drive
to rebuild.
in a fault tolerant logical volume.
This drive has been replaced and
Replace the drive.
marked OK by the firmware. This may
occur if a drive has an intermittent
failure (for example, if a drive has
previously failed, then when DAAD is
run, the drive starts working again).
The indicated physical drive has failed.Replace this drive.
Drive has insufficient capacity to be
Replace this drive with a larger capacity drive.
used in this logical drive configuration.
This drive is being rebuilt from the
Normal operations should occur.
corresponding mirror or parity data.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 80
3-42 Diagnostic Tools
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
Drive (bay) X was
inadvertently
replaced
Duplicate write
memory error
Error occurred
reading RIS copy
from drive (bay) X
FYI: Drive (bay) X
is non-Compaq
supplied
Identify controller
data did not
match with
NVRAM
The physical drive was incorrectly
replaced after another drive failed.
Data could not be written to the array
accelerator board in duplicate due to
the detection of parity errors. This is not
a data loss situation.
An error occurred while DAAD was
trying to read the RIS from this drive.
The installed drive was not supplied by
Compaq.
The identify controller data from the
array controller did not match the
information stored in nonvolatile RAM.
This could occur if new, previously
configured drives have been placed in a
system that has also been previously
configured. It could also occur if the
firmware on the controller has been
upgraded and the System Configuration
Utility was not run.
Replace the drive that was incorrectly replaced and
replace the original drive that failed. Do not run the
System Configuration Utility and try to reconfigure;
data will be lost.
Replace the array accelerator board.
If there were multiple errors, this drive may need to
be replaced.
If problems exist with this drive, replace it with a
Compaq drive.
Check the identify controller data under the Inspect
Utility. If the firmware version field is the only thing
different between the controller and nonvolatile RAM
data, this is not a problem. Otherwise, run the System
Configuration Utility.
Identify logical
drive data did not
match with
NVRAM
Insufficient
adapter resources
The identify unit data from the array
controller did not match with the
information stored in nonvolatile RAM.
This could occur if new, previously
configured drives have been placed in a
system that has also been previously
configured.
The adapter does not have sufficient
resources to perform operations to the
array accelerator board. Drive rebuild
may be occurring.
Run the System Configuration Utility to configure the
controller and nonvolatile RAM.
Operate the system without the array accelerator board
until the drive rebuild completes.
Continued
Page 81
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
3-43
Logical drive X
failed due to
cache error
Logical Drive X
status = FAILED
Logical Drive X
status = INTERIM
RECOVERY
Logical Drive X
status = LOOSE
CABLE DETECTED
This logical drive failed due to a
catastrophic cache error.
This status could be issued for several
reasons. If this logical drive is
configured for No Fault Tolerance and
one or more drives fail, this status will
occur. If mirroring is enabled, and any
two mirrored drives fail, this status will
Replace the array accelerator board and reconfigure
using the System Configuration Utility.
Check for drive failures, wrong drive replaced, or loose
cable messages. If there was a drive failure, replace the
failed drive(s) and then restore the data for this logical
drive from the tape backup. Otherwise, follow the
wrong drive replaced or loose cable detected
procedures.
occur. If Data Guarding is enabled, and
two or more drives fail in this unit, this
status will occur. This status may also
occur if another configured logical drive
is in the WRONG DRIVE REPLACED or
LOOSE CABLE DETECTED state.
A physical drive in this logical drive has
Replace the failed drive as soon as possible.
failed. The logical drive is operating in
interim recovery mode and is
vulnerable.
A physical drive has a cabling problem.Turn the system off and attempt to reattach the cable
onto the drive. If this does not work, replace the cable.
Logical Drive X
status = NEEDS
RECOVER
Logical Drive X
status =
OVERHEATED
Logical Drive X
status =
OVERHEATING
Logical Drive X
status =
RECOVERING
A physical drive in this logical drive has
failed and has now been replaced. This
drive needs to be rebuilt from the mirror
drive or the parity data.
The temperature of the Intelligent Array
Expansion System drives is beyond safe
operating levels and it has shut down to
avoid damage.
The temperature of the Intelligent Array
Expansion System drives is beyond safe
operating levels.
A physical drive in this logical drive has
failed and has now been replaced. The
replaced drive is rebuilding from the
mirror drive or the parity data.
When booting up the system, select the "F1 - rebuild
drive" option to rebuild the replaced drive.
Check the fans and the operating environment.
Check the fans and the operating environment.
Nothing needs to be done. Normal operations
can occur.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 82
3-44 Diagnostic Tools
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
Logical Drive X
status = WRONG
DRIVE REPLACED
Loose cable
detected - logical
drives may be
marked FAILED
until corrected
Mirror data
miscompare
Mirrored memory
location errors
No configuration
for Accelerator
Board
A physical drive in this logical drive has
failed. The incorrect drive was replaced.
Controller unable to communicate with
one or more physical drives, probably
because of a cabling problem. Logical
drives may be in a FAILED state until the
condition is corrected, preventing
access to data on the controller.
Data was found at reinitialization in the
posted write memory; however, the
mirror data compare test failed resulting
in data being marked as invalid. Data
loss is possible.
Soft errors occurred when attempting to
read the same data from both sides of
the mirrored memory. Data loss will
occur.
The array accelerator board has not
been configured.
Replace the drive that was incorrectly replaced. Then,
replace the original drive that failed with a new drive.
Do not run the System Configuration Utility to
reconfigure; you will lose data on the drive.
Check all controller and drive cable connections.
Replace the array accelerator board.
Replace the array accelerator board.
If the array accelerator board is present, run the System
Configuration Utility to configure the board, if desired.
SCSI port X, drive
ID Y firmware
needs upgrading
Set configuration
command issued
Soft Firmware
Upgrade required
Threshold for
drive (bay) X
violated
Drive’s firmware may cause problems
and should be upgraded.
The configuration of the array controller
has been updated. The array
accelerator board may remain disabled
until it is reinitialized.
DAAD has determined that your
controller is running firmware that has
been soft upgraded by the Compaq
Upgrade Utility. However, the firmware
running is not present on all drives. This
could be caused by the addition of new
drives in the system.
This message indicates that a monitor
and performance threshold for this drive
has been violated.
Run Options ROMPaq to upgrade the drive’s firmware to
a later revision.
Run the System Configuration Utility to reinitialize the
array accelerator board.
Run the Compaq Upgrade Utility to place the latest
firmware on all drives.
Check for the particular threshold that has
been violated.
Continued
Page 83
DAAD Diagnostic Messages Continued
MessageDescriptionRecommended Action
3-45
Threshold
violations for drive
(bay) X
Unknown disable
code
Warning bit
detected
WARNING - Drive
Write Cache is
enabled on X
Wrong AcceleratorThis could mean that either the board
This is a list of the individual thresholds
that have been violated for this drive.
A code was returned from the array
accelerator board that DAAD does
not recognize.
A monitor and performance threshold
violation may have occurred. The status
of a logical drive may not be OK.
Drive has its internal write cache
enabled. The drive may be a third-party
drive or the drive’s operating
parameters may have been altered.
Condition may cause data corruption if
power to the drive is interrupted.
was replaced in the wrong slot or
placed in a system that was previously
configured with another board type.
Included with this message is a
message indicating the type of adapter
sensed by DAAD and a message
indicating the type of adapter last
configured in EISA nonvolatile RAM.
The drive may need to be replaced. Run the Compaq
Diagnostics Utility to determine if the drive has been
initialized and the threshold violation warrants
drive replacement.
Obtain the latest version of DAAD.
Check the other error messages for an indication of
the problem.
Replace the drive with a Compaq supplied drive, or
restore the drive’s operating parameters.
Check the diagnosis screen for other error messages.
Run the System Configuration Utility to update the
system configuration.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 84
3-46 Diagnostic Tools
Integrated Management Log
On servers supporting the Integrated Management Display, the Compaq Integrated Management
Log (IML) replaces the Critical Error Log and Correctable Memory Logs. It records system
events and stores them in an easily viewable form. It marks each event with a time-stamp with
one-minute granularity.
Events listed in the Integrated Management Log are categorized as one of four event
severity levels:
■ Status - indicates that the message is informational only.
■ Repaired - indicates that corrective action has been taken.
■ Caution - indicates a non-fatal error condition.
■ Critical - indicates a component failure.
The Integrated Management Log requires Compaq Operating System-dependent drivers. Refer
to the Compaq Support Software CD for instructions on installing the appropriate drivers.
Multiple Ways of Viewing the Log
You can view an event in the IML in several ways:
■ On the Integrated Management Display
■ From within Compaq Insight Manager
■ From within Compaq Survey Utility
■ From within IML Management Utility
Integrated Management Display
The Integrated Management Display is a Liquid Crystal Display (LCD) panel that presents
information directly at the server, assisting in diagnosing and servicing the server without a
keyboard and monitor.
Compaq Insight Manager
Compaq Insight Manager is a server management tool providing in-depth fault configuration and
performance monitoring of hundreds of Compaq servers from a single management console.
System parameters that are monitored describe the status of all key server components. By being
able to view the events that may occur to these components, you can take immediate action. You
can view and print the event list from within Compaq Insight Manager by following the
instructions that follow. You can also mark a Critical or Caution event as Repaired after the
affected component has been replaced, for example, when a failed fan has been replaced. By
marking the component as repaired, you can lower the severity of the event.
Page 85
Viewing the Event List
1. From Compaq Insight Manager, select the appropriate server, then select View Device
Data. The selected server displays, with buttons around its perimeter.
Select the Recovery button à Integrated Management Log.
2.
If a failed component has been replaced, select the event from the list, then select
3.
Mark Repaired.
Printing the Event List
NOTE: You can only view the event list from the Recovery/Integrated Management Log screen
as described above.
1.
From the Insight Manager, select the appropriate server.
Select the Configuration button à Recovery button à Print.
2.
Compaq Survey Utility
The Compaq Survey Utility is a serviceability tool available from Windows NT and Novell
NetWare that delivers online-configuration capture and comparison to maximize server
availability. It is delivered on the Compaq Management CD in the SmartStart package or is
available on the Compaq website. Refer to the Compaq Management CD for information on
installing and running the Compaq Survey Utility.
3-47
After running the Compaq Survey Utility, you can view the IML by loading the output of the
utility (typically called “survey.txt”) into a text viewer such as Notepad. The event list follows
the system slot information. Once you have opened the text file, you can print it using the print
feature of the viewer.
Compaq IML Management Utility
The Compaq IML Management Utility is a DOS-based tool that gives you the off-line ability to
review, mark corrected, and print events from the IML. It is located on the Compaq SmartStart
and Support Software CD. Refer to the SmartStart Installation for Servers poster, which ships
with the server, for information on how to install and use the IML Management Utility.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 86
3-48 Diagnostic Tools
Event List
The event list displays the affected components and the associated error messages. Though the
same basic information is displayed, the format of the list may differ, depending on how you
view it: on the Integrated Management Display, from within Compaq Insight Manager, the IML
management utility, or the Compaq Survey Utility. An example of the format of an event (as
displayed on the Integrated Management Display) is as follows:
**001 of 010**
---caution---
03/19/1997
12:54 PM
FAN INSERTED
Main System
Location:
System Board
Fan ID: 03
**END OF EVENT**
Automatic Operating System Shutdown Initiated Due to Fan Failure
Automatic Operating System Shutdown Initiated Due to Overheat Condition
Fatal Exception (Number X, Cause)
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 88
3-50 Diagnostic Tools
Rapid Recovery Services
Compaq servers provide rapid recovery services for diagnosing and recovering from errors.
These tools are available for local and remote diagnosis and recovery.
Rapid recovery means fast identification and resolution of complex faults. The Rapid Recovery
Engine and Insight Management Agents notify the system administrator when a failure occurs,
ensuring that the server experiences minimal downtime. You enable these features through the
System Configuration Utility. These integrated server management features are:
■ Automatic Server Recovery-2 (ASR-2)
■ Server Health Logs (on servers not supporting Integrated Management Logs)
These are discussed in more detail on the Systems Reference Library CD (SRL).
Automatic Server Recovery-2
Automatic Server Recovery-2 (ASR-2) lets the server restart automatically from the operating
system or the Compaq Utilities. To use this feature, you must use the System Configuration
Utility to install Compaq Utilities in the system partition.
You can tell ASR-2 to restart your server after a critical hardware or software error occurs.
Using the Compaq System Configuration Utility, configure the system for either automatic
recovery or for attended local or remote access to diagnostic and configuration tools.
You can also configure ASR-2 to page an administrator when the system restarts. ASR-2
depends on the application and driver that routinely notify the ASR-2 hardware of proper system
operations. If the time between ASR-2 notifications exceeds the specified period, ASR-2
assumes a fault has occurred and initiates the recovery process.
To configure ASR-2:
Execute the System Configuration Utility.
1.
Select View and Edit Details.
2.
Set the software error recovery status to Enabled.
3.
Set the software error recovery time-out.
4.
Page 89
The available recovery features are:
■ Software Error Recovery – automatically restarts the server after a software-induced
server failure
■ Environmental Recovery – allows the server to restart when temperature, fan, or AC
power conditions return to normal
Unattended Recovery
For unattended recovery, ASR-2 performs the following actions:
■ Logs the error information to the IML
■ Resets the server
■ Pages you (if a modem is present and you selected Paging)
■ Tries to restart the operating system. Often the server restarts successfully, making
unattended recovery the ideal choice for remote locations where trained service personnel
are not immediately available.
If ASR-2 cannot restart the server within 10 attempts, it places a critical error in the Integrated
Management Log, starts the server into Compaq Utilities, and enables remote access (if you
configured remote access).
3-51
To use this level of ASR-2, you must configure ASR-2 to load the operating system after restart.
Attended Recovery
For attended recovery, ASR-2 performs the following actions:
■ Logs the error information to the IML
■ Resets the server
■ Pages you (if a modem is present and you selected Paging)
■ Starts Compaq Utilities from the hard drive
■ Enables remote access
During system configuration, these utilities are placed on the system utilities partition of the hard
drive.
If you have configured for dial-in access and have a modem with an auto-answer feature
installed, you can dial in and remotely diagnose or reconfigure the server.
If you have configured the Compaq Utilities for network access, you can access the utilities over
the network. You can use Compaq Insight Manager for dial-in or network access.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 90
3-52 Diagnostic Tools
Hardware Requirements
To use this level of ASR-2 over a modem, you need the following:
■ Compaq modem or optional Hayes compatible modem
■ System Configuration Utility and Diagnostics Utility installed on the system partition of
the hard drive
■ ASR-2 configured to load Compaq Utilities after restart
You can also run Compaq Utilities remotely over an IPX or IP network using the
Network feature:
■ To use Compaq Utilities on an IPX network, you must have Compaq Insight Manager 2.0
or later or an NVT (Novell Virtual Terminal) Terminal Emulator with VT100 or ANSI
terminal capabilities.
■ To use Compaq Utilities on an IP network, you must have Compaq Insight Manager 2.10
or later, or a Telnet Terminal Emulator with VT100 or ANSI capabilities.
If you are notified that ASR-2 restarted the server and you have restarted to Compaq Utilities,
use the Inspect Utility or Compaq Insight Manager to view the critical error in the Critical Error
Log. Run Diagnostics to diagnose and resolve the problem.
You can configure ASR-2 to restart the server into Compaq Utilities to diagnose the critical
error, or to start the operating system to return the server to operational status as rapidly
as possible.
When you enable ASR-2 to start the operating system, the server tries to start from the primary
partition. In this mode, ASR-2 can page you if a critical error occurs, but you cannot access
Compaq Utilities.
When you enable ASR-2 to start Compaq Utilities, your server restarts after a critical error and
loads Compaq Utilities from the system partition on the hard drive.
You can configure your server to start Compaq Utilities in four different ways:
■ Without remote console support; for example, to run Compaq Utilities from the server
console only
■ With remote console support using modems for dial-in access
■ With remote console support using a modem to dial a predetermined telephone number
■ With remote console support through a network connection (IP or IPX)
Page 91
Compaq Integrated Remote Console
The standard Compaq Integrated Remote Console performs a wide range of configuration
activities. Some of the console’s features include:
■ Accessible using ANSI terminal
■ Operates independently of the operating system
■ Provides for remote server reboot
■ Provides access to system configuration
■ Uses out-of-band communication with dedicated management modem installed in
the server
For more information, see the Integrated Remote Console User Guide that shipped with
your server.
IMPORTANT: Before configuring ASR-2, verify that the System Configuration Utility and
Diagnostics software are installed on the system partition. ASR-2 must have this to start
Compaq Utilities after a system restart. Compaq recommends this even if you configure
ASR-2 to start the operating system.
3-53
Compaq Health Driver
The Compaq Health Driver continually resets the ASR-2 timer according to the frequency you
specified in the System Configuration Utility (for example, 10 minutes). If the ASR-2 timer
counts down to zero before being reset, due to an operating system crash or a server lock-up,
ASR-2 restarts the server into either Compaq Utilities or the operating system (as indicated by
the System Configuration parameters). The default value is 10 minutes. The allowable settings
are 5, 10, 20, and 30 minutes.
For remote and off-site (unattended) servers, setting the software error recovery time-out for
5 minutes reduces server downtime and allows the server to recover quickly. For local (attended)
servers located onsite, you can set the software error recovery time-out for 20 or 30 minutes,
giving you time to arrive at the server if you wish to manually diagnose the problem.
The Compaq Health Driver is independent of the ASR-2 timer. You should load it and enable the
ASR-2 timer. This allows the driver to detect and log information about numerous hardware and
software errors in the IML. However, you cannot enable the ASR-2 timer without loading the
Compaq Health Driver.
Before ASR-2 restarts the server, it records any information available about the condition of the
operating system in the Critical Error Log, or the IML depending on the server support. This
information can be used to diagnose an operating system crash or server lock-up, while still
allowing the server to be restarted.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 92
3-54 Diagnostic Tools
The following ASR-2 flow chart shows you the sequence of events after a hardware or software
error occurs:
Figure 3-1. ASR-2 flow chart
Hardware/Software error occurs
Error records in the Critical Error Log,
or the Integrated Management Log,
depending on your server
Operating System halts normal
If a modem is installed and paging
is enabled, the Server Failure
Notification pager alert is sent to
the Server Administrator
|
Unattended server boots the
Operating System
If the server continues experiencing
hardware/software errors and the
number of ASR cycles exceed the
specified number of recovery
attempts, the server logs an error to
the Server Health Log or the
Integrated Management Log and
boots the Compaq Utilities from the
system partition on the hard drive
|
configuration.
|
operation
|
ASR Timer expires
|
Server is reset
|
---Or---Server boots the Compaq Utilities
on the system partition on the
hard drive
|
If a modem is installed, ASR puts
the modem on auto answer so that
the Server Administrator can dial in
(using third party terminal emulator
software) to remotely run the
Compaq Utilities to identify the
source of the fault
|
Or
|
Local Server Administrator runs
Compaq Utilities from server
console to identify the source
of the fault
Page 93
Booting into Compaq Utilities
When you enable ASR-2 to start into Compaq Utilities and a critical error occurs, the operatingsystem-specific Health Driver logs the error information in the Critical Error Log or the IML and
the ASR-2 feature restarts the server. When the system reinitializes, the system pages the
designated administrator (if enabled), and starts Compaq Utilities from the hard drive.
If Dial-In status is enabled, the modem is placed in auto-answer mode. If you enable Dial-Out
status, you are automatically enabled for Dial-In.
If Network Status is enabled, the appropriate network support software is loaded, depending on
the network protocol, IP or IPX. This allows remote access via the network.
IMPORTANT: Compaq Utilities are loaded from a specially created system partition on the
hard drive. This partition was configured during server configuration.
You can access the server and view the Server Health Logs (in servers not supporting the IML)
remotely by modem, in-band over the network, or directly from the server. For modem access,
you must have either Compaq Insight Manager 2.0 or above or have a VT100 or ANSI terminal
type device. You may use a standard CRT with VT100 or ANSI emulation capability, or you
may use a PC with a VT100 or ANSI terminal emulation package. The communication
parameters must be set for 8 data bits, no parity, and 1 stop bit.
3-55
You can also enable ASR-2 to allow network access using the Network Status feature in the
System Configuration Utility. You must have either Compaq Insight Manager 2.0 or greater or a
Novell Virtual Terminal (NVT) emulator on an IPX network to use this feature. You must also
have version 2.24 or later of the System Configuration Utility. For IP access, you must have
Compaq Insight Manager 2.10 or later, or a Telnet Terminal emulator to use this feature. You
also must have version 2.24 or later of the System Configuration Utility.
The System Configuration Utility settings should resemble the settings in Table 3-23 when you
enable ASR-2 to start into Compaq Utilities.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 94
3-56 Diagnostic Tools
Pager DataSettingDescription
Pager statusEnabledIndicates if the pager feature is enabled or disabled.
Table 3-23
Compaq System Configuration Utility Pager Settings
for Booting into Compaq Utilities
Pager dial
string
Pager
message
Pager testSelect to test
Serial interfaceCOM1Select the communications port for the modem used by the pager and the remote
Dial-in statusEnabledSet Dial-In Status to Enabled. Be sure the Reset Boot option is set to Boot
ATDT
555-5555
1234567#Represents a unique number (maximum seven digits, numeric only) that you
pager setup
Indicates the pager dial string and delay before the pager message. Pagers
typically use one of the following formats:
Local pagers: ATDT 555-5555
Wide area pagers: ATDT 1-800-555-5555,1234567#
must designate to identify the server on your pager display. The ROM adds a
three-digit code to the front of this number. The first two indicate the subsystem
and the third indicates the severity of the error that caused the alert. The #
symbol usually terminates the message. If no message is required, delete
the # symbol.
Use this to test the current pager settings. Press Enter to dial the pager number,
and the pager message (if present) displays. You must configure the computer
before testing the pager and the Pager Status must be set to Enabled. Do not test
the pager if you are running remotely and are using only one modem.
ASR-2 functions. The options are COM1 and COM2.
Compaq Utilities. When the system starts because of an ASR reset, it starts to the
Compaq Utilities, sets the Management Modem to auto-answer, and waits for the
administrator to dial in and run the Compaq Utilities.
You automatically disable this option when you configure the software error
recovery start option to Boot Operating System. When ASR pages you, you cannot
dial in unless ASR-2 exceeds 10, the threshold number of server restart retries.
When this happens, ASR-2 restarts the server into the Compaq Utilities and
places the modem in auto-answer mode.
Continued
Page 95
Compaq System Configuration Utility Pager Settings
for Booting into Compaq Utilities
Pager DataSettingDescription
Dial-out statusEnabledAllows ASR-2 to dial out to a remote workstation. If you selected this option,
Dial-out string555-1234Enter the dial string followed by the remote computer telephone number.
Network statusEnabledTo allow network access to Compaq Utilities, set Network Status to Enabled
Network protocolTo use IPX network access, set Network Protocol to IPX. When the system
Continued
Dial-In Status is automatically selected.
To use the dial-out feature, set Dial-Out Status to Enabled and set the Dial-Out
String to the correct phone number. You must also set the Reset Boot option to
Boot Compaq Utilities. When the system restarts because of an ASR reset, the
administrator is paged via Pager Status and Pager Dial String, the system
restarts to the Compaq Utilities, and dials out to the phone number provided in
the Dial-Out string. The dial-out number will be tried five times. If it fails to
connect after five attempts, the modem is put in auto-answer mode.
and make sure the Reset Boot option is set to Boot Compaq Utilities.
restarts to the Compaq Utilities because of an ASR reset, it loads IPX network
support. This enables remote access via NVT.
3-57
To use IP network access, set Network protocol to IP. Also make sure to set
Network IP address, Network IP net mask, and Network IP router address.
When the system restarts to the Compaq Utilities because of an ASR reset, it
loads IP network support. This enables remote access via Telnet.
NOTE: The Network Status must be set to Enabled for network access.
Network controllerCompaqFor all Compaq Standard Network Controllers.
Network host
name
CPQHOUEnter the network name of the server. Use underscores instead of spaces
within the name, for example, Compaq_Server. If you are using IPX network
access to the Compaq Utilities, this server name is used to advertise NVT host
services. This server name displays in the Compaq Insight Manager server list
when it determines it can communicate via NVT. Set this name to be the same
as the server name you assign when the host OS is running.
Network card slotSlot #Select the slot number of the network interface card you wish to use for
network access to Compaq Utilities.
Network frame
type
ETHERNET_I ISelect the frame type for your network. Selections include both Ethernet and
Token Ring topologies.
Continued
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 96
3-58 Diagnostic Tools
Compaq System Configuration Utility Pager Settings
for Booting into Compaq Utilities
Pager DataSettingDescription
Continued
Network IP
address
Network IP net
mask
Network IP
router address
Enter the IP address for this server in standard dot notation.
NOTE: This is not used if you select Custom for Network controller. You
must enter your IP address in the NET. CFG file that you load into the
system partition.
Enter the net mask for this server in standard dot notation.
NOTE: This is not used if you select Custom for network controller. You
must enter your IP address in the NET. CFG file that you load into the
system partition.
Enter the router to be used for this server in standard dot notation.
NOTE: This is not used if you select Custom for network controller. You
must enter your IP address in the NET. CFG file that you load into the
system partition.
If you configure the server to boot into Compaq Utilities, it prepares for remote communications.
You can remotely run Diagnostics software, the Inspect Utility, or the System Configuration
Utility using a workstation running terminal emulation software, such as Compaq Insight
Manager or PC Anywhere.
Booting into the Operating System
When you enable ASR-2 to restart into the operating system and a critical error occurs, ASR-2
logs the error in the Critical Error Log or IML and restarts the server. The system ROM pages
the designated administrator, then executes the normal restart process.
IMPORTANT: When you enable ASR-2 to restart into the operating system, Modem Dial-In
Status, Network Status, and Modem Dial-Out Status are automatically disabled. In this
mode, ASR-2 can page you if a critical error
occurs, but you cannot access the server, and
the server cannot dial out to a remote workstation.
If the ASR-2 feature cannot restart the server within 10 attempts, it logs a critical error in the
Critical Error Log or IML Log restarts the server into the Compaq Utilities, and puts the modem
into auto-answer mode.
Your System Configuration Utility setting should resemble the following when you enable ASR
to restart into the operating system:
■ Serial interfaceCOM1
■ Dial-in statusDisabled
■ Dial-out statusDisabled
■ Dial-out string555-1234
■ Network statusDisabled
■ Network protocolIPX
Page 97
■ Network controllerCompaq
■ Network host nameCPQHOU
■ Network card slotSlot #
■ Network frame typeETHERNET_II
■ Network IP addressxxx.xxx.xxx.xxx
■ Network IP net maskxxx.xxx.xxx.xxx
■ Network IP router addressxxx.xxx.xxx.xxx
ASR-2 Security
The standard Compaq password features function differently during ASR-2 than during a typical
system startup.
During ASR-2, the system does not prompt for the Power-On Password. This allows the ASR-2
to restart the operating system or Compaq Utilities without user intervention.
To maintain system security, set the server to boot in Network Server Mode (an option in the
System Configuration Utility). This option ensures that the server keyboard is locked until you
enter the Keyboard Password.
3-59
Select an Administrator Password (an option in the System Configuration Utility). During
attended ASR-2 (local or remote), you must enter this Administrator Password before any
modifications can be made to the server configuration.
Server Health Logs
In some servers, Server Health Logs are replaced by the IML, if it is supported. See “Integrated
Management Display” in this chapter for more information.
Server Health Logs contain information to help identify and correct any server failures and
correlate hardware changes with server failure. Server Health Logs are stored in nonvolatile
RAM and consist of the Critical Error Log and the Revision History Table.
If errors occur, information about the errors is automatically stored in the Critical Error Log.
Whenever boards or components (that support revision tracking) are updated to a new revision,
the Revision History Table is updated.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 98
3-60 Diagnostic Tools
Critical Error Log
The Critical Error Log records memory errors, as well as catastrophic hardware and software
errors that cause the system to fail. This information helps you quickly identify and correct the
problem, thus minimizing downtime.
You can view the Critical Error Log through the Compaq Insight Manager. The Diagnostics
Utility either resolves the error or suggests corrective action in systems that do not support event
logs.
The Critical Error Log identifies and records the following errors. Each error type is briefly
explained below.
MessageDescription
Table 3-24
Critical Error Log Messages
Abnormal Program
Termination
ASR-2 detected by ROMAn ASR-2 activity has been detected and logged by the system ROM.
ASR-2 Test EventThe System Configuration Utility generated a test alert.
Automatic Server Recovery
Base Memory Parity Error
Automatic Server Recovery
Extended Memory Parity
Error
Automatic Server Recovery
Memory Parity Error
Automatic Server Recovery
Reset Limit Reached
Battery FailingLow system battery warning. Replace battery within 7 days to prevent loss of nonvolatile
Caution: Temperature
Exceeded
The operating system has encountered an abnormal situation that has caused a
system failure.
The system detected a data error in base memory following a reset due to the Automatic
Server Recovery-2 (ASR-2) timer expiration.
The system detected a data error in extended memory following a reset due to the ASR-2
timer expiration.
The system ROM was unable to allocate enough memory to create a stack. Then, it was
unable to put a message on the screen or continue booting the server.
The maximum number of system resets due to ASR-2 timer expiration has been reached,
resulting in the loading of Compaq Utilities.
configuration memory. Failure of the battery supporting the system’s nonvolatile RAM
is imminent.
The operating system has detected that the temperature of the system has exceeded the
caution level. Accompanying data in the log notes if an auto-shutdown sequence has been
invoked by the operating system.
Diagnostic ErrorAn error was detected by the Diagnostics Utility. See the specific error code in this chapter
for a detailed explanation.
Error Detected On Boot UpThe server detected an error during the Power-On Self-Test (POST).
Processor PrefailureA CPU has passed an internal corrected error threshold; excessive internal ECC
cache errors.
Continued
Page 99
Critical Error Log Messages Continued
MessageDescription
NMI - PCI Bus Parity ErrorA parity error was detected on the PCI bus.
NMI - Expansion Board Error A board on the expansion bus indicated an error condition, resulting in a server failure.
NMI - Processor Parity ErrorThe processor detected a data error, resulting in a server failure.
Server Manager FailureAn error occurred with the Server Manager/R.
NMI - Software Generated
Interrupt Detected Error
Caution: Temperature
Exceeded
Abnormal Program
Termination
ASR-2 Test EventThe System Configuration Utility generated a test alert.
NMI- Automatic Server
Recovery Timer Expiration
A bus master expansion board in the indicated slot did not release the bus after its
maximum time, resulting in a server failure.
A board on the expansion bus delayed a bus cycle beyond the maximum time, resulting in
a server failure.
Software was unable to reset the system fail-safe timer, resulting in a server failure.
Software indicated a system error, resulting in a server failure.
The operating system has detected that the temperature of the system has exceeded the
caution level. Accompanying data in the log notes if an auto-shutdown sequence has been
invoked by the operating system.
The operating system has encountered an abnormal situation that has caused a
system failure.
The operating system has received notice of an impending ASR-2 timer expiration.
Required System Fan Failure The required system fan has failed. Accompanying data in the log notes if an auto-
shutdown sequence has been invoked by the operating system.
UPS A/C Line Failure
Shutdown or Battery Low
ASR-2 detected by ROMAn ASR-2 activity has been detected and logged by the system ROM.
The UPS notified the operating system that the AC power line has failed. Accompanying
data indicates if an auto-shutdown sequence has been invoked or if the battery has been
nearly depleted.
Compaq ProLiant 800 Servers Maintenance and Service Guide
Page 100
3-62 Diagnostic Tools
Revision History Table
Some errors can be resolved by reviewing changes to the server configuration. The server has an
Automatic Revision Tracking (ART) feature that helps you review recent changes to the server
configuration.
One ART feature is the Revision History Table, which contains the hardware version number of
the system board and any other system boards providing ART-compatible revision information.
This feature lets you determine the level of functionality of an assembly in a system without
opening or powering down the unit.
Current Revisions
Data10/31/95
System Board Revision03
Assembly Version1
Functional Revision LevelC
Processor 01 Revision01
Table 3-25
Revision History Format
Assembly Version1
Functional Revision LevelA
Previous Revisions
Date03
System Board Revision03
Assembly Version1
Functional Revision LevelC
Processor 01 Revision01
Assembly Version1
Functional Revision LevelA
The Revision History Table is stored in nonvolatile RAM and is accessed through the Inspect
Utility and Compaq Insight Manager.
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.