Hp NetServer AA 4000 Reference Guide

HP AA NetServer
4000 Reference Guide
Printed in March 2000
HP NetServer AA
Notice
The information contained in this document is subject to change without notice.
incidental or consequential damages in connection with the furnishing, performance, or use of this material.
Hewlett-Packard assumes no responsibility for the use or reliability of its software on equipment that is not furnished by Hewlett-Packard.
This document contains proprietary information that is protected by copyright. All rights are reserved. No part of this document may be photocopied, reproduced, or translated to another language without the prior written consent of Hewlett-Packard Company.
Assured Availability is a trademark, and the Marathon logo and Endurance are registered trademarks of Marathon Technologies Corporation. Microsoft and Windows NT are registered trademarks of Microsoft Corporation. All other brand or product names are trademarks or registered trademarks of their respective holders.
Network Server Division
NSD Technical Training
10955 Tantau Avenue, MS45SLF
Cupertino, California 95014 USA
© Copyright 2000, Hewlett-Packard Company.
ii
Hewlett-Packard Company
AA 4000 Reference Guide
Contents
CHAPTER ONE ~ARCHITECTURE OVERVIEW AND TERMINOLOGY..........................................1-1
What is HP AA?................................................................................................................................ 1-2
HPAA Components...........................................................................................................................1-3
Software Components ......................................................................................................................1-4
The Logical Server ...........................................................................................................................1-5
Windows NT and Application Licensing...........................................................................................1-8
Division of Labor..............................................................................................................................1-9
The Compute Elements....................................................................................................................1-9
The SSDLs .......................................................................................................................................1-9
The I/O Processors...........................................................................................................................1-9
Client Network Access....................................................................................................................1-10
SCSI Identifiers...............................................................................................................................1-11
SCSI Port Number Changes...........................................................................................................1-12
Device Redirection .........................................................................................................................1-14
Putting it all together......................................................................................................................1-16
NetServer Rackmount Configurations............................................................................................1-18
Rules for maintaining availability ..................................................................................................1-20
CHAPTER TWO ~HPAA SYSTEM BOOT UP........................................................................................ 2-1
Verifying the MIC connections.........................................................................................................2-2
Checking the SSDL LEDs ...............................................................................................................2-2
Troubleshooting a “RED” LED .......................................................................................................2-4
The MTCTEST utility......................................................................................................................2-4
Powering Up the HPAA System........................................................................................................2-7
Cabling the AA 4000 hardware........................................................................................................2-7
Cabling the Console Switch.............................................................................................................2-8
Power Distribution ...........................................................................................................................2-9
Power On Sequence .......................................................................................................................2-10
AA 4000 Boot Options....................................................................................................................2-12
AA 4000 Boot Process....................................................................................................................2-13
IOP Boot ........................................................................................................................................2-13
The First CE Boot ..........................................................................................................................2-13
The Second CE Boot......................................................................................................................2-13
Using the Keyboard, Mouse, and Video .........................................................................................2-15
Video..............................................................................................................................................2-15
Keyboard and Mouse Control........................................................................................................2-16
Shutting Down the System ..............................................................................................................2-18
MTCCONS.exe..............................................................................................................................2-18
Removing Components..................................................................................................................2-19
Server Shutdowns and Reboots......................................................................................................2-21
Avoiding Unnecessary Re-Mirror Operations ...............................................................................2-21
Using the “Right” Copy of Windows NT........................................................................................ 2-22
When to use Windows NT on the IOPs .........................................................................................2-22
When to use Windows NT on the CEs...........................................................................................2-22
CHAPTER THREE ~AA 4000 AND HP MANAGEMENT TOOLS........................................................3-1
AA 4000 Software Architecture ........................................................................................................3-2
Marathon System Manager (MSM) ..................................................................................................3-4
Remote Management ........................................................................................................................3-5
MSM – Main Screen ........................................................................................................................3-6
Control and Display ........................................................................................................................ 3-7
Control and Display Options............................................................................................................3-8
MSM Preferences .............................................................................................................................3-9
Device Status ..................................................................................................................................3-10
Last Mirror Copy Status.................................................................................................................3-12
Network Server Division
iii
HP NetServer AA
Utilities........................................................................................................................................... 3-13
Display Software Revisions............................................................................................................ 3-14
HP TopTools Remote Control Card............................................................................................... 3-15
HP TopTools and Agents................................................................................................................3-15
ManageX ........................................................................................................................................ 3-15
CHAPTER FOUR ~NETWORKING EXPLAINED ................................................................................. 4-1
Network Planning............................................................................................................................. 4-2
PCI Slot locations ............................................................................................................................ 4-2
Windows NT Bus Numbering .........................................................................................................4-3
How Windows NT sees it… ............................................................................................................4-3
Gathering Networking Information .................................................................................................4-4
Three Independent Networks............................................................................................................ 4-4
The Private Network (IOP link) .......................................................................................................4-5
IOP Link Configuration ...................................................................................................................4-5
The Public Network (Ethernet Rails) ...............................................................................................4-7
Public Rail Configuration (IOP)...................................................................................................... 4-8
IOP Public Rail Bindings................................................................................................................. 4-9
Public Rail Configuration (CE)........................................................................................................4-9
Virtual Adapters...............................................................................................................................4-9
CE Bindings................................................................................................................................... 4-10
Public Network to IOP to CE and Back......................................................................................... 4-11
The Virtual Network....................................................................................................................... 4-12
Virtual Network Configuration ......................................................................................................4-12
Adding a Public Rail ...................................................................................................................... 4-14
Working with a fully configured system ......................................................................................... 4-14
CHAPTER FIVE ~SYSTEM UPGRADES ............................................................................................... 5-1
Before Upgrading the HP AA System...............................................................................................5-2
Becoming Familiar With the Array .................................................................................................. 5-2
System Documentation ..................................................................................................................... 5-3
Adding Additional Storage to the Array........................................................................................... 5-4
Can downtime be tolerated? ............................................................................................................5-4
Software requirements:....................................................................................................................5-4
Document the present storage configuration....................................................................................5-4
Install the additional storage ............................................................................................................ 5-5
Decision time................................................................................................................................... 5-5
Reboot the IOP ................................................................................................................................5-5
Configuring the new mirrored drive ................................................................................................ 5-5
Determine the 4 digit SCSI Identifier .............................................................................................. 5-6
Modify the Marathon configuration on IOP1 and IOP2.................................................................. 5-7
Reboot the array............................................................................................................................... 5-9
Confirm the new drives on the CE...................................................................................................5-9
Adding SCSI Devices (HBA's, HP NetRAID)................................................................................... 5-9
Upgrading the Marathon Software ................................................................................................ 5-11
Upgrading Marathon software on the CE Operating System......................................................... 5-12
Upgrading Marathon Software on Each IOP ................................................................................. 5-12
Running MTCFlash on each CE....................................................................................................5-13
Verifying the Upgrade ................................................................................................................... 5-14
Upgrading an Installed System to an SMP IOP System................................................................. 5-15
Other Upgrade and Downgrade Options........................................................................................5-15
Updating/Patching Windows NT with Service Packs..................................................................... 5-16
For the CE Operating System ........................................................................................................ 5-16
For the IOP Operating System.......................................................................................................5-16
Updating NT Applications.............................................................................................................. 5-17
CHAPTER SIX ~BACKUP AND RESTORE ........................................................................................... 6-1
Backup topologies and tradeoffs ...................................................................................................... 6-2
Pure Local Backups ......................................................................................................................... 6-2
Semi-Local Backups........................................................................................................................6-3
iv
Hewlett-Packard Company
AA 4000 Reference Guide
Network Backups.............................................................................................................................6-4
Configuration Comparisons .............................................................................................................6-5
Backup Confiurguration Setup Notes...............................................................................................6-6
Pure-local backup configuration ......................................................................................................6-6
Semi-local backup configuration .....................................................................................................6-6
Network backup configuration.........................................................................................................6-8
Disaster recovery procedures.........................................................................................................6-11
Part Numbers for Backup Configurations...................................................................................... 6-15
CHAPTER SEVEN ~BASIC TROUBLESHOOTING .............................................................................. 7-1
Overview of Troubleshooting in a HP AA Environment...................................................................7-2
Diagnosing Faults ............................................................................................................................ 7-2
Other MTC Tools.............................................................................................................................7-3
Isolating the Faults........................................................................................................................... 7-4
Analyzing an Event...........................................................................................................................7-6
Correcting the Faults........................................................................................................................7-9
Providing Information to the HP Call Center ................................................................................7-10
The Windows NT "Blue Screen of Death" ...................................................................................... 7-10
Basic Marathon Hardware Replacement .......................................................................................7-11
Replacing the MIC Cable...............................................................................................................7-11
Replacing the TL Cable .................................................................................................................7-11
Replacing the IL Cable...................................................................................................................7-12
Replacing an IOPx.Ethernet Cable.................................................................................................7-12
Replacing a MIC ............................................................................................................................7-13
Replacing an SSDL........................................................................................................................7-14
Replacing an IOP ...........................................................................................................................7-15
Replacing a CE...............................................................................................................................7-17
Replacing a Failed Ethernet Adapter.............................................................................................7-18
Replacing a Failed Mirrored Disk .................................................................................................7-21
Replacing a Failed NetRAID Adapter............................................................................................7-22
Reenabling faulted Components.....................................................................................................7-23
Troubleshooting Tips .....................................................................................................................7-23
Common Problems .........................................................................................................................7-24
Network Server Division
v
Ch 1: Architecture Overview and Terminology
Chapter One ~ Architecture Overview and Terminology
This chapter contains a brief overview of the HP AA system based on the Endrance 4000 software from Marathon Technologies. Topics to be covered include:
HP AA Components
Installation Overview
How the system works
Storage Architecture, and
Network Architecture
Network Server Division
1-1
HP NetServer AA
What is HP AA?
HP AA is a platform of high-availability solutions offering the highest levels of system uptime with the lowest total cost of ownership in the industry. Using HP NetServers, standard Windows NT, and unmodified "off-the-shelf" applications, every HP NetServer AA Solution delivers:
Nonstop processing through failures & repairs
Continuous data access to storage
Uninterrupted network connectivity
Disaster tolerance for multi-site protection
In addition to using HP NetServers, an OEM hardware/software kit from Marathon Technologies is used to create one logical server array from four NetServers. The current model of the kit is known as the Endurance 4000 (AA 4000). Though HP AA is a HP product sold by HP and supported by HP, the system splash screens, administrative tools, and product documentation will make several references to Marathon and more importantly, Endurance 4000. Knowing the product name AA 4000 is important in order to maintain and apply the correct software revisions for the array itself and the firmware of the interconnect cards.
For the remainder of this reference guide, when referring to the array as a whole, the convention used is HPAA. When referring to the specifics of the components, ‘AA 4000’ may be used to distinguish the generation of the product (as opposed to future products that may be referred to as ‘E6XX’).
1-2
Hewlett-Packard Company
HPAA Components
Ch 1: Architecture Overview and Terminology
There are four major hardware componenets of the HPAA system:
The NetServers – Four NetServers are needed, two perform a
synchronous operation of the NT operating system and the other two perform asynchronous I/O operations.
Marathon Interface Cards (MIC) – Each NetServer has a MIC
placed in a particular PCI slot. The MICs are identical and all of them must have the same firmware revision levels.
Network Server Division
SplitSite Data Link (SSDL) – There are two SSDLs that
provide for the interconnection between the four NetServers. The SSDLs are simply a transport mechanism between theNetServers and offer little software control of the system. The SSDLs are also used to provide video, keyboard, and mouse functions to the administrator of the system. More information on the SSDLs will be provided later.
MIC Cables – Each MIC attaches to the SSDL through the
implementaiton of a 100 pin serial cable. There are two different MIC cables. The four NetServers attach to the SSDL with an indentical 5-meter cable. The SSDLs attach to each other with a similar, but unique cable. This cable is identified as the one cable in the kit that has a ferrite (thicker, black rectangle shape) incorporated at one end of the cable.
1-3
HP NetServer AA
Software Components
Though at first glance, the HPAA solution appears to be mostly a hardware solution, in fact, it is an “85%” software solution. There are two major components of the software: the firmware on the MICs and the AA 4000 software installed on each of the NetServers. This obviously does not count the Windows NT operating system and any application software to be added for operation. The Windows NT software is fairly standard in all implementations; the application environment will vary among the different implementations.
MIC Firmware
The MIC firmware has a revision level, as do some of its subcomponents. When the NetServer is booted and going through its normal boot routine, following POST operations the NetServer will detect the presence of the MIC. When the MIC is detected the screen will display the following revision levels:
Marathon BIOS
Ucode
FPGA
Adpater Revision
The revision levels of each of these components must be identical on all four NetServers. If one of the NetServers contains a MIC with down level revisions, then the MIC must be flashed using the MTCflash utility. Specific steps for performing this operation can be found in Chapter 6 of the E 4000 User Guide provided with the system (and provided in Adobe Acrobat format on the AA 4000 CD), or Chapter 6 of this guide.
AA 4000 Software
The HPAA system is shipped to the customer as a complete, operational array. This includes the AA 4000 software already installed. The AA 4000 software can also be found on the Marathon CD provided with the system. It may be necessary during maintenance procedures to re-install the software (details on this operation is found in Chapter 6 of this guide). The AA 4000 software exists on the same logical drive where the Windows NT system files are located for all four NetServers. Some of the files are in a new directory and some are in the Windows NT system directory. The specific locations of the AA 4000 software are not as important as the presence of the software itself.
1-4
CAUTION
The specific files of the AA 4000 software do not need to be modified or accessed by the administrator through Windows Explorer. Maintenance of these files must take
Hewlett-Packard Company
The Logical Server
Ch 1: Architecture Overview and Terminology
place only through AA 4000 Management Tools or Utilities.
Logical servers are created from an array of four separate servers. Computing is distinctly separate from the input/output (I/O) processing, and the array runs simultaneously on two symmetrical halves (or tuples), which, combined together, do not have a single point of failure. I/O processors run asynchronously, and the compute elements run synchronously in lockstep.
Network Server Division
Compute Elements
Two of the NetServers take the roles of Compute Elements (CE’s). Within the AA 4000 software the CEs are numbered CE1 and CE2. The CEs are two exactly identical NetServers including the same stepping code of the same processor type, and the same system memory sizes. All other components of the CEs are either disabled in the BIOS or removed from the system. There will be no use of any onboard SCSI or a SCSI HBA. There are no network cards, keyboard, mouse, or any other peripheral devices. The lone exception to this is the MIC. The MIC is the only I/O device in the CE. These characteristics of the CEs will result in two servers that can now perform processing in what is called “lockstep.”
With the CEs running in lockstep, together they are running one on copy of Windows NT Server. During a typical HPAA boot process, one CE will boot off of a system disk located in one of the servers functioning as an I/O Processor. The second CE will not boot from a disk, but instead, it synchronizes with the other CE. Once synchronization is complete, then the two CEs process in “lockstep”
1-5
HP NetServer AA
allowing for the fail through performance should anything happen to one of the CEs.
I/O Processors
The other two NetServers take the roles of I/O Processors (IOPs). Within the AA 4000 software the IOPs are numbered IOP1 and IOP2. An IOP performs all I/O operations on behalf of the CE. It contains the hard disk drives necessary for storing its own copy of Windows NT, the CE’s copy of Windows NT, the applications installed on the CEs (applications for the array), and all of the needed data. It also has all of the network cards necessary for client access. In a PCI slot in the IOP is a MIC. Through the SSDLs, the IOP’s MIC can communicate with either one of the CEs. Typically it only communicates with one of the CEs.
It is necessary for the IOPs to boot first before the CEs can boot up. At least one IOP has to be ready with disks available in order for a CE to have a disk from which it can boot. When the IOP is operational, the NT administrative tools only see one logical disk, the one with its own copy of Windows NT. The rest of the disks have been “redirected” to the ownership of the CEs (redirection will be covered later in this chapter).
In effect, what is happening is whenever the CE has an I/O operation to perform, the AA 4000 software intercepts the I/O request Window NT has made, passes it to the MIC, the MIC passes it to the MIC in the IOP, then the IOP executes the I/O. This happens in the IOP as if the I/O operation originated in the IOP’s copy of Windows NT when in fact it was “inserted” by the AA 4000 software and MICs. When the I/O operation is complete, the “results” are sent back through the MICs to the processor and cache of the CEs.
Tuple
1-6
Tuple 1 Tuple 2
Hewlett-Packard Company
Ch 1: Architecture Overview and Terminology
The term tuple simply refers to the pair of one CE and one IOP connected through one SSDL. Tuples are important during installation and when trying to determine the status of the array. By default, CE1 attempts comminucation with IOP1 first, and then IOP2 in the event of IOP1 being unavailable. However, even though the CEs try to communicate within their own tuple first, cross-tuple communications will occur when one of the NetServers is unavailable.
As long as the MIC cables are attached to the slot correctly on the SSDLs, the tuples are “predefined.” There is only one way to configure the tuples, and CE1 and IOP1 will always be tuple 1; the same is tru for CE2, IOP2, and tuple 2.
All of the MICs are the same. What distinguishes CE from IOP, and CE1 from CE2, is the position of the cable on the SSDL and the SSDL itself. Though the SSDLs are identical in appearance, there is a slight difference in the inside of the SSDL that distinguishes between SSDL1 and SSDL2. The following is a rear view of the SSDL where the MIC cables are plugged.
Note that there are two 100 pin serial ports for the MIC cable, but there are specifically labeled for CE or IOP. The Data Link A port is for the Tuple Link cable (similar to the MIC cable) that goes to the other SSDL.
Network Server Division
1-7
HP NetServer AA
Windows NT and Application Licensing
The HPAA based on the AA 4000 software requires four Windows NT licenses. The CEs must have two licenses of Windows NT Server or Windows NT Enterprise Edition and the two IOPs must have two licenses of any of the Windows NT products (technically, the IOPs will work with Windows NT Workstation). Given these parameters, it is recommended to have four Windows NT Server licenses. Windows NT Enterprise Edition is rarely needed since the HPAA only supports a single CPU and the MS clustering service is not used. Windows NT Workstation is not recommended for the IOPs since future upgrades will not support NT workstation.
Windows NT Installations
There are three actual installations of Windows NT, one for each IOP and one for both CE’s. Because there are four servers each with its own memory and therefore a copy of the NT Kernel, Microsoft requires four total NT licenses. Localized language versions of Windows NT can be installed; however, the AA 4000 software itself is only available in English and Japanese versions. When installing the AA 4000 software, Windows NT must have Service Pack 3 or greater.
Application Licensing
Though the array is comprised of four NetServer each running their own kernel of Windows NT, the array is only presented as one server to the network. Clients only have one attachment point at any given moment. This combined with the fact that the CEs are only running one copy of Windows NT (in lockstep on two NetServers), means there is only a need for one application server license per array. For example, if the HPAA is going to be an MS Exchange Server for the network, there is only one Exchange Application server license needed. Client licenses are unaffected. The client license requirements would be the same as if there was only one physical server running an application.
1-8
Hewlett-Packard Company
Division of Labor
The Compute Elements
Ch 1: Architecture Overview and Terminology
The compute elements and the I/O processors have very distinct roles and therefore have different performance characteristics. If the array was going to do nothing more than run Windows NT without any applications, then the memory requirements are minimal. The array can be functional with 64 MB of memory for each node. This serves to prove that the AA 4000 software itself is not memory intensive and does require a significant amount of server resources.
The compute element functions with just the CPU and the memory. The server is sized for the maximum amounts of CPU and memory required for the application that stays within the limits of the memory support of the physical server and the fact that the HP AA 4000 software currently supports one CPU and a maximum of 2 GB of system memory. The system will generate I/O requests, but they are immediately intercepted by the HP AA 4000 software and transferred to the IOP via the MICs, cables, and SSDLs. This creates a very low overhead for I/O operations on the CE.
The SSDLs
The I/O Processors
These two components are nothing more than I/O routers. By requiring that the nodes be attached to specific ports, what little software function there is in the SSDLs is easily maintained and does not have to be concerned with any kind of routing scheme or mesh. The SSDLs are for making sure the data is transferred from each CE to both IOPs by simply providing a path. The HP AA 4000 software is responsible for the integrity of the data between the nodes and largely does this through checksums (EDCs) at the end of each packet transfer.
Here a different kind of server activity takes place. The I/O Processors do not run any applications other than simple management software added by the customer. They are complex I/O controllers that for lack of an easier solution happen to be running Windows NT. They are not in lock-step with each other, but instead, maintain disk synchronization through a HP AA 4000 software disk partition on each logical disk the CE uses. These are not CPU or memory intensive operations. As it turns out, the disk activity itself, as in most optimized server environments, can be the bottleneck if there is one.
Network Server Division
CAUTION
It is possible to run applications on the IOPs. However, this impacts the reliability of the overall system. HP strongly recommends that other applications not be run on the IOPs but will continue to support the array.
1-9
HP NetServer AA
Client Network Access
The HPAA System provides client network access to one logical server. As a single logical server, the system can provide services and applications to clients just like any other NT Server. However, the implementation of the network hardware and software is different than a single server environment.
The I/O Processors have all of the needed network interface cards installed for the solution. Two of the network cards provide a private link between the I/O Processors for disk synchronization and other activities. Both I/O Processors have an additional network card for each subnet it provides services and applications. Network cards must be ordered in pairs so that each IOP can continue to provide access to each subnet. The two I/O Processors will each go to the same subnet for client access using a softset MAC address from one of the cards. The network cards pass all network traffic to the Compute Elements. The CE’s will then decide what action, if any, needs to be taken as a result of the network packet. When network traffic is outbound from the array, only one network card of the pair will actually place data on the wire so as to avoid Ethernet collisions.
1-10
Hewlett-Packard Company
SCSI Identifiers
Ch 1: Architecture Overview and Terminology
During the installation and maintenance of the HPAA system, there are several different pieces of configuration data that must be collected, documented, and referenced. One of the more important pieces is the SCSI identifier for the logical drives that the AA 4000 software will “redirect” to the ownership of the CE.
In the AA 4000 Installation Guide bundled with the system (also an Adobe Acrobat file on the AA 4000 CD), there are several blank charts for recording SCSI device information. This chart should be filled out, or a similar one made. The chart helps the administrator keep track of the SCSI configurations for each SCSI device. An example of the chart is below.
If the SCSI information for the SCSI devices in the array is not known, check the Windows NT Registry. The Registry contains an entry for each SCSI device. Before looking for the SCSI information, be sure to know which adapter is being used and the driver name associated with that adapter.
To check the Registry:
1. Open the Windows NT Registry.
2. Choose HKEY_LOCAL_MACHINE\HARDWARE\DEVICEMAP\Scsi
3. Choose the SCSI port matching the adapter.
(Make sure the Driver parameters on the right side of the Registry window match the adapter being checked.)
4. Choose \Scsi BUS x\Target ID x\Logical Unit Number x
5. On the right side of the Registry window, make sure that the
Identifier and Type parameters describe the SCSI devices.
6. Map the following Registry information to the appropriate field
on the SCSI Configuration Chart.
Network Server Division
1-11
HP NetServer AA
Here is an example of the SCSI information needed from the Windows NT Registry:
When filling out the SCSI configuration chart included in the Installation Guide, the following notes are some reminders about the configurations of SCSI devices:
SCSI Bus Numbers – Be sure to have all of the drivers installed
SCSI IDs – Verify what SCSI ID is being used by the adapter.
Boot Disks – For the IOP, this should be the disk that is SCSI ID
SCSI Port Number Changes
Windows NT assigns port numbers to SCSI adapters based on the load order of the adapters’ device drivers at boot time. This is important because it is possible that the addition of another SCSI adapter may impact the NT port assignments. For example, the addition of an Adaptec Controller may result in its port assignment being 0 and the other SCSI adapters that previously existed would have their port numbers increased by 1. The result of this scenario is the AA 4000 software will not be able to identify the devices correctly based upon its configuration file.
for the SCSI adapters to be used. The check the NT registry for the bus numbers.
This may be seen in the setup utility of the SCSI adapter as the “Initiator ID”, or in the NT Registry as the “Target ID.” Typically the adpater ID is ‘7,’ but it may be changed or different. The same SCSI ID for the adapter must be used on both IOPs.
0 on each IOP. For the CE, this should be the first disk redirected in the AA 4000 software; typically this is SCSI ID 1.
1-12
Hewlett-Packard Company
Ch 1: Architecture Overview and Terminology
To prevent this problem from occurring, you must change the default load order used by Windows NT. Changing the Windows NT default load order for SCSI adapter drivers requires modification to the Registry. Each adapter driver has a Registry key located at:
\HKEY_LOCAL_MACHINE\System\CurrentControlSet\Services\% adapter_driver_name%
(Where %adapter_driver_name% is the name of the SCSI adapter driver’s Registry key. For example, an Adaptec 2940 driver has a Registry key name of aic78xx.)
Each SCSI driver’s Registry key name has an associated value named Tag.TheTag value contains a number that is used to control the load order of a particular SCSI driver. Smaller Tag values cause drivers to load before larger Tag values. The Adaptec 1542’s Tag value can be changed to a value greater than the Adaptec 2940’s Tag value, which will cause the Adapter 1542 adapter driver to load last, preventing the port number discrepancy discussed above.
The Tag value controls the load order of different drivers. However, if one driver operates more than one SCSI adapter, the Tag value is useless for altering the port number assignment. In this case, it may be best to change an adapter’s position in the PCI bus with regard to another adapter. For example, if you have a SCSI adapter in PCI slot 1, and later you add another SCSI adapter to PCI slot 0, and these adapters are controlled by the same SCSI driver, this may cause the SCSI adapter in slot 1 to have its SCSI port number assignment changed by Windows NT from 0 to 1. To prevent this port number discrepancy from occurring, try swapping the SCSI adapter in slot 1 with the one in slot 0.
If changing the PCI bus position of an adapter has no effect, consider moving the devices from the original adapter to the newly installed adapter. You may be able to accomplish this by simply unplugging the SCSI bus cables from each adapter and swapping them. However, if it becomes too difficult to move devices from one adapter to another, you may want to consider reworking your AA 4000 device configuration database.
Network Server Division
1-13
HP NetServer AA
Device Redirection
With the disk and network resources existing on the IOPs, but “owned” and accessed by the CEs, the AA 4000 software has to have a way to make this happen. The method is called device redirection. First, let’s start with a list of devices than can be redirected:
SCSI Disks
SCSI Tape Drives
CD-ROMs
Ethernet Adpaters
Floppy Disk Drives
Keyboard and Mouse
Serial Ports
These devices exist on the IOP, but when the AA 4000 software loads during the Windows NT boot process, a configuration file is checked and all devices that have been configured for redirection no longer are accessible by the IOP. When the CE boots, it will have access to the redirected devices.
HP AA 4000 Configuration Utility
The list of redirected devices can be found by using the HP AA 4000 Configuration Utility. As seen by the screenshot below, the devices are redirected by category.
1-14
Hewlett-Packard Company
Ch 1: Architecture Overview and Terminology
The keyboard and the mouse connected to the IOPs are automatically redirected to the CEs when they boot. Control of the keyboard and mouse can be switched back to the IOP or to the CE by pressing <CTRL> <SHIFT> <F12>.
The HP AA 4000 Configuration Utility is accessible through the Start, Program, and HP AA 4000 menus on each of the IOPs. It is automatically installed as part of the AA 4000 software install. Whenever a configuration file is changed on one IOP, it must be committed and the same configuration must exist on the other IOP. The easiest way to ensure that both configurations are the same is when one IOP changes, save the results to a floppy diskette and in the other IOP load the file from the floppy disk and commit the changes. Configuration changes do not take effect until the next reboot of the IOPs.
Mirrored Devices
Within the list of devices that can be redirected to the CE, a subset of that list is mirrored devices. A mirrored device means the device exists on both IOPs and it is the same resource at all times on both IOPs. Therefore, if one device shall fail, the other one is still available and in the necessary state to function as if there was no failure at all. The best example of this is a disk resource. On each IOP, a mirrored disk is a disk that exists on each IOP on the same SCSI bus, with the same SCSI IDs, using the same physical hard drives, and the same logical disk size. Data is written to each disk on each IOP asynchronously by the CEs. If one disk should fail, then the CEs would smply continue to use the remaining disk.
Single Ended Devices
Devices that cannot be mirrored are commonly referred to as “Single-Ended Devices” in AA 4000 documentation. Do not confuse this with “Single-Ended SCSI”; they are not the same at all. An example of a single-ended device is a CD-ROM. The most important characteristic of a single-ended device is if it fails, the device must be repaired or replaced. There is not a mirrored device ready to take over. For example, if the CD-ROM fails on an IOP, then there will be no access to the CD-ROM until that CD-ROM is repaired, replaced, or the system is rebooted and the CD-ROM in the other IOP is redirected. The CD-ROM is not mirrored.
Network Server Division
1-15
HP NetServer AA
Putting it all together
Before the conclusion of the chapter, the following is a review of all of the AA 4000 components and how they work together. Using the diagram below, the best way to understand how the system works, is to trace a typical client transaction with the HPAA system.
1. The client requests (for example) to perform a databse query.
2. The NIC on both IOPs has picked up the network traffic and will
immediately pass the packet to the MIC and later to the CE for the parsing of the network packet. At this point the IOPs are not looking at any information in the packet, it is simply acting as a pass-through for all packets regardless of their intended destinaiton.
3. The NIC passes the packet to the MIC.
4. The MIC on each IOP passes the packet to its own SSDL for its
tuple.
5. The SSDL passes the packet “as-is” to the MIC on the CE. At
this point the packet is going to both CEs from both SSDLs at the same instance.
6. The CE looks at the packet and begins to do the parsing to
determine if the packet is to be “dropped” or passed on to the rest of the OSI layers.
At this point the CE then performs the query to the database. It will be accessing the disks on each IOP via the MICs. Each tuple will perform this. At the CE level all I/O going in and out of the MICs in
1-16
Hewlett-Packard Company
Ch 1: Architecture Overview and Terminology
synchronous. At the IOP level, due to the different spin rates of the disk drives, the I/O is asynchronous.
The query results are eventually gathered by the CE (remember, the CE is running the SQL server application, not the IOP) and passed by both CEs to the SSDLs. The SSDLs both pass the network packet to the IOPs for transmittal. And at the last instant, only one IOP places the frame on the wire. The second IOP holds off so as to avoid Ethernet collisions.
Network Server Division
1-17
HP NetServer AA
NetServer Rackmount Configurations
So far the diagrams used to describe the HPAA system have shown the NetServers out of the rack to help illustrate the components and their functions. The following is a look at the different configurations available when ordering the HPAA system.
NOTE
These configurations are specifically for the AA 4000 and are as of 3/2000.
NetServer LPr as the CE and LH 3r or LH 4r as the IOP
NetServer LH 4r as the CE and the IOP (two racks needed)
1-18
Hewlett-Packard Company
Ch 1: Architecture Overview and Terminology
NetServer LPr as the CE and the IOP
Network Server Division
1-19
HP NetServer AA
Rules for maintaining availability
There really is only one rule when working with the HPAA system: Always maintain the highest level of availability.
How is this done? Here are a few simple reminders to adhere to when working with or adminstrating the HPAA system:
Never shutdown the “server” with clients attached.
Anytime one IOP goes offline for any reason while the CEs are
in operation, a disk re-mirror operation must take place after the IOP is brought back online.
To avoid unnecessary disk re-mirror operations, shutoff the CEs
during planned maintenance activities.
One component can fail and the HPAA sytem still continues on
as if no failure took place. However, a second failure of the same component generally means the HPAA system will be unavailable.
When changing, adding or deleting an IOPs AA 4000
configuration file, the same operation must be done on the other IOP. The changes do not take affect until the next server reboot.
The CEs are running the NT operating system that is used by the
applications and client network, all changes that need to be made affecting these resources need tobe done on the CE, not the IOP.
Changes to the NT operating system on the CE may not take
place until NT is rebooted (just like in a single server environment). Take the appropriate precautions before rebooting the CE’s NT operating system.
1-20
Hewlett-Packard Company
Chapter Two ~ HPAA System Boot Up
This chapter covers the startup process for the HPAA system. Before going through the details of powering on the system and beginning to use it, the proper hardware connections should be verified. In the event there is a problem with the basic connections, how to use the MTCTEST utility ti troubleshoot will be covered first. The remainder of the chapter will discuss the process from power on to HPAA system online before the adminstrator can start using the system. Topics to be covered include:
Verifyng the MIC connections
Powering up the system
The different boot options
Ch 2: HPAA System Boot Up
The HPAA boot process
Proper shutdown of the system, and
Using the correct keyboard and monitor view
Network Server Division
2-1
HP NetServer AA
Verifying the MIC connections
Before booting the system, a quick visual verification of the connections between the MICs and the SSDLs should be performed. A more through verification can be performed using the MTCTEST utility. It is important to make sure the MIC connections are good before powering up the system and re-checking the connections when a failure of any type occurs in the system. A quick visual inspection can save hours of needless troubleshooting if in fact there is a problem with a MIC cable or even the MIC itself.
Checking the SSDL LEDs
On the front of each SSDL are a series of LEDs that help verify that the MIC cards are in good working order and the MIC cable connections are correct. However, this is not the final verification that all is in good working order. There are rare instances when all LEDs indicate a working system, but there may still be a minor problem with a MIC cable. The occurrence is rare enough that the LEDs can be “trusted” and the cable connections can be ruled out as the cause of a problem when troubleshooting. But in the rare instance that all other troubleshooting is not pinpointing the problem, the cable connections can be re-verified through the use of the MTCTEST utility.
2-2
The above is a graphic of both SSDLs. In the middle of the SSDLs are the LED indicators. Below is a close-up of how the LEDs are organized.
2
1
3
Compute
I/O
Link
There are three columns for LEDs, the third column was for a possible future development and is not used. Column 1 represents tuple 1 and column 2 represents tuple 2. The Compute row shows the status of the MIC connections from the CEs to the SSDLs (1 and
2). The I/O row indicates the status of the MIC connections from the
Hewlett-Packard Company
Ch 2: HPAA System Boot Up
IOP to the SSDLs (1 and 2). The last row labeled “link” is for the connection between the SSDLs.
When the SSDLs are not powered (not plugged in), the LEDs are completely off. On the far right of the front of the SSDL are power indicators; one for each power cord that can be used (the SSDL has two power inlets for redundancy, only one is required). When the SSDLs are powered, but the NetServers are powered down (standby power), the LEDs will be red for the CE and IOP rows and green for the “link” row. So a “red” LED is not necessarily an indication of a failure, it could simply mean the server is not powered on. As each NetServer is powered on, the MIC commnicates with the SSDL and the LED representing the CE or IOP and the particular tuple changes from “red” to “green.” At any time the HPAA system is fully operational, all LEDs on the SSDL should be “green.”
If an LED is “red” check to see which connection it represents by identifying the server role and in which tuple. Check to see if the NetServer is powered up. If not, then a “red” LED is normal. If the NetServer is powered up and the LED is still “red,” then there may be a problem with the MIC cable connection for that node or a problem with the MIC itself. In this case, a further inspection and/or utility tests are needed to isolate the problem.
Tuple IDs
Also on the front of the SSDLs are tuple ID LEDs. And next to the LED is a locking mechanism. The LEDs for the tuples are actually buttons that can be pushed in implying that the tuple ID for the SSDL can be changed. This is partially true.
WARNING
If a tuple ID button is pushed in that is not actually the SSDL number it originally was configured at, not only the ID will be changed, but also the SSDL will cease to function.
The SSDLs are preset to belong to a particular tuple. Looking at the back of the SSDL and seeing how it is labeled proves this. It is recommended to verify the correct tuple ID LED is pressed and remove the key and set aside somewhere safe and forget about using it. Once the tuple ID is correctly set, the key is not ever needed.
Network Server Division
2-3
HP NetServer AA
Troubleshooting a “RED” LED
When an LED is Red on the SSDL, the problem is the MIC cable, the port it is plugged into, or the MIC itself. Troubleshoot this situation as follows:
Disconnect the cable at the point where the LED indicates there is a problem. For example, if the LED that interesects column 2 and the “compute” row is red, then remove the cable from the MIC card on CE2.
Check the pins on the cable and the port on the MIC. Re-insert the cable by attaching the cable straight onto the MIC and turning both screws by hand evenly until secure. Then use a Flathead screwdriver to give the screws one more quarter turn to tighen. DO NOT over­tighten. Check the LED again. If it still Red, move to the next step.
After the cable connection at the MIC has been eliminated as the problem, the more difficult connection to access should be checked; the connection at the SSDL. Using our same example, remove the MIC cable on SSDL2 for the port labeled CE2.
The MTCTEST utility
Check the pins on the cable and the port on the SSDL. Re-insert the cable by attaching the cable straight onto the SSDL and turning both screws by hand evenly until secure. Then use a Flathead screwdriver to give the screws one more quarter turn to tighen. DO NOT over­tighten. Check the LED again. If it still Red, move to the next step.
Since both cable connections have been checked and eliminated as the cause of the problem, attention is now focused on the MIC itself.
NOTE
It is possible that the SSDL has failed, but this is unlikely and not checked unless there are multiple LED problems or other symptoms.
To test the validity of the MIC and its ability to perform communications, the MTCTEST utility should be used. However, before performing the test, open the cover of the NetServer and verify that the MIC has been properly seated into the PCI slot.
MTCTEST is a utility for testing MIC communications. The utility can be found on the AA 4000 software CD under the /MTCUTILS directory. The test cannot be used on a server that is booted into Windows NT. MTCTEST is a DOS-based application that runs from a bootable floppy. To use MTCTEST, copy the contents of the /MTCUTILS directory to a DOS-bootable diskette and power up the NetServer with the floppy. To fully test MIC communications, two
2-4
Hewlett-Packard Company
Loading...
+ 112 hidden pages