This guide describes how to perform routine system hardware operations for HP
Integrity NonStop™ NS-series servers. These tasks include monitoring the system,
performing common operations tasks, and performing routine hardware maintenance.
This guide is written for system operators.
Product Version
N.A.
Supported Release Version Updates (RVUs)
This guide supports H06.08 and all subsequent H-series RVUs until otherwise
indicated by its replacement publication.
Manual Informationxiii
New and Changed Informationxiii
About This Guidexv
Who Should Use This Guidexv
What Is in This Guidexvi
Where to Get More Informationxvii
Notation Conventionsxviii
1. Introduction to Integrity NonStop NS-Series Operations
When to Use This Section1-2
Understanding the Operational Environment1-2
What Are the Operator Tasks?1-2
Monitoring the System and Performing Recovery Operations1-2
Preparing for and Recovering from Power Failures1-3
Stopping and Powering Off the System1-3
Powering On and Starting the System1-3
Creating Startup and Shutdown Files1-3
Performing Preventive Maintenance
Operating Disk Drives and Tape Drives
Responding to Spooler Problems1-4
Updating Firmware1-4
Determining the Cause of a Problem:
A Systematic Approach1-4
A Problem-Solving Worksh eet1-4
Task 1: Get the Facts1-6
Task 2: Find and Eliminate the Cause of the Problem1-7
1-3
1-3
Task 3: Escalate the Problem If Necessary1-8
Task 4: Prevent Future Problems1-9
Hewlett-Packard Company—529869-005
i
Page 4
Contents
2. Determining Your System Configuration
Logging On to an Integrity NonStop Server1-9
System Consoles1-9
Opening a TACL Window1-10
Overview of OSM Applications1-1 1
Launching OSM Applications1-11
Service Procedures1-12
Support and Service Library1-12
2. Determining Your System Configuration
When to Use This Section2-1
Modular Hardware Components2-2
Differences Between Integrity NonStop NS-Series Systems2-2
Terms Used to Describe System Hardware Components2-4
Recording Your System Configuration2-4
Using SCF to Determine Your System Configuration2-5
SCF System Naming Conventions2-5
SCF Configuration Files2-5
Using SCF to Display Subsystem Configuration Information2-6
Displaying SCF Configuration Information for Subsystems2-9
When to Use This Section3-1
Functions of Monitoring3-2
Monitoring Tasks3-2
Working With a Daily Checklist3-2
Tools for Checking the Status of System Hardware3-3
Additional Monitoring Tasks3-6
Monitoring and Resolving Problems—An Approach3-7
Using OSM to Monitor the System3-7
Using the OSM Service Connection3-7
Recovery Operations for Problems Detected by OSM3-12
Monitoring Problem Incident Reports3-12
Using SCF to Monitor the System3-12
Determining Device States3-13
Automating Routine System Monitoring3-16
Using the Status LEDs to Monitor the System3-20
Related Reading3-22
HP Integrity NonStop NS-Series Operations Guide—529869-005
ii
Page 5
Contents
4. Monitoring EMS Event Messages
When to Use This Section4-1
What Is the Event Management Service (EMS)?4-1
Tools for Monitoring EMS Event Messages4-1
OSM Event Viewer 4-2
EMSDIST4-2
ViewPoint4-2
Web ViewPoint4-2
Related Reading4-2
5. Processes: Monitoring and Recovery
When to Use This Section5-1
Types of Processes5-1
System Processes5-1
I/O Processes (IOPs)5-2
4. Monitoring EMS Event Messages
Generic Processes5-2
Monitoring Processes5-3
Monitoring System Processes5-3
Monitoring IOPs5-4
Monitoring Generic Processes5-4
Recovery Operations for Processes5-6
Related Reading5-6
6. Communications Sub systems: Mon itoring and Recovery
When to Use This Section6-1
Communications Subsystems6-1
Local Area Networks (LANs) and Wide Area Networks (WANs)
Monitoring Communications Subsystems and Their Objects
Monitoring the SLSA Subsystem6-4
Monitoring the WAN Subsystem6-6
Monitoring the NonStop TCP/IP Subsystem6-9
Monitoring Line-Handler Process Status6-10
Tracing a Communications Line6-12
6-4
6-2
Recovery Operations for Communications Subsystems 6-13
Related Reading6-13
HP Integrity NonStop NS-Series Operations Guide—529869-005
iii
Page 6
Contents
7. ServerNet Resources: Monitoring and Recovery
7. ServerNet Resources: Monitoring and Recovery
When to Use This Section7-1
ServerNet Communications Network7-1
System I/O ServerNet Connections7-4
Monitoring the Status of the ServerNet Fabrics7-4
Monitoring the ServerNet Fabrics Using OSM7-5
Monitoring the ServerNet Fabrics Using SCF7-6
Related Reading7-8
8. I/O Adapters and Modules: Monitoring and Recovery
When to Use This Section8-1
I/O Adapters and Modules8-2
Fibre Channel ServerNet Adapter (FCSA8-2
Gigabit Ethernet 4-Port Adapter (G4SA)8-2
4-Port ServerNet Extender (4PSE)8-3
Monitoring I/O Adapters and Modules8-3
Monitoring the FCSAs8-4
Monitoring the G4SAs8-5
Monitoring the 4PSEs8-7
Recovery Operations for I/O Adapters and Modules8-7
Related Reading8-8
9. Processors and Components: Monitoring and Recovery
When to Use This Section9-1
Overview of the NonStop Blade Complex9-2
Monitoring and Maintaining Processors9-4
Monitoring Processors Automatically Using TFDS
Monitoring Processor Status Using the OSM Low-Level Link
Monitoring Processor Status Using the OSM Service Connection9-5
Monitoring Processor Performance Using ViewSys9-7
Identifying Processor Problem s9-7
Processor or System Hangs9-7
Processor Halts9-8
9-4
9-5
OSM Alarms and Attribute Values9-8
HP Integrity NonStop NS-Series Operations Guide—529869-005
iv
Page 7
Contents
10. Disk Drives: Monitoring and Recovery
Recovery Operations for Processors9-9
Recovery Operations for a Processor Halt9-9
Halting One or More Processors9-10
Reloading a Single Processor on a Running Server9-10
Recovery Operations for a System Hang9-14
Enabling/Disabling Processor and System Freeze9-15
Freezing the System and Freeze-Enabled Processors9-15
Dumping a Processor to Disk9-15
Backing Up a Processor Dump to Tape9-19
Replacing Processor Memory9-19
Replacing the Processor Board and Processor Entity9-19
Submitting Information to Your Service Provider9-19
Related Reading9-22
10. Disk Drives: Monitoring and Recovery
When to Use This Section10-1
Overview of Disk Drives10-2
Internal SCSI Disk Drives10-2
M8xxx Fibre Channel Disk Drives10-3
Enterprise Storage System (ESS) Disks10-3
Monitoring Disk Drives10-4
Monitoring Disk Drives With OSM10-4
Monitoring Disk Drives With SCF10-5
Monitoring the State of Disk Drives10-9
Monitoring the Use of Space on a Disk Volume10-9
Monitoring the Size of Database Files10-9
Monitoring Disk Configur ation and Perf orma nce10-10
Identifying Disk Drive Problems10-1 1
Internal SCSI Disk Drives10-11
M8xxx Fibre Channel Disk Drives10-11
Recovery Operations for Disk Drives10-12
Recovery Operations for a Down Disk or Down Disk Path10-14
Recovery Operations for a Nearly Full Database File10-15
Related Reading10-15
HP Integrity NonStop NS-Series Operations Guide—529869-005
v
Page 8
Contents
11. Tape Dr ives: Monitoring and Recovery
11. Tape Drives: Monitoring and Recovery
When to Use This Section11-1
Overview of Tape Drives11-1
Monitoring Tape Drives11-2
Monitoring Tape Drive Status With OSM11-2
Monitoring Tape Drive Status With SCF11-5
Monitoring Tape Drive Status With MEDIACOM11-6
Monitoring the Status of Labeled-Tape Operations11-7
Identifying Tape Drive Problems11-7
Recovery Operations for Tape Drives 11-8
Recovery Operations Using the OSM Service Connection11-8
Recovery Operations Using SCF11-9
Related Reading11-9
12. Print ers and Termin a ls : Monitoring and R e co v ery
When to Use This Section12-1
Overview of Printers and Terminals12-1
Monitoring Printer and Collector Process Status12-2
Monitoring Prin ter Status12-2
Monitoring Collector Process Status12-2
Recovery Operations for Printers and Terminals12-3
Recovery Operations for a Full Collector Process12-3
Related Reading12-3
13. Applications: Monitoring and Recovery
When to Use This Section13-1
Monitoring TMF
Monitoring the Status of TMF
Monitoring Data Volumes13-2
TMF States13-3
Monitoring the Status of Pathway13-4
PATHMON States13-5
Related Reading13-6
13-1
13-2
HP Integrity NonStop NS-Series Operations Guide—529869-005
vi
Page 9
Contents
14. Pow er Failures: Preparation and Recovery
14. Power Failures: Preparation and Recovery
When to Use This Section14-2
System Response to Power Failures 14-2
NonStop NS-Series Cabinets (Modular Cabinets)14-2
NonStop S-Series I/O Enclosures14-2
External Devices 14-2
ESS Cabinets14-3
Air Conditioning14-3
Preparing for Power Failure14-3
Set Ride-Through Time14-3
Configure OSM Power Fail Support14-3
Monitor Power Supplies14-4
Monitor Batteries14-4
Maintain Batteries14-4
Power Failure Recovery14-4
Procedure to Recover From a Power Failure14-5
Setting System Time14-5
Related Reading14-5
15. Starting and Stopping the System
When to Use This Section15-2
Powering On a System15-2
Powering On the System From a Low Power State15-3
Powering On the System From a No Power State15-3
Starting a System15-5
Loading the System15-5
Starting Other System Components15-9
Performing a System Load15-9
Performing a System Load From a Specific Processor15-11
Reloading Processors15-12
Minimizing the Frequency of Planned Outages15-14
Anticipating and Planning for Change15-14
Stopping Application, Devices, and Processes15-14
Stopping the System15-16
Alerts15-16
Halting All Processors Using OSM15-16
Powering Off a System15-17
System Power-Off Using OSM15-17
System Power-Off Using SCF15-17
Emergency Power-Off Procedure15-18
HP Integrity NonStop NS-Series Operations Guide—529869-005
vii
Page 10
Contents
Troubleshooting and Recovery Operations15-18
Fans Are Not Turning15-18
System Does Not Appear to Be Powered On15-19
Green LED Is Not Lit After POSTs Finish15-19
Amber LED on a Component Remains Lit After the POST Finishes15-19
Components Fail When Testing the Power15-19
Recovering From a System Load Failure15-20
Getting a Corrupt System Configuration File Analyzed15-21
Recovering From a Reload Failure15-21
Exiting the OSM Low-Level Link15-22
Opening Startup Event Stream and Startup TACL Windows15-22
Related Reading15-24
16. Creating Startup and Shutdown Files
Automating System Startup and Shutdown16-2
16. Creating Startup and Shutdown Files
Managed Configuration Services (MCS)16-2
Startup16-2
Shutdown16-3
For More Information 16-3
Processes That Represent the System Console16-3
$YMIOP.#CLCI16-3
$YMIOP.#CNSL16-3
$ZHOME16-4
$ZHOME Alternative16-4
Example Command Files16-4
CIIN File16-5
Establishing a CIIN File16-6
Modifying a CIIN File16-6
If a CIIN File Is Not Specified or Enabled in OSM16-7
Example CIIN Files16-8
Writing Efficient Startup and Shutdown Command Files16-9
Command File Syntax16-9
Avoid Manual Intervention16-10
Use Parallel Processing16-10
Investigate Product-Specific Techniques16-11
How Process Persistence Affects Configuration and Startup16-11
Tips for Startup Files16-11
HP Integrity NonStop NS-Series Operations Guide—529869-005
viii
Page 11
Contents
Startup File Examples16-12
Tips for Shutdown Files16-19
Shutdown File Examples16-19
17. Preventive Maintenance
System Startup File16-12
Spooler Warm-Start File 16-14
TMF Warm-Start File16-14
TCP/IP Stack Configuration and Startup File16-14
CP6100 Lines Startup File16-17
ATP6100 Lines Startup File16-17
X.25 Lines Startup File16-17
Printer Line Startup File16-18
Expand-Over-IP Line Startup File16-18
Expand Direct-Connect Line Startup File16-18
System Shutdown File16-20
CP6100 Lines Shutdown File16-21
ATP6100 Lines Shutdown File16-21
X.25 Lines Shutdown File16-21
Printer Line Shutdown File16-22
Expand-Over-IP Line Shutdown File16-22
Direct-Connect Line Shutdown File16-22
Spooler Shutdown File16-23
TMF Shutdown File16-23
17. Preventive Maintenance
When to Use This Section17-1
Monitoring Physical Facilities17-1
Checking Air Temperature and Humidity17-1
Checking Physical Security17-2
Maintaining Order and Cleanliness17-2
Checking Fire-Protectio n Systems17-2
Cleaning System Components17-2
Cleaning an Enclosure17-2
Cleaning and Maintaining Printers17-2
Cleaning Tape Drives17-3
Handling and Storing Cartridge Tapes17-3
HP Integrity NonStop NS-Series Operations Guide—529869-005
ix
Page 12
Contents
A. Operational Differences Between Systems
Running G-Series and H-Series RVUs
A. Operational Differences Between Systems Runnin g G-Series
and H-Series RVUs
B. Tools and Utilities for Operations
When to Use This AppendixB-1
BACKCOPYB-2
BACKUPB-2
Disk Compression Program (DCOM)B-2
Disk Space Analysis Program (DSAP)B-2
EMSDISTB-2
Event Management Service Ana lyzer (EM SA)B-2
File Utility Program (FUP)B-3
MeasureB-3
MEDIACOMB-3
NonStop NET/MASTERB-3
NSKCOM and the Kernel-Managed Swap Facility (KMSF)B-3
OSM PackageB-3
PATHCOMB-4
PEEKB-4
RESTOREB-4
SPOOLCOMB-4
Subsystem Control Facility (SCF) B-4
HP Tandem Advanced Command Language (TACL)B-5
TMFCOMB-5
Web ViewPoint B-5
ViewPointB-5
ViewSys
B-6
C. Related Reading
D. Converting Numbers
When to Use This AppendixD-1
Overview of Numbering SystemsD-2
Binary to DecimalD-3
Octal to DecimalD-4
Hexadecimal to DecimalD-5
Decimal to BinaryD-7
Decimal to OctalD-8
Decimal to HexadecimalD-9
HP Integrity NonStop NS-Series Operations Guide—529869-005
x
Page 13
Contents
Safety and Compliance
Index
Examples
Example 2-1.SCF LISTDEV Command Output2-7
Example 2-2.SCF ADD DISK Command Output2-11
Example 2-3.SCF INFO PROCESS Command Output2-15
Example 2-4.SCF INFO SAC Command Output2-15
Example 2-5.SCF INFO PROCESS $ZZWAN Command Output2-16
Example 2-6.SCF INFO LINE Command Output2-16
Example 3-1.SCF STATUS TAPE Command3-13
Example 3-2.System Monitoring Command File3-16
Example 3-3.System Monitoring Output File3-17
Figures
Safety and Compliance
Figure 3-1.OSM Management: System Icons Indicate Problems Within3-8
Figure 3-2.Expanding the Tree Pane to Locate the Source of Problems3-9
Figure 3-3.Attributes Tab3-10
Figure 3-4.Using System Status Icons to Monitor Multiple Systems3-10
Figure 3-5.Alarm Summary Dialog Box3-11
Figure 3-6.Problem Summary Dialog Box3-11
Figure 7-1.Integrity NonStop NS16000 System7-2
Figure 7-2.Integrity NonStop NS14000 System with IOAM Enclosure7-3
Figure 7-3.I/O Connections to the PICS in a P-Switch7-4
Figure 9-1.Modular NSAA With One NonStop Blade Complex and Four
Processors9-3
Figure 9-2.
Figure 9-3.
Figure 11-1. OSM: Monitoring Tape Drives Connected to an FCSA11-3
Figure 11-2.OSM: Monitoring Tape Drives Connected to an IOMF211-4
Figure 15-1.System Load Dialog Box15-10
Figure 15-2.Logical Processor Reload Parameters15-13
Figure 15-3.Opening a Startup TACL Window15-22
Processor Status Display9-5
OSM Representation of Processor Complex9-6
Figure 15-4.OutsideView Buttons on the Windows Toolbar15-22
Figure D-1.Binary to Decimal ConversionD-3
Figure D-2.Octal to Decimal ConversionD-4
Figure D-3.Hexadecimal to Decimal ConversionD-6
HP Integrity NonStop NS-Series Operations Guide—529869-005
xi
Page 14
Contents
Tables
Table 1-1.Problem-Solving Worksheet1-5
Table 2-1.Key Subsystems and Their Logical Device Names and Device
Table 2-2.Displaying Information for the TCP/IP Subsystem ($ZTCO)2-9
Table 2-3.Displaying Information for the Kernel Subsystem ($ZZKRN)2-10
Table 2-4.Displaying Information for the Storage Subsystem ($ZZST0)2-10
Table 2-5.Displaying Information for the SLSA Subsystem ($ZZLAN)2-12
Table 2-6.Displaying Information for the WAN Subsystem ($ZZWAN)2-13
Table 2-7.Subsystem Objects Controlled by SCF2-13
Table 3-1.Monitoring System Components3-4
Table 3-2.Daily Tasks Checklist3-6
Table 3-3.SCF Object States3-14
Table 3-4.Status LEDs and Their Functions3-20
Table 3-5.Related Reading for Monitoring3-22
Tables
Types2-8
Table 4-1.Related Reading for Monitoring EMS Event Messages4-2
Table 6-1.Related Reading for Communications Lines and Devices6-13
Table 8-1.Service, Flash Firmware, Flash Boot Firmware, Device, and Enabled
States for the FCSA8-4
Table 8-2.Service, Device, and Enabled States for the G4SA8-6
Table 8-3.Related Reading for I/O Adapters and Modules8-8
Table 9-1.Other Files to Submit to Your Service Provider9-20
Table 9-2.Additional Processor Dump Information for Your Service Provider9-21
Table 9-3.Related Reading for Monitoring and Recovery Operations on
Processors9-22
Table 10-1.Primary and Backup Path States for Disk Drives10-9
Table 10-2.Possible Causes of Common Disk Drive Problems10-11
Table 10-3.
Table 11-1.
Common Recovery Operations for Disk Drives10-12
Common Tape Drive Problems11-7
Table 1 1-2.Related Reading for Tapes and Tape Drives11-9
Table 13-1.TMF States13-3
Table 15-1.System Load Paths in Order of Use15-7
Table 15-2.Related Reading for Starting and Stopping a System15-24
Table C-1.Related Reading for Tools and UtilitiesC-1
Table D-1.Descriptions of Number SystemsD-2
HP Integrity NonStop NS-Series Operations Guide—529869-005
xii
Page 15
What’s New in This Manual
Manua l In forma tion
HP Integrity NonStop NS-Series Operations Guide
Abstract
This guide describes how to perform routine system hardware operations for HP
Integrity NonStop™ NS-series servers. These tasks include monitoring the system,
performing common operations tasks, and performing routine hardware maintenance.
This guide is written for system operators.
Product Version
N.A.
Supported Release Version Updates (RVUs)
This guide supports H06.08 and all subsequent H-series RVUs until otherwise
indicated by its replacement publication.
This manual has b een up dated to i nclude refer ences t o HP In tegr ity NonStop NS14000
and NS1000 servers containing VIO enclosures (in place of an IOAM enclosure).
HP Integrity NonStop NS-Series Operations Guide—529869-005
xiii
Page 16
What’s New in This Manual
New and Changed Information
HP Integrity NonStop NS-Series Operations Guide—529869-005
xiv
Page 17
About This Guide
This guide describes how to perform routine system hardware operations for HP
Integrity NonStop NS-series servers on H-series release version updates.
This guide is primarily geared toward commercial type NonStop NS-series servers
(see Differences Between Integrity NonStop NS-Series Systems on page 2-2 for high-
level architectural and hardware differences between the various commercial models).
While basic monitoring principles, such as Using OSM to Monitor the System on
page 3-7, apply to Telco as well as commercial systems, refer to the NonStop NS-Series Carrier Grade Server Manual for hardware details and service procedures
specific to Telco systems.
Note. NS-series refers to the hardware that makes up the server. H-series refers to the
software that runs on the server.
The term, NonSto p s erv er, refers to both NonStop S-series servers and Integrity NonStop
NS-series servers.
Use this guide along with the Guardian User’s Guide and the written policies and
procedures of your company regarding:
General operations
•
Security
•
System backups
•
Starting and stopping applications
•
Who Should Use This Guide
This guide is written for operators who perform system hardware operations. It
provides an overview of the routine tasks of monitoring the system and guides the
operator through the infrequent tasks of starting and stopping the system and
performing online recovery on the system.
HP Integrity NonStop NS-Series Operations Guide—529869-005
Introduction to Integrity NonStop NS-Series Operations
Determining Your System Configuration
Overview of Monitoring and Recovery
Monitoring EMS Event Messages
Processes: Monitoring and Recovery
Communications Subsystems: Monitoring and Recovery
ServerNet Resources: Monitoring and Recovery
I/O Adapters and Modules: Monitoring and Recovery
Processors and Components: Monitoring and Recovery
Disk Drives: Monitoring and Recovery
Tape Drives: Monitoring and Recovery
Printers and Terminals: Monitoring and Recovery
Applications: Monitoring and Recovery
Power Failures: Preparation and Recovery
Section 1 5
Section 1 6
Section 1 7
Appendix A
Appendix B
Appendix C
Appendix D
Starting and Stopping the System
Creating Startup and Shutdown Files
Preventive Maintenance
Operational Differences Between Systems Running G-Series and
H-Series RVUs
Tools and Utilities for Operations
Related Reading
Converting Numbers
HP Integrity NonStop NS-Series Operations Guide—529869-005
xvi
Page 19
About This Guide
Where to Get More Information
Operations planning and operations management practices appear in these manuals:
NonStop NSxxxx Planning Guide for your NS16000, NS14000, or NS1000 server
•
Availability Guide for Application Design
•
Availability Guide for Change Management
•
Availability Guide for Problem Management
•
Note. For manuals not available in the H-series collection, please refer to the G-series
collection on NTL.
For comprehensive information about performing operations tasks for an Integrity
NonStop NS-series server, you need both this guide and the Guardian User’s Guide.
The Guardian User’s Guide describes some tasks not covered in this guide, such as
supporting users of the system.
The Guardian User’s Guide describes routine tasks common to system operations on
all NonStop servers. Instructions and examples show how to support users of the
system, how to monitor operator messages, how to control the spooler, and how to
manage disks and tapes. Numerous tools that support these functions are also
documented. Some monitoring procedures in the Guardian User’s Guide have
information about using only the Subsystem Control Facility (SCF). That guide does
not generally describe any monitoring procedures using the OSM packages.
Where to Get More Info rma tion
Information about the use of OSM, such as how to migrate from TSM to OSM, how to
install and configure OSM server and client components, and how to use the OSM
Service Connection, appear in these manuals:
OSM Migration and Configuration Guide
•
NonStop System Console Installer Guide
•
OSM Service Connection User’s Guide (available in NTL and as online help within
•
the OSM Service Connection)
Servers that are connected in ServerNet clusters require special installation and
operating procedures that are not documented in this manual. Such information is
instead provided with the appropriate cluster documentation and the ServerNet Cluster Supplement for Integrity NonStop NS-Series Servers.
In the 6780 ServerNet cluster environment, installation and operating procedures are
documented in these manuals:
ServerNet Cluster 6780 Planning and Installation Guide
•
ServerNet Cluster 6780 Operations Guide
•
Installation and operating procedures for earlier server clusters (those using 6770
switches) are documented in:
ServerNet Cluster Manual
•
HP Integrity NonStop NS-Series Operations Guide—529869-005
xvii
Page 20
About This Guide
OSM is the required system management tool for servers that use 6780 switches in
ServerNet clusters, but OSM also provides system management for earlier versions of
ServerNet clusters.
For other documentation related to operations tasks, refer to Appendix C, Related
Reading.
Support and Service Library
These NTL Support and Service library categories provide procedures, part numbers,
troubleshooting tips, and tools for servicing NonStop S-series and Integrity NonStop
NS-series systems:
Hardware Service and Maintenance Publications
•
Service Information
•
Service Procedures
•
Tools and Download Files
•
Troubleshooting Tips
•
Support and Service Library
Within these categories, where applicable, content might be further categorized
according to server or enclosure type.
Authorized service providers can also order the NTL Support and Service Library CD:
Channel Partners and Authorized Service Providers: Order the CD from the SDRC
•
at https://scout.nonstop.compaq.com/SDRC/ce.htm.
HP employees: Subscribe at World on a Workbench (WOW). Subscribers
•
automatically receive CD updates. Access the WOW order form at
http://hps.knowledgemanagement.hp.com/wow/order.asp.
Notation Conventions
Hypertext Links
Blue underline is used to indicate a hypertext link within text. By clicking a passage of
text with a blue underline, you are taken to the location described. For example:
This requirement is described under Backup DAM Volumes and Physical Disk
Drives on page 3-2.
General Sy ntax Notation
The following list summarizes the notation conventions for syntax presentation in this
manual.
HP Integrity NonStop NS-Series Operations Guide—529869-005
xviii
Page 21
About This Guide
General Syntax Notation
UPPERCASE LETTERS. Uppercase letters indicate keywords and reserved words; enter
these items exactly as shown. Items not enclosed in brackets are required. For
example:
MAXATTACH
lowercase italic letters. Lowercase italic letters indicate variable items that you supply.
Items not enclosed in brackets are required. For example:
file-name
computer type. Computer type letters within text indicate C and Open System Services
(OSS) keywords and reserved words; enter these items exactly as shown. Items not
enclosed in brackets are required. For example:
myfile.c
italic computer type. Italic computer type letters within text indicate C and Open
System Services (OSS) variable items that you supply. Items not enclosed in brackets
are required. For example:
pathname
[ ] Brackets. Brackets enclose optional syntax items. For example:
TERM [\system-name.]$terminal-name
INT[ERRUPTS]
A group of items enclosed in brackets is a list from which you can choose one item or
none. The items in the list may be arranged either vertically, with aligned brackets on
each side of the list, or horizontally, enclosed in a pair of brackets and separated by
vertical lines. For example:
FC [ num ]
[ -num ]
[ text ]
K [ X | D ] address
{ } Braces. A group of items enclosed in braces is a list from which you are required to
choose one item. The items in the list may be arranged either vertically, with aligned
braces on each side of the list, or horizontally, enclosed in a pair of braces and
separated by vertical lines. For example:
LISTOPENS PROCESS { $appl-mgr-name }
{ $process-name }
ALLOWSU { ON | OFF }
| Vertical Line. A vertical line separates alternatives in a horizontal list that is enclosed in
brackets or braces. For example:
INSPECT { OFF | ON | SAVEABEND }
HP Integrity NonStop NS-Series Operations Guide—529869-005
xix
Page 22
About This Guide
Notation for Messages
… Ellipsis. An ellipsis immediately following a pair of brackets or braces indicates that you
can repeat the enclosed sequence of syntax items any number of times. For example:
M address [ , new-value ]…
[ - ] {0|1|2|3|4|5|6|7|8|9}…
An ellipsis imme diately fol lowing a single syntax item indi cates that you can repeat that
syntax item any number of times. For example:
"s-char…"
Punctuation. Parentheses, commas, semicolons, and other symbols not previously
described must be entered as shown. For example:
error := NEXTFILENAME ( file-name ) ;
LISTOPENS SU $process-name.#su-name
Quotation marks around a symbol such as a bracket or brace indicate the symbol is a
required character that you must enter as shown. For example:
"[" repetition-constant-list "]"
Item Spacing. Spaces shown between items are required unless one of the items is a
punctuation symbol such as a parenthesis or a comma. For example:
CALL STEPMOM ( process-id ) ;
If there is no space between two items, spaces are not permitted. In the following
example, there are no spaces permitted between the period and any other items:
$process-name.#su-name
Line Spacing. If the syntax of a command is too long to fit on a single line, each
continuation line is indented three spaces and is separated from the preceding line by
a blank line. This spacing distinguishes items in a continuation line from items in a
vertical list of selections. For example:
ALTER [ / OUT file-spec / ] LINE
[ , attribute-spec ]…
Notation for Messages
The following list summarizes the notation conventions for the presentation of
displayed messages in this manual.
Bold Text. Bold text in an example indicates user input entered at the terminal. For
example:
ENTER RUN CODE
?123
CODE RECEIVED: 123.00
HP Integrity NonStop NS-Series Operations Guide—529869-005
xx
Page 23
About This Guide
Notation for Messages
The user must press the Return key after typing the input.
Nonitalic text. Nonitalic letters, numbers, and punctuation indicate text that is displayed or
[ ] Brackets. Brackets enclose items that are sometimes, but not always, displayed. For
example:
Event number = number [ Subject = first-subject-value ]
A group of items enclosed in brackets is a list of all possible items that can be
displayed, of which one or none might actually be displayed. The items in the list might
be arranged either vertically, with aligned brackets on each side of the list, or
horizontally, enclosed in a pair of brackets and separated by vertical lines. For
example:
proc-name trapped [ in SQL | in SQL file system ]
{ } Braces. A group of items enclosed in braces is a list of all possible items that can be
displayed, of which one is actually displayed. The items in the list might be arranged
either vertically, with aligned braces on each side of the list, or horizontally, enclosed in
a pair of braces and separated by vertical lines. For example:
obj-typeobj-name state changed to state, caused by
{ Object | Operator | Service }
process-name State changed from old-objstate to objstate
{ Operator Request. }
{ Unknown. }
| Vertical Line. A vertical line separates alternatives in a horizontal list that is enclosed in
brackets or braces. For example:
Transfer status: { OK | Failed }
% Percent Sign. A percent sign precedes a number that is not in decimal notation. The
% notation precedes an octal number. The %B notation precedes a binary number.
The %H notation precedes a hexadecimal number. For example:
%005400
%B101111
%H2F
P=%p-register E=%e-register
HP Integrity NonStop NS-Series Operations Guide—529869-005
xxi
Page 24
About This Guide
Change Bar Notation
Change bars are used to indicate substantive differences between this edition of the
manual and the preceding edition. Change bars are vertical rules placed in the right
margin of changed portions of text, figures, tables, examples, and so on. Change bars
highlight new or revised information. For example:
The message types specified in the REPORT clause are different in the COBOL85
environment and the Common Run-Time Environment (CRE).
The CRE has many new message types and some new message type codes for
old message types. In the CR E, the messa ge type S Y STEM incl udes all me ssages
except LOGICAL-CLOSE and LOGICAL-OPEN.
Change Bar Notation
HP Integrity NonStop NS-Series Operations Guide—529869-005
xxii
Page 25
1
Introduction to Integrity NonStop
NS-Series Operations
When to Use This Section on page 1-2
Understanding the Operational Environment on page 1-2
What Are the Operator Tasks? on page 1-2
Monitoring the System and Performing Recovery Operations on page 1-2
Preparing for and Recovering from Power Failures on page 1-3
Stopping and Powering Off the System on page 1-3
Powering On and Starting the System on page 1-3
Performing Preventive Maintenance on page 1-3
Operating Disk Drives and Tape Drives on page 1-3
Responding to Spooler Problems on page 1-4
Updating Firmware on page 1-4
Determining the Cause of a Problem: A Systematic Approach on page 1-4
A Problem-Solving Worksh eet on page 1-4
Task 1: Get the Facts on page 1-6
Task 2: Find and Eliminate the Cause of the Problem on page 1-7
Task 3: Escalate the Problem If Necessary on page 1-8
Task 4: Prevent Future Problems
Logging On to an Integrity NonStop Server on page 1-9
System Consoles
Opening a TACL Window on page 1-10
Overview of OSM Applications on page 1-11
Launching OSM Applications on page 1-11
Service Procedures on page 1-12
on page 1-9
on page 1-9
Support and Service Library on page1-12
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-1
Page 26
Introduction to Integrity NonStop NS-Series
Operations
When to Use This Section
When to Use This Section
This section introduces system hardware operations for Integrity NonStop NS-series
servers. It provides an introduction to the other sections in this guide.
Understanding the Operational Environment
To understand the operational environment:
If you are already familiar with other NonStop systems, see Appendix A,
•
Operational Differences Between Systems Running G-Series and H-Series RVUs.
For a brief introduction to the system organization and the location of system
•
components in an Integrity NonStop server, see Section 2, Determining Your
System Configuration.
For information about various software tools and utilities you can use to perform
•
system operations on an Integrity NonStop server, see Appendix B, Tools and
Utilities for Operations.
What Are the Operator Tasks?
The system operations described in this guide include:
Monitoring the system and performing recovery operations
•
Preparing for and recovering from power failures
•
Stopping and powering off the system
•
Powering on and starting the system
•
Performing preventive maintenance
•
Operating disk drives and tape drives
•
Responding to spooler problems
•
Monitoring the System and Performing Recovery Operation s
Checking for indications of potential system problems by monitoring the system is part
of the normal system operations routine. You perform recovery operations to restore a
malfunctioning system component to normal use. Most recovery procedures for
Integrity NonStop servers can be performed online. Monitoring the status of all system
components and performing recovery operations are described in:
Section 3, Overview of Monitoring and Recovery
•
Section 4, Monitoring EMS Event Messages
•
Section 5, Processes: Monitoring and Recovery
•
Section 6, Communications Subsystems: Monitoring and Recovery
•
Section 7, ServerNet Resources: Monitoring and Recovery
•
Section 8, I/O Adapters and Modules: Monitoring and Recovery
•
Section 9, Processors and Components: Monitoring and Recovery
•
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-2
Page 27
Introduction to Integrity NonStop NS-Series
Operations
Section 10, Disk Drives: Monitoring and Recovery
•
Section 11, Tape Drives: Monitoring and Recovery
•
Section 12, Printers and Terminals: Monitoring and Recovery
•
Section 13, Applications: Monitoring and Recovery
•
Recovery operations for a system conso le are not discu ssed in th is guide. For recove ry
procedures for a system console and the applications installed on the system console,
see the NonStop NSxxxx Hardware Installation Manual for your Integrity NonStop
NS16000, NS14000, or NS1000 server.
Preparing for and Recovering from Power Failures
Preparing for and Recovering from Power Failures
You can minimize unplanned outage time by having procedures to prepare and
recover quickly from power failures, as described in Section 14, Power Failures:
Preparation and Recovery.
Stopping and Powering Off the System
HP recommends a specific set of procedures for stopping and powering off an Integrity
NonStop server or its components, as described in Section 15, Starting and Stopping
the System.
Powering On and Starting the System
HP recommends a specific set of procedures for powering on and starting an Integrity
NonStop server or its components, as described in Section 15, Starting and Stopping
the System.
Creating St a r tup and Shutdown Files
HP recommends a specifi c set of pr ocedur es for creati ng st ar tup and shut down fil es on
an Integrity NonStop server or its components, as described in Section 16, Creating
Startup and Shutdown Files.
Performing Preventive Maintenance
Routine preventive maintenance consists of:
Dusting or cleaning enclosures as needed
•
Cleaning tape drives regularly
•
Evaluating tape condition regularly
•
Cleaning and reverifying tapes as needed
•
Routine hardware maintenance procedures are described in Section 17, Preventive
Maintenance.
Operating Disk Drives and Tape Drives
Refer to the documentation shipped with the drive.
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-3
Page 28
Introduction to Integrity NonStop NS-Series
Operations
Responding to Spooler Problems
Responding to Spooler Problems
Refer to the Spooler Utilities Reference Manual.
Updating Firmwa r e
Refer to the H06.xx Software Installation and Upgrade Guide
Determining the Cause of a Problem:
A Systematic Approach
Continuous availability of your NonStop system is important to system users, and your
problem-solving processes can help make such availability a reality. To determine the
cause of a problem on your system, start by trying the easiest, least expensive
possibilities. Move to more complex, expensive possibilities only if the easier solutions
fail.
This subsection presents an approach you can use in your operations environment to:
Determine the possible causes of problems
•
Systematically fix or escalate such problems
•
Develop ways of preventing the same problems from recurring
•
The four basic steps in systematic problem solving are:
TaskPage
Task 1: Get the Facts
Task 2: Find and Eliminate th e C ause of the Problem1-7
Task 3: Escalate the Problem If Necessary1-8
Task 4: Prevent Future Problems1-9
A Problem-Solving Worksheet
Table 1-1 is a worksheet that you can use to help you through the problem-solving
process. Use this worksheet to:
Get the facts about a problem
•
Find and eliminate the cause of the problem
•
Make any appropriate escalation decisions
•
Prevent future problems
•
1-6
Make copies of this worksheet and use it to collect and analyze facts regarding a
problem you are experiencing. The results might not tell you exactly what is occurring,
but they will narrow down the number of possible causes.
You are authorized by HP to reproduce this worksheet only for the purpose of
operating your system.
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-4
Page 29
Introduction to Integrity NonStop NS-Series
Operations
Table 1-1. Problem-Solving Worksheet
Problem FactsPossible Causes
What?
Where?
A Problem-Solving Worksheet
When?
Magnitude?
Situ ation FactsEscalation Decision
Plan to Verify/Fix
Plan to Prevent and Control
Damage
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-5
Page 30
Introduction to Integrity NonStop NS-Series
Operations
Task 1: Get the Facts
The first step in solving any problem is to get the facts. Although it is tempting to
speculate about causes, your time is better spent in first understanding the symptoms
of the problem.
Task 1a: Determine the Facts About the Problem
To get a clear, complete description of problem symptoms, ask questions to determine
the facts about the problem. For example:
CategoryQuesti ons to Ask
What?What are you having trouble with?
What specifically is wrong?
Where?Where did you first notice the problem?
Where has it occurred since you first noticed it?
Which ap plic ations, components, de v ic es , and people ar e affec t ed?
When?When did the problem occur?
What is the frequency of the problem?
Has this problem occurred before this time?
Task 1: Get the Facts
Magnitude?Is the problem quantifiable in any way? (That is, can it be measured?) For
example , h ow m any people are affected? Is this problem getting wors e?
Task 1b: Determine the Facts About the Situation
Collect facts about the situation in which the problem arose. A clear description of the
situation that led to the problem could indicate a simple solution. Examples of
questions to ask are:
Who reported the problem and how can this person be contacted?
•
How critical is the situation?
•
What events led to the problem?
•
Has anything changed recently that might have caused the problem?
•
What event messages have you received?
•
What is the current configuration of the hardware and software products affected?
•
An example of information you might obtain from asking questions:
QuestionAnswer
What is happening that
indicates a problem?
Where is this problem
occurring?
A terminal is h ung.
In the office of USER.BONNIE. The affected terminal is
named $JT1.#C02.
When is this problem occurring?At 8:30 this morning and also at the same time two days
ago. Both times, this problem occurred after three
unsuccessful attempts to log on.
What is the magnitude of this
problem?
HP Integrity NonStop NS-Series Operations Guide—529869-005
Intermittent; the problem seemed t o disappear on its own
when it first occurred two days ago.
1-6
Page 31
Introduction to Integrity NonStop NS-Series
Operations
Task 2: Find and Eliminate the Cause of the Problem
Task 2: Find and Eliminate the Cause of the Problem
After you collect the facts, you are ready to begin considering the possible causes of a
problem. Using these facts and relying on your knowledge and experience, begin to
list possible causes of the problem.
Task 2a: Identify the Most Likely Cause
To evaluate the possible causes of any problem, you must compare each cause with
the problem symptoms. The problem-solving worksheet gives you a guide for
accomplishing this task. In the following example:
Possible causes become column headings
•
Entries made in the worksheet’s rows indicate whether the cause in that column
•
could have produced the problem symptoms you listed in that row.
Write yes in the appropriate box if that cause could explain that symptom.
°
Write no in the appropriate box if a possible cause does not explain a fact.
°
The most likely cause is the one that best explains all the facts; that is, the cause
that contains the most yes answers.
For example, possible causes of a hung terminal problem could be:
A terminal hardware problem
•
A stopped or suspended TACL process
•
System security, which locks a user out after three unsuccessful logon attempts
•
This worksheet lists some possible causes of a hung terminal and illustrates further
how to evaluate the possible causes:
Problem FactsPossible Causes
Terminal
hardware
What?
Terminal $JT 1. #C 02 is hungYesYesYes
Where?
Office of USER.BONNIEYesYesYes
When?
8:30 a.m. to day
Two days ago at 8:30 a. m .
After 3 failed logon attempt s
Yes
Yes
No
TACL
process
Yes
Yes
No
Security
Yes
Yes
Yes
Magnitude?
Intermittent
Goes awa y on its ow n
HP Integrity NonStop NS-Series Operations Guide—529869-005
?
?
1-7
Yes
Yes
Yes
Yes
Page 32
Introduction to Integrity NonStop NS-Series
Operations
Task 3: Escalate the Problem If Necessar y
Task 2b: Fix the Most Probable Cause of the Problem
For the example in the worksheet, the most likely cause of the hung terminal is a
security problem. Ask yourself what would be the fastest, least expensive, safest, and
surest way of verifying that this is the most probable cause of the problem.
Once you have determined the most likely cause, try to fix it. Follow through and
implement the appropriate solution. If this solution does not fix the problem, continue
trying other possible solutions that are reasonable considering time, expense, and
safety.
Task 3: Escalate the Problem If Necessary
If the solutions you tried in the previous tasks do not solve the problem, you might
consider escalating the problem to get additional help.
Task 3a: Determine Whether You Need to Escalate the
Problem
After you complete each task i n the pro blem- solvi ng pro cess, you must decide whe ther
you can continue by yourself or if you must ask for help. Ask yourself these questions:
Do I have the authority to resolve this problem?
•
Do I have the necessary knowledge?
•
Do I have the skill?
•
Do I have the time?
•
What other people need to become involved, if any?
•
Who needs to be informed about the problem’s status?
•
Task 3b: Provide Documentation
If you decide to escalate the problem, you might be required to document the problem
by providing:
A problem identification number
•
A problem classification
•
A complete description and history of the problem
•
Diagnostic information such as copies of the event log, results of memory dumps,
•
and so on
You might also have procedures at your site for logging problems. If you have a shift
log or problem log, make timely entries in the log.
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-8
Page 33
Introduction to Integrity NonStop NS-Series
Operations
Task 4: Prevent Future Problems
Solving problems that occur with your system can be exciting because it is active and
stimulating. Preventing problems is often less dramatic. But in the end, prevention is
more productive than solving problems. The more work you do to prevent problems
before they arise, the fewer problems that will arise at potentially critical times.
These questions provide a framework for your problem-prevention efforts:
Why did this problem occur? What was the root cause? Were there any
•
contributing causes?
How serious was the problem?
•
What is the likelihood that it will occur again?
•
Is it possible to eliminate the causes of this problem?
•
Is it possible to reduce the likelihood that this problem will occur in the future?
•
Can automation tools be used to detect and respond to preliminary symptoms of
•
this problem?
Task 4: Prevent Future Problems
Can anything be done now to minimize the damage that would result from a
•
reoccurrence of this problem?
Can the problem resolution process be improved in any way?
•
Logging On to an Integrity NonStop Server
Many operations and troubleshooting tasks are performed by logging on to your
Integrity NonStop server from a system console and using the TACL command
interpreter or one of the OSM applications. For example, the TACL command
interpreter allows you to access SCF, which you use to configure, control, and collect
information about objects within subsystems. For examples of OSM tasks and
functions, see Overview of OSM Applications
System Consoles
A system console is a personal computer approved by HP to run maintenance and
diagnostic software for Integrity NonStop servers. New system consoles are
preconfigured with the required HP and third-party software. When upgrading to the
latest RVU, software upgrades can be installed from the HP NonStop System Console
Installer CD.
on page 1-11.
System consoles comm uni cate with Inte gri ty Non Stop servers over a ded icated se rvi ce
LAN (local area network). System consoles configured as the primary and backup dialout points are referred to as the primary and backup system consoles, respectively.
The OSM Low-Level Link and OSM Notification Director applications reside on the
system console, along with other required HP and third-party software. OSM Service
Connection and OSM Event Viewer software resides on your server, and connectivity
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-9
Page 34
Introduction to Integrity NonStop NS-Series
Operations
is established from the console through Internet Explorer browser sessions. For more
information, see Launching OSM Applications on page 1-11.
Opening a TACL Window
On a system console, you must open a TACL window before you can log on to the
TACL command interpreter. For information about logging on to a TACL command
interpreter, see the Guardian User’s Guide.
You can use any of the following methods to open a TACL window.
Opening a TACL Window Directly From OutsideView
If you know the IP address of the NonStop server (not that of OSM), use this method:
1.Select St art> Pro gram s>OutsideV iew32 7.1.
2.From the Session menu, select New. The New Session Properties dialog box
appears.
Opening a TACL Window
3.From the New Session Properties dialog box, Session tab, click IO Properties.
The TCP/IP Properties dialog box appears.
4.In the TCP/IP Properties dialog box:
a.In the Host name or IP address and port box, type the IP address, followed by
a space and the port number. For example:
172.17.22.187 23
The port number is 23 for a TACL prompt and 301 for a Startup TACL prompt.
In general, you should use port number 23 to perform operations tasks.
b.Click OK.
5.From the New Session Properties dialog box, click OK. A TACL window appears.
6.Log on to the TACL prompt.
Opening a TACL Window From the Low-Level Link
You can also open a TACL window from the OSM Low-Level Link application as
described in the Troubleshooting section in Opening Startup Event Stream and Startup
TACL Windows on page 15-22.
For more details on the functions of the TACL command interpreter, see Appendix B,
Tools and Utilities for Operations.
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-10
Page 35
Introduction to Integrity NonStop NS-Series
Operations
Overview of OSM Applications
HP NonStop Open System Management (OSM) applications perform a variety of
functions, such as:
The OSM Low-Level Link Application is primarily used for down-system support,
•
such as Two startup event stream windows and two startup TACL windows are
automatically launched on the system console configured to receive them. on
page 15-6, Recovery Operations for Processors on page 9-9, and configuring
IOAM, VIO, and P-switch modules (see the NonStop NSxxxx Hardware Installation Manual for your Integrity NonStop NS16000, NS14000, or NS1000 server).
The OSM Service Connection is used to monitor, inventory, and perform actions
•
on system and ServerNet Cluster components. See Using OSM to Monitor the
System on page 3-7 for an overview of how the OSM Service Connection is used
to monitor your system components.
The OSM Event Viewer is used for Section 4, Monitoring EMS Event Messages.
•
The OSM Notification Director is used for Monitoring Pr oblem Inci dent Reports on
•
page 3-12 and dialing out information to your service provider.
Overview of OSM Applications
Launching OSM Applications
Several operations tasks in this guide require you to log on to one of the OSM
applications. Assuming that all OSM client components have been installed on the
system console, launch the desir ed ap plication as d escribed below, then see the online
help (or default home page, for the browser-based OSM applications) for log-on
instructions.
To launch OSM applications: Start>Pr ogram s> HP OSM. Then select the name of the
application to launch:
OSM Service Connection
•
OSM Low-Level Link Application
•
OSM Notification Director>Start/Stop
•
OSM Event Viewer
•
OSM System Inventory Tool
•
The OSM Service Connection and the OSM Event Viewer are browser-based
applications. Assuming that the OSM Console Tools component has been installed on
the system console, the Start menu shortcuts launch a default web page for these two
applications. From that page, you can select the system of your choice from the list of
bookmarks displayed in the left column of the page (available bookmarks include those
that were user-created during previous sessions and those converted automatically
from an existing OSM system list). If no bookmarks are available, the web page also
contains instructions on how to access these a pplications by e nterin g a system URL as
an Internet Explorer address. The system console-based OSM Console Tools
component is not required to use the OSM Service Connection and the OSM Event
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-11
Page 36
Introduction to Integrity NonStop NS-Series
Operations
Viewer applications; it merely installs the Start menu shortcuts and default home pages
to make accessing these applications easier. You can also simply open a new Internet
Explorer browser window and enter the URL of the system you wish to access.
For more information on configuring, accessing, or using OSM applications, see:
OSM Migration and Configuration Guide
•
OSM Service Connection User’s Guide
•
Online help within the OSM Service Connection, Low-Level Link, Notification
•
Director, and Event Viewer applications
Service Procedures
OSM offers a variety of guided procedures, interactive actions, and documented
service procedures to automate or assist with system serviceability . They are launched
by actions within the OSM Service Connection, and include online help.
For a list (and help files) for service procedures, both those incorporated into OSM and
others that are not part of OSM, refer to the Support and Service Library.
Service Procedures
Support and Service Library
These NTL Support and Service library categories provide procedures, part numbers,
troubleshooting tips, and tools for servicing NonStop S-series and Integrity NonStop
NS-series systems:
Hardware Service and Maintenance Publications
•
Service Information
•
Service Procedures
•
Tools and Download Files
•
Troubleshooting Tips
•
Within these categories, where applicable, content might be further categorized
according to server or enclosure type.
Authorized service providers can also order the NTL Support and Service Library CD:
Channel Partners and Authorized Service Providers: Order the CD from the SDRC
•
at https://scout.nonstop.compaq.com/SDRC/ce.htm.
HP employees: Subscribe at World on a Workbench (WOW). Subscribers
•
automatically receive CD updates. Access the WOW order form at
http://hps.knowledgemanagement.hp.com/wow/order.asp.
HP Integrity NonStop NS-Series Operations Guide—529869-005
1-12
Page 37
2
Determining Your System
Configuration
When to Use This Section on page 2-1
Modular Hardware Components on page 2-2
Differences Between Integrity NonStop NS-Series Systems on page 2-2
Terms Used to Describe System Hardware Components on page 2-4
Recording Your System Configuration on page 2-4
Using SCF to Determine Your System Configuration on page 2-5
SCF System Naming Conventions on page 2-5
SCF Configuration Files on page 2-5
Using SCF to Display Subsystem Configuration Information on page 2-6
Displaying SCF Configuration Information for Subsystems on page 2-9
Additional Subsystems Controlled by SCF on page 2-13
Displaying Configuration Information—SCF Examples on page 2-15
When to Use This Section
This section describes the system e nclosur es, the system organ ization, n umbe ring and
labeling, and how to identify components in an Integrity NonStop NS-series server. For
detailed information on system hardware organization, refer to the NonStop NSxxxx Planning Guide for your Integrity NonStop NS16000, NS14000, or NS1000 server.
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-1
Page 38
Determining Your System Configuration
Modular Hardware Components
Hardware for Integrity NonStop systems is implemented in modules or enclosures that
are installed in modular cabinets. The servers include these hardware components:
Modular Cabinet with Power Distribution Unit (PDU)
•
NonStop Blade Complex
•
NonStop Blade Element
•
Logical Synchronization Unit (LSU) (in Integrity NonStop NS16000 and NS14000
•
systems only; Integrity NonStop NS1000 systems have no LSUs)
Processor Switch, or p-switch (in Integrity NonStop NS16000 systems only;
•
Integrity NonStop NS14000 and NS1000 systems have no processor switches)
I/O Adapter Module (IOAM) Enclosure, including subcomponent I/O Adapters:
•
Fibre Channel ServerNet adapter (FCSA)
•
Gigabit Ethernet 4-port ServerNet adapter (G4SA)
•
4-Port ServerN et Extender s (4P SEs) ( Integr ity NonStop NS14000 and NS1000
•
systems only)
Modular Hardware Components
VIO Enclosure (displayed by OSM as a VIO Module object) — For more
•
information, see Integrity NonStop NS14000 Systems, Integrity NonStop NS1000
Systems, or the Versatile I/O (VIO) Manual.
Fibre Channel disk module (FCDM)
•
Maintenance Switch (Ethernet)
•
UPS and ERM
•
NonStop System Console (to manage the system)
•
Cable Management Devices
•
Enterprise Storage System (ESS)
•
Differences Between Integrity NonStop NS-Series Systems
NonStop System Architectures
Integrity NonStop NS-series systems offer of a variety of architecture and configuration
options to suit different customer needs. Integrity NonStop NS16000 and Integrity
NonStop NS14000 systems take advantage of NonStop advanced architecture
(NSAA). For more information, see the NonStop NS16000 Planning Guide or NonStop NS14000 Planning Guide. Integrity NonStop NS1000 systems employ the NonStop
value architecture (NSVA). For more information, see the NonStop NS1000 Planning Guide.
Integrity NonStop NS16000 Systems
In Integrity NonStop NS16000 systems, IOAM enclosures connect through ServerNet
links to the processors via the processor switches. One IOAM enclosure provides
ServerNet connectivity for up to 10 ServerNet I/O adapters on each of the two
ServerNet fabrics. FCSAs and G4SAs can be installed in an IOAM enclosure for
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-2
Page 39
Determining Your System Configuration
communications to storage devices and subsystems as well as to LANs. Additional
IOAM enclosures can be added to increase connectivity and storage resources.
Integrity NonStop NS16000 systems connect to NonStop S-series I/O enclosures by
using fiber-optic ServerNet links to connect the p-switches of the Integrity NonStop
system to IOMF2 CRUs in the I/O enclosures.
Integrity NonStop NS14000 Systems
In Integrity NonStop NS14000 systems, there are no p-switches. There are now two
types of NS14000 systems:
A NonStop NS14000 system consisting of a single IOAM enclosure, with an I/O
•
adapter module on each ServerNet fabric — processor connections are made
through ports on 4-Port ServerNet Extenders (4PSEs), located in slot one and
optionally slot 2 of each I/O adapter module, to the processors via the LSUs. The
IOAM enclosure provides ServerNet connectivity for up to 8 ServerNet I/O
adapters on each of the two S erve rNet fab rics (FC SAs and G4SAs can be inst al led
in slots 2 through 5 of the two IOAMs in the IOAM enclosure for communications to
storage devices and subsystems as well as to LANs). Integrity NonStop NS14000
systems do not support connections to additional IOAM enclosures or NonStop
S-series I/O enclosures.
Differences Between Integrity NonStop NS-Series
Systems
A NonStop NS14000 system consisting of two VIO enclosures, one on each
•
ServerNet fabric — processor connections for processors 0-3 are made through
ports 1-4 of the VIO Logic Board in slot 14 of each VIO enclosure, via the LSUs.
An optional Optical Extender PIC in slot 2 provides for additional processor
connectivity (processors 4-7). VIO enclosures have embedded ports and allow for
optional expansion ports to supply the equivalent functionality provided by FCSAs
and G4SAs in NS14000 systems with IOAMs.
Integrity NonStop NS14000 systems do not support connections to additional IOAM
enclosures or NonStop S-series I/O enclosure
For more information on Integrity NonStop NS14000 systems, see the Versatile I/O
(VIO) Manual, the NonStop NS14000 Planning Guide, or the NonStop NS14000
Hardware Installation Manual.
Integrity NonStop NS1000 Systems
Integrity NonStop NS1000 systems have no processor switches or LSUs. Like Integrity
NonStop NS14000 systems, there are now two types: those consisting of a single
IOAM enclosure (two IOAMs) and those consisting of one VIO enclosure for each
fabric. ServerNet connectivity for each type is accomplished as described for the
Integrity NonStop NS14000 Systems, except for the absence of the LSUs.
Integrity NonStop NS1000 systems do not support connections to NonStop S-series
I/O enclosures. Besides the architectural differences, Integrity NonStop NS1000
systems also utilize different NonS t op Bl ade Element s t han I ntegr ity NonStop NS16000
or NS14000 systems. For more information on Integrity NonStop NS1000 systems,
refer to the NonStop NS1000 Planning Guide and the NonS top NS10 00 Hardware Installation Manual.
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-3
Page 40
Determining Your System Configuration
Terms Used to Describe System Hardware
Terms Used to Describe System Hardware Components
The terms used to describe system hardware components vary. These terms include:
Device
•
System resource or object
•
Device
A device can be a physical device or a logical device. A physical device is a physical
component of a computer system that is used to communicate with the outside world
or to acquire or store data. A logical device is a process used to conduct input or
output with a physical device.
System Resource or Object
The term “system resource” is used in OSM documentation to refer to server
components that OSM software displays, monitors, and often controls. The term
“object” is often used when referring to a specific resource, such as “the Disk object.”
All system resources are displayed in hierarchical form in the tree pane of the OSM
Service Connectio n; man y are also d isplayed in P hysical or In ventor y view s of the view
pane. The effe ct of sele cting an object in ei ther pa ne is the same: for example, you can
view attributes for the selected system resource in the Attributes tab, view alarms for
that resource (if any exist) in the Alarms tab, or right-click on the resource object and
select Actions, to display the Actions dialog box (from which you can select and
perform actions on the selected system resource). Besides physical hardware
components, such as IOAM enclosures, power supplies, ServerNet adapters, and disk
and tape drives, system resources also include logical entities that OSM supports,
such as logical processors, ServerNet fabrics, and LIFs (logical interfaces).
Components
Recording Your System Configuration
As a system operator, you need to understand how your system is configured so you
can confirm that the hardware and system software are operat ing norma lly. If problems
do occur, knowing your configuration allows you to pinpoint problems more easily. If
your system configuration is corrupted, documentation about your configuration is
essential for recovery. You should be familiar with the system organization, system
configuration, and naming conventions.
Several methods are available for researching and recording your system
configuration:
Maintaining records in hard-copy format
•
Using the OSM Service Connection to inventory your system
•
In the OSM Service Connection tree pane, select the System object. From the
View pane drop-down menu, select Inventory to display a list of the system’s
hardware resources. Click Save to save this list to a Microsoft Excel file.
Using SCF to list objects and devices and to display subsystem configuration
•
information
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-4
Page 41
Determining Your System Configuration
For information on forms available that can help you record your system configuration,
refer to the NonStop NSxxxx Planning Guide for your Integrity NonStop NS16000,
NS14000, or NS1000 server.
Using SCF to Determine Your System Configuration
Using SCF to Determine Your System
Configuration
SCF is one of the most important tools available to you as a system operator. SCF
commands configure and control the objects (lines, controllers, processes, and so on)
belonging to each subsystem running on the Integrity NonStop NS-series server. You
also use SCF to display information about subsystems and their objects.
SCF accepts commands from a workstation, a disk file, or an application process. It
sends display output to a workstation, a file, a process, or a printer. Some SCF
commands are available only to some subsystems. An overall SCF reference is the
SCF Reference Manual for H-Series RVUs. Subsystem-specific information appears in
a separate manual for each subsystem. For a partial list of these manuals, refer to
Appendix C, Related Reading.
More detai ls about the functions o f SCF appear in Subsystem Control Facility (SCF) on
page B-4.
SCF System Naming Conventions
SCF object names usually follow a consistent set of naming conventions defined for
each installation. HP preconfigures some of the naming conventions to create the
logical device names for many SCF objects.
System planning and configuration staff at your site likely will change or expand on the
preconfigured file-naming conventions that HP provides, typically by establishing
naming conventions for configuring such objects as storage devices, communication
processes, and adapters. These conventions should simplify your monitoring tasks by
making process or object functions intuitively obvious to someone looking at the object
name. For example, in your environment, tape drives might be named $TAPEn, where
n is a sequential number.
The SCF Reference Manual for H-Series RVUs lists naming conventions for SCF
objects, as well as HP reserved names that cannot be changed or used for other
objects or processes in your environment.
SCF Configuration Files
Your system is delivered with a standard set of configuration files:
The $SYSTEM.SYSnn.CONFBASE file contains the minimal configuration
•
required to load the system.
The $SYSTEM.ZSYSCONF.CONFIG file contains a standard system configuration
•
created by HP. This basic configuration includes such objects as disk drives, tape
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-5
Page 42
Determining Your System Configuration
drives, ServerNet adapters, the local area network (LAN) and wide area network
(WAN) subsystem manager processes, the OSM server processes, and so on.
You typically use this file to load the system.
The $SYSTEM.ZSYSCONF.CONFIG file is also saved on your system as the
•
ZSYSCONF.CONF0000 file.
All subsequent changes to the system configuration are made using SCF. The system
saves configuration changes on an ongoing basis in the ZSYSCONF.CONFIG file. You
have the option to save a stable copy of your configuration at any time in
ZSYSCONF.CONFxxyy using the SCF SAVE command. For example:
-> SAVE CONFIGURATION 01.02
You can save multiple system configurations by numbering them seq uentially ba sed on
a meaningful convention that reflects, for example, different hardware configurations.
Each time you load the system from CONFBASE or CONFxxyy, the system
automatically saves in a file called ZSYSCONF.CONFSAVE a copy of the
configuration file used for the system load.
Using SCF to Display Subsystem Configuration
Information
For guidelines on how to recover if your system configuration files are corrupted, refer
to Troubleshooting and Recovery Operations on page 15-18.
For certain SCF subsystems, configuration changes are persistent. The changes
persist through processor and system loads unle ss you load the system with a dif ferent
configuration file. Examples of these subsystems are the Kernel, ServerNet LAN
Systems Access (SLSA), the storage subsystem, and WAN. For other SCF
subsystems, the changes are not persistent. You must reimplement them after a
system or processor load. Examples of these subsystems are General Device Support
(GDS), Open System Services (OSS), and SQL communication subsystem (SCS).
Using SCF to Display Subsystem Configuration Information
SCF enables you to display, in varying levels of detail, the configuration of objects in
each subsystem supported by SCF. For example, you can use the LISTDEV command
to list all the devices on your system or to list the objects within a given subsystem.
Then you can use the INFO command with a logical device name or device type to
obtain information about a specific device or class of devices.
Another useful command when displaying information is the ASSUME command. Use
the ASSUME command to define a current default object and fully qualified object
name. Then you can use INFO to display information just for that object. For example,
if you type this command and then enter the INFO command without specifying an
object, SCF displays only the information for the workstation called $Ll.#TERM1:
> SCF ASSUME WS $L1.#TERM1
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-6
Page 43
Determining Your System Configuration
SCF LISTDEV: Listing the Devices on Your System
To obtain listings for most devices and processes that have a device type known to
SCF, at a TACL prompt type:
> SCF LISTDEV
In the example shown in Example 2-1, the SCF LISTDEV command lists all the
physical and logical devices on the system.
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-7
Page 44
Determining Your System Configuration
The columns in Example 2-1 mean:
LDevThe logical device number
NameThe logical device name
PPIDThe primary proc essor number and process iden tification num ber (PIN)
of the specified device
BPIDThe backup processor number and PIN of the specified device
TypeThe device type and subtype
RSizeThe record size the device is configured for
PriThe priority level of the I/O process
ProgramThe fully qualified name of the program file for the process
Table 2-1 gives the names of some subsystems that are common to most Integrity
NonStop NS-series systems and are routinely monitored by operations. These
subsystems appear in the LISTDEV output in Example 2-1 on page 2 - 7.
Using SCF to Display Subsystem Configuration
Information
Table 2-1. Key Subsystems and Their Logical Device Names and Device Typ es
All storage devices; for
examp le, disk an d t ape
Systems Access (SLSA)
connection and facilities
(WAN) connec ti ons
Also, in Example 2-1 on page 2-7, several disk drives and tape drives have been
configured. You can identify the subsystem that owns a device by looking up its device
type in the SCF Reference Manual for H-Series RVUs.
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-8
Page 45
Determining Your System Configuration
Dis playing SCF Configuration I nformati on for
Subsystems
To display information about a particular device:
> SCF LISTDEV TYPE n
where n is a number for the device type. For example, if n is 3, the device type is
disks. For the \MS9 system, entering LISTDEV TYPE 3 would display information for
$DATA6, $DATA5, $DATA4, $DATA3, $DATA2, $DATA1, and $DATA.
To display information for a given subsystem:
> SCF LISTDEV subsysname
where subsysname is the logical name of a subsystem; for example, $ZZKRN for the
Kernel subsystem.
Displaying SCF Configuration Information for Subsystems
The following tables give some of the SCF commands that display configuration
information for objects controlled by subsystems that are common to most Integrity
NonStop NS-series systems. The examples use the SCF ASSUME command to make
a given subsystem the current default object for gathering information.
TCP/IP Subsystem
These examples are based on a TCP/IP process named $ZTCO. Before using the
commands listed in Table 2-2, type this command to make the TCP/IP subsystem the
default object:
> SCF ASSUME PROCESS $ZTCO
Table 2-2. Displaying Information for the TCP/IP Subsystem ($ZTCO)
To Display Information About These
Configured Objects Enter This Command
All TCP/IP devicesLISTDEV TCPIP
Detailed information ab out the TCP/I P
subsystem manager
All SUBN ET namesINF O SU BNET *
All ROUTE namesINFO ROUTE *
Integrity NonStop servers support two versions of TCP/IP—NonStop TCP/IPv6 and
NonStop TCP/IP. When you use the SCF LISTDEV and INFO commands, all current
TCP/IP processes are displayed. For more information, refer to the TCP/IPv6
Configuration and Management Manual and the TCP/IP Configuration and
Management Manual.
INFO, DETAIL
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-9
Page 46
Determining Your System Configuration
Kernel Subsystem
Before using commands listed in Table 2-3, type this command to make the Kernel
subsystem the default object:
> SCF ASSUME PROCESS $ZZKRN
Generic processes are part of the SCF Kernel subsystem. Generic processes can be
created by the operating system or by a user. Examples of generic processes created
by the operating system are the Kernel, SLSA, the storage subsystem, and WAN
subsystem manager processes. Examples of generic processes created by a user are
a Pathway program, a third-party program, or a user-written program that you
configure to be controlled by the operating system. The $ZPM persistence manager
starts and monitors all generic processes.
Table 2-3. Displaying Information for the Kernel Subsystem ($ZZKRN)
To Display Information About These
Configured Objects Enter This Command
Dis playing SCF Configuration I nformati on for
Subsystems
The Kernel subsystem manager and
ServerN et process nam es
All Kernel subsystem object and process
names
All generic processesINFO *
Detailed information ab out a generic
process
LIST DE V KERNEL
NAMES $ZZKRN
INFO #generic-process, DETAIL
Storage Subsystem
The storage subsystem manages disk and tape drives as well as SCSI and HP
NonStop Storage Management Foundation (SMF) devices. Use the commands listed
in Table 2-4 to display desired information.
Table 2-4. Displaying Information for the Storage Subsystem ($ZZST0)
To Display Information About These
Configured Objects Enter This Command
All disk and tape drives (list)LISTDEV STORAGE
All storage subsystem objects and
processes (by name)
NAMES $ZZSTO
All di s k drives (list)LIST DE V T YPE 3
All di s k drives (s um m ary informatio n)INFO DISK $*
A speci f ic disk dr ive (detailed informat ion)INFO DISK $name, DETAIL
All tape drives (list)LISTDEV TYPE 4
All tape drives (summary information)INFO TAPE $*
A specific tape driv e (detailed info rm ation)INFO TAPE $name, DETAIL
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-10
Page 47
Determining Your System Configuration
When displaying configuratio n files for disk and tape devices in the storage subsystem,
you can use the OBEYFORM option with the INFO command to display currently
defined attribute values in the format that you would use to set up a configuration file.
Each attribute appears as a syntactically correct configuration command.
For example, this command shows all the attributes for $SYSTEM in OBEYFORM:
You can create a command file containing the output by using the OUT option of the
INFO command. For details, see the SCF Reference Manual for the Storage Subsystem.
To get detailed configuration information in command format for all disks on the
system, issue this command:
-> INFO DISK $*,OBEYFORM
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-11
Page 48
Determining Your System Configuration
To get detailed configuration information in command format for all tape drives on the
system, issue this command:
-> INFO TAPE $*,OBEYFORM
ServerNet LAN Systems Access (SLSA) Subsystem
Before using commands listed in Table 2-5, type this command to make the SLSA
subsystem the default object:
> SCF ASSUME PROCESS $ZZLAN
The SLSA subsystem provides access to parallel LAN and WAN I/O for Integrity
NonStop servers. The SLSA subsystem provides access to Ethernet, token-ring, and
multifunction I/O board Ethernet adapters and to the ServerNet wide area network
(SWAN) concentrator.
Table 2-5. Displaying Information for the SLSA Subsystem ($ZZLAN)
To Display Information About These
Configured Objects Enter This Command
Dis playing SCF Configuration I nformati on for
Subsystems
The SLSA subsystem managerLISTDEV SLSA
All SLSA subsystem object and process
names
All configured adapters, with
grou p/ m odule/ s lot and adap t er type
A specific adapterINFO ADAPTER adapter, DETAIL
All logical interface (LIF) names, with
associated MAC addresses, associated
physical int erface (PIF ) names, and port
types
A specific LI FINFO LIF lifname, DETAIL
A specific PI FINFO PIF pifname, DETAIL
All ServerN et addressable control ler (SAC)
names
A specific SACINFO SAC sacname.n, DET A IL
NAMES $ZZ LAN
INFO ADAPTER *
INFO LIF *
INFO SAC *
When displaying configuration files for adapter and LIF devices in the SLSA
subsystem, you can use the OBEYFORM option with the INFO command to display
currently defined attribute values in the format that you would use to set up a
configuration file. E ach a ttribut e app ears as a syntactically correct system conf igurat ion
command. For example:
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-12
Page 49
Determining Your System Configuration
Examples of the INFO command used with the OBEYFORM option are:
-> INFO ADAPTER $*, OBEYFORM
-> INFO LIF $*, OBEYFORM
WAN Subsystem
Before using commands listed in Table 2-6, type this command to make the wide area
network (WAN) subsystem the default object:
> SCF ASSUME PROCESS $ZZWAN
The WAN subsystem has responsibility for all WAN connections.
Table 2-6. Displaying Information for the WAN Subsystem ($ZZWAN)
To Display Information About These
Configured Objects Enter This Command
The WAN subsystem managerLISTDEV WAN
Additional Subsystems Controlled by SCF
All WAN configu rat ion managers, TCP/IP
processes , and WANBoot pr oc es s es
All PATH namesINFO PATH *
The WAN adaptersINFO ADAPTER *
All DEVICE objectsINFO DEVICE *
All PROFILE objects INFO PROFILE *
INFO *
Additional Subsystems Controlled by SCF
Table 2-7 lists the names associated with additional subsystems that can be controlled
by SCF, along with its device types. You can use SCF commands to display the
current attribute values for these objects.
Some SCF commands are available only to some subsystems. The objects that each
command affects and the attributes of those objects are subsystem specific. This
subsystem-specific information is presented in a separate manual for each subsystem.
A partial list of these manuals appears in Table 6-1 on page 6-13.
Refer to the SCF Reference Manual for H-Series RVUs for further information.
Table 2-7. Subsystem Objects Controlled by SCF (page 1 of 2)
Subsystem
AcronymDescription
AM3270AM3270 Access Method600 or 10
ATMAsynchronous Transfer Mode (ATM)
protocol
A TP6100Asynchronous T erminal Process 6100530
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-13
Device
Type
420 or 1
Device
Subtype
Page 50
Determining Your System Configuration
Table 2-7. Subsystem Objects Controlled by SCF (page 2 of 2)
Additional Subsystems Controlled by SCF
Subsystem
AcronymDescription
Device
Type
CP6100Communications Process Subsystem510
EnvoyByte-synchronous and asynchronous
70
communic ations data link -level interfac e
EnvoyACP/XFByte-synchronous communications data
1140, 41, 42,
link-leve l int erface
ExpandExpan d network con t rol process ($NCP) or
62 or 632, 3, 5, or 6
line-handler process
GDSGeneral Device Support57
OSIAPLMGOpen Systems Interconnection/Appli cat i on
5520
Manager
OSIASOpen Systems Interconnection/Appli cat i on
551-5
Services
OSICMIPOpen Systems Interconnection/ Common
5524
Management Information Protocol
OSIFTAMOpen Systems Interconnection/File
5521 or 25
Transfe r, Access, and Management
OSIMHSOpen Systems Interconnectio n/ Message
5511 or 12
Handling System
Device
Subtype
43
OSITSOpen Systems Interconnection/Transport
5555, 4
Services
OSSOpen System Services240
PAMPort Access Method
QIOQueued I/O product450
SCPSubsystem Control Point5063
SCSSQL Communications Subsystem380
SNAX/APNSNAX Advanced Peer Networking58 or 130
SNAX/XFSNAX Extended Facility58 or 13
SNAXAPCSN AX Advanc ed Program C om m unication1310
SNAXCRESNAX Creator-2180
SNAXHL SSNAX High-Leve l Support135
SNMPSimple Network Management Protoco l
310
agent
TELSERVTCP/IP TELNET product460
TR3271TR3271 Access M et hod60 1 or 11
X25AMX.25 Access Method610
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-14
Page 51
Determining Your System Configuration
Displaying Configuration Information—SCF
Display ing Configura tion Information—SCF Exa m ples
These examples show SCF commands that display subsystem configuration
information, along with the information that is returned. These commands are not
preceded by an ASSUME command.
To display all the processes running in the Kernel subsystem:
-> INFO PROC $ZZKRN.#*
The system displays a listing similar to that shown in Example 2-3:
Example 2-3. SCF INFO PROCESS Command Output
32-> INFO PROCESS $ZZKRN.#*
NONSTOP KERNEL - Info PROCESS \DRP09.$ZZKRN
Symbolic Name *Name *Autorestart *Program
HP Integrity NonStop NS-Series Operations Guide—529869-005
2-16
Page 53
3
Overview of Monitoring and
Recovery
When to Use This Section on page 3-1
Functions of Monitoring on page 3-2
Monitoring Tasks on page 3-2
Working With a Daily Checklist on page 3-2
Tools for Checking the Status of System Hardware on page3-3
Additional Monitoring Tasks on page 3-6
Monitoring and Resolving Problems—An Approach on page 3-7
Using OSM to Monitor the System on page 3-7
Using the OSM Service Connection on page 3-7
Recovery Operations for Problems Detected by OSM on page 3-12
Monitoring Problem Incident Reports on page 3-12
Using SCF to Monitor the System on page 3-12
Determining Device States on page 3-13
Automating Routine System Monitoring on page 3-16
Using the Status LEDs to Monitor the System on page 3-20
Related Reading on page 3-22
When to Use This Section
This section provides an overview of monitoring an Integrity NonStop server using
various tools. It describes some common monitoring tasks. It also refers you to other
sections or manuals for more information about monitoring specific system
components, events, applications, or processes.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-1
Page 54
Overview of Monitoring and Recovery
Functions of Monitoring
You must monitor a system to ensure that it is operating properly and to recognize
when corrective action is required. By monitoring a system, you can:
Verify whether components are currently up or down
•
Be quickly notified of error conditions, state changes, and threshold conditions that
•
have been exceeded or are reaching their limits
View a chronological list of events that can help with problem diagnosis and
•
resolution
Determine how much of a particular resource is being used; for example,
•
processor capacity, disk or file space, or communications line bandwidth
Find performance problems that can affect the users of the system
•
Make better use of existing resources
•
Ensure that products such as HP NonStop SQL/MP, HP NonStop SQL/MX, HP
•
NonStop Transaction Management Facility (TMF), and Pathway are available
Functions of Monitoring
Prevent many problems and outages from occurring
•
Monitoring Tasks
Regardless of the shift you work, certain areas of your hardware and software
environment need to be checked on a regular basis. This subsection provides
guidelines that will enable you to determine the general areas you should monitor.
Working With a Daily Checklist
A good method for ensuring that certain areas of your operations environment are
monitored is to develop a checklist. Monitor these items on a system frequently. At
least daily, monitor:
OSM Service Connection GUI
•
Event messages
•
Alarms
•
Problem incident reports
•
The status of all system components
•
The status of processes
•
The status of all applications
•
The performance of processors, disks, and communications lines (Monitoring
•
performance is not discussed in this guide.)
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-2
Page 55
Overview of Monitoring and Recovery
An example of a checklist you might use to standardize your routine daily monitoring
tasks is:
TaskOperator’s nameDate & timeN otes and questions
Check phone
messages
Check faxes
Check e-mail
Check shift log
Check EM S event
messages
Check s tatus of
terminals
Check comm. lines
Check TMF status
Check Pathway status
Tools for Checking the Status of System Hardware
Check disks
Check tape dri v es
Check process ors
Check printers
Check spooler
supervisor and
collector processes
Check ServerNet
cluster status
Tools for Checking the Status of System Hardware
Several tools are available to check the status of system components in an Integrity
NonStop NS-series server. The most frequently used tools are the OSM Service
Connection and the Subsystem Control Facility (SCF).
For information relating to system components in NonStop S-series servers, refer to
the appropriate NonStop S-Series documentation.
Table 3-1
lists the tools available to monitor system components.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-3
Page 56
Overview of Monitoring and Recovery
Table 3-1. Monitoring System Components (page1of3)
Resource
Tools for Checking the Status of System Hardware
Monitored
Using These
Tools See...
Adapters for communica tions
subsystems:
G4SA
OSM Serv ic e
Connection
SCF interf ac e
to various
subsystems
Adapters fo r t he s t orage
subsystem:
Fibre Channel ServerNet
adapter (F C SA)
OSM Serv ic e
Connection
SCF interf ac e
to the storage
subsystem
AWAN access serverRAS
management
tool
Communications linesSCF interface
to the various
subsystems
Using th e OSM Service Co nnection
on
page 3-7
Section 6, Communications Subsystems:
Monito ring and Reco v ery
Secti on8, I/O Adapters and Modules:
Monito ring and Reco v ery
OSM Service Connection User’s Guide
(or OSM Service Connection on line help)
Using th e OSM Service Co nnection
on
page 3-7
Secti on8, I/O Adapters and Modules:
Monito ring and Reco v ery
OSM Service Connection User’s Guide
(or OSM Service Connection on line help)
AWAN 3886 Serv er Installatio n and
Configuration Guide
Section 6, Communications Subsystems:
Monito ring and Reco v ery
Disk drive enclosure (a nd
indi v idual dis k drives)
attached to FCSAs
Disk drives attached to
ServerNet adapters in legacy
NonStop S-series enclosures
Modular I/O adapter module
(IOAM) and subcomponents,
including ServerNet switch
boards, p ow er supplies, and
fans
OSM Serv ic e
Connection
SCF interf ac e
to the storage
subsystem
DSAP
OSM Serv ic e
Connection
SCF interf ac e
to the storage
subsystem
DSAP
OSM Serv ic e
Connection
Using th e OSM Service Co nnection on
page 3-7
Secti on8, I/O Adapters and Modules:
Monito ring and Reco v ery
Section 10, Disk Drives: Monitoring and
Recovery
Guardi an User’s Gui de.
Using th e OSM Service Co nnection
on
page 3-7
Section 10, Disk Drives: Monitoring and
Recovery
Guardi an User’s Gui de
Using th e OSM Service Co nnection
on
page 3-7
Mon itor B atte r i e s
on page 14-4
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-4
Page 57
Overview of Monitoring and Recovery
Table 3-1. Monitoring System Components (page2of3)
Resource
Tools for Checking the Status of System Hardware
Monitored
Using These
Tools See...
Legacy NonStop S-series
enclosu re and
subcomponents, inclu ding
IOMF2 CRUs, PMCUs,
power supplies, fans, and
batteries
NonStop Bl ade Compl ex
compon ents: Blade
Elements, LS U s , logical
processors
NonSto p ServerNet Clust er
6770 Switch
NonSto p ServerNet Cluster
6780 Switch
OSM Serv ic e
Connection
OSM Serv ic e
Connection
OSM Serv ic e
Connection
OSM Serv ic e
Connection
Using th e OSM Service Co nnection on
page 3-7
Secti on8, I/O Adapters and Modules:
Monito ring and Reco v ery
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
Using th e OSM Service Co nnection
on
page 3-7
Secti on 9, Proc essors an d C om ponents:
Monito ring and Reco v ery
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
ServerNet Cluster 6770 Hardware
Installat ion and Supp ort Guide , or
ServerNet Cluster Manual
OSM Serv ice Connec t ion User’s Guide
(or OSM Service Connection on line help)
ServerNet Cluster 6780 Operations Guide
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
PrintersSCF
SPOOLCOM
Processor switch (P-switch)
module and subcomponents,
OSM Serv ic e
Connection
including ServerNet switch
boards, power supplies, fans,
PICs and ports
ServerNet connectivity for an
Integrity NonStop NS14000
OSM Serv ic e
Connection
or NS1000 system (which
have no processor switches)
4-Port ServerNet Extender
(4PSE)
ServerN et fa brics: process orto-proce s so r and processorto-IOMF2 communication
OSM Serv ic e
Connection
SCF interf ac e
to the Kern el
subsystem
Section 12, Printers and Terminals:
Monito ring and Reco v ery
Guardi an User’s Gui de
Using th e OSM Service Co nnection
on
page 3-7
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
Using th e OSM Service Co nnection
on
page 3-7
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
Using th e OSM Service Co nnection
on
page 3-7
Section 7, ServerNet Resources:
Monito ring and Reco v ery
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-5
Page 58
Overview of Monitoring and Recovery
Table 3-1. Monitoring System Components (page3of3)
Resource
Additional Monitoring Tasks
Monitored
Using These
Tools See...
ServerNet wide area network
(SWAN) concentrator
Tape drives OSM Service
Uninterruptible Power Supply
(UPS)
OSM Serv ic e
Connection
SCF interf ac e
to the WAN
subsystem
Connection
SCF interf ac e
to the storage
subsystem
MEDIACOM
OSM Serv ic e
Connection
Additional Monitoring Tasks
Table 3-2 provides an example of additional areas you should monitor daily.
Table 3-2. Daily Tasks Checklist
Using th e OSM Service Co nnection on
page 3-7
Section 6, Communications Subsystems:
Monito ring and Reco v ery
Section 11, Tape Drives: Monitoring and
Recovery
Secti on8, I/O Adapters and Modules:
Monito ring and Reco v ery
Guardi an User’s Gui de
Mon itor B atte r i e s
OSM Serv ice Connec t ion Use r’s Guide
(or OSM Service Connection on line help)
on page 14-4
General TasksSpecific TasksFor More Information, See
Monitor messages fr om
system users
Monitor o perator
messages
Monitor key
applications
Monitor system
processes
Check telephone, fax, electronic
mail, and any other messages
From the OSM Event Viewer
From the EMSDIST printing
distributor
From V iewPoin t
Monitor Pathway and TMF
Monitor SQL/MX, SQL/MP and
other applications
Use the SC F and TACL P PD
commands
Guardi an User’s Gui de
Section 4, Monitoring EMS
Event Messages
OSM Ev ent Viewer onlin e
help
Guardi an User’s Gui de
ViewPoint Manual
Section 13, Applications:
Monito ring and Reco v ery
The doc um entation spec if ic
to the application
Section 5, Processes:
Monito ring and Reco v ery
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-6
Page 59
Overview of Monitoring and Recovery
Monitoring and Resolving Problems—An Approach
Monitoring and Resolving Problems—An
Approach
A useful approach to identifying and resolving problems in your system is to first use
OSM to locate the focal point of a hardware problem and then use SCF to gather all
the related data from the subsystems that control or act on the hardware. In this way,
you can develop a larger picture that encompasses the whole environment, including
communications links and other objects and services that might be contributing to the
problem or affected by it.
To get comprehensive online descriptions of all the available SCF commands, use the
SCF HELP command.
The following subsections give instructions for using OSM and SCF to monitor and
resolve problems.
Using OSM to Monitor the System
This section deals mostly with the OSM Service Connection, the primary OSM
interface for system monitoring and serviceability.
See Overview of OSM Applications on page 1-11 for examples of how the other OSM
applications are used for monitoring-related functions.
Using the OSM Service Connection
The OSM Service Connection can be used in a variety of ways to monitor your system,
including:
Use of colors and symbols to direct you to the source of any problems
•
Attribute values for system resources, displayed in the Attributes tab and in many
•
dialog boxes.
Alarms, displayed in the Alarms tab and Alarm Summary dialog box.
•
The following section presents one model for using the OSM Service Connection to
monitor your system, along with a few other options.
A Top-Down Approach
The Management (or main) window of the OSM Service Connection uses a series of
colors and symbols to notify you that pr oblems exist within the system. You can tell at a
high-level glance when problem conditions exist, then drill-down, or expand the tree
pane to find the component reporting the problem. Figure 3-1 illustrates how both the
the rectangular system icon (located at the top of the view pane) and the system object
in the tree pane indicate problems within the system. The system icon, which is green
when OSM is reporting no problems on the system, has turned yellow. The system
icon in the tree pane is displaying a yellow arrow to indicate a problem within.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-7
Page 60
Overview of Monitoring and Recovery
Figure 3-1. OS M Management: System Icons Indicate Problems Within
Using the OSM Service Connection
VST310.vsd
Note. In the OSM Service Connection Management window, the tree pane is located on the
far left. In the lower right is the Overview pane. Located between them is the details pane, from
which you can choose to view the Attributes or Alarms tab. Directly above the details pane is
the view pane, from which you can choose a Physical or Inventory view of your system or
ServerN et C lus t er. The gray bar directly above the view pane is an OSM -s pecific toolb ar (as
opposed to the standard Internet Explorer menu bar at the top of the browser window).
Expanding the system object in the tree pane, you can see a yellow arrow on the
Group 110 object, indicating that the problem is located somewhere within that group.
Expanding the tree pane further, as illustrated in Figure 3-2, yellow arrows on the
IOAM Enclosure 110 and IOAM 110.3 objects reveal that the problem exists on a
ServerNet adapter in slot 3 of that I/O module. The red bell-shaped icon by that
resource object (in the tree pane) indicates that there is an alarm on the object. To
obtain information about the alarm:
1.Click to select the object displaying the red triangular and bell-shaped symbols.
2.Select the Alarms tab from the details pane.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-8
Page 61
Overview of Monitoring and Recovery
3.Click to select the alarm, then right-click and select Details.
Figure 3-2. Expanding the Tree Pane to Locate the Source of Problems
Using the OSM Service Connection
VST311.vsd
Check the Attributes tab (Figure 3-3) also, as a yellow or red triangular symbol
indicates problem attribute values exist. In this case, the degraded Service State
attribute was caused by an alarm. However, when a resource displays a yellow or red
triangular object but no bell-shaped icon, it has no alarms but is reporting problem or
degraded attribute values.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-9
Page 62
Overview of Monitoring and Recovery
Figure 3-3. Attributes Tab
Using the OSM Service Connection
VST312.vsd
Using System Status Icons to Monitor Multiple Systems
When you are monitoring multiple systems, you can create a System Status Icon for
each system, allowing you to keep a high-level eye on each system while saving
screen space. Figure 3-4 shows three separate System Status icons, each created by:
1.Establishing an OSM Service Connection session to the system.
2.From the Summary menu on the OSM toolbar, selecting System Status.
You can then minimize, but not close, the OSM Service Connection Management
window for each system. If the System Status icon for a system turns from green to
yellow, as illustrated in Figure 3-4, open the Management window for that system and
locate the problem as described in A Top-Down Approach on page 3-7.
Figure 3-4. Using System Status Icons to Monitor Multiple Systems
VST313.vsd
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-10
Page 63
Overview of Monitoring and Recovery
Using Alarm and Problem Summaries
Other options for monitoring your system with the OSM Service Connection include
using the Alarm Summary (Figure 3-5) or Problem Summary (Figure 3-6) dialog boxes
to quickly view all alarms and problem conditions that exist on your system.
Figure 3-5. Alarm Summary Dialog Box
Using the OSM Service Connection
Figure 3-6. Problem Summary Dialog Box
VST314.vsd
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-11
VST315.vsd
Page 64
Overview of Monitoring and Recovery
Recovery Operations for Problems Detected by
Suppressing Problems and Alarms
In certain cases, you might want to acknowledge or suppress a particular problem, to
stop it from propagating a known problem all the way up to the system level. That way,
it will be easier to identify other problems that might occur. For more information on
OSM problem management features such as deleting or suppressing alarms and
suppressing problem attributes, see the OSM Service Connection User’s Guide (also
available as online help within the OSM Service Connection).
Recovery Operations for Problems Detected by OSM
Recovery operations depend on the particular problem, of course. Methods of
determining the appropriate recovery action include:
Alarm Deta ils, avai lable for each al arm displaye d i n O SM, pr ovide su ggested repair
•
actions.
The value displayed by problem attributes in OSM often provide clues to recovery.
•
EMS events, retrieved and viewed in the OSM Event Viewer, include cause, effect,
•
and recovery information in the event details.
OSM
Check the section in this guide that covers the system resource—for example,
•
Section 11, Tape Drives: Monitoring and Recovery— for information on using the
SCF and other tools to determine the cause of a problem. Then follow the
directions in the Recovery Operations subsection in the relevant section.
Replacing a system component that has malfunctioned is beyond the scope of this
guide. For more information, contact your service provider, or refer to the Support and
Service Library on page 1-12.
Monitoring Problem Incident Reports
The OSM Notif ication D ire ctor g ener ates pr oblem i ncident r epor t s when changes occur
that could directly affect the availability of resources on your Integrity NonStop server.
The Incident Report List tab on the Notification Director dialog box allows you to view,
sort, authorize, and reject incident reports. The Notification Director allows you to
forward notifications to your service provider if your system is configured for remote
dial-out.
Using SCF to Monitor the System
Use the Subsystem Control Facility (SCF) to display information and current status for
all the devices on your system known to SCF. Some SCF commands are available
only to some subsystems. The objects that each command affects and the attributes of
those objects are sub system sp ecific. This subsystem -spe cific inform ation a ppear s in a
separate manual for each subsystem. A partial list of these manuals appears in
Appendix C, Related Reading.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-12
Page 65
Overview of Monitoring and Recovery
Determining Device States
This subsection explains how to determine the state of devices on your system. For
example, to monitor the current state of all tape devices on your system, at an SCF
prompt:
-> STATUS TAPE $*
Example 3-1 shows the results of the SCF STATUS TAPE $* command:
Example 3-1. SCF STATUS TAPE Command
1-> STATUS TAPE $*
STORAGE - Status TAPE \COMM.$TAPE0
LDev State Primary Backup DeviceStatus
PID PID
156 STOPPED 2,268 3,288 NOT READY
STORAGE - Status TAPE \COMM.$DLT20
LDev State Primary Backup DeviceStatus
PID PID
394 STARTED 2,267 3,295 NOT READY
STORAGE - Status TAPE \COMM.$DLT21
LDev State Primary Backup DeviceStatus
PID PID
393 STARTED 1,289 0,299 NOT READY
STORAGE - Status TAPE \COMM.$DLT22
LDev State Primary Backup DeviceStatus
PID PID
392 STARTED 0,300 1,288 NOT READY
STORAGE - Status TAPE \COMM.$DLT23
LDev State Primary Backup DeviceStatus
PID PID
391 STARTED 1,287 0,301 NOT READY
STORAGE - Status TAPE \COMM.$DLT24
LDev State Primary Backup DeviceStatus
PID PID
390 STARTED 6,265 7,298 NOT READY
STORAGE - Status TAPE \COMM.$DLT25
LDev State Primary Backup DeviceStatus
PID PID
389 STARTED 4,265 5,285 NOT READY
Determining Device States
Some other examples of the SCF STATUS command are:
-> STATUS LINE $LAM3
-> STATUS WS $LAM3.#WS1
-> STATUS WS $LAM3.*
-> STATUS WINDOW $LAM3.#WS1.*
-> STATUS WINDOW $LAM3.*, SEL STOPPED
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-13
Page 66
Overview of Monitoring and Recovery
The general format of the STATUS display follows. However, the format varies
depending on the subsystem.
subsystem STATUS object-type object-name
Name State PPID BPID attr1attr2attr3 …
object-name1 state nn,nnn nn,nnn val1 val2 val3 …
object-name2 state nn,nnn nn,nnn val1 val2 val3 …
where:
subsystemThe reporting subsystem name
object-typeThe object, or device, type
object-nameThe fully qualified name of the object
StateOne of the valid object states: ABORTING, DEFINED,
PPIDThe primary processor number and process identification number
(PIN) of the object
BPIDThe backup processor number and PIN of the object
attrnThe name of an attribute of the object
valnThe value of that object attribute
SCF Object States
Table 3-3 lists and explains the possible object states that the SCF STATUS command
can report.
T able 3-3. SCF Object States (page 1 of 2)
StateS ubstateExplanation
ABORTINGThe objec t is being aborted. T he object is
responding to an ABORT command or some type
of malfunction. In this stat e, no new links are
allowed, and drastic m easures mi ght be underway
to reach the STOPPED state. This state is
irrevocable.
DEFINE DOne of the generally def ined possible co nditions of
an object w it h respect to the m anagem ent of t hat
object.
DIAGNOSINGThe object is in a subsystem-defined test mode
entered through the DIAGNOSE command.
INITIALIZEDThe system has created the process, but it is not
yet in one of th e operational s tates.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-14
Page 67
Overview of Monitoring and Recovery
T able 3-3. SCF Object States (page 2 of 2)
StateS ubstateExplanation
SERVICINGSP EC I ALT he object is being servic ed or used by a
TESTThe object is reserved for exclusive testing.
STARTEDThe object is logically accessible to user
STARTINGThe object is being initialized and is in transition to
STOPPEDCONFI G-ERRORThe objec t is co nf igured improperly.
DOWNThe object is no longer logically accessible to user
HARDDOWNThe object is in the har d-do wn state or is physical l y
Determining Device States
privileged process and is inacces s ible to user
processes.
processes.
the STARTED state.
processes.
inaccessible due to a hardware error.
INACCESSIBLEThe object is inaccessible to user processes.
PREMATURE-
TAKEOVER
The backup input/output (I/O) process was asked
to take over for the primary I/O process before it
had the proper information.
RESOURCE-
UNAVAILABLE
UNKNOWN-
REASON
The input/output (I/O) process could not obtain a
necessary resource.
The input / output (I/O) p roc ess is down for an
unknow n reason.
STOPPINGThe object is in transition to the STOPPED state.
No new links are allowed to or from the object.
Existing links are in the process of b eing deleted.
SUSPEN D EDTh e f low of informatio n t o an d from the ob jec t is
restricted. (It is typically prevented.) A subsystem
must clearly distinguish between the type of
information that is allowed to flow in the
SUSPENDED state and that whi c h normall y fl ows
in the STARTED or STOPPED state. In the
SUSPENDED state, the object must complete any
outstanding work defined by the subsystem.
SUSPENDINGThe object is in transition to the SUSPENDED
state. The subsystem must clearly define the
nature of the restrictions that this state imposes on
its objects.
UNKNOWNThe object’s state cannot be determined because
the object is inaccessible.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-15
Page 68
Overview of Monitoring and Recovery
Automating Routine System Moni toring
Automating Routine System Monitoring
You can automate many of the monitoring procedures. Automation saves you time and
helps you to perform many routine tasks more efficiently.
Your operations environment might be using TACL macros, TACL routines, or
command files to perform routine system monitoring and other tasks. These items
allow you to run many procedures so that you can quickly determine system status,
produce reports, or perform other common tasks. The TACL Reference Manual
contains an example that you can adapt to automate system monitoring.
Example 3-2 contains an example of a command file you can use or adapt to check
many of the system elements:
1.To create a command file named SYSCHK that will automate system monitoring,
type the text shown in Example 3-2 into an EDIT file.
Example 3-2. System Monitoring Command File
COMMENT THIS IS THE FILE SYSCHK
COMMENT THIS CHECKS ALL DISKS:
SCF STATUS DISK $*
COMMENT THIS CHECKS ALL TAPE DRIVES:
SCF STATUS TAPE $*
COMMENT THIS CHECKS THE SPOOLER PRINT DEVICES:
SPOOLCOM DEV
COMMENT THIS CHECKS THE LINE HANDLERS:
SCF STATUS LINE $*
COMMENT THIS CHECKS THE STATUS OF TMF:
TMFCOM;STATUS TMF
COMMENT THIS CHECKS THE STATUS OF PATHWAY:
PATHCOM $ZVPT;STATUS PATHWAY;STATUS PATHMON
COMMENT THIS CHECKS ALL SACS:
SCF STATUS SAC $*
COMMENT THIS CHECKS ALL ADAPTERS
SCF STATUS ADAPTER $*
COMMENT THIS CHECKS ALL LIFS
SCF STATUS LIF $*
COMMENT THIS CHECKS ALL PIFS
SCF STATUS PIF $*
2.After you create this file, at a TACL prompt, type this command to execute the file
and automatically monitor many elements of your system:
> OBEY SYSCHK
For an example of the output that is sent to your home terminal when you execute a
command file such as SYSCHK, refer to Example 3-3. This output shows that all
elements of the system being monitored are up and running normally.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-16
Page 69
Overview of Monitoring and Recovery
Example 3-3. System Monitoring Output File (page 1 of 3)
COMMEN T THIS IS TH E FILE SYSCHK
COMMEN T THIS CHE CKS ALL DISKS :
SCF STAT US DI SK $*
STORAG E - Sta tu s DISK \SHARK .$ DA TA12
LDev Primary Backup Mirror MirrorBackup Primary Backup
PID PID
52 *STARTED STARTED *STARTED STARTED 3,262 2,263
STORAG E - Sta tu s DISK \SHARK .$ DA TA01
LDev Primary Backup Mirror MirrorBackup Primary Backup
PID PID
63 *STARTED STARTED *STARTED STARTED 0,267 1,266
STORAG E - Sta tu s DISK \SHARK .$ DA TA04
LDev Primary Backup Mirror MirrorBackup Primary Backup
PID PID
60 *STARTED STARTED *STARTED STARTED 0,270 1,263
STORAG E - Sta tu s DISK \SHARK .$ SY STEM
LDev Primary Backup Mirror MirrorBackup Primary Backup
PID PID
6 *STARTED STARTED STOPPED STOPPED 0,256 1,256
Automating Routine System Moni toring
COMMEN T THIS CHE CKS ALL TAPE DR IV ES :
SCF STAT US TA PE $*
STORAG E - Sta tu s TAPE $TAPE1
LDev State SubState Primary Backup DeviceStatus
PID PID
48 STARTED 0,274
STORAG E - Sta tu s TAPE $TAPE0
LDev State SubState Primary Backup DeviceStatus
PID PID
49 STARTED 0,273
COMMENT THIS CHECKS THE SPOOLER PRINT DEVICES:
SPOOLCOM DEV
DEVICE STATE FLAGS PROC FORM
$LINE1 WAITING H $SPLX
$LINE2 WAITING H $SPLX
$LINE3 WAITING H $SPLX
$LASER WAITING H $SPLP
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-17
Page 70
Overview of Monitoring and Recovery
Example 3-3. System Monitoring Output File (page 2 of 3)
COMMEN T THIS CHE CKS ALL SACS:
SCF STAT US SA C $*
SLSA Status SAC
Name Owner State
$ZZLAN .E 4SA 1.0 1 STARTED
$ZZLAN .E 4SA 1.1 0 STARTED
$ZZLAN .E 4SA 1.2 0 STARTED
$ZZLAN .E 4SA 1.3 1 STARTED
COMMENT THIS CHE CKS ALL ADAPTERS
SCF STATUS ADAPTER $*
SLSA Status ADAP TER
Name State
$ZZLAN.MIOE0 STARTED
$ZZLAN.E4SA0 STARTED
$ZZLAN.MIOE1 STARTED
$ZZLAN.E4SA2 STARTED
COMMEN T THI S CH ECKS ALL LIFS
SCF STAT US LI F $*
Automating Routine System Moni toring
SLSA Status LIF
Name State Access State
$ZZLAN.LAN0 STARTED UP
$ZZLAN.LAN3 STARTED DOWN
COMMEN T THI S CH ECKS ALL PIFS
SCF STAT US PI F $*
SLSA Status PIF
Name State
$ZZLAN .E 4SA 0.0.A STARTED
$ZZLAN .E 4SA 0.0.B STARTED
$ZZLAN .E 4SA 0.1.A STOPPED
$ZZLAN .E 4SA 0.1.B STARTED
COMMENT THIS CHECKS THE LINE HANDLERS:
SCF STATUS LINE $*
COMMEN T THIS CHE CKS THE STATU S OF TMF :
TMFCOM;STATUS TMF
TMF Stat us:
System: \SA GE , Ti me: 12-Jul- 19 94 14:05:00
State: Started
Transaction Rate: 0.25 TPS
AuditTrail Status:
Master:
Active audit tr ai l ca pacity used : 68 %
First pinned fi le : $T MF1.ZTMFA T. AA 000044
Reason: Active transactions(s).
Current file: $TMF1.ZTMFAT.AA000045
AuditDump Status:
Master: State: enabled, Status: active, Process $X545,
File: $TMF2.ZTMFAT.AA000042
BeginT ra ns St atus: Enab le d
Catalo g Sta tu s:
Status: Up
Processes Status:
Dump Files:
#0: State: InProgress
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-18
Page 71
Overview of Monitoring and Recovery
Example 3-3. System Monitoring Output File (page 3 of 3)
COMMEN T THIS CHE CKS THE STATU S OF PAT HWAY:
PATHCO M $ZV PT ;STATUS PA TH WAY ;STATUS PA TH MON
PATHWA Y -- ST AT E=RUNNIN G
RUNNING
EXTERN AL TCP S 0
LINKMONS 0
PATHCOMS 1
SPI 0
FREEZE
RUNNING STOPPE D THAWED FROZEN PENDING
SERVERCLASSES 17 0 17 0 0
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-19
Page 72
Overview of Monitoring and Recovery
Using the Status LEDs to Monitor the System
Using the Status LEDs to Monitor the System
Status LEDs on the various enclosures and system components light during certain
operations, such as when the system performs a series of power-on self-test s (POSTs)
when a server is first powered on. Table 3-4 lists some of the status light-emitting
diodes (LEDs) and their functions.
Table 3-4. Status LEDs and Their Functions (page1of3)
LocationLED NameColorFunction
Disk drivePower-onGreenLights wh en the disk dr ive is rece iving
power.
ActivityYellow or
amber
Disk drive, fibre
channel
EMUHeartbe atLeft G reenFlashes w hen EMU is operational and
Drive Ready
(top green)
Drive Online
(middle green)
Drive Failure
(bottom
amber)
AllIf all lights are on and none are f l ashing,
GreenFlashes when drive is starting. (At the
GreenFlashes when drive is operational and
AmberFlashes when drive is inactive or in error
Lights when the disk drive is executing a
read or write command.
same time, the middle green light is lit and
the bottom am ber light is lit.)
performin g a locate func ti on.
condition. When this occurs, verify the
loop and re place the drive, if nec essary.
the drive is not operation al. Perform the
following actions:
1.Check FCSA. Replace if defective.
2.Chec k FC-AL I/O module. Replac e if
defective.
3.Replace drive.
performing locate. Power might just have
been appl ied to the EMU, or an enclosu r e
fault might exist.
On when an EMU fault exists that is not
an enclosure fault.
Off when an EMU fault exi sts, which could
be or might not be an enclosu r e f ault.
PowerMiddle
Green
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-20
Flashes w hen EMU is operational and
performin g locate.
On when EM U is operational. An EMU or
an enclosu re fault might still exist.
Off when pow er has just been applied to
an enclosure, or when an enclosure fault
exists.
Page 73
Overview of Monitoring and Recovery
Table 3-4. Status LEDs and Their Functions (page2of3)
LocationLED NameColorFunction
Using the Status LEDs to Monitor the System
FC-AL I/O
Module
Fibre Cha nnel
ServerNet
adapter (F C SA)
Enclosure
Status
AmberFlashes when EMU is operational and
performin g locate.
On when EM U is operational, but an
enclosur e fa ult exists.
Off when EMU is operational, or power
has just been applied to an enclosure , or
when an EM U fa ult exists that is not an
enclo su re fa ult , or w hen an enc los ur e fa ult
exists.
Power-onMiddle
Green
Lights when power is on an d m odule is
available f or normal operation. If light is
off, the module is nonoperational: check
FCSAs, cables, and power supplies.
Port 1Bottom
Green
Lights when c arrier on Port 1 is operational.
Port 2Top GreenLights when carrier on Port 2 is opera-
tional.
Power-onGreenLights when the adapter is receiving
power.
ServiceAmberLights to indicate internal failure or
serv ice acti on required.
Gigabit Et hernet
4-port
Power-onGreenLights when the adapter is receiving
power.
ServerNet
adapter (G4SA)
ServiceAmberLights to indicate internal failure or
serv ice acti on required.
LSU I/O PI CPower-onGre enLights when pow er is on and adapter is
available f or normal operation.
ServiceAmb erLights when a POST is in progre s s , b oard
is being reset, or a fault exists.
LSU optics
adapter
Power-onGr eenLigh ts when NonSt op Blad e Element optic
or ServerNet link is functional.
connector
LSU logic boardPower-onGreenLights when pow er is on and adapter is
available f or normal operation.
ServiceAmb erLights when a POST is in progre s s , b oard
is being reset, or a fault exists.
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-21
Page 74
Overview of Monitoring and Recovery
Table 3-4. Status LEDs and Their Functions (page3of3)
LocationLED NameColorFunction
Related Reading
NonStop Bl ade
Element
P-switch PICsPower-onGreenLights when power is on with PIC avail-
P-switch PIC
ServerNet
connector
Power-onFlashing
ServiceSteady
LocatorFlashing
Power-onGreenLights when a ServerNet link is functional.
Related Reading
For more information about monitoring, see the documentation listed in Table 3-5.
Lights when power is on and Blade Ele-
Green
Flashing
Yellow
Amber
Blue
AmberLights when a fault exists.
ment is ava ilable for norm al operation.
Lights when Blade Element is in low
power mo de.
Lights when a hardware or software fault
exists.
Lights when the system locator is acti-
vated.
able for norm al operation.
Table 3-5. Related Reading for Monitoring
TaskToolFor information, see...
Monitoring system
hardware, including
locating failed or failing
FRUs
Using SCF, its
commands and options,
and devic e t yp es and
subtypes
Monitoring clustered
servers
OSM Serv ice
Connection
SCF interface to
subsystems
OSM Serv ice
Connection
OSM online help
OSM Service Connection User’s Guide
SCF Reference Manual for H-Series RVUs
SCF Reference Manual for the Storage
Subsystem
ServerNet Cluster 6780 Operations Guide
ServerNet Cluster Manual
HP Integrity NonStop NS-Series Operations Guide—529869-005
3-22
Page 75
4
Monitoring EMS Event Messages
When to Use This Section on page 4-1
What Is the Event Management Service (EMS)? on page 4-1
Tools for Monitoring EMS Event Messages on page 4-1
OSM Event Viewer on page 4-2
OSM Event Viewer on page 4-2
ViewPoint on page 4-2
Web ViewPoint on page 4-2
Related Reading on page 4-2
When to Use This Section
Use this section for a brief description of the Event Management Service (EMS) and
the tools used to monitor EMS event messages.
What Is the Event Management Service (EMS)?
The Event Management Service (EMS) is a collection of processes, tools, and
interfaces that support the reporting and retrieval of event information. Information
retrieved from EMS can help you to:
Monitor your system or network environment
•
Analyze circumstances that led up to a problem
•
Detect failure patterns
•
Adjust for changes in the run-time environment
•
Recognize and handle critical problem s
•
Perform many other tasks required to maintain a productive computing operation
•
Tools for Monitoring EMS Event Messages
To view EMS event messages for an Integrity NonStop server, use one of these tools:
OSM Event Viewer
•
EMSDIST
•
ViewPoint
•
Web ViewPoint
•
HP Integrity NonStop NS-Series Operations Guide—529869-005
4-1
Page 76
Monitoring EMS Event Messages
OSM Event Viewer
The OSM Event Viewer is a browser-based event viewer. The OSM Event Viewer
allows you to retrieve and view events from any EMS formatted log files ($0, $ZLOG,
or an alternate collector) for rapid assessment of operating system problems.
To access the OSM Event Viewer, refer to Launching OSM Applications on pa ge 1-11.
For details on how to use the OSM Event Viewer, refer to the online help.
EMSDIST
The EMSDIST program is the object program for a printing, forwarding, or consumer
distributor, any of which you can start with a TACL RUN command. This guide does
not describe using EMSDIST. For more information, see the Guardian User’s Guide.
ViewPoint
ViewPoint displays event messages about current or past events occurring anywhere
in the network on a set of block-mode events screens. The messages can be errors,
failures, warnings, and requests for operator actions. The events screens allow
operators to monitor significant occurrences or problems in the network as they occur.
Critical events or events requiring immediate action are highlighted.
OSM Event Viewer
Web View Po i nt
Web ViewPoint, a browser-based product, accesses the Event Viewer, Object
Manager, and Performance Monitor subsystems. Web ViewPoint monitors and
displays EMS events; identifies and lists all supported subsystems; manages NonStop
server subsystems and user applications in a secure, automated, and customizable
way; monitors and gr aphs pe rfor mance attribu tes and tr ends; investi gates a nd displays
most active system processes; and offers simple navigation and a point-and-click
command interface.
Related Reading
For more information about monitoring EMS event messages, see the documentation
in Table 4-1.
Table 4-1. Related Reading for Monitoring EMS Event Messages
TaskToolFor information, see...
Viewing eve nt logsEMSDISTGuardian User’s Guide
ViewPointViewPoint Manual
OSM Eve nt
Viewer
OSM Ev ent Viewer onlin e help
HP Integrity NonStop NS-Series Operations Guide—529869-005
4-2
Page 77
5
Processes: Monitoring and
Recovery
When to Use This Section on page 5-1
Types of Processes on page 5-1
System Processes on page 5-1
I/O Processes (IOPs) on page 5-2
Generic Processes on page 5-2
Monitoring Processes on page 5-3
Monitoring System Processes on page 3
Monitoring IOPs on page 4
Monitoring Generic Processes on page 4
Recovery Operations for Processes on page 5-6
Related Reading on page 5-6
When to Use This Section
This section provides basic information about the different types of processes for
Integrity NonStop servers. It gives a brief example of monitoring each type of process
and provides information about the commands available for recovery operations.
Types of Processes
Three types of processes are of major concern to a system operator of an Integrity
NonStop NS-series server:
System processes
•
I/O processes (IOPs)
•
Generic processes
•
System Processes
A system process is a privileged process that is created during system load and exists
continuously for a given configuration for as long as the processor remains operable.
Examples of system processes include the memory manager, the monitor, and the I/O
control processes.
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-1
Page 78
Processes: Monitoring and Recovery
I/O Processes (IOPs)
An I/O process (IOP) is a system process that manages communications between a
processor and I/O devices. IOPs are often configured as fault-tolerant process pairs,
and they typically control one or more I/O devices or communications lines. Each IOP
is configured in a maximum of two processors, typically a primary processor and a
backup processor.
An IOP provides an application program interface (API) that allows access to an I/O
interface. A wide area network (WAN) communications line is an example of an I/O
interface. IOPs configured using the SCF interface to the WAN subsystem manage the
input and output functions for the ServerNet wide area network (SWAN) concentrator.
Examples of IOPs include, but are not limited to, line-handler processes for Expand
and other communications subsystems.
Generic Processes
Generic processes are configured by the SCF interface to the Kernel subsystem. They
can be configured in one or more processors. Although sometimes called systemmanaged processes, generic processes can be either system processes or usercreated processes. Any process that can be started from a TACL prompt can be
configured as a generic process. Generic processes can be configured to have
persistence; that is, to automatically restart if stopped abnormally.
I/O Processes (IOPs)
Examples of generic processes:
The $ZZKRN Kernel subsystem manager process
•
Other generic processes controlled by $ZZKRN; for example:
•
The $ZZSTO storage subsystem manager process
°
The $ZZWAN wide area network (WAN) subsystem manager process
°
QIO processes
°
OSM server processes
°
The $ZZLAN ServerNet LAN Systems Access (SLSA) subsystem manager
°
process
The $FCSMON fibre channel storage monitor
°
For more information, refer to the SCF Reference Manual for the Kernel Subsystem.
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-2
Page 79
Processes: Monitoring and Recovery
Monitoring Processes
Monitoring Processes
This subsection briefly provides examples of some of the tools available to monitor
processes. For some processes, such as IOPs, monitoring is more fully discussed in
other manuals. In general, use this method to monitor processes:
1.Develop a list of processes that are crucial to the operation of your system.
2.Determine how each of these processes is configured.
3.Use the appropriate tool to monitor the process.
Monitoring System Processes
Check that the system processes are up and running in the processors as you
intended. At a TACL prompt:
> STATUS *
This example shows partial output produced by the TACL STATUS * command:
$SYSTEM STARTUP 2> status *
Process Pri PFR %WT Userid Program file Hometerm
0,0 201 P R 000 255,255 $SYSTEM.SYS14.NMONTOR $YMIOP.#CLCI
0,1 210 P 040 255,255 $SYSTEM.SYS14.NMEMMAN $YMIOP.#CLCI
0,2 210 P 051 255,255 $SYSTEM.SYS14.NMSNGERR $YMIOP.#CLCI
$0 0,3 201 P 011 255,255 $SYSTEM.SYS14.OPCOLL $YMIOP.#CLCI
0,4 211 P 017 255,255 $SYSTEM.SYS14.TMFMON $YMIOP.#CLCI
$YMIOP 0,5 205 P 251 255,255 $SYSTEM.SYS14.TMIOP $YMIOP.#CLCI
$ZNUP 0,6 200 P 015 255,255 $SYSTEM.SYS14.NZNUP $YMIOP.#CLCI
$Z0 0,7 200 P 015 255,255 $SYSTEM.SYS14.OCDIST $YMIOP.#CLCI
$ZOPR 0,8 201 P 011 255,255 $SYSTEM.SYS14.OAUX $YMIOP.#CLCI
$ZCNF 0,9 200 P 001 255,255 $SYSTEM.SYS14.TZCNF $YMIOP.#CLCI
$ZTM00 0,11 200 P 017 255,255 $SYSTEM.SYS14.TMFMON2 $YMIOP.#CLCI
$TMP 0,12 204 P 005 255,255 $SYSTEM.SYS14.TMFTMP $YMIOP.#CLCI
$ZL00 0,13 200 P 001 255,255 $SYSTEM.SYS14.ROUT $ZHOME
$NCP 0,14 199 P 011 255,255 $SYSTEM.SYS14.NCPOBJ $ZHOME
$ZEXP 0,15 150 P 001 255,255 $SYSTEM.SYS14.OZEXP $ZHOME
$CLCI 0,34 199 000 0,0 $SYSTEM.SYS14.TACL $YMIOP.#CLCI
$TRAK 0,40 146 000 255,255 $SYSTEM.SYSTOOLS.QATRACK $ZHOME
$Z00Y 0,43 150 015 255,255 $SYSTEM.SYS14.FDIST $ZHOME
$NULL B 0,45 147 001 255,255 $SYSTEM.SYSTEM.NULL $Z01J
$ZNET 0,64 175 P 011 255,255 $SYSTEM.SYS14.SCP $ZHOME
$Z1RL 0,249 148 R 000 98,98 $SYSTEM.SYS14.TACL $ZTNT.#PTBY5D
$SYSTEM 0,257 220 P 317 255,255 $SYSTEM.SYS14.TSYSDP2 $YMIOP.#CLCI
$ZHOME 0,292 199 P 001 255,255 $SYSTEM.SYS14.ZHOME $YMIOP.#CLCI
$ZM00 0,294 201 P 015 255,255 $SYSTEM.SYS14.QIOMON $ZHOME
$ZZWAN 0,295 180 011 255,255 $SYSTEM.SYS14.WANMGR $ZHOME
$ZZSTO 0,296 180 P 011 255,255 $SYSTEM.SYS14.TZSTO $ZHOME
$ZZLAN 0,297 199 P 015 255,255 $SYSTEM.SYS14.LANMAN $ZHOME
$ZZKRN 0,298 180 P 011 255,255 $SYSTEM.SYS14.OZKRN $ZHOME
$Z000 0,299 180 P 011 255,255 $SYSTEM.SYS14.TZSTOSRV $ZHOME
$ZLM00 0,300 200 P 015 255,255 $SYSTEM.SYS14.LANMON $ZHOME
$IXPOHO 0,301 199 P 355 255,255 $SYSTEM.SYS14.LHOBJ $ZHOME
$ZTXAE 0,330 145 015 255,255 $SYSTEM.SYS14.SNMPTMUX $ZHOME
$ZWBAF 0,333 179 P 015 255,255 $SYSTEM.SYS14.WANBOOT $ZHOME
$ZZW00 0,334 199 P 215 255,255 $SYSTEM.SYS14.CONMGR $ZHOME
$DSMSCM 0,335 220 P 317 255,255 $SYSTEM.SYS14.TSYSDP2 $ZHOME
$DATA2 0,336 220 P 317 255,255 $SYSTEM.SYS14.TSYSDP2 $ZHOME
$ZLOG 0,340 150 011 255,255 $SYSTEM.SYS14.EMSACOLL $ZHOME
$ZTH00 0,343 148 P 005 255,255 $SYSTEM.SYS14.TFDSHLP $YMIOP.#CLCI
$DSMSCM 0,344 220 P 317 255,255 $SYSTEM.SYS14.TSYSDP2 $ZHOME
$Z1RM 1,80 148 005 255,255 $SYSTEM.SYS14.TACL $ZTNT.#PTBY5D
$ZPP01 1,280 160 P 015 255,255 $SYSTEM.SYS14.OSSPS $YMIOP.#CLCI
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-3
Page 80
Processes: Monitoring and Recovery
$ZLM01 1,342 200 P 015 255,255 $SYSTEM.SYS14.LANMON $ZHOME
$ZTC0 B 1,352 200 P 011 255,255 $SYSTEM.SYS14.TCPIP $ZHOME
$ZTNT B 1,355 149 001 255,255 $SYSTEM.SYS14.TELSERV $ZHOME
$ZPORT B 1,357 149 001 255,255 $SYSTEM.SYS14.LISTNER $ZHOME
$KLA9E 1,424 147 001 255,255 $DATA2.KMZTT.LOGGER $ZTNT.#PTBY5D
$ZTM02 2,5 200 P 017 255,255 $SYSTEM.SYS14.TMFMON2 $YMIOP.#CLCI
$GRD2 2,243 147 P 001 255,255 $DATA2.QA9050.RUNNER $ZTNT.#PTBY5CV
$ZP02A B 2,300 195 001 255,255 $SYSTEM.ZRPC.PORTMAP $ZHOME
$ZCMOM B 2,303 150 001 255,255 $SYSTEM.SYS14.CIMOM $ZHOME
Monitoring IOPs
Monitoring IOPs
For a list of manuals that provide more information about monitoring I/O processes
(IOPs), refer to the WAN Subsystem Configuration and Management Manual, the
SWAN Concentrator and WAN Subsystem Troubleshooting Guide, and the Expand
Configuration and Management Manual.
Monitoring Generic Processes
Because generic processes are configured using the SCF interface to the Kernel
subsystem, you specify the $ZZKRN Kernel subsystem manager process when
monitoring a generic process. These SCF commands are available for monitoring
$ZZKRN and other generic processes:
INFODisplays configuration information for the specified objects
NAMESDisplays a list of subordinate object types and names for the
specified objects
STATUSDisplays current status in formati on about the specified obje ct s
Monitoring the Status of $ZZKRN
To monitor the status of the $ZZKRN Kernel subsystem manager process, at a TACL
prompt:
> SCF STATUS SUBSYS $ZZKRN
This example shows the output produced by this command:
1 -> STATUS SUBSYS $ZZKRN
NONSTOP KERNEL - Status SUBSYS \COMM.$ZZKRN
Name State Processes
(conf/strd)
\COMM.$ZZKRN STARTED ( 25/22 )
Monitoring the Status of All Generic Processes
To monitor the status of all generic processes controlled by $ZZKRN, at a TACL
prompt:
> SCF STATUS PROCESS $ZZKRN.#*
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-4
Page 81
Processes: Monitoring and Recovery
This example shows the output produced by this command:
1-> STATUS PROCESS $ZZKRN.#*
NONSTOP KERNEL - Status PROCESS \DRP25.$ZZKRN.#CLCI-TACL
Symbolic Name Name State Sub Primary Backup Owner
PID PID ID
CLCI-TACL $CLCI STOPPED None None
MSGMON $ZIM00 STARTED 0 ,306 None 255,255
MSGMON $ZIM01 STARTED 1 ,291 None 255,255
MSGMON $ZIM02 STARTED 2 ,285 None 255,255
MSGMON $ZIM03 STARTED 3 ,280 None 255,255
MSGMON $ZIM04 STARTED 4 ,280 None 255,255
MSGMON $ZIM05 STARTED 5 ,280 None 255,255
MSGMON $ZIM06 STARTED 6 ,280 None 255,255
MSGMON $ZIM07 STARTED 7 ,280 None 255,255
MSGMON $ZIM08 STARTED 8 ,280 None 255,255
MSGMON $ZIM09 STARTED 9 ,280 None 255,255
MSGMON $ZIM10 STARTED 10,280 None 255,255
MSGMON $ZIM11 STOPPED None None
MSGMON $ZIM12 STOPPED None None
MSGMON $ZIM13 STOPPED None None
MSGMON $ZIM14 STOPPED None None
MSGMON $ZIM15 STOPPED None None
OSM-APPSRVR $ZOSM STARTED 2 ,292 None 255,255
OSM-CIMOM $ZCMOM STARTED 2 ,294 3 ,288 255,255
OSM-CONFLH-RD $ZOLHI STOPPED None None
OSM-OEV $ZOEV STARTED 2 ,290 None 255,255
QATRAK $TRAK STARTED 0 ,17 None 255,255
QIOMON $ZM00 STARTED 0 ,290 None 255,255
QIOMON $ZM01 STARTED 1 ,280 None 255,255
QIOMON $ZM02 STARTED 2 ,280 None 255,255
QIOMON $ZM03 STARTED 3 ,279 None 255,255
QIOMON $ZM04 STARTED 4 ,279 None 255,255
QIOMON $ZM05 STARTED 5 ,279 None 255,255
QIOMON $ZM06 STARTED 6 ,279 None 255,255
QIOMON $ZM07 STARTED 7 ,279 None 255,255
QIOMON $ZM08 STARTED 8 ,279 None 255,255
QIOMON $ZM09 STARTED 9 ,279 None 255,255
QIOMON $ZM10 STARTED 10,279 None 255,255
QIOMON $ZM11 STOPPED None None
QIOMON $ZM12 STOPPED None None
QIOMON $ZM13 STOPPED None None
QIOMON $ZM14 STOPPED None None
QIOMON $ZM15 STOPPED None None
RTACL $RTACL STOPPED None None
SCP $ZNET STARTED 0 ,14 1 ,13 255,255
SP-EVENT $ZSPE STARTED 0 ,309 None 255,255
TFDSHLP $ZTH00 STARTED 0 ,310 None 255,255
TFDSHLP $ZTH01 STARTED 1 ,292 None 255,255
TFDSHLP $ZTH02 STARTED 2 ,286 None 255,255
TFDSHLP $ZTH03 STARTED 3 ,281 None 255,255
TFDSHLP $ZTH04 STARTED 4 ,281 None 255,255
TFDSHLP $ZTH05 STARTED 5 ,281 None 255,255
TFDSHLP $ZTH06 STARTED 6 ,281 None 255,255
TFDSHLP $ZTH07 STARTED 7 ,281 None 255,255
TFDSHLP $ZTH08 STARTED 8 ,281 None 255,255
TFDSHLP $ZTH09 STARTED 9 ,281 None 255,255
TFDSHLP $ZTH10 STARTED 10,281 None 255,255
TFDSHLP $ZTH11 STOPPED None None
TFDSHLP $ZTH12 STOPPED None None
Monitoring Generic Processes
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-5
Page 82
Processes: Monitoring and Recovery
TFDSHLP $ZTH13 STOPPED None None
TFDSHLP $ZTH14 STOPPED None None
TFDSHLP $ZTH15 STOPPED None None
ZEXP $ZEXP STARTED 0 ,13 1 ,15 255,255
ZHOME $ZHOME STARTED 0 ,289 1 ,295 255,255
ZLOG $ZLOG STARTED 0 ,308 1 ,329 255,255
ZZKRN $ZZKRN STARTED 0 ,293 1 ,319 255,255
ZZLAN $ZZLAN STARTED 0 ,292 1 ,297 255,255
ZZSCL $ZZSCL STARTED 1 ,290 2 ,279 255,255
ZZSMN $ZZSMN STARTED 1 ,289 2 ,282 255,255
ZZSTO $ZZSTO STARTED 0 ,291 1 ,320 255,255
ZZWAN $ZZWAN STARTED 2 ,296 3 ,289 255,255
In nearly all circumstances, items that are essential to system operations that must be
running at all times restart automatically if they are stopped for any reason while the
NonStop Kernel operating system is running.
Some OSM processes stop after executing a macro that runs during system load or
during the reload of processor 0 or 1. Those processes include $ZOLHI.
Optionally, you can also configure other processes such as the Expand subsystem
manager process, $ZEXP, and the Safeguard monitor process, $ZSMP, as generic
processes.
Recovery Operations for Processes
Recovery Operations for Processes
For recovery operations on generic processes, use the SCF interface to the Kernel
subsystem and specify the PROCESS object. These SCF commands are available for
controlling generic processes:
ABORTTerminates operation of a generic process. This command is not
supported for the subsystem manager processes.
STARTInitiates the operation of a generic process.
Generic processes that are configured to be persistent usually do not require operator
intervention for recovery. In most circumstances, persistent generic processes restart
automatically.
For recovery operations on IOPs, refer to the WAN Subsystem Configuration and
Management Manual, the SWAN Concentrator and WAN Subsystem Troubleshooting
Guide, and the Expand Configuration and Management Manual.
For recovery operations on system processes, refer to the Guardian User’s Guide.
Related Reading
For more information about generic processes and the SCF interface to the Kernel
subsystem, refer to the SCF Reference Manual for the Kernel Subsystem.
For more information about IOPs, refer to the WAN Subsystem Configuration and
Management Manual, the SWAN Concentrator and WAN Subsystem Troubleshooting
Guide, and the Expand Configuration and Management Manual.
HP Integrity NonStop NS-Series Operations Guide—529869-005
5-6
Page 83
6
Communications Subsystems:
Monitoring and Recovery
When to Use This Section on page 6-1
Communications Subsystems on page 6-1
Local Area Networks (LANs) and Wide Area Networks (WANs) on page 6-2
Monitoring Communications Subsystems and Their Objects on page 6-4
Monitoring the SLSA Subsystem on page6-4
Monitoring the WAN Subsystem on page 6-6
Monitoring the NonStop TCP/IP Subsystem on page 6-9
Monitoring Line-Handler Process Status on page 6-10
Tracing a Communications Line on page 6-12
Recovery Operations for Communications Subsystems on page 6-13
Related Reading on page 6-13
When to Use This Section
Use this section to determine where to find more information about monitoring and
recovery operations for communications devices such as ServerNet adapters, printers,
and spoolers; communications lines; and communications processes such as WAN
IOPs.
Communications Subsystems
The software that provides users of Integrity NonStop systems with access to a set of
communications services is called a co mmunications su bsystem . Because con nectivity
is an important part of online transaction processing (OLTP), HP offers a variety of
communications products that support a wide range of applications.
Communication between specific devices or networks is typically achieved using
several communications products or subsystems. These products are related as
component s in a layered structur e. To accomplish the required connection, higher-level
components—for example, NonStop TCP/IP processes—use the services of lowerlevel components such as the ServerNet LAN Systems Access (SLSA) subsystem.
The same higher-level component can often use any of several lower-level
components; thus, the Expand subsystem—which consists of multiple processes on a
node—can use the NonStop TCP/IP subsystem, the X.25 Access Method (X.25 AM),
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-1
Page 84
Communications Subsystems: Monitoring and
Recovery
or other communication interface options to provide data transmissions over local area
networks (LANs) or wide area networks (WANs), respectively. Similarly, multiple
higher-level components can use the services of a single lower-level component.
Local Area Networks (LANs) and Wide Area
Networks (WANs)
Local Area Networks (LANs) and Wide Area Networks (WANs)
Two important communications interfaces for LANs and WANs on Integrity NonStop
servers are the SLSA subsystem and the WAN subsystem.
The SLSA subsystem supports parallel LAN I/O operations, allowing Integrity NonStop
NS-series servers to communicate across the ServerNet fabrics and access Ethernet
devices through various LAN protocols. SLSA also communicates with the appropriate
adapter type over the ServerNet fabrics. Adapters supported on Integrity NonStop
systems include:
Gigabit Ethernet 4-port adapter (G4SA)
•
Fibre Channel ServerNet adapter (FCSA) (for the Storage subsystem)
•
I/O adapter module (IOAM) enclosures enable I/O operations to take place between
Integrity NonStop servers and some Fibre Channel storage devices. See the Modular I/O Installation and Configuration Guide for more information.
Adapters supported on NonStop S-series servers that can be accessed through
Expand over IP, include:
ATM 3 ServerNet adapter (ATM3SA)
•
Ethernet 4 ServerNet adapter (E4SA)
•
Fast Ethernet ServerNet adapter (FESA)
•
Gigabit Ethernet ServerNet adapter (GESA)
•
Gigabit Ethernet 4-Port ServerNet adapter (G4SA)
•
Multifunction I/O board (MFIOB) in the processor multifunction (PMF) customer-
•
replaceable unit (CRU) and I/O multifunction (IOMF) CRU
Token-Ring ServerNet adapter (TRSA)
•
For further information, refer to the Introduction to Networking for NonStop NS-Series
Servers.
In addition to the adapters, the SLSA subsystem supports these objects:
Processes
•
Monitors
•
ServerNet addressable controllers (SACs)
•
Logical interfaces (LIFs)
•
Filters
•
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-2
Page 85
Communications Subsystems: Monitoring and
Recovery
Physical interfaces (PIFs)
•
Processes that use the SLSA subsystem to send and receive data on a LAN attached
to an Integrity NonStop server are called LAN service providers. Two service
providers—the NonStop TCP/IP and NonStop TCP/IPv6 subsystems and the Port
Access Method (PAM)—are currently supported. They provide access for these
subsystems:
LAN Service ProviderSubsystems Supported
Local Area Networks (LANs) and Wide Area
Networks (WANs)
NonSto p TCP/IP subsyste m,
NonSto p TCP/IPv6 subsys tem
Port Access Method (PAM)Ethernet and token-ring LANs. The OSI/AS, OSI/TS,
The Expand subsystem, which provides Expand-overIP connections.
SNAX/XF, and SNAX/APN subsystems communicate
with SLSA through the PAM subsystem.
Processes, user applications, and subsystems that use the SLSA subsystem and
related LAN providers to connect to an FCSA or G4SA attached to an Integrity
NonS top NS-series server are called LAN clients. For exam ple, the W A N subsystem is
a client of the SLSA subsystem because the SLSA subsystem provides the WAN
subsystem access to the ServerNet wide area network (SWAN) concentrator through
the LAN.
The WAN subsystem is used to control access to the SWAN concentrator. Depending
on your configuration, it can be used to configure and manage both WAN and LAN
connectivity for these communication subsystem objects:
ObjectConne ctiv it y By
AM3270Line-han dler processes
Asynchronous Terminal Process
6100 (ATP6100)
Communications Process
subsystem (CP6100)
Line-han dler processes
Line-han dler processes
EnvoyA C P/ XFLine-han dler processes
Envoy subsystemLine-handler processes
ExpandSubsystem network control process and line-handler
processes
ServerN et cl us t er (Expand-ov erServerNet)
SNAX/APNSubsystem service manager process and line-han dler
SNAX/XFSubsystem service manager p roce ss and li n e-han dler
TR3271Line-han dler processes
X25AMLine-handler proces s es
Line-han dler processes
processes
processes
You can define these communications subsystem objects as WAN subsystem devices.
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-3
Page 86
Communications Subsystems: Monitoring and
Recovery
Monitoring Communications Subsystem s and Their
Objects
Monitoring Communications Subsystems and
Their Objects
Monitoring and recovery operations for communications subsystems can be complex.
An error in any of the components—service providers, clients, objects, adapters,
processes, and so on—can generate multiple error messages from many
interdependent subsystems and processes. Analyzing and solving an error that
originates in an object controlled by a LAN or a WAN often requires that you
methodically gather status information about the affected services and then eliminate
objects that are working normally.
Detailed monitoring and recovery techniques for devices and processes related to
communications subsystems are discussed in detail in the manuals for each
subsystem. For more information, refer to Related Reading on page 6-13.
This guide provides some basic commands you can use to identify and resolve
common problems. Your most powerful tool for monitoring and collecting information
about subsystem objects is the SCF facility. You can use SCF commands to get
information and status for subsystem objects by name, device type, or device subtype.
Subdevices are defined if a subsystem potentially operates on numerous, separately
addressable objects, such as stations on a multipoint line; the line is a device, and the
stations are subdevices.
For a list of subsystems with their device type numbers and device subtypes, see
Using SCF to Determine Your System Configuration on page 2-5.
Monitoring the SLSA Subsystem
This subsection describes how to obtain the status of adapters, SACs, LIFs, and PIFs.
For more information on the SLSA subsystem, refer to the LAN Configuration and Management Manual.
Monitoring the Status of an Adapter and Its Components
1.To monitor the status of an adapter:
> SCF STATUS ADAPTER adapter-name
A listing similar to this example is sent to your home terminal:
->STATUS ADAPTER $ZZLAN.G11123
SLSA Status ADAPTER
Name State
$ZZLAN.G11123 STARTED
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-4
Page 87
Communications Subsystems: Monitoring and
Recovery
This example shows the listing displayed when checking all adapters on $ZZLAN:
> SCF STATUS ADAPTER $ZZLAN.*
1->STATUS ADAPTER $ZZLAN.*
SLSA Status ADAPTER
Name State
$ZZLAN.G11121 STARTED
$ZZLAN.G11122 STARTED
$ZZLAN.G11123 STARTED
$ZZLAN.G11124 STARTED
$ZZLAN.G11125 STARTED
$ZZLAN.MIOE0 STARTED
$ZZLAN.MIOE1 STARTED
2.The SAC object corresponds directly to the hardware on an adapter. A SAC is a
component of an adapter and can support one or more PIFs. To monitor the status
of a SAC:
> SCF STATUS SAC sac-name
Monitoring the SLSA Subsystem
A listing similar to this example is sent to your home terminal:
1->STATUS SAC $ZZLAN.G11123.O
SLSA Status SAC
Name Owner State Trace Status
$ZZLAN.G11123.0 1 STARTED ON
This example shows a listing of the status of all SACs on $ZZLAN.G11123:
> SCF STATUS SAC $ZZLAN.G11123*
->STATUS SAC $ZZLAN.G11123*
SLSA Status SAC
Name Owner State Trace Status
$ZZLAN.G11123.0 1 STARTED ON
3.The PIF object corresponds directly to hardware on the adapter. A PIF is the
physical connection to the LAN. To monitor the status of a PIF:
> SCF STATUS PIF pif-name
A listing similar to this example is sent to your home terminal:
->STATUS PIF $ZZLAN.G11123.0
SLSA Status PIF
Name State Trace Status
$ZZLAN.G11123.0.A STARTED ON
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-5
Page 88
Communications Subsystems: Monitoring and
Recovery
This example shows a listing of the status of all PIFs on $ZZLAN.G11123:
> SCF STATUS PIF $ZZLAN.G11123.*
->STATUS PIF $ZZLAN.G11123.*
SLSA Status PIF
Name State Trace Status
$ZZLAN.G11123.0.A STARTED ON
$ZZLAN.G11123.0.B STARTED ON
$ZZLAN.G11123.0.C STOPPED OFF
$ZZLAN.G11123.0.D STARTED ON
4.The LIF provides an interface to the PIF. The LIF object corresponds to logical
processes that handle data transferred between the LAN and a system using the
ServerNet architecture. To monitor the status of a LIF:
> SCF STATUS LIF lif-name
A listing similar to this example is sent to your home terminal:
Monitoring the WAN Subsystem
->STATUS LIF $ZZLAN.L11021A
SLSA Status LIF
Name State Access State
$ZZLAN.L11021A STARTED UP
This example shows a detailed listi ng of the statu s of the LIF on $ZZLAN.L11021A:
> SCF STATUS LIF $ZZLAN.L11021A , DETAIL
->STATUS LIF $ZZLAN.L11021A , DETAIL
SLSA Detailed Status LIF \SYS.$ZZLAN.L11021A
Access State............. UP
CPUs with Data Path...... ( 0, 1, 2 )
Potential Access CPUs.... ( 0, 1, 2, 3 )
State.................... STARTED
Trace Filename...........
Trace Status.............
Monitoring the WAN Subsystem
This subsection describes how to obtain the status of SWAN concentrators, data
communications devices, processes, and CLIPs. For more information on the WAN
subsystem, see the WAN Subsystem Configuration and Management Manual.
Monitoring Status for a SWAN Concentrator
To display the current status for a SWAN concentrator:
> SCF STATUS ADAPTER $ZZWAN.#concentrator-name
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-6
Page 89
Communications Subsystems: Monitoring and
Recovery
The system displays a listing similar to:
-> status adapter $zzwan.#s01
WAN Manager STATUS ADAPTER for ADAPTER \TAHITI.$ZZWAN.#S01
State........... STARTED
Number of clips. 3
Clip 1 status : CONFIGURED
Clip 2 status : CONFIGURED
Clip 3 status : CONFIGURED
To display the status for all SWAN concentrators configured for your system:
> SCF STATUS ADAPTER $ZZWAN.*
The system displays a listing similar to:
1-> STATUS ADAPTER $ZZWAN.*
WAN Manager STATUS ADAPTER for ADAPTER \COMM.$ZZWAN.#SWAN1
State........... STARTED
Number of clips. 3
Clip 1 status : CONFIGURED
Clip 2 status : CONFIGURED
Clip 3 status : CONFIGURED
WAN Manager STATUS ADAPTER for ADAPTER \COMM.$ZZWAN.#SWAN2
State........... STARTED
Number of clips. 3
Clip 1 status : CONFIGURED
Clip 2 status : CONFIGURED
Clip 3 status : CONFIGURED
Monitoring the WAN Subsystem
Monitoring Status for a Data Communications Device
To verify that a WAN subsystem device is in the STARTED state:
> SCF STATUS DEVICE $ZZWAN.#device-name
The system displays a listing similar to:
-> status DEVICE $zzwan.#IP01
WAN Manager STATUS DEVICE for DEVICE \COWBOY.$ZZWAN.#IP01
STATE ...........STARTED
LDEV number.. ..173
PPIN...........2, 13 BPIN............3, 11
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-7
Page 90
Communications Subsystems: Monitoring and
Recovery
Monitoring WAN Processes
To display the status of all WAN subsystem processes—configuration managers,
TCP/IP processes, WANBoot processes:
> SCF STATUS PROCESS $ZZWAN.*
The system displays a listing similar to:
-> STATUS PROCESS $ZZWAN.*
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#5
State :......... STARTED
LDEV Number..... 66
PPIN............ 5 ,264 Process traced.. NO
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#4
State :......... STARTED
LDEV Number..... 67
PPIN............ 4 ,264 Process traced.. NO
Monitoring the WAN Subsystem
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#ZTF00
State :......... STARTED
PPIN............ 4 ,342
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#SWB1
State :......... STARTED
PPIN............ 4 ,275 BPIN............ 5 ,302
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#ZTF01
State :......... STARTED
PPIN............ 5 ,340
WAN Manager STATUS PROCESS for PROCESS \COMM.$ZZWAN.#SWB0
State :......... STARTED
PPIN............ 4 ,274 BPIN............ 5 ,303
To monitor a single WANBoot process, type:
> SCF STATUS PROCESS $ZZWAN.#boot-process
The system displays a listing similar to:
-> status PROCESS $ZZWAN.#ZB017
WAN Manager STATUS PROCESS for PROCESS \ICEBAT.$ZZWAN.#ZB017
STATE:...........STARTED
PPIN.............0 ,278 BPIN.............0, 282
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-8
Page 91
Communications Subsystems: Monitoring and
Recovery
Monitoring CLIPs
To display the current status for a CLIP:
> SCF STATUS SERVER $ZZWAN.#concentrator-name.clip-num
Values for the CLIP number are 1, 2, or 3.
The system displays a listing similar to:
-> status server $zzwan.#s01.1
WAN Manager STATUS SERVER for CLIP \COWBOY.$ZZWAN.#S01.1
STATE :..........STARTED
PATH A...........: CONFIUGRED
PATH B...........: CONFIGURED
NUMBER of lines. 2
Line...............0 : $SAT23A
Line...............1 : $SAT23B
Monitoring the NonStop TCP/IP Subsystem
Monitoring the NonStop TCP/IP Subsystem
This subsection describes how to obtain the status for NonStop TCP/IP processes,
routes, and subnets. For additional information, refer to the TCP/IP Configuration and
Management Manual. For NonStop TCP/IPv6, refer to the TCP/IPv6 Configuration and
Management Manual.
Monitoring the NonStop TCP/IP Process
To display the dynamic state of a NonStop TCP/IP process, first list the names of all
NonStop TCP/IP processes:
-> SCF LISTDEV TCPIP
Then type:
> SCF STATUS PROCESS tcp/ip-process-name
where tcp/ip-process-name is the name of the process you want information
about.
The system displays a listing similar to this output, which is for process $ZTCO:
-> Status Process $ZTCO
TCPIP Status PROCESS \SYSA.$ZTCO
Status: STARTED
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-9
Page 92
Communications Subsystems: Monitoring and
Recovery
Monitoring NonStop TCP/IP Routes
To display status information for all NonStop TCP/IP routes:
> SCF STATUS ROUTE $ZTCO.*
The system displays a listing similar to:
1-> Status Route $ZTCO.*
TCPIP Status ROUTE \SYSA.$ZTCO.*
Name Status RefCnt
#ROU11 STARTED 0
#ROU9 STARTED 0
#ROU12 STARTED 0
#ROU8 STARTED 1
#ROU3 STOPPED 0
Monitoring NonStop TCP/IP Subnets
Monitoring Line-Handler Process Status
To obtain the status of all NonStop TCP/IP subnets:
> SCF STATUS SUBNET $ZTC0.*
The system displays a listing similar to:
1-> STATUS SUBNET $ZTC0.*
TCPIP Status SUB NET \SYSA.$ZTC0.*
Name Status
#LOOP0 STARTED
#EN1 STARTED
Monitoring Line-Handler Process Status
A line-handler process is a component of a data communications subsystem. It is an
I/O process that transmits and receives data on a communications line, either directly
or by communicating with another I/O process. This subsection explains how to
monitor the status of a line-handler process on your system or on another system in
your network to which you have remote access.
To check the status of a line-handler process on your system:
> SCF STATUS LINE $line
A listing similar to this example is sent to your home terminal:
1-> STAT US LI NE $LHPLIN1
EXPAND Status LINE
Name State PPID BPID ConMgr-LDEV
$LHCS6S STARTED 1, 20 2,25 49
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-10
Page 93
Communications Subsystems: Monitoring and
Recovery
This listing shows that the Expand line-handler process being monitored is up and
functioning normally.
The data shown in the report means:
NameSpecifies the name of the object
StateIndicates the summary state of the object, which is either
STARTED, STARTING, DIAGNO SING (for SWAN concentrators
only), or STOPPED
PPIDSpecifies the primary process ID
BPIDSpecifies the backup process ID
ConMgr-LDEVContains the LDEV of the concentrator manager process. This
field applies only to SWAN concentrator lines.
If any state other than STARTED appears, check the meaning of the state in SCF
Object States on page 3-14. Depending upon the type of problem, follow your
established procedures for problem reporting and escalation.
Monitoring Line-Handler Process Status
Examples
To check the detailed status of line $LHCS6S:
> SCF STATUS LINE $LHCS6S, DETAIL
A listing such as this output is sent to your home terminal:
Trace Status............ OFF Clip Status......... UNLOADED
ConMgr-LDEV............. 49
Path-prim
Path-alter
To display the status of all the Expand lines that are currently active on your system,
enter this INFO PROCESS command for the Expand manager process $NCP:
-> INFO PROCESS $NCP, LINESET
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-11
Page 94
Communications Subsystems: Monitoring and
Recovery
The system displays a listing similar to this output. The NEIGHBOR field displays the
system to which a given line connects, and the STATUS field indicates whether the line
is up:
1-> INFO PROC ES S $NCP, LINES ET
EXPAND Info PROCESS $NCP , LINESET
LINESETS AT \COMM (116) #LIN ESETS=3 5 TIME: JUL 9,2001 19:28:04
LINESET NEIGHBOR LDEV TF PID LINE LDEV STATUS FileErr#
Use the SCF TRACE command to trace the operation of a communications line. The
line continues normal operation while being traced, but it passes all its message traffic
to a trace procedure. Tracing enables you to see the hi story of a commun i cations li ne,
including its internal processing.
You can display trace files by using the commands available in the PTrace program.
For information about PTrace, refer to the PTrace Reference Manual. For information
about configuring a trace by using the SCF TRACE command, refer to the
configuration and manag emen t manu al for the comm unication s subsystem you want to
trace.
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-12
Page 95
Communications Subsystems: Monitoring and
Recovery
Recovery Operations for Communications
Subsystems
Recovery Operations for Communications
Subsystems
Some general troubleshooting guidelines are:
Examine the contents of the event message log for the subsystem. For example,
•
the WAN subsystem or Kernel subsystem might have issued an event message
that provides information about the process failure. Event messages returned by
the WAN subsystem and SWAN concentrator are described in the WANMGR and
TRAPMUX sections of the Operator Messages Manual, respectively.
HP provides a comprehensive library of troubleshooting guides for the
•
communications subsystems. Attempt to analyze the problems and restart the
process or object using the commands described in the appropriate manual listed
in Related Reading on page 6-13. If you are unable to start a required process or
object, contact your service provider.
Related Reading
For more information about monitoring and performing recovery operations for
communications subsystems, see the manuals listed in Table 6-1. The appropriate
manual to use depends on how your system is configured.
For example, if a process is configured using the SCF interface to the WAN subsystem
and then reconfigured with the SCF interface to another subsystem, only the SCF
interface to the other subsystem would provide current information about the
configuration. The SCF interface to the WAN subsystem would provide only
information about the configuration before it changed.
Table 6-1. Related Reading for Communications Lines and Devices (page 1 of 2)
For Information
About...Refer to...
General information
about communications
subsystems
Using SCF to monitor
generic processes
Using SCF to monitor
the SLSA subsystem as
well as Et hernet
addressable devices,
such as ServerNet
adapters
Introduction to Networking for HP NonStop NS-Series Servers
SCF Reference Manual for the Kernel Subsystem
LAN Con fig uration and M anagement M anual
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-13
Page 96
Communications Subsystems: Monitoring and
Recovery
Table 6-1. Related Reading for Communications Lines and Devices (page 2 of 2)
For Information
About...Refer to...
Related Reading
Using SCF to monitor
WAN communic ations
lines for de v ic es and
intersystem
communications
protocols
Using SCF to monitor a
specific device or
communications
protocol product;
troubleshooting specific
communications
subsystems and
protocols
WAN Subsystem Configuration and Management Manual
Asynchronous Termin als and Printer Processes C onfiguratio n
and Management Manual
ATM Ad apter Installation and Supp ort Guide
ATM Configuration and Management Manual
CP6100 Configuration and Management Manual
EnvoyA C P/ XF C onfiguratio n and Mana gement Manual
Expand Co nf iguration and M anagement Manual
Fibre Cha nnel ServerN et Adapter Instal lat ion and Supp ort
Guide
Gigabit Et hernet 4-Port Adapter Install at ion and Supp ort Guide
P AM Configuration and Management Manual
QIO Configuration and Management Manual
SCF Refe rence Manua l f or H -Series RVUs
ServerN et C lus t er Manual
SNAX/ XF and SNA X/ APN Confi guration and M anagement
Manual
SWAN Concentrator and WAN Subsystem Troubleshooting
Guide
TCP/IPv6 Configuration and Management Manual
TCP/IP Configuratio n and Management Ma nual
Token-Ring Adapter Installation and Support Guide
X25AM Configuration and Management Manual
HP Integrity NonStop NS-Series Operations Guide—529869-005
6-14
Page 97
7
ServerNet Resources: Monitoring
and Recovery
ServerNet Communications Network on page 7-1
System I/O ServerNet Connections on page 7-4
Monitoring the Status of the ServerNet Fabrics on page 7-4
Monitoring the ServerNet Fabrics Using OSM on page 7-5
Monitoring the ServerNet Fabrics Using SCF on page 7-6
Related Reading on page 7-8
When to Use This Section
Use this section to learn about monitoring and performing recovery operations for the
internal and external ServerNet fabrics, and to understand how and when an Integrity
NonStop NS-series system can be connected to legacy NonStop S-series I/O
enclosures.
Notes. Integrity NonStop NS16000 systems support connectivity to NonStop S-series I/O
enclosures, Integrity NonStop NS14000 and NS1000 systems do not. For more information,
see Differences Between Integrity NonSto p NS-Series Systems on page 2-2.
An Integrity NonSto p NS16000 system can be part of the same ServerNet cluster as NonStop
S-series systems, an Integrity NonStop NS14000 system cannot be. For more information, see
the ServerNet Cluster Supplement for Integrity NonStop NS-Series Servers.
Integrity NonStop NS1000 systems do not support ServerNet clusters.
All Integrity NonStop system I/O is performed through the ServerNet system area
network (SAN). LSU logic boards connect the SAN to the replicated four-way
microprocessors on Integrity NonStop systems (except for Integrity NonStop NS1000
systems, which have no LSUs; see System I/O ServerNet Connections on page 7-4).
ServerNet Communications Network
The ServerNet communications network is a high-speed network within an Integrity
NonStop system that connects processors to each other and to peripheral controllers.
This network offers the connectivity of a standard network, but it does not depend on
shared resources such as interprocessor buses or I/O channels. Instead, the
ServerNet communications network uses the ServerNet architecture, which is
wormhole-routed, full-duplex, packet-switched, and point-to-point. This network offers
low latency, low software overhead, high bandwidth, and parallel operation.
In the ServerNet architecture, each processor maintains two independent paths to
other processors, I/O devices, and ServerNet adapters. These dual paths can be used
HP Integrity NonStop NS-Series Operations Guide—529869-005
7-1
Page 98
ServerNet Resources: Monitoring and Recovery
simultaneously to improve performance, and to ensure that no single failure disrupts
communications among the remaining system components.
A ServerNet adapter provides the interface between a ServerNet fabric and the Fibre
Channel and Ethernet links. A ServerNet adapter contains a ServerNet bus interface
(SBI) and one or more ServerNet addressable controllers (SACs).
Integrity NonStop NS16000 ServerNet Connectivity
An Integrity NonStop NS16000 system uses the ServerNet fabric for interconnections
between the LSUs, p-switches, and IOAMs, enabling an Integrity NonStop system to
be connected to legacy NonStop S-series enclosures. Figure 7-1 shows a logical
representation of a complete system with the X and Y ServerNet fabrics.
Figure 7-1. Integrity NonStop NS16000 System
ServerNet Communications Network
HP Integrity NonStop NS-Series Operations Guide—529869-005
7-2
Page 99
ServerNet Resources: Monitoring and Recovery
Integrity NonStop NS14000 ServerNet Connectivity
ServerNet connections between I/O devices and processors depend on whether the
Integrity NonStop NS14000 system has an IOAM enclosure or VIO enclosures.
Figure 7-2 shows an NS14000 system with an IOAM enclosure. For more information
on Integrity NonStop NS14000 systems with VIO enclosures, see Integrity No nStop
NS14000 Systems on page 2-3, the NonStop NS14000 Planning Guide, or the
Versatile I/O (VIO) Manual.
Figure 7-2. Integrity NonStop NS14000 System with IOAM Enclosure
4-Pr ocesso r, D up lex C onf i gu rat ion
Connec ti ons to
Maintenance
Switch
Connec ti on to 6780
Serve rNet Clust er
Switch
ServerNet Communications Network
Connec tion t o
Maintenance
Switch
Connec tion to
6780 Server N et
Clus te r Switch
IOAM
Enclosure
4PSE
4
4
3
3
2
2
1
1
20272625242322
Y
X
C
B
A
S T Q R
S T Q R
4PSE
21
4PSE
FCSA
FCSA
4PSE
G4SA
FCSA
G4SA
FCSA
Connec ti on to
4
4
3
3
2
2
1
1
Y
Y
Y
X
X
C
C
B
B
A
A
Y
X
X
C
C
B
B
A
A
Y
Y
X
C
B
A
Y
X
X
C
C
B
B
A
A
1 2 3 4 5 6 7 8
J1 J3 J5 J7 K1 K3 K5 K7
J0 J2 J4 J6 K0 K2 K4 K6
1 2 3 4 5 6 7 8
Maintenance
Switch
LSU
Enclo sure 0
X Fabric
Y Fabr ic
Blade Element B
Blade Element A
J1 J3 J5 J7 K1 K3 K5 K7
J0 J2 J4 J6 K0 K2 K4 K6
HP Integrity NonStop NS-Series Operations Guide—529869-005
7-3
VST165.vsd
Page 100
ServerNet Resources: Monitoring and Recovery
System I/O ServerNet Connections
Integrity NonStop NS1000 ServerNet Connectivity
ServerNet connections between I/O devices and processors depend on whether the
Integrity NonStop NS1000 system has an IOAM enclosure or VIO enclosures. For
more information on Integrity NonStop NS1000 systems, see the NonStop NS1000
Planning Guide, NonStop NS1000 Hardware Installation Manual, or the Versatile I/O
(VIO) Manual.
System I/O ServerNet Connections
For Integrity NonStop NS16000 systems, ServerNet connections to the system I/O
devices (storage disk and tape drive as well as Ethernet communication to networks)
radiate out from the p-switches for both the X and Y ServerNet fabrics.
ServerNet cables connected to the p-switch PICs in slots 10 through 13 come from the
LSUs and processors. Cables connected to the PICs in slots 4 though 9 connect to
one or more IOAM enclosures or to NonStop S-series I/O enclosures equipped with
IOMF2 CRUs. Figure 7-3 shows the connections to the PICs in a fully populated
p-switch.
For Integrity NonStop NS14000 systems, see Integrity NonStop NS14000 ServerNet
Connectivity on page 7-3. Like NS14000 systems, Integrity NonStop NS1000 systems
use 4PSEs to provide ServerNet connections between I/O devices and processors.
However, there are no LSUs; the 4PSEs connect directly to the Blade Elements. For
more information, see the NonStop NS1000 Hardware Installation Manual.
Figure 7-3. I/O Connections to the PICS in a P-Switch
Monitoring the Status of the ServerNet Fabrics
The ServerNet fabrics provide the communication paths used for interprocessor
messages, for comm unication betwee n pr ocessor s and I/O devices, and (in th e case of
HP Integrity NonStop NS-Series Operations Guide—529869-005
7-4
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.