Before using this information and the product it supports, be sure to read the general information in
Appendix J, “Special Notices” on page 389.
First Edition (December 1997)
This edition applies to Version 3, Release 7, Modification 0 of the AS/400 Operating System and to all subsequent
releases until otherwise indicated in new editions or technical bulletins.
Comments may be addressed to:
IBM Corporation, International Technical Support Organization
Dept. JLU Building 107-2
3605 Highway 52N
Rochester, Minnesota 55901-7829
When you send information to IBM, you grant IBM a non-exclusive right to use or distribute the information in any
way it believes appropriate without incurring any obligation to you.
Copyright International Business Machines Corporation 1997. All rights reserved.
Note to U.S. Government Users — Documentation related to restricted rights — Use, duplication or disclosure is
subject to restrictions set forth in GSA ADP Schedule Contract with IBM Corp.
Improving communication performance is not a trivial task. The purpose of this
redbook is to discuss how to manage communications performance and ways to
locate the problem areas in communication performance. This redbook collects
a large amount of the performance information from several sources and
presents it in an ordered manner. The databases created by the Performance
Tools/400 were used to give the key performance indicators.
This redbook is intended for technical professionals including network designers
who want to tune the IBM AS/400 system to improve communications
performance.
An intermediate knowledge of the Performance Tools/400 (5716-PT1) and
Query/400 (5716-QU1) is assumed.
The Team That Wrote This Redbook
This redbook was produced by a team of specialists from around the world
working at the International Technical Support Organization Rochester Center.
Suehiro Sakai is an Advisory International Technical Support Specialist for the
AS/400 system at the International Technical Support Organization, Rochester
Center. He writes extensively and teaches IBM classes worldwide in all areas of
AS/400 communications. Before joining the ITSO, he worked in AS/400 Brand,
Japan as an AS/400 Solution Specialist.
Petri Nuutinen is a Systems Support Engineer in Finland. He has 15 years of
experience in the Work Management field; first with S/38 and with the AS/400
system from 1987. H is areas of expertise include performance tuning and work
management. He has written extensively on how to find a performance problem
and whether it is related to hardware or software.
Jozsef Redey has been with IBM for 5 years and is a Software Customer
Engineer in Hungary. He has 15 years of experience in the IBM network and
connectivity fields. He holds a degree in electrical engineering from the HfV in
Dresden and in digital systems design from the Technical University in Budapest.
His areas of expertise include multi-platform SNA communications, Client Access
and AS/400 Internet connectivity.
Marcelo Porta has been supporting AS/400 in Argentina since 1988. Since 1991,
he has been working in the AS/400 communications area, and PC Support/Client
Access areas. His areas of expertise include APPN, main frame
communications, TCP/IP connection with RS/6000 and the satellite
communications.
This document is based on the ITSO redbook,
Performance Investigation
The authors of the redbook were:
Petri Nuutinen, IBM Finland
, GG24-4669.
AS/400 Communication
Philip Ryder, IBM Australia
Copyright IBM Corp. 1997 xi
Meindert de Schiffart, IBM Netherlands
Thanks to the following people for their invaluable contributions to this project:
Allan Johnson, Rochester Development
Bob Manulik, Rochester Development
Tom Freeman, Rochester Development
John Horvath, Rochester Development
Doug Prigge, Rochester Development
Lois Douglas, ITSO Rochester
Comments Welcome
Your comments are important to us!
We want our redbooks to be as helpful as possible. Please send us your
comments about this or other redbooks in one of the following ways:
•
•
Fax the evaluation form found in “ITSO Redbook Evaluation” on page 399 to
the fax number shown on the form.
Use the electronic evaluation form found on the Redbooks Web sites:
For Internet users
http://www.redbooks.ibm.com
For IBM Intranet usershttp://w3.itso.ibm.com
•
Send us a note at the following address:
redbook@vnet.ibm.com
xiiComm Perf Investigation - V3R6/V3R7
Chapter 1.Tools Used for Finding Performance Problems
Finding a performance problem is similar to solving a three-dimensional
crossword puzzle: all of the puzzles are different from each other but after
solving several puzzles, you begin to grasp a pattern. For example, you start the
puzzle from the lower left-hand corner and continue systematically towards the
upper right-hand corner. Solving a communications performance problem is a
task even more challenging. You need to have the AS/400 system tuned
properly before trying to figure out what is causing the communications
performance problem.
As it is impossible to give anyone explicit instructions for solving a crossword
puzzle, it is impossible to give you an exact check-list to be followed to find and
solve a communications performance problem. In this book, we are leading you
to the beginning of a never-ending task of finding the perfect performance.
The first step of solving a communication performance problem is to collect
material to be analyzed with the tools available. The collection is done by
entering the Start Performance Monitor (STRPFRMON) command that is
described in Section 1.2, “Collecting Communications Performance Data” on
page 2.
The tools you need to solve a performance problem are:
•
CL commands described in more detail in Chapter 2, “Using CL Commands
to Find Performance Problems” on page 11:
−WRKSYSVAL, Work with System Values
−WRKSYSSTS, Work with System Status
−WRKACTJOB, Work with Active Jobs
−WRKDSKSTS, Work with Disk Status
•
Performance tools/400
Tools/400” on page 27 and consists of the following parts:
−WRKSYSACT, Work with System Activity command
This command differs from the rest of the performance tools because it
is the only tool used for a real-time analysis. For information about
using this command, see Chapter 2, “Using CL Commands to Find
Performance Problems” on page 11.
−DSPPFRDTA, Display Performance Data command
−Advisor
−Reports
−BEST/1 is used to plan for system growth and analyze the effect of work
load and hardware changes. Using this tool is discussed in Chapter 4,
“Using BEST/1 for Communications Performance Analysis and Capacity
Planning” on page 53.
−Programmer performance utilities such as:
- Job trace
- Disk Data Collection
- Analyze Process Access Group
- Performance Explorer
•
System Service Tools is discussed in Chapter 5, “Using System Service
Tools” on page 71.
•
Communications Trace is discussed in Chapter 5, “Using System Service
Tools” on page 71.
is described in Chapter 3, “Using Performance
Copyright IBM Corp. 1997 1
The tools should be used in sequence from top to bottom. First, use the Work
with System Values command to find out the settings of the allocation system
values. After that, check the overall performance by using the Work with System
Status command. Then find out if any individual jobs are using too much of the
systems′ resources by using the Work with Active Jobs command. The Work
with Disk Status command helps you to determine if any of the actuators are
being over-committed or whether the total amount of disk arms is adequate.
By using the Performance tools, you find out the bottlenecks of the performance
that can be analyzed more thoroughly by running queries to the performance
tools database. Communications trace is used to find out how the data is
passed between the AS/400 system and the remote end.
Please note that the users on a local token-ring are considered as remote users.
1.1 Usual Symptoms of Degraded Performance
There are several ways of finding out if your AS/400 system is having a
performance problem in the communications area, but a good starting point is to
ask users what they think about response times. Bear in mind that usually
workstation users are not satisfied with the response time even if it were
something similar to a sub-second...
The indicators to pay attention to are:
•
Poor response time
•
Reduced throughput
•
Heavy faulting rate in the main storage
•
High usage of system resources such as CPU, IOP, or DISK
Normally the degradation of response times is the first indication of something
getting out of order. Be aware that usually the response times get longer little
by little so noticing the degradation is almost impossible without a regular
observation of system performance.
1.2 Collecting Communications Performance Data
Before collecting the performance data to solve a communications performance
problem, decide what might be the problem to be investigated. The problem
description does not need to be overly detailed or technical, just try to simply
describe one problem. For example:
•
Remote response time seems too slow.
•
File transfer should go faster.
•
At times, the entire system seems sluggish.
Next, determine when the problem usually occurs. Maybe remote response time
is slow the first thing in the morning, or the file transfers seem slow late in the
afternoon.
When you can describe the communications performance problem and have
determined when it seems to occur, you are ready to collect communications
performance data for your analysis.
If possible, focus on collecting data for one problem at a time. Of course, try to
collect the data when the problem is the most likely to appear. You can decide
2Comm Perf Investigation - V3R6/V3R7
later how much of the data you want to analyze. For more information about
when to collect performance data and how much to collect, see the first few
pages of Chapter 4 in the
AS/400 Performance Tools/400 Guide
1.2.1 Why Collect Performance Data
Collect performance data on a regular basis and create historical data out of the
material collected. For example, you can run the performance data collection for
two hours on every Wednesday afternoon with the default parameters; the trace
data is not needed for the historical data. The reason for doing this is that
viewing the historical data graphics is the easiest way to notice any trends in
system performance if you are not using the Performance Monitor/400 software.
Another reason for collecting data regularly is that without having a baseline to
compare your performance data with, you have no way of telling whether the
performance is improving or degrading.
1.2.2 How to Collect Performance Data
You do not need Performance Tools/400 to collect the data, the collection part is
done by entering the Start Performance Monitor (STRPFRMON) command. This
generates several performance database files that contain statistics for each
communications protocol used. When collecting performance data to analyze a
communications performance problem, set the sampling interval to the smallest
value possible.
Figure 1 shows an example of how to collect performance data to generate
communications statistics to be analyzed either by the advisor or Performance
Tools/400
F3=Exit F4=Prompt F5=Refresh F12=Cancel F13=How to use this display
F24=More keys
Start Performance Monitor (STRPFRMON)
2
+ for more values
More...
Figure 1. STRPFRMON Command
Notes:
Chapter 1. Tools Used for Finding Performance Problems3
1 When collecting performance data, you can use the default library
QPFRDATA or you can create a specific library for your data. For
example, you can create a library with your customer name.
2 As you may have several performance members in that library, put a
text description of each member collected to help identify them. Usually,
it is a good idea to include the date of the collection in the description
field.
3 Set the time interval to five minutes.
4 This specifies the type of information collected.
The possible values are:
*ALLAll of the information is collected including system information,
communications information, and input/output processor (IOP)
information.
*SYSOnly system information is collected. IOP information is not
collected.
5 This parameter is new from Version 3 Release 6. Specifying *YES
starts Database monitoring for all the jobs in the system and that usually
is not preferable.
1.2.2.2 Start Database Monitor (STRDBMON) Command
You may use the STRDBMON command to start monitoring database activities if
special information is required. Entering the STRDBMON command provides you
with the following display:
Type choices, press Enter.
File to receive output . . . . .
Library . . . . . . . . . . .*LIBL
Output member options:
Member to receive output . . .*FIRST
Replace or add records . . . .*REPLACE
Job name . . . . . . . . . . . .*
User . . . . . . . . . . . . .
Number . . . . . . . . . . . .
Type of records . . . . . . . .*SUMMARY
Force record write . . . . . . .*CALC
Comment . . . . . . . . . . . .*BLANK
F3=Exit F4=Prompt F5=Refresh F12=Cancel F13=How to use this display
F24=More keys
Start Database Monitor (STRDBMON)
1Name
Name, *LIBL, *CURLIB
Name, *FIRST
*REPLACE, *ADD
2Name, *, *ALL
Name
000000-999999
*SUMMARY, *DETAIL
0-32767, *CALC
3
Bottom
Figure 2. STRDBMON Command
Notes:
4Comm Perf Investigation - V3R6/V3R7
1 Use this parameter to specify both the library and the file name to
which the performance statistics are written. If the file does not exist, one
is created based on the QAQQDBMN file in library QSYS.
2 Use this parameter to choose the job or jobs whose database
activities are to be monitored.
3 Enter up to 100 characters of descriptive text on this input field.
Please note that at the time this publication was being written, there were no
tools available for analyzing the data collected. Be extremely careful when
collecting data because there is no way of knowing whether database monitoring
is active or not.
Usually the data collected through the STRDBMON command includes no data
directly related to communications performance.
IMPORTANT!
If you forget to turn the monitoring off, you may eventually fill up all of the
disk space on the AS/400 system.
1.2.3 Automatic Data Collection
Automatic data collection allows you to select specific days of the week to
collect the data using the OS/400 performance monitor. Use the Add
Performance Collection (ADDPFRCOL) command or choose option 1 (add) on the
Work with Performance Collection menu (achieved by entering WRKPFRCOL
command) to establish a regular schedule for collecting performance data
automatically on any day of the week.
You may either specify the day and the time to collect the performance data or
just specify starting and ending times and run it every day of the week. Please
make sure that the collection time includes the peak hours or the period you
want to monitor.
Note: The default value of the RMTRSPTIME (Remote Response Time)
parameter is
collected unless otherwise specified.
*NONE which means that remote workstation response time is not
1.2.4 Performance Management/400
One tool that is completely different from all the other tools discussed in this
publication is Performance Management/400. It is a tool that is a combination of
both collecting and analyzing the performance data.
Performance Management/400 (PM/400) is an IBM system management service
offering that assists customers by helping them to plan and manage system
resources through regular analysis of key performance indicators.
The service uses a set of software and procedures installed on the customer′s
system. The software collects performance data and summarizes and transmits
the summarized data weekly to your local service provider.
PM/400 automates these functions and provides a summary of capacity and
performance information. Reports and graphs are produced in a format that
both non-technical and technical persons can understand.
Chapter 1. Tools Used for Finding Performance Problems5
Performance data is both analyzed and maintained by IBM. Contact your local
service provider for more information about using PM/400.
PM/400 does not require Performance Tools/400 (5716-PT1) and has no intention
to replace that product.
1.3 Using CL Commands Interactively
You have several commands to use for identifying the performance problem
interactively:
WRKSYSSTSThis command is used to get a quick look at the system wide
performance figures such as:
•
CPU usage
•
Disk usage
•
Memory usage
Note: There is no way of knowing the amount of memory
used; you can only observe the rate of paging, which
indirectly tells you whether there is enough storage
available or not.
•
Job State transition rates
WRKACTJOBWith this command, you can easily find out how the individual
jobs are using system resources.
WRKDSKSTSWith this command, you can observe the performance of each
disk arm on the system.
WRKSYSACTWith this command, you can observe both external jobs and
internal task or processes. This command is actually the two
previous commands in one package and is only available as a
part of the Performance Tools/400 licensed program.
NOTICE!
Please bear in mind that using these commands can add a significant
amount of workload to the system, especially if you are using the console
display. In other words, analyzing a performance problem can cause more
performance problems.
1.4 Using Performance Tools/400
Performance Tools/400 provides more ways for you to display performance
related information about the system being analyzed.
1.4.1 WRKSYSACT Command
The Work with System Activity display allows you to view performance data in a
real-time fashion. The data is reported for any selected job or task that is
currently active on the system. Besides having the capacity to view this data on
the display station, you may also direct the data to be stored in a database file
for future use.
6Comm Perf Investigation - V3R6/V3R7
1.4.2 PRTACTRPT Command
The Print Activity Report (PRTACTRPT) command generates reports based on
the data collected by the Work With System Activity (WRKSYSACT) command.
1.4.3 DSPPFRDTA Command
The Display Performance Data (DSPPFRDTA) command starts the interactive
displays that are used for showing the performance data.
Note: This command can only be used when previously collected performance
data is available.
1.4.4 The Advisor
Pay attention to any communications related recommendations or conclusions.
1.4.5 Produce Reports
The following list contains reports that you can produce by using the
Performance Tools/400 licensed software.
System reportPrints an overview of what happened on the system.
Component report
Transaction report
Prints performance data by job, user, pool, disk, IOP, local
workstation, and exception.
Prints information about the transactions that occurred during
the time that the performance data was collected.
The transaction report may be extended to print:
•
Transaction detail report
•
Transition detail report
Note: The transaction detail and transition detail reports are
quite detailed. Use select/omit parameters to choose specific
jobs, users, and time intervals only.
Lock reportPrints a report that is used to determine whether jobs are
delayed during processing because of unsatisfied lock requests
or internal machine waits.
Job reportPrints performance data about jobs that were active during the
time that the performance data was collected.
Pool reportPrints performance data about pools.
Resource report
Prints performance data about the system resources such as
disks and workstation controllers.
Batch job report
Prints performance data about batch jobs traced through time.
Resources utilized, exceptions, and state transitions are
reported.
Chapter 1. Tools Used for Finding Performance Problems7
1.5 What to Look For
Follow the flow chart shown in Figure 3 on page 9 to solve your communication
performance problem.
Questions to ask yourself about the performance problems are:
•
Is the performance always unacceptable?
•
Is the AS/400 system balanced? If it is not, follow the map in Figure 7 on
page 20 or contact your service provider to get assistance with tuning the
system.
•
Is there a specific time of day/week/month when performance is poor?
•
Are there batch jobs or file transfer jobs running during the poor
performance time?
•
Are all of the users affected?
•
Are only remote users affected?
•
What do the complaining users have in common?
−If the answer is yes, are the batch jobs running in the same storage pool
as the communication jobs?
−If the answer is yes, consider creating a separate storage pool for either
batch jobs or the communication jobs.
−Is the same application used both in remote locations and locally?
−Are all of the users for this application complaining?
−Is there only one group of users having a problem?
−Are all of the users connected to the same controller/line/IOP?
8Comm Perf Investigation - V3R6/V3R7
Figure 3. Where to Read, 1 of 2
Chapter 1. Tools Used for Finding Performance Problems9
Figure 4. Where to Read, 2 of 2
10Comm Perf Investigation - V3R6/V3R7
Chapter 2.Using CL Commands to Find Performance Problems
This chapter provides information about identifying a communications
performance problem by using command language (CL) commands interactively.
Please bear in mind that using these commands can add a significant amount of
workload to the system, especially if you are using the console display. In other
words, analyzing a performance problem can cause more performance
problems.
2.1 WRKSYSVAL Command
System values are pieces of information that affect the operating environment in
the entire system. System values are not objects and, therefore, they cannot be
passed as parameter values the same as CL variables.
There are some system values that affect performance such as QTOTJOB,
QACTJOB, QMAXACTLVL, QMCHPOOL, and QCMNRCYLMT. Review these
values first because they can relate to your situation.
2.1.1 QTOTJOB
This value controls the total number of jobs for which the storage is allocated
during IPL.
The correct setting of this system value can be obtained by entering the
WRKSYSSTS command. Pay attention to the value displayed in the ″Jobs in
system″ field because the amount of jobs in the system should never be greater
than the value of QTOTJOB. Add 15% to the number of ″Jobs in system″ field
and set this to be the system value QTOTJOB provided that the followingcautions are followed:
•
Remember to clear output queues regularly because OS/400 reserves
storage for a job as long as there is at least one spooled output file for that
job even though the job is inactive. The more files there are in output
queues, the more jobs you see on the Work with System Status display.
•
If you have a high number of spooled files on the system while using the
WRKSYSSTS command and you add 15% more to set the QTOTJOB value,
you significantly increase the time it takes to IPL the system. Performance is
also affected at run time of any system functions that search through the
system wide Work Control Block Table (WCBT). These functions include the
WRKACTJOB command, WRKJOB command, and STRSBS command.
•
Consider using the AS/400 Operational Assistant options to clean the
obsolete spooled files such as old job logs and program dumps from the
system. This can be done by entering
If the amount of ″Jobs in system″ reaches this value, all of the jobs are paged
out from the main storage and the amount of job structures given with the
QADLTOTJ system value (the shipped value is 10) is created before all of the
jobs are paged into the main storage and normal processing continues.
GO CLEANUP on any command line.
You can suspect a wrong setting of QTOTJOB if the system seems to ″slow
down″ periodically with no apparent reason such as a heavy batch job visible.
The ″hang up″ situation normally lasts a couple minutes after which normal
Copyright IBM Corp. 1997 11
2.1.2 QACTJOB
processing continues until the previously created job structures are used up and
a new ″hang up″ situation arises.
The value shipped with the operating system is 30 which normally is not large
enough.
Note: A change of this system value is effective only after the next IPL.
This value controls the initial number of active jobs for which storage is to be
allocated during IPL. The amount of storage allocated for each active job is
approximately 110K.
The correct setting for this value can be determined by entering the
WRKACTJOB command; on the right-hand top corner of the display is the
amount of active jobs in the system. Find out what is the highest amount of the
active jobs during a busy day, add 10% to the number, and you have found the
correct setting for the QACTJOB system value. The number of active jobs
should not exceed this value, or all of the jobs are paged out from main storage
until a number of job structures given with QADLACTJ are created.
You can suspect a wrong setting of QACTJOB if the system seems to ″fall
asleep″ periodically with no apparent reason visible. The ″sluggish
performance″ situation normally lasts a couple of minutes after which normal
processing continues until the amount of previously created job structures are
used up and a new ″hang up″ situation arises.
The value shipped with the operating system is 20 which normally is not large
enough.
Note: A change of this system value is effective only after the next IPL.
You must keep QACTJOB, QTOTJOB, QADLACTJ, and QADLTOTJ at
reasonable values. If you make QACTJOB and QTOTJOB excessively high,
the IPL is slower due to excessive storage allocation. If you make QACTJOB
and QTOTJOB too small for your environment and you make QADLTOTJ and
QADLACTJ excessively large, run-time performance can be impacted.
2.1.3 QMAXACTLVL
This value determines the maximum activity level of the system. This is the
number of all the jobs that can compete at the same time for main storage and
processor resources. If a job cannot be processed because no activity levels
are available, the job is held until another job reaches a time slice end or a long
wait. See Chapter 14 in the
state transitions.
Even though the value shipped with V3R7 is *NOMAX, ensure that this is the
setting on your AS/400 system. This is because the value shipped with the
previous releases (prior to V3R1M0) was 100 and normally the system values are
not changed during the update of the operating system. A change to this system
value takes effect immediately.
Do Not Set the Values Too Large!
Work Management Guide
for information about job
12Comm Perf Investigation - V3R6/V3R7
2.1.4 QMCHPOOL
This system value affects the size of the *MACHINE storage pool. The machine
storage pool contains the highly-shared microcode and operating system
programs. Some of the programs are pageable and some of them are not
pageable. This means that you must be careful when changing the size for this
storage pool because system performance may be impaired if the storage pool
is too small.
Notes:
1. A change to this system value takes effect immediately. The shipped value
2. This value may be changed by the performance adjust support when the
You can also change the setting of the QMCHPOOL system value by using the
Work with System Status display as described in the Section 2.4, “WRKSYSSTS
Command” on page 16.
The third way of changing this system value is done by using the WRKSHRPOOL
(Work with Shared Pools) command.
2.1.5 QCMNRCYLMT
This system value provides recovery limits for system communications recovery.
It specifies the number of recovery attempts to make and when to send an
inquiry message to the system operator if the specified number of recovery
attempts has been reached.
is 20000KB.
system value QPFRADJ is set to 1, 2, or 3.
The recommended value is (2 5), which means that two communication line or
control unit retries are tried within a 5-minute interval. Never set the first value
(count limit) equal to or greater than the second value (time interval) excluding
(0 0).
If the count limit is 0, regardless of the time interval, no recovery attempts are
made. When the count limit is greater than 0 and the time interval is 0, infiniterecovery attempts are being made. If the count limit is greater than 0 and the
time interval is greater than 0, the specified number of recovery attempts are
made and an inquiry message is sent to the operator after the specified time
interval.
Table 1. QCMNRCYLMT Settings Examples
Count LimitTime IntervalAction
00No recovery
01 through 120No recovery
1 through 990Infinite recovery
1 through 991 through 120Count and time recovery
An incorrect setting of a QCMNRCYLMT value can cause the system to perform
the line or controller recovery continuously. Under some conditions, the
continuous retries can consume a significant amount of system resources. If this
occurs, stop the process by varying the configuration object off.
Chapter 2. Using CL Commands to Find Performance Problems13
2.2 PRTERRLOG Command
The next step of solving a communications performance problem is to verify that
the hardware is functioning properly. This can be done with the PRTERRLOG
(Print Error Log) command that is used primarily for problem analysis tasks. The
command places a formatted printer file of the data in the system error log (in
case there are errors reported) into a spooled printer device file named
QPCSMPRT or into a specified output file.
This command is shipped with public *EXCLUDE authority. The following user
profiles have private authorities to use the command: QPGMR, QSYSOPR,
QSRV, and QSRVBAS.
The first page of the PRTERRLOG command prompt looks similar to the following
display:
F3=Exit F4=Prompt F5=Refresh F12=Cancel F13=How to use this display
F24=More keys
Print Error Log (PRTERRLOG)
*ALL, *ALLSUM, *ANZLOG...
Name, *ALL
Date, *CURRENT
Time, *AVAIL
Date, *CURRENT
*CHAR, *HEX
More...
Figure 5. PRTERRLOG Command Prompt
You can also view the error log by using the System Service Tool as described
in Chapter 5, “Using System Service Tools” on page 71.
If the list produced with the Print Error Log command contains no hardware
errors in lines, controllers, or IOPs, proceed with the next topic. Otherwise,
contact your hardware service provider.
2.3 PTF Commands
This topic provides only part of the information about working with PTFs. For
more information, see Chapter 4 in
Handling
Install the latest cumulative PTF package about every four months or at least
twice a year. This is to ensure that your system has the latest level of code
14Comm Perf Investigation - V3R6/V3R7
, SC41-4206.
AS/400 System Startup and Problem
installed, and usually most of the so-called ″performance PTFs″ are included in
the cumulative PTF packages.
IBM creates PTFs to correct problems or potential problems found within IBM
licensed programs. PTFs may fix problems that appear to be hardware failures,
or they may provide new or enhanced functions.
2.3.1 DSPPTF
The Display Program Temporary Fix (DSPPTF) command shows the program
temporary fixes (PTFs) for a specified product.
To find out what level of code is running on the system, type the DSPPTF 5716999
command on any command line and you receive the ″Display PTF Status″
display. The first line displayed shows you the latest cumulative PTF package
installed on your system.
2.3.2 SNDPTFORD
To find out what the latest PTF package is, enter the SNDPTFORD
PTFID((SF98370)) command and press Enter. I f you have a maintenance
agreement with IBM, you receive a file that has information about:
•
PTF packages available for Version 3 Release 7
•
Installing the latest cumulative package
•
Preventive service planning (PSP) information for installing the latest
cumulative PTF package
•
PSP information for installing Version 3 Release 7
•
IBM frequently-asked questions about the AS/400 system
•
Summary of the Version 3 Release 7 High Impact/Pervasive (HIPER) PTFs
and PTFs that are in error (PE)
•
Complete detailed list of the Version 3 Release 7 PTFs that are in error (PE)
•
Complete detailed list of the Version 3 Release 7 High Impact/Pervasive
(HIPER) problems
•
Summary of the generally available Version 3 Release 7 PTFs
Enter the SNDPTFORD PTFID((SF97370)) command to obtain a listing that
provides you with a convenient reference of the License Internal Code fixes and
program temporary fixes (PTFs) that are available by IBM licensed program
categories. This listing is updated regularly. You may choose to order a
PTF/FIX that effects one of your IBM licensed programs.
Enter the SNDPTFORD PTFID((SF99370)) command to order the latest cumulative
PTF package that is available in your country.
Information about the latest performance PTFs can also be obtained by reading
item 130NC in HONE.
Chapter 2. Using CL Commands to Find Performance Problems15
2.4 WRKSYSSTS Command
Observe and balance the overall (system wide) performance before focusing on
a communications performance problem. The reason for this is that the
communications performance is only a relatively small part of the overall
performance. If the entire system is functioning poorly, there normally is no use
trying to figure out what might be wrong with communications.
2.4.1 WRKSYSSTS
The Work with System Status display shows the current status of the system in
real time. Use this display to observe the paging fault rates and job transitions.
The indicators you need to pay special attention to (in order of priority) are:
1. Non database fault rates in the machine pool
2. Non database fault rates in all the other pools
3. Page rates in all the pools
4. Transition rates in all the pools
Note: When tuning the system, make sure that the machine pool is treated
separately from the other pools.
Use the faulting guidelines in the
Work Management Guide
manual and
Appendix I, “Guidelines for Interpreting Performance Data” on page 379 to
determine the effects that faulting has on performance. The following examples
may help you to understand the faulting guidelines:
•
The response time of an interactive transaction is affected by any faults that
occur during that transaction. Each fault adds from 10 to 30 milliseconds to
the end-user′s response time. For example, if the disk response time is 20
milliseconds and the transaction has five faults per transaction, add about
0.1 seconds to the total response time.
•
Each fault consumes a certain amount of the CPU power: the more faults that
occur, the more CPU is being consumed for unproductive work. In the
following examples, processing the transactions consumes 70% of the CPU
capability and the faulting rate is 100.
−On a 9401 class (CPW close to 7) processor, these faults use CPU for 0.6
seconds.
−On an 9402 model 2130 class (CPW close to 12) processor, these faults
use CPU for 0.3 seconds.
−On an 9406 530 class (CPW close to 132) processor, these faults use CPU
for 0.02 seconds.
If the faulting rate of your system is close to the poor end of the faulting
guidelines tables, approximately 10% to 20% of the CPU is used for faulting.
Adding main storage to reduce the faulting rate also lowers the CPU
utilization, thus leaving more processing power available to handle more
transactions.
•
With the increasing faulting rate, the amount of disk I/O also increases. If
you have only a few actuators, these faults can cause the disk utilizations to
increase more rapidly than if you have many disk arms. As your disk arm
(actuator) utilization increases, the time to process disk I/Os increases and
the response times get longer.
16Comm Perf Investigation - V3R6/V3R7
While using the Work with System Status display to analyze a communication
performance problem, concentrate on two storage pools:
*MACHINE pool
This is the pool in which the OS/400 jobs and microcode tasks run.
Normally this is the pool that should have the rate of non-DB faults
below 10 faults per second.
OTHER pool
This is the pool in which the communications jobs are routed to. The
shipped value for this is the *BASE pool. Investigate the subsystem
descriptions for QCMN and QSERVER subsystems to see which
storage pool is being used by the jobs and focus on that storage pool.
•
What is the faulting rate in the *MACHINE pool? See Table 17 on page 379
for guidelines of non-database page faults in the storage pool. If the rate is
not acceptable, see the map in Figure 7 on page 20.
•
What is the faulting rate in the storage pool used for communications jobs?
•
A rule of thumb for the initial Activity Level Factor used for the
communications subsystem is 500K per activity level (for example, 4000K of
memory and an activity level of 8 should provide adequate resources for
interactive work). If 500K per activity level is not enough, add memory to the
pool or decrease the activity level in the pool.
Remember to provide enough activity levels in the pool where the
communication jobs are running or you may experience a significant
performance degradation. Please note that file transfer jobs require
considerably more memory than interactive jobs so a rule of thumb for a filetransfer job is a 2000K per transfer.
•
If you have Client Access/400 users running critical file transfer functions,
consider separating the transfer jobs to a storage pool of their own. Create
a new storage pool for subsystem QCMN and direct the routing entry having
the compare value
QTFDWNLD to that pool. The following table describes the
routing entries that you may work with to override the IBM supplied default
values:
Table 2. IBM Supplied Program Routing Entry Compare Values for V3R7
1 This column is the most important column of this display. Because
the machine pool contains objects used system-wide, page faulting in this
pool affects all of the jobs in the system. Therefore, it is desirable to
maintain a low page fault rate in this pool. The only way to affect thepaging in the machine pool is to adjust the size of the pool.
See Table 17 on page 379 for guidelines of non-database page faults in
*MACHINE pool.
the
2 The rule of thumb for adjusting the machine pool size is to multiply
the number in the ″Reserved Size″ field by one and a half.
3 This column represents the sum of non-database faults in all of the
storage pools and this is the column you need to focus your attention on.
The non database faults include program code (jobs′ work areas and
variables, for example). To affect the faulting rate in the pool (except
machine pool), you can change either the size or the activity level of the
pool.
See Table 18 on page 379 and Table 19 on page 380 for guidelines about
the amount of faults in storage pools.
4 This column represents the sum of database faults in all of the
storage pools. Please remember that a system with no database faults is
a ″dead″ system. This is because the data may be changed only when
the data is in the main storage and if the data is not in the main storage,
the system issues a fault. When no database pages are brought into the
main storage, not a single piece of data is being changed and no work is
done with the system.
18Comm Perf Investigation - V3R6/V3R7
Basically, a fault is an order to go and get a piece of data from a disk to
main storage so that the data can be changed. Technically speaking, a
page fault is a program notification that occurs when a page that is
marked as not in main storage is referred to by an active program.
5 These last three columns (from left to right) represent the job′s state
transitions. When the pool size and activity level settings are in balance
with each other, the ratio of columns (from left to right) should be 10 to
one. Usually, when the pool size and activity level settings are correct for
the workload, the transition rates fall within the guidelines.
A job running on the system is in one of the following states:
ActiveThe job is in main storage and it is processing work that is
requested by the application.
WaitThe job needs to use a resource that is momentarily
unavailable.
IneligibleThe job has all of the resources required to do the
processing, but it is waiting for a free activity level.
Wait-to-ineligible transitions need not be zero all of the time. When there
is a momentary period of heavy usage, it may be better to let the jobs
become ineligible to avoid excessive page fault rates or thrashing.
See Table 20 on page 380 for guidelines of the ratio of
Wait-to-Ineligible/Active-to-Wait transitions.
6 The time frame of the observation period should be kept between five
and 30 minutes. If the observation period is less than five minutes, the
occasional peak loads tend to distract the rates of both faults and pages.
On the other hand, if the time period is over 30 minutes, the important
data may be lost because the counters holding the data may get wrapped.
2.4.2 Information About Activity Level Guidelines
Table 3. Activity Level
Resource DescriptionWhere to LookCompare With
Activity Level for *BASE and
QSPL pool
QINTER Activity LevelSystem Report: Storage Pool
System Report: Storage Pool
Utilization, WRKSYSSTS, ADVISOR
Utilization, WRKSYSSTS, ADVISOR
Figures given in Chapter
14 in the
Management Guide
See Table 22 on
page 380.
2.4.3 Information About Transition Guidelines
Work
.
Table 4. W-I and A-W Ratio
Resource DescriptionWhere to LookCompare With
W-I/A-WSystem Report: Storage Pool
Utilization, WRKSYSSTS
Chapter 2. Using CL Commands to Find Performance Problems19
See Table 20 on
page 380.
2.4.4 Interactive Tuning Roadmap
Balancing your main memory and CPU utilization is accomplished by allocating
the memory available and setting the activity levels in the storage pools. Refer
to the
Work Management Guide
activity level settings.
Note: You have to repeat Step 4 through Step 7 for all of the other pools in your
AS/400 system; Step 3 is for the *MACHINE pool only. F ollow the road map
during periods of high system′s activity because there is no use tuning the
system when there is only a relatively light workload on the system. Make sure
that system value QPFRADJ is set to zero before following the tuning road map.
for the guidelines of both the memory and
Interactive AS/400 Tuning Roadmap
1. Enter command WRKSYSSTS.
Press PF21 to set assistance level to Intermediate.
2. Wait 2-3 minutes and press PF5 to refresh.
3. Does *MACHINE NDB faults meet the guidelines?
a. Yes ... Press PF10 and go to step 4.
b. No .... Adjust QMCHPOOL:
1) -50K if fault rate = 0
2) +50K if fault rate > 3.0
3) Press PF10 to reset and go to step 2.
4. Is the DB fault + NDB fault > 20 in any pool?
a. Yes ... Increase pool size by 50KB, press PF10 and repeat
Step 4 (repeat until all pools are less than 20).
b. No .... Go to step 5.
5. Wait 2-5 minutes, press PF5. Press PF21 to set the Assistance
level to Advanced.
Is the Wait to Ineligible state = 0?
a. Yes ... Reduce Activity level by 2, press PF10 to reset, and
repeat step 5.
b. No .... Go to step 6.
6. Is the Active to Wait state 10x the activity level?
a. No ....System not heavily used or complex application mix,
b. Yes ... Go to step 7.
7. Is the sum of all fault rates for all pools within guidelines?
a. No .... Go to step 4.
b. Yes ... Go to step 8.
8. Activity levels and pool sizes probably OK. Continue monitoring
Figure 7. AS/400 Tuning Roadmap
WRKSYSSTS display regularly.
20Comm Perf Investigation - V3R6/V3R7
go to step 4.
2.5 WRKACTJOB Command
The Work with Active Jobs command measures system performance by
measuring aspects such as the CPU usage and response time. The following
examples show the different Work with Active Jobs displays.
To view the Work with Active Jobs display, type
and press the Enter key. Press the PF21 key to show more jobs on one display
as in the following example:
Is a communications job consuming a relatively great deal of CPU?
•
Are there any communications jobs creating lots of I/O?
−If there is, display the job by entering a number five in front of the job
−Are there many files opened?
−Are there many logical files opened?
−Can a similar task be done on a locally attached terminal?
−Is the response time the same in both cases?
−If it is, go and see what the application is doing.
−Can the application itself be modified?
2.6 Using WRKDSKSTS
The Work with Disk Status display shows performance and status information
about the disk units on the system. Type the WRKDSKSTS command on the
command line and press the Enter key. The Work with Disk Status display is
shown:
Note: Before observing disk status, have your system tuned according to either
Figure 7 on page 20 or as described in Chapter 14 in the
.
Guide
When viewing the Work with Disk Status display, pay attention to the percent
busy data that is actually the estimated percentage of time the disk unit is being
used during the elapsed time. This estimate is based on:
Chapter 2. Using CL Commands to Find Performance Problems23
Work Management
•
The number of I/O requests
•
The amount of data transferred
•
The performance characteristics of the type of disk unit
Each unit (actuator) should be less than 50% busy. An actuator is the device
within an auxiliary storage device that moves the read and write heads. If each
unit is between 50% and 70% busy, you may experience variable response
times. In case all the units are more than 70% busy, the amount of actuators is
inadequate for the workload in the system. If you have a well-tuned system with
actuators exceeding the 50% busy guideline, increase the number of disk
actuators.
It is possible to experience unacceptable performance even if only one actuator
exceeds the 50% busy guideline. This usually happens when frequently-used
data is placed on a single actuator. If this happens on your system, use the
Performance Tools/400 licensed program to run the disk report to find out which
data is frequently used. After identifying the data causing the bottleneck, you
can save the data, delete the data, and restore the data to spread it across all of
the actuators.
A batch job accessing the data can cause a short time period of an actuator
exceeding the 50% guideline. If the data is not concentrated on a single
actuator, you notice the high percentage of the utilization moving from one unit
to another unit.
Note: Please remember that observation periods of less than five minutes
usually do not provide reliable results.
To notice either improving or degrading trends in the disk performance, observe
the historical data created from the regularly collected performance data.
2.7 WRKSYSACT Command
This command is a part of the Performance Tools/400 licensed program and is
actually an enhancement of the Work with Active Jobs display. It is the only tool
that shows both external jobs and internal tasks at the same time on the display.
By default, the jobs are sequenced by CPU usage but you can also sequence the
display by I/O.
•
Are there any communications related modules consuming CPU?
•
Are there any communications related modules consuming Disk?
−If the answer is yes, contact your service provider to find out if there are
any Program Temporary Fixes available for these modules.
Notice!
The performance statistics reported by this function represent activity that
has occurred since a previous collection. This implementation may be
different from other system functions that generally provide cumulative
values until specifically reset.
If the Performance Tools/400 licensed program is installed on your system, enter
the WRKSYSACT command on any command line to receive the following
display:
Figure 12. WRKSYSACT Display View 1 is a Summary Display
Notes:
1 This column displays the run priority of the job.
2 This column displays the CPU utilization of the job.
3 This column displays the total amount of synchronous I/O the job is
causing. Having a low amount of synchronous I/O is important because a
job has to wait for the completion of the synchronous I/O operation before
continuing.
4 This column displays the total amount of asynchronous I/O caused by
the job. The amount of asynchronous I/O is of less importance than the
amount of synchronous I/O because a job can continue processing
immediately after requesting an asynchronous I/O. In a way, an
asynchronous I/O is similar to a batch job; after having submitted it, you
do not have to wait for its completion.
5 By entering ″1″ in this field, you can monitor this job only and by
entering ″5″, you access the ″Work With Job″ display. You can monitor up
to 20 jobs and tasks at a single time.
6 By pressing the PF11 key, you can select from three different
displays: the summary, Synchronous I/O details, and Asynchronous I/O
details.
7 Press the PF24 key for additional function keys to use. Press the
PF14 key to display jobs only and exclude the information for tasks. Press
the PF15 key to display tasks only and exclude the information for jobs.
You can also use the PF16 key to display the jobs/tasks in a descending
order of disk I/O operations.
Chapter 2. Using CL Commands to Find Performance Problems25
F3=Exit F10=Update list F11=View 1 F12=Cancel F19=Automatic refresh
F24=More keys
Work with System Activity
--------Asynchronous--------
Bottom
Figure 14. WRKSYSACT Display View 3, Details of Asynchronous I/O
26Comm Perf Investigation - V3R6/V3R7
Chapter 3.Using Performance Tools/400
Performance analysis is a method of investigating, measuring, and correcting
deficiencies so that system performance meets the user′s expectations. The
problem solving cycle should be similar to:
1. Understand the symptoms of the problem.
2. Use tools to measure and define the problem.
3. Isolate the cause.
4. Correct the problem.
5. Use tools to verify the correction.
Once the apparent cause (or causes) is isolated, you can propose a solution.
The solution can be something simple such as tuning the storage pools, or a
complex one that requires application recoding.
To achieve the optimum performance, you must understand the relationships
between critical system resources and attempt to balance the use of the
resources that are the CPU, the main storage, the disk units, and the
communication lines. However, any improvement can only come through
analyzing the critical resources and contention for both system and application
objects.
3.1 System-Wide Problem Analysis
The ways to analyze the system-wide performance are:
•
Using the CL commands as described in Chapter 2, “Using CL Commands to
Find Performance Problems” on page 11.
•
Using the Performance Tools/400 reports and displays as described in this
chapter.
3.1.1 Advisor
The Advisor provides the easiest way of evaluating the performance data. It is a
tool located between automatic system tuning and performance reports. You
can either enter the Analyze Performance Data (ANZPFRDTA) command or
choose option 10 on the PERFORM menu to start the advisor.
The Advisor uses data collected by the Performance Monitor to recommend
performance tuning changes, and it can also point out other problems affecting
system performance. You can use the Advisor to analyze the performance data
collected from other systems.
The Advisor analyzes one member set of performance data at a time. Select the
member that was collected when the performance problem occurred.
It is easy to find the right time interval to analyze by using the display histogram
function. For example, if you need a time interval when the transactions had the
longest response times, select the transaction response time option on the
Display Histogram display. From the chart, select a time interval by moving the
cursor to that interval, type 1, and press the Enter key. The Advisor analyzes the
Copyright IBM Corp. 1997 27
performance data collected during that particular interval and gives you
recommendations and conclusions.
The Advisor analyzes performance data, including:
•
Storage pool sizes
•
Activity levels
•
Disk and CPU utilization percentages
•
Communications line utilization percentages and error rates
•
IOP utilization percentages
•
Unusual job activities such as exceptions or excessive use of system
resources
•
Interactive trace data (when collected)
Note!
To avoid causing a serious performance impact while running the Advisor
interactively, start the advisor by entering the ANZPFRDTA command. Press
the PF4 (prompt) key followed by the PF10 (additional parameters) key to
change the value of the DATATYPE parameter from
*ALL to *SAMPLE. If you
need to analyze the trace data with Advisor, consider submitting the job or
running it when your system has a light workload.
You can either select all of the intervals or a subset of the time intervals for
analysis. You can run the Advisor either interactively or as a batch job. The
output of using the Advisor is grouped under the following headings:
•
Recommendations
•
Conclusions
•
Interval conclusions
All of the headings have information about:
•
CPU utilization of high priority (with a run priority 20 or higher) jobs
•
Performance analysis of interactive transactions by using the trace data
collected with the performance monitor
•
Main storage utilization and Wait-To-Ineligible versus Active-To-Wait ratio.
This addresses page faulting and activity-level analysis.
•
Disk utilization and other disk activities
•
IOP utilization
•
System impact of authority lookups
•
System impact of exceptions
•
Communication line utilization and error percentages
The Advisor does not:
•
Make any recommendations for modifying specific programs to improve their
performance.
•
Analyze noninteractive trace data.
The recommendations may include changes to the system′s basic tuning values
that can improve performance. They also may list problems that (when solved)
can solve other performance problems.
28Comm Perf Investigation - V3R6/V3R7
The conclusions display lists conditions that may have affected performance
during the data collection. Good examples of conclusions are:
•
Thresholds reached
•
Save and restore activities
•
Communications line errors
You can use the conclusions that are not related to recommendations as guides
for collecting more performance data or for adjusting the system.
The Advisor may suggest changes to pool sizes and activity levels. These
changes are not made dynamically but only after the operator tells the advisor to
make the changes or to ignore the recommendations. The tuning is done by
pressing the PF9 key on the display recommendations display. Pool and activity
level changes can be made to all of the main storage pools on the system.
The Advisor also suggests which report to run to get more information for your
problem analysis. See the ″Advisor″ chapter of the
SC41-4340, for detailed information.
3.1.2 Performance Graphics
Performance data collected by the performance monitor can also be displayed in
a graphical format. The graphs can either be displayed interactively, or printed
or plotted to hardcopy. The printing option of the graphs is recommended
because the graphics on the paper are more descriptive. The best printouts are
created by using an *IPDS printer. If you must use a conventional printer, please
check the PAGESIZE parameter of the printer file QPPGGPH in the QPFR library.
Performance Tools Guide
,
You can enter the DSPPFRGPH command or the DSPHSTGPH command, or you
may choose option 9 (
Performance graphics) on the PERFORM menu. You have two
types of performance graphics to choose from:
•
Option 1. Display performance data graphics:
Performance data graphs are graphs that:
−Use select/omit criteria.
−Are run against the original performance data.
−Are used to show the performance during one data collection only.
•
Option 2. Display historical data graphics:
Historical data graphs are graphs that:
−Use performance data from several collections.
−Are useful when tracking performance trends.
The Display Historical Graph (DSPHSTGPH) command produces a graph from
the historical data created by the Create Historical Data (CRTHSTDTA)
command. The DSPHSTGPH command is intended to give you a historical
perspective of the system performance in a graphical presentation.
You must run the Create Historical Data (CRTHSTDTA) command for each
member that you want to include in the graph. If no historical data has been
created for a member, it is not included in the graph unless you specify
*YES on
the CRTHSTDTA parameter of the DSPHSTGPH command. You can use the
IBM-supplied format (in the QPFRDATA library) or you can create a format of
your own.
Chapter 3. Using Performance Tools/40029
3.1.3 Print Activity Report
The print activity report (PRTACTRPT) command creates a report using the
performance data collected by the WRKSYSACT command.
You may produce two different reports:
Summary report
Detailed report
The following example shows a summary report:
This prints out a report showing the top 10 list of:
•
CPU intensive jobs and tasks
•
I/O intensive jobs and tasks
This prints out either:
•
Selected numbers of entries for each interval specified by the
PERIOD parameter.
•
The number of entries specified by the NBRJOBS parameter that
are listed in the order specified by the SEQ parameter.
VTMTS10.0 0 0 0 0 0000 000
Job or Task-- Job or task name
User-- User profile associated with the job
Number-- Job number
Pty-- Job or task priority
CPU Util-- Percent of CPU used by the job or task
Total Sync I/O-- Total number of synchronous I/O operations
Total Async I/O -- Total number of asynchronous I/O operations
PAG Fault-- Number of faults involving the process access group
Synchronous I/O --
DB Read-- Number of synchronous database reads
DB Write-- Number of synchronous database writes
Non-DB Read-- Number of synchronous non-database reads
Non-DB Write-- Number of synchronous non-database writes
Asynchronous I/O --
DB Read-- Number of asynchronous database reads
DB Write-- Number of asynchronous database writes
Non-DB Read-- Number of asynchronous non-database reads
Non-DB Write-- Number of asynchronous non-database writes
Total Total-------Synchronous I/O-----------Asynchronous I/O------
Total Total-------Synchronous I/O-----------Asynchronous I/O------
Page1
Figure 15. An Example of PRTACTRPT Output
30Comm Perf Investigation - V3R6/V3R7
Notes:
The upper part of the list shows the jobs sequenced by the CPU
utilization.
The lower part of the list shows the jobs sequenced by the disk I/O.
3.1.4 Performance Tools Reports
Printing performance reports extracts information from previously collected
performance data. You can review the performance of specific jobs or
transactions, or other performance elements. This can be done by choosing
option 3 (print performance report) on the
print performance report display. You may also issue any of the following
commands:
Each of these commands provides you with a different level of information. The
following reports are produced from the sample data collected with the
performance monitor:
•
System report
•
Component report
•
Job report
•
Pool report
•
Resource report
If you collected trace data with the performance monitor, you can produce:
•
Transaction report that can further be extended to:
−Transaction detail report
−Transition detail report
•
Lock report
•
Batch job trace report
In the early stages of problem determination, print only the first two reports
(system report and component report). These help you to determine whether
you need to analyze the problem in more detail or not.
The system report and the component report provide information to evaluate
your system-wide performance. Pay attention to the following items:
•
Average response time in the system report workload
•
Number of transactions for total run time and per hour
•
CPU percent for all levels of priority and also cumulative. The cumulative
value up to and including priority 20 should not exceed 60 per cents provided
no queries are run interactively.
•
Number of database/non-database page faults in each storage pool
Chapter 3. Using Performance Tools/40031
•
Disk (percentage used and utilization of the actuators)
•
Communication lines traffic and IOP utilization
3.1.4.1 System Report
In the system report, you find the basic set of information to compare against
your predefined performance objectives and the guideline tables as shown in
Work Management Guide
•
The system overview workload and resource utilization part shows you what
the system workload is and what is the cost of processing the workload. The
CPU utilization shows the percentage of processing unit time used by each
job type. According to the guidelines, the total CPU utilization should not
continuously exceed 81% (for four-way processors). See Table 16 on
page 379 for other CPU categories.
•
Check the percent of space in use and the utilization of disk on the utilization
part of report; compare those values to Table 16 on page 379. Column ″ops
per second″ and number of disk IOPs installed on the system show whether
or not you are overdriving the IOPs. On a normal distribution of disk
operations, each IOP′s average should be between 30 to 60 per second.
•
Avg util and max util column on the communication part gives you the
average and maximum percentage of the line capacity used during the
measured interval. Compare those values to Table 16 on page 379.
If you find any discrepancies between the system performance report and the
guidelines, go to the component report to find out whether you need to do a
problem analysis on the system performance.
manual.
3.1.4.2 Component Report
The component report provides information about the same components as the
system report but at a greater level of detail.
•
Component interval activity shows the use of CPU, disk, and pools at
selected time intervals. For example:
−Is the transaction rate high in all the intervals?
−Is the same disk unit suffering from high utilization during all of the
intervals?
−Which of the memory pools has the highest faulting rate?
•
Job workload activity shows the activities of each job. You need to perform
problem analysis on a particular job if you find that a job used most of the
disk I/O operation (under column disk I/O) or CPU utilization (CPU util).
•
In the pools storage activity part, you need to look at the columns DB faults
and Non-DB faults. Compare those values to Table 18 on page 379 and
Table 19 on page 380. W ait-to-ineligible need not be zero all of the time, but
it must be less than .25 for good performance. See Table 20 on page 380 for
the guidelines of activity level changes.
•
Disk activity shows average disk activity per hour and the capacity of each
disk. Batch processing may cause a high utilization of individual disk drives.
Batch sequential processing can stay on one drive for some time.
Interactive performance is not normally degraded if the batch jobs are
running in a storage pool of their own. However, if there are many
interactive jobs, a high disk utilization can indicate a performance problem.
32Comm Perf Investigation - V3R6/V3R7
•
The database journal summary includes user journal and system journaling
of access paths disk write counts. No guidelines are provided so you must
record this information over time to determine any increase in the disk I/O
as a result of journaling.
If you need more data on your current system performance before you decide to
analyze, issue the WRKSYSACT command. Refer to Chapter 2, “Using CL
Commands to Find Performance Problems” on page 11 for more information
about that command.
Based on this information, you can decide if there is a problem with the overall
performance of the system.
3.1.5 Memory Performance Displays and Reports
You cannot measure the amount of memory currently in use; you only can
observe the amount of faults that indirectly tell you whether there is enough
main storage or not. The tools used for finding out the memory performance
are:
•
The WRKSYSSTS command
The Work with System Status display shows you in real time what the
demand for main storage is. See Chapter 2, “Using CL Commands to Find
Performance Problems” on page 11 for information about using this
command.
•
The DSPPFRDTA command
The Display Performance Data command provides an interactive interface to
the previously collected data given in the system, component, and interval
reports.
•
DSPACCGRP command and ANZACCGRP command
These commands show for a job or a group of jobs:
−The temporary storage used
−Open files
−File I/O counts
−Active programs
The Display command and Analyze Access Group command provide data on
the size of the ″currently in use″ part of the PAG. The PAG size can be
affected by reducing the number of active programs, the number of display
and database files open, and the number of display formats and database
buffers allocated for the files. See Section 3.4.2.1, “DSPACCGRP and
ANZACCGRP” on page 51 for more information about displaying and
analyzing the access groups.
3.1.5.1 Where to Find Information About Memory Usage
Table 5 (Page 1 of 2). Memory Utilization Information
Resource DescriptionWhere to LookCompare With
Machine pool NDB page faultSystem Report: Storage Pool
Utilization, WRKSYSSTS, Advisor
Sum of DB and NDB page faults
for each pool
System Report: Storage Pool
Utilization, WRKSYSSTS, Advisor
Table 17 on page 379
Table 18 on page 379
Chapter 3. Using Performance Tools/40033
Table 5 (Page 2 of 2). Memory Utilization Information
Resource DescriptionWhere to LookCompare With
Sum of DB and NDB page faults in
all pools
Pool size By intervalPool Report: Pool Activity
The pool with the highest fault
rate for each time interval
System Report: Storage Pool
Utilization, WRKSYSSTS, Advisor
Component Report: Component
Interval Activity
Table 19 on page 380
3.1.6 CPU Performance Reports and Displays
•
System report
•
Component report
•
Transaction report
If the interactive utilization percentage of CPU is always more than 85, try
modeling to see if a faster CPU can help.
•
The WRKACTJOB command:
This command allows you to determine:
−What is the utilization percentage of CPU?
−How much does each job use CPU, both in terms of percentage and for
how long a time total?
•
The WRKSYSACT command
3.1.7 A Brief Discussion About Program Exceptions Consuming CPU
Pre-V3R6 systems report a number of exceptions types by the performance tools
Component Report and the Advisor. The reports showed the number of
exceptions per second per interval that occurred; the Advisor shows the percent
of the CPU used by exceptions in an interval. In addition, there are a set of
charts that show the percent of the CPU used as a result of ″n″ number of
exceptions by type per second.
From V3R6, some of the exceptions have been eliminated for one reason or
another, and some of the exception CPU overhead has changed as a result of
the machine implementation or the type of program in which they occur.
3.1.7.1 Program Exceptions
EAOThese are gone from V3R6 as a result of the difference in PowerPC
hardware addressing structure. Any data field or report that had
them has been changed to indicate their absence. See Section
3.1.7.3, “Removal of Effective Address Overflow (EAO) Exceptions in
V3R6” on page 36 for more details on why we used to have EAO
exceptions and why now when we use PowerPC technology and 64-bit
addressing, they no longer occur.
SizeThese are the result of an arithmetic operation in which the receiving
field is too small for the result. They are an application programming
problem and still occur.
The programs should be reviewed and changed to ensure that the
proper receiver field specification is used or that the programming
algorithm is doing the function in a manner to avoid size exceptions.
Using Performance Monitor trace data and Transaction Report
34Comm Perf Investigation - V3R6/V3R7
(PRTTNSRPT RPTTYPE *TRSIT) may provide additional information
about which programs are getting size exceptions.
VerifyVerify exceptions occur when trying to resolve an as yet unresolved
pointer.
This exception can occur on the RISC machine and is an application
programming problem. The program should be changed to ensure
that the variable used in a CALL instruction does not change from
one use to the next. Use PEX STATS and PEX TRACE to find out
where it is occurring.
Authority These can occur on the RISC machine and are the result of a system
security setup mismatch.
The same rules apply to fixing these as on pre-V3R6 systems. Us e
authorization lists instead of group profiles and ensure that objects do
not have private authorities on them that are less than the PUBLIC
authority. For example, PUBAUT(*CHANGE) and QPGMR(*USE)
causes authority checking to be done.
Decimal Data This can occur on PowerPC AS/400 systems. It is usually related
to incorrect data specification in application data migrated from other
systems, especially the System/36.
3.1.7.2 CPU Cost Variations
What has changed is the CPU cost for each exception. For the program
exceptions that were tested (Authority, Size, Verify) that can occur on the RISC
machine, the CPU costs varied depending on the type of program model used
when the program was built.
There are three different options that affect the cost:
1. OPM (Original Program Model)
2. ILE with DFTACTGRP(*YES) where the activation group is not already created
3. ILE with DFTACTGRP(*NO)
The results of testing the different exceptions and program models showed
inconsistent variations in the CPU/exception cost. For that reason, there are
multiple costs shown in this document for each type of exception.
The challenge is to know what types of programs are generating the exceptions
on the machine. The exception reporting mechanism does not discriminate
between the types of programs so, in some cases, you have to make an
intelligent guess about the possible severity of the problem.
The test results in Appendix I, “Guidelines for Interpreting Performance Data” on
page 379 were generated by running each type of program on an AS/400 model
500-2142. There were two programs for each case, one that did not get the
exception and one that did. The results were calculated by computing the
difference between the two test cases, calculating the CPU time used per
exception, and extrapolating the results to other PowerPC models using the ratio
of the relative performance ratings between the measured system model and the
other models.
Chapter 3. Using Performance Tools/40035
3.1.7.3 Removal of Effective Address Overflow (EAO) Exceptions in
V3R6
Effective Address Overflow exceptions on the S/38 and CISC AS/400 systems
result from two different views of the six-byte address. This address really has
two parts, the segment identifier that identifies a unique group of virtual
addresses (called a segment or segment group) and an offset that identifies
specific locations within the segment.
S/38 hardware was designed with the assumption that the system would use a
larger number of smaller segments. Thus, the hardware treated the six-byte
address as four bytes of segment identifier and two bytes of offset. This allows
over one trillion segments each with a maximum size of 64KB. The software
designers felt that the system needed a smaller number of larger segments and
treated the address as three bytes of segment identifier and three bytes of offset
that allows over 16 million segments each with a maximum size of 16MB.
The hardware detected any operation that spanned a 64K boundary and raised
an Effective Address Overflow exception when this happened. According to the
hardware′s view of the address, this was always a bad thing in that someone
had tried to cross from their segment into what was potentially someone else′s
segment. But due to the software design, this was generally not a problem. For
example, a program could be storing a piece of data that crossed from the first
64K of a segment into the second 64K of the segment. (Note that the 16MB
segment viewed by the software can contain up to 256 of the 64K ″hardware″
segments.) Thus, the software needed to look at EAO exceptions and decide if
they were bad or not. Because IMPI operations cannot use operands spanning a
64K boundary, this handler also needed to look at the good exceptions and
decompose the instruction into pieces that does not span a 64K boundary. For
example, a Move Character operation might have a target operand of 10 bytes
that spanned a 64K boundary. This move needs to be split into two Move
Character operations, one to move the first part prior to the 64K boundary and
one to move the second part after the 64K boundary. Needless to say, this
software EAO exception handler could and did impact performance at times.
Because the AS/400 PowerPC hardware is a new design using eight-byte
addresses, there is no longer a mismatch between the hardware and software
views of the address. There is no longer any need for the software handler
previously described and no corresponding performance impact. Hence, EAO
exceptions are not reported for PowerPC AS/400 system.
3.1.7.4 Where to Find Information About CPU Usage
Table 6 (Page 1 of 2). CPU Utilization Information
CPU Queuing MultiplierTransaction Report: Job Summary,
System Summary Data, System
Report
CPU Queuing Multiplier by Job
Priority
Total CPU usage by job typeSystem Report: Resource utilization
System Report
expansion
36Comm Perf Investigation - V3R6/V3R7
Table 16 on page 379
Table 34 on page 386
Table 6 (Page 2 of 2). CPU Utilization Information
Resource DescriptionWhere to LookCompare With
Total CPU usage by individual
jobs
CPU utilization and seconds per
job and system task
CPU Usage by Subsystem and
Pool by Interval
Job Maximums of CPU, I/O,
Transactions and Response Time
by Pool
CPU Time by Job Per IntervalPRTACTRPT, Component Report
CPU Time by LIC Task Per
Interval
Table 7. Information about Exceptions Consuming CPU
Resource DescriptionWhere to LookCompare With
Authority LookupComponent Report: Exception
Size (Arithmetic Overflow and
Binary Overflow)
VerifyComponent Report: Exception
Note: Even though one exception consumes only a relatively small amount of CPU at a time, the
cumulative effect of exceptions can add a significant workload to the CPU.
Transaction Report: Job Summary,
System Summary Data
Pool Report: Subsystem activityHistorical Data
Pool Report: Subsystem Activity
PRTACTRPT, Component Report
Occurrence Summary
Component Report: Exception
Occurrence Summary
Occurrence Summary
Historical Data
Table 26 on page 382
Table 28 on page 383
Table 31 on page 384
3.1.8 Disk Performance Reports and Displays
•
The system report shows you:
−The disk I/O by job type (batch, system, interactive, pass-through, and so
on)
−The IOP utilization percentage
−The ASP number and mirrored units
−The disk unit size
−The I/O rate per a disk unit
−The disk IOP and device service time
•
The component report shows you:
−The synchronous and asynchronous disk I/O per second, displayed by
interval.
−The summary of the highest used device in the interval.
−The synchronous and asynchronous disk I/O per job total.
−The summary of database journal deposits (entries), bundle (blocks of
deposits) writes for both user journaling and for system managed access
path protection (SMAPP) support, system access path journal deposits
and bundle writes, and access path recovery time estimates.
−Per interval and by unit:
Chapter 3. Using Performance Tools/40037
- The utilization percentage
- The size
- The number of overruns and underruns
- The seek activity
•
Transaction report - summary report
−Shows synchronous and asynchronous disk I/O per transaction per job
•
Resource report
−Shows you by interval:
- The number of disk I/O per second
- The number of reads and writes per second
- The average amount of data transferred per disk I/O
- The highest utilization and service time disk unit
- The total disk space used
−Shows per unit and interval:
- The unit identification data per disk unit including:
Bus
IOP
ASP
- The number of reads and writes per second
- The average data transfer size
- The unit service time average
- The IOP service time average
- The average device I/O queue depth
•
Pool report
−Shows the highest number of disk I/O operations by a job running in a
pool during an interval
•
The WRKSYSACT command:
−Shows the number of disk I/O operations by job and LIC task. These are
further separated as synchronous and asynchronous operations.
•
The WRKDSKSTS command:
−The Work with Disk Status display shows performance and status
information about the disk units on the system. It displays the:
- Number of units currently on the system
- Type of each disk unit
- Size of disk space
- Percentage of disk space used
- I/O requests per second
- Average size of the I/O requests
- Average number of read and write requests
- Average amount of data read and written
- Percentage of time the disk is being used
3.1.8.1 Where to Find Information About Disk Performance
Table 8 (Page 1 of 2). Disk Utilization Information
Resource DescriptionWhere to LookCompare With
Disk Arm UtilizationSystem Report: Disk Utilization,
WRKDSKSTS
Table 16 on page 379
38Comm Perf Investigation - V3R6/V3R7
Table 8 (Page 2 of 2). Disk Utilization Information
Resource DescriptionWhere to LookCompare With
Disk IOP UtilizationComponent Report: IOP UtilizationTable 16 on page 379
Disk Physical I/O per Transaction
(Average)
Disk Physical I/O per Transaction
per Job.
Synchronous and Asynchronous
DB and NDB I/O per Job by
Interval
Sync and Async Disk I/O per Job
or LIC Task per Interval
Database journal deposits and
bundle writes to user and system
(SMAPP) journals
Sync and Async Disk I/O by
Subsystems and Pools by Interval
Note:
Use the Write, Read, and Total Physical Disk I/O per transaction values shown in Table 35 on page 386
as a ″reasonability measure″. Verify that any job exceeding the values is performing the work required.
Note that any asynchronous disk I/O performed by the system QDBSRVnn jobs on behalf of a user job
are not included in the job′s asynchronous I/O totals shown on performance reports.
See Table 36 on page 387 to find where the different types of job and disk I/O activities are counted in.
System Report: Resource
Utilization, Transaction Report
Transaction Report: Job SummaryTable 35 on page 386
Job Interval ReportHistorical Data
PRTACTRPT
Component Report: Database
Journal Summary
Pool Report: Subsystem activity
Table 35 on page 386
3.1.9 Communications Performance Data
The performance monitor can optionally collect remote response time data from
5494 remote controllers with Microcode Release 1.1 or later installed on the
5494. Communication IOP and line performance data is always collected.
If you have 5494 workstation controllers included in the data collection, you have
information about:
•
The number of active workstations on each controller
•
The range of response times for the remote workstations
•
The average response time for the remote workstations
The Performance Tools/400 system report and resource report list this 5494
response time data. Note that while entering the STRPFRMON command, you
must set the RRSPTIME (remote response time) parameter value other than the
default
controllers. A value of
and so on) as for the LRSPTIME (local response time) parameter.
*NONE to have the response time data collected from the active 5494
*SYS uses the same response time slots (0-1 seconds,
•
The component report shows you:
−The workstation IOP utilization
−The multifunction IOP utilizations
−The twinaxial line utilization for local workstation IOPs
It is possible to have either high local workstation IOP utilization and low
twinaxial utilization or low local workstation IOP utilization and high twinaxial
utilization. High IOP utilization can occur if there is heavy use of the text
assist functions for an OV/400 editor. High twinaxial utilization can occur if
Chapter 3. Using Performance Tools/40039
there is a significant amount of high-speed printer output, Client Access/400
shared folder activities, or file transfer work going on.
•
The system report shows you:
−Both the average and peak line utilization over the report period
•
The resource interval report shows communication line details per time
interval selected.
•
Resource Report:
−Additional line utilization data by interval
−Response time counts per ″response time buckets″ for local workstations
and optionally for remote 5494-attached workstations
•
Query:
−The performance tools reports do not include all data or they show
certain combinations of data. A common use of a query is to tie together
more complex analysis structures such as jobs, pools, lines, and so on.
−Display Performance Data (DSPPFRDTA) command:
- Provides an interactive access to information contained in system,
component, and resource reports.
−Work with System Activity (WRKSYSACT) command:
- LIC communication task activity (CPU, disk I/O, frequency)
−QSYSOPR message queue:
- Error failure, threshold, and communication job start and end
messages are found in the system operators message queue.
−QHST log:
- Error failure, threshold, and communication job start and end
messages are found in the history log.
−Communication error log:
- Communication errors are logged in the system error log regardless
of Performance Monitor activity. Each entry is time stamped. Use
the STRSST command to view the logged data. Assistance from your
service provider in interpreting the log data is needed in most cases.
See Chapter 5, “Using System Service Tools” on page 71 for
information about using the service tools.
3.1.9.1 Where to Look for Information About Communications
Performance
Table 9 (Page 1 of 2). Line and IOP Utilization Information
Resource DescriptionWhere to LookCompare With
Local WS IOPComponent Report: IOP UtilizationTable 16 on page 379
Multifunction IOPComponent Report: IOP UtilizationTable 16 on page 379
Communication IOPComponent Report: IOP UtilizationTable 16 on page 379
40Comm Perf Investigation - V3R6/V3R7
Table 9 (Page 2 of 2). Line and IOP Utilization Information
Resource DescriptionWhere to LookCompare With
File Server IOPComponent Report: IOP UtilizationIOP reported is the one
for exchanging data
between the Integrated
PC Server and AS/400
Disk. No guideline
available at this time.
Attached LAN lines are
reported under remote
lines, LAN lines. See
Appendix F, “Integrated
PC Server Query” on
page 345 for information
on cache read/write hit
and 486 CPU utilization
percentage guidelines.
Remote Lines, LAN LinesSystem Report: Communication
Summary, Resource Report
Communications I/O Count by Job
Type
Line Utilization and Activity
(input/output)
Communications Gets and Puts
per Transaction by Job type
Communication I/O Per JobComponent Report: Job Workload
Local and Remote Workstation
Response Time Distribution
Local and Remote Workstation
Response Time Distribution By
Interval
System Report: Resource Utilization
System Report: Communications
Summary
System Report: Resource Utilization
Expansion
Activity
Component Report: Local Work
Stations - Response Time Buckets
Resource Report: Local Workstation
IOP Utilization and Remote
Workstation Response Times
Table 16 on page 379
Historical Data
3.1.10 Activity Level Performance Reports and Displays
•
System report:
−Displays Job State changes (movement in and out of activity level) per
pool for the total collection period.
•
Component report shows you:
−Job State changes by pool summarized over selected time intervals.
•
Pool report shows you:
−Job State changes by subsystem and pool for each selected time
interval.
−Pool activity level for each interval. This may change during the time
period due to operator action, an OEM automatic tuner, or the OS/400
automatic tuning through QPFRADJ. The value shown is the value at the
time of the sample.
•
The DSPPFRDTA command:
−Provides interactive access to database including system, component,
and pool interval report data.
Chapter 3. Using Performance Tools/40041
•
The WRKSYSSTS command
−Provides real-time information on activity level usage and job state
changes. See Chapter 2, “Using CL Commands to Find Performance
Problems” on page 11 for information about using this command.
3.1.11 Comparing with Activity Level Guidelines
Table 10. Activity Level Information
Resource DescriptionWhere to LookCompare With
Activity Level for *BASE and
Spooled Writer pool
QINTER Activity Level.System Report: Storage Pool
System Report: Storage Pool
Utilization, WRKSYSSTS, ADVISOR
Utilization, WRKSYSSTS, ADVISOR
Figures 14-8, 14-9, and
14-10 in the
Management Guide
14-10.
See Table 22 on
page 380.
3.1.12 Comparing W-I and A-W Ratio Guidelines
Table 11. W to I and A to W Ratio Information
Resource DescriptionWhere to LookCompare With
W-I/A-WSystem Report: Storage Pool
Utilization, WRKSYSSTS
Table 20 on page 380
Work
, page
3.2 User Level Problem Analysis
The first step in determining a user level problem is to identify the affected user
or users. The following questions are good examples of how to start the user
level problem analysis:
•
Are all of the users affected by poor performance or is there only a small,
easily-defined group of users affected?
•
What do these users have in common?
•
Are they using the same application?
•
Are they sharing the same (possibly small) memory pool?
•
Is there only one user suffering from poor performance?
−How does this user differ from the rest of the users?
After answering these questions, the solution is much closer.
User level problem analysis is done by:
•
Using the WRKACTJOB command:
For information about using this command, see Section 2.5, “WRKACTJOB
Command” on page 21.
•
Using the WRKSYSACT command:
For information about using this command, see Section 2.7, “WRKSYSACT
Command” on page 24.
42Comm Perf Investigation - V3R6/V3R7
•
Using DSPPFRDTA command
•
Analyzing Performance Tools/400 reports.
3.2.1 Print Job Summary Report
Find out if the user appears in the ″job statistics″ section of the ″job summary
report″. Next look at the ″individual transaction statistics″ section to see what
programs are used. Is this user the only one using this program? If no, is this
user the only one with a performance problem? If all of the users of this
program have problems with performance, see Section 3.3, “Application Level
Problem Analysis” for more information.
Refer to the ″system summary data″ section, (″analysis by interactive response
time″) to see how your response time objectives are met.
3.2.2 Print Transaction Summary Report
The transaction summary report provides you with information about response
times, CPU utilization, and disk I/O by job. This report can be used for both
user-level problem analysis and application-level problem analysis.
If the job summary section shows jobs that have high response times, high disk
I/O activity, or high CPU utilization, use the transaction detail report to
investigate further. However, always print the summary report first because both
the transaction detail report and the transition report provide detailed
information. By using the summary report, you can choose to print only the
intervals or users that have performance problems instead of printing thousands
of pages of irrelevant data.
3.3 Application Level Problem Analysis
Is there a problem with one application only?
Are there only some operations that are slow?
Application level analysis is based on the Performance Tools/400 reports.
See the ″Interactive Program Statistics″ section of Job Report for the top 10
programs with the highest resource utilization such as:
•
CPU per transaction
•
Disk I/O per transaction
•
Response time per transaction
•
Database reads/writes per transaction
•
Non-database reads/writes per transaction
Compare this information with Table 35 on page 386 to see if values are
acceptable.
Please note that values shown are guidelines only so you must verify that each
transaction exceeding the values is performing the work required.
Chapter 3. Using Performance Tools/40043
3.3.1 Charging Resource Utilization to Interactive Program
The Job Summary report, Individual Transaction Statistics, and Interactive
Program Statistics data list several categories of performance metrics commonly
referred as the 10 worst. Some of these metrics are the transactions with the
longest CPU service time and transactions with the longest lock wait time. Each
of these categories identifies a program that is charged with consuming that
resource.
The program name listed is the program first doing a workstation output
operation following the receipt of the workstation input. The receipt of the input
is used to signal to the system the beginning of a transaction. In many
application environments, this accurately reflects the program doing the work.
On the other hand, in many application environments, the program
program
the identified program (and programs it called) is required to find out what
program really consumes the resources.
A good example of this kind of situation is the OS/400 User Interface Manager
program appearing in the list showing the worst 10 programs. Frequently the
QUIINMGR and QUIMNDRV are in this list even though these programs are
almost never responsible for high consumption of a system resource. Usually,
the functions and programs called from a menu display are the ones responsible
for consuming the system resources. You need to choose one specific job and
print the Transition Detail Report to see what is happening below the QUIxxxxx
program falsely accused of stealing the systems resources.
actually consumes the system resource. Therefore, further analysis of
called by that
One example of this false interpretation is doing a Send Network File command
from any system menu. The SNDNETF function does no workstation I/O but it
may lock the display station while copying a file to an internal space (on disk) for
a later delivery done by SNADS functions. When the SNDNETF command
completes its work, a user receives a message indicating the results and this
message is written by QUIINMGR.
3.3.2 Print Transaction Summary Report
From the transaction report, you can select those programs that show a frequent
high resource utilization. These programs should be analyzed in deeper detail
using the other tools listed in Section 3.4, “Programmer Performance Utilities” to
find out the cause of the problem.
3.3.3 Print Transaction Detail Report
If you need a more detailed problem analysis, print a transaction detail report by
specifying RPTTYPE(
report output has two parts:
•
The details, which show data about every transaction in the job.
•
The summary, which shows data about overall job operation.
If there are response times that are not acceptable compared to your objectives,
read the report further.
*TNSACT) on the PRTTNSRPT command. The transaction
The next section to look at is the job summary data and especially the
synchronous disk I/O counts. If there are, for example, 200 DB Reads (database
read operations) per transaction, the response times are surely unacceptable.
44Comm Perf Investigation - V3R6/V3R7
3.3.4 Print Transition Report
If you want to know all of the state changes within a transaction, run the
Transaction report by specifying RPTTYPE(
Be Careful!
Remember to use the select/omit parameters or you receive several
thousand pages of printout while adding a significant workload to your
system.
The transition report is composed of two sections:
•
Transition detail, which shows each state transition made by the job, for
example, active-to-ineligible and transaction boundaries. For a brief
discussion about transaction boundaries, see the index entry for trace points.
•
Summary, which shows the same data as the summary output from the
transaction report.
You may see in the transaction report (seize/lock conflict reports) that object
ADDR 00000E00 0002IUSE″ is being held for a relatively long time. This refers to
″
the internal object ″database file in use table″, which indicates frequent
occurrences of one of the following conditions:
•
File opens/closes
•
File creates/deletes
•
Clear physical file member
•
Reorganize physical file member, and so on.
*TRSIT) on the PRTTNSRPT command.
Since these functions have a significant impact on system and job performance,
reduce their usage.
You may also see the I/O transaction boundaries in the transaction report. They
indicate the trace points such as:
•
SOTn
−Start of a transaction
−Start of the response time for that transaction
−N represents various transaction types.
•
SOR
−Start of resource utilization time
•
EORn
−End of response time for the transaction
•
EOTn
−End of resource usage time
−End of the transaction
See the
Performance Tools/400 Guide
, SC41-4340, for more details.
Chapter 3. Using Performance Tools/40045
3.4 Programmer Performance Utilities
The tools described in this part are not meant to be used for all of the cases with
performance problems. These tools are meant to be used only as a last resort if
none of the other tools provide you with the information required.
Usually the data acquired by using these tools is used for tuning the application
only. These tools normally provide a limited amount of data of the performance
on the communications area.
3.4.1 OS/400 Utilities for Tracing a Job
The following OS/400 commands may be used to produce trace job information:
•
STRSRVJOB
The Start Service Job command starts the remote service operation for a
specified job (other than the job issuing the command) so that other service
commands can be entered to service the specified job. Any dump, debug,
and trace commands can be run in that job until the service operation ends.
The service operation continues until the End Service Job command is run.
To use this command, you must be signed on as QPGMR, QSYSOPR, QSRV,
or QSRVBAS, or have *ALLOBJ authority.
•
ENDSRVJOB
The End Service Job command ends the remote job service operation. This
command stops the service operation that began when the Start Service Job
command was entered.
To use this command, you must be signed on as QPGMR, QSYSOPR, QSRV,
or QSRVBAS, or have *ALLOBJ authority.
•
TRCJOB
The Trace Job command controls traces of Original Program Model (OPM)
programs and Integrated Language Environment (ILE) procedure calls and
returns that occur in the current job or in the job being serviced as a result
of the Start Service Job command directed to that job. The command, which
sets a trace on or off, can trace module flow, operating system data
acquisition (including CL command traces), or both.
Restrictions for using the TRCJOB command:
1. The record format of the database output file must match the record
format of the IBM-supplied output file QATRCJOB.
2. The number of trace records processed between the start and end of the
trace must not exceed one million.
3. This command is shipped with public *EXCLUDE authority.
The following user profiles are authorized to use this command: QPGMR,
QSRV, QSRVBAS, QSYSOPR, and QRJE.
The following display is an example of starting a job trace:
Figure 17. An Example of Output Created with TRCJOB Command
Note: This is the first of the eight pages produced by tracing the Display Job
command so please be careful when tracing a job. To avoid producing
thousands of pages of printout, run the job trace for a short period of time only.
3.4.2 Performance Tools/400 Utilities for Tracing a Job
The performance tools provides some additional commands for gathering trace
information:
•
STRJOBTRC
The Start Job Trace command starts the job tracing function to collect
performance statistics for the specified job. After job tracing is started, a
trace record is generated for every:
−External (program) call and return
−Exception
−Message
−Workstation wait in the job
At least two (usually more) trace records are generated for every I/O
statement (open, close, read, and write) in a high-level language program.
48Comm Perf Investigation - V3R6/V3R7
•
ENDJOBTRC
The End Job Trace command turns off the job tracing function. It also:
−Saves all of the collected trace records in a database file.
−Optionally produces reports.
You may also use the Print Job Trace (PRTJOBTRC) command to produce
reports from the same data.
Notice!
Tracing has a significant effect on the performance of the job being traced. It
also affects the performance of the system in general, but to a lesser extent.
The following trace examples are produced by using the Start Job Trace
(STRJOBTRC) command followed by End Job Trace (ENDJOBTRC) command.
Transaction ending program . . . ENDTNSQT3REQIO
Transaction starting program . . STRTNSQWSGET
Job name . . . . . . . . . . . . JOBENDJOBTRC
Job description . . . . . . . . JOBDQPFRJOBD
Library ...........*LIBL
Figure 18. ENDJOBTRC Command Prompt
Note:
1 By entering *SUMMARY, you submit two reports to be produced
summarizing the job trace data by workstation transaction. One report
shows primarily physical disk activity; its printer file is QPPTTRC1, and its
page heading includes the text ″Trace Analysis Summary″. The other
report concentrates on higher level activities such as database I/O and
inter-program transfers of control; its printer file is QPPTTRC2, and its
page heading includes the text ″Trace Analysis I/O Summary″.
Based on your needs, you may use either of the job tracing functions because
they show a different kind of data. The OS/400 Job Trace shows the job flow and
the trace obtained with the Performance Tools/400 shows the number of different
disk I/O operations.
WAIT-ACT34.193.00145
ACTIVE .442 .24523145
WAIT-ACT1.154 .001150
ACTIVE .319 .10914208
WAIT-ACT10.354 .002213
ACTIVE .663 .41033441
WAIT-ACT23.624 .002446
ACTIVE .480 .218103555
WAIT-ACT17.752 .002560
ACTIVE .601 .38823779
WAIT-ACT20.577 .001784
ACTIVE .619 .453521046
WAIT-ACT3.3711051
ACTIVE .536 .330231247
WAIT-ACT1.797 .0011252
AVERAGE.523 .30943 7
TOTAL 3.6602.1622521
P R O G R A M******* PROGRAM DATA BASE I/O ******* FULLSHARESUBFILE
Figure 20. Trace Analysis I/O Summary
The trace job outputs are used to determine the following information that can
be used to analyze job performance:
•
Programs called and calling sequence and frequency
•
Wall clock time of the program call and return sequence
•
CPU time used by each program
•
The number of synchronous DB and NDB disk I/Os per program called
•
The number of full and shared file opens
•
Messages received by each program
TRACE ANALYSIS I/O SUMMARY10/31/96
Do not use the wall clock time (TIME heading) or CPU time (CPU TIME heading)
to estimate the actual time used by each program. The implementation of a
trace job inflates the real values to those shown in the trace job data. However,
you can use the time values to identify relative differences among the programs
listed.
Attention!
Be aware that tracing a job with hundreds of user program or procedure
calls may have a significant impact on CPU utilization.
50Comm Perf Investigation - V3R6/V3R7
3.4.2.1 DSPACCGRP and ANZACCGRP
Analyzing Process Access Group activity is done by using these commands.
Collect the data with the DSPACCGRP command and direct the output to a
database file. The command lets you select jobs by generic job or user name,
or by type (interactive or all).
Use the ANZACCGRP command to print a summary of the data in the file. For
each job type, it shows:
•
How many jobs exist
•
The number of files that are in use in each job, and the amount of I/O done
by the job
•
What files are open in the system, what duplicate files a job may have, and
the amount of I/O going on for each file
•
The active programs within the jobs selected
Analyze job PAGs to see if savings can be made. Opening and closing
seldom-used files each time they are used saves buffer space. In some cases,
display files have many formats but a job uses only one or two. Placing these
formats into a separate display file (for example, based on application function)
can reduce PAG size. This reduces the number of disk I/O operations to read
and write the PAG and saves space while the PAG is in memory. This is
valuable on a system with limited main storage.
3.4.2.2 The Performance Explorer
If the tools introduced earlier in this chapter do not give you enough information,
you might consider using the Performance Explorer. The Performance Explorer
is a combination of Timing and Paging Statistics Tool and Sampled Access
Monitor. The use of Performance Explorer is beyond the scope of this
publication but if you have used either TPST or SAM earlier, you should have no
problems with Performance Explorer. For detailed information about using the
Performance Explorer, see the
3.5 Performance Data Conversion
You can analyze performance data collected on a system running an earlier
release of the OS/400 but the files must be converted before the current (V3R7)
level of Performance Tools can use them.
This is done by running the Convert Performance Data (CVTPFRDTA) command
against the down-level performance data.
The conversion may be done in the library in which the current data resides, or
to a different library. If the conversion is done in the same library, the current
data is replaced by the new data. If the conversion is done to a different library,
the new data exists in the new library while the ″back level″ data continues to
exist in the ″old″ library.
AS/400 Performance Tools/400 Guide
, SC41-4340.
Note: To avoid the risk of destroying the old data if the command ends
abnormally, convert the data into a different library (To library prompt (TOLIB
parameter)), and later, delete the data from the old library (From library prompt
(FROMLIB parameter)). Data conversion may affect the other transaction
response times. You may consider submitting it during a low period of CPU
utilization.
Chapter 3. Using Performance Tools/40051
To be able to analyze performance data collected on a V3R2 level operating
system you must specify either
performance data library.
TGTRLS(V3R1M0) or TGTRLS(*PRV) when saving the
52Comm Perf Investigation - V3R6/V3R7
Chapter 4.Using BEST/1 for Communications Performance Analysis
and Capacity Planning
This chapter discusses using the BEST/1 function of the Performance Tools to
analyze communications performance problems. The following major topics are
covered in this chapter:
•
V3R7 AS/400 capacity planning
•
Creating a model for communications capacity planning analysis
•
Using a model for communications capacity planning analysis
•
Changing communications resources
•
BEST/1 communications support for performance capacity analysis
•
BEST/1 considerations when analyzing communications data
4.1 V3R7 BEST/1 Capacity Planning
Creating a model of the current system is the most common use of the capacity
planning tool. Use the model to see how changing either the system
configuration or the workload affects the performance. You may find out the
affect on remote response time, line utilization, IOP utilization, CPU utilization,
and other parameters.
The scenario of changing either hardware or workload and re-analyzing the data
and viewing the results is discussed in this chapter.
If you are interested in seeing how the changes to your hardware configuration
affect communications performance, you can do so by using the V3R7 BEST/1
Capacity Planning, which is a part of the Performance Tools/400 program
product. If you are not familiar with BEST/1 Capacity Planning, it is highly
recommended that you review the following manual:
•
AS/400 BEST/1 Capacity Planning Tool
, SC41-3341
4.1.1 When to Use BEST/1 for Communications Performance Analysis
Use BEST/1 modeling when the communications performance does not meet the
predefined objectives or when you know that there are major changes coming to
either the workload or the system configuration. For example, you can predict
the impact to the system performance of adding 100 new users to the existing
configuration. You can also see how replacing a 2626 IOP with a 6506 IOP
affects either response times or the number of transactions getting done.
The first step in analyzing the communications performance data collected by
using the Start Performance Monitor (STRPFRMON) CL command usually is to
use the Advisor tool. See Chapter 3, “Using Performance Tools/400” on
page 27 for information about using the advisor tool for communications
performance analysis. By using BEST/1 with the real communications
performance data, you can simulate beforehand what happens if you change, for
example:
•
The line speed of a communication line
•
A communication IOP
•
The CPU model
Copyright IBM Corp. 1997 53
•
The size of main storage
•
DASD configuration
4.2 Creating a Model for Communications Analysis
The purpose of this section is to show you the steps of building a model using
performance measurement data.
When building a model, choose a performance data member that represents a
normal workload on the system. If you choose a member with only a few active
jobs, the results you obtain may not help you in your search of the performance
bottleneck. The heavier the workload in the performance data file member you
choose to build the model from, the more usable the model is that you create.
If the created model can handle a workload significantly heavier than the one
you have in real life, the system performance after the configuration changes
should be acceptable in real life also. Usually, the results acquired by using the
BEST/1 tool are accurate within five percent.
The following sections describe some of the displays associated with these
steps. For a complete step-by-step demonstration, see the
Capacity Planning Tool Guide
Enter the STRBEST command to start the BEST/1 modeling tool and from the
BEST/1 for the AS/400
with BEST/1 Models
menu, choose option 1 to work with models. The
menu is displayed. Enter option 1 to create a new model
from performance data. Either use the default jobs classification or create your
own job classifications as shown on the following pages.
.
AS/400 BEST/1
Work
4.2.1 Assigning Jobs to Workloads by Communications Line
BEST/1 enables you to assign jobs to workloads based on communications line
or control unit options.
This enables models to be created that allow for workload changes according to
remote locations. For example, you can use these options to predict what
happens if:
•
The amount of remote work station users increase or decrease.
•
The business volumes on the remote end changes such as when:
−A new branch is opened.
−A competitive company is bought.
4.2.1.1 Specify Job Classification by Communication Line
Figure 21 on page 55 shows an example of the Specify Job Classification
Category display. In this example, option 9 is used to group the jobs according
to the communication line they are attached to.
54Comm Perf Investigation - V3R6/V3R7
Type choice, press Enter.
Category . . . . . . . . . . . .91=User ID
F3=ExitF12=Cancel
Specify Job Classification Category
2=Job type
3=Job name
4=Account code
5=Job number
6=Subsystem
7=Pool
1 When a control unit option is selected, all of the work that can be
associated with a local station controller, a display station pass-through
virtual controller, or a WAN controller is identified. All of the other work
is assigned to a single workload.
2 When a communications line option is selected, all of the work that
was not associated by the Performance Monitor with a communication
line is assigned to only one workload.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning55
4.2.1.2 Edit Job Classification
Figure 22 shows workloads and communications line pairs manually typed in.
Only the communications lines that were active during performance data
collection are shown. Press the PF9 key to get a list of communications lines
from the previously gathered performance data.
Enter workload names and category values which are assigned to each workload,
press Enter. Jobs with unassigned values become part of workload QDEFAULT.
12
WorkloadComm LineWorkloadComm LineWorkloadComm Line
SDLC101SC101__________________________________
SDLC102SC102__________________________________
SDLC103SC103__________________________________
TRNLANLINTRN__________________________________
F3=ExitF9=Display values from dataF12=Cancel
To display values from performance data, press F9.4
Figure 22. Edit Job Classifications Display
Edit Job Classifications
More...
Notes:
1 and 2 show the workload assignments with activity from lines
SC101, SC102, SC103, LINTRN, LINX25A, and LINX25B.
3 The remaining activity is assigned to workload QDEFAULT.
4 For easier assignment of workloads, press the PF9 (Display values
from data) key to have BEST/1 query the QAPMJOBS file and show you a
list of communications line names instead of trying to remember the
names of communications lines.
56Comm Perf Investigation - V3R6/V3R7
4.2.1.3 Assign Jobs to Workloads
Figure 23 shows an example of the selection display after pressing the PF9 key
F3=ExitF12=CancelF15=Sort by workloadF16=Sort by comm line
F17=Sort by transactionsF18=Sort by CPU secondsF19=Sort by I/O count
Assign Jobs to Workloads
333
Number of CPU I/O
5Bottom
Figure 23. Assign Jobs to Workloads
Notes:
1 Type a workload name and use option 1 or 2 beside the
communications line to group work into workloads.
2 Use options 1 and 2 to assign or unassign the activity associated with
the communications lines.
3 The Number of Transactions, CPU Seconds, and I/O Counts statistics
provide you information about the activity on the line.
4 The first line shown under communications line names shows no
communications line. This line represents all of the work that was not
other
assigned to any of the communications lines. This
assigned to only one workload, but it is preferable to let BEST/1 assign
this work to the QDEFAULT workload to handle the *LIC (Licensed Internal
Code) tasks properly. Leave this line unassigned.
work can be
5 You can have the workloads sorted after different factors by using the
function keys shown on the bottom of the display.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning57
4.2.2 Creating a Model
The actual creation of the model is submitted after the job classification scheme
has been created. The create process accesses the performance data collected
by the Performance Monitor and builds a model according to the specified job
classifications.
4.3 Using a Model for Communications Analysis
Once the batch job creating the model has completed, you can work with the
model to examine the results that can be viewed either as reports or graphs.
Have the model analyzed and the calibration report viewed prior to any
analysis.
if...?
The Analyzing of the model is done by selecting either option 5 (Analyze current
model) or option 6 (Analyze current model and give recommendations) on the
Work with BEST/1 Models
menu.
4.3.1 Displaying Model Reports
Figure 24 shows an example of the Display Comm Resources Report display that
shows the communications IOP utilization and the following information for each
communications line resource:
F3=ExitF10=Re-analyzeF12=CancelF15=Configuration menu
F17=Analyze multiple pointsF18=Specify objectivesF24=More keys
Figure 24. Communications Resources Report
Display Comm Resources Report
More...
58Comm Perf Investigation - V3R6/V3R7
Notes:
1 The descriptive name of the analysis period (for example, a date).
2 The system-assigned or user-assigned name of the communications
resource.
3 The predicted utilization of this line resource. For a communications
IOP (such as the
of frames processed and the service time per frame. For a line resource,
this is calculated from the number of characters transferred and the line
speed.
CC01 in this example), this is calculated from the number
4 The line utilization that is due to overhead. For example, let′s
assume that the line resource has a line overhead value of 10%. This
means that for every 10 information bytes transferred, there is an extra
overhead byte transferred. If the predicted line utilization (total) is 55%,
the utilization due to overhead is 5%.
5 This represents the average predicted response time in seconds of
the interactive transactions that contribute to the line traffic. The BEST/1
workload definitions indicate what portion of LAN or WAN transactions
flow across each communications line. See Section 4.5.3, “Distribution of
Characters Transferred Across Line Resources” on page 68 for more
details. The workload report shows average WAN and LAN response
times for an entire workload.
6 This field indicates the number of lines that the line resource
represents. Adding or removing line resources is done by using the Work
with Communications IOP Features display.
7 The line speeds for lines that are represented by the line resource. If
this is an input field, type the line speed of your choice, or press PF4 to
select from a list of line speeds. This line speed must be the actual
speed of the connection. If the modems are communicating at 9.6Kbps,
use 9.6Kbps as the AS/400 line speed (ignore what the AS/400 line
description parameters show since they may not match the current
situation).
4.3.2 Understanding Recommendations
If you choose option 6 to analyze the model and give recommendations, you may
get suggestions for configuration changes. These changes are based on the
internal tables containing values for resource utilization limits, so reading the
recommendations is helpful when analyzing communications performance.
Some of the recommended changes may be creating, changing, or deleting
communications IOPs.
4.3.2.1 Display Recommendations
Figure 25 on page 60 shows an example of the Display Recommendations
display that indicates both the components that were not meeting the desired
objectives and the changes proposed to the current configuration to meet those
objectives. This report contains two sections of information: exceptions and
recommendations.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning59
***** Analysis Exceptions ***** 1
Utilization of 65.22 for communications IOP CC02 exceeds objectives of 45.00
Utilization of 93.20 for communications IOP CC03 exceeds objectives of 45.00
***** Analysis Recommendations ***** 2
Move 3 Communication lines from IOP CC02 to CIOP1
Move 4 Communication lines from IOP CC03 to CIOP2
Create 1 communications IOP(s)
Move 1 communication lines from IOP CC02 to CC04
Move 1 communication lines from IOP CC03 to CC05
Performance estimates -- Press help to see disclaimer.
F3=ExitF12=CancelF15=Configuration menuF17=Analyze multiple points
F18=Specify objectivesF19=Work with workloads
Figure 25. Display Recommendations
Notes:
Display Recommendations
Bottom
1 Exceptions are conditions that BEST/1 has identified as indicators of
poor performance according to the objectives or guidelines.
2 Recommendations may suggest configuration changes to achieve the
desired performance.
4.3.2.2 Exceptions
There are two basic types of exceptions related with communications
performance:
•
Utilization of ... exceeds objectives ...
The predicted utilization of the identified hardware component has exceeded
the guideline for that type of component.
•
... is saturated
The predicted utilization of the identified hardware component has exceeded
100%. This usually indicates a severe over-commitment of the hardware
resource, which means that system was not capable of managing the
workload.
4.3.2.3 Recommendations
If one or more exceptions have occurred, BEST/1 makes specific reconfiguration
suggestions. These suggestions have one of two possible origins:
•
Primary
The reconfiguration is specifically indicated to remove one or more
performance exception.
•
Secondary
Additional reconfiguration is required by AS/400 configuration requirements.
For example, if the communications IOP utilization guideline is exceeded,
BEST/1 probably recommended installation of a new communications IOP.
60Comm Perf Investigation - V3R6/V3R7
For communication lines, BEST/1 suggests increasing line speeds first. Then
it suggests adding more lines to the configuration.
BEST/1 only recommends hardware reconfigurations that are both:
•
Adequately or completely defined in the hardware table.
•
Marked as currently available (Y).
4.3.2.4 How Communication Resources Utilizations are Predicted
BEST/1 uses the amount of:
•
The total number of characters transferred
•
Characters per transaction
•
Transactions per function
•
Functions per user
to calculate the utilization of communications resources for each workload.
Characters per transaction is the total of all characters transferred in and out.
When communications lines are present, the number of characters is kept
separately for each line. These are added across all workloads.
When building the workloads, BEST/1 assigns relative communication line
activity to each workload based on the job′s CPU usage that has been assigned
to each workload. Most interactive jobs indicate the communications line they
are associated with, so this assignment is thought to be valid. Results in your
environment may vary if you are using Client Access workstations attached to
5294 or 5394 remote workstation controllers.
WAN Controllers and LAN Controllers:
Utilization of LAN and WAN controllers is
determined by using the number of active jobs connected to LAN or WAN.
This method is based on the assumption that an average service time is
representative and that the traffic is evenly spread among all the controllers of a
particular type.
LAN and WAN IOPs:
Use of LAN IOPs is determined by the attached
communications lines. Each line′s contribution depends on the total LAN
characters (determined previously for each workload), frame size, and IOP
service time per frame. All the attached lines are added together to calculate
the total percentage of the IOP utilization.
LAN IOP service time per frame is specific to each IOP. Line frame size is
specific to each communications line.
Utilization for WAN IOPs is determined similarly by using the total WAN
characters and number of WAN IOPs for WAN utilizations.
Communications Lines:
Utilization of a LAN communications line is calculated
by using the total LAN characters (determined previously to each workload), the
line overhead, and the line speed.
Line overhead and line speed in kilobits per second is specific to each
communications line. Utilization of WAN lines is determined similarly by using
the total WAN characters per line.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning61
Multifunction IOPs:
adding together the utilization for each type of activity. For example, the disk
IOP utilization plus communications IOP utilization equals total MFIOP utilization.
The utilization of a multifunction IOP is determined by
4.4 Changing Communications Resources
This section shows some of the changes you can make to your model to meet
your performance definitions. By changing some of the communications
resources, you can ask
performance prediction.
what-if...?
questions that help you with communications
Work with Communications Resources:
Work with Communication Resources display. This display shows you all the
communications IOPs and line resources in the current configuration for both
LANs and WANs. Use this display to change the properties of communications
resources.
For a communications IOP, you can change:
•
Communications IOP feature
•
Average service time
For a communications line, you can change:
•
Number of lines the line represents
•
Line speeds of all the lines
Type options, press Enter.1
2=Change 3=Copy 4=Delete 7=Rename 8=Create line resources
23Nbr of 5Line Speed Pct Line
Opt ResourceFeature Lines Text(Kbit/sec) Overhead
_ CC012626LAN IOP
_LINTRN1 Comm line(s)4000.02.0
_ CC022623WAN IOP
_AE1011 Comm line(s)9.614.5
_JAIRO1 Comm line(s)9.62.0
_LINSI1 Comm line(s)9.650.4
_LITECP11 Comm line(s)9.621.3
_SC1011 Comm line(s)9.678.5
_SC1021 Comm line(s)4.857.7
_ CC032623WAN IOP
_FILIAL_CTR1 Comm line(s)19.22.0
_ITSC1 Comm line(s)9.62.0
_LINX25A1 Comm line(s)9.63.0
_OM1041 Comm line(s)9.62.0
8More...
F3=ExitF6=Create communications IOPF12=Cancel
Work with Communications Resources
467
Figure 26 shows an example of the
Figure 26. Changing Communications Resources
62Comm Perf Investigation - V3R6/V3R7
Notes:
1 The create line resources option allows you to create one or more
line resources to attach to the current communications IOP.
2 This shows you the unique system-assigned or user-assigned name
of the communications resource.
3 The communications IOP feature is shown on this column.
4 This field indicates the number of lines that the line resource
represents. Adding or removing of the line resources is done through the
Work with Communications IOP Features display.
5 A description of the communications resource. This can be a LAN
IOP, a WAN IOP, or a communication line.
6 The line speed parameter for the lines that are represented by the
line resource. If this is an input field, you can type the line speed of your
choice, or press the PF4 key to select from a list of line speeds.
7 The amount of non-information bytes sent over a line resource is
expressed as a percentage of the information bytes. For example, if 800
information bytes and 200 non-information bytes are transferred, the
overhead is 25%. Depending on the type of line, non-information bytes
can represent items such as protocol overhead or error retransmission.
A high percentage of line overhead may indicate error conditions. B y
default, this value is set to 2% if the line utilization is 10% or less.
8 The create communications IOP function allows you to add a
communication IOP into your configuration.
Multiple function IOPs are shown in this display as well as the Work with Disk
Resources display because a multiple function IOP serves both as a disk IOP
and as a communications IOP.
BEST/1 uses a LAN IOP called the 613L to represent a 6130 with LAN
communications lines. The 6130 can support either WAN or LAN
communications lines, but BEST/1 requires any communications IOP in the
hardware table to be exclusively WAN or LAN. The hardware table includes a
6130 that supports WAN line speeds and a 613L that supports LAN line speeds.
When you create a model from performance data, the line speeds of the lines
attached to a 6130 determine whether the IOP is listed as a 6130 or 613L.
LAN or WAN?
BEST/1 determines the communications IOP type by whether the minimum
line speed supported by an IOP is less than 4MB.
•
An IOP with a minimum line speed of 4MB or greater is considered LAN.
•
An IOP with a minimum line speed of less than 4MB is considered WAN.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning63
4.4.1 Example - Changing the IOP Type
The characteristics of the communications IOP usually affect the performance of
the lines attached to the IOP. The higher capacity 2623 IOP provides better line
performance than the lower capacity IOPs if any of these conditions are true:
•
There is high throughput.
•
The line has a high line speed.
•
The IOP has many lines attached to it.
•
Other lines on the IOP are highly utilized.
•
There is high polling activity across the lines.
Change Communications IOP:
Figure 27 shows an example of the Change
Communications IOP display that allows you to change characteristics of the
communications IOP. To change the communications IOP feature, select option
2 next to the element you want to change on the Work with Communications
Resources display shown in Figure 26 on page 62.
Change Communications IOP
IOP name 1 . . . . . . . :CC02
Text 2 . . . . . . . . . :WAN IOP
Type changes, press Enter.
3Feature . . . . . . . . . .2623F4 for list
4Service Time . . . . . . . .7.2Msecs per frame
F3=Exit F4=Prompt F12=Cancel
Figure 27. Changing IOP Type
Notes:
1 The name of the communications IOP resource
2 A description of the communications IOP resource. This can be:
•
LAN (Local Area Network) IOP
•
WAN (Wide Area Network) IOP
3 The communications IOP feature
4 The communications IOP average service time expressed in
milliseconds per frame
64Comm Perf Investigation - V3R6/V3R7
4.5 BEST/1 Communications Support for Performance Analysis
BEST/1 communications support refers to modeling the way the users are
connected to the system. For a communications performance analysis, look for
users that are connected to the AS/400 system through:
•
Local Area network (LAN)
•
Wide Area Network (WAN)
A high utilization percentage of the communications line (LAN or WAN) surely
affects response time. The purpose of this section is to describe the
configuration displays that provide the communications support. These include
the following displays:
•
Create communications IOP feature.
•
Create communications line resource.
•
Specify Chars to communication line resources.
These three displays may be helpful if you have communications performance
problems such as:
•
There are IOPs with high utilization percentages.
•
There are lines with high utilization percentages.
•
You need to spread the workload evenly between IOP and line resources.
4.5.1 Creating a Communications IOP Feature
This section shows you an example of creating a communications IOP that you
can add to your model to improve the performance. For example, if the analysis
shows that there are IOPs with a high utilization percentage, you can see the
estimated effect of adding a new IOP.
Create Communications IOP:
Create Communications IOP display that enables you to create a new
communications IOP for your configuration. At the same time, you can also
create communications lines and attach them to the new IOP.
You can add one or more communications lines to the current communications
IOP with the average line speed specified in kilobits per second.
Fill in the necessary parameters for the communications IOP such as the feature
number, service time, and the frame size. You also need to fill in information for
the first line resource displayed such as the number of lines the line resource
represents and whether the data is sent half or full duplex.
To create a communications IOP, press the PF6 key (Create Communications
IOP) on the Work with Communications Resources display shown in Figure 26 on
page 62.
You can also use the copy function on the Work with Communications Resources
display shown in Figure 26 on page 62 to create an IOP.
Figure 28 on page 66 shows an example of the
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning65
Type changes, press Enter. Line resources with 0 lines are not created.
1IOP name . . . . . . . . . . .CMB01
2Feature . . . . . . . . . . .2619F4 for list
3Service time . . . . . . . . .3.0Msecs per frame
F3=Exit F4=Prompt F6=Create line resource F12=Cancel
Figure 28. Create Communications IOP
Notes:
Create Communications IOP
Bottom
10
1 The name of the communications IOP resource
2 The communications IOP feature
3 The communications IOP average service time expressed in
milliseconds per frame
4 The name of the line resource. In the case of a model created from
performance data, this name is system-defined. When you create (add) a
communications IOP to a configuration, you provide the name.
5 This field indicates the number of lines that the line resource
represents. Adding or removing line resources is done in the Work with
Communications IOP Features display.
6 The line speed for lines that are represented by the line resource. If
this is an input field, you can type the line speed of your choice, or press
the PF4 key to select from a list of line speeds.
7 The amount of non-information bytes sent over a line resource given
as a percentage of the information bytes. For example, if 800 information
bytes and 200 non-information bytes are transferred, the overhead is 25%.
Depending on the type of line, non-information bytes represent items such
as protocol overhead or error retransmission. A high value for line
overhead may indicate error conditions.
•
For lines with less than 10% utilization, the default value of Pct Line
Overhead is 2%.
•
As the traffic on the line increases, the percentage value of Pct Line
Overhead remains the same. Therefore, whether 800 or 8000
information bytes are transferred, if the value for Pct Line Overhead is
set to 5%, it remains at 5%.
66Comm Perf Investigation - V3R6/V3R7
8 The size of the frame being processed by the communications line
resource. For measured models, the frame size is determined by the
average size of the information frame transferred across the line during
the measurement interval.
9 This shows whether the line resource represents half duplex lines or
full duplex lines. Half duplex lines can only send or receive data at any
one time. Full duplex lines can both send and receive at the same time.
10 Press the PF6 key (Create line resource) to add line resources to
this new communications IOP.
4.5.2 Creating a Communications Line Resource
This section is an example of creating a communications line. This helps you to
remove a performance bottleneck if the line utilization percentage exceeds the
guidelines.
Create Communications Lines:
Communications Lines display that you can use to create a line resource and to
specify its parameters. These include:
•
Number of lines the line resource represents.
•
Line speeds of all the lines. This is not a total speed, but a speed for each
line.
Note: The fields shown in this display have already been described in Figure 28
on page 66.
To add lines to an existing communications IOP, select option 8 (Create line
resources) on the Work with Communications Resources display shown in
Figure 26 on page 62.
IOP name . . . . . . . . . . :CMB01
Type changes, press Enter. Line resources with 0 lines are not created.
F3=Exit F4=Prompt F6=Create line resource F12=Cancel
Figure 29. Create Communications Line Display
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning67
4.5.3 Distribution of Characters Transferred Across Line Resources
You may examine the distribution of transaction characters in a specific
workload being transferred over the communications line resources to find out:
•
The utilization of the line resource
•
Whether you need to consider redistributing a part of the workload
Distribution is expressed as a relative count, not as a percentage. For example,
if the relative count for line resource
as many characters are being transferred across
SC107 has a relative count of 67.5 that represents 15 times more characters being
transferred than
Also notice that relative counts for LAN are not related to relative counts for
WAN. The division of work between WAN and LAN is specified by the number of
active jobs on the Objectives display.
SC103.
SC103 is 4.5 and for SC104 is 18, four times
SC104. Note that line resource
Specify Chars to Comm Line Resources:
Specify Chars to Comm Line Resources display that allows you to specify the
relative counts and characters transferred for this workload across
communications line resources.
You can access the Specify Chars to Communication Line Resources display by
pressing the PF9 key on the Change or Create Workload display.
Workload . . . . . . . :1INTERACTIV
Type changes, press Enter.
23Nbr ofLine Speed6
Line Resource Connect Lines(Kbit/sec) Relative Count
F3=ExitF10=Set relative counts to 1F11=Show all line resources
F12=CancelF17=Set relative counts to line capacity
Figure 30. Specify Chars to Comm Line Resources
68Comm Perf Investigation - V3R6/V3R7
Bottom
Notes:
1 The name of the workload
2 The line resource that the workload is transferring characters across
3 The type of the line resource. The values are *LAN (Local Area
Network) and *WAN (Wide Area Network). This value is determined by
the minimum line speed supported by the communications IOP feature.
LAN communications IOP features have a minimum line speed of 4MB.
WAN communications IOP features have a maximum line speed of 4MB.
4 This field indicates the number of lines that the line resource
represents. Adding or removing line resources is done from the Work
with Communications IOP Features display.
5 This indicates the line speeds for lines that are represented by the
line resource. If this is an input field, you can type the line speed of your
choice. You may also press the PF4 key to select from a list of line
speeds.
6 The relative number of characters transferred across this
communications line resource. These are relative to each other; they are
not percentages. Relative counts for WAN are calculated separate from
the relative counts for LAN. For example, Figure 30 on page 68 shows all
LAN traffic going across LINTRN and sixty five-hundreds of the WAN traffic
going across
SC107.
4.5.3.1 Communications Workload
BEST/1 creates two types of communications workloads to represent
communications activity:
•
For communications lines that have traffic but have no jobs associated,
BEST/1 creates a workload named QCMN that represents traffic on those
lines. QCMN workload contains no I/O activity and no CPU utilization.
•
BEST/1 creates a communications workload for workloads that show
non-interactive activity in a group but with no corresponding interactive
activity in the same group. The name of the workload is your workload
name plus the letters QLAN or QWAN (QL or QW if the name is too long).
This workload contains only non-interactive activity. BEST/1 creates this
workload because it cannot show non-interactive activity for that group in
your original workload without showing corresponding interactive activity,
which misrepresents the activity of your workload.
4.6 Comparing the Model Against the Measured Performance
After completing the changes to the model created, press the PF12 key until the
Work with BEST/1 Model menu is displayed and re-analyze the model. On the
Display Analysis Summary display, press the PF11 key to compare the results
against the measured values.
Repeat the entire process described in this chapter until your performance
objectives are met.
Chapter 4. Using BEST/1 for Communications Performance Analysis and Capacity Planning69
4.7 Considerations When Analyzing Communications Data with BEST/1
The following list contains communications related assumptions under which
BEST/1 creates the model for analysis:
•
All controllers are equally distributed across all communications lines for
LAN and WAN.
•
All LAN controllers have the same service time.
•
All WAN workstation controllers have the same service time.
Please remember the following things when analyzing the model:
•
Many times, communications activity caused by batch jobs is put in the
special QCMN workload described in Section 4.5.3.1, “Communications
Workload” on page 69. Use the copy function or the combine workloads
function to properly associate communications activity with CPU and DASD
activity.
•
Client/Access users connected through 5294 or 5394 controllers are
incorrectly assigned as local Client/Access jobs. Client/Access users
attached through 5494, however, are correctly assigned.
•
Total MFIOP utilization can only be determined by adding the predicted
utilizations from the disk IOPs and Arms report with the predicted utilization
on the Communications report.
•
Assignment of relative counts of communications line activity to workloads is
done based on job assignments to workloads and their relative CPU usage.
•
LAN utilizations can only be calculated for workload that is actually being
done with the AS/400 system. Other traffic on the LAN causes utilizations to
be different from the predicted.
•
Response times can only take into account the effect of the communications
line that is attached to the AS/400 system. Any other connections beyond
that line add additional response time.
70Comm Perf Investigation - V3R6/V3R7
Chapter 5.Using System Service Tools
The System Service Tools (SST) provides a relatively easy access to numerous
logs that OS/400 constantly maintains. This chapter gives you some examples of
how to use the system service tools but please remember that incorrect use ofthis service tool can cause damage to data in the system. Contact your service
representative for assistance if you have even a slightest doubt about how to
proceed.
5.1 Checking the Communications Hardware
The following displays give you an example of how to find information
concerning communications error log data. Choose the option that is displayed
on the input field. Sign on to the system with a user profile having the *SERVICE
special authorities and enter the
following display is shown:
STRSST command on any command line. The
System Service Tools (SST)
Select one of the following:
1. Start a service tool
2. Work with active service tools
3. Work with disk units
4. Work with diskette data recovery
Selection
1
F3=ExitF10=Command entryF12=Cancel
Figure 31. The System Service Tools (SST) Display
System Service Tools (SST) lets you start service tools, work with active tools,
and work with disk unit data. Be aware: Service tools should only be used
under the direction of your service representative. The options you can select
from this display are:
1. Start a service tool. The service tools are:
•
Product activity log
•
Trace Licensed Internal Code
•
Work with communications trace
•
Display/Alter/Dump
•
Work with LIC log
•
Main storage dump manager
Copyright IBM Corp. 1997 71
•
Hardware service manager
2. Select this option to:
•
Start a service tool.
•
Re-enter a service tool you left active.
•
End an active service tool.
The status of a service tool is shown if the service tool is either active or
ending.
3. Select this option to use tools that can be run for disk units. You can:
•
Display disk unit configuration.
•
Calculate disk configuration.
•
Work with the storage threshold of an Auxiliary Storage Pool (ASP).
•
Work with disk unit information.
•
Work with disk unit recovery.
4. This option is used to recover the data from a diskette containing read
errors. Select this option to:
•
Read the contents of a diskette into the system.
•
Print reports about the data on the diskette.
•
Review the data on the diskette.
•
Change the data that has been read from the diskette.
•
Write the changed data back to another diskette.
Use this option only when directed by your service representative.
Important!
Service Tools should only be used under direction of a service
representative. Some of the tools allow changes in the data and LIC. These
can cause unpredictable results.
The following example shows Start a Service Tool display:
72Comm Perf Investigation - V3R6/V3R7
Warning: Incorrect use of this service tool can cause damage
to data in this system. Contact your service representative
for assistance.
Select one of the following:
1. Product activity log
2. Trace Licensed Internal Code
3. Work with communications trace
4. Display/Alter/Dump
5. Licensed Internal Code log
6. Main storage dump manager
7. Hardware service manager
Selection
1
F3=Exit F12=Cancel F16=SST menu
Start a Service Tool
Figure 32. The Start a Service Tool Display
The Start a Service Tool display lets you select a service tool to diagnose
problems, for example, with the system Licensed Internal Code (LIC).
The options you can select from this display are:
1. This option displays or prints errors that have occurred (such as in disk and
tape units, communications, and workstations). This option also lets you
work with tape and diskette statistics.
2. This option shows a menu that lets you start or stop a trace of Licensed
Internal Code (LIC). You can also display, dump, allocate, or clear the trace
tables where the LIC is recorded.
3. This option lets you start or stop a trace of data on a communications line or
network. Any traced data can be formatted and printed.
4. This option lets you display or change virtual storage data. You can dump
the data to tape, diskette, or printer. You can also print data that was
previously dumped to a tape or diskette. USE THIS OPTION ONLY WHEN
DIRECTED BY SERVICE REPRESENTATIVE!!
5. This option lets you display LIC log information. You can dump the Licensed
Internal Code log information to tape or diskette, or to a printer.
6. This option lets you display a main storage dump or copy the dump to tape
or diskette, or to a printer.
7. This option lets you display, work with, and print the stored hardware
resource information. Both logical and packaging hardware resources are
displayed. This option also allows you to display, alter, trace, or dump
input/output (I/O) processor Licensed Internal Code. I/O processors control
the storage devices, workstations, and communication data links on the
system.
Chapter 5. Using System Service Tools73
Selecting option 1 provides you with the Product Activity Log display:
Select one of the following:
1. Analyze log
2. Display or print by log ID
3. Change log sizes
4. Work with removable media lifetime statistics
5. Display or print removable media session statistics
6. Reference code description
Selection
1
F3=ExitF12=Cancel
Product Activity Log
Figure 33. The Product Activity Log Display
This display allows you to display or print product activity log entries, removable
media statistic log entries, or to change the size of logs. The options you can
select from this display are:
1. Select analyze log to display or print a summary of product activity entries.
This summary is useful for analyzing intermittent and multiple error
conditions.
2. Select this option to display or print data from the product activity log by log
identifier. The log ID is a unique identifier that ties together all data related
to a single error condition.
3. Select this option to verify or change the amount of storage on a disk unit
used for product activity log data.
4. Select this option to display, print, or delete the statistical data logged for the
lifetime use of a removable media.
Lifetime is the total length of time one of these media allows information to
be read from or written to it. When a removable media is deleted, please
delete the entry from the log.
5. Select this option to display or print the statistical data logged for a session
of a removable media.
Session is the length of time one of these media is in position to be read
from or written to (read/write heads are loaded).
6. Select this option to display or print the description of a reference code.
Select Option 1 on the Product Activity Log display. The Select Subsystem Data
display is shown.
74Comm Perf Investigation - V3R6/V3R7
Select Subsystem Data
Type choices, press Enter.
Log . . . . . . . . . . 5
1=All logs
2=Processor
3=Magnetic media
4=Local work station
5=Communications
6=Power
7=Licensed program
8=Licensed Internal Code
From:
Date . . . . . . . .10/24/96
Time . . . . . . . .11:51:44
MM/DD/YY
HH:MM:SS
To:
Date . . . . . . . .10/25/96
Time . . . . . . . .11:51:44
MM/DD/YY
HH:MM:SS
F3=ExitF5=RefreshF12=Cancel
Figure 34. The Select Subsystem Data Display
This display allows you to select a subsystem log to work with and the time
period you want to work in.
The options you can select from this display are:
1. Display or print all data in the product activity log.
2. Display or print processor log data.
3. Display or print magnetic media error log data, including data for disk and
removable media devices.
4. Display or print local workstation log data. Local workstations are connected
to the system by a method other than a local area network or a
communications device.
5. Display or print communications log data including:
•
Communications I/O processors
•
I/O adapters
•
Ports
•
Lines
•
Controllers including devices connected with following protocols:
−SDLC
−ASYNC
−BSC
−X.25
−IDLC
−ISDN
−Local Area Network
6. Display or print log data associated with the system power control network.
7. Display or print licensed program log data.
Chapter 5. Using System Service Tools75
8. Display or print Licensed Internal Code (LIC) log data. LIC is the layered
architecture below the machine interface (MI) and above the machine. LIC is
a proprietary system design that carries out many functions such as:
•
Storage management
•
Pointers and addressing
•
Program management functions
•
Exception and event management
•
Data functions
•
I/O managers
•
Security
All of the selections lead you to the following display:
Figure 35. The Select Report Type for Subsystem Display
This display allows you to choose the type of report, the detail report format you
want, and the type of entries you want in the report.
1 The three different report options are:
1. This option provides you with a list of entries that match the selected search
values. The fields displayed include:
•
System reference code that identifies a unique logging condition. The
system reference code is made up of the first four digits of the translate
table ID followed by the four digits of the reference code.
•
Date and time when the entry was logged
•
Error class
•
Resource name and resource type
•
Logical address that is the direct select address and unit address for the
resource most closely related to the entry
•
Frame ID (the identifier assigned to the frame enclosure)
•
Card and device position
76Comm Perf Investigation - V3R6/V3R7
•
Device name
•
Component (the component ID of the program logging the entry)
•
Code (the product library code for the program logging the entry)
•
Description
2. This option provides you with a summary of log entries sorted by the option
specified using the
PF9=Sort by...function. The default is to sort by date.
The number of entries that match the search values is displayed with each
summary line.
3. This option prints a report based on the selected search values and sort
value.
2 The optional entries to include are:
•
Informational entries that are logged to provide information about the system
(for example, vary ons and vary offs).
•
Statistic entries are logged to record the volume statistics information for
removable media. Usually the statistic entries contain no information about
communications.
3 The reference code selection enables you to:
1. Include entries with certain reference codes only.
2. Omit entries with certain reference codes.
Type up to 10 reference codes separated by blanks or commas. Reference
codes must be four hexadecimal numbers or you may use a wildcard (*). The
wildcard represents all reference codes that match the hexadecimal numbers in
front of the wildcard. For example, AA* represents all reference codes that
begin with AA. There can be only one wildcard in each value and the wildcard
must be the last character, although multiple values with wildcards may be used.
The default is to include all entries for all reference codes.
4 The device selection field enables you to:
1. Include entries for selected device types.
2. Include entries for devices with specific resource names.
Type up to 10 device types or resource names separated by blanks or commas.
Device types must be four characters while the resource names are up to 10
characters. You may use wildcards (*) on both types and names. As with
reference codes, only one wildcard per entry is allowed and the wildcard must
be in the last position of the value (for example, 93*).
The default is to include all entries for all device types.
The combination of selections provides you with the Log Analysis Report display:
Chapter 5. Using System Service Tools77
From . . :110/25/96 11:15:04To . . :111/11/96 14:04:12
This display allows you to display or print error log entries for each resource
listed. If you select to display reports of more than one resource entry, you
cannot return to this display until all selected resource entries have been
displayed. The PF12 key can only cancel the entry you are working on.
1 From and To (date and time). The information displayed is gathered
between these times. The format is the same as the system date and time.
2 Use the display option to display the Detail Report for the selected entry.
3 Use the print option to print the Detail Report for the selected entry.
4 Using the PF11 key provides additional information about all of the entries in
the log.
The following display is the Detailed Report for the previously selected entry. If
you have selected to display more than one entry, PF12 does not return you to
the Log Analysis Report display until all of the selected reference code entries
have been displayed.
Figure 37. The Display Detail Report for Resource Display
The fields have the following meanings:
1 The Sequence field shows you the numbers (assigned to the entries
in the error log) that indicate the sequence in which the errors occurred.
The highest number is the most recent.
2 The Reference code. This is the code that your service
representative asks you for because this shows you the hardware error
code for the failing condition.
3 The Secondary code may show the failing condition (for example, IOP
return code, processor step code, program return code, or major/minor
code).
4 The Table ID. Your service representative may also ask the contents
of this field because this identifies a group of reference codes.
5 This identifies the source of the IPL code being used at the time the
entry was added and the state of the machine when the entry was added.
6 The protocol field shows you the protocol used for sending and
receiving data between the resource and the system.
7 The Error class that identifies the type of the entry is one of the
following:
•
Permanent
•
Statistics
•
Temporary
•
Threshold
•
Buffered
•
Recoverable
Chapter 5. Using System Service Tools79
•
Informational
•
Vary on or vary off
•
Machine check
•
Qualified
8 The description is provided by the reference code translate table.
5.2 Working with Communications Traces
Communications Trace is a service function that allows data to be traced on a
communications line, a network interface, or a network server. Once the data
has been traced, it may be formatted and placed in a spooled file to be
displayed or printed.
Communications Trace should be used when:
•
Your problem analysis procedures do not give sufficient information about
the problem.
•
You suspect that a protocol violation is the problem.
•
You suspect that line noise is the problem.
•
The error messages indicate that there is an SNA BIND problem.
Interpreting the communications trace output requires detailed knowledge of the
line protocols being used to correctly interpret the data generated. The
information needed to interpret the trace is in the
Whenever possible, start the communications trace before varying on the line to
be traced. This gives you the most accurate sample of the line coming up.
SNA Formats
, GA27-3136.
5.2.1 Starting and Stopping the Trace
There are two ways to start a trace:
•
Enter the Start Communications Trace (STRCMNTRC) CL command.
•
Press PF6 on the Work with Communications Traces display accessed by
using SST.
A communication trace continues until:
•
The End Communications Trace (ENDCMNTRC) command is run.
•
TRCFULL(*STOPTRC) is specified when starting the Trace and the buffer
becomes full.
•
The Communications Trace function of the SST is used to end the trace.
•
A physical line problem causes the trace to end.
In this presentation, we are using the SST functions to trace a token-ring line
called AN EXAMPLE. The Work with Communications Traces display is accessed
by choosing option 3 on the Start a Service Tool menu displayed in Figure 32 on
page 73. The following display is shown:
80Comm Perf Investigation - V3R6/V3R7
Type options, press Enter.
2=Stop trace4=Delete trace6=Format and print trace
7=Display message8=Restart trace
Configuration
OptObjectTypeTrace DescriptionProtocolTrace Status
1 Buffer Size(K); this shows the size of the buffer allocated to capture
data for this trace. The maximum value that can be specified is 8192K
bytes.
Note: For network server description traces, the buffer size indicates the
size of the buffer allocated for the formatted trace output.
2 Stop on buffer full (referred to later as Stop/Wrap) specifies whether
the data captured by the trace should be overwritten after the specified
buffer size is filled. ″Yes″ indicates that trace data is written to the buffer
only until the buffer is filled. Later data is not traced. ″No″ indicates that
later data is written over earlier data once the buffer is full. Data
collected in the beginning of the trace is lost if the buffer wraps over.
3 Trace Direction determines whether to trace transmitted data,
received data, or both.
After starting the trace, the Work with Communications Traces display is shown
again as follows:
Type options, press Enter.
2=Stop trace4=Delete trace6=Format and print trace
7=Display message8=Restart trace
Configuration
OptObjectTypeTrace DescriptionProtocolTrace Status
2
1AN EXAMPLE 2LINE3YOUR DESCRIPTION4TRN5ACTIVE
F3=ExitF5=RefreshF6=Start traceF10=Change size
F11=Display buffer sizeF12=Cancel
Work with Communications Traces
Figure 40. The Work with Communications Trace Display with Active Trace
The options or the function keys available on this display are:
•
Option 2 = Stop trace
Select this option to stop a trace that is currently active or waiting for the
line, network interface, or network server to be varied on.
•
Option 4 = Delete trace
Select this option to delete a trace that is currently stopped or has an error.
•
Option 6 = Format trace data
Select this option to show trace data formatting options.
Note: The trace status must be
Use PF5 to update the display and view the current trace status. When
82Comm Perf Investigation - V3R6/V3R7
STOPPED or ERROR before using this option.
formatting is complete, the trace data is placed in a spooled file named
QPCSMPRT in the default output queue.
•
Option 7 = Display message
Select this option to show a message associated with a trace that has an
error status.
•
Option 8 = Restart trace
Select this option to restart a trace. Selecting this option is equivalent to
selecting option 4 (Delete trace) and pressing PF6 to start trace in
succession. Restarting a trace starts the trace again using the options
selected for the original trace.
Note: The trace status must be
•
PF10 is used to change the maximum storage size for all traces.
•
PF11 is used to view more information about a trace.
STOPPED or ERROR to use this option.
The fields shown on this display are:
1 Configuration object shows the name of the configuration object
being traced.
2 Type shows the type of configuration object being traced.
3 Trace Description shows you the text description specified for this
trace.
4 Protocol shows you the protocol used for sending and receiving data
on the communications line, network interface, or network server. The
protocol types are:
•
Async (Asynchronous Communications)
•
BSC (Binary Synchronous Communications)
•
TRLAN (Token-Ring Network)
•
X.25
•
SDLC (Synchronous Data Link Control)
•
Ethernet (CSMA/CD or DIX V2)
•
IDLC (ISDN Data Link Control)
•
ISDN (Integrated Services Digital Network)
•
DDI (Distributed Data Interface)
•
Frame Relay
•
Wireless LAN
•
NetBIOS (Network Basic Input Output System)
5 The status of the trace is one of the following statuses:
ActiveTrace data is being gathered.
ErrorAn error occurred while the trace was gathering data
(some data may have been gathered) or while the trace
was being formatted.
FormattingThe trace is being formatted.
StartingTrace is being started by another user. You cannot stop or
delete this trace. Use the PF5 key to update the trace
status.
StoppedThe trace has stopped (it is not gathering data).
StoppingThe trace is stopping.
Chapter 5. Using System Service Tools83
WaitingThe trace is waiting for the line, network interface, or
network server to be varied on and is not gathering data.
If the status of the trace is
network server has not been varied on or the job has not been started, do
the following steps:
1. Return to the SST Main Menu.
2. From the SST Main Menu, press PF10 to receive the Command Entry
display.
3. Vary on the line, network interface, or network server and start the
job.
4. Return to the SST Main Menu and choose the option to start a service
function.
5. Select the Communications Traces service function.
6. The trace status should be shown as
Communications Traces display. Use PF5 to update the trace status.
If you chose the option to stop the trace when the buffer is full, use PF5 to
update the trace status. When the trace buffer is full, the trace status
changes to
5.2.2 Formatting the Trace Data
After the trace has been stopped, you must format the trace. Formatting is done
by entering Option 6 on the Work with Communications Traces display and the
following display is shown:
WAITING and the line, network interface, or
ACTIVE on the Work with
STOPPED.
Configuration object. . . . :AN EXAMPLE
Type. . . . . . . . . . . . :LINE
Type choices, press Enter.
Controller . . . . . . . . . .1*ALL
Data representation . . . . .23
Format SNA data only . . . . .3N
Format RR, RNR commands . . .4N
Format TCP/IP data only . . .5N
Format UI data only . . . . .6N
Format MAC or SMT data only .7N
Format Broadcast data . . . .8Y
F3=ExitF5=RefreshF12=Cancel
Format Trace Data
*ALL, name
1=ASCII, 2=EBCDIC, 3=*CALC
Y=Yes, N=No
Y=Yes, N=No
Y=Yes, N=No
Y=Yes, N=No
Y=Yes, N=No
Y=Yes, N=No
Figure 41. The Format Trace Data Display
The display-only fields on this display are:
84Comm Perf Investigation - V3R6/V3R7
•
The configuration object that shows the name of the configuration object
traced.
•
The type that shows the type of configuration object traced.
The input fields on this display are:
1 Controller. This option is only valid for Async, X.25, SDLC, IDLC, and
local area networks. You can select to format the data for a specific
controller or for all controllers attached to the communications line at the
time of the trace.
•
To format the data for all controllers, type *ALL.
•
To format the data for a specific controller, type the name of the
controller.
2 Data representation. This option is not valid for BSC networks. This
option determines whether the hexadecimal data is converted to ASCII or
EBCDIC characters.
•
Select *CALC (default) to have the system calculate how to format the
displayable characters in the trace.
•
Select ASCII to convert the hexadecimal data to displayable
characters using ASCII conversion rules.
•
Select EBCDIC to convert the hexadecimal data to displayable
characters using EBCDIC conversion rules.
For example, hexadecimal 61 is a slash (′/′) in EBCDIC but hexadecimal
61 is an ″a″ using ASCII conversion.
3 Format SNA data only. This option is only valid for local area
networks, SDLC, X.25, and IDLC.
•
Select Yes to format and spool SNA data only.
•
Select No to format and spool line protocol data (SDLC, X.25, Ethernet,
token-ring, DDI, and wireless). SNA data is spooled (shown in
hexadecimal form) but is not formatted.
4 Format RR and RNR commands. This option is only valid for local
area networks, SDLC, X.25, IDLC, and ISDN.
•
Select Yes to format RR (Receiver Ready) and RNR (Receiver Not
Ready) commands in addition to other data.
•
Select No if you do not want RR and RNR commands formatted with
other data.
5 Format TCP/IP data only. This option is only valid for local area
networks and X.25.
•
Select Yes to format and spool frames that contain Transmission
Control Protocol/Internet Protocol (TCP/IP) data only.
•
Select No to format and spool line protocol data (token-ring, Ethernet,
X.25, or wireless) only.
6 Format UI data only. This option is only valid for local area networks.
″Yes″ can be specified for this option only if ″No″ was specified for the
″Format SNA data only″ option.
•
Select Yes to format and spool Unnumbered Information (UI) data
only.
Chapter 5. Using System Service Tools85
•
Select No to format and spool line protocol data (token-ring, Ethernet,
DDI, or wireless) only.
7 Format MAC or SMT data only.
Note: Traces of token-ring lines attached to a network server description
do not contain any MAC data; therefore, this option must be set to No.
This option is only valid for local area networks.
•
Select Yes to format and spool only Medium Access Control (MAC) or
Station Management (SMT) data.
•
Select No to format and spool line protocol data (token-ring, Ethernet,
DDI, or wireless) only.
8 Format broadcast data. This option is only valid for local area
networks.
•
Select Yes to include the broadcast data (frames received with
destination MAC addresses of FFFFFFFFFFFF) in the formatted trace
data.
•
Select No to exclude the broadcast data from the trace.
86Comm Perf Investigation - V3R6/V3R7
Loading...
+ hidden pages
You need points to download manuals.
1 point = 1 manual.
You can buy points or you can get point for every manual you upload.