IBM E027SLL-H, Tivoli Monitoring 6.2.3 FP1 Troubleshooting Manual

IBM Tivoli Monitoring
Version 6.2.3 FP1
Troubleshooting Guide

GC32-9458-05
IBM Tivoli Monitoring
Version 6.2.3 FP1
Troubleshooting Guide

GC32-9458-05
Note
Before using this information and the product it supports, read the information in “Notices” on page 267.
© Copyright IBM Corporation 2005, 2012.
US Government Users Restricted Rights – Use, duplication or disclosure restricted by GSA ADP Schedule Contract with IBM Corp.
Contents
Tables ...............xiii
About this information ........xv
Chapter 1. Introduction to
troubleshooting ...........1
Sources of troubleshooting information .....1
Problem classification ...........1
Viewing the IBM Support Portal ........2
Subscribing to IBM support notifications .....2
Chapter 2. Logs and data collection for
troubleshooting ...........5
Appropriate IBM Tivoli Monitoring RAS1 trace
output ................5
Running snapcore to collect information .....5
Locating the core file ...........6
Getting Dr. Watson dumps and logs ......7
KpcCMA.RAS files ............7
Sources of other important information .....8
Chapter 3. Common problem solving . . 9
About the tools .............9
I am trying to find out what software is supported . 9 Workspaces are missing or views are empty . . . 10
Diagnosing that workspaces are missing or
empty ...............10
Resolving application support problems ....11
Resolving monitoring server problems ....12
Resolving monitoring agent problems ....12
Status of a monitoring agent is mismatched between
the portal client and tacmd command .....13
Diagnosing that the status of a monitoring agent
is mismatched between the portal client and
tacmd command............13
Resolving monitoring agent problems ....14
Resolving monitoring server problems ....14
The portal server does not start or stops responding 15
Diagnosing that the portal server does not start
or stops responding ..........15
Resolving database problems - missing table or
portal server database ..........16
Resolving database problems - user ID and
password ..............16
Resolving database problems - instance not
started ...............17
Diagnosing that portal server logon fails....18
The portal client does not respond.......18
Diagnosing that the portal client does not
respond ..............18
Resolving storage or memory problems ....19
Resolving client configuration problems ....19
Historical data is missing or incorrect .....20
Diagnosing that historical data is missing or
incorrect ..............20
Resolving warehouse proxy connection problems 21 Resolving warehouse proxy agent problems -
configuration .............21
Resolving warehouse proxy agent problems -
connectivity .............22
Resolving summarization and pruning agent
problems ..............22
Resolving persistent data store for z/OS
problems ..............23
Historical data does not get collected for some
monitoring server attribute groups .....25
A situation does not raise when expected ....25
Diagnosing that a situation does not raise when
expected ..............25
Resolving situation-specific problems .....26
A reflex automation script does not run when it
should ................28
Diagnosing that a reflex automation script does
not run when it should .........28
Resolving format and variable problems ....28
High CPU usage on a distributed system ....29
Diagnosing high CPU usage on a distributed
system ...............29
Resolving situation problems - diagnostic actions 30
Resolving situation problems - corrective actions 31
Resolving firewall problems - diagnostic actions 31
Resolving firewall problems - corrective actions 31
Resolving Oracle DB Agent problems - diagnostic
actions ...............32
Resolving Oracle DB Agent problems - corrective
actions ...............32
Chapter 4. Tools...........35
Trace logging ..............35
Log file locations ...........35
Installation log files...........39
Reading RAS1 logs ...........42
Setting traces .............43
Dynamically modify trace settings for an IBM Tivoli
Monitoring component ..........53
Using the IBM Tivoli Monitoring Service Console . 55
Starting the IBM Tivoli Monitoring service
console ...............56
Blocking access to the IBM Tivoli Monitoring
Service Console ............56
Displaying portal server tasks in the command
prompt ................57
KfwSQLClient utility ...........57
Clearing the JAR cache ..........58
Using the UAGENT application .......59
pdcollect tool ..............59
ras1log tool ..............60
Backspace Check utility ..........60
© Copyright IBM Corp. 2005, 2012 iii
Build TEPS Database utility .........61
IBM Tivoli Monitoring Operations Logging ....61
Windows and UNIX systems .......61
z/OS systems ............62
ITMSuper ...............63
Chapter 5. Installation and
configuration troubleshooting .....65
Frequently asked questions .........65
General installation frequently asked questions 65 Windows installation frequently asked questions 65 UNIX-based systems installation frequently asked
questions ..............66
General installation problems and resolutions . . . 68
Agent Builder application support is not displayed in listappinstallrecs output if it is manually installed without recycling the
monitoring server ...........68
Debugging mismatched application support files 69 Startup Center fails to reset the sysadmin password on the hub Tivoli Enterprise
Monitoring Server configuration panel ....69
Startup Center fails to create the Tivoli
Warehouse database and user .......69
On UNIX systems, a new user is not created or a password is not reset in the Startup Center when you use a non-root user to install Warehouse Proxy Agent and Tivoli Enterprise Portal Server . 69 On Windows systems, a Tivoli Monitoring Warehouse DSN is not created in the Startup
Center ...............69
Startup Center fails to test DSN with database
connectivity .............69
Startup Center shows some system types as
“Unknown Operating System” .......70
Tivoli Enterprise Monitoring Agents .....70
Upgrade SQL file not found when installing
application support on the standby hub ....73
Many files in the First Failure Data Capture log
directory ..............74
Monitoring agents fail to start after agent support
or multi-instance agents are installed .....74
Incorrect behavior after an uninstallation and
re-installation.............75
Where Remote Deployment of agents is not
supported ..............75
Application Support Installer hangs .....75
An agent bundle is not visible from the Tivoli
Enterprise Portal............75
Agent Management Services fails after deployment on Linux Itanium and xLinux with
kernel 2.4 systems ...........76
Watchdog utility requires Windows Script Host
5.6................76
Unable to deploy monitoring agents from the
Tivoli Enterprise Portal .........76
Installing application support with a silent
installation response file fails .......76
Unable to run gsk7ikm.exe ........77
*_cq_*.log files appear ..........77
SPD: Installing a bundle on the wrong operating
system, architecture, or kernel .......77
Installing a Software Package Block (SPB) on top of an existing, running IBM Tivoli Monitoring
agent ...............78
Problems with the SPB file ........78
Installation was halted and receive message
about active install ...........78
Receive an install.sh error when installing two components or agents in the same installation
directory ..............78
When attempting to install IBM Java 1.5.0 on Windows 64 bit system nothing happens . . . 78 Backup failure message during a remote
monitoring server upgrade ........79
Remote configuration of deployed Monitoring
Agent for DB2 agent fails .........79
Monitoring Server cannot find your deployment
depot ...............79
The agent installation log shows error
AMXUT7502E ............80
Failure occurs when sharing directories for the
agent deploy depot ...........80
You receive a KFWITM290E error when using deploy commands with a z/OS monitoring
server ...............80
Running deployment in a hot-standby
environment .............80
Difficulty with default port numbers .....81
Selecting Security Validation User displays a
blank popup .............81
When installing a monitoring agent on top of the Systems Monitor Agent, you receive an error . . 81 Some rows do not display in an upgraded table 81 The monitoring server and portal server automatically start after running Application
Support Installer............82
Errors occur during installation of Event IBM Tivoli Monitoring Event Forwarding tool . . . 82 Missing LSB tags and overrides warning message
at the end of installation .........82
Self-describing capability .........82
Windows installation problems and resolutions . . 84
On Windows systems, the installation fails randomly when installing different features . . 84 Problems that are cleared by rebooting the
Windows system ...........85
When installing and configuring the Tivoli Enterprise Monitoring Server on Windows Server 2008, a number of popups and errors occur. . . 85 After an upgrade, the Tivoli Enterprise Portal Server is in the 'stop pending' state and cannot
be manually started ..........86
When running the setup.exe, an unknown
publisher error message displays ......86
The error “Could not open DNS registry key”
occurs ...............86
Agent not connecting to Tivoli Enterprise
Monitoring Server ...........86
iv IBM Tivoli Monitoring: Troubleshooting Guide
InstallShield displays the error “1607: Unable to install InstallShield Scripting Runtime” during installation on Windows from a
network-mounted drive .........87
Installation on a Windows 2003 server fails with
Error Number: 0x80040707 ........87
Extracting the nls_replace script causes remote
deployment to fail ...........87
Upgrade tool deploys agent to the wrong
directory ..............87
Deploying an agent instance gives a
KUICAR020E error ...........88
Uninstallation is not available for Application
Support on Windows systems .......88
Problems installing directly from the .zip file . . 88 Installation hangs or loops after presenting initial
splash screen .............88
Unable to discover systems within a specified IP range when running the Startup Center from
eclipse.exe ..............88
UNIX-based system installation problems and
resolutions...............88
Self-describing capability might be overwritten
by UNIX monitoring server application support . 89
On a RHEL6 64-bit system, the Tivoli Monitoring
installer fails with errors .........91
Application agent remote deployment on
workload partition fails .........91
Message is received about the Korn Shell after
running the install.sh file ........92
AIX stat_daemon memory leak .......92
Manage Tivoli Enterprise Monitoring Services
does not start on AIX V6.1 ........92
UNIX and Linux install.sh command fails with
error code: 99 and error code: 4.......93
Receive KUIC02101W error ........93
Receive JVMDG080 or JVMXM012 Java errors . . 93
On HP-UX systems, where the host name does
not equal the nodename, the upgrade installation
fails to stop running processes .......94
EIF Slot Customization does not work on
upgraded zlinux systems .........94
Running of the KfwSQLClient binary fails on
Linux and AIX systems .........94
Failed to attach to the DB2 instance db2inst1
ERROR: Unable to create TEPS, return code = 3 . 94
Installation on SLES9 terminates with install.sh
failure:KCI1008E terminating... license declined . 95
Command line interface program of the
Application Support Installer is not currently
available ..............95
Silent installation on UNIX-based systems returns
an encryption key setting error .......95
The error “Unexpected Signal: 4 occurred at
PC=0xFEC3FDE4” occurs during installation . . 95
Installing IBM Tivoli Monitoring on Red Hat 5
and see the following error: “KCI1235E
terminating ... problem with starting Java Virtual
Machine” ..............95
Installation on the Linux S390 R2.6 64-bit operating system fails with the message “LINUX MONITORING AGENT V610Rnnn unable to install agent” where nnn is the release number . 96
Troubleshooting z/OS-based installations ....96
Tivoli Monitoring z/OS initialization checklist. . 96 z/OS-based installations problems and
resolutions .............104
Uninstallation problems and workarounds . . . 107
Unable to uninstall multi-instance agent from a
managed system on windows 64bit .....108
Prompted for .msi file during uninstallation process started from 'Add/Remove Programs'
on systems with v6.2.2 installed ......108
Uninstallation is blocked by another process that is using the IBM Tivoli Monitoring Eclipse
help server .............108
Uninstallation of an agent produces help errors 108 Uninstallation of an agent occurring more than
once stops the OS agent .........109
After uninstallation, Tivoli Enterprise
Monitoring Server folder is not deleted....109
Removing a failed installation on Windows . . 109 Incorrect behavior after an uninstallation and
reinstallation.............113
Tivoli Data Warehouse database does not
uninstall ..............113
The agent installation log shows error
AMXUT7512E ............113
Prompted to uninstall a database that was not
running during uninstallation .......114
Chapter 6. Connectivity
troubleshooting ..........115
Cannot log on to the portal server ......115
Cannot connect to the portal server ......117
Cannot launch the portal client on Windows XP
after installation (message KFWITM215E) ....120
Portal server is initializing and is not ready for
communications ............121
Portal server is unavailable during a portal client
work session .............121
Portal server does not start after installation . . . 121 Portal server is not connecting with the hub
monitoring server ............121
DB2 errors when opening a Tivoli Enterprise Portal
workspace ..............122
A monitoring process fails to start on Linux or
UNIX after changing a .profile for root .....123
Heartbeat issues when running on a Linux guest
using VMware .............124
Chapter 7. Portal client
troubleshooting ..........127
Cannot select the Create new group icon within the
Object group editor ...........127
Cannot load product configuration data after changing warehouse database from Oracle to DB2
on Linux or UNIX ............127
Contents v
Data in the Tivoli Enterprise Portal is missing and
you receive an error ...........127
JavaWebStart Tivoli Enterprise Portal fails to
display help screens ...........127
Client allows you to save a situation with an
invalid character ............128
Tivoli Enterprise Portal or the browser displays the
yen symbol as a backslash in Japanese .....128
Using an administrator name with non-latin1 characters, cannot log onto the Tivoli Enterprise
Portal ................128
Non-ASCII characters are not accepted in the user
ID or the distinguished name field ......128
The Tivoli Enterprise Portal desktop does not work
when exporting DISPLAY .........129
Some attribute groups showing a different name in
the Tivoli Enterprise Portal .........129
Monitoring agents show in an unexpected position
in the navigation tree...........129
Tivoli Enterprise Portal Desktop called through Java Web Start does not work properly after adding agents support for Tivoli Enterprise Portal
Server ................131
Receive a Loading Java TM0 Applet Failed... error 131 tacmd createUser output indicates that the path to
the Java home directory was not found .....131
Cannot launch the Tivoli Enterprise Portal help 132 On an Active Directory Server, sysadmin cannot
logon to the Tivoli Enterprise Portal client ....132
Several enterprise workspaces are returning an error, KFWITM217E:Request
Error,SQL1_CreateRequest Failed, rc=350 ....133
You cannot paste non-ASCII characters in the
Situation editor .............133
Situation editor cannot display advanced advice
help files ...............133
After acknowledging a situation event and selecting the link for that situation, you might
receive a message ............133
Password problem using the LDAP Security option
on Active Directory system .........134
There is a memory leak in the Tivoli Enterprise Portal browser client when the number of
workspace switches increases ........134
Help index and search text entry fields are
disabled ...............134
Java exception logging onto the Tivoli Enterprise
Portal from a browser ..........134
On Linux, IBM Tivoli Enterprise Monitoring Agent topics do not display in the Help Contents or
Index tabs ..............135
Navigator items are listed in an unexpected order 135 Clicking on the Timespan icon for one view brings
up the data for another view ........135
HEAPDUMPs and JAVACore files are placed on
desktops when running in browser mode ....135
Java errors occur with the IBM v1.4.2 JRE ....137
Web Portal Client does not work with Sun JRE . . 137 Tivoli Enterprise Portal has high memory usage
and poor response time ..........138
Tivoli Enterprise Portal has high memory usage 138
Data is not returned to the portal client ....139
DirectDraw thread loops infinitely causing poor
portal client performance .........139
Workflow Editor is disabled and the following tools do not display: Event Console, Graphic View, Edit Navigator View (Navigator view toolbar) . . 140
Situations are not firing ..........140
Historical UADVISOR situations are started on the agent if historical collection is configured to collect
data ................140
At the bottom of each view, you see a historical
workspace KFWITM217E error .......141
Installation of situation data fails due to I/O on
VSAM data sets ............141
kshsoap client fails because of missing libraries on
UNIX-based systems ...........142
Category and Message field of the universal
message does not accept DBCS .......142
An error occurs when remotely removing an
instance on Windows...........142
Agents display offline in the portal client but fire situations and agent logs report that they are
running ...............143
Navigator view sorts erratically when you remove multiple managed systems simultaneously . . . 143 Multiple events that occur at the same time are
loaded too slowly ............143
Desktop client performs poorly after installing Language Packs for IBM Tivoli Monitoring . . . 143 Existing OMEGAMON product imagery displays after upgrading to IBM Tivoli Monitoring V6.1 . . 144 The Warehouse Proxy Agent started, but does not appear in the Managed System Status list on the
Tivoli Enterprise Portal ..........144
Cannot start or stop agents from the Navigator
view ................144
Cannot load a ws_pres.css file in order to select a
language other than English ........145
Chapter 8. Portal server
troubleshooting ..........147
Performance impacts of the HTTP and HTTPS
protocols ...............147
Users who run the IBM HTTP Server do not have
permission to the content directory ......147
tacmd exportWorkspaces or importWorkspaces
receives an out of memory error .......147
The portal server and Warehouse Proxy Agent fail to connect to the database on a 64-bit Windows
system ...............148
Failed to log on as sysadmin with portal server
LDAP enabled .............148
On AIX systems, newly created users with auto-expire passwords cause installation failures . 148 Linux portal server unable to FTP catalog/attribute
files ................148
Upgrading the Tivoli Enterprise Portal Server takes
a long time ..............148
Running the Tivoli Management Services Discovery Library Adapter, results in a book that does not contain the fully qualified host name . . 148
vi IBM Tivoli Monitoring: Troubleshooting Guide
Portal server performance is slow ......149
Cannot create a Tivoli Enterprise Portal Server
database ...............149
You receive a KFW error when a query is sent to
more than 200 managed systems .......150
Non-hub situations are not associated at the Tivoli
Enterprise Portal Server level ........151
Starting and stopping the Eclipse Help Server . . 151 Non-root stopping or starting agents causes
problems ...............151
Root password is not accepted during non-root Tivoli Enterprise Portal Server configuration . . . 151 Corba user exception is included in the portal
server log when creating situations ......152
Stopping or starting the eWAS subcomponent of
the portal server ............152
Chapter 9. Monitoring server
troubleshooting ..........153
Messages related to the index file are displayed when the agent fails back to a remote monitoring
server ................153
A generic RPC communications error is received when issuing a long-running tacmd execute
command ..............153
Troubleshooting monitoring server problems on
distributed systems ...........153
The CT_GET request method fails in SOAP
queries with a V6.2.3 hub monitoring server, a
remote hub monitoring server earlier than
V6.2.3, and an agent connected to a remote
monitoring server ...........154
Exposure of passwords in the clear .....154
Receive a seeding failed message ......154
High monitoring server CPU after restarting
with Warehouse Proxy Agents configured . . . 154
Upgrade inconsistency between the History and
Object windows ...........155
Attribute groups started for collection on the
managed systems should not be available on the
monitoring server list..........155
To decrypt a password,
KDS_VALIDATE_EXT='Y' is required ....156
Remote Tivoli Enterprise Monitoring Server
consumes high CPU when large number of
agents connect ............156
Unable to start the Tivoli Enterprise Monitoring
Server after the kdsmain process is terminated
abnormally .............156
THRESHOLDS.XML and Tivoli Enterprise
Monitoring Server table not cleaned when
managed system override is removed ....157
Situations fail to trigger for attributes by
applying group function.........157
Monitoring server application support
completes all seeding functions but might crash
as the program is exiting ........157
tacmd login fails when monitoring server is
configured with LDAP authentication ....158
Some agents are displayed in the Service Console list that are not accessible from that
user interface ............159
tacmd login fails after hub monitoring server is
recycled ..............159
tacmd and SOAP are not able to connect . . . 160 The system crashes when attempting a bulk
import or export command........160
Monitoring server fails to start, but then does
after a reboot ............160
Remote monitoring server lost connection to the
hub and all agents display offline .....161
After the set timeout, the Tivoli Enterprise
Monitoring Server is still pending .....161
Providing the wrong path to configuration files during LDAP configuration causes the Tivoli Enterprise Portal login window to hang . . . 161 Crash on Linux remote monitoring server
during hub failover to Hot Standby .....162
HUB Tivoli Enterprise Monitoring Server quiesce prevents the display of the data collected by the attached Tivoli Enterprise
Monitoring Agents ..........162
During installation of a remote Tivoli Enterprise Monitoring Server on a Windows system, the
agent support is applied, but fails .....162
Using a Deploy Group with addSystem or
updateAgent commands.........163
Tivoli Enterprise Monitoring Server requires restart if you issue itmcmd server stop/start commands when you are already logged on . . 163 Log indicates hub monitoring servers are down
when they are up ...........163
The Platform view in the Manage Tivoli Enterprise Monitoring Services panel shows the Tivoli Enterprise Monitoring Server as running as a 32 bit application, but my agents are shown
as running as 64 bit applications ......164
Tivoli Enterprise Monitoring Server does not release memory after running a large SQL query 164 SQL queries with more than 200 OR predicates
do not complete ...........164
Tivoli Enterprise Monitoring Server aborts unexpectedly when exiting the telnet session
used to start it ............165
KCIIN0084E Timeout appears while waiting for Tivoli Enterprise Monitoring Server to start on
AIX5.3..............165
In a hot standby (FTO) environment, commands to a mirror hub might not return hub records
after reconnection ...........165
A deleted object is redisplayed when two hot
standby (FTO) hubs reconnect .......166
Troubleshooting monitoring server problems on
z/OS systems .............166
Receive Program KDFCINIT and Program
FAXCMON messages..........166
Contents vii
The Tivoli Enterprise Monitoring Server start task (CANSDSST default) encountered error message 'KLVST044 LOADLIST MEMBER NOT FOUND IN RKANPAR DATASET (KDSLLIST) KppLLIST KLVST001 CANDLE ENGINE INITIALIZATION ERROR(S), ABEND U0012' in
the RKLVLOG at startup ........167
KDS Parameters not generated from the batch
parm deck .............167
Cannot encrypt text. A call to CSNBSYE failed.
Cannot encrypt contents of keyfile .....168
The error “KLVST005 MVS JOBSTEP AUTHORIZATION REQUIRED KLVST001 CANDLE ENGINE INITIALIZATION ERROR(S), ABEND U0012 CSV019I - Required module KLVSTWTO not accessed, it is not APF Authorized (RKANMODL) CSV028I - ABEND 306-0C” occurs in the z/OS monitoring server
RKLVLOG during startup ........168
The error “KLVSQ000 carved mode in effect for extended storage” occurred in the RKLVLOG
during startup ............168
Error message 'KDSMA013 OPEN VTAM for VDM1APPL failed with status 8' occurs in the Tivoli Enterprise Monitoring Server start task
(CANSDSST default) ..........169
Chapter 10. Monitoring agent
troubleshooting ..........171
Command-line interface ..........171
OS agents ..............171
Linux OS agent fails to start .......172
OS agent start command fails .......172
Specific events are not monitored by the
Windows OS agent ..........172
Take action commands and reflex automation . . 172
Warehouse agents ............173
Unable to configure the Warehouse Proxy agent
with modified parameters from the Tivoli
Enterprise Portal GUI .........174
Workspaces ..............174
A workspace view is showing an error ....174
Local history migration tools move the agent operation logs to multiple agent history locations . 175 Unreadable tool tip information for Available EIF
Receivers list of the Situation editor ......175
32-bit Agent Builder agent will not start on 64-bit Windows with System Monitor Agent-installed OS
Agent ................175
Unable to locate the file name of an exported
situation that begins with numerals ......176
Tivoli Enterprise Portal data for UNIX OS and Linux OS agents is not updated after stopping the
disk ................176
Testing the connection to the Tivoli Data Warehouse database is valid even with an invalid
password...............176
Configured non-root user agent starts up as root 176 Large historical collections slow monitoring agents 176 Unable to access History Collection Configuration
for any agent .............176
Agent names and icons are displayed incorrectly 177
64 bit monitoring agents are not started ....177
Errors in the configuration xml file ......177
Subnode Limitations for autonomous function . . 178 Binary Path attribute of the Windows OS agent
does not show a value ..........179
Installing pre-v6.2.1 Monitoring Agent for Windows OS onto a v6.2.1 or later monitoring server inadvertently unconfigures the monitoring
server ................179
OS agent restarted unexpectedly on heavily loaded
systems ...............179
Calendar entries overlap..........180
Receive an error when deploying an System
Service Monitor agent ..........180
The Agent Service Interface is not globalized . . . 180 Some attribute group names are unintelligible from the History Collection Configuration window . . 180 History collection fails to display the most recent
24 hours of data ............180
Situations with attributes from more than 1 group
not supported with autonomous agent .....181
Failure when importing situation xml file edited
with WordPad .............181
Printer details of another system are displayed on
the Tivoli Enterprise Portal .........181
CTIRA_MAX_RECONNECT_TRIES environment
variable is now obsolete ..........181
Agent goes offline after removing history path . . 182 Override button is not present for a situation. . . 182 Agent's Management Definition View columns are
not showing data ............182
There is a situation distribution discrepancy if there is a hub monitoring server outage when one or more remote monitoring servers remain active . 182 Installing v6.2.2 agent application support on a monitoring server for a prior release causes agents
to fail ................182
Installing backlevel Windows OS agent on existing environment causes monitoring server not to start . 182
SNMP trap Sendto fails ..........183
Situation overrides cannot be used to disable situations on specific systems at specific times . . 183 Situation or calendar name in thresholds.xml file
appears incorrect ............183
BAROC file is missing for IBM Tivoli Monitoring
5.x Endpoint situations ..........183
The target host name, platform, and version information is not displayed for the deployment
status in the CLI or the workspace ......184
Agent upgrade and restart using non-root ....184
After installing and configuring a monitoring
agent, it fails to start ...........186
situation_fullname slot missing for delete events 186 Logs are using the situation ID string instead of
the display name ............186
If a managed system list is removed for a situation,
the situation stops ............187
Descriptions are not displayed for default
situations...............187
viii IBM Tivoli Monitoring: Troubleshooting Guide
Agent configuration failed on remote deployment
while using single quotes for configuration
properties ..............187
New attributes missing ..........187
Unable to receive summarized data for the last
hour in the Tivoli Enterprise Portal ......188
Summarization for CCC logs is not allowed . . . 188
Receive errors when modifying the JAVA HEAP
SIZE for the Summarization and Pruning Agent . . 188
When associating situations, they fire, but cannot
be viewed ..............188
The Summarization and Pruning agent fails when
processing an index created in a previous release of
the product ..............188
Summarization and Pruning agent schedule not
affected by daylight saving time .......189
Attribute names must be kept under 28 characters
long ................189
Agent deployment operations are not completing
before the TIMEOUT expires ........189
Deploy cannot tell if the installation failed....190
An agent does not display in the portal client or in
the output from the listSystems command....190
One monitoring agent's workspaces are listed
under another agent node on the portal client . . 192
Issues with starting and stopping an agent as a
non-Administrator user ..........193
UNIX-based systems Log agent was deployed,
configured, and started but returns the
KFWITM290E error ...........193
KDY1024E error is displayed when configuring the
run-as user name for an agent ........193
Interface unknown messages in ras1 logs ....193
When upgrading a System Service Monitors agent
from 3.2.1 to 4.0, receive KDY3008E message . . . 194
The Tivoli Data Warehouse fails and you either
lose data or have memory problems......194
Error list appears in warehouse logs ......195
When configuring the Monitoring Agent for Sybase
and the Warehouse Proxy Agent, receive message
to use CandleManage ..........196
listSit command with the type option fails with a
KUIC02001E message on Japanese Systems . . . 196
Creating a situation from a group member does
not copy the distribution list ........196
A changed situation name does not show up . . . 196
New agents do not display in the portal client
Navigator view .............196
An agent displays unavailable in the portal client 196
CTIRA_HOSTNAME has no effect on log file
names ................197
The Summarization and Pruning Agent and the
Warehouse Proxy Agent do not work with DB2 9.1
Fix Pack 2 ..............197
An error of 'can bind a LONG value only for
insert' appears .............197
Errors in either the Warehouse Proxy Agent or
Summarization and Pruning Agent logs ....197
Receive a message saying that the statement
parameter can only be a single select or a single
stored procedure ............197
Custom defined workspace views do not handle
symbol substitution as expected .......197
Unresolved variables in custom queries ....198
A message appears after accepting the license . . 199 Adding agent help files requires a restart of the Eclipse Help Server and the Tivoli Enterprise
Portal Server .............200
Unable to create historical collection directory for
ud:db2inst1 ..............200
Receive a large amount of data back from the
warehouse for a baseline command ......200
Chapter 11. Command
troubleshooting ..........201
On Solaris 8 operating systems, checkprereq
processes do not complete .........201
Situations deleted from the CLI are still listed on
Tivoli Enterprise Portal Situation editor.....201
The tacmd addBundles command returns an
unexpected KUICAB010E error message ....201
tacmd removeBundles command returns
unnexpected KUICRB010E error message ....201
tacmd executecommand command run against
subnode fails .............202
The krarloff command returns an error message 202
Unexpected KUIC02013E error message ....202
Missing options for login -stdin results in
unexpected behavior ...........203
A system error occurs with the tacmd
editsystemlist -e command .........203
Problem running the tacmd listsystemlist -d
command on Linux systems ........203
Commands with embedded single quotation marks
fail .................204
tacmd exportnavigator -o not behaving correctly 204 TACMD xxxxAction commands fail on Japanese
systems ...............204
Overrides set against an agent cannot be deleted
from the command line ..........204
tacmd listsit -m UX Managed System gives no
result ................204
Receive a busy monitoring server message when using the getfile, putfile, or executecommand
commands ..............205
Reconfiguring an agent and then getting the deploy status yields a port number message . . . 205 tacmd getfile or putfile command is failing . . . 205 Temporary files remain when tacmd getfile or
putfile is interrupted ...........205
Receive an OutOfMemory exception when using
the import or export commands .......206
The suggestbaseline or acceptbaseline commands
fail .................206
Problems with Take Action commands and curly
brackets ...............206
When configuring the Monitoring Agent for Sybase and the Warehouse Proxy Agent, receive message
to use CandleManage ..........206
listSit command with the type option fails with a KUIC02001E message on Japanese Systems . . . 206
Contents ix
Take Action command names do not accept
non-English characters ..........207
addBundles command times out .......207
createNode command fails .........207
tacmd suggestbaseline minimum, maximum, and
average function values are ignored ......207
tacmd suggestbaseline command receives an error 208 When using the listSystems command, the last two
digits for the version appear as 'XX' ......208
The command tacmd restartAgent fails if the agent
is already stopped ............208
Using the kinconfig command and remotely starting, stopping or recycling agents fails on
Windows 2000 systems ..........209
You receive a message when using a tacmd
command related to agents .........209
You receive a message when trying to use the
tacmd maintagent command ........209
Endpoint fails to connect to monitoring server when running createnode from a monitoring server
in a different domain ...........209
Take Action commands do not work if unrequired
values are left blank ...........210
Take Action commands do not display messages when run from a Navigator Item or from a
workspace view ............210
Corrupted tacmd responses are displayed in the
command-line interface ..........210
The listSystems command consumes high CPU in
enterprise environments ..........211
Improving tacmd command response time when
using VMWare .............211
The addSystem command fails with error message
KUICCR099E .............212
The tacmd getdeploystatus command is not
returning status return codes ........212
tacmd createSit does not send errors if you mistype
the name of an attribute ..........213
wsadmin commands' output indicates the wrong
server name ..............213
When using the viewuser command, you receive a
message that an option is repeating ......213
Commands fail when a situation name consists of
characters ..............213
tacmd addSystem fails if agent already exists. . . 213 Installing an exported agent bundle using install.sh
causes an error .............214
The addbundles command fails .......214
The exportBundles command does not work for
patches ...............214
Chapter 12. Performance Analyzer
troubleshooting ..........215
Enabling logging for the agent .......215
Enabling logging for the monitoring portal . . . 216
Installation and configuration issues......216
Problems after upgrading .........217
Tivoli Performance Analyzer graphical user interface for Tivoli Enterprise Portal fails when
downloading tasks list ..........218
When tasks are started and when you should see
data in the workspaces ..........218
No data is displayed in the workspaces ....219
The Tivoli Performance Analyzer workspaces are
not available or not displayed ........219
No chart is visible on the Forecast Details
workspace ..............219
The Performance Analyzer Agent Statistics workspace shows database errors indicating that
some tables or views are missing .......219
Nonlinear tasks take too long to complete ....220
Agent never connects to the monitoring server . . 221 The Tivoli Enterprise Monitoring Server does not restart after installation of Domain Support . . . 221
Chapter 13. Database troubleshooting 223
Data loss prevention ...........223
Backing up the database for recovery purposes 223
Restoring the original database contents . . . 223 If you modify your password or if it expires . . . 223
DB2 pureScale environment ........224
Receive First Steps error at the end of a DB2
installation ..............225
Windows portal server cannot connect to the
database ...............225
Oracle problem with JDBC drivers prior to 11.1.0.7 226 Database contents are incorrect after installation 226 The error SQL0443N with 'SYSIBM:CLI:-805' occurs after upgrading to DB2 UDB Version 8.1 Fix Pack
10.................227
Using DB2 v8.1, Warehouse Proxy Agent crashes 227 Using DB2 V9.1 for z/OS, Warehouse Proxy agent encounters a large number of disconnections . . . 227
Historical data is not warehoused ......228
Historical data for logs is incorrect ......228
Warehouse Proxy Agent or Summarization and Pruning Agent fails due to DB2 transaction log full. 228 Incorrect data is collected in the warehouse for
filtering if using a wildcard.........228
Too much historical data is collected .....229
Warehouse Proxy agent failed to export data . . . 229 There are ORACLE or DB2 errors in the
khdras1.log file .............229
SQL0552N “ITMUSER” does not have the privilege to perform operation “CREATE BUFFERPOOL”
SQLSTATE=42502 ............230
Chapter 14. Event synchronization
troubleshooting ..........231
Event synchronization installation and
configuration troubleshooting ........231
Errors occur during installation of IBM Tivoli
Monitoring event synchronization .....231
Netcool/OMNIbus Probe for Tivoli EIF does not
start after configuring the probe to use
monitoring rules ...........231
Netcool/OMNIbus integration troubleshooting . . 232
Log files for Netcool/OMNIbus Event
Synchronization ...........232
x IBM Tivoli Monitoring: Troubleshooting Guide
Unable to send situation events from the hub monitoring server to Netcool/OMNIbus . . . 233 Event status updates in Netcool/OMNIbus are
not forwarded to Tivoli Monitoring .....235
Monitoring events in Netcool/OMNIbus do not have expected values for the Summary attribute or other attributes set by the IBM Tivoli
Monitoring probe rules .........239
After an event is cleared in Netcool/OMNIbus, the event's severity is changed back to its
original severity ...........241
Tivoli Enterprise Console integration
troubleshooting.............242
General event synchronization troubleshooting . . 242
Editing the default destination server
information from the command line does not
work ...............242
tacmd refreshTECinfo -t all shows no results on
console ..............243
Changing the TCP/IP timeout setting on your
event server .............243
Cognos reports are displayed as a blank page . . 248 You are missing drivers after the Tivoli Common
Reporting installation...........248
Documentation for the base agents ......249
The report fails to generate because the SQL query
was not valid .............249
Message “SQL Statement does not return a
ResultSet object” displayed .........250
Your report fails to generate with unexpected error
messages displayed ...........250
The generated report displays the message “SQL
Error”................251
The report fails with a SQLSTATE:22003 arithmetic
overflow error .............251
No data is plotted in graph, or some columns in
the table are blank............252
The generated report displays the message “The
requested data is not available” .......253
You receive the message “serverName is unknown
host” ................253
You receive the message “Empty Data Set” . . . 254
Chapter 15. Tivoli Common Reporting
troubleshooting ..........245
Locations of log files ...........245
Java out of memory error after installation . . . 245 Running OS Cognos Reports with Tivoli Common Reporting 2.1.1 on 64-bit AIX 6.1 results in error
DPR-ERR-2056 .............246
Displaying data for Situations History report
results in error .............247
Date and time format in IBM Tivoli Monitoring OS
Agents reports not localized ........247
Prompted to select report type when installing
reports with CLI ............247
Cognos Query Studio displays Japanese text within
the Thai web browser ..........247
The prompt page of a Cognos report within the Tivoli Common Reporting tool displays strings that
are not translated ............247
The generated report displays an incorrect date
and time format ............247
The generated report does not display report
legend................248
Receive a 'statement is too long' error message
when running a report ..........248
Running COGNOS reports against a DB2 database
is slow ...............248
Chapter 16. Tivoli Audit Facility
troubleshooting ..........255
Audit Log workspace shows only 100 of the most
recent audit records ...........255
Audit Log workspace does not display records
before the latest component startup ......255
Appendix. IBM Tivoli Monitoring
processes .............257
Documentation library .......259
IBM Tivoli Monitoring library ........259
Documentation for the base agents .....260
Related publications ...........261
Other sources of documentation .......261
Support information ........263
Notices ..............267
Glossary .............271
Index ...............285
Contents xi
xii IBM Tivoli Monitoring: Troubleshooting Guide
Tables
1. Location of log files for the IBM Tivoli
Monitoring components. ........36
2. Installation log files ..........39
3. Upgrading from Tivoli Distributed
MonitoringTivoli log file ........41
4. Setting the trace option for the Tivoli
Monitoring upgrade toolkit .......50
5. General frequently asked questions ....65
6. Windows installation frequently asked
questions .............65
7. Frequently asked questions for UNIX-based
systems installation ..........66
8. lcfd log file.............80
9. Removing a failed installation on Windows 109
10. Installation logs ...........113
11. Uninstall OS command ........114
12. Cannot log in to the Tivoli Enterprise Portal
Server ..............115
13. Cannot connect to Tivoli Enterprise Portal
Server ..............118
14. The Tivoli Enterprise Portal Server does not
start after installation .........121
15. Control interface publishing ......122
16. Resolutions for agent deployment operations
that TIMEOUT ...........189
17. createNode command fails .......207
18. Utilities for backing up the database ....223
19. Resolving problems sending events to
Netcool/OMNIbus..........233
20. Event status updates in Netcool/OMNIbus are not forwarded to Tivoli Monitoring . . . 236
21. Monitoring events in Netcool/OMNIbus do
not have expected values .......239
22. IBM Tivoli Monitoring processes by operating
system ..............257
© Copyright IBM Corp. 2005, 2012 xiii
xiv IBM Tivoli Monitoring: Troubleshooting Guide
About this information
This guide provides problem determination and resolution information for the issues most commonly encountered with IBM®Tivoli®Monitoring components and related products.
You can use this guide in conjunction with the other books for your product.
© Copyright IBM Corp. 2005, 2012 xv
xvi IBM Tivoli Monitoring: Troubleshooting Guide
Chapter 1. Introduction to troubleshooting
To troubleshoot a problem, you typically start with a symptom or set of symptoms and trace back to the cause.
Troubleshooting is not the same as problem solving, although during the process of troubleshooting, you can obtain enough information to solve a problem, such as with end-user errors, application programming errors, and system programming errors.
You might not always be able to solve a problem yourself after determining its cause. For example, a performance problem might be caused by a limitation of your hardware. If you are unable to solve a problem on your own, contact IBM Software Support for a solution. See Chapter 2, “Logs and data collection for troubleshooting,” on page 5 for information on the types of data to collect before contacting Support.
Sources of troubleshooting information
The primary troubleshooting feature is logging. Logging refers to the text messages and trace data generated by the software. Messages and trace data are sent to an output destination, such as a console screen or a file.
Typically, text messages relay information about the state and performance of a system or application. Messages also alert the system administrator to exceptional conditions when they occur. Consult the explanation and operator response associated with the displayed messages to determine the cause of the failure. See the document IBM Tivoli Monitoring Messages for message information.
Trace data capture transient information about the current operating environment when a component or application fails to operate as designed. IBM Software Support personnel use the captured trace information to determine the source of an error or unexpected condition. See “Trace logging” on page 35 for more information about tracing.
Problem classification
The first task in troubleshooting is to determine the origin of the problem, or which component or function is experiencing a problem. To assist you in determining the origin of the problem, collect documentation at the time of the error.
You might experience problems with IBM Tivoli Monitoring in the following areas:
v Installation
v Upgrading
v Configuration
v Connectivity
v Tivoli Enterprise Portal
v Tivoli Enterprise Portal Server
v Tivoli Enterprise Monitoring Server
v Tivoli Enterprise Monitoring Agent deployment
© Copyright IBM Corp. 2005, 2012 1
v Databases
v Tivoli Data Warehouse
v Universal Agent
v IBM Tivoli Enterprise Console
Viewing the IBM Support Portal
The IBM Support Portal is a unified, customizable view of all technical support tools and information for your IBM systems, software, and services. It brings all the support resources available for IBM hardware and software offerings together in one place.
About this task
Perform the following actions to access technotes for this product:
Procedure
1. Open the http://ibm.com website and select Support & downloads >
Technical support. You can also launch an IBM support website, such as
http://www.ibm.com/support/us.
2. Enter your IBM user ID when prompted or, in the Quick start page or Support
home, click Sign in to sign in with your IBM user ID or to register if you have not yet registered.
3. Enter a keyword or keywords for the information you want to find in the
Quick Find or Search support fields. You can also browse through the other Support tabs.
®
Subscribing to IBM support notifications
You can subscribe to e-mail notification about product tips and newly published fixes through the Support portal.
In the Support portal, you can specify the products for which you want to receive notifications; choose from flashes, downloads, and technotes; and set up to receive email updates.
About this task
Perform the following actions to subscribe to Support emails.
Procedure
1. Open the http://ibm.com website and select Support & downloads >
Technical support. You can also launch an IBM support website, such as
http://www.ibm.com/support/us.
2. In the Quick start page or Support home, click Sign in to sign in or to register
if you have not yet registered.
3. In the Notifications area of Support home, click Manage all my subscriptions.
4. In the Subscribe and My defaults tabs, select a product family and continue
setting your preferences to specify the information you want in your emails.
5. If you have not yet added an email address to your profile, click My IBM >
Profile > Edit and add it to your personal information.
2 IBM Tivoli Monitoring: Troubleshooting Guide
Results
You begin receiving “IBM My notifications” emails about the products you have selected and at the interval you specified.
Chapter 1. Introduction to troubleshooting 3
4 IBM Tivoli Monitoring: Troubleshooting Guide
Chapter 2. Logs and data collection for troubleshooting
If you have a problem that you are unable to solve using the information in this guide or on the IBM Support Portal, gather the information that relates to the problem and contact IBM Software Support for further assistance.
Appropriate IBM Tivoli Monitoring RAS1 trace output
IBM Software Support uses the information captured by trace logs to trace a problem to its source or to determine why an error occurred.
The reliability, availability, and serviceability (RAS) trace logs are available on the Tivoli Enterprise Monitoring Server, the Tivoli Enterprise Portal Server, and the monitoring agent. By default, the logs are stored in the installation path for IBM Tivoli Monitoring.
The following links to sections in this document supply more information on these files:
v For information on where they are stored, see “Log file locations” on page 35
v For information on setting the trace option for an IBM Tivoli Monitoring
component, see “Setting traces” on page 43.
v For information on dynamically setting the trace settings, see “Dynamically
modify trace settings for an IBM Tivoli Monitoring component” on page 53.
v For information on reading RAS1 logs, see “Reading RAS1 logs” on page 42.
v For information on the ras1log tool, see “ras1log tool” on page 60.
Running snapcore to collect information
Use the snapcore command for collecting information for use in identifying and resolving problems with an application.
The snapcore command gathers a core file, program, and libraries used by the program and compresses the information into a pax file. The file can then be downloaded to disk or tape, or transmitted to a remote system.
About this task
Take the following steps to run the snapcore command and collect information you might need to debug and analyze the problem:
Procedure
1. Change to the directory where the core dump file is located:
#ls-l total 84176
-rw-r--r-- 1 root system 2704 Feb 21 09:52 core.18048.01084144
2. Run the snapcore command to collect all needed files:
# snapcore -d /tmp/myDir core.18048.01084144
The snapcore command gathers all information and creates a new compressed pax archive in the/tmp/myDir directory. If you do not specify a special directory
© Copyright IBM Corp. 2005, 2012 5
using the -d flag, the archive will be stored in the/tmp/snapcore directory. The new archive file will be named as snapcore_$pid.pax.Z:
# ls -l /tmp/myDir total 5504
-rw-r--r-- 1 root system 2815081 Feb 21 09:56 snapcore_20576.pax.Z
3. To check the content of the pax archive, run the uncompress command:
# uncompress -c snapcore_20576.pax.Z | pax core.18048.01084144 README lslpp.out errpt.out vi ./usr/lib/libc.a ./usr/lib/libcrypt.a ./usr/lib/libcurses.a ./usr/lib/nls/loc/en_US ./usr/lib/libi18n.a ./usr/lib/libiconv.
Locating the core file
You can read the core file for information related to system stops on UNIX-based systems. Use the errpt -a command to get a summary of the most recent system stoppages and the location of the core file.
If the system stops on UNIX-based systems, collect the core file from the directory that stores the binary file, to which the process belongs. For example, if the failing process is the Tivoli Enterprise Portal Server server process, KfwServices, the core is created in the /opt/IBM/ITM/archtype/cq/bin/ directory.
Procedure
To retrieve information on where the core file is created, enter the errpt -a command.
Results
A summary of information is displayed about the most recent crashes and also the location of the core file:
------------­LABEL: CORE_DUMP IDENTIFIER: A63BEB70
Date/Time: Tue Jun 30 15:38:47 DFT 2009 Sequence Number: 1229 Machine Id: 0056536D4C00 Node Id: nc114062 Class: S Type: PERM Resource Name: SYSPROC
Description SOFTWARE PROGRAM ABNORMALLY TERMINATED
Probable Causes SOFTWARE PROGRAM
User Causes USER GENERATED SIGNAL
Recommended Actions CORRECT THEN RETRY
6 IBM Tivoli Monitoring: Troubleshooting Guide
Failure Causes SOFTWARE PROGRAM
Recommended Actions RERUN THE APPLICATION PROGRAM IF PROBLEM PERSISTS THEN DO THE FOLLOWING CONTACT APPROPRIATE SERVICE REPRESENTATIVE
Detail Data SIGNAL NUMBER
USER’S PROCESS ID:
FILE SYSTEM SERIAL NUMBER
INODE NUMBER
PROCESSOR ID
CORE FILE NAME /opt/IBM/ITM/aix533/cq/bin/core PROGRAM NAME KfwServices STACK EXECUTION DISABLED
---------------
11
32248
10
655367
0
Getting Dr. Watson dumps and logs
Use the Dr. Watson debugger to get the information needed by IBM Support to diagnose problems on Windows systems.
If you encounter errors or failures on your Windows system, collect the
drwtsn32.log and user.dmp files if they are available. The drwtsn32.log and user.dmp files are located in: \Documents and Settings\All Users\Documents\ DrWatson.
About this task
Take the following steps to enable Dr. Watson and configure it to create a detailed dump file:
Procedure
1. To enable Dr. Watson as the default debugger, at the command prompt, enter
2. To open the Dr. Watson configuration dialog, at the command prompt, enter the
3. Set the following fields:
KpcCMA.RAS files
the following command: drwtsn32 –i.
following command: drwtsn32
a. Set the Crash dump Type to FULL.
b. Clear the Dump Symbol Table check box.
c. Enable the Dump all Thread Contexts check box.
d. Enable the Create Crash Dump File check box.
IBM Tivoli Monitoring on Windows systems has ( where pc is the two-character product or component code) KpcCMA.RAS files in the c:\windows\system32 directory to collect information about monitoring process failures.
Chapter 2. Logs and data collection for troubleshooting 7
For example, KNTCMA.RAS is the Monitoring Agent for Windows OS the reliability, availability, and serviceability file. These files contain system dump information similar to the drWatson.log, but are generated by the IBM Tivoli Monitoring infrastructure.
Sources of other important information
You can collect important information from log files, such as trace or message logs that report system failures. Also, application information provides details on the application that is being monitored, and you can obtain information from messages or information on screen.
The following sources provide additional information to aid in troubleshooting:
v Monitored application file as specified on the SOURCE FILE statement, if
applicable.
v Description of the operation scenario that led to the problem.
v Incorrect output, such as Tivoli Enterprise Portal screen captures or a description
of what you observed, if applicable.
v Log files collected from failing systems. You can collect all logs or logs of a
certain type such as, RAS trace logs or message logs.
v Messages and other information displayed on the screen.
v Information about the application that you are monitoring, such as DB2 or SAP.
This information includes the version number, patch level, and a sample application data file if you are monitoring a file.
v Operating system version number and patch level.
v Version number of the following members of the monitoring environment:
– IBM Tivoli Monitoring and the patch level, if available.
– Monitoring Agent version number .
– Tivoli Enterprise Portal (Select Help > About Tivoli Enterprise Portal)
Note: The version number of the Tivoli Enterprise Portal and theTivoli Enterprise Portal Server must always be synchronized.
8 IBM Tivoli Monitoring: Troubleshooting Guide
Chapter 3. Common problem solving
Customers using IBM Tivoli Monitoring products or the components of Tivoli Management Services can encounter problems such as missing workspaces or historical data, or a reflex automation script that does not run when it should. In many cases you can recover from these problems by following a few steps.
Note: Use the trace settings indicated in these troubleshooting instructions only while you are trying to diagnose a specific problem. To avoid generating excessive trace data, go back to the default trace settings as soon as the problem is solved.
About the tools
You can access several troubleshooting tools, such as the Log analyzer or pdcollect tool to help you troubleshoot your IBM Tivoli Monitoring product or the components of Tivoli Management Services.
ITMSuper Tools
The ITMSUPER Tools give you information about the health of your managed systems, situations, and environment configuration. You can find the tools by searching for “ITMSUPER” in the IBM Integrated Service Management Library (http://www.ibm.com/software/brandcatalog/ ismlibrary).
pdcollect tool
The pdcollect tool collects the most commonly used information from a system. It gathers log files, configuration information, version information, and other data. You can also use this tool to manage the size of trace data repositories. For more information see “pdcollect tool” on page 59.
IBM Support Assistant
The IBM Support Assistant is a free, stand-alone application that you can install on any workstation. Then, you can enhance the application by installing product-specific plug-in modules for the IBM products you use. For more information see “Support information” on page 263.
I am trying to find out what software is supported
Use resources in the IBM Tivoli Monitoring Installation and Setup Guide and the IBM website to determine the software that is supported. This enables you to find platform or database information for specific products.
The following resources are available to determine the software that is supported:
v For specific information about the supported software for IBM Tivoli
Monitoring, see “Hardware and software requirements” in the IBM Tivoli
Monitoring Installation and Setup Guide
v For platform and database support information for most Tivoli products, consult
the matrix at Tivoli Supported Platforms (http://www-306.ibm.com/software/ sysmgmt/products/support/Tivoli_Supported_Platforms.html)
© Copyright IBM Corp. 2005, 2012 9
Workspaces are missing or views are empty
You can encounter a problem that workspaces are missing or views are empty. For example, you may have workspaces that return no data.
Symptoms of the problem:
v The workspaces return no data.
v There are no child Navigator items under the agent node in the Navigator view.
See “Resolving application support problems” on page 11.
v The Navigator items are labeled with internal names, such as Knt:KNT1076
instead of the correct names (such as Disk). See “Resolving application support problems” on page 11.
v You receive message KFWITM217E: Request error: SQL1_CreateRequest failed,
rc=209. See “Resolving application support problems” on page 11.
v You receive message KFWITM220E: Request failed during execution.See
“Resolving monitoring agent problems” on page 14.
For more information on workspaces that relate to historical data, see “Historical data is missing or incorrect” on page 20.
To diagnose the problem that workspaces are missing or empty, see “Diagnosing that workspaces are missing or empty.”
Diagnosing that workspaces are missing or empty
You can diagnose that workspaces are missing or empty by verifying that the monitoring agent has been started and that the configuration is correct.
You can also check that application support has been added.
About this task
To diagnose that workspaces are missing or empty, perform the following steps:
Procedure
Preliminary diagnostics
1. Refresh the Navigator by clicking View > Refresh.
2. Verify that the monitoring agent has been started. Restart if necessary. In the
Tivoli Enterprise Portal, right-click the Navigator item of the monitoring agent and click Start or Restart
3. Verify that the monitoring agent configuration is correct.
4. If your data is missing in an Oracle Agent workspace, see “Resolving Oracle
DB Agent problems - diagnostic actions” on page 32. Similar problems might exist for other monitoring agents.
5. Check that application support has been added. See “Resolving application
support problems” on page 11.
10 IBM Tivoli Monitoring: Troubleshooting Guide
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks.
Resolving application support problems
Application support problems are caused by a lack of application support or an application support level mismatch among the components: monitoring server, portal server, desktop and Java Web Start clients, and monitoring agents. Check the installed level of application support or run the ITMSUPER Tivoli Enterprise Monitoring Server analysis tool (or both) to get more information.
Before you begin
Complete one or both of the following tasks to ensure that this is an application support problem:
v “Diagnosing that workspaces are missing or empty” on page 10
v “Diagnosing that a situation does not raise when expected” on page 25
About this task
To resolve application support problems, you perform diagnostic and corrective actions. These actions include checking application support on the servers and client and running the Tivoli Enterprise Monitoring Server tool to ensure that application support is installed consistently in your environment.
Procedure
Diagnostic and corrective actions
1. Check application support on the monitoring server, portal server, and portal
client:
v
v
v
2. You can also run the Tivoli Enterprise Monitoring Server analysis tool provided
by ITMSUPER against the hub monitoring server to ensure that application support is installed consistently throughout your environment.
3. If application support is missing, add the appropriate application support to
the portal server and monitoring server for the monitoring agents.
4. If the desktop client or Java Web Start client is being used, ensure application
support is installed on the portal client.
Run the kincinfo.exe -i command in the %CANDLE_HOME\InstallITM
directory to show what is installed.
Run the ./cinfo –i command in the $CANDLEHOME/bin directory to
show what is installed.
(monitoring server) Look in the &rhilev.&rte.RKANDATV data set where
&rhilev is the high-level qualifier and &rte is the mid-level qualifier of the libraries for the runtime environment where the monitoring server is configured for files named KppCATand KppATR where pp is the two-character product or component code.
What to do next
For more information and instructions on installing application support see “Configuring application support for nonbase monitoring agents” in the IBM Tivoli Monitoring Installation and Setup Guide. For instructions on installing application
Chapter 3. Common problem solving 11
support on a z/OS monitoring server, see “Adding application support to a monitoring server on z/OS” in Configuring IBM Tivoli Enterprise Monitoring Server on z/OS.
Resolving monitoring server problems
Monitoring server problems are caused by a monitoring server that is not started or connectivity that is lost either between servers or between servers and agents. You can restart the Tivoli Enterprise Monitoring Server, and you can also run the ITMSUPER Topology tool to get more information.
About this task
To resolve monitoring server problems, you perform diagnostic and corrective actions. These actions include running tools, such as the Topology or Connectivity tool and correcting communication failures in logs.
Procedure
Diagnostic and corrective actions
1. If you are an administrator, restart the monitoring server. Otherwise, notify an
administrator and wait for the monitoring server to be restarted.
2. Running the following ITMSUPER tools might also provide more information:
v Topology tool
v Connectivity tool
v Tivoli Enterprise Monitoring Server analysis tool
v Tivoli Enterprise Portal Server
3. Check the portal server logs for messages indicating communication failures to
the monitoring server.
4. Check the monitoring server logs for messages indicating communication
failures to the remote monitoring servers or to monitoring agents.
5. Correct the communication failures indicated in the logs.
Resolving monitoring agent problems
If the monitoring agent is running but data is not being returned or if you receive an error message from an agent log, such as Endpoint unresponsive, verify that the agent is connected and online. You can also verify that application support has been installed correctly.
About this task
To resolve monitoring agent problems, you perform diagnostic and corrective actions. These actions include verifying that the agent is running and that application support has been installed correctly. For information on monitoring agents on z/OS
Procedure
Diagnostic and corrective actions
1. Verify that the agent is connected. Check the monitoring server log for
messages similar to Remote node <SYS:MQIRA> is ON-LINE.
2. If the agent is online, check to see whether subnodes are online in the agent
log. For example: KMQMI171I Node JSG1:SYS:MQESA is online.
12 IBM Tivoli Monitoring: Troubleshooting Guide
®
see each product's Problem Determination Guide.
3. If subnodes are online, are workspaces showing correct titles?
v No: Verify that application support has been installed correctly and that
buildpresentation.bat ran correctly.
v Yes: Go to the next step.
4. If workspaces contains titles, is there a column heading?
v No: Verify that application support has been installed correctly and that
buildpresentation.bat ran correctly.
v Yes: Go to the next step.
5. If there is only a column heading with no data, turn on UNIT:KRA ALL in the
agent and verify that rows are being returned when the workspaces are displayed.
Status of a monitoring agent is mismatched between the portal client and tacmd command
You can encounter a problem that the status of a monitoring agent is mismatched between the Tivoli Enterprise Portal client and tacmd command. For example, a monitoring agent shows as online in the portal client and offline in the results of a tacmd command.
Diagnosing that the status of a monitoring agent is mismatched between the portal client and tacmd command
You can diagnose that the status of a monitoring agent is mismatched between the Tivoli Enterprise Portal and tacmd command by setting a trace to determine whether the problem is in the Tivoli Enterprise Portal Server or the Tivoli Enterprise Monitoring Server.
About this task
To diagnose that the status of a monitoring agent is mismatched between the portal client and tacmd command, perform the following steps:
Procedure
Preliminary diagnostics
1. Verify the state of the monitoring agent in Manage Tivoli Monitoring Services.
2. Compare the status of the node in the physical Navigator view with the status
reported in the Managed System Status workspace. If the status in the physical Navigator view agrees with the status shown in the Managed System Status workspace, then the problem is at the monitoring agent. See “Resolving monitoring agent problems” on page 14.
3. To determine whether the problem is in the portal server or monitoring server,
set the following trace in the portal server: ERROR (UNIT:ctcmw IN ER)./
4. Then examine the portal server log for the following statement: Node Status event received (managed system name).
v If the trace shows that the last node status record received for the managed
system matches the status shown in the portal client, then the problem is located in the monitoring server. See
v If the trace shows that the last node status record received for the managed
system indicated the correct status, then the problem is located in the portal server. Run the portal server trace, collect logs, and call IBMSoftware Support.
Chapter 3. Common problem solving 13
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks.
Resolving monitoring agent problems
Monitoring agent problems, such as a monitoring agent that has not started can be resolved by refreshing the Navigator status in the Managed System Status workspace.
About this task
To resolve monitoring agent problems, you perform diagnostic and corrective actions. These actions include checking the status of the monitoring agent and the monitoring server.
Procedure
Diagnostic and corrective actions
1. Open the Managed System Status workspace and click View > Refresh.
2. Make sure the monitoring agent is connected to the correct monitoring server.
3. Check the status of the monitoring server that the monitoring agent is
connected to. For more information, see the monitoring server problem resolution task.
Resolving monitoring server problems
Tivoli Enterprise Monitoring Server problems, such as loss of connectivity between a monitoring agent and a remote monitoring server can be resolved by checking, for example, that the remote monitoring server is connected to the hub monitoring server.
Some causes of monitoring server problems:
v A remote monitoring server has shut down
v Loss of connectivity between the monitoring agent and the remote monitoring
server to which it reports, or between that monitoring server and the hub monitoring server
v You receive the following message in the monitoring server log:
KDS9151E: The heartbeat from remote TEMS variable was not received at its scheduled time and the remote TEMS has been marked offline.
About this task
To resolve monitoring server problems, you perform diagnostic and corrective actions. These actions include running tools, such as the ITMSUPER tools and correcting connectivity failures.
Procedure
Diagnostic and corrective actions
1. Check the Managed System Status workspacein the portal client.
14 IBM Tivoli Monitoring: Troubleshooting Guide
2. If the monitoring agent is connected through a remote monitoring server, confirm that the remote monitoring server is connected to the hub monitoring server.
3. If the remote monitoring server is not running and you are an administrator, restart it. Otherwise, notify an administrator and wait for the remote monitoring server to be restarted.
4. Running the following ITMSUPER tools might also provide more information:
v Topology tool
v Connectivity tool
v Agent response time tool
v Tivoli Enterprise Monitoring Server analysis tool
5. Correct the connectivity failures identified.
The portal server does not start or stops responding
You can encounter a problem that the Tivoli Enterprise Portal Server does not start or stops responding. For example, you may receive a message that communication with the portal server could not be established or the portal server is not ready.
Symptoms of the problem
v Portal client logon fails. “Diagnosing that portal server logon fails” on page 18.
v The portal server stops responding during normal operation of the portal client. v You receive message KFWITM091E: View not available at this time. v You receive message KFWITM010I: Tivoli Enterprise Portal Server not ready. v You receive message KFWITM402E: Communication with the Tivoli Enterprise
Portal Server could not be established.
v You find a similar text string to KFWDBVER, version not found when trying to
start the portal server. See “Resolving database problems - missing table or
portal server database” on page 16.
Diagnosing that the portal server does not start or stops responding
You can diagnose that the Tivoli Enterprise Portal Server does not start or stops responding by running the Tivoli Enterprise Portal Server Analysis ITMSUPER tool.
About this task
To diagnose that the portal server does not start or stops responding, perform the following steps:
Procedure
Preliminary diagnostics
1. For more information about any messages received, see the IBM Tivoli Monitoring: Messages (http://publib.boulder.ibm.com/infocenter/tivihelp/
v15r1/topic/com.ibm.itm.doc_6.2.3fp1/itm623_messages.htm) reference guide. Operator responses and general information are provided for each message.
2. Allow the portal client enough time to establish a connection with the portal server.
®
3. Is DB2
running?
Chapter 3. Common problem solving 15
v Yes: See step 4.
v No: See “Resolving database problems - instance not started” on page 17.
4. Collect the portal server log or the operations log and look for the following
text strings:
v KFWDBVER, version not found v TEPS database not found v user ID or password invalid v DB2 instance not started
5. Run the Tivoli Enterprise Portal Server Analysis ITMSUPER tool.
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks.
Resolving database problems - missing table or portal server database
Database problems caused by a missing table or Tivoli Enterprise Portal Server database, or by a mismatch between the portal server version and the version record in the database, can be resolved by reconfiguring the portal server.
About this task
To resolve database problems, you perform diagnostic and corrective actions. These actions include reconfiguring the portal server.
Procedure
Diagnostic and corrective actions
1.
2.
3. Run the buildpresentation script.
To reconfigure the portal server, open Manage Tivoli Monitoring Services, right-click the portal server, and select Reconfigure. If the problem persists, run one of the following commands and set the correct password in the window that is displayed:
v For an SQL database, cnpsdatasource.exe v For a DB2 database, db2datasource.exe
To reconfigure the portal server, take one of the following steps:
v On the GUI interface, open Manage Tivoli Monitoring Services right-click the
portal server, and select Reconfigure.
v On the command-line interface, run the ./itmcmd config -A cq command.
Resolving database problems - user ID and password
Database problems caused by a password that does not match the operating system password or an incorrect password in the registry can be resolved by reconfiguring the Tivoli Enterprise Portal Server and verifying the user ID and password.
The causes of database problems include:
v Portal server database user password is out of sync.
v User ID does not match the operating system logon user ID.
16 IBM Tivoli Monitoring: Troubleshooting Guide
v Password does not match the operating system password.
v Registry does not have the correct password.
About this task
To resolve database problems, you perform diagnostic and corrective actions. These actions include reconfiguring the portal server and ensuring that your portal client user ID is the same as the logon user ID of your system.
Procedure
Diagnostic and corrective actions
v
command. If the problem persists, take one or more of the following steps:
– Ensure that your portal client user ID is identical to the logon user ID of your
– Check theDB2 UDB database and ensure that the db2admin user ID and
– Check the DB2 user ID and password for the database and data source:
– On the Advanced Settings tab, verify that the DATABASE name is correct.
v
– On the GUI interface, open Manage Tivoli Monitoring Services right-click the
– On the command-line interface, run the ./itmcmd config -A cq command.
To reconfigure the portal server, run the tacmd configureportalserver
system and use the correct capitalization for your user ID and password. If you need to change your password, take the following steps:
1. Right-click My Computer and select Manage.
2. Select Local Users and Groups.
3. Select Users.
4. Right-click your user ID and select Properties.
5. For db2admin, set the password to never expire.
password match those of the db2admin local account:
1. Click Control Panel > Administrative Tools > Services.
2. Right-click DB2 - DB2 and select Properties.
3. Select the Log On tab and ensure that the db2admin user ID and
password match the db2admin UDB account.
1. Click Control Panel > Administrative Tools > Data Sources (ODBC).
2. On the System DSN tab, select TEPS2 and click Configure.
3. Enter your user ID and password. For example: db2admin for database and
CNPS for data source.
4. To test the connection to the UDB database, click Connect.
To reconfigure the portal server, take one of the following steps:
portal server, and select Reconfigure.
Resolving database problems - instance not started
Database problems such as a DB2 instance that is not started can be resolved by recycling the Tivoli Enterprise Portal Server to resolve problems and ensuring that the user ID and password are correct.
About this task
To resolve database problems, you perform diagnostic and corrective actions. These actions include ensuring that the user ID and password are correct.
Chapter 3. Common problem solving 17
Procedure
1. Check the status of the instance in theDB2 Control Panel.
2. Recycle the portal server and resolve any issues reported.
3. Ensure that the user ID and password are correct.
Diagnosing that portal server logon fails
Logging on to the Tivoli Enterprise Portal Server fails when a user ID is locked, disabled, or an internal error occurs during logon. Examine the portal server or portal client logs for more information.
The logon failure might cause one or more of the following messages to display:
v KFWITM392E: Internal error occurred during logon. v KFWITM009I: The Tivoli Enterprise Portal Server is still being
initialized and is not ready for communications.
v KFWITM010I: Tivoli Enterprise Portal Server not ready. v KFWITM395E: User ID has been locked or disabled. v KFWITM396E: User ID has been locked or disabled by Tivoli Enterprise
Portal Server.
About this task
To diagnose that the portal server logon Tivoli Enterprise Portal Logon fails, perform the following steps:
Procedure
Preliminary diagnostics
1. For a guide to the messages and operator responses, refer to IBM Tivoli
Monitoring: Messages (http://publib.boulder.ibm.com/infocenter/tivihelp/
v15r1/topic/com.ibm.itm.doc_6.2.3fp1/itm623_messages.htm).
2. Look in the portal server or portal client logs for more information concerning
the message.
The portal client does not respond
You can encounter a problem that the Tivoli Enterprise Portal does not respond or stops running.
Diagnosing that the portal client does not respond
You can diagnose that the Tivoli Enterprise Portal does not respond by verifying that a Tivoli Enterprise Monitoring Server is started or the selected workspace is returning data.
About this task
To diagnose that the Tivoli Enterprise Portal does not respond, perform the following steps:
Procedure
Preliminary diagnostics
1. Verify that the monitoring server is started.
18 IBM Tivoli Monitoring: Troubleshooting Guide
2. If you have selected a workspace that is retrieving large amounts of data, wait
for the data to be returned. If the workspace returns empty, see “Workspaces are missing or views are empty” on page 10.
3. On Windows, check theWindows Task Manager and in the
%CANDLE_HOME\InstallITM directory, run the following kincinfo.exe commands:
v kincinfo.exe -r to show running processes. v kincinfo.exe -i to show what is installed.
4. On Linux or UNIX, in the $CANDLEHOME/bin directory, run the following cinfo
commands:
v ./cinfo -r to show running processes. v ./cinfo –i to show what is installed.
5. If your portal client stops responding while in an Oracle Agent workspace, see
“High CPU usage on a distributed system” on page 29. Your problem might be related to a high CPU usage problem. Similar problems might exist for other monitoring agents.
6. Running the following ITMSUPER tools might also provide more information:
v Stressed Resources tool
v Connectivity tool
v Topology tool
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks.
Resolving storage or memory problems
Storage or memory problems are caused by a problem that leads to a lack of storage or memory. To resolve storage or memory problems, you perform diagnostic and corrective actions. These actions include reconfiguring the JavaControl Panel.
Procedure
Reconfigure the Java Control Panel. See “Tivoli Enterprise Portal has high memory usage” on page 138.
Resolving client configuration problems
To resolve Tivoli Enterprise Portal configuration problems, disable the DirectDraw to reduce high CPU usage due to the Java process attempting to write to the screen.
Procedure
Disable DirectDraw by setting the sun.java2d.noddraw client variable to false. See the “Portal client configuration settings” topics in the IBM Tivoli Monitoring Administrator's Guide.
Chapter 3. Common problem solving 19
Historical data is missing or incorrect
You can encounter a problem that historical data is missing or incorrect. For example, you can have a workspace that is missing historical data.
Symptoms of the problem:
v Workspace is missing historical data.
v Workspace graphs or tables contain short-term but not long-term historical data.
By default, long-term historical data is older than 24 hours.
v Summarized historical data is not displayed.
v You suspect that the values returned for historical data are incorrect.
Diagnosing that historical data is missing or incorrect
You can diagnose that historical data is missing or incorrect by using workspaces such as the Self-Monitoring Topology workspace to verify component activity. Also, you can use the tacmd commands to verify the configuration of historical data.
About this task
To diagnose that historical data is missing or incorrect, perform the following steps:
Procedure
Preliminary diagnostics
1. To verify component connectivity through the Self-Monitoring Topology
workspace, perform the following steps:
a. In the Navigator Physical view, click the Enterprise item.
b. Select Workspace > Self-Monitoring Topology.
Alternatively, review the Tivoli Data Warehouse workspaces for the Warehouse Proxy agent and Summarization and Pruning agent.
2. Verify the historical data collection configuration in the portal client or by
issuing the following tacmd commands:
v tacmd histlistproduct v tacmd histlistattributegroups v tacmd histviewattributegroup v tacmd histConfigureGroups v tacmd histViewAttributeGroup v tacmd histUnconfigureGroups v tacmd histStartCollection v tacmd histStopCollection
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks. See also the troubleshooting topics in the IBM Tivoli Warehouse Proxy Agent User's Guide (http://publib.boulder.ibm.com/infocenter/tivihelp/ v15r1/topic/com.ibm.itm.doc_6.2.3fp1/wpa/wpagent_user.htm) and IBM Tivoli Warehouse Summarization and Pruning Agent User’s Guide (http:// publib.boulder.ibm.com/infocenter/tivihelp/v15r1/topic/
20 IBM Tivoli Monitoring: Troubleshooting Guide
com.ibm.itm.doc_6.2.3fp1/spa/spagent_user.htm).
Resolving warehouse proxy connection problems
Tivoli Data Warehouse proxy agent connection problems can be resolved by verifying that the correct socket connection is being used.
The causes of Warehouse Proxy agent problems might be because the short-term historical data is being stored at the monitoring agent or the Tivoli Enterprise Monitoring Server and should be switched, or because the Warehouse Proxy agent cannot connect to the data warehouse or to the monitoring server.
About this task
To resolve warehouse proxy connection problems, you perform diagnostic and corrective actions. These actions include verifying that the monitoring agent is connected to the Tivoli Data Warehouse and that the connection from the agent to the Tivoli Enterprise Monitoring Server is not being prevented by a firewall.
Procedure
Diagnostic and corrective actions
1. Ensure that the Warehouse Proxy agent is running.
2. Look for export failures to the Warehouse Proxy agent in either the monitoring
agent RAS1 log or the monitoring server RAS1 log. Depending on where the error is found, see the monitoring agent steps or monitoring server steps below.
v Monitoring agent:
a. Verify that the correct socket connection is being used.
b. Verify that the monitoring agent is connected to the Tivoli Data
Warehouse.
v Monitoring server:
a. Verify that the connection between the monitoring server and Warehouse
Proxy agent is not being stopped by a firewall.
b. Verify that the correct port is being used for each component.
3. For corrective actions, perform the following steps:
v Monitoring agent:
– Store the collected data at the location of the monitoring server to ensure a
stable connection.
v Monitoring server:
– Consider using a high port number for connecting to the monitoring
server. See the “Controlling port number assignments” topics in the IBM Tivoli Monitoring Installation and Setup Guide for more information on the COUNT and SKIP options for port number allocation.
Resolving warehouse proxy agent problems - configuration
If the Warehouse Proxy agent is connected to the Tivoli Data Warehouse and Tivoli Enterprise Monitoring Server but cannot transmit data, change the environment variable settings in the Warehouse configuration file.
About this task
To resolve Warehouse Proxy agent problems, you perform diagnostic and corrective actions. These actions include modifying the environment variable
Chapter 3. Common problem solving 21
settings in the Warehouse configuration file.
Procedure
Diagnostic and corrective actions
1. To perform a diagnostic action, review the current CTIRA_NCSLISTEN and
KHD_QUEUE_LENGTH settings in the Warehouse configuration file.
2. To perform a corrective action, set CTIRA_NCSLISTEN equal to at least 20 times the value of KHD_EXPORT_THREADS and increase KHD_QUEUE_LENGTH equal to a value greater than the number of agents being handled by that Warehouse Proxy agent .
Resolving warehouse proxy agent problems - connectivity
Warehouse Proxy agent problems, such as the inability to send data to the Tivoli Data Warehouse, can be resolved by ensuring that agent can export data and that it is not too busy.
About this task
To resolve warehouse proxy agent problems, you perform diagnostic and corrective actions. These actions include verifying that the warehouse database password and user ID are correct. Also, you can update the configuration parameters.
Procedure
Diagnostic and corrective actions
1. To perform diagnostic actions:
a. Verify component connectivity through the Self-Monitoring Topology
workspace. To open this workspace right-click the Enterprise Navigator item, and then select Workspace > Self-Monitoring Topology.
b. Verify that the warehouse database password and user ID are correct and
have not expired.
c. Look at the Warehouse Proxy agent RAS1 logs for export resource
availability timeout. The Warehouse Proxy agent might be unable to export because it is too busy.
2. To perform a corrective action:
v Update the configuration parameters. See “Environment variables” in the
IBM Tivoli Monitoring Installation and Setup Guide.
Resolving summarization and pruning agent problems
Summarization and pruning agent problems, such as unexpected values for attributes can be resolved by reviewing the documentation for the monitoring agents that are producing the unexpected values.
The causes of monitoring server problems include: Summarization and Pruning agent yield unexpected values; unanticipated attribute behavior leads to unexpected data.
About this task
To resolve summarization and pruning agent problems, you perform diagnostic and corrective actions. These actions include comparing real-time data from a workspace view with the unexpected data.
22 IBM Tivoli Monitoring: Troubleshooting Guide
Procedure
1. To perform a diagnostic action:
v Open a workspace view that shows real-time data and compare it with the
unexpected data.
v To understand how data is aggregated for various data types, see Tivoli
Management Services Warehouse and Reporting (http://
www.redbooks.ibm.com/abstracts/sg247290.html). This IBM Redbooks publication describes aggregation methods used by the Summarization and Pruning Agent.
2. To perform a corrective action:
v Review the documentation for the monitoring agents that are generating the
unexpected values. This clarifies the expected types of values for the attributes in question.
Resolving persistent data store for z/OS problems
If you encounter a problem with the configuration of the persistent data store on the Tivoli Enterprise Monitoring Server, you can check the RKPDLOG output to verify the configuration.
About this task
To resolve persistent data store for z/OS problems, you perform diagnostic and corrective actions. These actions include verifying that the data store is configured properly.
Procedure
Diagnostic and corrective actions
1. Is historical data configured to be collected at the agent or the monitoring
server? If the agent is configured in the address space of the monitoring server, then historical data can be collected only at the monitoring server.
v If historical data is configured to be collected at the monitoring server, see
step 2 below.
v If historical data is configured to be collected at the agent, see step 3 below.
2. To verify that the persistent data store is configured correctly, on the
monitoring server, check the RKPDLOG output, for example:
2008/07/28 08:45:41 KPDIFIL: Status of files assigned to group GENHIST:
v
2008/07/28 08:45:41 ----------------------------------------------------­2008/07/28 08:45:41 &philev.RGENHIS3 Status = Active 2008/07/28 08:45:41 &philev.RGENHIS2 Status = Offline 2008/07/28 08:45:41 &philev.RGENHIS1 Status = Offline 2008/07/28 08:45:41 ----------------------------------------------------­2008/07/28 08:45:41 KPDIFIL: End of group GENHIST status.
3. To verify that the persistent data store is configured correctly at the agent,
check the RKPDLOG of the agent, for example:
v If KM5AGENT (this agent runs on the monitoring server), check the
RKPDLOG of the monitoring server:
2008/07/28 08:48:27 KPDIFIL: Status of files assigned to group PLEXDATA: 2008/07/28 08:48:27 ----------------------------------------------------­2008/07/28 08:48:27 &philev.RKM5PLX3 Status = Active 2008/07/28 08:48:27 &philev.RKM5PLX2 Status = Empty 2008/07/28 08:48:27 &philev.RKM5PLX1 Status = Partially Full 2008/07/28 08:48:27 -----------------------------------------------------
Chapter 3. Common problem solving 23
v If the MQ agent is running in its own address space, check its RKPDLOG
(time stamp not shown):
Response: &philev.RMQSGRP3 1700 83 14 5000 Active Write Response: &philev.RMQSGRP2 1700 25 0 5000 Empty Read Access Response: &philev.RMQSGRP1 1700 25 0 5000 Empty Read Access Response: &philev.RKMQPDS3 23327 31 0 4000 Empty Read Access Response: &philev.RKMQPDS2 23327 6598 143 4000 Partial Read Access Response: &philev.RKMQPDS1 23327 3523 105 4000 Active Write
4. Verify that the files are not being used by another task.
5. Verify that the files are initialized correctly and that the KppPDICT is inserted
into the persistent data store files.
6. Verify that the maintenance procedure is correctly processing the persistent data store files.
Example
Examples of the error codes in the RKPDLOG:
Error code 25804
Indicates that an attempt was made to read slot 0 of the GENHIST dataset. This is a protected record and the persistent data store will not allow the slot to be read. One possible cause is a problem with DELETE processing. The warehouse code, which is the only code that attempts to use the delete logic, might be generating a bad condition.
Run the RECOVERY command which will save the data and rebuild the indexes so that the data is once again usable.
Error code 3205
The last 3 digits represent the error and the beginning digits represent the persistent data store function that was being called. The 205 indicates the error RowExceedsFileFormat.
This error is generated if the row you attempt to insert is larger than what will fit in a block allocated to the persistent data store data set. The actual maximum length is about 100 bytes smaller than the block size. Therefore, if you allocate a block size of 1000 (Window=1) and attempt to write a row greater then 900, you receive this message. The persistent data store cannot span a data row across multiple blocks. One other possibility is that either the API calls to the persistent data store to do the insert are specifying an invalid row length or the lengths of all the columns put together for the insert exceed the buffer length.
Error code 35404
This code has many causes. One possibility is that a PARMA parameter intended for the agent processing is mistakenly set to the monitoring server and interpreted as a column name. This might be due to obsolete SQL saved in the monitoring server database. In most cases you can ignore this error. Set monitoring server traces to (UNIT:kdssqprs input,error).
The UNIT:kdssqprs input, error trace returns large amounts of data. Turn the trace off as soon as you finish troubleshooting.
KFAPERR : error code 14209
Persistent data store Filename is Not Available messages in the RKLVLOG of an agent or monitoring server on z/OS: Error 8 trying to set up table <table-name>, KRAIRA000, Starting UADVISOR_Kpp_table-name, where pp is the two-character component or product code and table-name is the application table name.
24 IBM Tivoli Monitoring: Troubleshooting Guide
What to do next
For more information about the persistent data store, see the IBM Tivoli OMEGAMON XE and Tivoli Management Services on z/OS: Common Planning and Configuration Guide.
Historical data does not get collected for some monitoring server attribute groups
Problem Description: When configuring ITM Historical Data Collection, the following attribute groups can not successfully be stored at the agent when an agent runs on the z/OS platform. CCC Logs - Agent Operations Log CCC Logs ­ITM Audit This is a limitation in the current implementation of ITM history collection for these attribute groups that should be fixed in a future release.
The following error messages will be visible in the z/OS agent RAS1 log (RKLVLOG) when this problem occurs (time stamp not shown):
(0034-D8CDE7B3:kraahbin.cpp,977,"ConnectToPDS") Unable to locate table KRAAUDIT (0034-D8CDE7B3:kraahbin.cpp,977,"ConnectToPDS") Unable to locate table OPLOG
When historical collection data is required from any z/OS monitoring agent for the CCC Logs - Agent Operations Log or CCC Logs - ITM Audit attribute groups, configure Historical Data Collection for short-term data storage at the TEMS rather than at the agent.
A situation does not raise when expected
You can encounter a problem that a situation does not raise. For example, certain conditions may not raise a situation as expected.
Symptoms of the problem:
v In the portal client or Tivoli Enterprise Console, certain conditions exist that
should have raised a situation but the situation has not been raised.
Diagnosing that a situation does not raise when expected
You can diagnose that a situation does not open an event when expected by verifying that the monitoring agent has started.
About this task
To diagnose that a situation does not raise when expected, perform the following steps:
Procedure
Preliminary diagnostics
1. Verify that the monitoring agent has started.
2. Verify that the situation is associated with a Navigator item in the Tivoli
Enterprise Portal.
3. In the situation event console, confirm that the situation is true and an event
has opened.
4. Verify that maintenance has not been run against situations. One possible
tacmd command that could have been run is the tacmd maintAgent.If maintenance has been run, wait for the situation to restart.
Chapter 3. Common problem solving 25
5. Click any workspace that should contain the data to verify that data is arriving.
6. To provide more information, run the following ITMSUPER tools:
v Situation Test tool
v Exceptions Analysis tool
v Distributions Analysis tool
What to do next
For more information on actions that relate to these diagnostics, see the following tasks:
Related tasks:
“Resolving situation-specific problems”
Resolving situation-specific problems
To resolve situation-specific problems, check the log files (such as the agent operations log) to verify that the situation was started; check that the agent returned data and that the SITMON received the data; and check that the situation opened an event.
About this task
To resolve situation-specific problems, you perform diagnostic and corrective actions. These actions include checking if the agent is online and if the agent returned data. Then, you check that the SITMON received the data.
Procedure
Diagnostic and corrective actions
1. Verify that the situation was started by checking one of the following log files for text strings based on the specific situation:
v Agent operations log
For example: 1061110125731000KRAIRA000 Starting FireOnWednesday
<7340776,3145895> for KPX.LOCALTIME
v Monitoring server log
For example: 11/13/06 15:07:21 KO41046 Monitoring for enterprise
situation FireOnWednesday started.
2. Did the situation start?
v No: See step 3.
v Yes: See step 5.
3. Is the situation distributed to the agent and is the agent online?
v Look for a text string similar to the following text string in the monitoring
server log:
KO41047 Situation CheckIfSituationCreated distribution Primary:KATE:NT added
v Yes: See step 5.
v No: Use (UNIT:kpxreqds all) to trace the distribution at the monitoring
server for a situation.
(4558D8CC.0033-1114:kpxreqds.cpp,621,"DetermineTargets") Active RAS1
Classes: EVERYT EVERYE EVERYU
(4558D8CC.0034-1114:kpxreqds.cpp,661,"determineTargetsFromNodelist")
26 IBM Tivoli Monitoring: Troubleshooting Guide
Active RAS1 Classes: EVERYT EVERYE EVERYU (4558D8CC.0035-1114:kpxreqds.cpp,661,"determineTargetsFromNodelist") Entry (4558D8CC.0036-1114:kpxreqds.cpp,669,"determineTargetsFromNodelist") Exit (4558D8CC.0037-1114:kpxreqds.cpp,821,"determineTargetsFromAccessList")
Active RAS1 Classes: EVERYT EVERYE EVERYU (4558D8CC.0038-1114:kpxreqds.cpp,821,"determineTargetsFromAccessList") Entry (4558D8CC.0039-1114:kpxreqds.cpp,837,"determineTargetsFromAccessList")
Calling KFA_GetAccessListNodes for NT_Paging_File_Critical, 5140
(4558D8cc.003A-1114:kpxreqds.cpp,852,"determineTargetsFromAccessList")Node #0
Primary:KATE:NT (4558D8CC.003B-1114:kpxreqds.cpp,891,"determineTargetsFromAccessList")
Deleting NodeRecEntry: #0, node_entries @0x1B63B90, next @0xNULL,
ptr_next @0xNULL (4558D8CC.003C-1114:kpxreqds.cpp,898,"determineTargetsFromAccessList") Exit
4. Did the agent return data?
v On the monitoring server set this trace level (UNIT:kpxrpcrq ERROR STATE)
to show the number of rows returned by each agent.
(3A933B00.24A827C0-154:kpxrpcrq.cpp,547,"IRA_NCS_Sample")
Rcvd 1 rows sz 448 tbl *.NTLOGINFO req NT_Log_Space_Low <4294706777,761>
node <Primary:NODE1:NT
v If Yes: See step 6.
v If No: Is the situation authored correctly? At the agent, trace (UNIT:kdsfilt
all).
a. Yes: The problem might be related to the monitoring agent. See the
Troubleshooting appendix of the distributed agent's User's Guide or the Troubleshooting Guide of the z/OS monitoring agent.
b. No: See step 5.
5. Look in the log of the monitoring server to which the agent is attached. Search
for the situation name and look for any errors logged.
v Catalog errors (message return codes 202 and 209). Ensure the application
support is installed at the monitoring server.
v Message KO41046 is missing – situation failed to lodge message:
KO41039 Error in request MCS_Sit. Status= 1133. Reason= 1133. KO41039 Error in request MCS_Sit. Status= 1131. Reason= 1131. (4558E8EF.0079-11A4:ko4sitma.cpp,782,"IBInterface::lodge") error:
Lodge <1131> (4558E8EF.007A-11A4:ko4ibstr.cpp,659,"IBStream::op_ls_req") IB Err: 1131 (4558E8EF.007B-11A4:ko4sit.cpp,658,"Situation::slice") Sit MCS_Sit: Unable
to lodge - giving up. KO48156 Not able to start monitoring for situation MCS_Sit.
v SITMON/IB Lodge errors
a. Attribute file is incorrect (wrong version) or missing and the RULE could
not be created.
b. A value of 1133 or 1203 leads to a value of 1131.
c. A value of 1145 generally means that the referenced situation either has
been deleted or has not been distributed correctly.
#define ERR_LODGEERROR 1131 // Bad lodge request #define ERR_NOATTRIBUTE 1133 // No attribute found #define ERR_DUPLICATEINSERT 1144 // Duplicate record exists #define ERR_INVALIDSITNAME 1145 // Invalid sitname supplied #define ERR_RULESYNTAX 1203 // Generic rule syntax error
6. Did SITMON receive the data?
v Monitoring server trace (UNIT:ko4async ERROR STATE FLOW)
(UNIT:ko4tobje ALL) (UNIT:ko4sitma ALL)
v If Yes and SITMON receives the data: Does the situation apply to the
Enterprise? For example:
Chapter 3. Common problem solving 27
11/08/06 16:18:49 KO46256 Situation definition CheckIfSituationCreated created by *ENTERPRISE
v This displays *ENTERPRISE in the MSG2 message of the monitoring server
message log when the situation was created. Only Enterprise situations show up in the portal client user interface. A non-Enterprise situation does not show up in the portal client user interface, even if the situation is raised.
v The distinction between Enterprise and non-Enterprise situations is shown in
the following monitoring server log examples:
a. Enterprise situation KO41046 Monitoring for enterprise situation
MS_Offline started.
b. Non-Enterprise situation KO41036 Monitoring for situation Weekday
started.
v If Yes and it is a Non-Enterprise situation: See step 7.
v If No and it is not an Enterprise situation: Reconfigure the situation to
include the Enterprise flag setting.
v If No and SITMON does not receive the data: Use the Monitoring server
trace (UNIT:kdsruc1 ERROR STATE) (UNIT:kfaadloc all) to see where the data is getting filtered out.
This trace generates a large amount of data. Turn the trace off as soon as you finish troubleshooting.
7. Is there a MSG2 message indicating the situation raised?
v Yes: Contact IBM Software Support. See Chapter 2, “Logs and data collection
for troubleshooting,” on page 5 for information on what types of data to collect before contacting Support. Also consult the IBM Support Portal (http://www.ibm.com/support/entry/portal/software).
A reflex automation script does not run when it should
You can encounter a problem that a reflex automation script does not run when it should. For example, after a situation raised, a particular action might not occur.
Diagnosing that a reflex automation script does not run when it should
You can diagnose that a reflex automation script does not run when it should by checking if the situation raised.
Procedure
Preliminary diagnostics
If the situation does not raise, see “Diagnosing that a situation does not raise when expected” on page 25.
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution task.
Resolving format and variable problems
To resolve format and variable problems, you verify that the system command is correct and that it can be executed on a specific platform. You can check the monitoring agent operations log to see if reflex automation occurred.
28 IBM Tivoli Monitoring: Troubleshooting Guide
Procedure
Diagnostic and corrective actions
1. Does the system command run correctly from a command line?
v Yes: Go to the next step.
v No: Verify that the command you typed in is correct.
2. Is the length of the command within the limit for your operating system?
v Yes: Go to the next step.
v No: The command cannot be executed on this platform. You might be able to
write a wrapper script to issue the command.
3. Are the required user type and environment variables correct?
v Yes: Go to the next step. v No: Include the set command in the shell script or batch script and redirect
the output to a file. Review the file afterwards to show which variables are being used.
4. Collect the monitoring agent operations log, which shows whether reflex
automation occurred. A monitoring server message log also confirms which error occurred.
5. Correct the identified problem.
High CPU usage on a distributed system
You can encounter a problem that CPU usage is high on a distributed system.
Symptoms of the problem:
v Performance degrades or availability is lost because of high processing in an
application or a computer.
v No data is returned in the portal client and the collector log contains the text
string Open Probe pipe error. See “Resolving Oracle DB Agent problems ­corrective actions” on page 32.
v Situations alert you frequently about a managed system cycling between online
and offline. See “Resolving firewall problems - corrective actions” on page 31.
Diagnosing high CPU usage on a distributed system
You can diagnose that CPU usage is high on a distributed system by determining whether a monitoring component, application, or process running on the system might be the cause of the problem. Also, you can use the ITMSUPER tools, such as the Connectivity tool to provide more information.
About this task
To diagnose that CPU usage is high on a distributed system, perform the following steps:
Procedure
Preliminary diagnostics
1. Determine whether an IBM Tivoli Monitoring component is the root cause.
Another application or process running on the system might be causing high CPU usage.
Chapter 3. Common problem solving 29
2. Use the tools and data provided by Task Manager to identify the process
causing high CPU usage. In the Processes tab you can reorder the processes by CPU usage. An example of a process name is kntcma.exe for the Windows OS agent.
3.
UNIX, you can also use the ps auxww command.
4. Verify the following:
v Is historical data collection enabled?
v Is the database undergoing a backup?
5.
v Yes: Disable all event log monitoring situations.
6. Select each of the workspaces in turn, to see which one is consuming high
CPU.
7. Running the following ITMSUPER tools might also provide more information:
v Stressed Resources tool
v Connectivity tool
v Situations tool
8. When the computer (where the monitoring agent is running) has multiple
Network Interface Cards (NICs), the agent might not be bound to the Primary NIC. The agent might not be able to establish connectivity with the monitoring server. High CPU usage can result from the agent's frequent attempts to connect.
a. To correct this, you might need to set the environment variable
KDEB_INTERFACELIST = '!*' or KDEB_INTERFACELIST = IP_address, where IP_address is the address of the NIC.
b. Make the changes in the associated agent *ENV configuration file for
Windows, or the *.ini configuration file for UNIX or Linux.
Use the top command to display processes using high CPU. For
Is the situation writing a lot of event logs?
What to do next
For more information on actions that relate to these diagnostics, see the problem resolution tasks.
Resolving situation problems - diagnostic actions
You can resolve situation problems by running the ITMSUPER tool. Also, you can examine the situation definition and formula in the agent-specific .lg0 file.
Procedure
Diagnostic actions
1. Run the Situation Test ITMSUPER tool.
2. Find out which situations have been deployed to the monitoring agent.
3. Open the agent-specific.lg0 file to view a list of situations started for that
agent.
v v install_dir/logs
4. Examine the situation definition and formula.
5. Does the situation contain any wildcard * characters against UTF8 columns?
v Yes: See “Resolving situation problems - corrective actions” on page 31.
6. Switch between situations to see which one is causing the high CPU.
install_dir\TMAITM6\logs
30 IBM Tivoli Monitoring: Troubleshooting Guide
Resolving situation problems - corrective actions
You can resolve situation problems by either changing the formulation of situations or rewriting the situation using the SCAN strcscan function. You can also use non-UTF8 columns to rewrite the situation or combine predicates with an OR.
Procedure
Corrective actions
1. Change the formulation of any situations that are causing excessive processing.
2. For situations with wildcard * characters, perform one of the following steps:
v Rewrite the situation using the SCAN strcscan function instead of the
character-by-character pattern-matching function LIKE. For example, situations with this simple LIKE"*/process" pattern can be rewritten as SCAN "/process".
v Rewrite the situation using non-UTF8 columns. For example, *IF *LIKE
NT_System.User_Name_U *EQ ’*group’ can be rewritten as *IF *LIKE NT_System.User_Name *EQ ’*group’ where User_Name is a non-UTF8 column and User_Name_U is the corresponding UTF8 column.
v Rewrite the situation combining predicates with an OR. For example, *IF
*LIKE NT_System.User_Name_U *EQ ’group*’ can be rewritten as *IF(( *VALUE NT_System.User_Name_U *EQ ’groupA’ ) *OR ( *VALUE NT_System.User_Name_U *EQ ’groupB’ ) *OR ( *VALUE NT_System.User_Name_U *EQ ’groupC’ ) ).
Resolving firewall problems - diagnostic actions
A problem with firewall interference or a problem with communication between the Tivoli Enterprise Monitoring Server and monitoring agents can be resolved by using the ping command to verify the communication between the server and agents.
Procedure
Diagnostic actions
1. Check connectivity between the monitoring agent and the monitoring server.
2. Use the ping command to verify whether communication exists between the
monitoring server and agents. Ping from the monitoring agent system to the monitoring server, and then from the monitoring server system to the monitoring agent.
v Use the IP address of the host name specified during agent configuration.
v If the communication is broken and you see high CPU, proceed to the
corrective actions.
3. Turn on the RAS1 trace log to verify whether the monitoring agent has made a
connection to the monitoring server. See “Setting traces” on page 43 for more information.
Resolving firewall problems - corrective actions
You can resolve firewall problems by contacting IBM Software Support.
Chapter 3. Common problem solving 31
Procedure
1. If you still have high CPU usage issues even after ensuring proper connectivity
across firewalls, open a problem report with IBM Software Support or refer to the IBM Support Portal (http://www.ibm.com/support/entry/portal/ software).
2. For more information, see the “Firewalls” topics in the IBM Tivoli Monitoring
Installation and Setup Guide.
Resolving Oracle DB Agent problems - diagnostic actions
Oracle DB Agent problems, such as a cursor performance problem can be resolved by setting an environment variable to disable problematic cursors.
Procedure
Diagnostic actions
1. Collect the detail traces of collector and RAS1 log. See the problem
determination topics for enabling detailed tracing in the collector trace log and setting RAS trace parameters in the IBM Tivoli Monitoring for Databases: Oracle Agent User's Guide.
2. Identify the SQL query that caused the high CPU usage issues from the
collector logs.
3. You can identify the SQL query that caused the high CPU usage issue from
ITM Oracle Agent logs or the Oracle tools. Use the following procedure to identify the problematic cursors from ITM Oracle Agent logs:
a. Open the collector logs and find CFE1645 messages. The messages show the
return time of each cursors. For example: CFE1645T (165929) Time = 2008/06/06 16:59:29, collected records in 6 seconds.
b. The default timeout value of ITM Oracle Agent is 45 seconds. If it takes
more than 45 seconds, it might cause a timeout problem and Open Probe
pipe error will be reported in the collector log. CFE1645T (170246) Time = 2008/06/06 17:02:46, collected records in 203 seconds
c. When a timeout happens, review the previous cursor that executed before
this message. For example:
PDR3000T (170002) Deleting (1) rows for cursor DB6 RPF0300T (170002) Doing prep_l_fet for cursor DB6 ORU0085I (170002) -------------------------------------------------­ORU0090I (170002) Starting new SQL query. ORU0095I (170002) <SELECT /*+RULE*/ COUNT(*) EXTENTS FROM SYS.DBA_EXTENTS > ORU0085I (170002) -------------------------------------------------­CAT1610I (170213) Dump of row 1 UPX0100T 000: 20202020 20202020 20202032 34313135 * 24115*
4. The previous cursor (DB6) took about 2 minutes and 11 seconds to return data
causing the performance problem.
5. Were you able to identify an SQL query?
v Yes: Continue to the corrective actions task.
Resolving Oracle DB Agent problems - corrective actions
Oracle DB agent problems, such as problematic cursors can be resolved by setting environment variables and overriding variable settings.
32 IBM Tivoli Monitoring: Troubleshooting Guide
Procedure
Corrective actions
1. Disable the problematic cursors by setting an environment variable:
v
v
2. Recycle the Monitoring Agent for Oracle to recognize these changes to the
Extended Parameters value.
3. Using the name of the SQL cursor, you can look in the korcoll.ctl file for the
SQL modification that is done when the SQL cursor is enabled. The korcoll.ctl file is located in the following locations:
v
v
COLL_DISABLE_CURSORS
a. Launch Manage Tivoli Enterprise Monitoring Services.
b. Right-click the row that contains the name of the monitoring agent whose
environment variables you want to set.
c. From the pop-up menu, select Advanced > Edit Variables.
d. If the agent is running, accept the prompt to stop the agent.
e. The list dialog box is displayed. When only the default settings are in
effect, variables are not displayed. The variables are listed after you override them. Override the variable settings:
1) Click Add.
2) From the Variable menu, select COLL_DISABLE_CURSORS. If the variable
is not there, you can add it.
3) In the Value field, type a value and click OK twice.
4) Restart the agent.
db_extparms
a. Use a text editor to enter a new value for the db_extparms in the
hostname_or_instance_name.cfg file in the install_dir/config directory.
b. The cursors that are listed below take longer to return data and consume
excessive system resources in some customer environments: DB3, DB6, KF1, KF4, STATLTRN, TS1, TS3, TS5, , and TS6.
c. Each comma-delimited, no white space, value represents a change to the
SQL cursor that is executed during data gathering operations within the agent. The values are the SQL cursor name. For example, setting the Extended Parameters field to DB3, TS1 means that the DB3 and TS1 SQL cursor is enabled for Set FREEBYTES to zero, Set TSNEXTS to zero, and
Set MAXEXTTS to zero. The SQL cursor name is not case sensitive.
%CANDLE_HOME%\TMAITM6
$CANDLEHOME/misc
When these cursors are enabled, the Monitoring Agent for Oracle displays default attribute values of these cursors in the Tivoli Enterprise Portal, meaning, the Monitoring Agent for Oracle no longer monitors the attributes of the enabled cursors.
4. An example of an SQL cursor is displayed below:
SQL cursor: DB3 - ARCHIVE LOG DISPLAY
SQL:
Enabled: Set FREEBYTES to zero
Navigation Tree : Databases->Database Summary
SELECT TABLESPACE_NAME UTSNAME,
SUM(BYTES) FREEBYTES FROM SYS.DBA_FREE_SPACE GROUP BY TABLESPACE_NAME;
Chapter 3. Common problem solving 33
Workspace: Oracle_Database/Database Summary->Database Summary(Bar
Chart View)
View)
Navigation Tree : Databases->Enterprise Database Summary
Workspace: Oracle_Statistics_Enterprise/Databases Global->Database
Summary(Bar Chart View)
Summary(Table View)
Oracle_Database/Database Summary->Database Summary(Table
Column : DB Percent Free Space = 0
System TS Percent Free = 0
Oracle_Statistics_Enterprise/Databases Global->Database
Column : System TS Percent Free = 0
Situation: Oracle_DB_PctFree_Space_Low = always true
Oracle_SystemTS_PctFree_Critica = always true Oracle_SystemTS_PctFree_Warning = always false
5. For more information on the cursors, see the
Oracle Agent 6.2.0-TIV-ITM_ORA-LA0001 README or a higher version of the README.
34 IBM Tivoli Monitoring: Troubleshooting Guide
Chapter 4. Tools
IBM Tivoli Monitoring provides several tools; some include functionality for diagnosing problems. The primary diagnostic tool is logging. Logging refers to the text messages and trace data generated by the software. Messages and trace data are sent to an output destination, such as a console screen or a file.
Trace logging
Trace logs capture information about the operating environment when component software fails to operate as intended. IBM Software Support uses the information captured by trace logs to trace a problem to its source or to determine why an error occurred.
The principal log type is the reliability, availability, and serviceability (RAS) trace log. RAS logs are in the English language only. The RAS trace log mechanism is available on the Tivoli Enterprise Monitoring Server, the Tivoli Enterprise Portal Server, and the monitoring agent. By default, the logs are stored in the installation path for IBM Tivoli Monitoring.
The default configuration for tracing, such as whether tracing is enabled or disabled and trace level, depends on the source of the tracing. You can choose how many files to keep when the log rolls. If you cannot find the log files you need, restart the system and try again.
Log file locations
Log files are saved in log and component directories in your IBM Tivoli Monitoring installation. The log files have a certain format that includes a time stamp.
The log files are maintained with the following naming:
v
default, the logs are stored in the installation path for IBM Tivoli Monitoring. The following is an example of a log file name that includes the time stamp in hexadecimal format:
ibm-kpmn803v01_cq_472649ef-01.log
v The log file name includes a time stamp. The UNIX-based systems
RAS1 log files are stored in the /logs directory. The following is an example of a log files name that includes the time stamp:
f50pa2b_ux_1112097194.log
v
candle_home/.java/deployment/log
Note: When you communicate with IBM Software Support, you must capture and send the RAS1 log that matches any problem occurrence that you report.
The following table lists the location of the log files directories for the components in your Tivoli Monitoring environment.
The log file name includes a time stamp in hexadecimal format. By
In the Firefox browser:
© Copyright IBM Corp. 2005, 2012 35
Table 1. Location of log files for the IBM Tivoli Monitoring components.
Component Windows UNIX-based systems
Tivoli Enterprise Portal Server
Tivoli Enterprise Portal browser client
Tivoli Enterprise Portal desktop client
install_dir\logs install_dir/logs/
C:\Documents and Settings\ Administrator\ Application Data\ Java\Deployment\ log\plugin142.trace
install_dir\CNP\kcjerror.log install_dir\CNP\kcjras1.log
When launched via Java Web Start:
%USERPROFILE%\Application Data\ IBM\Java\Deployment\log\ javawsnnnnn.trace
where 'nnnnn' is a unique, randomly generated numeric suffix to support generational logs (for example, the last generated log will not be overlayed by the most current execution of Tivoli Enterprise Portal using Java Web Start. This is in contrast to the Tivoli Enterprise Portal Browser client, which has a fixed name and is overlayed with each execution cycle.
hostname_PC_timestamp.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Portal Server was installed.
hostname
Specifies the name of the system hosting the product.
PC Specifies the product code. cq for the Tivoli
Enterprise Portal Server.
timestamp
A decimal representation of the time at which the process was started.
None.
install_dir/logs/ hostname_PC_timestamp.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Portal Server was installed.
hostname
Specifies the name of the system hosting the product.
PC Specifies the product code. cq for the Tivoli
Enterprise Portal Server.
timestamp
A decimal representation of the time at which the process was started.
When launched via Java Web Start:
${user.home}/.java/deployment/log/javawsnnnnn.trace
36 IBM Tivoli Monitoring: Troubleshooting Guide
where 'nnnnn' is a unique, randomly generated numeric suffix to support generational logs (for example, the last generated log will not be overlayed by the most current execution of Tivoli Enterprise Portal using Java Web Start. This is in contrast to the Tivoli Enterprise Portal Browser client, which has a fixed name and is overlayed with each execution cycle.
Table 1. Location of log files for the IBM Tivoli Monitoring components. (continued)
Component Windows UNIX-based systems
Tivoli Enterprise Monitoring Server
install_dir\logs\hostnamePC_ HEXtimestamp-nn.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Monitoring Server was installed.
PC Specifies the product code.
ms for Tivoli Enterprise Monitoring Server
HEXtimestamp
A hexadecimal representation of the time
install_dir/logs/hostname_PC_timestamp.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Portal Server was installed.
hostname
Specifies the name of the system hosting the product.
PC Specifies the product code. cq for the Tivoli
Enterprise Portal Server.
timestamp
A decimal representation of the time at which the process was started.
at which the process was started.
nn Represents the circular
sequence in which logs are rotated. Ranges from 1-5, by default, though the first is always retained, since it includes configuration parameters.
Monitoring agents
install_dir\tmaitm6\logs\ hostname_PC_HEXtimestamp-nn.log
install_dir/logs/hostname_PC_timestamp.log
where:
where:
install_dir
Specifies the directory where the monitoring agent was installed.
PC Specifies the product
codes, for example, um for Universal Agent or nt for Windows.
HEXtimestamp
A hexadecimal representation of the time
install_dir
Specifies the directory where the Tivoli Enterprise Portal Server was installed.
hostname
Specifies the name of the system hosting the product.
PC Specifies the product code. cq for the Tivoli
Enterprise Portal Server.
timestamp
A decimal representation of the time at which the process was started.
at which the process was started.
nn Represents the circular
sequence in which logs are rotated. Ranges from 1-5, by default, though the first is always retained, since it includes configuration parameters.
Chapter 4. Tools 37
Table 1. Location of log files for the IBM Tivoli Monitoring components. (continued)
Component Windows UNIX-based systems
IBM Tivoli Warehouse Proxy agent
IBM Tivoli Summarization and Pruning agent
IBM Tivoli Enterprise Console Event Forwarder
install_dir\logs\hostname_PC_ timestamp.log
where
PC Specifies the product code.
hd is the product code for the IBM Tivoli Warehouse Proxy agent
The Summarization and Pruning Agent uses C-based RAS1 tracing, Java-based RAS1 tracing and Java-based internal tracing. By default, Summarization and Pruning Agent trace data is written to a file in the logs subdirectory.
install_dir\logs\hostname_PC_ HEXtimestamp-nn.log
install_dir\logs\hos tname_PC_ ras1java_HEXtimestamp-nn.log
install_dir\logs\hostname_PC_ java_HEXtimestamp-nn.log
where:
install_dir
Specifies the directory where the monitoring agent was installed.
PC Specifies the product codes, for example, sy for IBM Tivoli Summarization and Pruning
agent.
HEXtimestamp
A hexadecimal representation of the time at which the process was started.
nn Represents the circular sequence in which logs are rotated. Ranges from 1-5, by default,
though the first is always retained, since it includes configuration parameters.
install_dir\logs\hostname_PC_ HEXtimestamp-nn.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Monitoring Server was installed.
PC Specifies the product code.
ms for Tivoli Enterprise Monitoring Server
HEXtimestamp
A hexadecimal representation of the time at which the process was started.
nn Represents the circular
sequence in which logs are rotated. Ranges from 1-5, by default, though the first is always retained, since it includes configuration parameters.
Not supported.
install_dir/logs/hostname_PC_ HEXtimestamp-nn.log
install_dir/logs/hostname_PC_ras1java_ HEXtimestamp-nn.log
install_dir/logs/hostname_PC_java_ HEXtimestamp-nn.log
install_dir/logs/hostname_PC_timestamp.log install_dir/logs/hostname_PC_HEXtimestamp-nn.log
where:
install_dir
Specifies the directory where the Tivoli Enterprise Portal Server was installed.
hostname
Specifies the name of the system hosting the product.
PC Specifies the product code. cq for the Tivoli
Enterprise Portal Server.
timestamp
A decimal representation of the time at which the process was started.
38 IBM Tivoli Monitoring: Troubleshooting Guide
Table 1. Location of log files for the IBM Tivoli Monitoring components. (continued)
Component Windows UNIX-based systems
IBM Tivoli Enterprise Console Situation Update Forwarder
c:\tmp\itmsynch\logs\ synch_trace.log
c:\tmp\itmsynch\logs\ synch_msg.log
Note: IBM Tivoli Enterprise Console Situation Update Forwarder logs are created on the IBM Tivoli Enterprise Console server.
/tmp/itmsynch/logs/synch_ trace.log
tmp/itmsynch/logs/synch_ msg.log
Installation log files
Use the log files that are created during installation to help diagnose any errors or operational issues.
The following table lists and describes the log files created during installations.
Table 2. Installation log files
Windows UNIX-based systems
v ITM_HOME\InstallITM\Abort<Product_name><date_timestamp>.log
This log is created if an abort occurs for either a first time installation or a modification of previous installation of IBM Tivoli Monitoring.
v
ITM_HOME\InstallITM\<Product_name>_<timestamp>.log
This log is created during a normal clean installation.
v
ITM_HOME\InstallITM\MOD_<Product_name>timestamp.log
This log is created if you modify an existing product specified with the PC, or when adding or deleting components.
where:
Product_name
Specifies the product name. IBM Tivoli Monitoring 20050923
1815.log is the log file name for the IBM Tivoli Monitoring
installation CD.
timestamp
A decimal representation of the time at which the process was started.
You can find a log for uninstallation on Windows in the root directory where the product was installed:
Uninstall<PC><date_timestamp>.log
$CANDLEHOME/logs/candle_ installation.log
Windows installer and configuration logs
Obtain details about the installation (or upgrade) process in the logging and tracing information. You can set the trace levels.
You can set the degree of logging and tracing to one of three levels:
v DEBUG_MIN
v DEBUG_MID
v DEBUG_MAX
Chapter 4. Tools 39
By default, logging and tracing is set to DEBUG_MIN. Higher levels give you more detailed information about the installation process. This can be useful for investigating any problems or errors that occur.
Level name What is logged or traced
DEBUG_MIN Most important method entries, exits and trace messages are
traced
DEBUG_MID Most of the method entries, exits and trace messages are traced
DEBUG_MAX All of the method entries, exits and trace messages are traced
You can set the level of logging and tracing by using the /z flag when you execute the setup.exe file in the CLI.
v For GUI installation use one of the following commands:
setup.exe /zDEBUG_MAXsetup.exe /zDEBUG_MIDsetup.exe /zDEBUG_MIN
v For silent installation use one of the following commands:
start /wait setup /z"DEBUG_MAX/sfC:\temp\SILENT_SERVER.txt" /s
/f2"C:\temp\silent_setup.log"
– start /wait setup /z"DEBUG_MID/sfC:\temp\SILENT_SERVER.txt" /s
/f2"C:\temp\silent_setup.log"
– start /wait setup /z"DEBUG_MIN/sfC:\temp\SILENT_SERVER.txt" /s
/f2"C:\temp\silent_setup.log"
UNIX installer and configuration logs
Obtain details about the installation (or upgrade) process in the logging and tracing information. You can set the trace levels.
For tracing and logging Java code (that is run on UNIX systems), this mechanism enables problem debugging. Two sets of information are created – logs and traces. Logs (*.log) are globalized and traces (*.trc) are in English. They contain entry and exit parameters of method and stack traces for exceptions. The amount of information traced depends on the level of tracing set.
Level name What is logged or traced
LOG_ERR Only exceptions and errors are logged and
traced
LOG_INFO Also log messages are logged and traced -
DEFAULT
DEBUG_MIN Also most important method entries, exits
and trace messages are traced
DEBUG_MID Most of the method entries, exits and trace
messages are traced
DEBUG_MAX All of the method entries, exits and trace
messages are traced
The level can be set in configuration files or by exporting an environment variable called TRACE_LEVEL with one of the values mentioned above. Configuration of RAS settings is stored in the following files:
v CH/config/ITMInstallRAS.properties (for installation)
40 IBM Tivoli Monitoring: Troubleshooting Guide
v CH/config/ITMConfigRAS.properties (for configuration)
Callpoints are the only component that is handled differently, their logs and traces always go to the directory CH/InstallITM/plugin/executionEvents. The default location for installation is CH/logs/itm_install.log(.trc) and for configuration it is CH/logs/itm_config.log(.trc).
To gather all the needed logs and environment information in case of an error, use the pdcollect tool. See “pdcollect tool” on page 59.
Component Location File name
Install logs/traces CH/logs
candle_installation.log itm_install.log (.trc)
Config logs/traces CH/logs itm_config.log (.trc)
Logs for component startup CH/logs
pc.env (lists env variables passed to the agent) hostname_pc_ID.log
Callpoint logs/traces CH/InstallITM/plugin/
executionEvents/logs/ timestamp/install(config)/ plugin_type/pc
callpoint.trc (.log) *.stderr *.stdout
Tivoli Distributed Monitoring upgrade log file
All upgrade actions performed by the IBM Tivoli Monitoring Upgrade Toolkit are recorded in a central log with an associated user ID and a time stamp.
Upgrade actions taken outside of the Upgrade Toolkit are not recorded in the log.
Table 3. Upgrading from Tivoli Distributed MonitoringTivoli log file
Windows UNIX-based systems
$DBDIR/AMX/logs/log_tool_ timestamp.log
where:
$DBDIR
The Tivoli Management Environment Framework environment variable that specifies the directory where the Object Repository (odb.bdb) is located.
tool Specifies the IBM Tivoli Monitoring Upgrade Toolkit tool: witmscantmr,
witmassess, or witmupgrade.
timestamp
Specifies a time stamp that includes data and time of execution.
For example: log_witmscantmr_20050721_15_30_15.log
$DBDIR/AMX/logs/log_tool_ timestamp.log
The log file name displays when the Upgrade Toolkit tool completes the upgrade operation. Each time a Upgrade Toolkit tool runs, its generates a new log file that is never reused by any tool. The contents of the log file conform to the Tivoli Message Standard XML logging format. The following example is an excerpt from an Upgrade Toolkit tool log file:
Chapter 4. Tools 41
<Message Id="AMXUT2504I" Severity="INFO"> <Time Millis="1121977824199"> 2005.07.21 15:30:24.199 CST </Time> <Server Format="IP">YFELDMA1.austin.ibm.com</Server> <ProductId>AMXAMX</ProductId> <Component>ScanTMR</Component> </Component>1</ProductInstance> <LogText><![CDATA[AMXUT2504I The software is creating a new baseline file C:\PROGRA~1\Tivoli\db\YFELDMA1.db\AMX\shared\analyze\scans\
1889259234.xml.]]; </LogText> <TranslationInfo Type="JAVA" Catalog="com.ibm.opmt.utils.messages.MigrationManager_ msgs" MsgKey="AMXUT2504I"><Param> <![CDATA[C:\PROGRA~1\Tivoli\db\YFELDMA1.db\AMX\shared\analyze\scans\
1889259234.xml]]; </Parm></TranslationInfo> <Principal></Principal> </Message>
Reading RAS1 logs
The RAS1 trace log lists details related to the health of an ODBC data provider application.
This topic provides an example of the universal agent RAS1 trace logs. By default, the universal agent RAS1 trace log lists the following details about the ODBC data provider application:
v Whether the ODBC tables come online during startup.
v Whether the ODBC table data is collected.
v Errors with the ODBC-related status messages, including informational messages
about when each ODBC connection completes.
v Errors that occur during ODBC data provider data retrieval, including errors in
the ODBC driver code.
v Independent Software Vendor (ISV) API errors. (The universal agent makes API
calls to the ISV ODBC driver to implement the connections and SQL select statements.)
The following RAS1 log excerpt lists ODBC status messages using default tracing:
KUMP_ProcessStartUpConfig") Loading metafile <f:\candle\cma\metafiles\TIVOLI_DATA_WAREHOUSE.mdl> from startup config file f:\candle\CMA\WORK\KUMPCNFG_INST1 "DCHserver::dp_register") Application TIVOLI_DATA_WAREHOUSE successfully registered"KUMP_ProcessStartUpConfig") 1 application metafile(s) processed from startup config file f:\candle\CMA\WORK\KUMPCNFG_INST1 "KUMP_StartDataProvider") Starting ODBC Data Provider... "KUMP_WaitODBCsourceReadyForMonitor") Reusing connection handle for ODBC source TIVOLI_DATA_WAREHOUSE table <syscharsets> "KUMP_ODBCserver") Successfully connected to ODBC source TIVOLI_DATA_WAREHOUSE table <syscharsets> "KUMP_WaitODBCsourceReadyForMonitor") Reusing connection handle for ODBC source TIVOLI_DATA_WAREHOUSE table <syscomments>
The Reusing connection handle messages indicates the ODBC provider is reusing resource to conserve memory. The ODBC data provider allocates a connection for each metafile with multiple attribute groups that connect to the same data source
42 IBM Tivoli Monitoring: Troubleshooting Guide
using the same user ID and password combination. Each SQL Select statement that is run for the various attribute groups shares the same connection handle.
The following is an excerpt from later in the same log:
userDataList::calculateChecksum") Initial creation of catalog/attribute tables for applName <Tivoli_Data_Warehouse> "KUMP_ODBCserver") ODBC source <Tivoli_Data_Warehouse> table <syscharsets> is now online to the data provider "KUMP_ODBCserver") ODBC source <Tivoli_Data_Warehouse> table <syscacheobjects> is now online to the data provider "KUMP_ODBCserver") ODBC source <Tivoli_Data_Warehouse> table <syscomments> is now online to the data provider"
Setting traces
When you encounter an error with IBM Tivoli Monitoring that requires contacting IBM Software Support, you might be asked to submit a copy of the error log. The error log is part of the trace diagnostic tool in Tivoli Monitoring.
The tool is set to log errors, and you can set other parameters for collecting specific details. Always backup the files before altering them.
RAS1 syntax
Follow the RAS1 syntax for setting traces in your environment file.
KBB_RAS1= global_class (COMP: component_type) (ENTRY: entry_point)
(UNIT: unit_name, class)
where:
global_class
Indicates the level of tracing that you want. This is a global setting that applies to all RAS1 filters in the process. If you set this global class by itself, it is global in scope and the trace cannot filter on any of the other keywords. Separate combined classes with a space. The following values are possible. Valid abbreviations are in parentheses.
ERROR (ER):
STATE (ST):
FLOW (FL):
DETAIL (DE):
INPUT (IN):
returns severe error messages only (this is the default for most applications).
records the condition or current setting of flags and variables in the process. If state tracing is enabled, you can see the current state of particular variables or flags as the process is running.
causes a message to be generated at an entry or exit point of a function.
produces a detailed level of tracing.
records data created by a particular API, function, or process.
COMP
ALL: causes all available messages to be recorded. This setting combines
all the other forms of tracing.
Indicates that the trace includes a component type. The COMP keyword is
Chapter 4. Tools 43
used to trace groups of routines related by function (or component). Use this keyword only at the explicit request of an IBM Software Support representative.
component_type
Identifies a component type. An IBM Software Support representative can tell you what value to specify.
ENTRY
Narrows a filtering routine to specify a specific ENTRY POINT. Since multiple entry points for a single routine are rare, use this keyword only at the explicit request of an IBM Software Support representative.
entry_point
Represents the name of the entry point. An IBM Software Support representative can tell you what value to specify.
UNIT Indicates that the trace is to look for a match between the compilation unit
dispatched and the fully or partially qualified compilation unit specified on the RAS1 statement. A match results in a trace entry.
unit_name
Represents the name of the compilation unit. In most instances, this name defines the component that is being traced. The value is likely to be the three-character component identifier for the monitoring agent (KHL for OMEGAMON
®
z/OS Management Console).
class One of the same values specified for global_class but, because of its
position inside the parentheses, narrowed in scope to apply only to the unit_name specified.
Setting the trace option for the portal client trace
A log file is created automatically the first time you start the Tivoli Enterprise Portal, and is named install_dir\cnp\logs\kcjras1.log. This log file contains all of the RAS1 tracing for the portal client. Whenever you start a new work session, the log file is purged and rewritten for the current work session. If you want to preserve the log file from the last work session, you must rename it or copy it to another directory before starting the portal client again. The kcj.log file contains errors generated by the Java
libraries used in the portal client.
Procedure
1. Always backup the files before altering them.
2. From the Tivoli Enterprise Portal menu, select File > Trace Options.
3. Select a trace class from the list or as instructed by IBM Software Support (such
as UNIT:Workspace ALL):
v ALL provides data for all classes. Use the setting temporarily, because it
generates large amounts of data.
v ERROR logs internal error conditions. This setting provides the minimum
level of tracing, with little resource overhead, and ensures that program failures will be caught and detailed.
v NONE turns off the error log so no data is collected.
4. Click OK to close the window and turn on logging.
Setting the trace option for the portal server trace
Set the trace options for the Tivoli Enterprise Portal Server through Manage Tivoli Enterprise Monitoring Services.
44 IBM Tivoli Monitoring: Troubleshooting Guide
Before you set the trace options for the portal server, determine the trace string. The trace string specifies the trace setting. Set trace options for the portal server when you start it. The log file continues to grow until you either turn off the trace or recycle the portal server. Always backup the files before altering them.
Procedure
v On the computer where the portal server is installed, click Start >
Programs > IBM Tivoli Monitoring > Manage Tivoli Enterprise Monitoring Services.
1. Right-click the Tivoli Enterprise Portal Server service.
2. Select Advanced > Edit Trace Parms to display the Trace Parameters
window.
3. Select the RAS1 filters. The default setting is ERROR.
4. Accept the defaults for the rest of the fields and click OK.
v
filter is the component you want to trace and trace_level is the level of tracing you want.
KBB_RAS1=ERROR (UNIT:filter trace_level)
Set the following variable in the install_dir/config/cq.ini where
What to do next
Recycle the Tivoli Enterprise Portal Server.
Setting the trace option for the Tivoli Enterprise Monitoring Server trace
About this task
On Windows systems:
1. Always backup the files before altering them.
2. On the computer where the Tivoli Enterprise Monitoring Server is installed,
select Start > Programs > IBM Tivoli Monitoring > Manage Tivoli Enterprise Monitoring Services.
3. Right-click the Tivoli Enterprise Monitoring Server service.
4. Select Advanced > Edit Trace Parms to display the Trace Parameters window..
5. Select the RAS1 filters. RAS1 is the unit trace for the monitoring server. The
default setting is ERROR.
Note: There must be a space between each UNIT trace setting. For example, ERROR (UNIT:kdy all) (UNIT:kfaprpst all).
6. Accept the defaults for the rest of the fields.
7. Click OK to set the new trace options.
8. Click Yes to recycle the service.
On UNIX systems:
1. Always backup the files before altering them.
2. Set the following variable in the ms.ini file in the %CANDLEHOME/config
directory:
KBB_RAS1=ERROR (UNIT:filter trace_level)
where filter is the component you want to trace and trace_level is the level of tracing you want. The following example traces everything in the Deploy component:
Chapter 4. Tools 45
KBB_RAS1=ERROR (UNIT:KDY ALL)
Note: There must be a space between each UNIT trace setting. For example:
KBB_RAS1=ERROR (UNIT:KDY ALL) (UNIT:KFAPRPST ALL)
3. Set the following variable in $CANDLEHOME/bin/tacmd to trace the command line
interface of the Tivoli Enterprise Monitoring Server:
KBB_RAS1=ERROR (UNIT:filter trace_level)
4. Regenerate the host_name_ms_TEMS_NAME.config file by running the ./itmcmd
config -S [ -h install_dir][-aarch]-ttems_name command.
5. Recycle the Tivoli Enterprise Monitoring Server by "restarting" or "stop" and
then "start". The command syntax for starting and stopping the monitoring server is ./itmcmd server [ -h install_dir ] [-l] [-n] start|stop tems_name.
For information on how to set trace levels dynamically, see “Dynamically modify trace settings for an IBM Tivoli Monitoring component” on page 53.
Setting the trace option for the Agent Deploy tool
About this task
On Windows systems:
1. On the computer where the Tivoli Enterprise Monitoring Server is installed,
select Start > Programs > IBM Tivoli Monitoring > Manage Tivoli Enterprise Monitoring Services.
2. Right-click the Tivoli Enterprise Monitoring Server service.
3. Select Advanced > Edit Trace Parms > to display the Trace Parameters
window.
4. Type (UNIT:kdy all) in the Enter RAS1 Filters field.
5. Accept the defaults for the rest of the fields.
6. Click OK to set the new trace options.
7. Click Yes to recycle the service.
On Linux systems, set the following variable in $CANDLEHOME/config/lz.ini:
KBB_RAS1=ERROR(UNIT:kdy ALL)(UNIT:kdd ALL)
On UNIX systems other than Linux:
1. Set the following variable in $CANDLEHOME/config/ux.ini:
KBB_RAS1=ERROR (UNIT:kdy ALL) (UNIT:kdd ALL)
2. Recycle the OS Agent on that endpoint.
Setting any monitoring agent's trace option for SNMP alerts
When troubleshooting SNMP Alerts for any agent, set the following trace:
ERROR (UNIT:KRA ALL)
If the agent is configured to use SNMPv3 Encryption when emitting the SNMP alerts, set (COMP:SNMP ALL) so that the trace setting would be the following:
ERROR (UNIT:KRA ALL) (COMP:SNMP ALL)
Use (COMP:SNMP ALL) when you are focusing on SNMP traps. If you are focusing on an agent communication error or crash, then use:
KBB_RAS1=(UNIT:KRA ALL) (UNIT:s_ ALL)
The (UNIT:s_ ALL) trace level includes tracing of system calls during SNMP processing.
46 IBM Tivoli Monitoring: Troubleshooting Guide
Setting the trace option for the universal agent
About this task
Use the universal agent trace facility to diagnose problems . The universal agent uses RAS1 tracing. By default, universal agent trace data is written to a file in the logs subdirectory. The default RAS1 trace level is ERROR for all universal agent components and modules. On Windows, the kumras1.log is overwritten each time the universal agent starts and there is no method for archiving previous RAS1 log files. Therefore, you must obtain the RAS1 log that matches the problem occurrence before contacting IBM Software Support. You can set tracing options for individual universal agent components and modules in the KUMENV file on Windows or the um.ini file on UNIX-based systems.
RAS1 supports pattern matching. For example, (UNIT:kums options) traces all SNMP data provider modules because they all begin with kums. Detailed RAS1 tracing can degrade universal agent performance due to high CPU usage and I/O overhead. Therefore, set the universal agent RAS1 tracing to KBB_RAS1=ERROR after problem diagnosis. If a module produces excessive error messages and fills the RAS1 log, set (UNIT:modulename None) to suppress the module until you resolve the errors. If you discover an old Windows RAS1 log file, the KBB_RAS1 environment was erased or commented out in the KUMENV file, add KBB_RAS1=ERROR to the install_dir\logs\hostname_um_timestamp.log to reactivate universal agent RAS1 tracing.
Set the universal agent trace from Manage Tivoli Enterprise Monitoring Services:
1. Right-click the universal agent.
2. Select Advanced > Edit Trace Parms.
3. Select the RAS1 filters. The default setting is ERROR
4. Accept the defaults for the rest of the fields.
5. Click OK to set the new trace options.
6. Click Yes to recycle the service.
Setting the trace option for the Warehouse Proxy agent
Procedure
1. On Windows systems, on the computer where the Tivoli Enterprise Monitoring
Server is installed, select Start > Programs > IBM Tivoli Monitoring > Manage Tivoli Enterprise Monitoring Services.
2. Right-click Warehouse Proxy.
3. Select Advanced > Edit Trace Parms.
4. Select the RAS1 filters. The default setting is ERROR.
5. Accept the defaults for the rest of the fields.
6. Click OK to set the new trace options.
7. Click Yes to recycle the service.
Warehouse Proxy agent trace configuration:
You can edit the handler configuration file $CANDLEHOME%\Config\ ITMConfigRAS.properties for UNIX systems and the %CANDLEHOME%\Config\ ITMConfigRAS.properties file for Windows systems, and set the handler99 as the
configuration handler and set the debug tracing to the maximum DEBUG_MAX as shown below:
Handler99.name=config Handler99.scope=* Handler99.scopeName=Config
Chapter 4. Tools 47
Handler99.logFile=../logs/config.log Handler99.traceFile=../logs/config.trc Handler99.level=DEBUG_MAX Handler99.onConsoleToo=true Handler99.maxFiles=10 Handler99.maxFileSize=8192
Then you need to create a file called kKHDconfig.sysprops.cfg under the directory $CANDLEHOME\TMAITM6 for UNIX systems, and %CANDLEHOME%\ TMAITM6 for Windows systems, containing a link to the handler configuration file as shown below:
DInstallRASConfig="ITMConfigRAS.properties"
When the Warehouse Proxy Agent configuration panel is executed, tracing appears in the $CANDLEHOME/logs/config.trc file for UNIX systems, and %CANDLEHOME%/logs/config.trc for Windows systems, as described by the handler configuration file.
To trace the 2way translator, set the trace level to (UNIT: KDY ALL) (UNIT: KHD_XA ALL) in the Warehouse Proxy Agent environment file for KBB_RAS1.
Setting the trace option for the Summarization and Pruning Agent
Use the IBM Tivoli Universal Agent trace facility to diagnose problems with the Summarization and Pruning Agent. See “Setting the trace option for the universal agent” on page 47. The Summarization and Pruning Agent uses C-based RAS1 tracing, Java-based RAS1 tracing and Java-based internal tracing. By default, Summarization and Pruning Agent trace data is written to a file in the logs subdirectory. The default RAS1 trace level is ERROR for all Summarization and Pruning Agent components and modules.
The following trace options are available for the IBM Tivoli Summarization and Pruning Agent:
KBB_RAS1=ERROR
Trace general errors. KBB_RAS1=ERROR Affects the content of the C-based RAS1 tracing (hostname_sy_HEXtimestamp-nn.log).
KBB_RAS1=ERROR (UNIT:ksz ALL)
Trace agent startup. Affects the content of the C-based RAS1 tracing (hostname_sy_HEXtimestamp-nn.log).
KBB_RAS1=ERROR (COMP:com.tivoli.twh.ksy ALL)
Minimum level trace for summarization. Affects the content of the Java-based RAS1 tracing (hostname_sy_ras1java_timestamp-nn.log).
KBB_RAS1=ERROR (UNIT:ksy1 ALL)
Medium level trace for summarization. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log)
KBB_RAS1=ERROR (UNIT:ksy2 ALL)
Connection level trace for summarization. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log)
KBB_RAS1=ERROR (UNIT:ksy3 ALL)
Statement level trace for summarization. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log).
48 IBM Tivoli Monitoring: Troubleshooting Guide
KBB_RAS1=ERROR (UNIT:ksy4 ALL)
ResultSet level trace for summarization. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log).
KBB_RAS1=ERROR (UNIT:ksy5 ALL)
Column value level trace for summarization. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log).
KBB_RAS1=ERROR (UNIT:ksysql ALL)
Traces every SQL statement being executed. Affects the content of the Java-based internal tracing (hostname_sy_java_timestamp-n.log).
KBB_RAS1=ERROR (UNIT:ksysql1 ALL)
Same as (UNIT:ksysql ALL) but also includes all the parameter values used in the parameterized statements.
Note:
1. The following settings: (UNIT:ksy3 ALL) or (UNIT:ksy4 ALL) or (UNIT:ksy5
ALL) produce a high volume of trace output.
2. By default, the Java-based internal trace (hostname_sy_java_timestamp-n.log)
wraps at 5 files, and each file contains 300000 lines. To change the defaults, use the following settings in the KSYENV (Windows) or sy.ini (UNIX) files:
KSZ_JAVA_ARGS=-Dibm.tdw.maxNumberDetailTraceFiles=<A>
-Dibm.tdw.maxLinesForDetailTraceFile=<B>
where:
<A> Specifies the maximum number of Java-based internal trace files that
can exist at any one time for a single launch
<B> Specifies the maximum number of lines per Java-based internal trace
file.
Using the Summarization and Pruning Agent user interface:
You can edit the handler configuration file $CANDLEHOME%\Config\ ITMConfigRAS.properties for UNIX systems and the %CANDLEHOME%\Config\ ITMConfigRAS.properties file for Windows systems, and set the handler99 as the
configuration handler and set the debug tracing to the maximum DEBUG_MAX as shown below:
Handler99.name=config Handler99.scope=* Handler99.scopeName=Config Handler99.logFile=../logs/config.log Handler99.traceFile=../logs/config.trc Handler99.level=DEBUG_MAX Handler99.onConsoleToo=true Handler99.maxFiles=10 Handler99.maxFileSize=8192
Then you need to create a file called kKSYconfig.sysprops.cfg under the directory $CANDLEHOME\TMAITM6 for UNIX systems, and %CANDLEHOME%\ TMAITM6 for Windows systems, containing a link to the handler configuration file as shown below:
DInstallRASConfig="ITMConfigRAS.properties"
When the Summarization and Pruning Agent configuration panel is executed, tracing appears in the $CANDLEHOME/logs/config.trc file for UNIX systems, and %CANDLEHOME%/logs/config.trc for Windows systems, as described by the handler configuration file.
Chapter 4. Tools 49
To trace the 2way translator, set the trace level to (UNIT: KDY ALL) (UNIT: KHD_XA ALL) in the Summarization and Pruning Agent environment file for KBB_RAS1.
Setting the trace option for the tacmd commands
For Windows systems, manually edit the KUIENV file in the CANDLEHOME directory with the standard KBB_RAS1 statement to include the following:
KBB_RAS1=ERROR(UNIT:ksh all) (UNIT:kui all)
.
On UNIX systems, manually edit the $CANDLEHOME/bin/tacmd shell script to add a line like the following example:
KBB_RAS1=ERROR(UNIT:ksh all) (UNIT:kui all)
In order to debug KT1 as well, edit the line to be like the following example:
KBB_RAS1=ERROR(UNIT:ksh all) (UNIT:kui all) (UNIT:kt1 all)
Setting the trace option for the IBM Tivoli Monitoring upgrade toolkit
Table 4. Setting the trace option for the Tivoli Monitoring upgrade toolkit
Trace option Instructions
Endpoint tracing Run the following command to set log_threshold=3 or higher on an endpoint and enable
endpoint tracing:
wep ep set_config log_threshold=3
Traces are written to lcfd.log on the endpoint in $LCF_DATDIR.
Tracing in a test environment.
OS Agent tracing OS Agent tracing is enabled at a minimum level by default. Agent tracing levels can be
A Boolean value of TRUE or FALSE default. The default is FALSE.
Run the following command from a Tivoli Management Environment command prompt to enable tracing:idlcall oid _set_debug TRUE
where:
oid Specifies the object ID of the Upgrade Manager object. Run the wlookup Framework command to locate the Upgrade Manager object ID in the Tivoli Management Environment:
wlookup -a | grep Upgrade
Note: Setting the trace value to TRUE sets all Upgrade Toolkit tools to TRUE, affecting all users running Upgrade Toolkit tools.
A trace file named trace_tool_timestamp.log is created in the $DBDIR/AMX/trace/ directory in XML format, with tool being 'witmscantmr', 'witmassess', and 'witmupgrade', and timestamp a time stamp that includes data and time of execution. Each record in this log contains a time stamp and message. Additionally, these tools inherit Framework FFTC mechanisms such as wtrace and odstat for transaction and method stack traces. See the Tivoli management Framework documentation for more information about the commands.
adjusted with agent specific settings. Logs are stored in install_dir\installITM\ on Windows agents or install_dir/logs/ on UNIX-based systems agents. These logs follow the RAS1 log format.
50 IBM Tivoli Monitoring: Troubleshooting Guide
Setting the trace option for Tivoli Enterprise Console event forwarding
If your monitoring environment is configured for IBM Tivoli Monitoring event forwarding, you can forward situation events to the Tivoli Enterprise Console and view events on the event server through the Tivoli Enterprise Portal. If you want to forward situation events to and view updates from Tivoli Enterprise Console event server in the portal client, you can set the trace for the event forwarder on the Tivoli Enterprise Monitoring Server.
Use the event forwarding trace facility to diagnose problems with event forwarding.
About this task
The event forwarding trace facility uses RAS1 tracing. Event forwarding is set during installation. The acceptable values include:
v STATE
v DETAIL
v ALL
The default trace value is STATE. If you change the trace level, you must restart the monitoring server for the change to take effect.
Use the following instructions to set the trace levels:
:
1. In Manage Tivoli Enterprise Monitoring Services, right-click the Tivoli
Enterprise Monitoring Server.
2. Click Advanced > Edit trace parms.
3. Under Enter RAS1 Filter add UNIT:kfaot trc_class
where:
trc_class
Specifies STATE, DETAIL or ALL which produces increasingly more trace information.
4. The default log file location is C:\IBM\ITM\CMS\logs\KMSRAS1.LOG, change if
necessary.
5. Click OK to set the trace.
6. Recycle the monitoring server for the trace to take effect.
1. Edit install_dir/config/
hostname_ms_Tivoli_Enterprise_Monitoring_Server_ID.config
where:
install_dir
Specifies the installation directory of the monitoring server.
hostname
Specifies the host name value supplied during installation.
2. Add (UNIT:kfaot trc_class) to the line KBB_RAS1=’ERROR’
where:
trc_class
Specifies one of the following levels of trace detail:
Chapter 4. Tools 51
v STATE - minimum detail.
v DETAIL - medium detail.
v ALL - maximum detail.
For example, ’KBB_RAS1=’ERROR (UNIT:kfaot STATE)’
3. Save the file.
4. Recycle the monitoring server for the trace to take effect.
5. The monitoring server log can be found in install_dir/logs/
hostname_ms_nnnnnnn.log where is a time stamp. There might be multiple files with different time stamps in the logs directory.
Setting the trace option for the IBM Tivoli Enterprise Console Situation Update Forwarder
If your monitoring environment is configured for the IBM Tivoli Enterprise Console, you can forward situation events to the Tivoli Enterprise Console event server. You can also view events on the event server through the Tivoli Enterprise Portal. If you want to forward situation events to and view updates from IBM Tivoli Enterprise Console in the Tivoli Enterprise Portal, you can set the trace for the Situation Update Forwarder on the IBM Tivoli Enterprise Console event server. The default trace setting is low. You can edit the trace setting using the sitconfig command.
$BINDIR/TME/TEC/OM_TEC/bin/sitconfig.sh update fileName=configuration_file_name logLevel=trace_level
where:
configuration_file_name
The file name of the actively loaded configuration file as indicated by the situpdate.properties file.
trace_level
Specifies the level of trace as low, med,orverbose.
Use the IBM Tivoli Enterprise Console Situation Update Forwarder trace facility to diagnose problems with the IBM Tivoli Enterprise Console Situation Update Forwarder. The trace for the IBM Tivoli Enterprise Console Situation Update Forwarder is set during installation. The acceptable values include:
v low
v med
v verbose
The default trace value is low. If you change the trace level after the Situation Update Forwarder is started, you must restart the Situation Update Forwarder for the change to take effect. There are two trace files:
synch_trace.log
is always created.
synch_msg.log
is created if an error occurs while running the Situation Update Forwarder.
Run the following command to set the trace levels:
$BINDIR/TME/TEC/OM_TEC/bin/sitconfig.sh update fileName=configuration_file_name logLevel=trace_level
where:
52 IBM Tivoli Monitoring: Troubleshooting Guide
configuration_file_name
The file name of the actively loaded configuration file as indicated by the situpdate.properties file.
trace_level
Specifies the level of trace as low, med,orverbose.
Setting up RAS1 tracing on z/OS systems
Edit the KpcENV (where pc is the product code) environment file to set the RAS1 trace level for your OMEGAMON product.
This syntax is used to specify a RAS1 trace in the KppENV file (where pp is the product code: HL for the OMEGAMON z/OS Management Console or DS for the Tivoli Enterprise Monitoring Server). After you add this configuration setting to the KppENV file, you must stop and restart the address space for the setting to take effect. After that, it remains in effect for the life of the address space. To end the trace, you must edit the KppENV file again to reset the trace level, and stop and restart the address space.
RAS1 trace setting syntax
The KBB_RAS1 environment variable setting follows the RAS1 trace setting syntax as described in “RAS1 syntax” on page 43.
Note: The default setting for monitoring agents on z/OS is KBB_RAS1=ERROR, meaning that only error tracing is enabled. You can specify any combination of UNIT, COMP, and ENTRY keywords. No keyword is required. However, the RAS1 value you set with the global class applies to all components. For more information on setting RAS1 tracing on z/OS systems, see your individual monitoring agent's user's guide.
Dynamically modify trace settings for an IBM Tivoli Monitoring component
You can access the Tivoli Enterprise Monitoring Server, Tivoli Enterprise Portal Server, almost all of the agents, and other IBM Tivoli Monitoring components from this utility.
This method of modifying trace settings on an IBM Tivoli Monitoring component is the most efficient method since it allows you to do so without restarting the component. Settings take effect immediately. Modifications made this way are not persistent.
Note: When the component is restarted the trace settings are read again from the
.env file. Dynamically modifying these settings does not change the settings in the .env files. In order to modify these trace settings permanently, modify them in the .env files.
How to turn tracing on:
In order to use this utility you need to know a local log-on credential for the system.
This method uses the IBM Tivoli Monitoring Service Console. The Service Console is accessed using a web browser. Access the utility by using the following link:
http://hostname:1920
Chapter 4. Tools 53
where hostname is the host name or IP address of the system where the IBM Tivoli Monitoring component is running. The utility then appears with information about the components that are currently running on this system.
For example, the component Tivoli Enterprise Portal Server shows as cnp, the Monitoring Agent for Windows OS shows as nt, and the Tivoli Enterprise Monitoring Server shows as ms.
Select the link below the component for which you want to modify the trace settings. In the previous view if you want to modify tracing for the Tivoli Enterprise Monitoring Server, you select the "IBM Tivoli Monitoring Service Console" link under the Service Point: system. balayne_ms.
When you select one of the links, you will be prompted for a user ID and password to access the system. This is any valid user that has access to the system.
Typing ? displays a list of the supported commands.
The command for modifying the trace settings is ras1.
If you type ras1 in the field at the bottom of the screen, you will then see the help for this command.
The set option (ras1 set) turns on the tracing, but does not affect existing tracing.
An example would be ras1 set (UNIT:xxx ALL) (UNIT:yyy Detail). This command will enable full tracing for the xxx class of the component and low-level detailed tracing on the yyy class of the component.
The ras1 list command lists what tracing is set as default. It is best to do an initial list in order to track what changes you have made to the tracing settings.
The following list describes the options of tracing available:
ALL - Provides all trace levels. Shown as ALL when using the ras1 list command.
Flow - Provides control flow data describing function entry and exit. Shown as Fl when using the ras1 list command.
ERROR - Logs internal error conditions. Shown as ER when using the ras1 list command. The output also shows as EVERYE+EVERYU+ER.
Other settings which provide component specific information are:
Detail - Shown as Det when using the ras1 list command.
INPUT - Shown as IN when using the ras1 list command.
Metrics - Shown as ME when using the ras1 list command.
OUTPUT - Shown as OUT when using the ras1 list command.
State - Shown as ST when using the ras1 list command.
54 IBM Tivoli Monitoring: Troubleshooting Guide
Setting trace to ALL includes every trace point defined for the component. This might result in a large amount of trace. If you have been given a more specific setting, use it. ALL can sometimes be necessary when isolating a problem. It is the equivalent of setting "Error Detail Flow State Input Output Metrics".
The ras1 units command is used to determine the list of UNITs and COMPs available in an IBM Tivoli Monitoring component. The first column is the list of available UNIT values, the last column lists the corresponding COMP values.
Turning on (COMP:KDH ALL) will turn ALL level tracing on for all of the files where KDH is listed in the right hand column (highlighted below).
The following is a subset of the results for the Monitoring for Windows agent:
kbbcre1.c, 400, May 29 2007, 12:54:43, 1.1, * kbbcrn1.c, 400, May 29 2007, 12:54:42, 1.1, * kdhb1de.c, 400, May 29 2007, 12:59:34, 1.1, KDH kdh0med.c, 400, May 29 2007, 12:59:24, 1.1, KDH kdhsrej.c, 400, May 29 2007, 13:00:06, 1.5, KDH kdhb1fh.c, 400, May 29 2007, 12:59:33, 1.1, KDH kdhb1oe.c, 400,
May 29 2007, 12:59:38, 1.2, KDH kdhs1ns.c, 400, May 29 2007, 13:00:08, 1.3, KDH kbbacdl.c, 400, May 29 2007, 12:54:27, 1.2, ACF1kbbaclc.c, 400,
May 29 2007, 12:54:27, 1.4, ACF1 kbbac1i.c, 400, May 29 2007, 12:54:28, 1.11, ACF1 kdhsfcn.c, 400, May 29 2007, 13:00:11, 1.1, KDH kdhserq.c, 400, May 29 2007, 12:59:53, 1.1, KDH kdhb1pr.c, 400, May 29 2007, 12:59:39, 1.1, KDH kdhsgnh.c, 400, May 29 2007, 12:59:49, 1.1, KDH kdh0uts.c, 400, May 29 2007, 12:59:23, 1.1, KDH kdhsrsp.c, 400, May 29 2007, 13:00:13, 1.2, KDH kdhs1rp.c, 400, May 29 2007, 13:00:12, 1.1, KDH kdhscsv.c, 400, May 29 2007, 12:59:58, 1.9, KDH kdebbac.c, 400, May 29 2007, 12:56:50, 1.10, KDE
The UNIT value matches any unit that starts with the specified value. For example, (UNIT:kra FLOW) prints the FLOW traces for all files which match kra*.
How to turn tracing back off:
The option for turning the tracing off is ANY. For example you would use the following command to turn off tracing for the kbbcrcd class of the Windows OS agent:
ras1 set (UNIT:kbbcrcd ANY)
Using the IBM Tivoli Monitoring Service Console
The IBM Tivoli Monitoring Service Console enables you to read logs and turn on traces for remote product diagnostics and configuration.
The IBM Tivoli Monitoring Service Console is uniquely identified by its service point name. All service consoles for a host are linked and presented on the IBM Tivoli Monitoring Service Index for that host. Point a browser to the HTTP port 1920 on a specific host (for example, http://goby:1920) to launch the IBM Tivoli Monitoring Service Index. You can also launch the service console with the https protocol by connecting via the https protocol and port 3661. You can perform operations on a specific IBM Tivoli Monitoring process by selecting the service console associated with a service point name.
Chapter 4. Tools 55
The IBM Tivoli Monitoring Service Index has links to service consoles for the components installed on the computers. Now, when you go to the Service Index, you will also see links to the Agent Service Interface. Use the Agent Service Interface to get reports for an installed agent, whether it is a Tivoli Enterprise Monitoring Agent or Tivoli System Monitoring Agent. After logging into the local operating system, you can choose reports of agent information, private situations, private history, and attribute descriptions and current values. You can also make a service interface request using provided XML elements.
Starting the IBM Tivoli Monitoring service console
You can start the service console by accessing the Tivoli Enterprise Portal Server port.
Procedure
1. Start Internet Explorer V5 or higher.
2. In the Address field, type the URL for the portal server: http://hostname:1920
where hostname is the fully qualified name or IP address of the computer
where the portal server is installed. If the service console is not displayed, a
system administrator might have blocked access to it. See “Blocking access to
the IBM Tivoli Monitoring Service Console.”
3. Click the service console link associated with the desired process (service point
name).
4. When the log in window opens, click OK. In secure environments, you need a
valid user ID and password to proceed. Upon successful login, the service
console opens with three areas:
v Header
v Command Results
v Command Field
You can now issue service console commands in the command input area. For
a list available commands, type a question mark (?) and click Submit
Results
The service console performs user authentication using the native OS security facility. If you use the service console on z/OS systems, your user ID and password are checked by the z/OS security facility (RACF/SAF). If you use the service console on Windows systems, then you must pass the Windows workstation user ID and password prompt. This is the rule except for instances of a NULL or blank password, which are not accepted.
A password is always required to access the service console. Blank passwords, even if correct, cannot access the service console. Even if a user ID is allowed to log in to the operating system without a password, access to the service console is denied. Create a password for the user ID that is being used to log in to the service console.
Blocking access to the IBM Tivoli Monitoring Service Console
The Tivoli Management Services integral web server is installed automatically with the Tivoli Enterprise Portal Server and enables users to access the IBM Tivoli Monitoring Service Console. You can prevent users from accessing the service console that is available through the integral web server (http:// portal_server_host_name:1920).
56 IBM Tivoli Monitoring: Troubleshooting Guide
To block access to the service console, disable the integral web server. However, if you disable the integral web server, you must install a third party web server on the portal server computer to access the images and style sheets for the graphic view and edit the application parameters at every desktop client.
Procedure
1. From the Windows desktop select Start > Run
2. Type regedit.
3. Open the Tivoli Enterprise Portal Server Environment folder:
HKEY_LOCAL_computer\SOFTWARE\Tivoli Monitoring Services\KFW\Tivoli Enterprise Portal Server\KFWSRV\Environment
4. Locate the KDC_FAMILIES in the right frame and add a space and type the
following at the end of the line: http_server:n Example: IP PORT:1918 SNA
use:n IP.PIPE use:n http_server:n
5. Install a third party web server on each computer where you installed the
Tivoli Enterprise Portal desktop client:
a. From the Windows desktop select Start > Programs > IBM Tivoli
Monitoring > Manage Tivoli Enterprise Monitoring Services.
b. Right-click Tivoli Enterprise Portal desktop and select Reconfigure from
the menu.
c. In the list of parameters that opens, double-click cnp.http.url.DataBus to
open the Edit Tivoli Enterprise Portal Parm window.
d. Type the URL to the external web server and to the cnps.ior file in the
candle\cnb directory. For example, if the web server name is myweb.hostname.com and its document root was configured to be \candle\cnb, the value to type is: http://myweb.hostname.com/cnps.ior
e. Select the In Use check box and click OK.
Displaying portal server tasks in the command prompt
The Tivoli Enterprise Portal Server has an option to display the tasks at the command prompt. This is used primarily with IBM Software Support for gathering diagnostic information.
Procedure
1. From your Windows desktop, select Start > Programs > IBM Tivoli
Monitoring > Manage Tivoli Enterprise Monitoring Services.
2. Right-click Tivoli Enterprise Monitoring Server, then select Change Startup
from the menu.
3. Select the Allow Service to Interact with Desktop check box.
Results
The next time the portal server is started, the process tasks are shown in a command prompt window.
KfwSQLClient utility
This utility provides an optional cleanup step if any of the portal server-generated workspace queries must be deleted. A sample scenario where this might be necessary is if you initially create a metafile application called DISKMONITOR for the Tivoli Universal Agent that has five attribute groups in it. Assume that you subsequently remove two of the attribute groups, which results in a new
Chapter 4. Tools 57
application version suffix. You then decide to run um_cleanup to reset the DISKMONITOR version back to 00. After completing the cleanup process, the Navigator tree still shows workspaces for each of the five original attribute groups, even though the metafile contains only three attribute groups.
This mismatch is caused by the fact that the portal server saves workspace queries in the KFWQUERY table of the portal server database, which is not updated by the um_cleanup script. Therefore, the original 00 version of the queries, which knows about the five original attribute groups, is still being used when you view the DISKMONITOR00 application.
If you determine that you need to delete one or more portal server-generated queries for your Tivoli Universal Agent applications, there is a Tivoli Universal Agent-provided script called um_cnpsCleanup.bat, which is installed on Windows computers, that demonstrates how to perform the delete. The script is very short and uses only the following command:
kfwsqlclient /d TEPS2 /e "delete from kfwquery where id like ’zkum.%%’;"
For a Windows-based portal server, this command is entered from the \IBM\ITM\CNPS directory. The command assumes that the portal server database is using the default data source name of TEPS2, but you can change it if you have configured a different data source name.
On Linux and UNIX systems, this command should be invoked using the itmcmd execute command, for example:
itmcmd execute cq "KfwSQLClient -f myqueries.sql"
Note that this command deletes all portal server-generated Universal Agent queries, which always begin with zkum. To confirm that portal server-generated Tivoli Universal Agent queries have been deleted, or to see which queries are currently defined, run the following select command against the KFWQUERY table:
kfwsqlclient /d TEPS2 /e "select id, name from kfwquery where id like ’zkum.%%’;"
Clearing the JAR cache
If you encounter problems, IBM Software Support might instruct you to uninstall and to clear the Java archive (JAR) cache.
Procedure
1. If the Tivoli Enterprise Portal is running, exit by closing the browser window.
2. Start the Java Plug-in. You can find the Java Plug-in in Start > Settings >
Control Panel. To start it, double-click the Java Plug-in icon. Your desktop
might have a shortcut to the Java Plug-in.
3. In the Java Plug-in Control Panel window, select the Cache tab and click Clear
JAR Cache.
4. When a message indicates that the JAR cache is cleared, click OK.
What to do next
If you want to start browser mode again, restart your browser and type the URL for the Tivoli Enterprise Portal. The Java Extension Installation progress bar shows as each Java archive file is downloaded. Upon completion, the logon window opens and prompt you to enter a user ID.
58 IBM Tivoli Monitoring: Troubleshooting Guide
Using the UAGENT application
The UAGENT application is a diagnostic tool to help solve problems you might experience with the universal agent. Every universal agent data provider automatically activates an application called UAGENT, which includes the DPLOG and ACTION workspaces.
DPLOG
The DPLOG is a pure event table in that it maintains only the most recent 100 rows, unless overridden by the KUMA_MAX_EVENT_ENTRIES environment variable. The DPLOG contains informational and error messages about the status of a data provider that indicate:
v If a metafile was validated successfully.
v If a metafile failed validation (which means the application will not
come online).
v If a data source was available at startup
v Which console ports and socket listening ports were used or unavailable.
v When monitoring started and stopped for a data source.
v When monitoring switched from one file to another.
v When an API or socket client program connected and disconnected.
The DPLOG also records other actions including metafile refreshes. The two most common universal agent problem symptoms are:
v One or more managed systems do not come online.
v The managed systems are online but the workspaces are empty.
pdcollect tool
Use the UAGENT application workspaces as one of the first tools to diagnose a universal agent problem. You might find the solutions for both problems in the appropriate DPLOG. The ODBC data provider also includes a DPLOG message indicating when monitoring started for every attribute group listed in every ODBC metafile.
ACTION workspace
Whenever a Take Action command is issued or a Reflex Action fires, an entry is added to the ACTION workspace. The Action table is keyed and ActionID is the Key attribute. The Action table rows have a time-to-live value of 30 minutes. Unlike the DPLOG which is data provider-specific, the ACTION table is shared by all data providers. If you run multiple data providers, the ACTION workspace under every UAGENT application contains the same rows.
The Action_Result can indicate what happened to a particular Take Action command. For example, if universal agent reflex actions fire faster than one per second, the ACTION workspace temporarily stops recording the results. Recording resumes after several minutes if the action rate slows down.
Use the pdcollect tool to collect the most commonly used information from a system. Technicians in IBM Software Support use this information to investigate a problem.
The pdcollect tool is used to gather log files, configuration information, version information, and other information to help solve a problem. You can also use the tool to manage the size of trace data repositories.
Chapter 4. Tools 59
ras1log tool
The pdcollect tool is run from the tacmd pdcollect command. To use this tool, you must install the User Interface Extension. When you install or upgrade the Tivoli Enterprise Portal Server, the Tivoli Enterprise Services User Interface Extensions software is automatically installed in the same directory. The portal server extensions are required for some products that use the Tivoli Enterprise Portal, such as IBM Tivoli Composite Application Manager products. For more information about this command, see the IBM Tivoli Monitoring Command Reference (http://publib.boulder.ibm.com/infocenter/tivihelp/v15r1/topic/ com.ibm.itm.doc_6.2.3fp1/itm623_cmdref.htm).
This is a tool that converts the time stamps contained in trace logs into readable values. This tool can be found in the itm_install/bin directory on both Windows and UNIX systems. The following lists how the help appears:
usage: ras1log [-l|u] logfile ...
-l for local time
-u for UTC time
logfile can be either a file name or '-' for stdin (default).
You can either pass the tool a file name or you can filter a file through it to obtain a readable log. You do not need to specify any arguments.
The following examples work on Windows systems:
ras1log <balayne_ms_46c071a6-01.log ras1log <balayne_ms_46c071a6-01.log | grep GetEnv ras1log <balayne_ms_46c071a6-01.log > tems_log
The first example sends the result to the screen, the second sends the result to grep to find all of the lines with the text 'GetEnv' in them, which are then printed on the screen, and the third sends the result to a file named tems_log.
By default this tool converts the timestamps to UTC time. When using the -l option, it writes local time instead.
Backspace Check utility
On UNIX systems, if you have incorrectly configured the backspace key, you will see the following:
v When you press the backspace key, characters such as "^?" and "^H" are
displayed on the screen.
v The backspace key seems to be working correctly when entering text, but you
later find characters such as "^?" and "^H" in configuration files and your software malfunctions.
Configure your terminal and "stty erase" to use the same key code for backspace. Consider using "^?" as the key code. Verify your configuration with the IBM Tivoli Monitoring distributed utility, Install: BackspaceCheckUtility.
60 IBM Tivoli Monitoring: Troubleshooting Guide
Build TEPS Database utility
About this task
You can use this utility to build a blank database. Prior to the IBM Tivoli Monitoring v6.1 release, this utility would also populate the database with tables. Now, it is necessary to also run the BuildPresentation utility to build the tables in the database.
To build and populate a database, complete the following steps:
1. From the Manage Tivoli Enterprise Monitoring Services window, right-click
TEPS.
2. Select Advanced > Utilities > Build TEPS Database.
3. Run the BuildPresentation.bat file found in install_dir\CNPS.
IBM Tivoli Monitoring Operations Logging
You can use this logging facility to determine the cause of IBM Tivoli Monitoring problems. IBM Tivoli Monitoring Operations Logging replaces MSG2 logging. With MSG2 logging, physical space problems can occur due to MSG2 logs that grow without bound until a process is stopped. IBM Tivoli Monitoring Operations Logging enables you to configure log file management to avoid these problems.
Note: This functionality is not supported for agents. It is only supported for the log messages produced through the MSG2 facility as used by the Tivoli Enterprise Monitoring Server and Tivoli Enterprise Portal Server.
Windows and UNIX systems
The new optional logs replace the Tivoli Enterprise Monitoring Server log files
\install_dir\cms\kdsmain.msg on Windows systems and install_dir/logs/ hostname_ms_timestamp.log on UNIX-based systems. For the Tivoli Enterprise
Portal Server, they replace the \install_dir\logs\ kfwservices.msg file on Windows systems and the install_dir/logs/kfwservices.msg file on UNIX-based systems.
To use the new logging facility for the Tivoli Enterprise Monitoring Server, modify the \install_dir\cms\KBBENV file on Windows systems or the install_dir/config/ hostname_ms_TEMS ID.config file and install_dir/config/kbbenv.ini file on UNIX-based systems. Add the following line to the file:
MSG_MODE=kms
To disable the new logging facility and return to original logging, either remove this line in the file or change it to:
MSG_MODE=MSG2
To use the new logging facility for the Tivoli Enterprise Portal Server, modify the
\install_dir\cnps\kfwenv file on Windows systems, or the install_dir/config/ cq.ini file on UNIX-based systems. Add the following line to the file:
MSG_MODE=kcq
To disable the new logging facility and return to original logging, either remove this line from the file or change it to:
MSG_MODE=MSG2
Chapter 4. Tools 61
When you have enabled the new logging facility, the Tivoli Enterprise Monitoring Server writes a new log file: install_dir/itmLogs/itmc_hostname_kms.log. The Tivoli Enterprise Portal Server also writes to a new file: install_dir/itmLogs/ itmc_hostname_kcq.log.
The properties file (install_dir/itmLogs/itmc_kms.properties for the Tivoli Enterprise Monitoring Server and the install_dir/itmLogs/itmc_kcq.properties file for the Tivoli Enterprise Portal Server) determines the maximum size and the number of rolling log files. The default is 100000 bytes per file and 3 files.
You can modify these values by changing these parameters in the properties file: fh.maxFileSize=1000000 fh.maxFiles=3. When the log file exceeds the maxFileSize, the file moves to a new name, for example, itmc_hostname_kms1.log, and new messages are then written to the original file name, for example, itmc_hostname_kms.log. The process continues for the number of maxFiles.
z/OS systems
About this task
To use the new logging facility for the Tivoli Enterprise Monitoring Server, modify the RKANPARU member. Add the following line to the file:
MSG_MODE=kms
To disable the new logging facility and return to original logging, either remove this line in the file or change it to:
MSG_MODE=MSG2
The number of log datasets is determined by how many datasets are defined in the Tivoli Enterprise Monitoring Server JCL procedure. The amount of data that can be written to each dataset depends on the amount of space allocated at the time each dataset is created.
To use IBM Tivoli Monitoring Operations Logging on z/OS systems, complete the following procedure:
1. Create the log datasets. You must create at least one to use the new logging
facility, but it is best to create three. The only limit to the number of datasets
that you create is the limit imposed by the system (typically, about 70). Each
dataset must reside on a disk device. The DCB attributes are DSORG=PS,
RECFM=VB, LRECL=256. The BLKSIZE specification must be at least 260, but
you can allow it to default to the system-determined value for best
performance. The amount of space allocated to the datasets is not critical.
Allocating five 3390 cylinders allows space for about 50000 log records. (The
number varies, depending on the lengths of the messages.)
Note: Do not specify secondary allocations. Any secondary allocations are
ignored.
2. Edit your Tivoli Enterprise Monitoring Server JCL procedure (typically named
CANSDSST, but can be named otherwise). For each of your datasets, add a DD
statement with the DDNAME "RKMSLGnn" that points to the dataset with
DISP=SHR. The first dataset should use "RKMSLG00" as the DDNAME, with
"nn" incrementing by one for each additional dataset.
Note: DO NOT SKIP VALUES of "nn." If any values are skipped, subsequent
OpsLog DD statements are ignored.
62 IBM Tivoli Monitoring: Troubleshooting Guide
ITMSuper
You can examine the contents of the log datasets using ISPF Browse or any equivalent tool. Note that the dataset that is currently receiving log data might appear to be empty. You can force a switch to the next dataset (which in turn will flush any buffered log data to the current dataset), using the MODIFY OPSLOGSW console command. The syntax of this command is:
F procname,OPSLOGSW,KMS
"KMS" indicates that the log associated with Tivoli Enterprise Monitoring Server is to be processed.
The ITMSUPER tool performs audits of the IBM Tivoli Monitoring environment (topology, connectivity, application support consistency checks, situations distribution, warehouse analysis, etc.).
A Windows environment is required.
This tool can be run in stand-alone mode by pointing to the Tivoli Enterprise Monitoring Server on any platform. You can run the ITMSUPER tool from a Windows system without having other ITMSUPER software installed. The ITMSUPER Tools are included in the IBM Support Assistant (ISA), a free local software serviceability workbench that helps you resolve questions and problems with IBM software products. See IBM Support Assistant (http://www-01.ibm.com/ software/support/isa).
Chapter 4. Tools 63
64 IBM Tivoli Monitoring: Troubleshooting Guide
Chapter 5. Installation and configuration troubleshooting
This chapter contains the following sections, which provide information about problems that might occur during installation, upgrading from previous versions, and uninstallation of the product and product components:
v “Frequently asked questions”
v “General installation problems and resolutions” on page 68
v “Windows installation problems and resolutions” on page 84
v “UNIX-based system installation problems and resolutions” on page 88
v “Troubleshooting z/OS-based installations” on page 96
v “Uninstallation problems and workarounds” on page 107
Frequently asked questions
General installation frequently asked questions
The following table lists general installation frequently asked questions.
Table 5. General frequently asked questions
Question Answer
Are fix packs required if a user migrates Candle monitoring agent to IBM Tivoli Monitoring.
Do presentation files and customized OMEGAMON DE screens for Candle monitoring agents need to be migrated to a new zLinux system.
Fix packs for CNP196 are delivered as each monitoring agent is migrated to IBM Tivoli Monitoring. Note: The IBM Tivoli Monitoring download image or CD provides application fixpacks for the monitoring agents that are installed from that CD (for example, the agents for operating systems such as Windows, Linux, UNIX, and i5/OS The migration software for other agents is located on the download image or CDs for that specific monitoring agent, such as the agents for database applications.If you do not migrate the monitoring agent to IBM Tivoli Monitoring, the agent continues to work. However, you must migrate to have all the functionality that IBM Tivoli Monitoring offers.
The migration from version 350 to IBM Tivoli Monitoring handles export of the presentation files and the customized OMEGAMON DE screens.
®
).
Windows installation frequently asked questions
Table 6. Windows installation frequently asked questions
Question Answer
How can I determine if Windows Security logging is on?
© Copyright IBM Corp. 2005, 2012 65
If the sysadmin account that you use to log on to Tivoli Enterprise Portal is not a Windows Administrator, you do not see the security log.
Windows security logging is not turned on by default. Normally, data is not collected in the security log unless the Windows administrator turns it on. The Record Count=0intheWindows monitored logs report confirm that security logging is not turned on.
Table 6. Windows installation frequently asked questions (continued)
Question Answer
How can I diagnose problems with product browse settings?
1. Select Start > Programs > IBM Tivoli Monitoring > Manage Tivoli
Enterprise Monitoring Services.
2. Right-click the Windows agent and select Browse Settings. A text window
displays.
3. Click Save As and save the information in the text file. If requested, you can
forward this file to IBM Software Support for analysis.
UNIX-based systems installation frequently asked questions
Table 7. Frequently asked questions for UNIX-based systems installation
Problem Solution
The product was installed as root. Without re-installing the product, how can I change from root to another ID?
How can I set the trace option to capture any abends (core files)?
In an environment of 50 servers with at least one agent per server, a new agent (vt) was installed outside the firewall. The new agent must be configured on Tivoli Enterprise Monitoring Server for IP.PIPE communication. Is it necessary to change all the other UNIX-based systems agents for IP:PIPE?
Does SNMP need to be turned on to monitor UNIX-based systems host? The monitoring server is running WINNT4.0 and monitoring agent is running on HPUX?
If you installed and started the agent as root, the files do not have correct permissions, so the result is unpredictable. For this reason, do not use root ID either to install or start the UNIX-based systems agents. Create a user ID with all the authority and permissions to install, run or use any other ID other than root.
As root, run the command UnSetRoot , which is located under install_dir/bin/ directory. This script resets all the files under the install_dir directory, owned by root.
UnSetRoot [ -h CANDLEHOME ] userID
After executing the above script, run the SetPerm command, which is located under install_dir/bin/ directory. This command sets root permission for certain UNIX-based systems agent files.
Add the following in the agent .ini file. For an example if it is KUX agent, add the following line in install_dir/config/ux.ini file
KBB_SIG1=trace –dumpoff
Is it not necessary to change all the other UNIX-based systems agents for IP:PIPE. You have to configure only the agent, which connects to the Tivoli Enterprise Monitoring Server through a firewall. However, you must configure the Tivoli Enterprise Monitoring Server for IP.PIPE communication.
While configuring the agent, which communicate through the firewall, you get the following options:
v Does the agent connect through a firewall? [YES or NO] (Default is: NO)
v IP.PIPE Port Number (Default is: 1918)
v Enter name of KDC_PARTITION (Default is: null)
If you are communicating only through the Tivoli Enterprise Monitoring Server you do not need SNMP. However, if you are sending traps to the emitter through the Tivoli CA uni-center or HP Open-view, SNMP is required.
66 IBM Tivoli Monitoring: Troubleshooting Guide
Table 7. Frequently asked questions for UNIX-based systems installation (continued)
Problem Solution
Pressing the backspace key, characters such as "^?" and "^H" appear on the screen.
The backspace key appears to be working correctly when entering text, but you later find characters such as "^?" and "^H" in configuration files and your software malfunctions.
When running the install.sh script on a Linux system, I get a Memory fault (core dump) at different, random stages of the installation, regardless of what selections I make.
If you receive one of these symptoms when using the backspace on UNIX computers, you have incorrectly configured the backspace key.
Configure your terminal and "stty erase" to use the same key code for backspace. Consider using "^?" as the key code. Verify your configuration with the IBM Tivoli Monitoring distributed utility, Install: BackspaceCheckUtility.
When I run the command "getconf GNU_LIBPTHREAD_VERSION" on my system, the response I receive is "linuxthreads-0.10" or something similar. This is caused by the /etc/profile entry of "LD_ASSUME_KERNEL=2.4". If I unset this variable or change the value of /etc/profile to "2.6", the getconf command returns "NPTL 2.3.4" or something like it. This enables me to run the install.sh script without causing the memory fault.
OR
Changing the JAVA_COMPILER variable to NONE before upgrading allows me to continue without hitting the core dump.
Chapter 5. Installation and configuration troubleshooting 67
Table 7. Frequently asked questions for UNIX-based systems installation (continued)
Problem Solution
Why does a Linux or UNIX-based installation to a non-default path create directories in the default /opt/IBM/ITM path?
This is an expected condition. The following example depicts an AIX installation to a non-default location. The following links are created when the SetPerm command is run:
/opt/IBM/ITM/tmaitm6 /opt/IBM/ITM/tmaitm6/links /opt/IBM/ITM/tmaitm6/links/aix52 /opt/IBM/ITM/tmaitm6/links/aix52x6 /opt/IBM/ITM/tmaitm6/links/aix53
The SetPerm command creates those links by design. Some of the binaries have hard-coded execution paths. This coding is required by the operating system in order to invoke a program object in authorized mode [root owned with UID].
The IBM Tivoli Monitoring Installation and Setup Guide documents installation on a single target location. However, by using local testing and configuration control, you can install to multiple target locations and run Tivoli Monitoring from all of them. For example, you can run multiple remote monitoring servers on a single server. Of course, the multiple monitoring servers require a non-default configuration, such as using different base port numbers.
v If all installations on the system are at the same maintenance level, running
the SetPerm command and updating the hard-coded /opt/IBM/ITM/tmaitm6/ links directory structure does not cause any problems.
v If all installations on the system are not at the same maintenance level,
running the SetPerm command and updating the hard-coded /opt/IBM/ITM/tmaitm6/links directory structure can cause problems. This scenario needs more testing than the scenario where all installations are at the same level.
The following procedure might resolve problems you encounter in the latter scenario:
v Maintain an installation on this system with the most current maintenance.
v Run the SetPerm command from this installation each time after other
installations apply maintenance or add agents.
v Run the SetPerm command from this installation each time after other
installations run the SetPerm command or the secureMain commands.
Note: For some cases, the OS Agents for example, only one agent can be installed because of the agent's interaction with the operating system.
General installation problems and resolutions
This section describes general installation problems and resolutions.
Agent Builder application support is not displayed in listappinstallrecs output if it is manually installed without recycling the monitoring server
If you run the scripts to manually install the Agent Builder application support on the Tivoli Enterprise Monitoring Server (TEMS) and specify both the user name and password, the expected result is that the application support files are loaded without causing the TEMS to restart. After that, if you run the tacmd listapplinstallrecs command to verify the application support installation, the support is not listed in the command output. As a result, a lower version
68 IBM Tivoli Monitoring: Troubleshooting Guide
SDA-enabled Agent Builder agent might override the higher version application support when it is connected through that TEMS. To avoid this situation, you must recycle the monitoring server.
Debugging mismatched application support files
After upgrading your Tivoli Enterprise Monitoring Server and Tivoli Enterprise Portal Server to IBM Tivoli Monitoring V6.2.3 or higher, you might be warned that the portal server identified mismatched support files.
Mismatched files are identified when you forget to upgrade the agent support files during your upgrade or you forget to upgrade the TEPS support, but upgrade the agent support files.
To remedy this situation, complete the support upgrade specified by the warning. See “Resolving application support problems” on page 11 for more information.
Startup Center fails to reset the sysadmin password on the hub Tivoli Enterprise Monitoring Server configuration panel
If the Startup Center fails to reset the sysadmin password on the hub Tivoli Enterprise Monitoring Server configuration panel, reset the password manually.
Startup Center fails to create the Tivoli Warehouse database and user
If the Startup Center fails to create the Tivoli Data Warehouse database and user, follow the Warehouse Proxy Agent configuration instructions to create the Tivoli Data Warehouse database and user. See "Configuring a Warehouse Proxy agent" in the IBM Tivoli Monitoring Installation and Setup Guide.
On UNIX systems, a new user is not created or a password is not reset in the Startup Center when you use a non-root user to install Warehouse Proxy Agent and Tivoli Enterprise Portal Server
On UNIX systems, a new user cannot be created and a password cannot be reset in the Startup Center when you use a non-root user to install Warehouse Proxy Agent and Tivoli Enterprise Portal Server. To remedy this situation, create the user or reset the password manually.
On Windows systems, a Tivoli Monitoring Warehouse DSN is not created in the Startup Center
If the Tivoli Monitoring Warehouse DSN is not created by the Startup Center, create the DSN manually by using the Warehouse Proxy Agent configuration instructions. See “Configuring a Warehouse Proxy Agent on Windows (ODBC connection)” in the IBM Tivoli Monitoring Installation and Setup Guide. For more information, check the WAREHOUSE_ODBC.log and WAREHOUSE_ODBC.trc files under the target system temporary_directory\DSNUtil (for example, C:\Temp\DSNUtil).
Startup Center fails to test DSN with database connectivity
If you have an existing 32-bit Warehouse database in the 64-bit DB2 instance, the Startup Center fails to test the DSN for database connectivity after creating the Tivoli Monitoring Warehouse DSN. The WAREHOUSE database is not upgraded from 32-bit to 64-bit automatically. For more information, check
Chapter 5. Installation and configuration troubleshooting 69
WAREHOUSE_ODBC.log and WAREHOUSE_ODBC.trc under target system <your temp directory>\DSNUtil (for example, C:\Temp\DSNUtil).
Startup Center shows some system types as “Unknown Operating System”
When you run the discovery process for available machines, the Startup Center might not identify the type of operating system for some systems. These operating systems are listed as Unknown Operating System.
This issue does not prevent the use of the affected systems. If the operating system type of a specific system cannot be discovered, you are given the opportunity to categorize the system manually in a later step. When you assign systems to the components, if a system categorized as "Unknown Operating System” is assigned to a component, you can select the correct operating system from the list in the window that is displayed. After you have specified the correct OS, the system is moved to the correct category in the list.
The Startup Center uses Nmap OS detection to categorize systems. Nmap OS detection works by running through a set of probes against target IP implementations and comparing responses with those in the fingerprint database. These responses are affected by the specific IP stack creating the response, which allows for OS detection. However, in some cases it can also be affected by the IP stack on the system where nmap is running, as well as intermediate firewalls and routers, for example. In other words, for the same target OS type, several different fingerprints in the database might be required in order to address these variations. For additional information, see “Dealing with Misidentified and Unidentified Hosts” at the Nmap site: http://nmap.org/book/osdetect-unidentified.html
Whenever you find an OS that is not discovered correctly, you should ideally force nmap to generate a signature, so that you can submit it to Insecure for integration into the Nmap fingerprint database.
The nmap command is located on the Startup Center media in:
v (W32) StartupCenter/SDE/nmap-5.21-win32
v (Linux) StartupCenter/SDE/nmap-5.21-linux-x86
Run the nmap -O -sSU -T4 -d <target> command, where <target> is the misidentified system in question. The fingerprint is a series of lines where each start with "OS". Submit the information at http://insecure.org/cgi-bin/ submit.cgi?corr-os.
Tivoli Enterprise Monitoring Agents
Review the monitoring agent installation and configuration troubleshooting topics for help with monitoring agent problems that occur during or after installation and initial configuration.
OS agent installation does not detect system monitor agents
Any agent released with IBMTivoli Monitoring v6.2.2 or before, other than agents built with the latest Agent Builder tool, should not be installed on top of the IBMTivoli Monitoring v6.2.2 System Monitor agents.
70 IBM Tivoli Monitoring: Troubleshooting Guide
OS Agent does not install and a message indicates it was already installed
About this task
The file status.properties, located in $DBDIR/AMX/data/ is not deleted when you uninstall the Upgrade Toolkit. The Upgrade Toolkit refers to the old status.properties file that contains information indicating there the OS Agent was installed. You might experience this problem if you do the following in the order listed:
1. Upgrade an endpoint.
2. Uninstall the Upgrade Toolkit.
3. Clean the endpoint manually.
4. Reinstall the Upgrade Toolkit.
5. Attempt to upgrade the endpoint you previously upgrade in step 1.
Take the following steps to verify that information in the status.properties file is causing this problem:
1. Open the status.properties,
2. Look for an entry like the following example:
#Copyright IBM Corporation 2005 #Wed Sep 14 15:54:43 CDT 2005 @Endpoint\: \:east@EndpointClass\:TMF_Endpoint\:\:Endpoint=COMPLETE @Monitor\:Coast\ :120401@Threshold\:critical=COMPLETE
In this example, the status of the endpoint "east" is COMPLETE, which indicates that it was upgraded successfully. The witmupgrade command does not upgrade any item with the COMPLETE status and reports that it was already upgraded.
To upgrade the endpoint, the status for the endpoint "east" must be the INCOMPLETE, such as in the following example:
@Endpoint\:\:east@EndpointClass\:TMF_Endpoint\:\:Endpoint=INCOMPLETE
The only way to change the endpoint status in the status.properties file to INCOMPLETE is to perform a rollback on the upgrade of the item. See the IBM Tivoli Monitoring: Upgrading from Tivoli Distributed Monitoring.
Rolling back the upgrade: About this task
You can use the rollback option (–r option) of the witmupgrade command to remove the new IBM Tivoli Monitoring resources that you created. This is a necessary step if you want to repeat the test scenario. Rolling back the upgrade for the test scenario removes the Windows OS monitoring agent from the Windows endpoint and also removes the new situations and managed system list.
Follow these steps to roll back the upgrade:
1. Change to the $DBDIR/AMX/shared/analyze directory:
cd $DBDIR/AMX/shared/analyze
2. Type the following command to roll back the upgrade:
witmupgrade -x profilemanagers/DM_TEST_PM.xml -r -f scans/baseline.xml
where:
Chapter 5. Installation and configuration troubleshooting 71
-x profilemanagers/DM_TEST_PM.xml
Specifies the name and location of the output file that resulted from the assessment of the DM_TEST_PM profile manager.
-r Indicates that the purpose of this command is to perform a rollback.
-f scans/baseline.xml
Specifies the name and location of the baseline file to use as input for this command.
3. Restart the Windows endpoint.
The rollback option can also be used to roll back an endpoint upgrade or a profile upgrade independently. By rolling back the profile manager upgrade, you roll back all upgrades (profile manager, profile, and endpoint) in one step.
“SQL1_OpenRequest status = 79” return code occurs during when upgrading an agent: The return code SQL1_OpenRequest status = 79 occurs in the agent log
when the application support is added during an upgrade. This return code results from an attempt to delete a table entry that does not exist in the table. When you add application support for a V6.1 agent, the return code is expected behavior because the agent application support data does not exist in the table.
Installation of OS agent on a Microsoft Windows Server 2003 fails with this error: “Unable to establish BSS1 environment, can't continue”
This error is caused by the deletion of the gskit directory, whether intentionally or by accident, without clearing the registry information. If gskit was previously installed by another product and has a dependency on it, for example DB2 9.1, then let that product reinstall it, or if there are no other products that depend on the version of that gskit, then you can clear the GSK7 entry in the registry that can be found under My Computer\HKEY_LOCAL_MACHINE\SOFTWARE\IBM\GSK7. Then rerun the IBM Tivoli Monitoring installation to allow the gskit to be reinstalled.
Note: Create a backup of the registry before editing it.
Unable to update the Tivoli Data Warehouse agent by using the command line interface
When using remote deployment to upgrade the Tivoli Data Warehouse agents (Warehouse Proxy Agent and Summarization and Pruning Agent), you must use a specific workaround to ensure that the upgrade is successful.
On UNIX and Linux systems, you must add the following variable to the hd.ini file for the Warehouse Proxy Agent or the sy.ini file for the Summarization and Pruning Agent, and then restart the agent:
CTIRA_SYSTEM_NAME=$RUNNINGHOSTNAME$
On Windows systems, you must add the following line to the KHDCMA.INI file for the Warehouse Proxy Agent or the KSYCMA.INI file for the Summarization and Pruning Agent, and then reconfigure and restart the agent:
CTIRA_SYSTEM_NAME=%computername% .TYPE=REG_EXPAND_SZ
Monitoring server connection information is changed after upgrade
A Tivoli Enterprise Monitoring Agent that connects to a different Tivoli Enterprise Monitoring Server than the one that the OS agent connects to might have its monitoring server changed after the monitoring agent is upgraded to a new version using remote deployment.
72 IBM Tivoli Monitoring: Troubleshooting Guide
For example, consider an environment in which the monitoring agent for DB2 and the Linux OS agent V6.2.3 are installed on the same computer. The DB2 agent connects to the RTEMS 1 monitoring server and the OS agent connects to the RTEMS 2 monitoring server. After the upgrade, the DB2 agent connection is to RTEMS 2 rather than RTEMS 1. The same problem occurs when agents at V6.2.3 or earlier are updated using group deploy or single deploy.
If you have a Tivoli Monitoring V6.2.3 (or earlier) OS agent installed on the same computer as a product monitoring agent and both agents connect to different monitoring servers, upgrade the OS agent to V6.2.3 Fix Pack 1 or later version before upgrading the other monitoring agent. Otherwise, a remote deployment upgrade of the monitoring agent causes the monitoring server connection to change to the same monitoring server that the OS agent connects to.
Receive duplicate insert errors (SQL1 return code 80) after an agent switches away from the remote monitoring server and then switches back
When a Global Access List hub monitoring server is installed with a previous version of a remote monitoring server, you will see duplicate insert errors (SQL1 return code 80) after an agent switches away from the remote monitoring server and then switches back. These messages do not indicate an environment execution error.
Upgrade SQL file not found when installing application support on the standby hub
When adding application support to the hubs in a hot standby setup, after the first hub has been seeded, you might receive an error message similar to the following about the productcode_upg.sql file not being found while seeding the second hub:
Seeding support for Monitoring Agent for Microsoft SharePoint Server [8 of 10] KCIIN1602E ERROR - file not found: /boadata/IBM/ITM/tables/cicatrsq/SQLLIB/kqp_upg.sql Option "-f install|upgrade" can be used with the "itmcmd support" command to force using the pristine installation or upgrade support file for the product’s application support. Seeding failed. Seeding support for Monitoring Agent for Microsoft Virtual Server [9 of 10] KCIIN1602E ERROR - file not found: /boadata/IBM/ITM/tables/cicatrsq/SQLLIB/kqr_upg.sql Option "-f install|upgrade" can be used with the "itmcmd support" command to force usingthe pristine installation or upgrade support file for the product’s application support. Seeding failed.
This error is not necessarily a fatal error. It simply means the application did not provide an upgrade seeding file. There are generally two types of seeding files: install and upgrade. The installer determines which one to apply by checking to see if there are already situations belonging to the application on the target hub. If no situation is found, then the installation seeding file is chosen, otherwise the upgrade seeding file is used if provided. In a hot standby setup, as soon as one hub is seeded, the other hub can copy the situations immediately. So when seeding is applied to the second hub, the installer detects existing situations and looks for the upgrade seeding file instead. Even though some applications do not provide upgrade seeding files, because the hubs automatically synchronize seeded data, it is generally not a serious issue. Seeding can still be forced on the second hub by using the -f option.
Chapter 5. Installation and configuration troubleshooting 73
Many files in the First Failure Data Capture log directory
On Windows systems, there are eWAS logs in the following location of the IBM Tivoli Monitoring home directory:
CANDLE_HOME\CNPSJ\profiles\ITMProfile\logs\ffdc\
And, on UNIX systems, they are found in the following directory:
CANDLE_HOME/arch/iw/profiles/ITMProfile/logs/ffdc/
These log files might contain the following exceptions:
org.omg.CORBA.BAD_OPERATION CORBA.TRANSIENT ClassNotFound on MQJMS
These exceptions can be ignored and have no impact on either eWAS or IBM Tivoli Monitoring functionality.
Monitoring agents fail to start after agent support or multi-instance agents are installed
Monitoring agents on an IBM Tivoli Monitoring V6.2.1 (or later) managed system that have an unsupported system GSKit version installed might fail to start after an IBM Tivoli Monitoring V6.2 Multi-Instance Agent or IBM Tivoli Monitoring V6.2 Agent Support is locally installed.
The installer used by both IBM Tivoli Monitoring V6.2 Multi-Instance Agents (including fix packs) and IBM Tivoli Monitoring V6.2 Application Support causes monitoring agents on an IBM Tivoli Monitoring V6.2.1 managed system (or later) to revert back to using the system GSKit instead of the IBM Tivoli Monitoring embedded GSKit. This issue occurs on local installations only. Remote installation (remote deploy) does not have this issue.
If a system GSKit is installed on the managed system at a level supported by IBM Tivoli Monitoring, the monitoring agents continue to operate normally.
Monitoring agents might fail to start, however, if all of the following conditions are met:
v The managed system does not have a system GSKit installed or the system
GSKit is at a version not supported by IBM Tivoli Monitoring V6.2.1 or later.
v The agent is configured to use secure communications (IP.SPIPE) rather than
normal communication (IP.PIPE).
If agents on a managed system fail to start after an IBM Tivoli Monitoring V6.2 Multi-Instance Agent or IBM Tivoli Monitoring V6.2 Agent Support is installed, any one of the following corrective actions can be taken:
v Run kinconfig.exe -G on the managed system.
v OR
v Reconfigure any of the IBM Tivoli Monitoring V6.2.1 (or later) monitoring agents
on the managed system by running kinconfig.exe -rKproductcode.
v OR
v Install another IBM Tivoli Monitoring V6.2.1 monitoring agent (or later).
74 IBM Tivoli Monitoring: Troubleshooting Guide
Incorrect behavior after an uninstallation and re-installation
You might experience incorrect behavior if you uninstall then reinstall the product without rebooting. For example, you might experience the following problems:
v Inability to create trace logs.
v Agents do not start.
v Agents data is corrupt.
Reboot the system to resolve the problems.
Where Remote Deployment of agents is not supported
Remote Deployment is not supported for OMEGAMON agents. It is also not supported in environments with a z/OS Tivoli Enterprise Monitoring Server.
Remote Deployment is not supported when the Tivoli Enterprise Monitoring Server, Tivoli Enterprise Portal Server or the Tivoli Enterprise Portal are on the same system as the agent. It is also not supported if the target endpoint has a Tivoli Enterprise Monitoring Server, Tivoli Enterprise Portal Server or the Tivoli Enterprise Portal installed on it.
This restriction includes the following commands:
v tacmd viewagent
v tacmd startagent
v tacmd stopagent
v tacmd restartagent
v tacmd configuresystem
v tacmd updateagent
v tacmd removesystem
v tacmd createnode
v tacmd cleardeploystatus
v tacmd restartfaileddeployment
v tacmd checkprereq
v tacmd addsystem
Application Support Installer hangs
The Application Support Installer (ASI) gets to the screen indicating "Please select which applications you would like to add support for." but hangs there. After selecting the "Next" button, the installation hangs there and does not update the screen. The %TEMP%\ITM_AppSupport_Install.log (Windows) or \tmp\ITM_AppSupport_Install.log (UNIX and Linux) also fails to be updated after this point, even after waiting for hours.
Change to the directory where setup.jar exists, and then use java -jar setup.jarto run the installer.
An agent bundle is not visible from the Tivoli Enterprise Portal
The bundle has been added to the depot and is viewable from there, but it is missing from the list of agents available for deployment from the Tivoli Enterprise Portal for a given node. You cannot deploy an agent from the Tivoli Enterprise Portal if the xml version in the depot is later than the installed version because the newer xml might contain configuration properties that the back-level agent does not support. This issue was noticed for the DB2 agent.
Chapter 5. Installation and configuration troubleshooting 75
Agent Management Services fails after deployment on Linux Itanium and xLinux with kernel 2.4 systems
Agent Management Services fails after deployment on Linux Itanium and xLinux with kernel 2.4 systems when the -o KDYRXA.AUTOCLEAN=YES option is used. The Proxy Agent Services agent will not start when the deployment process completes if the option that removes the temporary directory used by remote deployment was used. To start the OS agent when this problem occurs, do one of the following actions:
v On the agent system, manually restart the OS agent. v On the agent system, run $CANDLEHOME/bin/itmcmd execute -c lz
startWatchdog.sh.
v Go to the Agent Management Services workspace for the agent in question and
run the 'AMS Start Agent' Take Action against the Proxy Agent Services agent with a resetRestartCount of 0.
Watchdog utility requires Windows Script Host 5.6
The OS Agent watchdog utility calls scripts that require Windows Script Host 5.6 at a minimum. If these scripts are run on a system running an earlier version of Windows Script Host (for example 5.1), then the script continues to run, and over time results in multiple cscript processes running on the system.
Upgrade Windows Script Host to version 5.6 or later.
Unable to deploy monitoring agents from the Tivoli Enterprise Portal
Receive an error when attempting to deploy an monitoring agent from a previous version of IBM Tivoli Monitoring through the Tivoli Enterprise Portal:
KFWITM291E An agent configuration schema was not found" error popup.
The application support for the version being deployed must be installed to the portal server, or the agent configuration xml file (for example, r2_dd_062100000.xml) must be manually copied to the same location in the portal server (../classes/candle/kr2/resources/config) where the current-level configuration xml file (for example, r2_dd_062200000.xml) resides.
Installing application support with a silent installation response file fails
Running the Application Support Installer with a silent installation response file to apply application support on the monitoring server, the portal server, or the Tivoli Enterprise Portal fails and displays a failure message:
Error java.lang.ArrayIndexOutOfBoundsException: 0
Additionally, the resulting application support files contained in the support package are not installed.
Using the Application Support Installer with the Silent Installation Response file option is not supported. The recommended mechanism for the installation is using the GUI interface.
76 IBM Tivoli Monitoring: Troubleshooting Guide
Unable to run gsk7ikm.exe
About this task
Unable to run c:\IBM\ITM\GSK7\bin\gsk7ikm.exe as it fails with the following error
Failed to parse JAVA_HOME setting
On UNIX and Linux systems, complete the following steps:
1. Open console.
2. Get the IBM Java location by running the following script:
CANDLEHOME/bin/ CandleGetJavaHome
3. Export variable JAVA_HOME to point to the IBM Java path. For 64bit, gsk7ikm
has to be 64bit java.
4. Check the path for a local GSkit. This path is CANDLEHOME/config/
gsKit.config. GskitInstallDir points to a 32bit GSKit and GskitInstallDir_64
points to a 64bit GSKit.
5. Run GSKit key manager by running the following depending on your system
setup: GskitInstallDir/bin/ gsk7ikm_32 (32bit on HP) GskitInstallDir/bin/ gsk7ikm (32bit on Linux, Aix, or Solaris) GskitInstallDir _64/bin/ gsk7ikm_64 (64bit)
On Windows systems:
1. Run cmd.
2. Get the IBM Java location by running the following script:
CANDLEHOME\InstallITM\GetJavaHome.bat
3. Set the JAVA_HOME variable that points to the IBM Java location.
4. Get the GSKit location by running the following script:
CANDLEHOME\InstallITM\GetGSKitHome.bat
5. Change the directory to GSKit path\bin.
6. Run the gsk7ikm.exe file.
*_cq_*.log files appear
Some of the *_cq_*.log files are from seeding operations. So, exception messages are expected by design.
SPD: Installing a bundle on the wrong operating system, architecture, or kernel
When you attempt to install a bundle on a system that does not correspond to the correct binaries (for example, installing a 32 bit bundle on a 64 bit system, or installing a 2.4 kernel-level bundle on a 2.6 kernel-level system). Look at the logs (Software Package Block (SPB) logs are located in the temporary directory of the system, /tmp for UNIX or %temp% for Windows). These will show that GSKit could not be installed.
To identify the right bundle for a particular system, the generated Software Package Definition (SPD) file uses the naming convention: product_code interp.spd. The interp tells you in which operating system, architecture, or kernel the bundle can be installed.
Chapter 5. Installation and configuration troubleshooting 77
Installing a Software Package Block (SPB) on top of an existing, running IBM Tivoli Monitoring agent
When you attempt to install another IBM Tivoli Monitoring agent bundle using Tivoli Configuration Manager (TCM) or the Tivoli Provisioning Manager (TPM) on a system that has another IBM Tivoli Monitoring agent running, the second agent is not successfully installed due to overlapping libraries and ports configuration.
To prevent this problem, stop the running agent, and use Tivoli Configuration Manager (TCM) or Tivoli Provisioning Manager (TPM) to install the second agent.
Problems with the SPB file
If an Software Package Definition (SPD) file, created with the tacmd exportBundles command, is moved to a different system to create an SPB, the files copied by the tacmd exportBundles command need to be moved with the SPD file as well, and the SOURCE_DIR in the default_variable section of the SPD file needs to be updated to reflect the new directory where the agent files are located.
Installation was halted and receive message about active install
If for any reason the installation was halted, either by invoking Ctrl-C or by a power outage, if you then run uninstallation, you receive a message.
An install may currently be running in "/data/itmfp6_preUPGR" from the following machine(s):Continue with this uninstallation [1-yes, 2-no; "2" is default]?
Recovery from a hard kill of the installer is currently not a supported scenario since the current installer does not have built-in rollback capability. Executing a hard stop of the installer will leave some or all IBM Tivoli Monitoring functions (including uninstall) in an unpredictable or disabled state.
However, you should be able to continue with the uninstallation after ensuring that there is indeed no installation being run on the system.
Receive an install.sh error when installing two components or agents in the same installation directory
Installing two components or agents in the same CANDLEHOME or installation directory is supported as long as the user ID used to run the installation is always the same.
Installing two components or agents in the same CANDLEHOME or installation directory using different user IDs is not supported.
When attempting to install IBM Java 1.5.0 on Windows 64 bit system nothing happens
Only 32-bit browsers are supported on the AMD 64 Windows environment due to the lack of a native 64-bit Web Start or Java Plug-in supports.
78 IBM Tivoli Monitoring: Troubleshooting Guide
Backup failure message during a remote monitoring server upgrade
During a remote Tivoli Enterprise Monitoring Server upgrade, if you receive the message, “The Backup procedure for TEMS database files has failed. If you continue with the installation your customized tables could be lost. Would you like to abort the installation?”, exit the upgrade installation to avoid losing data.
About this task
If you click YES, there is a risk of losing your customized tables. To ensure that you do not lose data, complete the following steps:
Procedure
1. Click NO and exit the upgrade installation.
2. Restart the remote monitoring server computer.
3. Stop all Tivoli Monitoring components.
4. Rerun the upgrade installation now with the remote monitoring server in the
stopped state.
Results
The upgrade installation is complete.
Remote configuration of deployed Monitoring Agent for DB2 agent fails
The following message is returned when running the tacmd addsystem command:
The agent action SETCONFIG failed with a return code of -1073741819 for product code ud.
Remote configuration and installation of a database agent requires that IBM Global Security Kit (GSKit) be installed in directory C:\Program Files\ibm\gsk7, or that the GSKit directory be defined in the Windows System environment variable ICCRTE_DIR. DB2 9.1 installs the GSKit package in C:\ibm\gsk7 and the ICCRTE_DIR environment variable is not exported as a System environment variable. Therefore, tacmd addsystem remote configuration processing cannot execute and results in the failure message reported to the user.
Choose one of the following resolutions that best fits your environment:
v Install the GSKit product by executing the InsGSKit.exe program in the target
directory C:\Program Files\ibm\ directory.
v Assign the System Environment variable named ICCRTE_DIR to the directory
path of the currently installed GSKit product (for example, C:\ibm\gsk7).
v When the error is reported, manually configure the Monitoring Agent for DB2
Service Startup Parameters to use the correct user name and password to interact with the DB2 9.1 product. Ensure that the InteractsWithDesktop Service is not enabled for this DB2 Agent Service.
Monitoring Server cannot find your deployment depot
If you create a shared deployment repository named depot on the server hosting the deployment depot and you create this repository in a subdirectory of the depot directory, the monitoring server will not be able to find your deployment depot, and you will receive this message:
Chapter 5. Installation and configuration troubleshooting 79
KDY2077E: The specified agent bundle depot \\hubtems\depot is not a directory. Either the agent bundle depot directory does not exist or it is not a directory. The agent bundle depot directory does not exist because no bundles have been added.
Create the repository at the C:\IBM\ITM\CMS level of the directory structure, not at the C:\IBM\ITM\CMS\depot level. Then set DEPOTHOME to DEPOTHOME=\\hubtems\ centralrepository\depot.
The agent installation log shows error AMXUT7502E
The error AMXUT7512E might occur when running the Distributed Monitoring Upgrade Toolkit.
The agent was not installed for one of the following reasons:
v There is another installation in progress that cannot complete until the computer
is restarted.
–OR–
v You are attempting to install a component that is already installed.
Refer to the lcfd.log on the endpoint and the agent installation log as listed in Table 8 to determine the exact cause of the problem.
Table 8. lcfd log file
Windows UNIX-based systems
install_dir/Install/Abort IBM Tivoli Monitoring timeStamp.log
install_dir/logs/candle_installation.log
Contact IBM Software Support if you cannot install the agent. See Chapter 2, “Logs and data collection for troubleshooting,” on page 5 for information on what types of data to collect before contacting Support. See the IBM Support Portal (http://www.ibm.com/support/entry/portal/software).
Failure occurs when sharing directories for the agent deploy depot
Although it is more efficient to use a network shared directory for the agent deploy depot directory, there are weaknesses that might negatively impact deployment in large enterprises:
v If an NFS is used to contain the depot and there is a problem with the NFS, then
the deployment activity is suspended for all deployments in progress.
v For UNIX environments, the directories that are mentioned on the shared
directory must have the names of each of the Tivoli Enterprise Monitoring Server servers.
v Administrator privileges need to be assigned based on a domain user ID. This is
impractical and is contrary to the desired effect of sharing.
You receive a KFWITM290E error when using deploy commands with a z/OS monitoring server
Remote Deployment is not supported in environments with a z/OS Tivoli Enterprise Monitoring Server.
Running deployment in a hot-standby environment
The IBM Tivoli Monitoring hot-standby capability allows your monitoring environment to continue operating in the event of environmental or operational issues with the primary hub monitoring server (for detailed information about
80 IBM Tivoli Monitoring: Troubleshooting Guide
Tivoli Monitoring's hot-standby feature, see the IBM Tivoli Monitoring High-Availability Guide for Distributed Systems). You should refrain from deploying or updating agents when IBM Tivoli Monitoring is converting to a mirror monitoring server. No agent deployments or remote deployment operations should be executed from a hot-standby mirror hub, as this might cause your deployment transactions to get stuck in a queued state, and you might not be able to clear them.
Difficulty with default port numbers
You can use Telnet to test if the port is open in the firewall. Use the following command for this test:
telnet hostname 15001
where 15001 is the port number in question.
Selecting Security Validation User displays a blank popup
While configuring the Tivoli Enterprise Monitoring Server you have an option to select the Security Validation User. When selecting this option a blank popup is displayed. The Security Validation is working despite a blank popup with this label that has a yellow triangle and exclamation point:
TEMS User Authentication actions are needed!
When installing a monitoring agent on top of the Systems Monitor Agent, you receive an error
If you try to install a monitoring agent (that is not one of the agents built with IBM Tivoli Monitoring v6.2.2 Agent Builder) on top of the Systems Monitor Agent, you receive an error:
install.sh failure: KCI1163E cannot read file "/opt/IBM/ITM/registry/imdprof.tbl".
Monitoring agents that have been configured to connect to a monitoring server cannot be installed on the same system as those that have been configured for autonomous operation.
Also, monitoring agents that have been configured for autonomous operation cannot be installed on the same system as those that are connected to a monitoring server.
Some rows do not display in an upgraded table
About this task
You might not see all tables after upgrading the Warehouse Proxy to IBM Tivoli Monitoring V6.1 because some tables might be corrupted. Do the following to find the errors that occurred during the upgrade:
1. Edit the KHDRAS1_Mig_Detail.log file.
2. Search for the word EXCEPTION.
The KHD_MAX_ROWS_SKIPPED_PER_TABLE environment variable allows you to skip bad data. KHD_MAX_ROWS_SKIPPED_PER_TABLE indicates the number of rows per table to skip to migrate if the data that needs to be inserted is incorrect. When this number is reached, migration of the table is aborted.
Chapter 5. Installation and configuration troubleshooting 81
The monitoring server and portal server automatically start after running Application Support Installer
After running the Application Support Installer the Tivoli Enterprise Monitoring Server and Tivoli Enterprise Portal Server automatically start, even if they were not running before the installation. The behavior is harmless and there is no workaround currently.
Errors occur during installation of Event IBM Tivoli Monitoring Event Forwarding tool
The product functions normally in spite of the error. Check the installation log for more details.
One or more errors occured during the replacement of files (tecSyncAllFile1) with files (tecSyncAllFile1). Refer to install log for more details. One or more errors occured during the replacement of files (tecSyncAllFile2) with files (tecSyncAllFile)1. Refer to install log for more details. One or more errors occured during the replacement of files (tecSyncAllFile3) with files (tecSyncAllFile1). Refer to install log for more details. . . .
Missing LSB tags and overrides warning message at the end of installation
During the installation process, you might see these unexpected warning messages:
insserv: warning: script ’S02ITMAgents2’ missing LSB tags and overrides insserv: warning: script ’ITMAgents2’ missing LSB tags and overrides
These warnings are caused by an older installer missing some tags required by the chkconfig utility, used to manage system startup files. These warnings do not adversely affect the installation, and can safely be ignored.
Self-describing capability
Review the self-describing agent and application support problems to learn if your installation or configuration problem is related.
Receive a message after installing a self-describing capable agent
After installing a self-describing capable agent, you see an error message if the self-describing application support packages are not present on the installation media.
Unable to install agent name support packages required for self-describing mode. Check installation log file for more details.
You can review the details of installation failure by reading the installation main log file. The following entries should be stored in the log file:
Unable to install agent name support packages required for self-describing mode. Following error(s) detected:
Self-describing mode for agent name is not enabled. When the problem is fixed, reinstall agent name to enable self-describing mode.
list of error(s)
82 IBM Tivoli Monitoring: Troubleshooting Guide
Loading...