HP Platform LSF Command Reference Guide

Platform™ LSF™ Command Reference
Version 7 Update 3
Release date: May 2008
Last modified: May 16, 2008
Comments to: doc@platform.com
Support: support@platform.com
Copyright © 1994-2008, Platform Computing Inc.
Although the information in this document has been carefully reviewed, Platform Computing Inc. (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
We’d like to hear from you You can help us make this document better by telling us what you think of the content, organization,
and usefulness of the information. If you find an error, or just want to make a suggestion for improving this document, please address your comments to doc@platform.com.
Your comments should pertain only to Platform documentation. For product support, contact support@platform.com.
Document redistribution and translation
Internal redistribution You may only redistribute this document internally within your organization (for example, on an
Trademarks LSF is a registered trademark of Platform Computing Inc. in the United States and in other
Third-party license agreements
Third-party copyright notices
This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole.
intranet) provided that you continue to check the Platform Web site for updates and update your version of the documentation. You may not make it available to your organization over the Internet.
jurisdictions.
ACCELERATING INTELLIGENCE, PLATFORM COMPUTING, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, PLATFORM ENTERPRISE GRID ORCHESTRATOR, PLATFORM EGO, and the PLATFORM and PLATFORM LSF logos are trademarks of Platform Computing Inc. in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.
http://www.platform.com/Company/third.part.license.htm
http://www.platform.com/Company/Third.Party.Copyright.htm
Contents
bacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
bapp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
badmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
bbot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
bchkpnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
bclusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
bgadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
bgdel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
bhist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
bhosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
bhpart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
bgmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
bjgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
bjobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
bkill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
bladmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
blaunch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
blcollect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
blhosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
blimits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
blinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
blkill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
blparams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
blplugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
blstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
bltasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
blusers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
bmgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
bmig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
bmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
bparams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
bpeek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
bpost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
bqueues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
bread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
brequeue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
bresources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Platform LSF Command Reference 3
brestart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
bresume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
brlainfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
brsvadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
brsvdel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
brsvmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
brsvs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
brun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
bsla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
bslots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
bstatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
bstop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
bsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
bswitch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
btop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
bugroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
busers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
ch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
lsacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
lsacctmrg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
lsadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
lsclusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
lseligible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
lsfinstall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
lsfmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
lsfrestart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
lsfshutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
lsfstartup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
lsgrun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
lshosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
lsid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
lsinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
lsload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
lsloadadj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
lslogin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
lsltasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
lsmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
lspasswd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
lsplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
lsrcp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
lsrtasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
lsrun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
lstcsh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
pam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
patchinstall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
pmcadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
4 Platform LSF Command Reference
pmcremoverc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
pmcsetrc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
perfadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
perfremoverc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
perfsetrc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
pversions (Windows) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
pversions (UNIX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
ssacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
ssched . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
taskman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
tspeek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
tssub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
wgpasswd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
wguser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Platform LSF Command Reference 5
6 Platform LSF Command Reference

bacct

Synopsis

Displays accounting statistics about finished jobs.
bacct [-b | -l] [-d] [-e] [-w] [-x] [-app application_profile_name]
[-C time0,time1] [-D time0,time1] [-f logfile_name] [-Lp ls_project_name ...] [-m host_name ...|-M host_list_file] [-N host_name | -N host_model | -N cpu_factor] [-P project_name ...] [-q queue_name ...] [-sla service_class_name ...] [-S time0,time1] [-u user_name ... | -u all]
bacct [-b | -l] [-f logfile_name] [job_ID ...]
bacct [-U reservation_ID ... | -U all [-u user_name ... | -u all]
bacct [-h | -V

Description

Displays a summary of accounting statistics for all finished jobs (with a DONE or EXIT status) submitted by the user who invoked the command, on all hosts, projects, and queues in the LSF system. in the current LSF accounting log file:
LSB_SHAREDIR/cluster_name/logdir/lsb.acct.
CPU time is not normalized.
All times are in seconds.
Statistics not reported by can be generated by directly using

Throughput calculation

The throughput (T) of the LSF system, certain hosts, or certain queues is calculated by the formula:
T = N/(ET-BT)
where:
N is the total number of jobs for which accounting statistics are reported
BT is the Start time—when the first job was logged
ET is the End time—when the last job was logged
]
bacct displays statistics for all jobs logged
bacct but of interest to individual system administrators
awk or perl to process the lsb.acct file.
You can use the option
-C time0,time1 to specify the Start time as time0 and the
End time as time1. In this way, you can examine throughput during a specific time period.
Jobs involved in the throughput calculation are only those being logged (that is, with a DONE or EXIT status). Jobs that are running, suspended, or that have never been dispatched after submission are not considered, because they are still in the LSF system and not logged in
lsb.acct.
Platform LSF Command Reference 7

Options

Options
The total throughput of the LSF system can be calculated by specifying -u all without any of the can be calculated by specifying The throughput of certain queues can be calculated by specifying the
-m, -S, -D or job_ID options.
bacct does not show local pending batch jobs killed using bkill -b. bacct shows
MultiCluster jobs and local running jobs even if they are killed using
-b Brief format.
-d Displays accounting statistics for successfully completed jobs (with a DONE
-m, -q, -S, -D or job_ID options. The throughput of certain hosts
-u all without the -q, -S, -D or job_ID options.
-u all without
bkill -b.
status).
-e Displays accounting statistics for exited jobs (with an EXIT status).
-l Long format with additional detail.
-w Wide field format.
-x Displays jobs that have triggered a job exception (overrun, underrun, idle). Use
with the
-l option to show the exception status for individual jobs.
-app application_profile_name
Displays accounting information about jobs submitted to the specified application profile. You must specify an existing application profile configured in
lsb.applications.
-C time0,time1 Displays accounting statistics for jobs that completed or exited during the specified
time interval. Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is
also used.
The time format is the same as in
-D time0,time1 Displays accounting statistics for jobs dispatched during the specified time interval.
Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
The time format is the same as in
-f logfile_name Searches the specified job log file for accounting statistics. Specify either an absolute
bhist(1).
bhist(1).
or relative path.
Useful for offline analysis.
The specified file path can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
-Lp ls_project_name ... Displays accounting statistics for jobs belonging to the specified License Scheduler
projects. If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
-M host_list_file Displays accounting statistics for jobs dispatched to the hosts listed in a file
(host_list_file) containing a list of hosts. The host list file has the following format:
Multiple lines are supported
Each line includes a list of hosts separated by spaces
The length of each line must be less than 512 characters
8 Platform LSF Command Reference
-m host_name ...
Displays accounting statistics for jobs dispatched to the specified hosts.
If a list of hosts is specified, host names must be separated by spaces and enclosed in quotation marks (") or (’).
-N host_name | -N host_model | -N cpu_factor
Normalizes CPU time by the CPU factor of the specified host or host model, or by the specified CPU factor.
If you use
-P project_name ... Displays accounting statistics for jobs belonging to the specified projects. If a list of
bacct offline by indicating a job log file, you must specify a CPU factor.
projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
-q queue_name ... Displays accounting statistics for jobs submitted to the specified queues.
If a list of queues is specified, queue names must be separated by spaces and enclosed in quotation marks (") or (’).
-S time0,time1 Displays accounting statistics for jobs submitted during the specified time interval.
Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
The time format is the same as in
-sla service_class_name
bhist(1).
Displays accounting statistics for jobs that ran under the specified service class.
If a default system service class is configured with ENABLE_DEFAULT_EGO_SLA in
lsb.params but not explicitly configured in lsb.applications,
bacct -sla service_class_name displays accounting information for the specified
default service class.
-U reservation_id ... | -U all
Displays accounting statistics for the specified advance reservation IDs, or for all reservation IDs if the keyword
all is specified.
A list of reservation IDs must be separated by spaces and enclosed in quotation marks (") or (’).
The
-U option also displays historical information about reservation modifications.
When combined with the
-U option, -u is interpreted as the user name of the
reservation creator. For example:
bacct -U all -u user2
shows all the advance reservations created by user user2.
Without the
-u option, bacct -U shows all advance reservation information about
jobs submitted by the user.
In a MultiCluster environment, advance reservation information is only logged in the execution cluster, so
bacct displays advance reservation information for local
reservations only. You cannot see information about remote reservations. You cannot specify a remote reservation ID, and the keyword
all only displays
information about reservations in the local cluster.
-u user_name ...|-u all Displays accounting statistics for jobs submitted by the specified users, or by all
users if the keyword
all is specified.
Platform LSF Command Reference 9

Default output format (SUMMARY)

If a list of users is specified, user names must be separated by spaces and enclosed in quotation marks (") or (’). You can specify both user names and user IDs in the list of users.
job_ID ... Displays accounting statistics for jobs with the specified job IDs.
If the reserved job ID 0 is used, it is ignored.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
Default output format (SUMMARY)
Statistics on jobs. The following fields are displayed:
Total number of done jobs
Total number of exited jobs
Total CPU time consumed
Average CPU time consumed
Maximum CPU time of a job
Minimum CPU time of a job
Total wait time in queues
Average wait time in queue
Maximum wait time in queue
Minimum wait time in queue
Average turnaround time (seconds/job)
Maximum turnaround time
Minimum turnaround time
Average hog factor of a job (cpu time/turnaround time)
Maximum hog factor of a job
Minimum hog factor of a job
Tota l t hro u ghp ut
Beginning time: the completion or exit time of the first job selected
Ending time: the completion or exit time of the last job selected
The total, average, minimum, and maximum statistics are on all specified jobs.
The wait time is the elapsed time from job submission to job dispatch.
The turnaround time is the elapsed time from job submission to job completion.
The hog factor is the amount of CPU time consumed by a job divided by its turnaround time.
The throughput is the number of completed jobs divided by the time period to finish these jobs (jobs/hour).

Brief format (-b)

In addition to the default format SUMMARY, displays the following fields:
10 Platform LSF Command Reference
U/UID
QUEUE Queue to which the job was submitted.
SUBMIT_TIME Time when the job was submitted.
CPU_T CPU time consumed by the job.
WAIT Wait t ime o f the j ob.
TURNAROUND Turnaround time of the job.
FROM Host from which the job was submitted.
EXEC_ON Host or hosts to which the job was dispatched to run.
JOB_NAME The job name assigned by the user, or the command string assigned by default at

Long format (-l)

Name of the user who submitted the job. If LSF fails to get the user name by
getpwuid(3), the user ID is displayed.
job submission with
bsub. If the job name is too long to fit in this field, then only
the latter part of the job name is displayed.
The displayed job name or job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
In addition to the fields displayed by default in SUMMARY and by -b, displays the following fields:
JOBID Identifier that LSF assigned to the job.
PROJECT_NAME Project name assigned to the job.
STATUS Status that indicates the job was either successfully completed (DONE) or exited
(EXIT).
DISPAT_TIME Time when the job was dispatched to run on the execution hosts.
COMPL_TIME Time when the job exited or completed.
HOG_FACTOR Average hog factor, equal to "CPU time" / "turnaround time".
MEM Maximum resident memory usage of all processes in a job. By default, memory
usage is shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a
larger unit for display (MB, GB, TB, PB, or EB).
CWD Current working directory of the job.
SWAP Maximum virtual memory usage of all processes in a job. By default, swap space is
shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a larger unit
for display (MB, GB, TB, PB, or EB).
INPUT_FILE File from which the job reads its standard input (see bsub -i input_file).
OUTPUT_FILE File to which the job writes its standard output (see bsub -o output_file).
ERR_FILE File in which the job stores its standard error output (see bsub -e err_file).
EXCEPTION STATUS Possible values for the exception status of a job include:
idle
Platform LSF Command Reference 11

Advance Reservations (-U)

The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.
overrun
The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.
underrun
The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.
Advance Reservations (-U)
Displays the following fields:
RSVID Advance reservation ID assigned by brsvadd command
TYPE Type of reservation: user or system
CREATOR User name of the advance reservation creator, who submitted the brsvadd
command
USER User name of the advance reservation user, who submitted the job with bsub -U
NCPUS Number of CPUs reserved
RSV_HOSTS List of hosts for which processors are reserved, and the number of processors
reserved
TIME_WINDOW Time window for the reservation.
A one-time reservation displays fields separated by slashes
(
month/day/hour/minute). For example:
11/12/14/0-11/12/18/0
A recurring reservation displays fields separated by colons
(
day:hour:minute). For example:
5:18:0 5:20:0

Termination reasons displayed by bacct

When LSF detects that a job is terminated, bacct -l displays one of the following termination reasons. The corresponding integer value logged to the JOB_FINISH record in
TERM_ADMIN: Job killed by root or LSF administrator (15)
TERM_BUCKET_KILL: Job killed with bkill -b (23)
TERM_CHKPNT: Job killed after checkpointing (13)
lsb.acct is given in parentheses.
TERM_CWD_NOTEXIST: current working directory is not accessible or does
not exist on the execution host (25)
TERM_CPULIMIT: Job killed after reaching LSF CPU usage limit (12)
TERM_DEADLINE: Job killed after deadline expires (6)
TERM_EXTERNAL_SIGNAL: Job killed by a signal external to LSF (17)
12 Platform LSF Command Reference
TERM_FORCE_ADMIN: Job killed by root or LSF administrator without time
for cleanup (9)
TERM_FORCE_OWNER: Job killed by owner without time for cleanup (8)
TERM_LOAD: Job killed after load exceeds threshold (3)
TERM_MEMLIMIT: Job killed after reaching LSF memory usage limit (16)
TERM_OWNER: Job killed by owner (14)
TERM_PREEMPT: Job killed after preemption (1)
TERM_PROCESSLIMIT: Job killed after reaching LSF process limit (7)
TERM_REQUEUE_ADMIN: Job killed and requeued by root or LSF
administrator (11)
TERM_REQUEUE_OWNER: Job killed and requeued by owner (10)
TERM_RUNLIMIT: Job killed after reaching LSF run time limit (5)
TERM_SLURM: Job terminated abnormally in SLURM (node failure) (22)
TERM_SWAP: Job killed after reaching LSF swap usage limit (20)
TERM_THREADLIMIT: Job killed after reaching LSF thread limit (21)
TERM_UNKNOWN: LSF cannot determine a termination reason—0 is logged
but TERM_UNKNOWN is not displayed (0)
TERM_WINDOW: Job killed after queue run window closed (2)
TERM_ZOMBIE: Job exited while LSF is not available (19)
TIP: The integer values logged to the JOB_FINISH record in lsb.acct and termination reason
keywords are mapped in lsbatch.h.

Example: Default format

bacct
Accounting information about jobs that are:
- submitted by users user1.
- accounted on all projects.
- completed normally or exited.
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
SUMMARY: ( time unit: second ) Total number of done jobs: 60 Total number of exited jobs: 118 Total CPU time consumed: 1011.5 Average CPU time consumed: 5.7 Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.0 Total wait time in queues: 134598.0 Average wait time in queue: 756.2 Maximum wait time in queue: 7069.0 Minimum wait time in queue: 0.0 Average turnaround time: 3585 (seconds/job) Maximum turnaround time: 77524 Minimum turnaround time: 6 Average hog factor of a job: 0.00 ( cpu time / turnaround time ) Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00 Total throughput: 0.67 (jobs/hour) during 266.18 hours Beginning time: Aug 8 15:48 Ending time: Aug 19 17:59
Platform LSF Command Reference 13

Example: Jobs that have triggered job exceptions

Example: Jobs that have triggered job exceptions
bacct -x -l
Accounting information about jobs that are:
- submitted by users user1,
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <1743>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 30> Mon Aug 11 18:16:17: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Mon Aug 11 18:17:22: Dispatched to <hostC>; Mon Aug 11 18:18:54: Completed <done>.
EXCEPTION STATUS: underrun
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 65 157 done 0.0012 4M 5M
------------------------------------------------------------------------------
Job <1948>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 550> Tue Aug 12 14:15:03: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Tue Aug 12 14:15:15: Dispatched to <hostC>; Tue Aug 12 14:25:08: Completed <done>.
EXCEPTION STATUS: overrun idle
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.20 12 605 done 0.0003 4M 5M
------------------------------------------------------------------------------
Job <1949>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 400> Tue Aug 12 14:26:11: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Tue Aug 12 14:26:18: Dispatched to <hostC>; Tue Aug 12 14:33:16: Completed <done>.
EXCEPTION STATUS: idle
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.17 7 425 done 0.0004 4M 5M
Job <719[14]>, Job Name <test[14]>, User <user1>, Project <default>, Status
14 Platform LSF Command Reference
<EXIT>, Queue <normal>, Command </home/user1/job1> Mon Aug 18 20:27:44: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Mon Aug 18 20:31:16: [14] dispatched to <hostA>; Mon Aug 18 20:31:18: Completed <exit>.
EXCEPTION STATUS: underrun
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 212 214 exit 0.0009 2M 4M
------------------------------------------------------------------------------
SUMMARY: ( time unit: second ) Total number of done jobs: 45 Total number of exited jobs: 56 Total CPU time consumed: 1009.1 Average CPU time consumed: 10.0 Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.1 Total wait time in queues: 116864.0 Average wait time in queue: 1157.1 Maximum wait time in queue: 7069.0 Minimum wait time in queue: 7.0 Average turnaround time: 1317 (seconds/job) Maximum turnaround time: 7070 Minimum turnaround time: 10 Average hog factor of a job: 0.01 ( cpu time / turnaround time ) Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00 Total throughput: 0.59 (jobs/hour) during 170.21 hours Beginning time: Aug 11 18:18 Ending time: Aug 18 20:31

Example: Advance reservation accounting information

bacct -U user1#2
Accounting for:
- advanced reservation IDs: user1#2
- advanced reservations created by user1
----------------------------------------------------------------------------­RSVID TYPE CREATOR USER NCPUS RSV_HOSTS TIME_WINDOW user1#2 user user1 user1 1 hostA:1 9/16/17/36-9/16/17/38 SUMMARY: Total number of jobs: 4 Total CPU time consumed: 0.5 second Maximum memory of a job: 4.2 MB Maximum swap of a job: 5.2 MB Total duration time: 0 hour 2 minute 0 second

Example: LSF Job termination reason logging

When a job finishes, LSF reports the last job termination action it took against the
bacct -l 7265
job and logs it into
If a running job exits because of node failure, LSF sets the correct exit information in
lsb.acct, lsb.events, and the job output file.
Use
bacct -l to view job exit information logged to lsb.acct:
lsb.acct.
Accounting information about jobs that are:
- submitted by all users.
- accounted on all projects.
Platform LSF Command Reference 15

Files

- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <7265>, User <lsfadmin>, Project <default>, Status <EXIT>, Queue <normal>,
Command <srun sleep 100000>
Thu Sep 16 15:22:09: Submitted from host <hostA>, CWD <$HOME>;
Thu Sep 16 15:22:20: Dispatched to 4 Hosts/Processors <4*hostA>;
Thu Sep 16 15:22:20: slurm_id=21793;ncpus=4;slurm_alloc=n[13-14];
Thu Sep 16 15:23:21: Completed <exit>; TERM_RUNLIMIT: job killed after reaching
LSF run time limit.
Accounting information about this job:
Share group charged </lsfadmin>
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.04 11 72 exit 0.0006 0K 0K
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 0 Total number of exited jobs: 1
Total CPU time consumed: 0.0 Average CPU time consumed: 0.0
Maximum CPU time of a job: 0.0 Minimum CPU time of a job: 0.0
Total wait time in queues: 11.0
Average wait time in queue: 11.0
Maximum wait time in queue: 11.0 Minimum wait time in queue: 11.0
Average turnaround time: 72 (seconds/job)
Maximum turnaround time: 72 Minimum turnaround time: 72
Average hog factor of a job: 0.00 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.00 Minimum hog factor of a job: 0.00
Files
Reads lsb.acct, lsb.acct.n.

See also

bhist, bsub, bjobs, lsb.acct, brsvadd, brsvs, bsla, lsb.serviceclasses
16 Platform LSF Command Reference

bapp

Synopsis

Description

Options

Displays information about application profile configuration.
bapp [-l | -w] [application_profile_name ...]
bapp [-h | -V]
Displays information about application profiles configured in lsb.applications.
Returns application name, job slot statistics, and job state statistics for all application profiles:
In MultiCluster, returns the information about all application profiles in the local cluster.
CPU time is normalized.
-w Wide format. Fields are displayed without truncation.
-l Long format with additional information.
Displays the following additional information: application profile description, application profile characteristics and statistics, parameters, resource usage limits, associated commands, and job controls.
application_profile_name ...
Displays information about the specified application profile.
-h Prints command usage to stderr and exits.
-V Prints product release version to stderr and exits.

Default output format

Displays the following fields:
APPLICATION_NAME
The name of the application profile. Application profiles are named to correspond to the type of application that usually runs within them.
NJOBS The total number of job slots held currently by jobs in the application profile. This
includes pending, running, suspended and reserved job slots. A parallel job that is running on n processors is counted as n job slots, since it takes n job slots in the application.
PEND The number of job slots used by pending jobs in the application profile.
RUN The number of job slots used by running jobs in the application profile.
SUSP The number of job slots used by suspended jobs in the application profile.
Platform LSF Command Reference 17

Long output format(-l)

Long output format(-l)
In addition to the above fields, the -l option displays the following:
Description A description of the typical use of the application profile.
PA R AM E T ER S/
STATISTICS
SSUSP
The number of job slots in the application profile allocated to jobs that are suspended by LSF because of load levels or run windows.
USUSP
The number of job slots in the application profile allocated to jobs that are suspended by the job submitter or by the LSF administrator.
RSV
The number of job slots in the application profile that are reserved by LSF for pending jobs.
Per-job resource usage limits
The soft resource usage limits that are imposed on the jobs associated with the application profile. These limits are imposed on a per-job and a per-process basis.
The possible per-job limits are:
CPULIMIT
The maximum CPU time a job can use, in minutes, relative to the CPU factor of the named host. CPULIMIT is scaled by the CPU factor of the execution host so that jobs are allowed more time on slower hosts.
MEMLIMIT
The maximum running set size (RSS) of a process.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
lsf.conf to
MEMLIMIT_TYPE
A memory limit is the maximum amount of memory a job is allowed to consume. Jobs that exceed the level are killed. You can specify different types of memory limits to enforce, based on PROCESS, TASK, or JOB (or any combination of the three).
PROCESSLIMIT
The maximum number of concurrent processes allocated to a job.
PROCLIMIT
The maximum number of processors allocated to a job.
SWAPLIMIT
The swap space limit that a job may use.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
THREADLIMIT
The maximum number of concurrent threads allocated to a job.
18 Platform LSF Command Reference
lsf.conf to
Per-process resource usage limits
The possible UNIX per-process resource limits are:
CORELIMIT
The maximum size of a core file.
CHKPNT_DIR The checkpoint directory, if automatic checkpointing is enabled for the application
CHKPNT_INITPERIOD
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
DATALIMIT
The maximum size of the data segment of a process, in KB. This restricts the amount of memory a process can allocate.
FILELIMIT
The maximum file size a process can create, in KB.
RUNLIMIT
The maximum wall clock time a process can use, in minutes. RUNLIMIT is scaled by the CPU factor of the execution host.
STACKLIMIT
The maximum size of the stack segment of a process. This restricts the amount of memory a process can use for local variables or recursive function calls.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
profile.
The initial checkpoint period in minutes. The periodic checkpoint does not happen until the initial period has elapsed.
lsf.conf to
lsf.conf to
CHKPNT_PERIOD The checkpoint period in minutes. The running job is checkpointed automatically
every checkpoint period.
CHKPNT_METHOD The checkpoint method.
MIG The migration threshold in minutes. A value of 0 (zero) specifies that a suspended
job should be migrated immediately.
Where a host migration threshold is also specified, and is lower than the job value, the host value is used.
PRE_EXEC The pre-execution command for the application profile. The PRE_EXEC command
runs on the execution host before the job associated with the application profile is dispatched to the execution host (or to the first host selected for a parallel batch job).
POST_EXEC The post-execution command for the application profile. The POST_EXEC
command runs on the execution host after the job finishes.
JOB_INCLUDE_POSTPROC
If JOB_INCLUDE_POSTPROC= Y, post-execution processing of the job is included as part of the job.
JOB_POSTPROC_TIMEOUT
Platform LSF Command Reference 19

See also

Timeout in minutes for job post-execution processing. If post-execution processing takes longer than the timeout, (POST_ERR status), and kills the process group of the job’s post-execution processes.
sbatchd reports that post-execution has failed
REQUEUE_EXIT_VALUES
Jobs that exit with these values are automatically requeued.
RES_REQ Resource requirements of the application profile. Only the hosts that satisfy these
resource requirements can be used by the application profile.
JOB_STARTER An executable file that runs immediately prior to the batch job, taking the batch job
file as an input argument. All jobs submitted to the application profile are run via the job starter, which is generally used to create a specific execution environment before processing the jobs themselves.
CHUNK_JOB_SIZE Chunk jobs only. Specifies the maximum number of jobs allowed to be dispatched
together in a chunk job. All of the jobs in the chunk are scheduled and dispatched as a unit rather than individually.
RERUNNABLE If the RERUNNABLE field displays yes, jobs in the application profile are
automatically restarted or rerun if the execution host becomes unavailable. However, a job in the application profile is not restarted if you use the rerunnable option from the job.
bmod to remove
RESUME_CONTROL The configured actions for the resume job control.
The configured actions are displayed in the format [action_type, command] where action_type is RESUME.
SUSPEND_CONTROL
The configured actions for the suspend job control.
The configured actions are displayed in the format [action_type, command] where action_type is SUSPEND.
TERMINATE_CONTROL
The configured actions for terminate job control.
The configured actions are displayed in the format [action_type, command] where action_type is TERMINATE.
See also
lab.applications, lsb.queues, bsub, bjobs, badmin, mbatchd
20 Platform LSF Command Reference

badmin

Synopsis

Description

Administrative tool for LSF.
badmin subcommand
badmin [-h | -V]
IMPORTANT: This command can only be used by LSF administrators.
badmin
subcommands are supplied for
provides a set of subcommands to control and monitor LSF. If no
badmin, badmin prompts for a subcommand from
standard input.
Information about each subcommand is available through the
The
badmin subcommands include privileged and non-privileged subcommands.
help command.
Privileged subcommands can only be invoked by root or LSF administrators. Privileged subcommands are:
reconfig
mbdrestart
qopen
qclose
qact
qinact
hopen
hclose
hrestart
hshutdown
hstartup
diagnose
The configuration file lsf.sudoers(5) must be set to use the privileged command
hstartup by a non-root user.
All other commands are non-privileged commands and can be invoked by any LSF user. If the privileged commands are to be executed by the LSF administrator,
badmin must be installed, because it needs to send the request using a privileged
port.
For subcommands for which multiple hosts can be specified, do not enclose the host names in quotation marks.
Platform LSF Command Reference 21

Subcommand synopsis

Subcommand synopsis
ckconfig [-v]
diagnose [job_ID ... | "job_ID[index]" ...]
reconfig [-v] [-f]
mbdrestart [-C comment] [-v] [-f]
qopen [-C comment] [queue_name ... | all]
qclose [-C comment] [queue_name ... | all]
qact [-C comment] [queue_name ... | all]
qinact [-C comment] [queue_name ... | all]
qhist [-t time0,time1] [-f logfile_name] [queue_name ...]
hopen [-C comment] [host_name ... | host_group ... | all]
hclose [-C comment
hrestart [-f] [host_name ... | all]
hshutdown [-f] [host_name ... | all]
hstartup [-f] [host_name ... | all]
hhist [-t time0,time1] [-f logfile_name] [host_name ...]
mbdhist [-t time0,time1] [-f logfile_name]
hist [-t time0,time1] [-f logfile_name]
hghostadd [-C comment] host_group host_name [host_name ...]
hghostdel [-f] [-C comment] host_group host_name [host_name ...]
help [command ...] | ? [
quit
mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
mbdtime [-l timing_level] [-f logfile_name] [-o]
sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
[host_name ...]
sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...]
schddebug [-c class_name ...] [-l debug_level] [-f logfile_name]
[-o]
schdtime [-l timing_level] [-f logfile_name] [-o]
showconf mbd | [sbd [ host_name … |
perfmon start [sample_period]| stop | view | setperiod sample_period
-h
] [host_name ... | host_group ... | all]
command ...]
all ]]
-V
22 Platform LSF Command Reference

Options

subcommand Executes the specified subcommand. See Usage section.

Usage

ckconfig [-v] Checks LSF configuration files located in the
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
LSB_CONFDIR/cluster_name/configdir directory, and checks LSF_ENVDIR/lsf.licensescheduler.
The LSB_CONFDIR variable is defined in LSF_ENVDIR or
/etc (if LSF_ENVDIR is not defined).
lsf.conf (see lsf.conf(5)) which is in
By default, check. If warning errors are found,
badmin ckconfig displays only the result of the configuration file
badmin prompts you to display detailed
messages.
-v
Verbose mode. Displays detailed messages about configuration file checking to
stderr.
diagnose [job_ID ... | "job_ID]" ...][
Displays full pending reason list if CONDENSE_PENDING_REASONS=Y is set in
lsb.params. For example:
badmin diagnose 1057
reconfig [-v] [-f] Dynamically reconfigures LSF without restarting mbatchd.
Configuration files are checked for errors and the results displayed to errors are found in the configuration files, a reconfiguration request is sent to
mbatchd and configuration files are reloaded.
With this option, replayed. To restart
mbdrestart
When you issue this command,
mbatchd and mbschd are not restarted and lsb.events is not
mbatchd and mbschd, and replay lsb.events, use badmin
.
mbatchd is available to service requests while
reconfiguration files are reloaded. Configuration changes made since system boot or the last reconfiguration take effect.
stderr. If no
If warning errors are found, fatal errors are found, reconfiguration is not performed, and
badmin prompts you to display detailed messages. If
badmin exits.
If you add a host to a queue or to a host group, the new host is not recognized by jobs that were submitted before you reconfigured. If you want the new host to be recognized, you must use the command
badmin mbdrestart.
Resource requirements determined by the queue no longer apply to a running job after running
badmin reconfig, For example, if you change the RES_REQ
parameter in a queue and reconfigure the cluster, the previous queue-level resource requirements for running jobs are lost.
-v
Platform LSF Command Reference 23
Usage
Verbose mode. Displays detailed messages about the status of the configuration files. Without this option, the default is to display the results of configuration file checking. All messages from the configuration file check are printed to
-f
Disables interaction and proceeds with reconfiguration if configuration files contain no fatal errors.
mbdrestart [-C comment] [-v] [-f]
Dynamically reconfigures LSF and restarts mbatchd and mbschd.
stderr.
Configuration files are checked for errors and the results printed to errors are found, configuration files are reloaded, restarted, and events in last
mbatchd. While mbatchd restarts, it is unavailable to service requests.
If warning errors are found, fatal errors are found, exits.
If
lsb.events is large, or many jobs are running, restarting mbatchd can take
several minutes. If you only need to reload the configuration files, use
reconfig
-C comment
.
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
-v
Verbose mode. Displays detailed messages about the status of configuration files. All messages from configuration checking are printed to
-f
Disables interaction and forces reconfiguration and mbatchd restart to proceed if configuration files contain no fatal errors.
qopen [-C comment] [queue_name ... | all]
Opens specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue can accept batch jobs only if it is open.
stderr. If no
mbatchd and mbschd are
lsb.events are replayed to recover the running state of the
badmin prompts you to display detailed messages. If
mbatchd and mbschd restart is not performed, and badmin
badmin
stderr.
-C comment
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qclose [-C comment] [queue_name ... | all]
Closes specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue does not accept any job if it is closed.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qact [-C comment] [queue_name ... | all]
24 Platform LSF Command Reference
Activates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. Jobs in a queue can be dispatched if the queue is activated.
A queue inactivated by its run windows cannot be reactivated by this command.
-C comment
Logs the text of the comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qinact [-C comment] [queue_name ... | all]
Inactivates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. No job in a queue can be dispatched if the queue is inactivated.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qhist [-t time0,time1] [-f logfile_name] [queue_name ...]
Displays historical events for specified queues, or for all queues if no queue is specified. Queue events are queue opening, closing, activating and inactivating.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the commands
hopen [-C comment] [host_name ... | host_group ... | all]
qclose, qopen, qact, and qinact, qhist displays the comment text.
Opens batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are opened if the reserved word group is specified, the local host is assumed. A host accepts batch jobs if it is open.
IMPORTANT: If EGO-enabled SLA scheduling is configured through ENABLE_DEFAULT_EGO_SLA
in lsb.params, and a host is closed by EGO, it cannot be reopened by badmin hopen. Hosts closed by EGO have status closed_EGO in bhosts -l output.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you open a host group, each host group member displays with the same comment string.
-C option of the queue control
all is specifie d. If no hos t or ho st
hclose [-C comment] [host_name ... | host_group ... | all]
Platform LSF Command Reference 25
Usage
Closes batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are closed if the reserved word specified, the local host is assumed. A closed host does not accept any new job, but jobs already dispatched to the host are not affected. Note that this is different from a host closed by a window; all jobs on it are suspended in that case.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you close a h ost g roup, eac h host grou p mem ber dis play s with the sam e comm ent string.
hghostadd [-C comment] host_group host_name [host_name ...]
If dynamic host configuration is enabled, dynamically adds hosts to a host group, . After receiving the host information from the master LIM, adds the host without triggering a
reconfig.
Once the host is added to the group, it is considered to be part of that group with respect to scheduling decision making for both newly submitted jobs and for existing pending jobs.
This command fails if any of the specified host groups or host names are not valid.
all is specified. If no argument is
mbatchd dynamically
RESTRICTION: If EGO- enabled SLA scheduling is configured through ENABLE _DEFAULT_EGO_SLA
in lsb.params, you cannot use hghostadd because all host allocation is under control of Platform EGO.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
hghostdel [-f] [-C comment] host_group host_name [host_name ...]
Dynamically deletes hosts from a host group by triggering an mbatchd reconfig
This command fails if any of the specified host groups or host names are not valid.
CAUTION: If you want to change a dynamic host to a static host, first use the command
badmin hghostdel to remove the dynamic host from any host group that it belongs to, and then configure the host as a static host in lsf.cluster.cluster_name.
RESTRICTION: If EGO- enabled SLA scheduling is configured through ENABLE _DEFAULT_EGO_SLA
in lsb.params, you cannot use hghostdel because all host allocation is under control of Platform EGO.
hrestart [-f] [host_name ... | all]
Restarts sbatchd on the specified hosts, or on all server hosts if the reserved word
all is specified. If no host is specified, the local host is assumed. sbatchd reruns
itself from the beginning. This allows new
sbatchd binaries to be used.
-f
Disables interaction and does not ask for confirmation for restarting sbatchd.
hshutdown [-f] [host_name ... | all]
26 Platform LSF Command Reference
Shuts down sbatchd on the specified hosts, or on all batch server hosts if the reserved word
sbatchd exits upon receiving the request.
-f
Disables interaction and does not ask for confirmation for shutting down sbatchd.
hstartup [-f] [host_name ... | all]
Starts sbatchd on the specified hosts, or on all batch server hosts if the reserved word use the hosts without having to type in passwords. If no host is specified, the local host is assumed.
all is specified. If no host is specified, the local host is assumed.
all is specified. Only root and users listed in the file lsf.sudoers(5) can
all and -f options. These users must be able to use rsh or ssh on all LSF
The shell command specified by LSF_RSH in
-f
Disables interaction and does not ask for confirmation for starting sbatchd.
hhist [-t time0,time1] [-f logfile_name] [host_name ...]
Displays historical events for specified hosts, or for all hosts if no host is specified. Host events are host opening and closing.
-t time0,time1 Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the commands
mbdhist [-t time0,time1] [-f logfile_name]
hclose or hopen, hhist displays the comment text.
Displays historical events for mbatchd. Events describe the starting and exiting of
mbatchd.
-t time0,time1 Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
lsf.conf is used before rsh is tried.
-C option of the host control
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the command,
hist [-t time0,time1] [-f logfile_name]
Displays historical events for all the queues, hosts and mbatchd.
-C option of the mbdrestart
mbdhist displays the comment text.
Platform LSF Command Reference 27
Usage
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the and
mbatchd commands, hist displays the comment text.
help [command ...] | ? [command ...]
Displays the syntax and functionality of the specified commands.
quit Exits the badmin session.
mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets message log level for mbatchd to include additional information in log files. You mu st be
See
sddebug for an explanation of options.
mbdtime [-l timing_level] [-f logfile_name] [-o]
root or the LSF administrator to use this command.
Sets timing level for mbatchd to include additional timing information in log files. You mu st b e
root or the LSF administrator to use this command. See sbdtime for
an explanation of options.
-C option of the queue, host,
sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] [host_name ...]
Sets the message log level for sbatchd to include additional information in log files. You mu st be
root or the LSF administrator to use this command.
In MultiCluster, debug levels can only be set for hosts within the same cluster. For example, you cannot set debug or timing levels from a host in in
clusterB. You need to be on a host in clusterB to set up debug or timing levels
for
clusterB hosts.
If the command is used without any options, the following default values are used:
class_name=0 (no additional classes are logged)
debug_level=0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name
.log.host_name
host_name=local host (host from which command was submitted)
-c class_name ...
Specifies software classes for which debug messages are to be logged.
Format of class_name is the name of a class, or a list of class names separated by spaces and enclosed in quotation marks. Classes are also listed in
Valid log classes are:
LC_ADVRSV - Log advance reservation modifications
clusterA for a host
lsf.h.
LC_AFS - Log AFS messages
LC_AUTH - Log authentication messages
28 Platform LSF Command Reference
LC_CHKPNT - Log checkpointing messages
LC_COMM - Log communication messages
LC_CONF - Print out all parameters in lsb.params
LC_DCE - Log messages pertaining to DCE support
LC_EEVENTD - Log eeventd messages
LC_ELIM - Log ELIM messages
LC_EXEC - Log significant steps for job execution
LC_FAIR - Log fairshare policy messages
LC_FILE - Log file transfer messages
LC_HANG - Mark where a program might hang
LC_JARRAY - Log job array messages
LC_JLIMIT - Log job slot limit messages
LC_LICENSE - Log license management messages (LC_LICENCE is also
supported for backward compatibility)
LC_LOADINDX - Log load index messages
LC_M_LOG - Log multievent logging messages
LC_MPI - Log MPI messages
LC_MULTI - Log messages pertaining to MultiCluster
LC_PEND - Log messages related to job pending reasons
LC_PERFM - Log performance messages
LC_PIM - Log PIM messages
LC_PREEMPT - Log preemption policy messages
LC_SIGNAL - Log messages pertaining to signals
LC_SYS - Log system call messages
LC_TRACE - Log significant program walk steps
LC_XDR - Log everything transferred by XDR
Default: 0 (no additional classes are logged)
-l debug_level
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
Possible values:
0 LOG_DEBUG level in parameter LSF_LOG_MASK in
lsf.conf.
1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
Platform LSF Command Reference 29
Usage
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Default: 0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
-f logfile_name
Specify the name of the file into which debugging messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file that is created has the following format:
logfile_name.daemon_name.
log.host_name
On UNIX, if the specified path is not valid, the log file is created in the directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
-o
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LSF_LOG_MASK and classes are reset to the value of LSB_DEBUG_MBD, LSB_DEBUG_SBD.
The log file is also reset back to the default log file.
host_name ...
Optional. Sets debug settings on the specified host or hosts.
Lists of host names must be separated by spaces and enclosed in quotation marks.
Default: local host (host from which command was submitted)
sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...]
Sets the timing level for sbatchd to include additional timing information in log files. You must be
root or the LSF administrator to use this command.
In MultiCluster, timing levels can only be set for hosts within the same cluster. For example, you could not set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
/tmp
timing_level=no timing information is recorded
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.
host_name=local host (host from which command was submitted)
-l timing_level
Specifies detail of timing information that is included in log files. Timing messages indicate the execution time of functions in the software and are logged in milliseconds.
Valid values: 1 | 2 | 3 | 4 | 5
30 Platform LSF Command Reference
log.host_name
Loading...
+ 290 hidden pages