HP Platform LSF Command Reference Guide

Platform™ LSF™ Command Reference
Version 7 Update 3
Release date: May 2008
Last modified: May 16, 2008
Comments to: doc@platform.com
Support: support@platform.com
Copyright © 1994-2008, Platform Computing Inc.
Although the information in this document has been carefully reviewed, Platform Computing Inc. (“Platform”) does not warrant it to be free of errors or omissions. Platform reserves the right to make corrections, updates, revisions or changes to the information in this document.
UNLESS OTHERWISE EXPRESSLY STATED BY PLATFORM, THE PROGRAM DESCRIBED IN THIS DOCUMENT IS PROVIDED “AS IS” AND WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. IN NO EVENT WILL PLATFORM COMPUTING BE LIABLE TO ANYONE FOR SPECIAL, COLLATERAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING WITHOUT LIMITATION ANY LOST PROFITS, DATA, OR SAVINGS, ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM.
We’d like to hear from you You can help us make this document better by telling us what you think of the content, organization,
and usefulness of the information. If you find an error, or just want to make a suggestion for improving this document, please address your comments to doc@platform.com.
Your comments should pertain only to Platform documentation. For product support, contact support@platform.com.
Document redistribution and translation
Internal redistribution You may only redistribute this document internally within your organization (for example, on an
Trademarks LSF is a registered trademark of Platform Computing Inc. in the United States and in other
Third-party license agreements
Third-party copyright notices
This document is protected by copyright and you may not redistribute or translate it into another language, in part or in whole.
intranet) provided that you continue to check the Platform Web site for updates and update your version of the documentation. You may not make it available to your organization over the Internet.
jurisdictions.
ACCELERATING INTELLIGENCE, PLATFORM COMPUTING, PLATFORM SYMPHONY, PLATFORM JOBSCHEDULER, PLATFORM ENTERPRISE GRID ORCHESTRATOR, PLATFORM EGO, and the PLATFORM and PLATFORM LSF logos are trademarks of Platform Computing Inc. in the United States and in other jurisdictions.
UNIX is a registered trademark of The Open Group in the United States and in other jurisdictions.
Microsoft is either a registered trademark or a trademark of Microsoft Corporation in the United States and/or other countries.
Windows is a registered trademark of Microsoft Corporation in the United States and other countries.
Other products or services mentioned in this document are identified by the trademarks or service marks of their respective owners.
http://www.platform.com/Company/third.part.license.htm
http://www.platform.com/Company/Third.Party.Copyright.htm
Contents
bacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
bapp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
badmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
bbot . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
bchkpnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
bclusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
bgadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
bgdel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
bhist . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
bhosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
bhpart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
bgmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
bjgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
bjobs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
bkill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
bladmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
blaunch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
blcollect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
blhosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
blimits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
blinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
blkill . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
blparams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
blplugins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
blstat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
bltasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
blusers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
bmgroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
bmig . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
bmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
bparams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
bpeek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
bpost . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
bqueues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128
bread . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
brequeue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
bresources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
Platform LSF Command Reference 3
brestart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
bresume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 148
brlainfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
brsvadd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
brsvdel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
brsvmod . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
brsvs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 163
brun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165
bsla . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 167
bslots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
bstatus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
bstop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
bsub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
bswitch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 209
btop . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 211
bugroup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 213
busers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214
ch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216
lsacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 218
lsacctmrg . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
lsadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 222
lsclusters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 230
lseligible . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 232
lsfinstall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 233
lsfmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 239
lsfrestart . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 240
lsfshutdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241
lsfstartup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 242
lsgrun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 243
lshosts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 246
lsid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 250
lsinfo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 251
lsload . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253
lsloadadj . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 257
lslogin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 259
lsltasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261
lsmon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 263
lspasswd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 267
lsplace . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 268
lsrcp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 270
lsrtasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273
lsrun . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275
lstcsh . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 277
pam . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 282
patchinstall . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285
pmcadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 287
4 Platform LSF Command Reference
pmcremoverc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 288
pmcsetrc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 289
perfadmin . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290
perfremoverc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 292
perfsetrc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293
pversions (Windows) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294
pversions (UNIX) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 295
ssacct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 298
ssched . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 302
taskman . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307
tspeek . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308
tssub . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 309
wgpasswd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 311
wguser . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 312
Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313
Platform LSF Command Reference 5
6 Platform LSF Command Reference

bacct

Synopsis

Displays accounting statistics about finished jobs.
bacct [-b | -l] [-d] [-e] [-w] [-x] [-app application_profile_name]
[-C time0,time1] [-D time0,time1] [-f logfile_name] [-Lp ls_project_name ...] [-m host_name ...|-M host_list_file] [-N host_name | -N host_model | -N cpu_factor] [-P project_name ...] [-q queue_name ...] [-sla service_class_name ...] [-S time0,time1] [-u user_name ... | -u all]
bacct [-b | -l] [-f logfile_name] [job_ID ...]
bacct [-U reservation_ID ... | -U all [-u user_name ... | -u all]
bacct [-h | -V

Description

Displays a summary of accounting statistics for all finished jobs (with a DONE or EXIT status) submitted by the user who invoked the command, on all hosts, projects, and queues in the LSF system. in the current LSF accounting log file:
LSB_SHAREDIR/cluster_name/logdir/lsb.acct.
CPU time is not normalized.
All times are in seconds.
Statistics not reported by can be generated by directly using

Throughput calculation

The throughput (T) of the LSF system, certain hosts, or certain queues is calculated by the formula:
T = N/(ET-BT)
where:
N is the total number of jobs for which accounting statistics are reported
BT is the Start time—when the first job was logged
ET is the End time—when the last job was logged
]
bacct displays statistics for all jobs logged
bacct but of interest to individual system administrators
awk or perl to process the lsb.acct file.
You can use the option
-C time0,time1 to specify the Start time as time0 and the
End time as time1. In this way, you can examine throughput during a specific time period.
Jobs involved in the throughput calculation are only those being logged (that is, with a DONE or EXIT status). Jobs that are running, suspended, or that have never been dispatched after submission are not considered, because they are still in the LSF system and not logged in
lsb.acct.
Platform LSF Command Reference 7

Options

Options
The total throughput of the LSF system can be calculated by specifying -u all without any of the can be calculated by specifying The throughput of certain queues can be calculated by specifying the
-m, -S, -D or job_ID options.
bacct does not show local pending batch jobs killed using bkill -b. bacct shows
MultiCluster jobs and local running jobs even if they are killed using
-b Brief format.
-d Displays accounting statistics for successfully completed jobs (with a DONE
-m, -q, -S, -D or job_ID options. The throughput of certain hosts
-u all without the -q, -S, -D or job_ID options.
-u all without
bkill -b.
status).
-e Displays accounting statistics for exited jobs (with an EXIT status).
-l Long format with additional detail.
-w Wide field format.
-x Displays jobs that have triggered a job exception (overrun, underrun, idle). Use
with the
-l option to show the exception status for individual jobs.
-app application_profile_name
Displays accounting information about jobs submitted to the specified application profile. You must specify an existing application profile configured in
lsb.applications.
-C time0,time1 Displays accounting statistics for jobs that completed or exited during the specified
time interval. Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is
also used.
The time format is the same as in
-D time0,time1 Displays accounting statistics for jobs dispatched during the specified time interval.
Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
The time format is the same as in
-f logfile_name Searches the specified job log file for accounting statistics. Specify either an absolute
bhist(1).
bhist(1).
or relative path.
Useful for offline analysis.
The specified file path can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
-Lp ls_project_name ... Displays accounting statistics for jobs belonging to the specified License Scheduler
projects. If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
-M host_list_file Displays accounting statistics for jobs dispatched to the hosts listed in a file
(host_list_file) containing a list of hosts. The host list file has the following format:
Multiple lines are supported
Each line includes a list of hosts separated by spaces
The length of each line must be less than 512 characters
8 Platform LSF Command Reference
-m host_name ...
Displays accounting statistics for jobs dispatched to the specified hosts.
If a list of hosts is specified, host names must be separated by spaces and enclosed in quotation marks (") or (’).
-N host_name | -N host_model | -N cpu_factor
Normalizes CPU time by the CPU factor of the specified host or host model, or by the specified CPU factor.
If you use
-P project_name ... Displays accounting statistics for jobs belonging to the specified projects. If a list of
bacct offline by indicating a job log file, you must specify a CPU factor.
projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
-q queue_name ... Displays accounting statistics for jobs submitted to the specified queues.
If a list of queues is specified, queue names must be separated by spaces and enclosed in quotation marks (") or (’).
-S time0,time1 Displays accounting statistics for jobs submitted during the specified time interval.
Reads
lsb.acct and all archived log files (lsb.acct.n) unless -f is also used.
The time format is the same as in
-sla service_class_name
bhist(1).
Displays accounting statistics for jobs that ran under the specified service class.
If a default system service class is configured with ENABLE_DEFAULT_EGO_SLA in
lsb.params but not explicitly configured in lsb.applications,
bacct -sla service_class_name displays accounting information for the specified
default service class.
-U reservation_id ... | -U all
Displays accounting statistics for the specified advance reservation IDs, or for all reservation IDs if the keyword
all is specified.
A list of reservation IDs must be separated by spaces and enclosed in quotation marks (") or (’).
The
-U option also displays historical information about reservation modifications.
When combined with the
-U option, -u is interpreted as the user name of the
reservation creator. For example:
bacct -U all -u user2
shows all the advance reservations created by user user2.
Without the
-u option, bacct -U shows all advance reservation information about
jobs submitted by the user.
In a MultiCluster environment, advance reservation information is only logged in the execution cluster, so
bacct displays advance reservation information for local
reservations only. You cannot see information about remote reservations. You cannot specify a remote reservation ID, and the keyword
all only displays
information about reservations in the local cluster.
-u user_name ...|-u all Displays accounting statistics for jobs submitted by the specified users, or by all
users if the keyword
all is specified.
Platform LSF Command Reference 9

Default output format (SUMMARY)

If a list of users is specified, user names must be separated by spaces and enclosed in quotation marks (") or (’). You can specify both user names and user IDs in the list of users.
job_ID ... Displays accounting statistics for jobs with the specified job IDs.
If the reserved job ID 0 is used, it is ignored.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
Default output format (SUMMARY)
Statistics on jobs. The following fields are displayed:
Total number of done jobs
Total number of exited jobs
Total CPU time consumed
Average CPU time consumed
Maximum CPU time of a job
Minimum CPU time of a job
Total wait time in queues
Average wait time in queue
Maximum wait time in queue
Minimum wait time in queue
Average turnaround time (seconds/job)
Maximum turnaround time
Minimum turnaround time
Average hog factor of a job (cpu time/turnaround time)
Maximum hog factor of a job
Minimum hog factor of a job
Tota l t hro u ghp ut
Beginning time: the completion or exit time of the first job selected
Ending time: the completion or exit time of the last job selected
The total, average, minimum, and maximum statistics are on all specified jobs.
The wait time is the elapsed time from job submission to job dispatch.
The turnaround time is the elapsed time from job submission to job completion.
The hog factor is the amount of CPU time consumed by a job divided by its turnaround time.
The throughput is the number of completed jobs divided by the time period to finish these jobs (jobs/hour).

Brief format (-b)

In addition to the default format SUMMARY, displays the following fields:
10 Platform LSF Command Reference
U/UID
QUEUE Queue to which the job was submitted.
SUBMIT_TIME Time when the job was submitted.
CPU_T CPU time consumed by the job.
WAIT Wait t ime o f the j ob.
TURNAROUND Turnaround time of the job.
FROM Host from which the job was submitted.
EXEC_ON Host or hosts to which the job was dispatched to run.
JOB_NAME The job name assigned by the user, or the command string assigned by default at

Long format (-l)

Name of the user who submitted the job. If LSF fails to get the user name by
getpwuid(3), the user ID is displayed.
job submission with
bsub. If the job name is too long to fit in this field, then only
the latter part of the job name is displayed.
The displayed job name or job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
In addition to the fields displayed by default in SUMMARY and by -b, displays the following fields:
JOBID Identifier that LSF assigned to the job.
PROJECT_NAME Project name assigned to the job.
STATUS Status that indicates the job was either successfully completed (DONE) or exited
(EXIT).
DISPAT_TIME Time when the job was dispatched to run on the execution hosts.
COMPL_TIME Time when the job exited or completed.
HOG_FACTOR Average hog factor, equal to "CPU time" / "turnaround time".
MEM Maximum resident memory usage of all processes in a job. By default, memory
usage is shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a
larger unit for display (MB, GB, TB, PB, or EB).
CWD Current working directory of the job.
SWAP Maximum virtual memory usage of all processes in a job. By default, swap space is
shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a larger unit
for display (MB, GB, TB, PB, or EB).
INPUT_FILE File from which the job reads its standard input (see bsub -i input_file).
OUTPUT_FILE File to which the job writes its standard output (see bsub -o output_file).
ERR_FILE File in which the job stores its standard error output (see bsub -e err_file).
EXCEPTION STATUS Possible values for the exception status of a job include:
idle
Platform LSF Command Reference 11

Advance Reservations (-U)

The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.
overrun
The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.
underrun
The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.
Advance Reservations (-U)
Displays the following fields:
RSVID Advance reservation ID assigned by brsvadd command
TYPE Type of reservation: user or system
CREATOR User name of the advance reservation creator, who submitted the brsvadd
command
USER User name of the advance reservation user, who submitted the job with bsub -U
NCPUS Number of CPUs reserved
RSV_HOSTS List of hosts for which processors are reserved, and the number of processors
reserved
TIME_WINDOW Time window for the reservation.
A one-time reservation displays fields separated by slashes
(
month/day/hour/minute). For example:
11/12/14/0-11/12/18/0
A recurring reservation displays fields separated by colons
(
day:hour:minute). For example:
5:18:0 5:20:0

Termination reasons displayed by bacct

When LSF detects that a job is terminated, bacct -l displays one of the following termination reasons. The corresponding integer value logged to the JOB_FINISH record in
TERM_ADMIN: Job killed by root or LSF administrator (15)
TERM_BUCKET_KILL: Job killed with bkill -b (23)
TERM_CHKPNT: Job killed after checkpointing (13)
lsb.acct is given in parentheses.
TERM_CWD_NOTEXIST: current working directory is not accessible or does
not exist on the execution host (25)
TERM_CPULIMIT: Job killed after reaching LSF CPU usage limit (12)
TERM_DEADLINE: Job killed after deadline expires (6)
TERM_EXTERNAL_SIGNAL: Job killed by a signal external to LSF (17)
12 Platform LSF Command Reference
TERM_FORCE_ADMIN: Job killed by root or LSF administrator without time
for cleanup (9)
TERM_FORCE_OWNER: Job killed by owner without time for cleanup (8)
TERM_LOAD: Job killed after load exceeds threshold (3)
TERM_MEMLIMIT: Job killed after reaching LSF memory usage limit (16)
TERM_OWNER: Job killed by owner (14)
TERM_PREEMPT: Job killed after preemption (1)
TERM_PROCESSLIMIT: Job killed after reaching LSF process limit (7)
TERM_REQUEUE_ADMIN: Job killed and requeued by root or LSF
administrator (11)
TERM_REQUEUE_OWNER: Job killed and requeued by owner (10)
TERM_RUNLIMIT: Job killed after reaching LSF run time limit (5)
TERM_SLURM: Job terminated abnormally in SLURM (node failure) (22)
TERM_SWAP: Job killed after reaching LSF swap usage limit (20)
TERM_THREADLIMIT: Job killed after reaching LSF thread limit (21)
TERM_UNKNOWN: LSF cannot determine a termination reason—0 is logged
but TERM_UNKNOWN is not displayed (0)
TERM_WINDOW: Job killed after queue run window closed (2)
TERM_ZOMBIE: Job exited while LSF is not available (19)
TIP: The integer values logged to the JOB_FINISH record in lsb.acct and termination reason
keywords are mapped in lsbatch.h.

Example: Default format

bacct
Accounting information about jobs that are:
- submitted by users user1.
- accounted on all projects.
- completed normally or exited.
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
SUMMARY: ( time unit: second ) Total number of done jobs: 60 Total number of exited jobs: 118 Total CPU time consumed: 1011.5 Average CPU time consumed: 5.7 Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.0 Total wait time in queues: 134598.0 Average wait time in queue: 756.2 Maximum wait time in queue: 7069.0 Minimum wait time in queue: 0.0 Average turnaround time: 3585 (seconds/job) Maximum turnaround time: 77524 Minimum turnaround time: 6 Average hog factor of a job: 0.00 ( cpu time / turnaround time ) Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00 Total throughput: 0.67 (jobs/hour) during 266.18 hours Beginning time: Aug 8 15:48 Ending time: Aug 19 17:59
Platform LSF Command Reference 13

Example: Jobs that have triggered job exceptions

Example: Jobs that have triggered job exceptions
bacct -x -l
Accounting information about jobs that are:
- submitted by users user1,
- accounted on all projects.
- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <1743>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 30> Mon Aug 11 18:16:17: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Mon Aug 11 18:17:22: Dispatched to <hostC>; Mon Aug 11 18:18:54: Completed <done>.
EXCEPTION STATUS: underrun
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 65 157 done 0.0012 4M 5M
------------------------------------------------------------------------------
Job <1948>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 550> Tue Aug 12 14:15:03: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Tue Aug 12 14:15:15: Dispatched to <hostC>; Tue Aug 12 14:25:08: Completed <done>.
EXCEPTION STATUS: overrun idle
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.20 12 605 done 0.0003 4M 5M
------------------------------------------------------------------------------
Job <1949>, User <user1>, Project <default>, Status <DONE>, Queue <normal>, Command <sleep 400> Tue Aug 12 14:26:11: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Tue Aug 12 14:26:18: Dispatched to <hostC>; Tue Aug 12 14:33:16: Completed <done>.
EXCEPTION STATUS: idle
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.17 7 425 done 0.0004 4M 5M
Job <719[14]>, Job Name <test[14]>, User <user1>, Project <default>, Status
14 Platform LSF Command Reference
<EXIT>, Queue <normal>, Command </home/user1/job1> Mon Aug 18 20:27:44: Submitted from host <hostB>, CWD <$HOME/jobs>, Output File </dev/null>; Mon Aug 18 20:31:16: [14] dispatched to <hostA>; Mon Aug 18 20:31:18: Completed <exit>.
EXCEPTION STATUS: underrun
Accounting information about this job: CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.19 212 214 exit 0.0009 2M 4M
------------------------------------------------------------------------------
SUMMARY: ( time unit: second ) Total number of done jobs: 45 Total number of exited jobs: 56 Total CPU time consumed: 1009.1 Average CPU time consumed: 10.0 Maximum CPU time of a job: 991.4 Minimum CPU time of a job: 0.1 Total wait time in queues: 116864.0 Average wait time in queue: 1157.1 Maximum wait time in queue: 7069.0 Minimum wait time in queue: 7.0 Average turnaround time: 1317 (seconds/job) Maximum turnaround time: 7070 Minimum turnaround time: 10 Average hog factor of a job: 0.01 ( cpu time / turnaround time ) Maximum hog factor of a job: 0.56 Minimum hog factor of a job: 0.00 Total throughput: 0.59 (jobs/hour) during 170.21 hours Beginning time: Aug 11 18:18 Ending time: Aug 18 20:31

Example: Advance reservation accounting information

bacct -U user1#2
Accounting for:
- advanced reservation IDs: user1#2
- advanced reservations created by user1
----------------------------------------------------------------------------­RSVID TYPE CREATOR USER NCPUS RSV_HOSTS TIME_WINDOW user1#2 user user1 user1 1 hostA:1 9/16/17/36-9/16/17/38 SUMMARY: Total number of jobs: 4 Total CPU time consumed: 0.5 second Maximum memory of a job: 4.2 MB Maximum swap of a job: 5.2 MB Total duration time: 0 hour 2 minute 0 second

Example: LSF Job termination reason logging

When a job finishes, LSF reports the last job termination action it took against the
bacct -l 7265
job and logs it into
If a running job exits because of node failure, LSF sets the correct exit information in
lsb.acct, lsb.events, and the job output file.
Use
bacct -l to view job exit information logged to lsb.acct:
lsb.acct.
Accounting information about jobs that are:
- submitted by all users.
- accounted on all projects.
Platform LSF Command Reference 15

Files

- completed normally or exited
- executed on all hosts.
- submitted to all queues.
- accounted on all service classes.
------------------------------------------------------------------------------
Job <7265>, User <lsfadmin>, Project <default>, Status <EXIT>, Queue <normal>,
Command <srun sleep 100000>
Thu Sep 16 15:22:09: Submitted from host <hostA>, CWD <$HOME>;
Thu Sep 16 15:22:20: Dispatched to 4 Hosts/Processors <4*hostA>;
Thu Sep 16 15:22:20: slurm_id=21793;ncpus=4;slurm_alloc=n[13-14];
Thu Sep 16 15:23:21: Completed <exit>; TERM_RUNLIMIT: job killed after reaching
LSF run time limit.
Accounting information about this job:
Share group charged </lsfadmin>
CPU_T WAIT TURNAROUND STATUS HOG_FACTOR MEM SWAP
0.04 11 72 exit 0.0006 0K 0K
------------------------------------------------------------------------------
SUMMARY: ( time unit: second )
Total number of done jobs: 0 Total number of exited jobs: 1
Total CPU time consumed: 0.0 Average CPU time consumed: 0.0
Maximum CPU time of a job: 0.0 Minimum CPU time of a job: 0.0
Total wait time in queues: 11.0
Average wait time in queue: 11.0
Maximum wait time in queue: 11.0 Minimum wait time in queue: 11.0
Average turnaround time: 72 (seconds/job)
Maximum turnaround time: 72 Minimum turnaround time: 72
Average hog factor of a job: 0.00 ( cpu time / turnaround time )
Maximum hog factor of a job: 0.00 Minimum hog factor of a job: 0.00
Files
Reads lsb.acct, lsb.acct.n.

See also

bhist, bsub, bjobs, lsb.acct, brsvadd, brsvs, bsla, lsb.serviceclasses
16 Platform LSF Command Reference

bapp

Synopsis

Description

Options

Displays information about application profile configuration.
bapp [-l | -w] [application_profile_name ...]
bapp [-h | -V]
Displays information about application profiles configured in lsb.applications.
Returns application name, job slot statistics, and job state statistics for all application profiles:
In MultiCluster, returns the information about all application profiles in the local cluster.
CPU time is normalized.
-w Wide format. Fields are displayed without truncation.
-l Long format with additional information.
Displays the following additional information: application profile description, application profile characteristics and statistics, parameters, resource usage limits, associated commands, and job controls.
application_profile_name ...
Displays information about the specified application profile.
-h Prints command usage to stderr and exits.
-V Prints product release version to stderr and exits.

Default output format

Displays the following fields:
APPLICATION_NAME
The name of the application profile. Application profiles are named to correspond to the type of application that usually runs within them.
NJOBS The total number of job slots held currently by jobs in the application profile. This
includes pending, running, suspended and reserved job slots. A parallel job that is running on n processors is counted as n job slots, since it takes n job slots in the application.
PEND The number of job slots used by pending jobs in the application profile.
RUN The number of job slots used by running jobs in the application profile.
SUSP The number of job slots used by suspended jobs in the application profile.
Platform LSF Command Reference 17

Long output format(-l)

Long output format(-l)
In addition to the above fields, the -l option displays the following:
Description A description of the typical use of the application profile.
PA R AM E T ER S/
STATISTICS
SSUSP
The number of job slots in the application profile allocated to jobs that are suspended by LSF because of load levels or run windows.
USUSP
The number of job slots in the application profile allocated to jobs that are suspended by the job submitter or by the LSF administrator.
RSV
The number of job slots in the application profile that are reserved by LSF for pending jobs.
Per-job resource usage limits
The soft resource usage limits that are imposed on the jobs associated with the application profile. These limits are imposed on a per-job and a per-process basis.
The possible per-job limits are:
CPULIMIT
The maximum CPU time a job can use, in minutes, relative to the CPU factor of the named host. CPULIMIT is scaled by the CPU factor of the execution host so that jobs are allowed more time on slower hosts.
MEMLIMIT
The maximum running set size (RSS) of a process.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
lsf.conf to
MEMLIMIT_TYPE
A memory limit is the maximum amount of memory a job is allowed to consume. Jobs that exceed the level are killed. You can specify different types of memory limits to enforce, based on PROCESS, TASK, or JOB (or any combination of the three).
PROCESSLIMIT
The maximum number of concurrent processes allocated to a job.
PROCLIMIT
The maximum number of processors allocated to a job.
SWAPLIMIT
The swap space limit that a job may use.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
THREADLIMIT
The maximum number of concurrent threads allocated to a job.
18 Platform LSF Command Reference
lsf.conf to
Per-process resource usage limits
The possible UNIX per-process resource limits are:
CORELIMIT
The maximum size of a core file.
CHKPNT_DIR The checkpoint directory, if automatic checkpointing is enabled for the application
CHKPNT_INITPERIOD
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
DATALIMIT
The maximum size of the data segment of a process, in KB. This restricts the amount of memory a process can allocate.
FILELIMIT
The maximum file size a process can create, in KB.
RUNLIMIT
The maximum wall clock time a process can use, in minutes. RUNLIMIT is scaled by the CPU factor of the execution host.
STACKLIMIT
The maximum size of the stack segment of a process. This restricts the amount of memory a process can use for local variables or recursive function calls.
By default, the limit is shown in KB. Use LSF_UNIT_FOR_LIMITS in specify a larger unit for display (MB, GB, TB, PB, or EB).
profile.
The initial checkpoint period in minutes. The periodic checkpoint does not happen until the initial period has elapsed.
lsf.conf to
lsf.conf to
CHKPNT_PERIOD The checkpoint period in minutes. The running job is checkpointed automatically
every checkpoint period.
CHKPNT_METHOD The checkpoint method.
MIG The migration threshold in minutes. A value of 0 (zero) specifies that a suspended
job should be migrated immediately.
Where a host migration threshold is also specified, and is lower than the job value, the host value is used.
PRE_EXEC The pre-execution command for the application profile. The PRE_EXEC command
runs on the execution host before the job associated with the application profile is dispatched to the execution host (or to the first host selected for a parallel batch job).
POST_EXEC The post-execution command for the application profile. The POST_EXEC
command runs on the execution host after the job finishes.
JOB_INCLUDE_POSTPROC
If JOB_INCLUDE_POSTPROC= Y, post-execution processing of the job is included as part of the job.
JOB_POSTPROC_TIMEOUT
Platform LSF Command Reference 19

See also

Timeout in minutes for job post-execution processing. If post-execution processing takes longer than the timeout, (POST_ERR status), and kills the process group of the job’s post-execution processes.
sbatchd reports that post-execution has failed
REQUEUE_EXIT_VALUES
Jobs that exit with these values are automatically requeued.
RES_REQ Resource requirements of the application profile. Only the hosts that satisfy these
resource requirements can be used by the application profile.
JOB_STARTER An executable file that runs immediately prior to the batch job, taking the batch job
file as an input argument. All jobs submitted to the application profile are run via the job starter, which is generally used to create a specific execution environment before processing the jobs themselves.
CHUNK_JOB_SIZE Chunk jobs only. Specifies the maximum number of jobs allowed to be dispatched
together in a chunk job. All of the jobs in the chunk are scheduled and dispatched as a unit rather than individually.
RERUNNABLE If the RERUNNABLE field displays yes, jobs in the application profile are
automatically restarted or rerun if the execution host becomes unavailable. However, a job in the application profile is not restarted if you use the rerunnable option from the job.
bmod to remove
RESUME_CONTROL The configured actions for the resume job control.
The configured actions are displayed in the format [action_type, command] where action_type is RESUME.
SUSPEND_CONTROL
The configured actions for the suspend job control.
The configured actions are displayed in the format [action_type, command] where action_type is SUSPEND.
TERMINATE_CONTROL
The configured actions for terminate job control.
The configured actions are displayed in the format [action_type, command] where action_type is TERMINATE.
See also
lab.applications, lsb.queues, bsub, bjobs, badmin, mbatchd
20 Platform LSF Command Reference

badmin

Synopsis

Description

Administrative tool for LSF.
badmin subcommand
badmin [-h | -V]
IMPORTANT: This command can only be used by LSF administrators.
badmin
subcommands are supplied for
provides a set of subcommands to control and monitor LSF. If no
badmin, badmin prompts for a subcommand from
standard input.
Information about each subcommand is available through the
The
badmin subcommands include privileged and non-privileged subcommands.
help command.
Privileged subcommands can only be invoked by root or LSF administrators. Privileged subcommands are:
reconfig
mbdrestart
qopen
qclose
qact
qinact
hopen
hclose
hrestart
hshutdown
hstartup
diagnose
The configuration file lsf.sudoers(5) must be set to use the privileged command
hstartup by a non-root user.
All other commands are non-privileged commands and can be invoked by any LSF user. If the privileged commands are to be executed by the LSF administrator,
badmin must be installed, because it needs to send the request using a privileged
port.
For subcommands for which multiple hosts can be specified, do not enclose the host names in quotation marks.
Platform LSF Command Reference 21

Subcommand synopsis

Subcommand synopsis
ckconfig [-v]
diagnose [job_ID ... | "job_ID[index]" ...]
reconfig [-v] [-f]
mbdrestart [-C comment] [-v] [-f]
qopen [-C comment] [queue_name ... | all]
qclose [-C comment] [queue_name ... | all]
qact [-C comment] [queue_name ... | all]
qinact [-C comment] [queue_name ... | all]
qhist [-t time0,time1] [-f logfile_name] [queue_name ...]
hopen [-C comment] [host_name ... | host_group ... | all]
hclose [-C comment
hrestart [-f] [host_name ... | all]
hshutdown [-f] [host_name ... | all]
hstartup [-f] [host_name ... | all]
hhist [-t time0,time1] [-f logfile_name] [host_name ...]
mbdhist [-t time0,time1] [-f logfile_name]
hist [-t time0,time1] [-f logfile_name]
hghostadd [-C comment] host_group host_name [host_name ...]
hghostdel [-f] [-C comment] host_group host_name [host_name ...]
help [command ...] | ? [
quit
mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
mbdtime [-l timing_level] [-f logfile_name] [-o]
sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
[host_name ...]
sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...]
schddebug [-c class_name ...] [-l debug_level] [-f logfile_name]
[-o]
schdtime [-l timing_level] [-f logfile_name] [-o]
showconf mbd | [sbd [ host_name … |
perfmon start [sample_period]| stop | view | setperiod sample_period
-h
] [host_name ... | host_group ... | all]
command ...]
all ]]
-V
22 Platform LSF Command Reference

Options

subcommand Executes the specified subcommand. See Usage section.

Usage

ckconfig [-v] Checks LSF configuration files located in the
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
LSB_CONFDIR/cluster_name/configdir directory, and checks LSF_ENVDIR/lsf.licensescheduler.
The LSB_CONFDIR variable is defined in LSF_ENVDIR or
/etc (if LSF_ENVDIR is not defined).
lsf.conf (see lsf.conf(5)) which is in
By default, check. If warning errors are found,
badmin ckconfig displays only the result of the configuration file
badmin prompts you to display detailed
messages.
-v
Verbose mode. Displays detailed messages about configuration file checking to
stderr.
diagnose [job_ID ... | "job_ID]" ...][
Displays full pending reason list if CONDENSE_PENDING_REASONS=Y is set in
lsb.params. For example:
badmin diagnose 1057
reconfig [-v] [-f] Dynamically reconfigures LSF without restarting mbatchd.
Configuration files are checked for errors and the results displayed to errors are found in the configuration files, a reconfiguration request is sent to
mbatchd and configuration files are reloaded.
With this option, replayed. To restart
mbdrestart
When you issue this command,
mbatchd and mbschd are not restarted and lsb.events is not
mbatchd and mbschd, and replay lsb.events, use badmin
.
mbatchd is available to service requests while
reconfiguration files are reloaded. Configuration changes made since system boot or the last reconfiguration take effect.
stderr. If no
If warning errors are found, fatal errors are found, reconfiguration is not performed, and
badmin prompts you to display detailed messages. If
badmin exits.
If you add a host to a queue or to a host group, the new host is not recognized by jobs that were submitted before you reconfigured. If you want the new host to be recognized, you must use the command
badmin mbdrestart.
Resource requirements determined by the queue no longer apply to a running job after running
badmin reconfig, For example, if you change the RES_REQ
parameter in a queue and reconfigure the cluster, the previous queue-level resource requirements for running jobs are lost.
-v
Platform LSF Command Reference 23
Usage
Verbose mode. Displays detailed messages about the status of the configuration files. Without this option, the default is to display the results of configuration file checking. All messages from the configuration file check are printed to
-f
Disables interaction and proceeds with reconfiguration if configuration files contain no fatal errors.
mbdrestart [-C comment] [-v] [-f]
Dynamically reconfigures LSF and restarts mbatchd and mbschd.
stderr.
Configuration files are checked for errors and the results printed to errors are found, configuration files are reloaded, restarted, and events in last
mbatchd. While mbatchd restarts, it is unavailable to service requests.
If warning errors are found, fatal errors are found, exits.
If
lsb.events is large, or many jobs are running, restarting mbatchd can take
several minutes. If you only need to reload the configuration files, use
reconfig
-C comment
.
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
-v
Verbose mode. Displays detailed messages about the status of configuration files. All messages from configuration checking are printed to
-f
Disables interaction and forces reconfiguration and mbatchd restart to proceed if configuration files contain no fatal errors.
qopen [-C comment] [queue_name ... | all]
Opens specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue can accept batch jobs only if it is open.
stderr. If no
mbatchd and mbschd are
lsb.events are replayed to recover the running state of the
badmin prompts you to display detailed messages. If
mbatchd and mbschd restart is not performed, and badmin
badmin
stderr.
-C comment
Logs the text of comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qclose [-C comment] [queue_name ... | all]
Closes specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. A queue does not accept any job if it is closed.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qact [-C comment] [queue_name ... | all]
24 Platform LSF Command Reference
Activates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. Jobs in a queue can be dispatched if the queue is activated.
A queue inactivated by its run windows cannot be reactivated by this command.
-C comment
Logs the text of the comment as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qinact [-C comment] [queue_name ... | all]
Inactivates specified queues, or all queues if the reserved word all is specified. If no queue is specified, the system default queue is assumed. No job in a queue can be dispatched if the queue is inactivated.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
qhist [-t time0,time1] [-f logfile_name] [queue_name ...]
Displays historical events for specified queues, or for all queues if no queue is specified. Queue events are queue opening, closing, activating and inactivating.
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name
Specify the file name of the event log file. Either an absolute or a relative path name may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the commands
hopen [-C comment] [host_name ... | host_group ... | all]
qclose, qopen, qact, and qinact, qhist displays the comment text.
Opens batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are opened if the reserved word group is specified, the local host is assumed. A host accepts batch jobs if it is open.
IMPORTANT: If EGO-enabled SLA scheduling is configured through ENABLE_DEFAULT_EGO_SLA
in lsb.params, and a host is closed by EGO, it cannot be reopened by badmin hopen. Hosts closed by EGO have status closed_EGO in bhosts -l output.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you open a host group, each host group member displays with the same comment string.
-C option of the queue control
all is specifie d. If no hos t or ho st
hclose [-C comment] [host_name ... | host_group ... | all]
Platform LSF Command Reference 25
Usage
Closes batch server hosts. Specify the names of any server hosts or host groups. All batch server hosts are closed if the reserved word specified, the local host is assumed. A closed host does not accept any new job, but jobs already dispatched to the host are not affected. Note that this is different from a host closed by a window; all jobs on it are suspended in that case.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
If you close a h ost g roup, eac h host grou p mem ber dis play s with the sam e comm ent string.
hghostadd [-C comment] host_group host_name [host_name ...]
If dynamic host configuration is enabled, dynamically adds hosts to a host group, . After receiving the host information from the master LIM, adds the host without triggering a
reconfig.
Once the host is added to the group, it is considered to be part of that group with respect to scheduling decision making for both newly submitted jobs and for existing pending jobs.
This command fails if any of the specified host groups or host names are not valid.
all is specified. If no argument is
mbatchd dynamically
RESTRICTION: If EGO- enabled SLA scheduling is configured through ENABLE _DEFAULT_EGO_SLA
in lsb.params, you cannot use hghostadd because all host allocation is under control of Platform EGO.
-C comment
Logs the text as an administrator comment record to lsb.events. The maximum length of the comment string is 512 characters.
hghostdel [-f] [-C comment] host_group host_name [host_name ...]
Dynamically deletes hosts from a host group by triggering an mbatchd reconfig
This command fails if any of the specified host groups or host names are not valid.
CAUTION: If you want to change a dynamic host to a static host, first use the command
badmin hghostdel to remove the dynamic host from any host group that it belongs to, and then configure the host as a static host in lsf.cluster.cluster_name.
RESTRICTION: If EGO- enabled SLA scheduling is configured through ENABLE _DEFAULT_EGO_SLA
in lsb.params, you cannot use hghostdel because all host allocation is under control of Platform EGO.
hrestart [-f] [host_name ... | all]
Restarts sbatchd on the specified hosts, or on all server hosts if the reserved word
all is specified. If no host is specified, the local host is assumed. sbatchd reruns
itself from the beginning. This allows new
sbatchd binaries to be used.
-f
Disables interaction and does not ask for confirmation for restarting sbatchd.
hshutdown [-f] [host_name ... | all]
26 Platform LSF Command Reference
Shuts down sbatchd on the specified hosts, or on all batch server hosts if the reserved word
sbatchd exits upon receiving the request.
-f
Disables interaction and does not ask for confirmation for shutting down sbatchd.
hstartup [-f] [host_name ... | all]
Starts sbatchd on the specified hosts, or on all batch server hosts if the reserved word use the hosts without having to type in passwords. If no host is specified, the local host is assumed.
all is specified. If no host is specified, the local host is assumed.
all is specified. Only root and users listed in the file lsf.sudoers(5) can
all and -f options. These users must be able to use rsh or ssh on all LSF
The shell command specified by LSF_RSH in
-f
Disables interaction and does not ask for confirmation for starting sbatchd.
hhist [-t time0,time1] [-f logfile_name] [host_name ...]
Displays historical events for specified hosts, or for all hosts if no host is specified. Host events are host opening and closing.
-t time0,time1 Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the commands
mbdhist [-t time0,time1] [-f logfile_name]
hclose or hopen, hhist displays the comment text.
Displays historical events for mbatchd. Events describe the starting and exiting of
mbatchd.
-t time0,time1 Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
lsf.conf is used before rsh is tried.
-C option of the host control
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the command,
hist [-t time0,time1] [-f logfile_name]
Displays historical events for all the queues, hosts and mbatchd.
-C option of the mbdrestart
mbdhist displays the comment text.
Platform LSF Command Reference 27
Usage
-t time0,time1
Displays only those events that occurred during the period from time0 to time1. See
bhist(1) for the time format. The default is to display all queue events in the event
log file (see below).
-f logfile_name Specify the file name of the event log file. Either an absolute or a relative path name
may be specified. The default is to use the event log file currently used by the LSF system:
LSB_SHAREDIR/cluster_name/logdir/lsb.events. Option -f is useful for
offline analysis.
If you specified an administrator comment with the and
mbatchd commands, hist displays the comment text.
help [command ...] | ? [command ...]
Displays the syntax and functionality of the specified commands.
quit Exits the badmin session.
mbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets message log level for mbatchd to include additional information in log files. You mu st be
See
sddebug for an explanation of options.
mbdtime [-l timing_level] [-f logfile_name] [-o]
root or the LSF administrator to use this command.
Sets timing level for mbatchd to include additional timing information in log files. You mu st b e
root or the LSF administrator to use this command. See sbdtime for
an explanation of options.
-C option of the queue, host,
sbddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o] [host_name ...]
Sets the message log level for sbatchd to include additional information in log files. You mu st be
root or the LSF administrator to use this command.
In MultiCluster, debug levels can only be set for hosts within the same cluster. For example, you cannot set debug or timing levels from a host in in
clusterB. You need to be on a host in clusterB to set up debug or timing levels
for
clusterB hosts.
If the command is used without any options, the following default values are used:
class_name=0 (no additional classes are logged)
debug_level=0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name
.log.host_name
host_name=local host (host from which command was submitted)
-c class_name ...
Specifies software classes for which debug messages are to be logged.
Format of class_name is the name of a class, or a list of class names separated by spaces and enclosed in quotation marks. Classes are also listed in
Valid log classes are:
LC_ADVRSV - Log advance reservation modifications
clusterA for a host
lsf.h.
LC_AFS - Log AFS messages
LC_AUTH - Log authentication messages
28 Platform LSF Command Reference
LC_CHKPNT - Log checkpointing messages
LC_COMM - Log communication messages
LC_CONF - Print out all parameters in lsb.params
LC_DCE - Log messages pertaining to DCE support
LC_EEVENTD - Log eeventd messages
LC_ELIM - Log ELIM messages
LC_EXEC - Log significant steps for job execution
LC_FAIR - Log fairshare policy messages
LC_FILE - Log file transfer messages
LC_HANG - Mark where a program might hang
LC_JARRAY - Log job array messages
LC_JLIMIT - Log job slot limit messages
LC_LICENSE - Log license management messages (LC_LICENCE is also
supported for backward compatibility)
LC_LOADINDX - Log load index messages
LC_M_LOG - Log multievent logging messages
LC_MPI - Log MPI messages
LC_MULTI - Log messages pertaining to MultiCluster
LC_PEND - Log messages related to job pending reasons
LC_PERFM - Log performance messages
LC_PIM - Log PIM messages
LC_PREEMPT - Log preemption policy messages
LC_SIGNAL - Log messages pertaining to signals
LC_SYS - Log system call messages
LC_TRACE - Log significant program walk steps
LC_XDR - Log everything transferred by XDR
Default: 0 (no additional classes are logged)
-l debug_level
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
Possible values:
0 LOG_DEBUG level in parameter LSF_LOG_MASK in
lsf.conf.
1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
Platform LSF Command Reference 29
Usage
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Default: 0 (LOG_DEBUG level in parameter LSF_LOG_MASK)
-f logfile_name
Specify the name of the file into which debugging messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file that is created has the following format:
logfile_name.daemon_name.
log.host_name
On UNIX, if the specified path is not valid, the log file is created in the directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
-o
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LSF_LOG_MASK and classes are reset to the value of LSB_DEBUG_MBD, LSB_DEBUG_SBD.
The log file is also reset back to the default log file.
host_name ...
Optional. Sets debug settings on the specified host or hosts.
Lists of host names must be separated by spaces and enclosed in quotation marks.
Default: local host (host from which command was submitted)
sbdtime [-l timing_level] [-f logfile_name] [-o] [host_name ...]
Sets the timing level for sbatchd to include additional timing information in log files. You must be
root or the LSF administrator to use this command.
In MultiCluster, timing levels can only be set for hosts within the same cluster. For example, you could not set debug or timing levels from a host in clusterA for a host in clusterB. You need to be on a host in clusterB to set up debug or timing levels for clusterB hosts.
If the command is used without any options, the following default values are used:
/tmp
timing_level=no timing information is recorded
logfile_name=current LSF system log file in the LSF system log file directory, in the format daemon_name.
host_name=local host (host from which command was submitted)
-l timing_level
Specifies detail of timing information that is included in log files. Timing messages indicate the execution time of functions in the software and are logged in milliseconds.
Valid values: 1 | 2 | 3 | 4 | 5
30 Platform LSF Command Reference
log.host_name
The higher the number, the more functions in the software that are timed and whose execution time is logged. The lower numbers include more common software functions. Higher levels include all lower levels.
Default: undefined (no timing information is logged)
-f logfile_name
Specify the name of the file into which timing messages are to be logged. A file name with or without a full path may be specified.
If a file name without a path is specified, the file is saved in the LSF system log file directory.
The name of the file created has the following format:
logfile_name.daemon_name.
log.host_name
On UNIX, if the specified path is not valid, the log file is created in the directory.
On Windows, if the specified path is not valid, no log file is created.
Note: Both timing and debug messages are logged in the same files.
Default: current LSF system log file in the LSF system log file directory, in the format daemon_name.
-o
log.host_name.
Optional. Turn off temporary timing settings and reset them to the daemon starting state. The timing level is reset back to the value of the parameter for the corresponding daemon (LSB_TIME_MBD, LSB_TIME_SBD).
The log file is also reset back to the default log file.
host_name ...
Sets the timing level on the specified host or hosts.
Lists of hosts must be separated by spaces and enclosed in quotation marks.
Default: local host (host from which command was submitted)
schddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets message log level for mbschd to include additional information in log files. You must be
root or the LSF administrator to use this command.
/tmp
See
sbddebug for an explanation of options.
schdtime [-l timing_level] [-f] [-o]
Sets timing level for mbschd to include additional timing information in log files. You mu st be
See
sbdtime for an explanation of options.
showconf mbd | [sbd [ host_name … | all ]]
root or the LSF administrator to use this command.
Display all configured parameters and their values set in lsf.conf or ego.conf that affect
mbatchd and sbatchd.
In a MultiCluster environment, daemons on the local cluster.
badmin showconf only displays the parameters of
Platform LSF Command Reference 31
Usage
Running badmin showconf from a master candidate host reaches all server hosts in the cluster. Running
badmin showconf from a slave-only host may not be able to
reach other slave-only hosts.
badmin showconf only displays the values used by LSF.
For example, if you define LSF_MASTER_LIST in EGO_MASTER_LIST in
ego.conf, badmin showconf displays the value of
EGO_MASTER_LIST.
badmin showconf displays the value of EGO_MASTER_LIST from wherever it is
defined. You can define either LSF_MASTER_LIST or EGO_MASTER_LIST in
lsf.conf. LIM reads lsf.conf first, and ego.conf if EGO is enabled in the LSF
cluster. The value of LSF_MASTER_LIST is displayed only if EGO_MASTER_LIST is not defined at all in
ego.conf.
For example, if EGO is enabled in the LSF cluster, and you define LSF_MASTER_LIST in
badmin showconf displays the value of EGO_MASTER_LIST in ego.conf.
If EGO is disabled,
lsf.conf, and EGO_MASTER_LIST in ego.conf,
ego.conf not loaded, so whatever is defined in lsf.conf is
displayed.
perfmon start [sample_period] | stop | view | setperiod sample_period
Dynamically enables and controls scheduler performance metric collection.
Collecting and recording performance metric data may affect the performance of LSF. Smaller sampling periods results in the
The following metrics are collected and recorded in each sample period:
The number of queries handled by mbatchd
The number of queries for each of jobs, queues, and hosts. (bjobs, bqueues,
and
bhosts, as well as other daemon requests)
lsf.conf, and
lsb.streams file growing faster.
The number of jobs submitted (divided into job submission requests and jobs
actually submitted)
The number of jobs dispatched
The number of jobs completed
The numbers of jobs sent to remote cluster
The numbers of jobs accepted by from cluster
start [sample_period]
Start performance metric collection dynamically and specifies an optional sampling period in seconds for performance metric collection.
If no sampling period is specified, the default period set in
SCHED_METRIC_SAMPLE_PERIOD in lsb.params is used.
stop
Stop performance metric collection dynamically.
view
Display real time performance metric information for the current sampling period
setperiod sample_period
32 Platform LSF Command Reference

See also

Set a new sampling period in seconds.
bqueues, bhosts, lsb.params, lsb.queues, lsb.hosts, lsf.conf, lsf.cluster,
sbatchd, mbatchd, mbschd
Platform LSF Command Reference 33

bbot

bbot

Synopsis

Description

Moves a pending job relative to the last job in the queue.
bbot job_ID | "job_ID[index_list]" [position]
bbot -h | -V
Changes the queue position of a pending job or job array element, to affect the order in which jobs are considered for dispatch.
By default, LSF dispatches jobs in a queue in the order of arrival (that is, first-come, first-served), subject to availability of suitable server hosts.
The
bbot command allows users and the LSF administrator to manually change the
order in which jobs are considered for dispatch. Users can only operate on their own jobs, whereas the LSF administrator can operate on any user’s jobs.
If invoked by the LSF administrator, with the same priority submitted to the queue.
If invoked by a user, priority submitted by the user to the queue.
Pending jobs are displayed by dispatch.
A user may use fairshare policy. However, if a job scheduled using a fairshare policy is moved by the LSF administrator using unless the same job is subsequently moved by the LSF administrator using this case the job is scheduled again using the same fairshare policy.
To prevent users from changing the queue position of a pending job with configure JOB_POSITION_CONTROL_BY_ADMIN=Y in
You ca nn ot r u n queue.

Options

job_ID | "job_ID[index_list]"
Required. Job ID of the job or job array on which to operate.
For a job array, the index list, the square brackets, and the quotation marks are required. An index list is used to operate on a job array. The index list is a comma separated list whose elements have the syntax start_index[-end_index[:step]] where start_index, end_index and step are positive integers. If the step is omitted, a step of one is assumed. The job array index starts at one. The maximum job array index is 1000. All jobs in the array share the same job_ID and parameters. Each element of the array is distinguished by its array index.
bbot moves the selected job after the last job
bbot moves the selected job after the last job with the same
bjobs in the order in which they are considered for
bbot to change the dispatch order of their jobs scheduled using a
btop, the job is not subject to further fairshare scheduling
bbot; in
bbot,
lsb.params.
bbot on jobs pending in an absolute priority scheduling (APS)
34 Platform LSF Command Reference

See also

position
Optional. The position argument can be specified to indicate where in the queue the job is to be placed. position is a positive number that indicates the target position of the job from the end of the queue. The positions are relative to only the applicable jobs in the queue, depending on whether the invoker is a regular user or the LSF administrator. The default value of 1 means the position is after all other jobs with the same priority.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
bjobs(1), bswitch(1), btop(1), JOB_POSITION_CONTROL_BY_ADMIN in
lsb.params
Platform LSF Command Reference 35

bchkpnt

bchkpnt

Synopsis

Description

checkpoints one or more checkpointable jobs
bchkpnt [-f] [-k] [-p minutes | -p 0]
job_ID | "job_ID[index_list]" ...
bchkpnt [-f] [-k] [-p minutes | -p 0] -J job_name
|-m host_name | -m host_group |-q queue_name |-u "user_name" | -u all [0]
bchkpnt -h | -V
Checkpoints the most recently submitted running or suspended checkpointable job.
LSF administrators and
Jobs continue to execute after they have been checkpointed.
root can checkpoint jobs submitted by other users.
LSF invokes the checkpoint.
Only running members of a chunk job can be checkpointed. For chunk jobs in WA IT st at e,
echkpnt(8) executable found in LSF_SERVERDIR to perform the
mbatchd rejects the checkpoint request.

Options

0 (Zero). Checkpoints all of the jobs that satisfy other specified critera.
-f Forces a job to be checkpointed even if non-checkpointable conditions exist (these
conditions are OS-specific).
-k Kills a job after it has been successfully checkpointed.
-p minutes | -p 0 Enables periodic checkpointing and specifies the checkpoint period, or modifies
the checkpoint period of a checkpointed job. Specify checkpointing.
Checkpointing is a resource-intensive operation. To allow your job to make progress while still providing fault tolerance, specify a checkpoint period of 30 minutes or longer.
-J job_name Checkpoints only jobs that have the specified job name.
-m host_name | -m host_group
Checkpoints only jobs dispatched to the specified hosts.
-p 0 (zero) to disable periodic
-q queue_name
Checkpoints only jobs dispatched from the specified queue.
-u "user_name" | -u all
36 Platform LSF Command Reference
Checkpoints only jobs submitted by the specified users. The keyword all specifies all users. Ignored if a job ID other than 0 (zero) is specified. To specify a Windows user account, include the domain name in uppercase letters and use a single backslash (DOMAIN_NAME\user_name) in a Windows command line or a double backslash (DOMAIN_NAME\\user_name) in a UNIX command line.
job_ID | "job_ID[index_list]"
Checkpoints only the specified jobs.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Examples

bchkpnt 1234
Checkpoints the job with job ID 1234.
bchkpnt -p 120 1234
Enables periodic checkpointing or changes the checkpoint period to 120 minutes (2 hours) for a job with job ID 1234.
bchkpnt -m hostA -k -u all 0
When issued by root or the LSF administrator, checkpoints and kills all checkpointable jobs on rebooted.
hostA. This is useful when a host needs to be shut down or

See also

bsub(1), bmod(1), brestart(1), bjobs(1), bqueues(1), bhosts(1), libckpt.a(3),
lsb.queues(5),
echkpnt(8), erestart(8), mbatchd(8)
Platform LSF Command Reference 37

bclusters

bclusters

Synopsis

Description

Options

-app Displays available application profiles in remote clusters.
displays MultiCluster information
bclusters [-app]
bclusters [-h | -V]
For the job forwarding model, displays a list of MultiCluster queues together with their relationship with queues in remote clusters.
For the resource leasing model, displays remote resource provider and consumer information, resource flow information, and connection status between the local and remote cluster.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Output

Job Forwarding Information
Displays a list of MultiCluster queues together with their relationship with queues in remote clusters.
Information related to the job forwarding model is displayed under the heading
Forwarding Information.
LOCAL_QUEUE Name of a local MultiCluster send-jobs or receive-jobs queue.
JOB_FLOW Indicates direction of job flow.
send
The local queue is a MultiCluster send-jobs queue (SNDJOBS_TO is defined in the local queue).
recv
The local queue is a MultiCluster receive-jobs queue (RCVJOBS_FROM is defined in the local queue).
Job
REMOTE For send-jobs queues, shows the name of the receive-jobs queue in a remote cluster.
For receive-jobs queues, always “-”.
CLUSTER For send-jobs queues, shows the name of the remote cluster containing the
receive-jobs queue.
38 Platform LSF Command Reference
For receive-jobs queues, shows the name of the remote cluster that can send jobs to the local queue.
STATUS Indicates the connection status between the local queue and remote queue.
ok
The two clusters can exchange information and the system is properly configured.
disc
Communication between the two clusters has not been established. This could occur because there are no jobs waiting to be dispatched, or because the remote master cannot be located.
reject
The remote queue rejects jobs from the send-jobs queue. The local queue and remote queue are connected and the clusters communicate, but the queue-level configuration is not correct. For example, the send-jobs queue in the submission cluster points to a receive-jobs queue that does not exist in the remote cluster.
If the job is rejected, it returns to the submission cluster.
Resource Lease Information
Displays remote resource provider and consumer information, resource flow information, and connection status between the local and remote cluster.
Information related to the resource leasing model is displayed under the heading
Resource Lease Information.
REMOTE_CLUSTER For borrowed resources, name of the remote cluster that is the provider.
For exported resources, name of the remote cluster that is the consumer.
RESOURCE_FLOW Indicates direction of resource flow.
IMPORT
Local cluster is the consumer and borrows resources from the remote cluster (HOSTS parameter in one or more local queue definitions includes remote resources).
EXPORT
Local cluster is the provider and exports resources to the remote cluster.
STATUS Indicates the connection status between the local and remote cluster.
ok
MultiCluster jobs can run.
disc
No communication between the two clusters. This could be a temporary situation or could indicate a MultiCluster configuration error.
conn
The two clusters communicate, but the lease is not established. This should be a temporary situation.
Platform LSF Command Reference 39

Files

Remote Cluster Application Information
bcluster -app displays information related to application profile configuration
under the heading profile information is only displayed for the job forwarding model. not show local cluster application profile information.
Remote Cluster Application Information. Application
REMOTE_CLUSTER The name of the remote cluster.
APP_NAME The name of the application profile available in the remote cluster.
DESCRIPTION The description of the application profile.
Files
Reads lsb.queues and lsb.applications.

See also

bapp, bhosts, bqueues, lsclusters, lsinfo, lsb.queues
bclusters does
40 Platform LSF Command Reference

bgadd

Synopsis

Description

Options

-L limit Specifies the maximum number of concurrent jobs allowed to run under the job
creates job groups
bgadd [-L limit] [-sla service_class_name] job_group_name
bgadd [-h | -V]
Creates a job group with the job group name specified by job_group_name.
You must provide full group path name for the new job group. The last component of the path is the name of the new group to be created.
You do not need to create the parent job group before you create a sub-group under it. If no groups in the job group hierarchy exist, all groups are created with the specified hierarchy.
group (including child groups) USSUP) under the job group.
Specify a positive number between 0 and 2147483647. If the specified limit is zero (0), no jobs under the job group can run.
You cannot specify a limit for the root job group. The root job group has no job limit. Job groups added with no limits specified inherit any limits of existing parent job groups. The
-L option only limits the lowest level job group created.
-L limits the number of started jobs (RUN, SSUSP,
If a parallel job requests 2 CPUs ( slots used by the job.
By default, a job group has no job limit. Limits persist across reconfiguration.
-sla service_class_name
The name of a service class defined in lsb.serviceclasses, or the name of the SLA defined in ENABLE_DEFAULT_EGO_SLA in attached to the specified SLA.
job_group_name Full path of the job group name.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Examples

Create a job group named risk_group under the root group /:
bgadd /risk_group
bsub -n 2), the job group limit is per job, not per
mbatchd restart or
lsb.params. The job group is
Platform LSF Command Reference 41

See also

See also
Create a job group named portfolio1 under job group /risk_group:
bgadd /risk_group/portfolio1
bgdel, bjgroup
42 Platform LSF Command Reference

bgdel

Synopsis

Description

deletes job groups
bgdel [-u user_name | -u all] job_group_name | 0
bgdel -c job_group_name
bgdel [-h | -V]
Deletes a job group with the job group name specified by job_group_name and all its subgroups.
You must provide full group path name for the job group to be deleted. The job group cannot contain any jobs.
Users can only delete their own job groups. LSF administrators can delete any job groups.
Job groups can be created explicitly or implicitly:
A job group is created explicitly with the bgadd command.
A job group is created implicitly by the bsub -g or bmod -g command when
the specified group does not exist. Job groups are also created implicitly when a default job group is configured (DEFAULT_JOBGROUP in LSB_DEFAULT_JOBGROUP environment variable).
lsb.params or

Options

0 Delete the empty job groups. These groups can be explicit or implicit.
-u user_name Delete empty job groups owned by the specified user. Only administrators can use
this option. These groups can be explicit or implicit. If you specify a job group name, the
-u all Delete empty job groups and their sub groups for all users. Only administrators can
use this option. These groups can be explicit or implicit. If you specify a job group name, the
-c job_group_name Delete all the empty groups below the requested job_group_name including the
job_group_name itself. These groups can be explicit or implicit.
job_group_name Full path of the job group name.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
-u option is ignored.
-u option is ignored.

Example

bgdel /risk_group
Job group /risk_group is deleted.
deletes the job group /risk_group and all its subgroups.
Platform LSF Command Reference 43

See also

See also
bgadd, bjgroup
44 Platform LSF Command Reference

bhist

Synopsis

Description

displays historical information about jobs
bhist [-a | -d | -e |-p | -r | -s] [-b | -w] [-l]
[-C start_time,end_time] [-D start_time,end_time] [-f logfile_name | -n number_logfiles | -n 0] [-S start_time,end_time] [-J job_name] [-Lp ls_project_name] [-m host_name] [-N host_name | -N host_model | -N CPU_factor] [-P project_name] [-q queue_name] [-u user_name | -u all]
bhist [-t] [-f logfile_name] [-T start_time,end_time]
bhist [-J job_name] [-N host_name | -N host_model | -N
[job_ID ... | "job_ID[index]" ...]
bhist [-h | -V]
cpu_factor]
By default:
Displays information about your own pending, running and suspended jobs.
Groups information by job
CPU time is not normalized

Options

Searches the event log file currently used by the LSF system:
$LSB_SHAREDIR/cluster_name/logdir/lsb.events (see lsb.events(5))
Displays events occurring in the past week, but this can be changed by setting
the environment variable LSB_BHIST_HOURS to an alternative number of hours
If neither
-l nor -b is present, the default is to display only the fields shown in
Output on page 48.
-a Displays information about both finished and unfinished jobs.
This option overrides
-b Brief format. Displays the information in a brief format. If used with the -s option,
-d, -p, -s, and -r.
shows the reason why each job was suspended.
-d Only displays information about finished jobs.
-e Only displays information about exited jobs.
-l Long format. Displays additional information. If used with -s, shows the reason
why each job was suspended.
If you submitted a job using the this option displays the successful
OR (||) expression to specify alternative resources,
Execution rusage string with which the job
ran.
Platform LSF Command Reference 45
Options
If you submitted a job with multiple resource requirement strings using the bsub -R option for the order, same, rusage, and select sections,
bjobs -l displays a single,
merged resource requirement string for those sections, as if they were submitted using a single
bhist -l can display job exit codes. A job with exit code 131 means that the job
-R.
exceeded a configured resource usage limit and LSF killed the job with signal 3 (131-128=3).
bhist -l can display changes to pending jobs as a result of the following bmod
options:
Absolute priority scheduling (-aps | -apsn)
Runtime estimate (-We | -Wen)
Post-execution command (-Ep | -Epn)
User limits (-ul | -uln)
Current working directory (-cwd | -cwdn)
Checkpoint options (-k | -kn)
Migration threshold (-mig | -mign)
-p Only displays information about pending jobs.
-r Only displays information about running jobs.
-s Only displays information about suspended jobs.
-t Displays job events chronologically.
-w Wide format. Displays the information in a wide format.
-C start_time,end_time
-D start_time,end_time
Only displays jobs that completed or exited during the specified time interval. Specify the span of time for which you want to display the history. If you do not specify a start time, the start time is assumed to be the time of the first occurrence. If you do not specify an end time, the end time is assumed to be now.
Specify the times in the format "yyyy/mm/dd/HH:MM". Do not specify spaces in the time interval string.
The time interval can be specified in many ways. For more specific syntax and examples of time formats, see TIME INTERVAL FORMAT.
Only displays jobs dispatched during the specified time interval. Specify the span of time for which you want to display the history. If you do not specify a start time, the start time is assumed to be the time of the first occurrence. If you do not specify an end time, the end time is assumed to be now.
Specify the times in the format "yyyy/mm/dd/HH:MM". Do not specify spaces in the time interval string.
The time interval can be specified in many ways. For more specific syntax and examples of time formats, see TIME INTERVAL FORMAT.
-S start_time,end_time
46 Platform LSF Command Reference
-T start_time,end_time
Only displays information about jobs submitted during the specified time interval. Specify the span of time for which you want to display the history. If you do not specify a start time, the start time is assumed to be the time of the first occurrence. If you do not specify an end time, the end time is assumed to be now.
Specify the times in the format "yyyy/mm/dd/HH:MM". Do not specify spaces in the time interval string.
The time interval can be specified in many ways. For more specific syntax and examples of time formats, see TIME INTERVAL FORMAT.
Used together with -t.
Only displays information about job events within the specified time interval. Specify the span of time for which you want to display the history. If you do not specify a start time, the start time is assumed to be the time of the first occurrence. If you do not specify an end time, the end time is assumed to be now.
Specify the times in the format
yyyy/mm/dd/HH:MM. Do not specify spaces in the
time interval string.
The time interval can be specified in many ways. For more specific syntax and examples of time formats, see Time Interval Format on page 49.
-f logfile_name Searches the specified event log. Specify either an absolute or a relative path.
Useful for analysis directly on the file.
The specified file path can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
-J job_name Only displays the jobs that have the specified job name.
The specified job name can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
-Lp ls_project_name Only displays information about jobs belonging to the specified License Scheduler
project.
-m host_name Only displays jobs dispatched to the specified host.
-n number_logfiles | -n 0
Searches the specified number of event logs, starting with the current event log and working through the most recent consecutively numbered logs. The maximum number of logs you can search is 100. Specify 0 to specify all the event log files in
$(LSB_SHAREDIR)/cluster_name/logdir (up to a maximum of 100 files).
If you delete a file, you break the consecutive numbering, and older files are inaccessible to
bhist.
For example, if you specify 3, LSF searches
lsb.events.2. If you specify 4, LSF searches lsb.events, lsb.events.1, lsb.events.2, and lsb.events.3. However, if lsb.events.2 is missing, both
searches include only
-N host_name | -N host_model | -N cpu_factor
Normalizes CPU time by the specified CPU factor, or by the CPU factor of the specified host or host model.
lsb.events, lsb.events.1, and
lsb.events and lsb.events.1.
Platform LSF Command Reference 47

Output

If you use bhist directly on an event log, you must specify a CPU factor.
Use
lsinfo to get host model and CPU factor information.
-P project_name Only displays information about jobs belonging to the specified project.
-q queue_name Only displays information about jobs submitted to the specified queue.
-u user_name | -u all Displays information about jobs submitted by the specified user, or by all users if
the keyword domain name in uppercase letters and use a single backslash (DOMAIN_NAME\ user_name) in a Windows command line or a double backslash (DOMAIN_NAME\\user_name) in a UNIX command line.
job_ID | "job_ID[index]"
Searches all event log files and only displays information about the specified jobs. If you specify a job array, displays all elements chronologically.
all is specified. To specify a Windows user account, include the
This option overrides all other options except with
-J, only those jobs listed here that have the specified job name are displayed.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
-J, -N, -h, and -V. When it is used
Output
Default format
Statistics of the amount of time that a job has spent in various states:
PEND The total waiting time excluding user suspended time before the job is dispatched.
PSUSP The total user suspended time of a pending job.
RUN The total run time of the job.
USUSP The total user suspended time after the job is dispatched.
SSUSP The total system suspended time after the job is dispatched.
UNKWN The total unknown time of the job (job status becomes unknown if sbatchd on the
execution host is temporarily unreachable).
TOTAL The total time that the job has spent in all states; for a finished job, it is the
turnaround time (that is, the time interval from job submission to job completion).
Long format (-l)
The -l option displays a long format listing with the following additional fields:
Project The project the job was submitted from.
Application Profile The application profile the job was submitted to.
Command The job command.
48 Platform LSF Command Reference
Detailed history includes job group modification, the date and time the job was forwarded and the name of the cluster to which the job was forwarded.
The displayed job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
Initial checkpoint period
The initial checkpoint period specified at the job level, by bsub -k, or in an application profile with CHKPNT_INITPERIOD.
Checkpoint period The checkpoint period specified at the job level, by bsub -k, in the queue with
CHKPNT, or in an application profile with CHKPNT_PERIOD.
Checkpoint directory
Migration
The checkpoint directory specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_DIR.
The migration threshold specified at the job level, by bsub -mig.
threshold

Files

Reads lsb.events

See also

lsb.events, bgadd, bgdel, bjgroup, bsub, bjobs, lsinfo

Time Interval Format

You use the time interval to define a start and end time for collecting the data to be retrieved and displayed. While you can specify both a start and an end time, you can also let one of the values default. You can specify either of the times as an absolute time, by specifying the date or time, or you can specify them relative to the current time.
Specify the time interval is follows:
start_time,end_time|start_time,|,end_time|start_time
Specify start_time or end_time in the following format:
[year/][month/][day][/hour:minute|/hour:]|.|.-relative_int
Where:
year is a four-digit number representing the calendar year.
month is a number from 1 to 12, where 1 is January and 12 is December.
day is a number from 1 to 31, representing the day of the month.
hour is an integer from 0 to 23, representing the hour of the day on a 24-hour
clock.
minute is an integer from 0 to 59, representing the minute of the hour.
. (period) represents the current month/day/hour:minute.
.-relative_int is a number, from 1 to 31, specifying a relative start or end time
prior to now.
start_time,end_time
Platform LSF Command Reference 49
Time Interval Format
Specifies both the start and end times of the interval.
start_time,
Specifies a start time, and lets the end time default to now.
,end_time
Specifies to start with the first logged occurrence, and end at the time specified.
start_time
Starts at the beginning of the most specific time period specified, and ends at the maximum value of the time period specified. For example, of February—start February 1 at 00:00 a.m. and end at the last possible minute in February: February 28th at midnight.
Absolute Time Examples
Assume the current time is May 9 17:06 2008:
1,8 = May 1 00:00 2008 to May 8 23:59 2008
,4 = the time of the first occurrence to May 4 23:59 2008
6 = May 6 00:00 2008 to May 6 23:59 2008
2/ = Feb 1 00:00 2008 to Feb 28 23:59 2008
2/ specifies the month
/12: = May 9 12:00 2008 to May 9 12:59 2008
2/1 = Feb 1 00:00 2008 to Feb 1 23:59 2008
2/1, = Feb 1 00:00 to the current time
,. = the time of the first occurrence to the current time
,2/10: = the time of the first occurrence to May 2 10:59 2008
2001/12/31,2008/5/1 = from Dec 31, 2001 00:00:00 to May 1st 2008 23:59:59
Relative Time Examples
.-9, = April 30 17:06 2008 to the current time
,.-2/ = the time of the first occurrence to Mar 7 17:06 2008
.-9,.-2 = nine days ago to two days ago (April 30, 2008 17:06 to May 7, 2008 17:06)
50 Platform LSF Command Reference

bhosts

Synopsis

Description

displays hosts and their static and dynamic resources
bhosts [-e | -l | -w] [-x] [-X] [-R "res_req"]
[host_name | host_group] ...
bhosts [-e | -l | -w] [-X] [-R "res_req"] [cluster_name]
bhosts [-e ] -s [resource_name ...]
bhosts [-h | -V]
By default, returns the following information about all hosts: host name, host status, job state statistics, and job slot limits.
bhosts displays output for condensed host groups. These host groups are defined
by
CONDENSE in the HostGroup section of lsb.hosts. These host groups are
displayed as a single entry with the name as defined by
HostGroup section of lsb.hosts.
The
-l and -X options display uncondensed output.
The
-s option displays information about the numeric resources (shared or
GROUP_NAME in the
host-based) and their associated hosts.

Options

With MultiCluster, displays the information about hosts available to the local cluster. Use -e to view information about exported hosts.
-e MultiCluster only. Displays information about resources that have been exported to
another cluster.
-l Displays host information in a (long) multi-line format. In addition to the default
fields, displays information about the CPU factor, the current load, and the load thresholds.
Also displays information about the dispatch windows.
If you specified an administrator comment with the commands
-w Displays host information in wide format. Fields are displayed without truncation.
For condensed host groups, the number of hosts with the
hclose or hopen, -l displays the comment text.
-w option displays the overall status and the
ok, unavail, unreach, and busy status in the following
-C option of the host control
format:
host_group_status num_ok/num_unavail/num_unreach/num_busy
where
host_group_status is the overall status of the host group. If a single host in the
host group is
ok, the overall status is also ok.
Platform LSF Command Reference 51
Options
num_ok, num_unavail, num_unreach, and num_busy are the number of hosts
that are
ok, unavail, unreach, and busy, respectively.
For example, if there are five in a condensed host group
hg1 ok 5/2/1/3
ok, two unavail, one unreach, and three busy hosts
hg1, its status is displayed as the following:
If any hosts in the host group are closed, the status for the host group is displayed as
closed, with no status for the other states:
hg1 closed
-x Display hosts whose job exit rate has exceeded the threshold configured by
EXIT_RATE in configured in next time LSF checks host exceptions and invokes
Use with the
If no hosts exceed the job exit rate,
There is no exceptional host found
lsb.hosts for longer than JOB_EXIT_RATE_DURATION
lsb.params, and are still high. By default, these hosts are closed the
eadmin.
-l option to show detailed information about host exceptions.
bhosts -x displays:
-X Displays uncondensed output for host groups.
-R "res_req" Only displays information about hosts that satisfy the resource requirement
expression. For more information about resource requirements, see Administering Platform LSF. The size of the resource requirement string is limited to 512 bytes.
LSF supports ordering of resource requirements on all load indices, including external load indices, either static or dynamic.
-s [resource_name ...]
Displays information about the specified resources (shared or host-based). The resources must have numeric values. Returns the following information: the resource names, the total and reserved amounts, and the resource locations.
bhosts -s only shows consumable resources.
When LOCAL_TO is configured for a license feature in
bhosts -s shows different resource information depending on the cluster locality
of the features. For example:
From
bhosts -s
RESOURCE TOTAL RESERVED LOCATION
hspice 36.0 0.0 host1
From clusterB in siteB:
bhosts -s
RESOURCE TOTAL RESERVED LOCATION
hspice 76.0 0.0 host2
host_name ... | host_group ...
Only displays information about the specified hosts. Do not use quotes when specifying multiple hosts.
lsf.licensescheduler,
clusterA:
52 Platform LSF Command Reference
For host groups, the names of the hosts belonging to the group are displayed instead of the name of the host group. Do not use quotes when specifying multiple host groups.
cluster_name MultiCluster only. Displays information about hosts in the specified cluster.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Output

Host-Based Default
Displays the following fields:
HOST_NAME The name of the host. If a host has batch jobs running and the host is removed from
the configuration, the host name is displayed as
For condensed host groups, this is the name of host group.
STATUS With MultiCluster, not shown for fully exported hosts.
lost_and_found.
The current status of the host and the dispatched to hosts with an
ok status. The possible values for host status are as
sbatchd daemon. Batch jobs can only be
follows:
ok
The host is available to accept batch jobs.
For condensed host groups, if a single host in the host group is is also shown as
If any host in the host group is not
ok.
ok, bhosts displays the first host status it
encounters as the overall status for the condensed host group. Use
ok, the overall status
bhosts -X to see
the status of individual hosts in the host group.
unavail
The host is down, or LIM and sbatchd on the host are unreachable.
unreach
LIM on the host is running but sbatchd is unreachable.
closed
The host is not allowed to accept any remote batch jobs. There are several reasons for the host to be closed (see Host-Based
unlicensed
-l Options).
The host does not have a valid LSF license.
JL/U With MultiCluster, not shown for fully exported hosts.
The maximum number of job slots that the host can process on a per user basis. If a dash (-) is displayed, there is no limit.
For condensed host groups, this is the total number of job slots that all hosts in the host group can process on a per user basis.
Platform LSF Command Reference 53
Output
The host does not allocate more than JL/U job slots for one user at the same time. These job slots are used by running jobs, as well as by suspended or pending jobs that have slots reserved for them.
For preemptive scheduling, the accounting is different. These job slots are used by running jobs and by pending jobs that have slots reserved for them (see the description of PREEMPTIVE in
lsb.queues(5) and JL/U in lsb.hosts(5)).
MAX The maximum number of job slots available. If a dash (-) is displayed, there is no
limit.
For condensed host groups, this is the total maximum number of job slots available in all hosts in the host group.
These job slots are used by running jobs, as well as by suspended or pending jobs that have slots reserved for them.
If preemptive scheduling is used, suspended jobs are not counted (see the description of PREEMPTIVE in
A host does not always have to allocate this many job slots if there are waiting jobs; the host must also satisfy its configured load conditions to accept more jobs.
lsb.queues(5) and MXJ in lsb.hosts(5)).
NJOBS The number of job slots used by jobs dispatched to the host. This includes running,
suspended, and chunk jobs.
For condensed host groups, this is the total number of job slots used by jobs dispatched to any host in the host group.
RUN The number of job slots used by jobs running on the host.
For condensed host groups, this is the total number of job slots used by jobs running on any host in the host group.
SSUSP The number of job slots used by system suspended jobs on the host.
For condensed host groups, this is the total number of job slots used by system suspended jobs on any host in the host group.
USUSP The number of job slots used by user suspended jobs on the host. Jobs can be
suspended by the user or by the LSF administrator.
For condensed host groups, this is the total number of job slots used by user suspended jobs on any host in the host group.
RSV The number of job slots used by pending jobs that have jobs slots reserved on the
host.
For condensed host groups, this is the total number of job slots used by pending jobs that have job slots reserved on any host in the host group.
Host-Based -l Option
In addition to the above fields, the -l option also displays the following:
loadSched, loadStop
54 Platform LSF Command Reference
The scheduling and suspending thresholds for the host. If a threshold is not defined, the threshold from the queue definition applies. If both the host and the queue define a threshold for a load index, the most restrictive threshold is used.
The migration threshold is the time that a job dispatched to this host can remain suspended by the system before LSF attempts to migrate the job to another host.
If the host’s operating system supports checkpoint copy, this is indicated here. With checkpoint copy, the operating system automatically copies all open files to the checkpoint directory when a process is checkpointed. Checkpoint copy is currently supported only on Cray systems.
STATUS The long format shown by the -l option gives the possible reasons for a host to be
closed:
closed_Adm
The host is closed by the LSF administrator or root (see badmin(8)). No job can be dispatched to the host, but jobs that are executing on the host are not affected.
closed_Lock
The host is locked by the LSF administrator or root (see lsadmin(8)). All batch jobs on the host are suspended by LSF.
closed_Wind
The host is closed by its dispatch windows, which are defined in the configuration file
lsb.hosts(5). Jobs already started are not affected by the dispatch windows.
closed_Full
The configured maximum number of batch job slots on the host has been reached (see MAX field below).
closed_Excl
The host is currently running an exclusive job.
closed_Busy
The host is overloaded, because some load indices go beyond the configured thresholds (see
lsb.hosts(5)). The displayed thresholds that cause the host to be
busy are preceded by an asterisk (*).
closed_LIM
LIM on the host is unreachable, but sbatchd is ok.
closed_EGO
For EGO-enabled SLA scheduling, host is closed because it has not been allocated by EGO to run LSF jobs. Hosts allocated from EGO display status
ok.
CPUF Displays the CPU normalization factor of the host (see lshosts(1)).
DISPATCH_WINDOW
Displays the dispatch windows for each host. Dispatch windows are the time windows during the week when batch jobs can be run on each host. Jobs already started are not affected by the dispatch windows. When the dispatch windows close, jobs are not suspended. Jobs already running continue to run, but no new jobs are started until the windows reopen. The default for the dispatch window is no restriction or always open (that is, twenty-four hours a day and seven days a week). For the dispatch window specification, see the description for the DISPATCH_WINDOWS keyword under the
-l option in bqueues(1).
Platform LSF Command Reference 55
Output
CURRENT LOAD Displays the total and reserved host load.
Reserved
You specify reserved resources by using bsub -R. These resources are reserved by jobs running on the host.
To ta l
The total load has different meanings depending on whether the load index is increasing or decreasing.
For increasing load indices, such as run queue lengths, CPU utilization, paging activity, logins, and disk I/O, the total load is the consumed plus the reserved amount. The total load is calculated as the sum of the current load and the reserved load. The current load is the load seen by
lsload(1).
For decreasing load indices, such as available memory, idle time, available swap space, and available space in tmp, the total load is the available amount. The total load is the difference between the current load and the reserved load. This difference is the available resource as seen by
lsload(1).
LOAD THRESHOLD Displays the scheduling threshold loadSched and the suspending threshold
loadStop. Also displays the migration threshold if defined and the checkpoint
support if the host supports checkpointing.
The format for the thresholds is the same as for batch job queues (see and
lsb.queues(5)). For an explanation of the thresholds and load indices, see the
description for the "QUEUE SCHEDULING PARAMETERS" keyword under the
-l option in bqueues(1).
THRESHOLD AND LOAD USED FOR EXCEPTIONS
Displays the configured threshold of EXIT_RATE for the host and its current load value for host exceptions.
ADMIN ACTION COMMENT
If the LSF administrator specified an administrator comment with the -C option of the
badmin host control commands hclose or hopen, the comment text is
displayed.
Resource-Based -s Option
The -s option displays the following: the amounts used for scheduling, the amounts reserved, and the associated hosts for the resources. Only resources (shared or host-based) with numeric values are displayed. See on how to configure shared resources.
The following fields are displayed:
RESOURCE The name of the resource.
bqueues(1))
lim(8), and lsf.cluster(5)
TOTAL The total amount free of a resource used for scheduling.
RESERVED The amount reserved by jobs. You specify the reserved resource using bsub -R.
LOCATION The hosts that are associated with the resource.
56 Platform LSF Command Reference

Files

See also

Reads lsb.hosts.
lsb.hosts, bqueues, lshosts, badmin, lsadmin
Platform LSF Command Reference 57

bhpart

bhpart

Synopsis

Description

Options

-r Displays the entire information tree associated with the host partition recursively.
host_partition_name ...
-h Prints command usage to stderr and exits.
displays information about host partitions
bhpart [-r] [host_partition_name ...]
bhpart [-h | -V]
By default, displays information about all host partitions. Host partitions are used to configure host-partition fairshare scheduling.
Displays information about the specified host partitions only.
-V Prints LSF release version to stderr and exits.

Output

The following fields are displayed for each host partition:
HOST_PARTITION_NAME
Name of the host partition.
HOSTS
Hosts or host groups that are members of the host partition. The name of a host group is appended by a slash (
USER/GROUP
Name of users or user groups who have access to the host partition (see
bugroup(1)).
SHARES
Number of shares of resources assigned to each user or user group in this host partition, as configured in the file priority for when fairshare scheduling is configured at the host level.
PRIORITY
Dynamic user priority for the user or user group. Larger values represent higher priorities. Jobs belonging to the user or user group with the highest priority are considered first for dispatch.
/) (see bmgroup(1)).
lsb.hosts. The shares affect dynamic user
In general, users or user groups with larger SHARES, fewer STARTED and RESERVED, and a lower CPU_TIME and RUN_TIME have higher PRIORITY.
58 Platform LSF Command Reference
STARTED
RESERVED
CPU_TIME
RUN_TIME
Number of job slots used by running or suspended jobs owned by users or user groups in the host partition.
Number of job slots reserved by the jobs owned by users or user groups in the host partition.
Cumulative CPU time used by jobs of users or user groups executed in the host partition. Measured in seconds, to one decimal place.
LSF calculates the cumulative CPU time using the actual (not normalized) CPU time and a decay factor such that 1 hour of recently-used CPU time decays to 0.1 hours after an interval of time specified by HIST_HOURS in by default).
Wall-clock run time plus historical run time of jobs of users or user groups that are executed in the host partition. Measured in seconds.
LSF calculates the historical run time using the actual run time of finished jobs and a decay factor such that 1 hour of recently-used run time decays to 0.1 hours after an interval of time specified by HIST_HOURS in Wall-clock run time is the run time of running jobs.
lsb.params (5 hours by default).
lsb.params (5 hours

Files

See also

Reads lsb.hosts.
bugroup(1), bmgroup(1), lsb.hosts(5)
Platform LSF Command Reference 59

bgmod

bgmod

Synopsis

Description

Options

-L limit Changes the limit of job_group_name to the specified limit value. If the job group
modifies job groups
bgmod [-L limit | -Ln] job_group_name
bgmod [-h | -V]
Modifies the job group with the job group name specified by job_group_name.
Only root, LSF administrators, the job group creator, or the creator of the parent job groups can use
You must provide full group path name for the modified job group. The last component of the path is the name of the job group to be modified.
has parent job groups, the new limit cannot exceed the limits of any higher level job groups. Similarly, if the job group has child job groups, the new value must be greater than any limits on the lower level job groups.
limit specifies the maximum number of concurrent jobs allowed to run under the job group (including child groups) SSUSP, USSUP) under the job group.
bgmod to modify a job group limit.
-L limits the number of started jobs (RUN,
Specify a positive number between 0 and 2147483647. If the specified limit is zero (0), no jobs under the job group can run.
You cannot specify a limit for the root job group. The root job group has no job limit. The -L option only limits the lowest level job group specified.
If a parallel job requests 2 CPUs ( slots used by the job.
-Ln Removes the existing job limit for the job group. If the job group has parent job
groups, the job modified group automatically inherits any limits from its direct parent job group.
job_group_name Full path of the job group name.
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Examples

The following command only modifies the limit of group
/canada/projects/test1. It does not modify limits of /canada
or
/canada/projects.
bgmod -L 6 /canada/projects/test1
bsub -n 2), the job group limit is per job, not per
60 Platform LSF Command Reference

See also

To m o di fy lim its o f /canada or/canada/projects, you must specify the exact group name:
bgmod -L 6 /canada
or
bgmod -L 6 /canada/projects
bgadd, bgdel, bjgroup
Platform LSF Command Reference 61

bjgroup

bjgroup
displays information about job groups

Synopsis

bjgroup [-N] [-s [group_name]]
bjgroup [-h | -V]

Description

Displays job group information.

Options

-s Sorts job groups by group hierarchy.
For example, for job groups named displays:
bjgroup
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/A 0 0 0 0 0 0 () 0/10 user1
/X 0 0 0 0 0 0 () 0/- user2
/A/B 0 0 0 0 0 0 () 0/5 user1
/X/Y 0 0 0 0 0 0 () 0/5 user2
For the same job groups, bjgroup -s displays:
bjgroup -s
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/A 0 0 0 0 0 0 () 0/10 user1
/A/B 0 0 0 0 0 0 () 0/5 user1
/X 0 0 0 0 0 0 () 0/- user2
/X/Y 0 0 0 0 0 0 () 0/5 user2
Specify a job group name to show the hierarchy of a single job group:
bjgroup -s /X
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/X 25 0 25 0 0 0 puccini 25/100 user1
/X/Y 20 0 20 0 0 0 puccini 20/30 user1
/X/Z 5 0 5 0 0 0 puccini 5/10 user2
Specify a job group name with a trailing slash character (/) to show only the root job group:
bjgroup -s /X/
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/X 25 0 25 0 0 0 puccini 25/100 user1
/A, /A/B, /X and /X/Y, bjgroup without -s
62 Platform LSF Command Reference
Displays job group information by job slots instead of number of jobs. NSLOTS,
-N
PEND, RUN, SSUSP, USUSP, RSV are all counted in slots rather than number of jobs:
bjgroup -N
GROUP_NAME NSLOTS PEND RUN SSUSP USUSP RSV SLA OWNER
/X 25 0 25 0 0 0 puccini user1
/A/B 20 0 20 0 0 0 wagner batch
by itself shows job slot info for all job groups, and can combine with -s to sort
-N
the job groups by hierarchy:
bjgroup -N -s
GROUP_NAME NSLOTS PEND RUN SSUSP USUSP RSV SLA OWNER
/A 0 0 0 0 0 0 wagner batch
/A/B 0 0 0 0 0 0 wagner user1
/X 25 0 25 0 0 0 puccini user1
/X/Y 20 0 20 0 0 0 puccini batch
/X/Z 5 0 5 0 0 0 puccini batch
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.

Default output

GROUP_NAME
NJOBS
PEND
RUN
SSUSP
USUSP
FINISH
A list of job groups is displayed with the following fields:
The name of the job group.
The current number of jobs in the job group. A parallel job is counted as 1 job, regardless of the number of job slots it uses.
The number of pending jobs in the job group.
The number of running jobs in the job group.
The number of system-suspended jobs in the job group.
The number of user-suspended jobs in the job group.
The number of jobs in the specified job group in EXITED or DONE state.
Platform LSF Command Reference 63

Job slots (-N) output

SLA
The name of the service class that the job group is attached to with
bgadd -sla service_class_name. If the job group is not attached to any service class,
empty parentheses
() are displayed in the SLA name column.
JLIMIT
The job group limit set by bgadd -L or bgmod -L. Job groups that have no configured limits or no limit usage are indicated by a dash ( displayed in a USED/LIMIT format. For example, if a limit of 5 jobs is configured and 1 job is started,
bjgroup displays the job limit under JLIMIT as 1/5.
-). Job group limits are
OWNER
The job group owner.
Example
bjgroup
GROUP_NAME NJOBS PEND RUN SSUSP USUSP FINISH SLA JLIMIT OWNER
/fund1_grp 5 4 0 1 0 0 Venezia 1/5 user1
/fund2_grp 11 2 5 0 0 4 Venezia 5/5 user1
/bond_grp 2 2 0 0 0 0 Venezia 0/- user2
/risk_grp 2 1 1 0 0 0 () 1/- user2
/admi_grp 4 4 0 0 0 0 () 0/- user2
Job slots (-N) output
NSLOTS, PEND, RUN, SSUSP, USUSP, RSV are all counted in slots rather than number of jobs. A list of job groups is displayed with the following fields:
GROUP_NAME
The name of the job group.
NSLOTS
The total number of job slots held currently by jobs in the job group. This includes pending, running, suspended and reserved job slots. A parallel job that is running on n processors is counted as n job slots, since it takes n job slots in the job group.
PEND
The number of job slots used by pending jobs in the job group.
RUN
The number of job slots used by running jobs in the job group.
SSUSP
The number of job slots used by system-suspended jobs in the job group.
USUSP
The number of job slots used by user-suspended jobs in the job group.
64 Platform LSF Command Reference
RSV
The number of job slots in the job group that are reserved by LSF for pending jobs.
SLA
The name of the service class that the job group is attached to with
bgadd -sla service_class_name. If the job group is not attached to any service class,
empty parentheses
() are displayed in the SLA name column.
OWNER
The job group owner.
Example
bjgroup -N
GROUP_NAME NSLOTS PEND RUN SSUSP USUSP RSV SLA OWNER
/X 25 0 25 0 0 0 puccini user1
/A/B 20 0 20 0 0 0 wagner batch

See also

bgadd, bgdel, bgmod
Platform LSF Command Reference 65

bjobs

bjobs

Synopsis

displays information about LSF jobs
bjobs [-A] [-a] [-W] [-w | -l] [-X] [-x]
[-app application_profile_name] [-g job_group_name] [-sla service_class_name] [-J job_name] [-Lp ls_project_name] [-m host_name | -m host_group | -m cluster_name] [-N host_name | -N host_model | -N cpu_factor] [-P project_name] [-q queue_name] [-u user_name | -u user_group | -u all | -G user_group]
job_ID | "job_ID[index_list]" ...
bjobs [-A] [-d] [-p] [-r] [-s] [-W] [-w | -l] [-X] [-x]
-app application_profile_name] [-g job_group_name]
[ [-sla service_class_name] [-J job_name] [-Lp ls_project_name] [-m host_name | -m host_group | -m cluster_name] [-N host_name | -N host_model | -N cpu_factor] [-P project_name] [-q queue_name] [-u user_name | -u user_group | -u all | -G user_group]
job_ID |"job_ID[index_list]" ...
bjobs [-w | -l | -aps] [-A] [-a] [-d] [-p] [-s] [-r] [-X] [-x]
[-m host_name] [-q queue_name] [-u user_name | -u user_group | -u all | -G user_group]
-g job_group] [-sla service_class] [-P project_name]
[ [-N host_spec] [-Lp license_project] [-app application_profile] [-J name_spec] [job_ID |"job_ID[index_list]" ...]
bjobs [-h | -V]

Description

Options

By default, displays information about your own pending, running and suspended jobs.
bjobs displays output for condensed host groups. These host groups are defined by CONDENSE in the HostGroup section of lsb.hosts. These host groups are displayed
as a single entry with the name as defined by of
lsb.hosts. The -l and -X options display uncondensed output.
If you defined LSB_SHORT_HOSTLIST=1 in
GROUP_NAME in the HostGroup section
lsf.conf, parallel jobs running in
the same condensed host group are displayed as an abbreviated list.
To display older historical information, use
-A Displays summarized information about job arrays. If you specify job arrays with
the job array ID, and also specify
-A, do not include the index list with the job array
bhist.
ID.
You ca n u s e
-a Displays information about jobs in all states, including finished jobs that finished
recently, within an interval specified by CLEAN_PERIOD in
-w to show the full array specification, if necessary.
lsb.params (the
default period is 1 hour).
66 Platform LSF Command Reference
Use -a with -x option to display all jobs that have triggered a job exception (overrun, underrun, idle).
-aps Displays absolute priority scheduling (APS) information for pending jobs in a
queue with APS_PRIORITY enabled. The APS value is calculated based on the current scheduling cycle, so jobs are not guaranteed to be dispatched in this order.
Pending jobs are ordered by APS value. Jobs with system APS values are listed first, from highest to lowest APS value. Jobs with calculated APS values are listed next ordered from high to low value. Finally, jobs not in an APS queue are listed. Jobs with equal APS values are listed in order of submission time. APS values of jobs not in an APS queue are shown with a dash (
If queues are configured with the same priority,
-).
bjobs -aps may not show jobs in
the correct expected dispatch order. Jobs may be dispatched in the order the queues are configured in
lsb.queues. You should avoid configuring queues with the same
priority.
-d Displays information about jobs that finished recently, within an interval specified
by CLEAN_PERIOD in
-l Long format. Displays detailed information for each job in a multiline format.
The
-l option displays the following additional information: project name, job
lsb.params (the default period is 1 hour).
command, current working directory on the submission host, initial checkpoint period, checkpoint directory, migration threshold, pending and suspending reasons, job status, resource usage, resource usage limits information, runtime resource usage information on the execution hosts.
Use
bjobs -A -l to display detailed information for job arrays including job array
job limit (
If JOB_IDLE is configured in the queue, use
%job_limit) if set.
bjobs -l to display job idle exception
information.
If you submitted your job with the with the
brsvadd command, bjobs -l shows the reservation ID used by the job.
If LSF_HPC_EXTENSIONS="SHORT_PIDLIST" is specified in
-U option to use advance reservations created
lsf.conf, the
output from bjobs is shortened to display only the first PID and a count of the process group IDs (PGIDs) and process IDs for the job. Without SHORT_PIDLIST, all of the process IDs (PIDs) for a job are displayed.
If you submitted a job with multiple resource requirement strings using the option for the order, same, rusage, and select sections,
bjobs -l displays a single,
bsub -R
merged resource requirement string for those sections, as if they were submitted using a single
If you submitted a job using the this option displays the
For jobs submitted to an absolute priority scheduling (APS) queue,
-R.
OR (||) expression to specify alternative resources,
Execution rusage string with which the job runs.
-l shows the
ADMIN factor value and the system APS value if they have been set by the administrator for the job:
-p Displays pending jobs, together with the pending reasons that caused each job not
to be dispatched during the last dispatch turn. The pending reason shows the number of hosts for that reason, or names the hosts if
-l is also specified.
Platform LSF Command Reference 67
Options
With MultiCluster, -l shows the names of hosts in the local cluster.
Each pending reason is associated with one or more hosts and it states the cause why these hosts are not allocated to run the job. In situations where the job requests specific hosts (using
bsub -m), users may see reasons for unrelated hosts also being
displayed, together with the reasons associated with the requested hosts.
The life cycle of a pending reason ends after the time indicated by PEND_REASON_UPDATE_INTERVAL in
lsb.params.
When the job slot limit is reached for a job array (
bsub -J "jobArray[indexList]%job_slot_limit") the following message is
displayed:
The job array has reached its job slot limit.
-r Displays running jobs.
-s Displays suspended jobs, together with the suspending reason that caused each job
to become suspended.
The suspending reason may not remain the same while the job stays suspended. For example, a job may have been suspended due to the paging rate, but after the paging rate dropped another load index could prevent the job from being resumed. The suspending reason is updated according to the load index. The reasons could be as old as the time interval specified by SBD_SLEEP_TIME in
lsb.params. So the
reasons shown may not reflect the current load situation.
-W Provides resource usage information for: PROJ_NAME, CPU_USED, MEM,
SWAP, PIDS, START_TIME, FINISH_TIME.
-w Wide format. Displays job information without truncating fields.
-X Displays uncondensed output for host groups.
-x Displays unfinished jobs that have triggered a job exception (overrun, underrun,
idle). Use with the
-l option to show the actual exception status. Use with -a to
display all jobs that have triggered a job exception.
-app application_profile_name
Displays information about jobs submitted to the specified application profile. You must specify an existing application profile.
-G user_group Only displays jobs associated with a user group submitted with bsub -G for the
specified user group. The
–G option does not display jobs from subgroups within
the specified user group.
-G option cannot be used together with the -u option. You can only specify a
The user group name. The keyword all is not supported for
-g job_group_name Displays information about jobs attached to the job group specified by
-G.
job_group_name. For example:
bjobs -g /risk_group
JOBID USER STAT QUEUE FROM_HOST EXEC_HOST JOB_NAME SUBMIT_TIME 113 user1 PEND normal hostA myjob Jun 17 16:15 111 user2 RUN normal hostA hostA myjob Jun 14 15:13 110 user1 RUN normal hostB hostA myjob Jun 12 05:03 104 user3 RUN normal hostA hostC myjob Jun 11 13:18
68 Platform LSF Command Reference
Use -g with -sla to display job groups attached to a service class. Once a job group is attached to a service class, all jobs submitted to that group are subject to the SLA.
bjobs -l with -g displays the full path to the group to which a job is attached. For
example:
bjobs -l -g /risk_group
Job <101>, User <user1>, Project <default>, Job Group </risk_group>, Status <RUN>, Queue <normal>, Command <myjob> Tue Jun 17 16:21:49: Submitted from host <hostA>, CWD </home/user1; Tue Jun 17 16:22:01: Started on <hostA>; ...
-J job_name Displays information about the specified jobs or job arrays. Only displays jobs that
were submitted by the user running this command.
The job name can be up to 4094 characters long for UNIX and Linux or up to 255 characters for Windows.
-Lp ls_project_name Displays jobs that belong to the specified LSF License Scheduler project.
-m host_name ... | -m host_group ... | -m cluster_name ...
Only displays jobs dispatched to the specified hosts. To see the available hosts, use
bhosts.
If a host group is specified, displays jobs dispatched to all hosts in the group. To determine the available host groups, use
bmgroup.
With MultiCluster, displays jobs in the specified cluster. If a remote cluster name is specified, you see the remote job ID, even if the execution host belongs to the local cluster. To determine the available clusters, use
-N host_name |-Nhost_model |-Ncpu_factor
Displays the normalized CPU time consumed by the job. Normalizes using the CPU factor specified, or the CPU factor of the host or host model specified.
-P project_name Only displays jobs that belong to the specified project.
-q queue_name Only displays jobs in the specified queue.
The command
bqueues returns a list of queues configured in the system, and
information about the configurations of these queues.
In MultiCluster, you cannot specify remote queues.
-sla service_class_name
Displays jobs belonging to the specified service class.
bjobs also displays information about jobs assigned to a default SLA configured
with ENABLE_DEFAULT_EGO_SLA in
Use
-sla with -g to display job groups attached to a service class. Once a job
group is attached to a service class, all jobs submitted to that group are subject to the SLA.
Use
bsla to display the configuration properties of service classes configured in
lsb.serviceclasses, the default SLA configured in lsb.params, and dynamic
information about the state of each service class.
bclusters.
lsb.params.
-u user_name... | -u user_group... | -u all
Platform LSF Command Reference 69

Output

job_ID | "job_ID[index]"
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
Output
Only displays jobs that have been submitted by the specified users or user groups. The keyword
all specifies all users. To specify a Windows user account, include the
domain name in uppercase letters and use a single backslash (DOMAIN_NAME\ user_name) in a Windows command line or a double backslash
(DOMAIN_NAME\\user_name) in a UNIX command line.
The
-u option cannot be used with the -G option.
Displays information about the specified jobs or job arrays.
If you use
-A, specify job array IDs without the index list.
Pending jobs are displayed in the order in which they are considered for dispatch. Jobs in higher priority queues are displayed before those in lower priority queues. Pending jobs in the same priority queues are displayed in the order in which they were submitted but this order can be changed by using the commands
bbot. If more than one job is dispatched to a host, the jobs on that host are listed in
btop or
the order in which they are considered for scheduling on this host by their queue priorities and dispatch times. Finished jobs are displayed in the order in which they were completed.
Default Display
A listing of jobs is displayed with the following fields:
JOBID The job ID that LSF assigned to the job.
USER The user who submitted the job.
STAT The current status of the job (see JOB STATUS below).
QUEUE The name of the job queue to which the job belongs. If the queue to which the job
belongs has been removed from the configuration, the queue name is displayed as
lost_and_found. Use bhist to get the original queue name. Jobs in the lost_and_found queue remain pending until they are switched with the bswitch
command into another queue.
In a MultiCluster resource leasing environment, jobs scheduled by the consumer cluster display the remote queue name in the format queue_name@cluster_name. By default, this field truncates at 10 characters, so you might not see the cluster name unless you use
FROM_HOST The name of the host from which the job was submitted.
With MultiCluster, if the host is in a remote cluster, the cluster name and remote job ID are appended to the host name, in the format host_name@cluster_name:job_ID. By default, this field truncates at 11 characters; you might not see the cluster name and job ID unless you use
-w or -l.
-w or -l.
70 Platform LSF Command Reference
EXEC_HOST The name of one or more hosts on which the job is executing (this field is empty if
the job has not been dispatched). If the host on which the job is running has been removed from the configuration, the host name is displayed as Use
bhist to get the original host name.
If the host is part of a condensed host group, the host name is displayed as the name of the condensed host group.
If you configure a host to belong to more than one condensed host groups using wildcards,
bjobs can display any of the host groups as execution host name.
lost_and_found.
JOB_NAME The job name assigned by the user, or the command string assigned by default at
job submission with the latter part of the job name is displayed.
The displayed job name or job command can contain up to 4094 characters for UNIX, or up to 255 characters for Windows.
bsub. If the job name is too long to fit in this field, then only
SUBMIT_TIME The submission time of the job.
-l output
The -l option displays a long format listing with the following additional fields:
Project The project the job was submitted from.
Application Profile The application profile the job was submitted to.
Command The job command.
CWD The current working directory on the submission host.
Initial checkpoint period
The initial checkpoint period specified at the job level, by bsub -k, or in an application profile with CHKPNT_INITPERIOD.
Checkpoint period The checkpoint period specified at the job level, by bsub -k, in the queue with
CHKPNT, or in an application profile with CHKPNT_PERIOD.
Checkpoint directory
Migration
The checkpoint directory specified at the job level, by bsub -k, in the queue with CHKPNT, or in an application profile with CHKPNT_DIR.
The migration threshold specified at the job level, by bsub -mig.
threshold
Post-execute
The post-execution command specified at the job-level, by bsub -Ep.
Command
PENDING REASONS The reason the job is in the PEND or PSUSP state. The names of the hosts
associated with each reason are displayed when both specified.
-p and -l options are
SUSPENDING REASONS
The reason the job is in the USUSP or SSUSP state.
Platform LSF Command Reference 71
Output
loadSched
The load scheduling thresholds for the job.
loadStop
The load suspending thresholds for the job.
JOB STATUS Possible values for the status of a job include:
PEND
The job is pending, that is, it has not yet been started.
PSUSP
The job has been suspended, either by its owner or the LSF administrator, while pending.
RUN
The job is currently running.
USUSP
The job has been suspended, either by its owner or the LSF administrator, while running.
SSUSP
The job has been suspended by LSF. The job has been suspended by LSF due to either of the following two causes:
The load conditions on the execution host or hosts have exceeded a threshold
according to the
The run window of the job’s queue is closed. See bqueues(1), bhosts(1), and
lsb.queues(5).
loadStop vector defined for the host or queue.
DONE
The job has terminated with status of 0.
EXIT
The job has terminated with a non-zero status – it may have been aborted due to an error in its execution, or killed by its owner or the LSF administrator.
For example, exit code 131 means that the job exceeded a configured resource usage limit and LSF killed the job.
UNKWN
mbatchd has lost contact with the sbatchd on the host on which the job runs.
WAI T
For jobs submitted to a chunk job queue, members of a chunk job that are waiting to run.
ZOMBI
A job becomes ZOMBI if:
A non-rerunnable job is killed by bkill while the sbatchd on the execution
host is unreachable and the job is shown as UNKWN.
72 Platform LSF Command Reference
The host on which a rerunnable job is running is unavailable and the job has
been requeued by LSF with a new job ID, as if the job were submitted as a new job.
After the execution host becomes available, LSF tries to kill the ZOMBI job.
Upon successful termination of the ZOMBI job, the job’s status is changed to EXIT.
With MultiCluster, when a job running on a remote execution cluster becomes a ZOMBI job, the execution cluster treats the job the same way as local ZOMBI jobs. In addition, it notifies the submission cluster that the job is in ZOMBI state and the submission cluster requeues the job.
RUNTIME Estimated run time for the job, specified by bsub -We or bmod -We.
RESOURCE USAGE For the MultiCluster job forwarding model, this information is not shown if
MultiCluster resource usage updating is disabled.
The values for the current usage of a job include:
CPU time
Cumulative total CPU time in seconds of all processes in a job.
IDLE_FACTOR
Job idle information (CPU time/runtime) if JOB_IDLE is configured in the queue, and the job has triggered an idle exception.
MEM
Total resident memory usage of all processes in a job. By default, memory usage is shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a larger unit
for display (MB, GB, TB, PB, or EB).
SWAP
Total virtual memory usage of all processes in a job. By default, swap space is shown in MB. Use LSF_UNIT_FOR_LIMITS in
lsf.conf to specify a larger unit for
display (MB, GB, TB, PB, or EB).
NTHREAD
Number of currently active threads of a job.
PGID
Currently active process group ID in a job.
PIDs
Currently active processes in a job.
RESOURCE LIMITS The hard resource usage limits that are imposed on the jobs in the queue (see
getrlimit(2) and lsb.queues(5)). These limits are imposed on a per-job and a
per-process basis.
The possible per-job resource usage limits are:
CPULIMIT
PROCLIMIT
MEMLIMIT
Platform LSF Command Reference 73
Output
SWAPLIMIT
PROCESSLIMIT
THREADLIMIT
OPENFILELIMIT
The possible UNIX per-process resource usage limits are:
RUNLIMIT
FILELIMIT
DATALIMIT
STACKLIMIT
CORELIMIT
If a job submitted to the queue has any of these limits specified (see the lower of the corresponding job limits and queue limits are used for the job.
If no resource limit is specified, the resource is assumed to be unlimited. User shell limits that are unlimited are not displayed.
EXCEPTION STATUS Possible values for the exception status of a job include:
idle
The job is consuming less CPU time than expected. The job idle factor (CPU time/runtime) is less than the configured JOB_IDLE threshold for the queue and a job exception has been triggered.
overrun
The job is running longer than the number of minutes specified by the JOB_OVERRUN threshold for the queue and a job exception has been triggered.
underrun
The job finished sooner than the number of minutes specified by the JOB_UNDERRUN threshold for the queue and a job exception has been triggered.
Job Array Summary Information
If you use -A, displays summary information about job arrays. The following fields are displayed:
bsub(1)), then
JOBID Job ID of the job array.
ARRAY_SPEC Array specification in the format of name[index]. The array specification may be
truncated, use
-w option together with -A to show the full array specification.
OWNER Owner of the job array.
NJOBS Number of jobs in the job array.
PEND Number of pending jobs of the job array.
RUN Number of running jobs of the job array.
DONE Number of successfully completed jobs of the job array.
74 Platform LSF Command Reference
EXIT Number of unsuccessfully completed jobs of the job array.
SSUSP Number of LSF system suspended jobs of the job array.
USUSP Number of user suspended jobs of the job array.
PSUSP Number of held jobs of the job array.

Examples

bjobs -pl
Displays detailed information about all pending jobs of the invoker.
bjobs -ps
Display only pending and suspended jobs.
bjobs -u all -a
Displays all jobs of all users.
bjobs -d -q short -m hostA -u user1
Displays all the recently finished jobs submitted by user1 to the queue short, and executed on the host
bjobs 101 102 203 509
Display jobs with job_ID 101, 102, 203, and 509.
bjobs -X 101 102 203 509
hostA.

See also

Display jobs with job ID 101, 102, 203, and 509 as uncondensed output even if these jobs belong to hosts in condensed host groups.
bjobs -sla Uclulet
Displays all jobs belonging to the service class Uclulet.
bjobs -app fluent
Displays all jobs belonging to the application profile fluent.
bsub(1), bkill(1), bhosts(1), bmgroup(1), bclusters(1), bqueues(1), bhist(1), bresume(1), bsla(1), bstop(1), lsb.params(5), lsb.erviceclasses(5), mbatchd(8)
Platform LSF Command Reference 75

bkill

bkill

Synopsis

Description

sends signals to kill, suspend, or resume unfinished jobs
bkill [-l] [-app application_profile_name] [-g job_group_name]
[-sla service_class_name] [-J job_name] [-m host_name |
-m host_group] [-q queue_name] [-r |
-s signal_value | signal_name] [-u user_name |
-u user_group | -u all] [job_ID ... | 0 | "job_ID[index]" ...]
bkill [ -l] [-b] [-app application_profile_name] [-g job_group_name]
[-sla service_class_name] [-J job_name] [-m host_name |
-m host_group] [-q queue_name] [-u user_name |
-u user_group | -u all] [job_ID ... | 0 | "job_ID[index]" ...]
bkill [-h | -V]
By default, sends a set of signals to kill the specified jobs. On UNIX, SIGINT and SIGTERM are sent to give the job a chance to clean up before termination, then SIGKILL is sent to kill the job. The time interval between sending each signal is defined by the JOB_TERMINATE_INTERVAL parameter in
lsb.params(5).
PEND
RUN
By default, kills the last job submitted by the user running the command. You must specify a job ID or
-q without a job ID, bkill kills the last job submitted by the user running the
command. Specify job ID
-app, -g, -J, -m, -u, or -q. If you specify -app, -g, -J, -m, -u, or
0 (zero) to kill multiple jobs.
On Windows, job control messages replace the SIGINT and SIGTERM signals (but only customized applications can process them) and the
TerminateProcess()
system call is sent to kill the job.
Exit code 130 is returned when a dispatched job is killed with
Only
root and LSF administrators can run bkill -r. The -r option is ignored for
bkill.
other users.
Users can only operate on their own jobs. Only
root and LSF administrators can
operate on jobs submitted by other users.
If a signal request fails to reach the job execution host, LSF tries the operation later when the host becomes reachable. LSF retries the most recent signal request.
If a job is running in a queue with CHUNK_JOB_SIZE set,
bkill has the following
results depending on job state:
Job is removed from chunk (NJOBS -1, PEND -1)
All jobs in the chunk are suspended (NRUN -1, NSUSP +1)
76 Platform LSF Command Reference
USUSP
WAIT
Job finishes, next job in the chunk starts if one exists (NJOBS -1, PEND -1, SUSP
-1, RUN +1)
Job finishes (NJOBS-1, PEND -1)

Options

If the job cannot be killed, use
bkill -r to remove the job from the LSF system
without waiting for the job to terminate, and free the resources of the job.
0 Kills all the jobs that satisfy other options (-app. -g, -m, -q, -u, and -J).
-b Kills large numbers of jobs as soon as possible. Local pending jobs are killed
immediately and cleaned up as soon as possible, ignoring the time interval specified by CLEAN_PERIOD in
lsb.acct.
lsb.params. Jobs killed in this manner are not logged to
Other jobs, such as running jobs, are killed as soon as possible and cleaned up normally.
If the
-b option is used with the 0 subcommand, bkill kills all applicable jobs and
silently skips the jobs that cannot be killed.
bkill -b 0
Operation is in progress
The -b option is ignored if used with the -r or -s options.
-l Displays the signal names supported by bkill. This is a subset of signals supported
by
/bin/kill and is platform-dependent.
-r Removes a job from the LSF system without waiting for the job to terminate in the
operating system.
Only
root and LSF administrators can run bkill -r. The -r option is ignored for
other users.
Sends the same series of signals as
bkill without -r, except that the job is removed
from the system immediately, the job is marked as EXIT, and the job resources that LSF monitors are released as soon as LSF receives the first signal.
Also operates on jobs for which a cannot be reached to be acted on by
bkill command has been issued but which
sbatchd (jobs in ZOMBI state). If sbatchd
recovers before the jobs are completely removed, LSF ignores the zombi jobs killed with
bkill -r.
Use
bkill -r only on jobs that cannot be killed in the operating system, or on jobs
that cannot be otherwise removed using
The
-app application_profile_name
-r option cannot be used with the -s option.
bkill.
Operates only on jobs associated with the specified application profile. You must specify an existing application profile. If job_ID or 0 is not specified, only the most recently submitted qualifying job is operated on.
-g job_group_name Operates only on jobs in the job group specified by job_group_name.
Platform LSF Command Reference 77
Options
Use -g with -sla to kill jobs in job groups attached to a service class.
bkill does not kill jobs in lower level job groups in the path. For example, jobs are
attached to job groups
bsub -g /risk_group myjob
Job <115> is submitted to default queue <normal>.
bsub -g /risk_group/consolidate myjob2
Job <116> is submitted to default queue <normal>.
The following bkill command only kills jobs in /risk_group, not the subgroup
/risk_group/consolidate:
bkill -g /risk_group 0
Job <115> is being terminated
bkill -g /risk_group/consolidate 0
Job <116> is being terminated
-J job_name Operates only on jobs with the specified job name. The -J option is ignored if a job
ID other than 0 is specified in the job_ID option.
-m host_name | -m host_group
Operates only on jobs dispatched to the specified host or host group.
If job_ID is not specified, only the most recently submitted qualifying job is operated on. The job_ID option. See and host groups.
/risk_group and /risk_group/consolidate:
-m option is ignored if a job ID other than 0 is specified in the bhosts(1) and bmgroup(1) for more information about hosts
-q queue_name Operates only on jobs in the specified queue.
If job_ID is not specified, only the most recently submitted qualifying job is operated on.
The
See
-s signal_value | signal_name
Sends the specified signal to specified jobs. You can specify either a name, stripped of the SIG prefix (such as KILL), or a number (such as 9).
Eligible UNIX signal names are listed by
The
Use of using
bresume.
Sending the SIGSTOP signal to sequential jobs or the SIGTSTP to parallel jobs is the same as using
You cannot suspend a job that is already suspended, or resume a job that is not suspended. Using SIGSTOP or SIGTSTP on a job that is in the USUSP state has no effect and using SIGCONT on a job that is not in either the PSUSP or the USUSP state has no effect. See
-q option is ignored if a job ID other than 0 is specified in the job_ID option.
bqueues(1) for more information about queues.
bkill -l.
-s option cannot be used with the -r option.
bkill -s to suspend and resume jobs by using the appropriate signal instead
bstop or bresume. Sending the SIGCONT signal is the same as using
bstop.
bjobs(1) for more information about job states.
-sla service_class_name
Operates on jobs belonging to the specified service class.
If job_ID is not specified, only the most recently submitted job is operated on.
78 Platform LSF Command Reference
Use -sla with -g to kill jobs in job groups attached to a service class.
The
-sla option is ignored if a job ID other than 0 is specified in the job_ID option.
bsla to display the configuration properties of service classes configured in
Use
lsb.serviceclasses, the default SLA configured with
ENABLE_DEFAULT_EGO_SLA in the state of each service class.
-u user_name | -u user_group | -u all
Operates only on jobs submitted by the specified user or user group, or by all users if the reserved user name include the domain name in uppercase letters and use a single backslash (DOMAIN_NAME\user_name) in a Windows command line or a double backslash (DOMAIN_NAME\\user_name) in a UNIX command line.
If job_ID is not specified, only the most recently submitted qualifying job is operated on. The job_ID option.
job_ID ... | 0 | "job_ID[index]" ...
Operates only on jobs that are specified by job_ID or "job_ID[index]", where "job_ID[index]" specifies selected job array elements (see quotation marks must enclose the job ID and index, and index must be enclosed in square brackets.
lsb.params, and dynamic information about
all is specified. To specify a Windows user account,
-u option is ignored if a job ID other than 0 is specified in the
bjobs(1)). For job arrays,

Examples

Jobs submitted by any user can be specified here without using the
-u option. If you
use the reserved job ID 0, all the jobs that satisfy other options (that is, and
-J) are operated on; all other job IDs are ignored.
The options IDs are returned at job submission time (see the
bjobs command (see bjobs(1)).
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
bkill -s 17 -q night
-u, -q, -m and -J have no effect if a job ID other than 0 is specified. Job bsub(1)) and may be obtained with
Sends signal 17 to the last job that was submitted by the invoker to queue night.
bkill -q short -u all 0
Kills all the jobs that are in the queue short.
bkill -r 1045
Forces the removal of unkillable job 1045.
bkill -sla Tofino 0
Kill all jobs belonging to the service class named Tofino.
bkill -g /risk_group 0
Kills all jobs in the job group /risk_group.
bkill -app fluent
-m, -q, -u
Platform LSF Command Reference 79

See also

See also
Kills the most recently submitted job associated with the application profile fluent for the current user.
bkill -app fluent 0
Kills all jobs associated with the application profile fluent for the current user.
bsub(1), bjobs(1), bqueues(1), bhosts(1), bresume(1), bapp(1), bsla(1), bstop(1), bgadd(1), bgdel(1), bjgroup(1), bparams(5), lsb.serviceclasses(5),
kill(1), signal(2)
mbatchd(8),
80 Platform LSF Command Reference

bladmin

reconfigures the Platform LSF License Scheduler daemon (bld)

Synopsis

bladmin subcommand
bladmin [-h | -V]

Description

Use this command to reconfigure the License Scheduler daemon (bld).
You must be a License Scheduler administrator to use this command.

Subcommand List

ckconfig [-v]
reconfig [host_name ... | all]
shutdown [host_name ... | all]
blddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
blcdebug [-l debug_level] [-f logfile_name] [-o] collector_name ...
| all
-h
-V
Platform LSF Command Reference 81

Options

Options
ckconfig [-v] Checks LSF License Scheduler configuration in
LSF_ENVDIR/lsf.licensescheduler and lsf.conf.
By default, check. If warning errors are found,
bladmin ckconfig displays only the result of the configuration file
bladmin prompts you to use the -v option to
display detailed messages.
-v
Verbose mode. Displays detailed messages about configuration file checking to
stderr.
reconfig [host_name ... | all]
Reconfigures License Scheduler.
shutdown [host_name ... | all]
Shuts down License Scheduler.
blddebug [-c class_name ...] [-l debug_level] [-f logfile_name] [-o]
Sets the message log level for bld to include additional information in log files. You must be
If the
root or the LSF administrator to use this command.
bladmin blddebug is used without any options, the following default values
are used:
class_name=0 (no additional classes are logged)
debug_level=0 (LOG_DEBUG level in parameter LS_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in
the format daemon_name
.log.host_name
-c class_name ...
Specifies software classes for which debug messages are to be logged.
Format of class_name is the name of a class, or a list of class names separated by spaces and enclosed in quotation marks. Classes are also listed in
lsf.h.
Valid log classes:
LC_AUTH - Log authentication messages
LC_COMM - Log communication messages
LC_FLEX - Log everything related to FLEX_STAT or FLEX_EXEC
Macrovision APIs
LC_LICENCE - Log license management messages
LC_PREEMPT - Log preemption policy messages
LC_TRACE - Log significant program walk steps
LC_XDR - Log everything transferred by XDR
Default: 0 (no additional classes are logged)
-l debug_level
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
82 Platform LSF Command Reference
Possible values:
0 LOG_DEBUG level in parameter LS_LOG_MASK in
lsf.conf.
1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Default: 0 (LOG_DEBUG level in parameter LS_LOG_MASK)
-f logfile_name
Specifies the name of the file where debugging messages are logged. The file name can be a full path. If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file has the following format:
logfile_name.daemon_name.
On UNIX, if the specified path is not valid, the log file is created in the
log.host_name
/tmp
directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
-o
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LS_LOG_MASK and classes are reset to the value of LSB_DEBUG_BLD. The log file is also reset back to the default log file.
blcdebug [-l debug_level] [-f logfile_name] [-o] collector_name | all
Sets the message log level for blcollect to include additional information in log files. You must be
If the
bladmin blcdebug is used without any options, the following default values
root or the LSF administrator to use this command.
are used:
debug_level=0 (LOG_DEBUG level in parameter LS_LOG_MASK)
logfile_name=current LSF system log file in the LSF system log file directory, in
the format daemon_name
collector_name=default
-l debug_level
.log.host_name
Specifies level of detail in debug messages. The higher the number, the more detail that is logged. Higher levels include all lower levels.
Possible values:
0 LOG_DEBUG level in parameter LS_LOG_MASK in
lsf.conf.
Platform LSF Command Reference 83

See also

1 LOG_DEBUG1 level for extended logging. A higher level includes lower logging levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
2 LOG_DEBUG2 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2 LOG_DEBUG1, and LOG_DEBUG levels.
3 LOG_DEBUG3 level for extended logging. A higher level includes lower logging
levels. For example, LOG_DEBUG3 includes LOG_DEBUG2, LOG_DEBUG1, and LOG_DEBUG levels.
Default: 0 (LOG_DEBUG level in parameter LS_LOG_MASK)
-f logfile_name
Specifies the name of the file where debugging messages are logged. The file name can be a full path. If a file name without a path is specified, the file is saved in the LSF system log directory.
The name of the file has the following format:
See also
logfile_name.daemon_name.
On UNIX, if the specified path is not valid, the log file is created in the
log.host_name
/tmp
directory.
On Windows, if the specified path is not valid, no log file is created.
Default: current LSF system log file in the LSF system log file directory.
-o
Turns off temporary debug settings and resets them to the daemon starting state. The message log level is reset back to the value of LS_LOG_MASK and classes are reset to the value of LSB_DEBUG_BLD. The log file is also reset back to the default log file.
If a collector name is not specified, default value is to restore the original log mask and log file directory for the
collector_name ... | all
default collector.
Specifies the collector names separated by blanks. all means all the collectors.
-h Prints command usage to stderr and exits.
-V Prints release version to stderr and exits.
blhosts, lsf.licensescheduler, lsf.conf
84 Platform LSF Command Reference

blaunch

Synopsis

Description

launches parallel tasks on a set of hosts
blaunch [-n] [-u host_file | -z host_name ... | host_name]
command [argument ...]
blaunch [-h | -V]
IMPORTANT: You cannot run blaunch directly from the command line.
RESTRICTION: The command blaunch does not work with user account mapping. Do not run
blaunch on a user account mapping host.
Most MPI implementations and many distributed applications use rsh and ssh as their task launching mechanism. The replacement for
rsh and ssh as a transparent method for launching parallel
blaunch command provides a drop-in
applications within LSF.
blaunch supports the following core command line options as rsh and ssh:
rsh host_name command

Options

-u host_file Executes the task on all hosts listed in the host_file.
host_name The name of the host where remote tasks are to be launched.
-z host_name ... Executes the task on all specified hosts.
ssh host_name command
All other
blaunch transparently connects directly to the RES/SBD on the remote host, and
rsh and ssh options are silently ignored.
subsequently creates and tracks the remote tasks, and provides the connection back to LSF. You do not need to insert
blaunch only works under LSF. It can only be used to launch tasks on remote hosts
pam, taskstarter or any other wrapper.
that are part of a job allocation. It cannot be used as a standalone command.
blaunch is not supported on Windows.
When no host names are specified, LSF allocates all hosts listed in the environment variable LSB_MCPU_HOSTS.
-n Standard input is taken from /dev/null.
Specify the path to a file that contains a list of host names. Each host name must listed on a separator line in the host list file.
This option is exclusive of the
-z option.
Platform LSF Command Reference 85

Diagnostics

Whereas the host name value for rsh and ssh is a single host name, you can use the
-z option to specify a space-delimited list of hosts where tasks are started in
parallel.
Specify a list of hosts on which to execute the task. If multiple host names are specified, the host names must be enclosed by quotation marks ( separated by white space.
" or ') and
command [argument ...]
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
Diagnostics

See also

This option is exclusive of the
Specify the command to execute. This must be the last argument on the command line.
Exit status is 0 if all commands are executed correctly.
lsb_getalloc(3), lsb_launch(3)
-u option.
86 Platform LSF Command Reference

blcollect

Synopsis

Description

license information collection daemon that collects license usage information
blcollect -c collector_name -m host_name [...] -p
license_scheduler_port [-i lmstat_interval | -D lmstat_path]
blcollect [-h | -V]
Periodically collects license usage information from Macrovision FLEXnet. It queries FLEXnet for license usage information from the FLEXnet command, and passes the information to the License Scheduler daemon (
blcollect daemon improves performance by allowing you to distribute license
information queries on multiple hosts.
By default, license information is collected from FLEXnet on one host. Use
blcollect to distribute the license collection on multiple hosts.
lmstat
bld). The
For each service domain configuration in name for but you can specify one collector to serve multiple service domains. You can choose any collector name you want, but must use that exact name when you run
blcollect.
blcollect to use. You can only specify one collector per service domain,
lsf.licensescheduler, specify one

Options

-c Required. Specify the collector name you set in lsf.licensescheduler. You must
use the collector name ( the configuration file.
-m Required. Specifies a space-separated list of hosts to which license information is
sent. The hosts do not need to be running License Scheduler or a FLEXnet. Use fully qualified host names.
-p Required. You must specify the License Scheduler listening port, which is set in
lsf.licensescheduler and has a default value of 9581.
-i lmstat_interval Optional. The frequency in seconds of the calls that License Scheduler makes to
lmstat to collect license usage information from FLEXnet.
The default interval is 60 seconds.
-D lmstat_path Optional. Location of the FLEXnet command lmstat.
-h Prints command usage to stderr and exits.
LIC_COLLECT) you define in the ServiceDomain section of

See also

-V Prints release version to stderr and exits.
lsf.licensescheduler
Platform LSF Command Reference 87

blhosts

blhosts

Synopsis

Description

Options

Output

displays the names of all the hosts running the License Scheduler daemon (bld)
blhosts [-h | -V]
Displays a list of hosts running the License Scheduler daemon. This includes the License Scheduler master host and all the candidate License Scheduler hosts running
-h Prints command usage to stderr and exits.
-V Prints release version to stderr and exits.
bld.

See also

Prints out the names of all the hosts running the License Scheduler daemon (bld).
For example, the following sample output shows the License Scheduler master host and two candidate License Scheduler hosts running
bld is running on:
master: host1.domain1.com
slave: host2.domain1 host3.domain1
blinfo, blstat, bladmin
bld:
88 Platform LSF Command Reference

blimits

Synopsis

Description

displays information about resource allocation limits of running jobs
blimits [-w] [-n limit_name ...]
[-m host_name | -m host_group | -m cluster_name ...] [-P project_name ...] [-q queue_name ...] [-u user_name | -u user_group ...]
blimits -c
blimits -h | -V
Displays current usage of resource allocation limits configured in Limit sections in
lsb.resources:
Configured limit policy name
Users (-u option)
Queues (-q option)
Hosts (-m option)
Project names (-P option)
Limits (SLOTS, MEM, TMP, SWP, JOBS)
Limit configuration (-c option). This is the same as bresources with no
options.
Resources that have no configured limits or no limit usage are indicated by a dash (-). Limits are displayed in a USED/LIMIT format. For example, if a limit of 10 slots is configured and 3 slots are in use, then
blimits displays the limit for SLOTS as
3/10.
Note that if there are no jobs running against resource allocation limits, LSF indicates that there is no information to be displayed:
No resource usage found.
If limits MEM, SWP, or TMP are configured as percentages, both the limit and the amount used are displayed in MB. For example,
lshosts displays maxmem of 249
MB, and MEM is limited to 10% of available memory. If 10 MB out of 25 MB are used,
blimits displays the limit for MEM as 10/25 (10 MB USED from a 25 MB
LIMIT).
Limits are displayed for both the vertical tabular format and the horizontal format for Limit sections. If a vertical format Limit section has no name,
blimits displays
NONAMEnnn under the NAME column for these limits, where the unnamed limits are numbered in the order the vertical-format Limit sections appear in the
lsb.resources file.
If a resource consumer is configured as
all, the limit usage for that consumer is
indicated by a dash (-)
Platform LSF Command Reference 89

Options

PER_HOST slot limits are not displayed. The bhosts commands displays these as MXJ limits.
In MultiCluster,
blimits returns the information about all limits in the local
cluster.
Limit names and policies are set up by the LSF administrator. See
lsb.resources(5) for more information.
Options
-c Displays all resource configurations in lsb.resources. This is the same as
bresources with no options.
-w Displays resource allocation limits information in a wide format. Fields are
displayed without truncation.
-n limit_name ... Displays resource allocation limits the specified named Limit sections. If a list of
limit sections is specified, Limit section names must be separated by spaces and enclosed in quotation marks (") or (’).
-m host_name | -m host_group | -m cluster_name ...
Displays resource allocation limits for the specified hosts. Do not use quotes when specifying multiple hosts.
To see the available hosts, use
For host groups:
If the limits are configured with HOSTS, the name of the host group is
displayed.
bhosts.
If the limits are configured with PER_HOST, the names of the hosts belonging
to the group are displayed instead of the name of the host group.
TIP: PER_HOST slot limits are not displayed. The bhosts command displays these as MXJ limits.
For a list of host groups see bmgroup(1).
In MultiCluster, if a cluster name is specified, displays resource allocation limits in the specified cluster.
-P project_name ... Displays resource allocation limits for the specified projects.
If a list of projects is specified, project names must be separated by spaces and enclosed in quotation marks (") or (’).
-q queue_name ... Displays resource allocation limits for the specified queues.
The command
bqueues returns a list of queues configured in the system, and
information about the configurations of these queues.
In MultiCluster, you cannot specify remote queues.
-u user_name | -u user_group ...
Displays resource allocation limits for the specified users.
If a list of users is specified, user names must be separated by spaces and enclosed in quotation marks (") or (’). You can specify both user names and user IDs in the list of users.
90 Platform LSF Command Reference
If a user group is specified, displays the resource allocation limits that include that group in their configuration. For a list of user groups see
-h Prints command usage to stderr and exits.
-V Prints LSF release version to stderr and exits.
bugroup(1)).

Output

Configured limits and resource usage for built-in resources (slots, mem, tmp, and swp load indices, and running and suspended job limits) are displayed as INTERNAL RESOURCE LIMITS separately from custom external resources, which are shown as EXTERNAL RESOURCE LIMITS.
Resource Consumers
blimits displays the following fields for resource consumers:
NAME The name of the limit policy as specified by the Limit section NAME parameter.
USERS List of user names or user groups on which the displayed limits are enforced, as
specified by the Limit section parameters USERS or PER_USER.
User group names have a slash (/) added at the end of the group name. See
bugroup(1).
QUEUES The name of the queue to which the limits apply, as specified by the Limit section
parameters QUEUES or PER_QUEUES.
If the queue has been removed from the configuration, the queue name is displayed as
lost_and_found. Use bhist to get the original queue name. Jobs in the
lost_and_found queue remain pending until they are switched with the bswitch
command into another queue.
In a MultiCluster resource leasing environment, jobs scheduled by the consumer cluster display the remote queue name in the format queue_name@cluster_name. By default, this field truncates at 10 characters, so you might not see the cluster name unless you use
-w or -l.
HOSTS List of hosts and host groups on which the displayed limits are enforced, as specified
by the Limit section parameters HOSTS or PER_HOSTS.
Host group names have a slash (/) added at the end of the group name. See
bmgroup(1).
TIP: PER_HOST slot limits are not displayed. The bhosts command displays these as MXJ limits.
PROJECTS List of project names on which limits are enforced., as specified by the Limit section
parameters PROJECTS or PER_PROJECT.
Resource Limits
blimits displays resource allocation limits for the following resources:
SLOTS Number of slots currently used and maximum number of slots configured for the
limit policy, as specified by the Limit section SLOTS parameter.
Platform LSF Command Reference 91

Example

MEM Amount of memory currently used and maximum configured for the limit policy,
as specified by the Limit section MEM parameter.
TMP Amount of tmp space currently used and maximum amount of tmp space
configured for the limit policy, as specified by the Limit section TMP parameter.
SWP Amount of swap space currently used and maximum amount of swap space
configured for the limit policy, as specified by the Limit section SWP parameter.
JOBS Number of currently running and suspended jobs and the maximum number of
jobs configured for the limit policy, as specified by the Limit section JOBS parameter.
Example
The following command displays limit configuration and dynamic usage information for project
blimits -P proj1
INTERNAL RESOURCE LIMITS:
NAME USERS QUEUES HOSTS PROJECTS SLOTS MEM TMP SWP JOBS limit1 user1 - hostA proj1 2/6 - - - ­NONAME022 - - hostB proj1 proj2 1/3 - - - -
proj1:
EXTERNAL RESOURCE LIMITS:
NAME USERS QUEUES HOSTS PROJECTS tmp1 limit1 user1 - hostA proj1 1/1

See also

bclusters, bhosts, bhist, bmgroup, bqueues, bugroup, lsb.resources
92 Platform LSF Command Reference

blinfo

Synopsis

Description

displays static License Scheduler configuration information
blinfo -Lp | -p | -D | -G | -P
blinfo [-a [-t token_name | "token_name ..."]] [-o alpha | total]
[-g "feature_group ..."]
blinfo -A [-t token_name | "token_name ..."] [-o alpha | total ]
[-g "feature_group ..."]
blinfo -C [-t token_name | "token_name ..."] [-o alpha | total]
[-g "feature_group ..."]
blinfo [-t token_name | "token_name ..."] [-o alpha | total]
[-g "feature_group ..."]
blinfo [ -h | -V ]
Displays different license configuration information, depending on the option selected.
By default, displays information about the distribution of licenses managed by License Scheduler.

Options

-A When LOCAL_TO is configured for a feature in lsf.licensescheduler, shows
the feature allocation by cluster locality.
You can optionally provide license token names.
-a Shows all information, including information about non-shared licenses
(NON_SHARED_DISTRIBUTION) and workload distribution (WORKLOAD_DISTRIBUTION).
You can optionally provide license token names.
blinfo -a does not display NON_SHARED information for hierarchical project
group scheduling policies. Use
-C When LOCAL_TO is configured for a feature in lsf.licensescheduler, shows
the cluster locality information for the features.
You can optionally provide license token names.
-D Lists the License Scheduler service domains and the corresponding FLEXnet
license server hosts.
-G Lists the hierarchical configuration information.
If PRIORITY is defined in the this option also shows the priorities of each project.
blinfo -G to see hierarchical group configuration.
ProjectGroup Section of lsf.licensescheduler,
Platform LSF Command Reference 93

Output

-g feature_group ...
When FEATURE_GROUP is configured for a group of license features in
lsf.licensescheduler, shows only information about the features configured in
the FEATURE_LIST of specified feature groups. You can specify more than one feature group at one time.
When you specify feature names with
-t, features in the feature list defined by -t
and feature groups are both displayed.
Feature groups listed with
-g but not defined in lsf.licensescheduler are
ignored.
-Lp Lists the active projects managed by License Scheduler.
-Lp only displays projects associated with configured features.
If PRIORITY is defined in the
Projects Section of lsf.licensescheduler, this
option also lists the priorities of each project.
-o alpha | total Sorts license feature information alphabetically, by total licenses, or by available
licenses.
alpha: Features are listed in descending alphabetical order.
total: Features are sorted by the descending order of the sum of licenses that are
allocated to LSF workload from all the service domains configured to supply licenses to the feature. Licenses borrowed by non-LSF workload are not included in this amount.
-P When LS_FEATURE_PERCENTAGE=Y, lists the license ownership in percentage.
-p Displays values of lsf.licensescheduler configuration parameters and
lsf.conf parameters related to License Scheduler. This is useful for
troubleshooting.
-t token_name |"token_name ..."
Only shows information about specified license tokens. Use spaces to separate multiple names, and enclose them in quotation marks.
-h Prints command usage to stderr and exits.
-V Prints the License Scheduler release version to stderr and exits.
Output
Default output
Displays the following fields:
FEATURE The license name. This becomes the license token name.
When LOCAL_TO is configured for a feature in shows the cluster locality information for the license features.
SERVICE_DOMAIN The name of the service domain that provided the license.
TOTAL The total number of licenses managed by FLEXnet. This number comes from
FLEXnet.
lsf.licensescheduler, blinfo
94 Platform LSF Command Reference
DISTRIBUTION The distribution of the licenses among license projects in the format [project_name,
percentage[ project is entitled to use when there is competition for licenses. The percentage is calculated from the share specified in the configuration file.
/number_licenses_owned]]. This determines how many licenses a
Allocation output (-A)
FEATURE The license name. This becomes the license token name.
When LOCAL_TO is configured for a feature in shows the cluster locality information for the license features.
PROJECT The License Scheduler project name.
ALLOCATION
The percentage of shares assigned to each cluster for a feature and a project.
All output (-a)
Same as Default Output with NON_SHARED_DISTRIBUTION.
NON-SHARED_DISTRIBUTION
This column is displayed directly under DISTRIBUTION with the -a option. If there are non-shared licenses, then the non-shared license information is output in the following format: [project_name, number_licenses_non_shared]
If there are no non-shared licenses, then the following license information is output
- (dash)
Cluster locality output (-C)
NAME The license feature token name.
When LOCAL_TO is configured for a feature in shows the cluster locality information for the license features.
lsf.licensescheduler, blinfo
lsf.licensescheduler, blinfo
FLEX_NAME The actual FLEXnet feature name—the name used by FLEXnet to identify the type
of license. May be different from the License Scheduler token name if a different FLEX_NAME is specified in
lsf.licensescheduler.
CLUSTER_NAME The name of the cluster the feature is assigned to.
FEATURE The license feature name. This becomes the license token name.
When LOCAL_TO is configured for a feature in shows the cluster locality information for the license features.
lsf.licensescheduler, blinfo
SERVICE_DOMAIN The service domain name.
Service Domain Output (-D)
SERVICE_DOMAIN The service domain name.
Platform LSF Command Reference 95
Output
LIC_SERVERS Names of FLEXnet license server hosts that belong the to service domain. Each host
name is enclosed in parentheses, as shown:
(port_number@host_name)
Redundant hosts (that share the same FLEXnet license file) are grouped together as shown:
(port_number@host_name port_number@host_name port_number@host_name)
Hierarchical Output (-G)
The following fields describe the values of their corresponding configuration fields in the
ProjectGroup Section of lsf.licensescheduler.
GROUP The project names in the hierarchical grouping and its relationships. Each entry
specifies the name of the hierarchical group and its members. The entry is enclosed in parentheses as shown:
(group (member ...))
SHARES The shares assigned to the hierarchical group member projects.
OWNERSHIP The number of licenses that each project owns.
LIMITS The maximum number of licenses that the hierarchical group member project can
use at any one time.
NON_SHARED The number of licenses that the hierarchical group member projects use exclusively.
PRIORITY The priority of the project if it is different from the default behavior. A larger
number indicates a higher priority.
DESCRIPTION The description of the project group.
Project Output (-Lp)
List of active License Scheduler projects.
-Lp only displays projects associated with configured features.
PROJECT The project name.
PRIORITY The priority of the project if it is different from the default behavior. A larger
number indicates a higher priority.
DESCRIPTION The description of the project.
Parameters Output (-p)
ADMIN The License Scheduler administrator. Defined in lsf.licensescheduler.
DISTRIBUTION_POLICY_VIOLATION_ACTION
This parameter includes
The interval (a multiple of LM_STAT_INVERVAL periods) at which License
Scheduler checks for distribution policy violations, and
96 Platform LSF Command Reference
The directory path and command that License Scheduler runs when reporting
a violation
Defined in
lsf.licensescheduler.
EXT_FILTER_PORT TCP listening port used by all external plug-ins to communicate with License
Scheduler hosts. Defined in
lsf.licensescheduler.
FLX_LICENSE_FILE Path to the file that contains the license keys FLEXnet.Ext.Filter and
FLEXnet.Usage.Snapshot to enable the FLEXnet APIs. Defined in
lsf.licensescheduler.
HOSTS License Scheduler candidate hosts. Defined in lsf.licensescheduler.
LM_REMOVE_INTERVAL
Minimum time a job must have a license checked out before lmremove can remove the license. Defined in
lsf.licensescheduler.
LM_STAT_INTERVAL Time interval between calls that License Scheduler makes to collect license usage
information from FLEXnet license management. Defined in
lsf.licensescheduler.
LS_MAX_TASKMAN_SESSIONS
Maximum number of taskman jobs that run simultaneously. Defined in
lsf.licensescheduler.
LSF_LIC_SCHED_HOSTS
List of hosts that are candidate LSF License Scheduler hosts. Defined in lsf.conf.
LSF_LIC_SCHED_PREEMPT_REQUEUE
Specifies whether to requeue or suspend a job whose license is preempted by LSF License Scheduler. Defined in
lsf.conf.
LSF_LIC_SCHED_PREEMPT_SLOT_RELEASE
Specifies whether to release the slot of a job that is suspended when its license is preempted by LSF License Scheduler. Defined in
lsf.conf.
LSF_LIC_SCHED_PREEMPT_STOP
Specifies whether to use job controls to stop a job that is preempted. Defined in
lsf.conf.
LSF_LICENSE_FILE Location of the LSF license file, which includes License Scheduler keys. Defined in
lsf.conf.
PORT TCP listening port used by License Scheduler. Defined in lsf.licensescheduler.
Platform LSF Command Reference 97

Examples

Examples
blinfo -a displays both NON_SHARED_DISTRIBUTION and
WORKLOAD_DISTRIBUTION information:
blinfo -a
FEATURE SERVICE_DOMAIN TOTAL DISTRIBUTION g1 LS 3 [p1, 50.0%] [p2, 50.0% / 2] NON_SHARED_DISTRIBUTION [p2, 2] WORKLOAD_DISTRIBUTION [LSF 66.7%, NON_LSF 33.3%]
blinfo -a
NON_SHARED_DISTRIBUTION is not defined:
blinfo -a
FEATURE SERVICE_DOMAIN TOTAL DISTRIBUTION g1 LS 0 [p1, 50.0%] [p2, 50.0%] WORKLOAD_DISTRIBUTION [LSF 66.7%, NON_LSF 33.3%] g2 LS 0 [p1, 50.0%] [p2, 50.0%] g33 WS 0 [p1, 50.0%] [p2, 50.0%]
blinfo -a
WORKLOAD_DISTRIBUTION is not defined:
blinfo -a
FEATURE SERVICE_DOMAIN TOTAL DISTRIBUTION g1 LS 3 [p1, 50.0%] [p2, 50.0% / 2] NON_SHARED_DISTRIBUTION [p2, 2]
does not display NON_SHARED_DISTRIBUTION, if the
does not display WORKLOAD_DISTRIBUTION, if the

Files

See also

Reads lsf.licensescheduler
blstat, blusers
98 Platform LSF Command Reference

blkill

Synopsis

Description

Options

task_ID Task ID of the task you want to kill.
-t seconds Specify how many seconds to delay before killing the task. A value of 0 means to kill
terminates an interactive License Scheduler task
blkill [-t seconds] task_ID
blkill [-h | -V]
Terminates a running or waiting interactive task in License Scheduler.
Users can kill their own tasks. You must be a License Scheduler administrator to terminate another user’s task.
By default,
the task immediately (do not give the user any time to save work).
-h Prints command usage to stderr and exits.
blkill notifies the user and waits 30 seconds before killing the task.
-V Prints License Scheduler release version to stderr and exits.
Platform LSF Command Reference 99

blparams

blparams
displays information about configurable License Scheduler parameters defined in the files

Synopsis

blparams [-h | -V]

Description

Displays the following parameter values:
ADMIN
The License Scheduler administrator. Defined in lsf.licensescheduler.
DISTRIBUTION_POLICY_VIOLATION_ACTION
This parameter includes
The interval (a multiple of LM_STAT_INVERVAL periods) at which License
lsf.licensescheduler and lsf.conf
Scheduler checks for distribution policy violations, and
The directory path and command that License Scheduler runs when reporting
a violation
Defined in
EXT_FILTER_PORT
TCP listening port used by all external plugins to communicate with License Scheduler hosts. Defined in
FLX_LICENSE_FILE
Path to the file that contains the license keys FLEXnet.Ext.Filter and FLEXnet.Usage.Snapshot to enable the FLEXnet APIs. Defined in
lsf.licensescheduler.
HOSTS
License Scheduler candidate hosts. Defined in lsf.licensescheduler.
LM_REMOVE_INTERVAL
Minimum time a job must have a license checked out before lmremove can remove the license. Defined in
LM_STAT_INTERVAL
Time interval between calls that License Scheduler makes to collect license usage information from FLEXnet license management. Defined in
lsf.licensescheduler.
lsf.licensescheduler.
lsf.licensescheduler.
lsf.licensescheduler.
LS_DEBUG_BLD
Sets the debugging log class for the LSF License Schedulerbld daemon. Defined in
lsf.licensescheduler.
100 Platform LSF Command Reference
Loading...