Sample UNIX installation directories
LSF_TOP
1 2
conf
5
lsbatch
cluster_name
configdir
lsb.hosts
lsb.params
lsb.queues
…
license.dat
lsf.cluster.cluster_name
lsf.conf
lsf.shared
lsf.task
profile.lsf
cshrc.lsf
Key
directories
files
work log
cluster_name
logdir
lsf_indir
lsf_cmdir
3
lsb.event.lock
info
man
include
misc
version
lsf
lsbatch.h
lsf.h
conf_tmpl
examples
make.def
make.misc
…
install
instlib
scripts
lsfinstall
hostsetup
...
4
6
8
badmin
bjobs
12
lsadmin
…
1
2
3
4
5
6
7
8
9
10
11
12
7
aix5-64 sparc-sol7-64
9
LSF_CONFDIR = LSF_ENVDIR
LSB_SHAREDIR
LSF_LOGDIR
LSF_VERSION
LSB_CONFDIR
LSF_MANDIR
Machine-dependent directory
LSF_INCLUDEDIR
LSF_BINDIR
LSF_SERVERDIR
LSF_LIBDIR
LSF_MISC
etcbin
lim
res
sbatchd
…
10
..
.
11
lib
locale
uid
ckpt_crt0.o
libampi.a
…
Daemon error log files
Daemon error log files are stored in the directory defined by LSF_LOGDIR in lsf.conf.
LSF base system daemon log files LSF batch system daemon log files
lim.log.hostname mbatchd.log.hostname
res.log.hostname sbatchd.log.hostname
pim.log.hostname mbschd.log.hostname
Configuration files
lsf.conf, lsf.shared, and lsf.cluster.cluster_name are located in LSF_CONFDIR.
lsb.params, lsb.queues, lsb.modules, and lsb.resources are located in LSB_CONFDIR/
cluster_name/configdir/.
File Description
install.config Options for Platform LSF installation and configuration
lsf.conf Generic environment configuration file describing the
configuration and operation of the cluster
lsf.shared Definition file shared by all clusters. Used to define cluster
name, host types, host models and site-defined resources
lsf.cluster.cluster_name Cluster configuration files used to define hosts, administrators,
lsf.licensescheduler Configures Platform LSF License Scheduler
lsb.params Configures LSF batch parameters
lsb.queues Batch queue configuration file
and locality of site-defined shared resources
File Description
lsb.modules Configures LSF scheduler and resource broker plugin
lsb.resources Configures resource allocation limits, exports, and resource
lsb.serviceclasses Defines service-level agreements (SLAs) in an LSF cluster as
7
lsb.users Configures user groups, hierarchical fairshare for users and
modules
usage limits
service classes, which define the properties of the SLA
user groups, and job slot limits for users and user groups
Cluster configuration parameters (lsf.conf)
Variable Description UNIX Default
LSF_TOP Top-level LSF installation directory, must
LSF_BINDIR Directory containing LSF user commands,
LSF_CONFDIR Directory for all LSF configuration files LSF_TOP/conf
LSF_ENVDIR Directory containing the lsf.conf file, must
LSF_INCLUDEDIR Directory containing LSF API header files
LSF_LIBDIR LSF libraries, shared by all hosts of the
LSF_LOGDIR (Optional) Directory for LSF daemon logs,
LSF_LOG_MASK Specifies the logging level of error
LSF_MANDIR Directory containing LSF man pages LSF_TOP/version/man
LSF_MISC Help files for the LSF GUI tools, sample C
LSF_SERVERDIR Directory for all server binaries and shell
LSB_CONFDIR Directory for LSF Batch configuration
LSB_SHAREDIR Directory for LSF Batch job history and
LSF_LIM_PORT TCP service port used for communication
LSF_RES_PORT TCP service port used for communication
LSB_MBD_PORT TCP service port used for communication
LSB_SBD_PORT TCP service port used for communication
be accessible from all hosts in the cluster
shared by all hosts of the same type
be owned by root
lsf.h and lsbatch.h
same type
must be owned by root
messages from LSF commands
programs and shell scripts, and a template
for an external LIM (elim)
scripts, and external executables invoked
by LSF daemons, must be owned by root,
and shared by all hosts of the same type
directories, containing user and host lists,
operation parameters, and batch queues
accounting log files for each cluster, must
be owned by primary LSF administrator
with lim
with res
with mbatchd
with sbatchd
/usr/local/lsf
LSF_TOP/version/
platform/bin
/etc (if LSF_CONFDIR
is not defined)
LSF_TOP/version/
include
LSF_TOP/version/
platform/lib
/tmp
LOG_WARNING
LSF_TOP/version/
misc
LSF_TOP/version/
platform/etc
LSF_CONFDIR/
lsbatch
LSF_TOP/work
6879
6878
6881
6882
Platform LSF®
Quick Reference
Version 6.2
Administration and accounting commands
Only LSF administrators or root can use these commands.
Command Description
lsacct Displays accounting statistics on finished RES tasks in the LSF system
lsadmin LSF administrative tool to control the operation of the LIM and RES
daemons in an LSF cluster. lsadmin help shows all subcommands.
lsfinstall Install LSF using install.config input file
lsfrestart Restart the LSF daemons on all hosts in the local cluster
lsfshutdown Shut down the LSF daemons on all hosts in the local cluster
lsfstartup Start the LSF daemons on all hosts in the local cluster
bacct Reports accounting statistics on completed LSF jobs
badmin LSF administrative tool to control the operation of the LSF Batch
system including sbatchd, mbatchd, hosts and queues. badmin help
shows all subcommands.
bladmin reconfigures the Platform LSF License Scheduler daemon (bld)
brun Forces LSF to run a submitted, pending job immediately on a specified
host
brsvadd Creates an advance reservation
brsvdel Deletes an advance reservation
Daemons
Executable Name Description
lim Load Information Manager (LIM)—collects load and resource
information about all server hosts in the cluster and provides host
selection services to applications through LSLIB. LIM maintains
information on static system resources and dynamic load indices.
mbatchd Master Batch Daemon (MBD)—accepts and holds all batch jobs.
MBD periodically checks load indices on all server hosts by
contacting the Master LIM.
mbschd Master Batch Scheduler Daemon—performs the scheduling
functions of LSF and sends job scheduling decisions to MBD for
dispatch. Runs on the LSF master server host.
sbatchd Slave Batch Daemon (SBD)—accepts job execution requests
from MBD, and monitors the progress of jobs. Controls job
execution, enforces batch policies, reports job status to MBD, and
launches MBD.
pim Process Information Manager (PIM)—monitors resources used
by submitted jobs while they are running. PIM is used to enforce
resource limits and load thresholds, and for fairshare scheduling.
res Remote Execution Server (RES)—accepts remote execution
requests from all load sharing applications and handles I/O on the
remote host for load sharing processes.
User commands
Viewing information about your cluster
Command Description
bhosts Displays hosts and their static and dynamic resources
bhpart Displays information about host partitions
bmgroup Displays information about host groups
blimits Displays information about resource allocation limits of running jobs
bparams Displays information about tunable batch system parameters
bqueues Displays information about batch queues
brsvs Displays advance reservations
bugroup Displays information about user grou ps
busers Displays information about users and user groups
lshosts Displays hosts and their static resource information
lsid Displays the current LSF version number, cluster name and the master
host name
lsinfo Displays load sharing configuration information
lsload Displays dynamic load indices for hosts
Monitoring jobs and tasks
Command Description
bhist Displays historical information about jobs
bjgroup Displays information about job groups
bjobs Displays information about jobs
blimits Displays information about resource allocation limits
bpeek Displays stdout and stderr of unfinished jobs
bsla Displays information about service class configuration for goal-oriented
service-level agreement (SLA) scheduling
bstatus Reads or sets external job status messages and data files
Submitting and controlling jobs
Command Description
bbot Moves a pending job relative to the last job in the queue
bchkpnt Checkpoints a checkpointable job
bgadd Creates job groups
bgdel Deletes job groups
bkill Sends a signal to a job
bmig Migrates a checkpointable or rerunnable job
bmod Modifies job submission options
bpost Sends a messages and attaches data files to a job
bread Reads messages and attached data files from a job
brequeue Kills and requeues a job
brestart Restarts a checkpointed job
bresume Resumes a suspended job
bstop Suspends a job
Command Description
bsub Submits a job
bswitch Moves unfinished jobs from one queue to another
btop Moves a pending job relative to the first job in the queue
bsub command
Syntax
bsub [options] command [arguments]
Options
Option Description
-B Sends email when the job is dispatched
-H Holds the job in the PSUSP state at submission
-I | -Ip | -Is Submits a batch interactive job. -Ip creates a pseudoterminal. -Is creates a pseudo-terminal in shell mode.
-K Submits a job and waits for the job to finish
-N Emails the job report when the job finishes
-r Makes a job rerunnable
-x Exclusive execution
-a esub_parameters String format parameter containing the name of an
application-specific esub program to be passed to the
master esub
-b begin_time Dispatches the job on or after the specified date and
time in the form [[month:]day:]:minute
-C core_limit Sets a per-process (soft) core file size limit (KB) for all
-c cpu_time[/host_name | /
host_model]
-D data_limit Sets per-process (soft) data segment size limit (KB)
-e error_file Appends the standard error output to a file
-ext[sched]
"external_scheduler_options"
-E "pre_exec_command
[arguments ...]"
-f "local_file op [remote_file]" ... Copies a file between the local (submission) host and
-F file_limit Sets per-process (soft) file size limit (KB) for each
-G user_group Associates job with a specified user group
-g job_group_name Associates job with a specified job group
-i input_file | -is input_file Gets the standard input for the job from specified file
-J "job_name[index_list]
%job_slot_limit"
-k "chkpnt_dir [chkpnt_period]
[method=method_name]"
the processes that belong to this job
Limits the total CPU time the job can use. CPU time is
in the form [hour:]minute
for each process that belong to the job
Application-specific external scheduling options for
the job (-extsched can be abbreviated to -ext)
Runs the specified pre-exec command on the
execution host before running the job
remote (execution) host. op is one of >, <, <<, ><, <>
process that belong to the job
Assigns the specified name to the job. Job arrary
Index_list has the form start[-end[:step]], and
%job_slot_limit is the maximum number of jobs that
can run at any given time.
Makes a job checkpointable and specifies the
checkpoint directory, period in minutes, and method
Option Description
-L login_shell Initializes the execution environment using the
-Lp ls_project_name Assigns the job to the specified License Scheduler
-m "host_name
[@cluster_name]
[+[pref_level]] |
host_group[+[pref_level]] ..."
-M mem_limit Sets the memory limit (KB)
-n min_proc[,max_proc] Specifies the minimum and maximum numbers of
-o output_file Appends the standard output to a file
-P project_name Assigns job to specified project
-p process_limit Sets the limit of the number of processes for the whole
-q "queue_name ..." Submits job to specified queues
-R "res_req" Specifies host resource requirements
-sla service_class_name Specifies the service class where the job is to run
-sp priority Specifies user-assigned job priority to allow users to
-S stack_limit Set s a per-process (soft) stack segment size limit (KB)
-s signal Send signal when a queue-level run window closes
-T thread_limit Sets the limit of the number of concurrent threads for
-t term_time Specifies the job termination deadline in the form
-U reservation_ID Use advance reservation created with brsvadd
-u mail_user Sends mail to the specified email address
-v swap_limit Set the total process virtual memory limit (KB) for the
-w 'dependency_expression' Places a job when the dependency expression
-wa '[signal | command |
CHKPNT]'
-wt '[hour:]minute' Specifies the amount of time before a job control
-W run_time[/host_name | /
host_model]
-Zs Spools a command file for the job to the directory
-h Prints command usage to stderr and exits
-V Prints LSF release version to stderr and exits
© 2000-2005 Platform Computing Corporation. All rights reserved. training@platform.com
Last Update: September 29 2005 +1 87PLATFORM (+1 877 528 3676)
All products or services mentioned in this document are identified by the trademarks or service marks of thei r
respective owners.
specified login shell
project
Runs job on one of the specified hosts. Plus (+) after
the names of hosts or host groups indicates a
preference. Optionally, a positive integer indicates a
preference level. Higher numbers indicate greater
preferences for those hosts.
processors required for a parallel job
job
order their jobs in a queue
for each of the processes that belong to the job
the whole job
[[month:]day:]hour:minute
whole job
evaluates to TRUE
Specifies the job action to be taken before a job
control action occurs
action occurs that a job warning action is to be taken
Sets the run time limit of the job in the form
[hour:]minute
specified by the JOB_SPOOL_DIR in lsb.params
www.platform.com
doc@platform.com
support@platform.com