jobqueues.lsfqueue module#
- class jobqueues.lsfqueue.LsfQueue(_configapp=None, _configfile=None, _findExecutables=True, _logger=True)#
Bases:
SimQueue
Queue system for LSF
- Parameters:
version ([9, 10], int, default=9) – LSF major version
jobname (str, default=None) – Job name (identifier)
queue (list, default=None) – The queue or list of queues to run on. If list, it attempts to submit the job to the first queue listed
app (str, default=None) – The application profile
ngpu (int, default=1) – Number of GPUs to use for a single job
gpu_options (dict, default=None) – Number of GPUs to use for a single job (valid dict entries: {‘mode’: <’shared’ or ‘exclusive_process’>}, {‘mps’: <’yes’ or ‘no’>}, {‘j_exclusive’: <’yes’ or ‘no’>})
ncpu (int, default=1) – Number of CPUs to use for a single job
memory (int, default=4000000) – Amount of memory per job (KB)
walltime (int, default=None) – Job timeout (hour:min or min)
resources (list, default=None) – Resources of the queue
outputstream (str, default='lsf.%J.out') – Output stream.
errorstream (str, default='lsf.%J.err') – Error stream.
datadir (str, default=None) – The path in which to store completed trajectories.
trajext (str, default='xtc') – Extension of trajectory files. This is needed to copy them to datadir.
envvars (str, default='ACEMD_HOME') – Envvars to propagate from submission node to the running node (comma-separated)
prerun (list, default=None) – Shell commands to execute on the running node before the job (e.g. loading modules)
Examples
>>> s = LsfQueue() >>> s.jobname = 'simulation1' >>> s.queue = 'multiscale' >>> s.submit('/my/runnable/folder/') # Folder containing a run.sh bash script
- inprogress()#
Returns the sum of the number of running and queued workunits of the specific group in the engine.
- Returns:
total – Total running and queued workunits
- Return type:
- property memory#
Subclasses need to have this property. This property is expected to return a integer in MiB
- property ncpu#
Subclasses need to have this property
- property ngpu#
Subclasses need to have this property
- retrieve()#
Subclasses need to implement this method
- stop()#
Cancels all currently running and queued jobs