MyNixOS website logo
Description

A Lightweight Wrapper for 'Slurm'.

'Slurm', Simple Linux Utility for Resource Management <https://slurm.schedmd.com/>, is a popular 'Linux' based software used to schedule jobs in 'HPC' (High Performance Computing) clusters. This R package provides a specialized lightweight wrapper of 'Slurm' with a syntax similar to that found in the 'parallel' R package. The package also includes a method for creating socket cluster objects spanning multiple nodes that can be used with the 'parallel' package.

DOI R CIrelease codecov CRANstatus CRANdownloads status Integrative Methods of Analysis for GeneticEpidemiology

slurmR: A Lightweight Wrapper for Slurm

Slurm Workload Manager is a popular HPC cluster job scheduler found in many of the top 500 supercomputers. The slurmR R package provides an R wrapper to it that matches the parallel package’s syntax, this is, just like parallel provides the parLapply, clusterMap, parSapply, etc., slurmR provides Slurm_lapply, Slurm_Map, Slurm_sapply, etc.

While there are other alternatives such as future.batchtools, batchtools, clustermq, and rslurm, this R package has the following goals:

  1. It is dependency-free, which means that it works out-of-the-box

  2. Emphasizes been similar to the workflow in the R package parallel

  3. It provides a general framework for creating personalized own wrappers without using template files.

  4. Is specialized on Slurm, meaning more flexibility (no need to modify template files) and debugging tools (e.g., job resubmission).

  5. Provide a backend for the parallel package, providing an out-of-the-box method for creating Socket cluster objects for multi-node operations. (See the examples below on how to use it with other R packages)

Checkout the VS section section for comparing slurmR with other R packages. Wondering who is using Slurm? Check out the list at the end of this document.

Installation

From your HPC command line, you can install the development version from GitHub with:

$ git clone https://github.com/USCbiostats/slurmR.git
$ R CMD INSTALL slurmR/ 

The second line assumes you have R available in your system (usually loaded via module R or some other command). Or using the devtools from within R:

# install.packages("devtools")
devtools::install_github("USCbiostats/slurmR")

Citation

To cite slurmR in publications use:

  Vega Yon et al., (2019). slurmR: A lightweight wrapper for HPC with
  Slurm. Journal of Open Source Software, 4(39), 1493,
  https://doi.org/10.21105/joss.01493

And the actual R package:

  Vega Yon G, Marjoram P (2022). _slurmR: A Lightweight Wrapper for
  'Slurm'_. R package version 0.5-2,
  <https://github.com/USCbiostats/slurmR>.

To see these entries in BibTeX format, use 'print(<citation>,
bibtex=TRUE)', 'toBibtex(.)', or set
'options(citation.bibtex.max=999)'.

Running slurmR with Docker

For testing purposes, slurmR is available in Dockerhub. The rcmdcheck and interactive images are built on top of xenonmiddleware/slurm.

Once you download the files contained in the slurmR repository, you can go to the docker folder and use the Makefile included there to start a Unix session with slurmR and Slurm included.

To test slurmR using docker, check the README.md file located at https://github.com/USCbiostats/slurmR/tree/master/docker.

Examples

Example 1: Computing means (and looking under the hood)

library(slurmR)
#  Loading required package: parallel
#  slurmR default option for `tmp_path` (used to store auxiliar files) set to:
#    /home/george/Documents/development/slurmR
#  You can change this and checkout other slurmR options using: ?opts_slurmR, or you could just type "opts_slurmR" on the terminal.

# Suppose that we have 100 vectors of length 50 ~ Unif(0,1)
set.seed(881)
x <- replicate(100, runif(50), simplify = FALSE)

We can use the function Slurm_lapply to distribute computations

ans <- Slurm_lapply(x, mean, plan = "none")
#  Warning in normalizePath(file.path(tmp_path, job_name)):
#  path[1]="/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18":
#  No such file or directory
#  Warning: [submit = FALSE] The job hasn't been submitted yet. Use sbatch() to submit the job, or you can submit it via command line using the following:
#  sbatch --job-name=slurmr-job-113bd5bca5b18 /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/01-bash.sh
Slurm_clean(ans) # Cleaning after you

Notice the plan = "none" option; this tells Slurm_lapply to only create the job object but do nothing with it, i.e., skip submission. To get more info, we can set the verbose mode on

opts_slurmR$verbose_on()
ans <- Slurm_lapply(x, mean, plan = "none")
#  Warning in normalizePath(file.path(tmp_path, job_name)):
#  path[1]="/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18":
#  No such file or directory
#  --------------------------------------------------------------------------------
#  [VERBOSE MODE ON] The R script that will be used is located at: /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/00-rscript.r and has the following contents:
#  --------------------------------------------------------------------------------
#  .libPaths(c("/home/george/R/x86_64-pc-linux-gnu-library/4.2", "/usr/local/lib/R/site-library", "/usr/lib/R/site-library", "/usr/lib/R/library"))
#  message("[slurmR info] Loading variables and functions... ", appendLF = FALSE)
#  Slurm_env <- function (x = "SLURM_ARRAY_TASK_ID") 
#  {
#      y <- Sys.getenv(x)
#      if ((x == "SLURM_ARRAY_TASK_ID") && y == "") {
#          return(1)
#      }
#      y
#  }
#  ARRAY_ID  <- as.integer(Slurm_env("SLURM_ARRAY_TASK_ID"))
#  
#  # The -snames- function creates the write names for I/O of files as a 
#  # function of the ARRAY_ID
#  snames    <- function (type, array_id = NULL, tmp_path = NULL, job_name = NULL) 
#  {
#      if (length(array_id) && length(array_id) > 1) 
#          return(sapply(array_id, snames, type = type, tmp_path = tmp_path, 
#              job_name = job_name))
#      type <- switch(type, r = "00-rscript.r", sh = "01-bash.sh", 
#          out = "02-output-%A-%a.out", rds = if (missing(array_id)) "03-answer-%03i.rds" else sprintf("03-answer-%03i.rds", 
#              array_id), job = "job.rds", stop("Invalid type, the only valid types are `r`, `sh`, `out`, and `rds`.", 
#              call. = FALSE))
#      sprintf("%s/%s/%s", tmp_path, job_name, type)
#  }
#  TMP_PATH  <- "/home/george/Documents/development/slurmR"
#  JOB_NAME  <- "slurmr-job-113bd5bca5b18"
#  
#  # The -tcq- function is a wrapper of tryCatch that on error tries to recover
#  # the message and saves the outcome so that slurmR can return OK.
#  tcq <- function (...) 
#  {
#      ans <- tryCatch(..., error = function(e) e)
#      if (inherits(ans, "error")) {
#          ARRAY_ID. <- get("ARRAY_ID", envir = .GlobalEnv)
#          msg <- paste0("[slurmR info] An error has ocurred while evualting the expression:\n[slurmR info]   ", 
#              paste(deparse(match.call()[[2]]), collapse = "\n[slurmR info]   "), 
#              "\n[slurmR info] in ", "ARRAY_ID # ", ARRAY_ID., 
#              "\n[slurmR info] The error will be saved and quit R.\n")
#          message(msg, immediate. = TRUE, call. = FALSE)
#          ans <- list(res = ans, array_id = ARRAY_ID., job_name = get("JOB_NAME", 
#              envir = .GlobalEnv), slurmr_msg = structure(msg, 
#              class = "slurm_info"))
#          saveRDS(list(ans), snames("rds", tmp_path = get("TMP_PATH", 
#              envir = .GlobalEnv), job_name = get("JOB_NAME", envir = .GlobalEnv), 
#              array_id = ARRAY_ID.))
#          message("[slurmR info] job-status: failed.\n")
#          q(save = "no")
#      }
#      invisible(ans)
#  }
#  message("done loading variables and functions.")
#  tcq({
#    INDICES <- readRDS("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/INDICES.rds")
#  })
#  tcq({
#    X <- readRDS(sprintf("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/X_%04d.rds", ARRAY_ID))
#  })
#  tcq({
#    FUN <- readRDS("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/FUN.rds")
#  })
#  tcq({
#    mc.cores <- readRDS("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/mc.cores.rds")
#  })
#  tcq({
#    seeds <- readRDS("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/seeds.rds")
#  })
#  set.seed(seeds[ARRAY_ID], kind = NULL, normal.kind = NULL)
#  tcq({
#    ans <- parallel::mclapply(
#      X                = X,
#      FUN              = FUN,
#      mc.cores         = mc.cores
#  )
#  })
#  saveRDS(ans, sprintf("/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/03-answer-%03i.rds", ARRAY_ID), compress = TRUE)
#  message("[slurmR info] job-status: OK.\n")
#  --------------------------------------------------------------------------------
#  The bash file that will be used is located at: /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/01-bash.sh and has the following contents:
#  --------------------------------------------------------------------------------
#  #!/bin/sh
#  #SBATCH --job-name=slurmr-job-113bd5bca5b18
#  #SBATCH --output=/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/02-output-%A-%a.out
#  #SBATCH --array=1-2
#  #SBATCH --job-name=slurmr-job-113bd5bca5b18
#  #SBATCH --cpus-per-task=1
#  #SBATCH --ntasks=1
#  /usr/lib/R/bin/Rscript  /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/00-rscript.r
#  --------------------------------------------------------------------------------
#  EOF
#  --------------------------------------------------------------------------------
#  Warning: [submit = FALSE] The job hasn't been submitted yet. Use sbatch() to submit the job, or you can submit it via command line using the following:
#  sbatch --job-name=slurmr-job-113bd5bca5b18 /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/01-bash.sh
Slurm_clean(ans) # Cleaning after you

Example 2: Job resubmission

The following example was extracted from the package’s manual.

# Submitting a simple job
job <- Slurm_EvalQ(slurmR::WhoAmI(), njobs = 20, plan = "submit")

# Checking the status of the job (we can simply print)
job
status(job) # or use the state function
sacct(job) # or get more info with the sactt wrapper.

# Suppose some of the jobs are taking too long to complete (say 1, 2, and 15 through 20)
# we can stop it and resubmit the job as follows:
scancel(job)

# Resubmitting only 
sbatch(job, array = "1,2,15-20") # A new jobid will be assigned

# Once its done, we can collect all the results at once
res <- Slurm_collect(job)

# And clean up if we don't need to use it again
Slurm_clean(res)

Take a look at the vignette here.

Example 3: Using slurmR and future/doParallel/boot/…

The function makeSlurmCluster creates a PSOCK cluster within a Slurm HPC network, meaning that users can go beyond a single node cluster object and take advantage of Slurm to create a multi-node cluster object. This feature allows using slurmR with other R packages that support working with SOCKcluster class objects. Here are some examples

With the future package

library(future)
library(slurmR)

cl <- makeSlurmCluster(50)

# It only takes using a cluster plan!
plan(cluster, cl)

...your fancy futuristic code...

# Slurm Clusters are stopped in the same way any cluster object is
stopCluster(cl)

With the doParallel package

library(doParallel)
library(slurmR)

cl <- makeSlurmCluster(50)

registerDoParallel(cl)
m <- matrix(rnorm(9), 3, 3)
foreach(i=1:nrow(m), .combine=rbind) 

stopCluster(cl)

Example 4: Using slurmR directly from the command line

The slurmR package has a couple of convenient functions designed for the user to save time. First, the function sourceSlurm() allows skipping the explicit creating of a bash script file to be used together with sbatch by putting all the required config files on the first lines of an R scripts, for example:

#!/bin/sh
#SBATCH --account=lc_ggv
#SBATCH --partition=scavenge
#SBATCH --time=01:00:00
#SBATCH --mem-per-cpu=4G
#SBATCH --job-name=Waiting
Sys.sleep(10)
message("done.")

Is an R script that on the first line coincides with that of a bash script for Slurm: #!/bin/bash. The following lines start with #SBATCH explicitly specifying options for sbatch, and the reminder lines are just R code.

The previous R script is included in the package (type system.file("example.R", package="slurmR")).

Imagine that that R script is named example.R, then you use the sourceSlurm function to submit it to Slurm as follows:

slurmR::sourceSlurm("example.R")

This will create the corresponding bash file required to be used with sbatch, and submit it to Slurm.

Another nice tool is the slurmr_cmd(). This function will create a simple bash-script that we can use as a command-line tool to submit this type of R-scripts. Moreover, this command will can add the command to your session’s alias as follows:

library(slurmR)
slurmr_cmd("~", add_alias = TRUE)

Once that’s done, you can submit R scripts with “Slurm-like headers” (as shown previously) as follows:

$ slurmr example.R

Example 5: Using the preamble

Since version 0.4-3, slurmR includes the option preamble. This provides a way for the user to specify commands/modules that need to be executed before running the Rscript. Here is an example using module load:

# Turning the verbose mode off
opts_slurmR$verbose_off()

# Setting the preamble can be done globally
opts_slurmR$set_preamble("module load gcc/6.0")

# Or on the fly
ans <- Slurm_lapply(1:10, mean, plan = "none", preamble = "module load pandoc")

# Printing out the bashfile
cat(readLines(ans$bashfile), sep = "\n")
#  #!/bin/sh
#  #SBATCH --job-name=slurmr-job-113bd5bca5b18
#  #SBATCH --output=/home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/02-output-%A-%a.out
#  #SBATCH --array=1-2
#  #SBATCH --job-name=slurmr-job-113bd5bca5b18
#  #SBATCH --cpus-per-task=1
#  #SBATCH --ntasks=1
#  module load gcc/6.0
#  module load pandoc
#  /usr/lib/R/bin/Rscript  /home/george/Documents/development/slurmR/slurmr-job-113bd5bca5b18/00-rscript.r

Slurm_clean(ans) # Cleaning after you

VS

There are several ways to enhance R for HPC. Depending on what are your goals/restrictions/preferences, you can use any of the following from this manually curated list:

PackageRerun (1)*apply (2)makeCluster (3)Slurm optionsDependenciesActivity
slurmRyesyesyeson the flystatusActivity
drakeyes--by templatestatusActivity
rslurm-yes-on the flystatusActivity
future.batchtools-yesyesby templatestatusActivity
batchtoolsyesyes-by templatestatusActivity
clustermq---by templatestatusActivity
  1. After errors, a part or the entire job can be resubmitted.
  2. Functionality similar to the apply family in base R, e.g., lapply, sapply, mapply or similar.
  3. Creating a cluster object using either MPI or Socket connection.

The packages slurmR, rslurm work only on Slurm. The drake package is focused on workflows.

Contributing

We welcome contributions to slurmR. Whether it is reporting a bug, starting a discussion by asking a question, or proposing/requesting a new feature, please go by creating a new issue here so that we can talk about it.

Please note that this project is released with a Contributor Code of Conduct (see the CODE_OF_CONDUCT.md file included in this project). By participating in this project, you agree to abide by its terms.

Who uses Slurm

Here is a manually curated list of institutions using Slurm:

InstitutionCountryLink
University of Utah’s CHPCUSlink
USC Center for Advance Research ComputingUSlink
Princeton Research ComputingUSlink
Harvard FASUSlink
Harvard HMS research computingUSlink
UCSan Diego WM Keck Lab for Integrated BiologyUSlink
Stanford SherlockUSlink
Stanford SCG Informatics ClusterUSlink
UC Berkeley Open Computing FacilityUSlink
University of Utah CHPCUSlink
The University of Kansas Center for Research ComputingUSlink
University of CambridgeUKlink
Indiana UniversityUSlink
Caltech HPC CenterUSlink
Institute for Advanced StudyUSlink
UTSouthwestern Medical Center BioHPCUSlink
Vanderbilt University ACCREUSlink
University of Virginia Research ComputingUSlink
Center for Advanced ComputingCAlink
SciNetCAlink
NLHPCCLlink
KultrunCLlink
MatbioCLlink
TIG MITUSlink
MIT SupercloudUSsupercloud.mit.edu/
Oxford’s ARCUKlink

Funding

With project is supported by the National Cancer Institute, Grant #1P01CA196596.

Computation for the work described in this paper was supported by the University of Southern California’s Center for High-Performance Computing (hpcc.usc.edu).

Metadata

Version

0.5-4

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows