diff --git a/news/index.html b/news/index.html
index f4ace1b..965756a 100644
--- a/news/index.html
+++ b/news/index.html
@@ -60,7 +60,7 @@
-
clustermq 0.9.4
+
clustermq 0.9.4
CRAN release: 2024-03-04
- Fix a bug where worker stats were shown as
NA
(#325)
- Worker API:
env()
now visibly lists environment if called without arguments
diff --git a/pkgdown.yml b/pkgdown.yml
index a26022b..749b6f5 100644
--- a/pkgdown.yml
+++ b/pkgdown.yml
@@ -5,5 +5,5 @@ articles:
faq: faq.html
technicaldocs: technicaldocs.html
userguide: userguide.html
-last_built: 2024-03-04T15:24Z
+last_built: 2024-03-05T08:33Z
diff --git a/search.json b/search.json
index eaeabf6..31215c2 100644
--- a/search.json
+++ b/search.json
@@ -1 +1 @@
-[{"path":"/articles/faq.html","id":"install","dir":"Articles","previous_headings":"","what":"Installation errors","title":"Frequently asked questions","text":"compile package fully C++11 compliant compiler required. implicit CRAN packages since R=3.6.2 hence listed SystemRequirements. encounter error saying matching function call zmq::message_t::message_t(std::string&) exists, compiler (fully) support automated check failed reason. happens instance old versions gcc compiler (default Linux distributions). can check version terminal using: case, likely HPC system already newer compiler installed need add $PATH load module. set, can install package R started terminal module/path active.","code":"In file included from CMQMaster.cpp:2:0: CMQMaster.h: In member function ‘void CMQMaster::proxy_submit_cmd(SEXP, int)’: CMQMaster.h:146:40: error: no matching function for call to ‘zmq::message_t::message_t(std::string&)’ mp.push_back(zmq::message_t(cur)); # the minimum required gcc version is 5.5 for full C++11 support (3.3 for clang) cc --version"},{"path":"/articles/faq.html","id":"stuck","dir":"Articles","previous_headings":"","what":"Session gets stuck at “Running calculations”","title":"Frequently asked questions","text":"R session may stuck something like following: see every time jobs queued yet started. Depending busy HPC , may take long time. can check queueing status jobs terminal e.g. qstat (SGE), bjobs (LSF), sinfo (SLURM). jobs already finished, likely means clustermq workers can connect main session. can confirm passing log_worker=TRUE Q inspect logs created current working directory. state something like: submitted job indeed unable establish network connection head node. can happen HPC allow incoming connections, likely happens multiple network interfaces, access head node. can list available network interfaces using ifconfig command terminal. Find interface shares subnetwork head node add R option clustermq.host=. unclear, contact system administrators see interface use.","code":"> clustermq::Q(identity, x=42, n_jobs=1) Submitting 1 worker jobs (ID: cmq8480) ... Running 1 calculations (5 objs/19.4 Kb common; 1 calls/chunk) ... > clustermq:::worker(\"tcp://my.headnode:9091\") 2023-12-11 10:22:58.485529 | Master: tcp://my.headnode:9091 2023-12-11 10:22:58.488892 | connecting to: tcp://my.headnode:9091: Error: Connection failed after 10016 ms Execution halted"},{"path":"/articles/faq.html","id":"ssh","dir":"Articles","previous_headings":"","what":"SSH not working","title":"Frequently asked questions","text":"trying remote schedulers via SSH, make sure scheduler works first connect cluster run job . terminal stuck make sure step SSH connection works typing following commands local terminal make sure don’t get errors warnings step: get Command found: R error, make sure $PATH set correctly ~/.bash_profile /~/.bashrc (depending cluster config might need either). may also need modify SSH template load R module conda environment. get SSH warning error try ssh -v enable verbose output. forward works, run following local R session (ideally also command-line R, RStudio): create log file remote server contain errors might occurred ssh_proxy startup. ssh_proxy startup fails local machine error server log show errors, can try increasing timeout: can happen SSH startup template includes additional steps starting R, activating module conda environment, confirm connection via two-factor authentication.","code":"Connecting via SSH ... # test your ssh login that you set up in ~/.ssh/config # if this fails you have not set up SSH correctly ssh # test port forwarding from 54709 remote to 6687 local (ports are random) # if the fails you will not be able to use clustermq via SSH ssh -R 54709:localhost:6687 R --vanilla options(clustermq.scheduler = \"ssh\", clustermq.ssh.log = \"~/ssh_proxy.log\") Q(identity, x=1, n_jobs=1) Remote R process did not respond after 5 seconds. Check your SSH server log. options(clustermq.ssh.timeout = 30) # in seconds"},{"path":"/articles/faq.html","id":"master-in-container","dir":"Articles","previous_headings":"","what":"Running the master inside containers","title":"Frequently asked questions","text":"master process inside container, accessing HPC scheduler difficult. Containers, including singularity docker, isolate processes inside container host. R process able submit job scheduler found. Note HPC node running master process must allowed submit jobs. HPC systems allow compute nodes submit jobs. case, may need run master process login node, discuss issue system administrator. container binary compatible host, may able bind scheduler executable container. example, PBS might look something like: working example binding SLURM CentOS 7 container image CentOS 7 host available https://groups.google.com//lbl.gov/d/msg/singularity/syLcsIWWzdo/NZvF2Ud2AAAJ Alternatively, can create script uses SSH execute scheduler login node. , need SSH client container, keys set password-less login, create script call scheduler login node via ssh (e.g. ~/bin/qsub SGE/PBS/Torque, bsub LSF sbatch Slurm): Make sure script executable, bind/copy container somewhere $PATH. Home directories bound default singularity.","code":"#PBS directives ... module load singularity SINGULARITYENV_APPEND_PATH=/opt/pbs/bin singularity exec --bind /opt/pbs/bin r_image.sif Rscript master_script.R #!/bin/bash ssh -i ~/.ssh/ ${PBS_O_HOST:-\"no_host_not_in_a_pbs_job\"} qsub \"$@\" chmod u+x ~/bin/qsub SINGULARITYENV_APPEND_PATH=~/bin"},{"path":[]},{"path":"/articles/technicaldocs.html","id":"base-api-and-schedulers","dir":"Articles","previous_headings":"Worker API","what":"Base API and schedulers","title":"Technical Documentation","text":"main worker functions wrapped R6 class name QSys. provides standardized API lower-level messages sent via ZeroMQ. base class derived scheduler classes add required functions submitting cleaning jobs: user-visible object worker Pool wraps , eventually allow manage different workers.","code":"+ QSys |- Multicore |- LSF + SGE |- PBS |- Torque |- etc."},{"path":[]},{"path":"/articles/technicaldocs.html","id":"creating-a-worker-pool","dir":"Articles","previous_headings":"Worker API > Workers","what":"Creating a worker pool","title":"Technical Documentation","text":"pool workers can created using workers() function, instantiates Pool object corresponding QSys-derived scheduler class. See ?workers details.","code":"# start up a pool of three workers using the default scheduler w = workers(n_jobs=3) # if we make an unclean exit for whatever reason, clean up the jobs on.exit(w$cleanup())"},{"path":"/articles/technicaldocs.html","id":"worker-startup","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker startup","title":"Technical Documentation","text":"workers started via scheduler, know machine run . start every worker TCP/IP address master socket distribute work. achieved call R common schedulers:","code":"R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/technicaldocs.html","id":"worker-communication","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker communication","title":"Technical Documentation","text":"master’s side, wait worker connects: can send expression evaluated worker using send method: expression (...), variables passed along call can added. batch processing clustermq usually , command work_chunk, chunk data added:","code":"msg = w$recv() # this will block until a worker is ready w$send(expression, ...) w$send(clustermq:::work_chunk(chunk, fun, const, rettype, common_seed), chunk = chunk(iter, submit_index))"},{"path":"/articles/technicaldocs.html","id":"worker-environment","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker environment","title":"Technical Documentation","text":"can add number objects worker environment using env method: also invisibly return data.frame objects currently environment. user wants inspect environment without changing can call w$env() without arguments. environment propagated workers automatically greedy fashion.","code":"w$env(object=value, ...)"},{"path":"/articles/technicaldocs.html","id":"main-event-loop","dir":"Articles","previous_headings":"Worker API","what":"Main event loop","title":"Technical Documentation","text":"Putting together event loop, get essentially implemented master. w$send invisibly returns identifier track call submitted, w$current() matches w$recv(). loop similar structure can used extend clustermq. example, done targets package.","code":"w = workers(3) on.exit(w$cleanup()) w$env(...) while (we have new work to send || jobs pending) { res = w$recv() # the result of the call, or NULL for a new worker w$current()$call_ref # matches answer to request, -1 otherwise # handle result if (more work) call_ref = w$send(expression, ...) # call_ref tracks request identity else w$send_shutdown() }"},{"path":"/articles/technicaldocs.html","id":"zeromq-message-specification","dir":"Articles","previous_headings":"","what":"ZeroMQ message specification","title":"Technical Documentation","text":"Communication master (main event loop) workers (QSys base class) organised messages. chunks serialized data sent via ZeroMQ’s protocol (ZMTP). parts message called frames.","code":""},{"path":"/articles/technicaldocs.html","id":"master---worker-communication","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Master - Worker communication","title":"Technical Documentation","text":"master requests evaluation message X frames (direct) Y proxied. handled clustermq internally. worker identity frame routing identifier delimiter frame Worker status (wlife_t) call evaluated variable name environment object yet present worker variable value using proxy, followed SEXP contains variable names proxy add forwarding worker.","code":""},{"path":"/articles/technicaldocs.html","id":"worker-evaluation","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Worker evaluation","title":"Technical Documentation","text":"worker evaluates call using R C API: error occurs evaluation returned structure class worker_error. developer wants catch errors warnings fine-grained manner, recommended add callingHandlers cmd (clustermq work work_chunk).","code":"R_tryEvalSilent(cmd, env, &err);"},{"path":"/articles/technicaldocs.html","id":"worker---master-communication","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Worker - Master communication","title":"Technical Documentation","text":"result evaluation returned message four (direct) five (proxied) frames: Worker identity frame (handled internally ZeroMQ’s ZMQ_REQ socket) Empty frame (handled internally ZeroMQ’s ZMQ_REQ socket) Worker status (wlife_t) handled internally clustermq result call (SEXP), visible user using worker via SSH, frames preceded routing identify frame handled internally ZeroMQ added peeled proxy.","code":""},{"path":"/articles/userguide.html","id":"installation","dir":"Articles","previous_headings":"","what":"Installation","title":"User Guide","text":"Install clustermq package R CRAN. automatically detect ZeroMQ installed otherwise use bundled library: Alternatively can use remotes package install directly Github. Note version needs autoconf/automake CMake compilation: develop branch, introduce code changes new features. may contain bugs, poor documentation, inconveniences. branch may install times. However, feedback welcome. installation issues please see FAQ.","code":"# Recommended: # If your system has `libzmq` installed but you want to enable the worker crash # monitor, set the following environment variable to enable compilation of the # bundled `libzmq` library with the required feature (`-DZMQ_BUILD_DRAFT_API=1`): # Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) install.packages(\"clustermq\") # Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) # install.packages('remotes') remotes::install_github(\"mschubert/clustermq\") # remotes::install_github(\"mschubert/clustermq@develop\") # dev version"},{"path":"/articles/userguide.html","id":"configuration","dir":"Articles","previous_headings":"","what":"Configuration","title":"User Guide","text":"HPC cluster’s scheduler ensures computing jobs distributed available worker nodes. Hence, clustermq interfaces order computations. default, take whichever scheduler find fall back local processing. work , cases. may need configure scheduler.","code":""},{"path":"/articles/userguide.html","id":"scheduler-setup","dir":"Articles","previous_headings":"Configuration","what":"Setting up the scheduler","title":"User Guide","text":"set scheduler explicitly, see following links: SLURM - work without setup LSF - work without setup SGE - may require configuration PBS/Torque - needs options(clustermq.scheduler=\"PBS\"/\"Torque\") can suggest another scheduler opening issue may addition need activate compute environments containers shell (e.g. ~/.bashrc) automatically. Check FAQ job submission/call Q errors gets stuck.","code":""},{"path":"/articles/userguide.html","id":"local-parallelization","dir":"Articles","previous_headings":"Configuration","what":"Local parallelization","title":"User Guide","text":"main focus package, can use parallelize function calls locally multiple cores processes. can also useful test code subset data submitting scheduler. Multiprocess (recommended) - Use callr package run manage multiple parallel R processes options(clustermq.scheduler=\"multiprocess\") Multicore - Uses parallel package fork current R process multiple threads options(clustermq.scheduler=\"multicore\"). sometimes causes problems (macOS, RStudio) available Windows.","code":""},{"path":"/articles/userguide.html","id":"ssh-connector","dir":"Articles","previous_headings":"Configuration","what":"SSH connector","title":"User Guide","text":"reasons might prefer work computing cluster directly rather local machine instead. RStudio excellent local IDE, ’s responsive feature-rich browser-based solutions (RStudio server, Project Jupyter), avoids X forwarding issues want look plots just made. Using setup, however, lost access computing cluster. Instead, copy data , submit individual scripts jobs, aggregating data end . clustermq trying solve providing transparent SSH interface. order use clustermq local machine, package needs installed computing cluster. computing cluster, set scheduler make sure clustermq runs without problems. Note remote scheduler can LOCAL (default HPC scheduler found) SSH work. local machine, add following options: recommend set SSH keys password-less login.","code":"# If this is set to 'LOCAL' or 'SSH' you will get the following error: # Expected PROXY_READY, received ‘PROXY_ERROR: Remote SSH QSys is not allowed’ options( clustermq.scheduler = \"multiprocess\" # or multicore, LSF, SGE, Slurm etc. ) options( clustermq.scheduler = \"ssh\", clustermq.ssh.host = \"user@host\", # use your user and host, obviously clustermq.ssh.log = \"~/cmq_ssh.log\" # log for easier debugging )"},{"path":[]},{"path":"/articles/userguide.html","id":"the-q-function","dir":"Articles","previous_headings":"Usage","what":"The Q function","title":"User Guide","text":"following arguments supported Q: fun - function call. needs self-sufficient (access master environment) ... - iterated arguments passed function. one, need named const - named list non-iterated arguments passed fun export - named list objects export worker environment Behavior can fine-tuned using options : fail_on_error - Whether stop one calls returns error seed - common seed combined job number reproducible results memory - Amount memory request job (bsub -M) n_jobs - Number jobs submit function calls job_size - Number function calls per job. used combination n_jobs latter overall limit chunk_size - many calls worker process reporting back master. Default: every worker report back 100 times total full documentation available typing ?Q.","code":""},{"path":"/articles/userguide.html","id":"examples","dir":"Articles","previous_headings":"Usage","what":"Examples","title":"User Guide","text":"package designed distribute arbitrary function calls HPC worker nodes. , however, couple caveats observe R session running worker share local memory. simplest example function call completely self-sufficient, one argument (x) iterate : Non-iterated arguments supported const argument: function relies objects environment passed arguments (including functions), can exported using export argument: want use package function need load worker using pkgs parameter, referencing package_name:::","code":"fx = function(x) x * 2 Q(fx, x=1:3, n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 2 #> #> [[2]] #> [1] 4 #> #> [[3]] #> [1] 6 fx = function(x, y) x * 2 + y Q(fx, x=1:3, const=list(y=10), n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 12 #> #> [[2]] #> [1] 14 #> #> [[3]] #> [1] 16 fx = function(x) x * 2 + y Q(fx, x=1:3, export=list(y=10), n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 12 #> #> [[2]] #> [1] 14 #> #> [[3]] #> [1] 16 f1 = function(x) splitIndices(x, 3) Q(f1, x=3, n_jobs=1, pkgs=\"parallel\") #> Running sequentially ('LOCAL') ... #> [[1]] #> [[1]][[1]] #> [1] 1 #> #> [[1]][[2]] #> [1] 2 #> #> [[1]][[3]] #> [1] 3 f2 = function(x) parallel::splitIndices(x, 3) Q(f2, x=8, n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [[1]][[1]] #> [1] 1 2 3 #> #> [[1]][[2]] #> [1] 4 5 #> #> [[1]][[3]] #> [1] 6 7 8 # Q(f1, x=5, n_jobs=1) # (Error #1) could not find function \"splitIndices\""},{"path":"/articles/userguide.html","id":"as-parallel-foreach-backend","dir":"Articles","previous_headings":"Usage","what":"As parallel foreach backend","title":"User Guide","text":"foreach package provides interface perform repeated tasks different backends. can perform function simple loops using %%: can also perform operations parallel using %dopar%: latter allows registering different handlers parallel execution, can use clustermq: BiocParallel supports foreach , means can run packages use BiocParallel cluster well via DoparParam.","code":"library(foreach) x = foreach(i=1:3) %do% sqrt(i) x = foreach(i=1:3) %dopar% sqrt(i) #> Warning: executing %dopar% sequentially: no parallel backend registered # set up the scheduler first, otherwise this will run sequentially clustermq::register_dopar_cmq(n_jobs=2, memory=1024) # this accepts same arguments as `Q` x = foreach(i=1:3) %dopar% sqrt(i) # this will be executed as jobs #> Running sequentially ('LOCAL') ... library(BiocParallel) register(DoparParam()) # after register_dopar_cmq(...) bplapply(1:3, sqrt)"},{"path":"/articles/userguide.html","id":"with-targets","dir":"Articles","previous_headings":"Usage","what":"With targets","title":"User Guide","text":"targets package enables users define dependency structure different function calls, evaluate underlying data changed. targets package Make-like pipeline tool statistics data science R. package skips costly runtime tasks already date, orchestrates necessary computation implicit parallel computing, abstracts files R objects. current output matches current upstream code data, whole pipeline date, results trustworthy otherwise. can use clustermq perform calculations jobs.","code":""},{"path":"/articles/userguide.html","id":"options","dir":"Articles","previous_headings":"","what":"Options","title":"User Guide","text":"various configurable options mentioned throughout documentation, applicable, however, list options reference. Options can set including call options( = ) current session added line ~/.Rprofile. former available active session, latter available time restart R. clustermq.scheduler - One supported clustermq schedulers; options \"LOCAL\", \"multiprocess\", \"multicore\", \"lsf\", \"sge\", \"slurm\", \"pbs\", \"Torque\", \"ssh\" (default HPC scheduler found $PATH, otherwise \"LOCAL\") clustermq.host - name node device constructing ZeroMQ host address (default Sys.info()[\"nodename\"]) clustermq.ssh.host - user name host connecting HPC via SSH (e.g. user@host); recommend setting SSH keys password-less login clustermq.ssh.log - Path file (SSH host) created populated logging information regarding SSH connection (e.g. \"~/cmq_ssh.log\"); helpful debugging purposes clustermq.ssh.timeout - amount time wait (seconds) SSH start-connection timing (default 10 seconds) clustermq.ssh.hpc_fwd_port - Port opened SSH reverse tunneling workers HPC local session (default: one integer range 50-55k) clustermq.worker.timeout - amount time wait (seconds) master-worker communication timing (default wait indefinitely) clustermq.template - Path template file submitting HPC jobs; necessary using template, otherwise default template used (default depends set inferred clustermq.scheduler) clustermq.data.warning - threshold size common data (Mb) clustermq throws warning (default 1000) clustermq.defaults - named-list default values HPC template; takes precedence defaults specified template file (default empty list)","code":""},{"path":"/articles/userguide.html","id":"debugging-workers","dir":"Articles","previous_headings":"","what":"Debugging workers","title":"User Guide","text":"Function calls evaluated workers wrapped event handlers, means even call evaluation throws error, reported back main R session. However, reasons workers might crash, case can report back. include: segfault low-level process Process kill due resource constraints (e.g. walltime) Reaching wait timeout without signal master process Probably others case, useful worker(s) create log file also include events reported back. can requested using: create file called -.log current working directory, irrespective scheduler use. can customize file name using Note case log_file template field scheduler script, hence needs present order work. default templates field included. order log worker separately, schedulers support wildcards log file names. instance: Multicore/Multiprocess: log_file=\"/path/.file.%\" SGE: log_file=\"/path/.file.$TASK_ID\" LSF: log_file=\"/path/.file.%\" Slurm: log_file=\"/path/.file.%\" PBS: log_file=\"/path/.file.$PBS_ARRAY_INDEX\" Torque: log_file=\"/path/.file.$PBS_ARRAYID\" scheduler documentation details available options. reporting bug includes worker crashes, please always include log file.","code":"Q(..., log_worker=TRUE) Q(..., template=list(log_file = ))"},{"path":"/articles/userguide.html","id":"environments","dir":"Articles","previous_headings":"","what":"Environments","title":"User Guide","text":"cases, may necessary activate specific computing environment scheduler jobs prior starting worker. can , instance, R installed specific environment container. Examples environments containers : Bash module environments Conda environments Docker/Singularity containers possible activate job submission script (.e., template file). widely untested, look following LSF scheduler (analogous others): template still needs filled, example need pass either set via option:","code":"#BSUB-J {{ job_name }}[1-{{ n_jobs }}] # name of the job / array jobs #BSUB-o {{ log_file | /dev/null }} # stdout + stderr #BSUB-M {{ memory | 4096 }} # Memory requirements in Mbytes #BSUB-R rusage[mem={{ memory | 4096 }}] # Memory requirements in Mbytes ##BSUB-q default # name of the queue (uncomment) ##BSUB-W {{ walltime | 6:00 }} # walltime (uncomment) module load {{ bashenv | default_bash_env }} # or: source activate {{ conda | default_conda_env_name }} # or: your environment activation command ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")' Q(..., template=list(bashenv=\"my environment name\")) options( clustermq.defaults = list(bashenv=\"my default env\") )"},{"path":[]},{"path":"/articles/userguide.html","id":"lsf","dir":"Articles","previous_headings":"Scheduler templates","what":"LSF","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #BSUB-* defines command-line arguments bsub program. Memory: defined BSUB-M BSUB-R. Check local setup memory values supplied MiB KiB, default 4096 requesting memory calling Q() Queue: BSUB-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # Walltime: BSUB-W {{ walltime }}. Set maximum time job allowed run killed. default disable line. enable , enter fixed value pass walltime argument function call. way written, use 6 hours arguemnt given. options, see LSF documentation add via #BSUB-* (* represents argument) change identifiers curly braces ({{ ... }}), used fill right variables done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"lsf\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #BSUB-J {{ job_name }}[1-{{ n_jobs }}] # name of the job / array jobs #BSUB-n {{ cores | 1 }} # number of cores to use per job #BSUB-o {{ log_file | /dev/null }} # stdout + stderr; %I for array index #BSUB-M {{ memory | 4096 }} # Memory requirements in Mbytes #BSUB-R rusage[mem={{ memory | 4096 }}] # Memory requirements in Mbytes ##BSUB-q default # name of the queue (uncomment) ##BSUB-W {{ walltime | 6:00 }} # walltime (uncomment) ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"sge","dir":"Articles","previous_headings":"Scheduler templates","what":"SGE","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #$-* defines command-line arguments qsub program. Queue: $ -q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see SGE documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"sge\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #$ -N {{ job_name }} # job name ##$ -q default # submit to queue named \"default\" #$ -j y # combine stdout/error in one file #$ -o {{ log_file | /dev/null }} # output file #$ -cwd # use pwd as work dir #$ -V # use environment variable #$ -t 1-{{ n_jobs }} # submit jobs as array #$ -pe smp {{ cores | 1 }} # number of cores to use per job #$ -l m_mem_free={{ memory | 1073741824 }} # 1 Gb in bytes ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"slurm","dir":"Articles","previous_headings":"Scheduler templates","what":"SLURM","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #SBATCH defines command-line arguments sbatch program. Partition: SBATCH --partition default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see SLURM documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"slurm\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #!/bin/sh #SBATCH --job-name={{ job_name }} ##SBATCH --partition=default #SBATCH --output={{ log_file | /dev/null }} #SBATCH --error={{ log_file | /dev/null }} #SBATCH --mem-per-cpu={{ memory | 4096 }} #SBATCH --array=1-{{ n_jobs }} #SBATCH --cpus-per-task={{ cores | 1 }} ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"pbs","dir":"Articles","previous_headings":"Scheduler templates","what":"PBS","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #PBS-* defines command-line arguments qsub program. Queue: #PBS-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see PBS documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"pbs\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #PBS -N {{ job_name }} #PBS -J 1-{{ n_jobs }} #PBS -l select=1:ncpus={{ cores | 1 }}:mpiprocs={{ cores | 1 }}:mem={{ memory | 4096 }}MB #PBS -l walltime={{ walltime | 12:00:00 }} #PBS -o {{ log_file | /dev/null }} #PBS -j oe ##PBS -q default ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"torque","dir":"Articles","previous_headings":"Scheduler templates","what":"Torque","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #PBS-* defines command-line arguments qsub program. Queue: #PBS-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see Torque documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options(clustermq.scheduler = \"Torque\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #PBS -N {{ job_name }} #PBS -l nodes={{ n_jobs }}:ppn={{ cores | 1 }},walltime={{ walltime | 12:00:00 }} #PBS -o {{ log_file | /dev/null }} #PBS -j oe ##PBS -q default ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"ssh-template","dir":"Articles","previous_headings":"Scheduler templates","what":"SSH","title":"User Guide","text":"SSH scheduler, can access remote schedulers via SSH. want use , first make sure clustermq works server real scheduler. move setting SSH. default template shown . R HPC $PATH, may need specify path load required bash modules/conda environments. supply template, save contents desired changes file local machine clustermq.template point .","code":"options(clustermq.scheduler = \"ssh\", clustermq.ssh.host = \"myhost\", # set this up in your local ~/.ssh/config clustermq.ssh.log = \"~/ssh_proxy.log\", # log file on your HPC clustermq.ssh.timeout = 30, # if changing the default connection timeout clustermq.template = \"/path/to/file/below\" # if using your own template ) ssh -o \"ExitOnForwardFailure yes\" -f -R {{ ctl_port }}:localhost:{{ local_port }} -R {{ job_port }}:localhost:{{ fwd_port }} {{ ssh_host }} \"R --no-save --no-restore -e 'clustermq:::ssh_proxy(ctl={{ ctl_port }}, job={{ job_port }})' > {{ ssh_log | /dev/null }} 2>&1\""},{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Michael Schubert. Author, maintainer, copyright holder. ZeroMQ authors. Author, copyright holder. source files 'src/libzmq' 'src/cppzmq'","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Schubert, M. clustermq enables efficient parallelisation genomic analyses. Bioinformatics (2019). doi:10.1093/bioinformatics/btz284","code":"@Article{, title = {clustermq enables efficient parallelisation of genomic analyses}, author = {Michael Schubert}, journal = {Bioinformatics}, month = {May}, year = {2019}, language = {en}, doi = {10.1093/bioinformatics/btz284}, url = {https://github.com/mschubert/clustermq}, }"},{"path":"/index.html","id":"clustermq-send-r-function-calls-as-cluster-jobs","dir":"","previous_headings":"","what":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"package allow send function calls jobs computing cluster minimal interface provided Q function: Computations done entirely network without temporary files network-mounted storage, strain file system apart starting R per job. calculations load-balanced, .e. workers get jobs done faster also receive function calls work . especially useful calls return time, one worker high load. Browse vignettes : User Guide Technical Documentation FAQ","code":"# load the library and create a simple function library(clustermq) fx = function(x) x * 2 # queue the function call on your scheduler Q(fx, x=1:3, n_jobs=1) # list(2,4,6)"},{"path":"/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"Install clustermq package R CRAN (including bundled ZeroMQ system library): Alternatively can use remotes package install directly Github. Note version needs autoconf/automake CMake compilation: [!TIP] installation problems, see FAQ","code":"install.packages('clustermq') # install.packages('remotes') remotes::install_github('mschubert/clustermq') # remotes::install_github('mschubert/clustermq@develop') # dev version"},{"path":"/index.html","id":"schedulers","dir":"","previous_headings":"","what":"Schedulers","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"HPC cluster’s scheduler ensures computing jobs distributed available worker nodes. Hence, clustermq interfaces order computations. currently support following schedulers (either locally via SSH): Multiprocess - test calls parallelize cores using options(clustermq.scheduler=\"multiprocess\") SLURM - work without setup LSF - work without setup SGE - may require configuration PBS/Torque - needs options(clustermq.scheduler=\"PBS\"/\"Torque\") via SSH - needs options(clustermq.scheduler=\"ssh\", clustermq.ssh.host=) [!TIP] Follow links configure scheduler case working box check FAQ job submission errors gets stuck","code":""},{"path":"/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"common arguments Q : fun - function call. needs self-sufficient (access master environment) ... - iterated arguments passed function. one, need named const - named list non-iterated arguments passed fun export - named list objects export worker environment documentation arguments can accessed typing ?Q. Examples using const export : clustermq can also used parallel backend foreach. also used BiocParallel, can run packages cluster well: examples available User Guide.","code":"# adding a constant argument fx = function(x, y) x * 2 + y Q(fx, x=1:3, const=list(y=10), n_jobs=1) # exporting an object to workers fx = function(x) x * 2 + y Q(fx, x=1:3, export=list(y=10), n_jobs=1) library(foreach) register_dopar_cmq(n_jobs=2, memory=1024) # see `?workers` for arguments foreach(i=1:3) %dopar% sqrt(i) # this will be executed as jobs library(BiocParallel) register(DoparParam()) # after register_dopar_cmq(...) bplapply(1:3, sqrt)"},{"path":"/index.html","id":"comparison-to-other-packages","dir":"","previous_headings":"","what":"Comparison to other packages","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"packages provide high-level parallelization R function calls computing cluster. compared clustermq BatchJobs batchtools processing many short-running jobs, found approximately 1000x less overhead cost. short, use clustermq want: one-line solution run cluster jobs minimal setup access cluster functions local Rstudio via SSH fast processing many function calls without network storage /O Use batchtools : want use mature well-tested package don’t mind arguments every call written /read disc don’t mind ’s load-balancing run-time Use Snakemake targets : want design run workflow HPC Don’t use batch (last updated 2013) BatchJobs (issues SQLite network-mounted storage).","code":""},{"path":"/index.html","id":"contributing","dir":"","previous_headings":"","what":"Contributing","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"Contributions welcome come many different forms, shapes, sizes. include, limited : Questions: Ask Github Discussions board. advanced user, please also consider answering questions . Bug reports: File issue something work expected. sure include self-contained Minimal Reproducible Example set log_worker=TRUE. Code contributions: look good first issue tag. Please discuss anything complicated putting lot work , ’m happy help get started. [!TIP] Check User Guide FAQ first, maybe query already answered ","code":""},{"path":"/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"project part academic work, evaluated citations. like able continue working research support tools like clustermq, please cite article using publications: M Schubert. clustermq enables efficient parallelisation genomic analyses. Bioinformatics (2019). doi:10.1093/bioinformatics/btz284","code":""},{"path":"/reference/LOCAL.html","id":null,"dir":"Reference","previous_headings":"","what":"Placeholder for local processing — LOCAL","title":"Placeholder for local processing — LOCAL","text":"Mainly tests pass without setting scheduler","code":""},{"path":"/reference/LSF.html","id":null,"dir":"Reference","previous_headings":"","what":"LSF scheduler functions — LSF","title":"LSF scheduler functions — LSF","text":"Derives QSys provide LSF-specific functions","code":""},{"path":"/reference/MULTICORE.html","id":null,"dir":"Reference","previous_headings":"","what":"Process on multiple cores on one machine — MULTICORE","title":"Process on multiple cores on one machine — MULTICORE","text":"Derives QSys provide multicore-specific functions","code":""},{"path":"/reference/MULTIPROCESS.html","id":null,"dir":"Reference","previous_headings":"","what":"Process on multiple processes on one machine — MULTIPROCESS","title":"Process on multiple processes on one machine — MULTIPROCESS","text":"Derives QSys provide callr-specific functions","code":""},{"path":"/reference/Pool.html","id":null,"dir":"Reference","previous_headings":"","what":"Class for basic queuing system functions — Pool","title":"Class for basic queuing system functions — Pool","text":"Provides basic functions needed communicate machines abstract functions rZMQ scheduler implementations can rely higher level functionality","code":""},{"path":"/reference/Q.html","id":null,"dir":"Reference","previous_headings":"","what":"Queue function calls on the cluster — Q","title":"Queue function calls on the cluster — Q","text":"Queue function calls cluster","code":""},{"path":"/reference/Q.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Queue function calls on the cluster — Q","text":"","code":"Q( fun, ..., const = list(), export = list(), pkgs = c(), seed = 128965, memory = NULL, template = list(), n_jobs = NULL, job_size = NULL, split_array_by = -1, rettype = \"list\", fail_on_error = TRUE, workers = NULL, log_worker = FALSE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/Q.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Queue function calls on the cluster — Q","text":"fun function call ... Objects iterated function call const list constant arguments passed function call export List objects exported worker pkgs Character vector packages load worker seed seed set function call memory Short template=list(memory=value) template named list values fill template n_jobs number LSF jobs submit; upper limit jobs job_size given well job_size number function calls per job split_array_by dimension number split arrays `...`; default: last rettype Return type function call (vector type 'list') fail_on_error error occurs workers, continue fail? workers Optional instance QSys representing worker pool log_worker Write log file worker chunk_size Number function calls chunk together defaults 100 chunks per worker max. 10 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number chunks sent one worker verbose Print status messages progress bar (default: TRUE)","code":""},{"path":"/reference/Q.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Queue function calls on the cluster — Q","text":"list whatever `fun` returned","code":""},{"path":"/reference/Q.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Queue function calls on the cluster — Q","text":"","code":"if (FALSE) { # Run a simple multiplication for numbers 1 to 3 on a worker node fx = function(x) x * 2 Q(fx, x=1:3, n_jobs=1) # list(2,4,6) # Run a mutate() call in dplyr on a worker node iris %>% mutate(area = Q(`*`, e1=Sepal.Length, e2=Sepal.Width, n_jobs=1)) # iris with an additional column 'area' }"},{"path":"/reference/QSys.html","id":null,"dir":"Reference","previous_headings":"","what":"Class for basic queuing system functions — QSys","title":"Class for basic queuing system functions — QSys","text":"Provides basic functions needed communicate machines abstract functions rZMQ scheduler implementations can rely higher level functionality","code":""},{"path":"/reference/Q_rows.html","id":null,"dir":"Reference","previous_headings":"","what":"Queue function calls defined by rows in a data.frame — Q_rows","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"Queue function calls defined rows data.frame","code":""},{"path":"/reference/Q_rows.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"","code":"Q_rows( df, fun, const = list(), export = list(), pkgs = c(), seed = 128965, memory = NULL, template = list(), n_jobs = NULL, job_size = NULL, rettype = \"list\", fail_on_error = TRUE, workers = NULL, log_worker = FALSE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/Q_rows.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"df data.frame iterated arguments fun function call const list constant arguments passed function call export List objects exported worker pkgs Character vector packages load worker seed seed set function call memory Short template=list(memory=value) template named list values fill template n_jobs number LSF jobs submit; upper limit jobs job_size given well job_size number function calls per job rettype Return type function call (vector type 'list') fail_on_error error occurs workers, continue fail? workers Optional instance QSys representing worker pool log_worker Write log file worker chunk_size Number function calls chunk together defaults 100 chunks per worker max. 10 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number chunks sent one worker verbose Print status messages progress bar (default: TRUE)","code":""},{"path":"/reference/Q_rows.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"","code":"if (FALSE) { # Run a simple multiplication for data frame columns x and y on a worker node fx = function (x, y) x * y df = data.frame(x = 5, y = 10) Q_rows(df, fx, job_size = 1) # [1] 50 # Q_rows also matches the names of a data frame with the function arguments fx = function (x, y) x - y df = data.frame(y = 5, x = 10) Q_rows(df, fx, job_size = 1) # [1] 5 }"},{"path":"/reference/SGE.html","id":null,"dir":"Reference","previous_headings":"","what":"SGE scheduler functions — SGE","title":"SGE scheduler functions — SGE","text":"Derives QSys provide SGE-specific functions","code":""},{"path":"/reference/SLURM.html","id":null,"dir":"Reference","previous_headings":"","what":"SLURM scheduler functions — SLURM","title":"SLURM scheduler functions — SLURM","text":"Derives QSys provide SLURM-specific functions","code":""},{"path":"/reference/SSH.html","id":null,"dir":"Reference","previous_headings":"","what":"SSH scheduler functions — SSH","title":"SSH scheduler functions — SSH","text":"Derives QSys provide SSH-specific functions","code":""},{"path":"/reference/check_args.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to check arguments with which Q() is called — check_args","title":"Function to check arguments with which Q() is called — check_args","text":"Function check arguments Q() called","code":""},{"path":"/reference/check_args.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to check arguments with which Q() is called — check_args","text":"","code":"check_args(fun, iter, const = list())"},{"path":"/reference/check_args.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to check arguments with which Q() is called — check_args","text":"fun function call iter Objects iterated function call const list constant arguments passed function call","code":""},{"path":"/reference/check_args.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to check arguments with which Q() is called — check_args","text":"Processed iterated argument list 'iter' list","code":""},{"path":"/reference/chunk.html","id":null,"dir":"Reference","previous_headings":"","what":"Subset index chunk for processing — chunk","title":"Subset index chunk for processing — chunk","text":"'attr' `[.data.frame` takes much CPU time","code":""},{"path":"/reference/chunk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Subset index chunk for processing — chunk","text":"","code":"chunk(x, i)"},{"path":"/reference/chunk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Subset index chunk for processing — chunk","text":"x Index data.frame Rows subset","code":""},{"path":"/reference/chunk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Subset index chunk for processing — chunk","text":"x[,]","code":""},{"path":"/reference/clustermq-package.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"Provides Q function send arbitrary function calls workers HPC schedulers without relying network-mounted storage. Allows using remote schedulers via SSH.","code":""},{"path":"/reference/clustermq-package.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"hood, submit cluster job connects master via TCP master send function argument chunks worker worker return results master everything done get back result Computations done entirely network without temporary files network-mounted storage, strain file system apart starting R per job. removes biggest bottleneck distributed computing. Using approach, can easily load-balancing, .e. workers get jobs done faster also receive function calls work . especially useful calls return time, one worker high load. detailed usage instructions, see documentation Q function.","code":""},{"path":[]},{"path":"/reference/clustermq-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"Maintainer: Michael Schubert mschu.dev@gmail.com (ORCID) [copyright holder] Authors: ZeroMQ authors (source files 'src/libzmq' 'src/cppzmq') [copyright holder]","code":""},{"path":"/reference/cmq_foreach.html","id":null,"dir":"Reference","previous_headings":"","what":"clustermq foreach handler — cmq_foreach","title":"clustermq foreach handler — cmq_foreach","text":"clustermq foreach handler","code":""},{"path":"/reference/cmq_foreach.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"clustermq foreach handler — cmq_foreach","text":"","code":"cmq_foreach(obj, expr, envir, data)"},{"path":"/reference/cmq_foreach.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"clustermq foreach handler — cmq_foreach","text":"obj Returned foreach::foreach, containing following variables: args : Arguments passed, call argnames: character vector arguments passed evalenv : Environment evaluate arguments export : character vector variable names export nodes packages: character vector required packages verbose : whether print status messages [logical] errorHandling: string function name call error , e.g. \"stop\" expr R expression curly braces envir Environment evaluate arguments data Common arguments passed register_dopcar_cmq(), e.g. n_jobs","code":""},{"path":"/reference/dot-onAttach.html","id":null,"dir":"Reference","previous_headings":"","what":"Report queueing system on package attach if not set — .onAttach","title":"Report queueing system on package attach if not set — .onAttach","text":"Report queueing system package attach set","code":""},{"path":"/reference/dot-onAttach.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Report queueing system on package attach if not set — .onAttach","text":"","code":".onAttach(libname, pkgname)"},{"path":"/reference/dot-onAttach.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Report queueing system on package attach if not set — .onAttach","text":"libname default arg compatibility pkgname default arg compatibility","code":""},{"path":"/reference/dot-onLoad.html","id":null,"dir":"Reference","previous_headings":"","what":"Select the queueing system on package loading — .onLoad","title":"Select the queueing system on package loading — .onLoad","text":"done setting variable 'qsys' package environment object contains desired queueing system.","code":""},{"path":"/reference/dot-onLoad.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Select the queueing system on package loading — .onLoad","text":"","code":".onLoad(libname, pkgname)"},{"path":"/reference/dot-onLoad.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Select the queueing system on package loading — .onLoad","text":"libname default arg compatibility pkgname default arg compatibility","code":""},{"path":"/reference/fill_template.html","id":null,"dir":"Reference","previous_headings":"","what":"Fill a template string with supplied values — fill_template","title":"Fill a template string with supplied values — fill_template","text":"Fill template string supplied values","code":""},{"path":"/reference/fill_template.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fill a template string with supplied values — fill_template","text":"","code":"fill_template(template, values, required = c())"},{"path":"/reference/fill_template.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fill a template string with supplied values — fill_template","text":"template character string submission template values named list key-value pairs required Keys must present template (default: none)","code":""},{"path":"/reference/fill_template.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fill a template string with supplied values — fill_template","text":"template placeholder fields replaced values","code":""},{"path":"/reference/host.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct the ZeroMQ host address — host","title":"Construct the ZeroMQ host address — host","text":"Construct ZeroMQ host address","code":""},{"path":"/reference/host.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct the ZeroMQ host address — host","text":"","code":"host( node = getOption(\"clustermq.host\", Sys.info()[\"nodename\"]), ports = 6000:9999, n = 100 )"},{"path":"/reference/host.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct the ZeroMQ host address — host","text":"node Node device name ports Range ports consider n many addresses return","code":""},{"path":"/reference/host.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct the ZeroMQ host address — host","text":"possible addresses character vector","code":""},{"path":"/reference/master.html","id":null,"dir":"Reference","previous_headings":"","what":"Master controlling the workers — master","title":"Master controlling the workers — master","text":"exchanging messages master workers works following way: * submitted job know start * starts, sends message list(id=0) indicating ready * send function definition common data * also send first data set work * get id > 0, result store * send next data set/index work * computatons complete, send id=0 worker * responds id=-1 (usage stats) shuts ","code":""},{"path":"/reference/master.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Master controlling the workers — master","text":"","code":"master( pool, iter, rettype = \"list\", fail_on_error = TRUE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/master.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Master controlling the workers — master","text":"pool Instance Pool object iter Objects iterated function call rettype Return type function fail_on_error error occurs workers, continue fail? chunk_size Number function calls chunk together defaults 100 chunks per worker max. 500 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number function calls sent one worker verbose Print progress messages","code":""},{"path":"/reference/master.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Master controlling the workers — master","text":"list whatever `fun` returned","code":""},{"path":"/reference/msg_fmt.html","id":null,"dir":"Reference","previous_headings":"","what":"Message format for logging — msg_fmt","title":"Message format for logging — msg_fmt","text":"Message format logging","code":""},{"path":"/reference/msg_fmt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Message format for logging — msg_fmt","text":"","code":"msg_fmt(verbose = TRUE)"},{"path":"/reference/register_dopar_cmq.html","id":null,"dir":"Reference","previous_headings":"","what":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"Register clustermq `foreach` parallel handler","code":""},{"path":"/reference/register_dopar_cmq.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"","code":"register_dopar_cmq(...)"},{"path":"/reference/register_dopar_cmq.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"... List arguments passed `Q` function, e.g. n_jobs","code":""},{"path":"/reference/ssh_proxy.html","id":null,"dir":"Reference","previous_headings":"","what":"SSH proxy for different schedulers — ssh_proxy","title":"SSH proxy for different schedulers — ssh_proxy","text":"call manually, SSH qsys ","code":""},{"path":"/reference/ssh_proxy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SSH proxy for different schedulers — ssh_proxy","text":"","code":"ssh_proxy(fwd_port, qsys_id = qsys_default)"},{"path":"/reference/ssh_proxy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SSH proxy for different schedulers — ssh_proxy","text":"fwd_port port master address connect (remote end reverse tunnel) qsys_id Character string QSys class use","code":""},{"path":"/reference/summarize_result.html","id":null,"dir":"Reference","previous_headings":"","what":"Print a summary of errors and warnings that occurred during processing — summarize_result","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"Print summary errors warnings occurred processing","code":""},{"path":"/reference/summarize_result.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"","code":"summarize_result( result, n_errors, n_warnings, cond_msgs, at = length(result), fail_on_error = TRUE )"},{"path":"/reference/summarize_result.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"result list vector processing result n_errors many errors occurred n_warnings many warnings occurred cond_msgs Error warnings messages, display first 50 many calls procesed point fail_on_error Stop error(s) occurred","code":""},{"path":"/reference/vec_lookup.html","id":null,"dir":"Reference","previous_headings":"","what":"Lookup table for return types to vector NAs — vec_lookup","title":"Lookup table for return types to vector NAs — vec_lookup","text":"Lookup table return types vector NAs","code":""},{"path":"/reference/vec_lookup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Lookup table for return types to vector NAs — vec_lookup","text":"","code":"vec_lookup"},{"path":"/reference/vec_lookup.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Lookup table for return types to vector NAs — vec_lookup","text":"object class list length 9.","code":""},{"path":"/reference/work_chunk.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to process a chunk of calls — work_chunk","title":"Function to process a chunk of calls — work_chunk","text":"chunk comes encapsulated data.frame","code":""},{"path":"/reference/work_chunk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to process a chunk of calls — work_chunk","text":"","code":"work_chunk( df, fun, const = list(), rettype = \"list\", common_seed = NULL, progress = FALSE )"},{"path":"/reference/work_chunk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to process a chunk of calls — work_chunk","text":"df data.frame call IDs rownames arguments columns fun function call const Constant arguments passed call rettype Return type function common_seed seed offset common function calls progress Logical indicated whether display progress bar","code":""},{"path":"/reference/work_chunk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to process a chunk of calls — work_chunk","text":"list call results (try-error failed)","code":""},{"path":"/reference/worker.html","id":null,"dir":"Reference","previous_headings":"","what":"R worker submitted as cluster job — worker","title":"R worker submitted as cluster job — worker","text":"call manually, master ","code":""},{"path":"/reference/worker.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"R worker submitted as cluster job — worker","text":"","code":"worker(master, ..., verbose = TRUE, context = NULL)"},{"path":"/reference/worker.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"R worker submitted as cluster job — worker","text":"master master address (tcp://ip:port) ... Catch-break older template values (ignored) verbose Whether print debug messages context ZeroMQ context (internal testing)","code":""},{"path":"/reference/workers.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a pool of workers — workers","title":"Creates a pool of workers — workers","text":"Creates pool workers","code":""},{"path":"/reference/workers.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a pool of workers — workers","text":"","code":"workers( n_jobs, data = NULL, reuse = TRUE, template = list(), log_worker = FALSE, qsys_id = getOption(\"clustermq.scheduler\", qsys_default), verbose = FALSE, ... )"},{"path":"/reference/workers.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a pool of workers — workers","text":"n_jobs Number jobs submit (0 implies local processing) data Set common data (function, constant args, seed) reuse Whether workers reusable get shut call template named list values fill template log_worker Write log file worker qsys_id Character string QSys class use verbose Print message worker startup ... Additional arguments passed qsys constructor","code":""},{"path":"/reference/workers.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a pool of workers — workers","text":"instance QSys class","code":""},{"path":"/reference/wrap_error.html","id":null,"dir":"Reference","previous_headings":"","what":"Wraps an error in a condition object — wrap_error","title":"Wraps an error in a condition object — wrap_error","text":"Wraps error condition object","code":""},{"path":"/reference/wrap_error.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Wraps an error in a condition object — wrap_error","text":"","code":"wrap_error(call)"},{"path":"/news/index.html","id":"clustermq-094","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.4","title":"clustermq 0.9.4","text":"Fix bug worker stats shown NA (#325) Worker API: env() now visibly lists environment called without arguments","code":""},{"path":"/news/index.html","id":"clustermq-093","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.3","title":"clustermq 0.9.3","text":"CRAN release: 2024-01-09 Fix bug BiocParallel export required objects (#302) Fix bug already finished workers killed (#307) Fix bug worker results stats garbage collected (#324) now FAQ vignette answers frequently asked questions Worker API: send() now reports call identifier current() tracks","code":""},{"path":"/news/index.html","id":"clustermq-092","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.2","title":"clustermq 0.9.2","text":"CRAN release: 2023-12-07 Fix bug SSH proxy cache data properly (#320) Fix bug max_calls_worker respected (#322) Local parallelism (multicore, multiprocess) uses local IP (#321) Worker API: info() now also returns current worker number calls","code":""},{"path":"/news/index.html","id":"clustermq-091","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.1","title":"clustermq 0.9.1","text":"CRAN release: 2023-11-21 Disconnect monitor (libzmq -DZMQ_BUILD_DRAFT_API=1) now optional (#317) Fix bug worker shutdown notifications can cause crash (#306, #308, #310) Fix bug template values filled correctly (#309) Fix bug using Rf_error lead improper cleanup resources (#311) Fix bug maximum worker timeout multiplied led undefined behavior Fix bug ZeroMQ’s -Werror flag led compilation issues M1 Mac Fix bug SSH tests error timeout high load Worker API: CMQMaster now needs know add_pending_workers(n) Worker API: status report info() now displays properly","code":""},{"path":"/news/index.html","id":"clustermq-090","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.0","title":"clustermq 0.9.0","text":"CRAN release: 2023-09-23","code":""},{"path":"/news/index.html","id":"features-0-9-0","dir":"Changelog","previous_headings":"","what":"Features","title":"clustermq 0.9.0","text":"Reuse common data now supported (#154) Jobs now error instead stalling upon unexpected worker disconnect (#150) Workers now error can establish connection within time limit Error n_jobs max_calls_worker provide insufficient call slots (#258) Request 1 GB default SGE template (#298) @nickholway Error warning summary now orders index severity (#304) call can multiple warnings forwarded, last","code":""},{"path":"/news/index.html","id":"bugfix-0-9-0","dir":"Changelog","previous_headings":"","what":"Bugfix","title":"clustermq 0.9.0","text":"Fix bug max memory reporting gc() may different column (#240) Fix passing numerical job_id qdel PBS (#265) job port/id pool now used properly upon binding failure (#270) @luwidmer Common data size warning now displayed exceeding limits (#287)","code":""},{"path":"/news/index.html","id":"internal-0-9-0","dir":"Changelog","previous_headings":"","what":"Internal","title":"clustermq 0.9.0","text":"Complete rewrite worker API longer depend purrr package","code":""},{"path":"/news/index.html","id":"clustermq-0895","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.95","title":"clustermq 0.8.95","text":"CRAN release: 2020-07-01 now using ZeroMQ via Rcpp preparation v0.9 (#151) New multiprocess backend via callr instead forking (#142, #197) Sending data sockets now blocking avoid excess memory usage (#161) multicore, multiprocess schedulers now support logging (#169) New option clustermq.host can specify host IP network interface name (#170) Template filling now raise error missing keys (#174, #198) Workers failing large common data improved (fixed?) (#146, #179, #191) Local connections now routed via 127.0.0.1 instead localhost (#192) Submit messages different local, multicore HPC (#196) Functions exported foreach now environment stripped (#200) Deprecation log_worker=T/F argument rescinded","code":""},{"path":"/news/index.html","id":"clustermq-089","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.9","title":"clustermq 0.8.9","text":"CRAN release: 2020-02-29 New option clustermq.ssh.timeout SSH proxy startup (#157) @brendanf New option clustermq.worker.timeout delay worker shutdown (#188) Fixed PBS/Torque docs, template cleanup (#184, #186) @mstr3336 Warning common data large, set clustermq.data.warning (#189)","code":""},{"path":"/news/index.html","id":"clustermq-088","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.8","title":"clustermq 0.8.8","text":"CRAN release: 2019-06-05 Q, Q_rows new arguments verbose (#111) pkgs (#144) foreach backend now uses dedicated API possible (#143, #144) Number size objects common calls now work properly Templates filled internally longer depend infuser package","code":""},{"path":"/news/index.html","id":"clustermq-087","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.7","title":"clustermq 0.8.7","text":"CRAN release: 2019-04-15 Q now max_calls_worker argument avoid walltime (#110) Submission messages now list size common data (drake#800) default templates now optional cores per job field (#123) foreach now treats .export (#124) .combine (#126) correctly New option clustermq.error.timeout wait clean shutdown (#134) SSH command now specified via template file (#122) SSH now forward errors local process (#135) Wiki deprecated, use https://mschubert.github.io/clustermq/ instead","code":""},{"path":"/news/index.html","id":"clustermq-086","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.6","title":"clustermq 0.8.6","text":"CRAN release: 2019-02-22 Progress bar now shown workers start (#107) Socket connections now authenticated using session password (#125) Marked internal functions @keywords internal Added vignettes User Guide Technical Documentation","code":""},{"path":"/news/index.html","id":"clustermq-085","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.5","title":"clustermq 0.8.5","text":"CRAN release: 2018-09-29 Added experimental support parallel foreach backend (#83) Moved templates package inst/ directory (#85) Added send_call worker evaluate arbitrary expressions (drake#501; #86) Option clustermq.scheduler now respected set package load (#88) System interrupts now handled correctly (rzmq#44; #73, #93, #97) Number workers running/total now shown progress bar (#98) Unqualified (short) host names now resolved default (#104)","code":""},{"path":"/news/index.html","id":"clustermq-084","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.4","title":"clustermq 0.8.4","text":"CRAN release: 2018-04-22 Fix error qsys$reusable using n_jobs=0/local processing (#75) Scheduler-specific templates deprecated. Use clustermq.template instead Allow option clustermq.defaults fill default template values (#71) Errors worker processing now shut cleanly (#67) Progress bar now shows estimated time remaining (#66) Progress bar now also shown processing locally Memory summary now adds estimated memory R session (#69)","code":""},{"path":"/news/index.html","id":"clustermq-083","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.3","title":"clustermq 0.8.3","text":"CRAN release: 2018-01-21 Support rettype function calls return type known (#59) Reduce memory requirements processing results receive Fix bug cleanup, log_worker flag working SGE/SLURM","code":""},{"path":"/news/index.html","id":"clustermq-082","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.2","title":"clustermq 0.8.2","text":"CRAN release: 2017-11-30 Fix bug never-started jobs cleaned Fix bug tests leave processes port binding fails (#60) Multicore longer prints worker debug messages (#61)","code":""},{"path":"/news/index.html","id":"clustermq-081","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.1","title":"clustermq 0.8.1","text":"CRAN release: 2017-11-27 Fix performance issues high number function calls (#56) Fix bug multicore workers shut properly (#58) Fix default templates SGE, LSF SLURM (misplaced quote)","code":""},{"path":"/news/index.html","id":"clustermq-080","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.0","title":"clustermq 0.8.0","text":"CRAN release: 2017-11-11","code":""},{"path":"/news/index.html","id":"features-0-8-0","dir":"Changelog","previous_headings":"","what":"Features","title":"clustermq 0.8.0","text":"Templates changed: clustermq:::worker now takes master argument Creating workers now separated Q, enabling worker reuse (#45) Objects function environment must now exported explicitly (#47) Added multicore qsys using parallel package (#49) New function Q_rows using data.frame rows iterated arguments (#43) Job summary now report max memory reported gc (#18)","code":""},{"path":"/news/index.html","id":"bugfix-0-8-0","dir":"Changelog","previous_headings":"","what":"Bugfix","title":"clustermq 0.8.0","text":"Fix bug copies common_data collected gc slowly (#19)","code":""},{"path":"/news/index.html","id":"internal-0-8-0","dir":"Changelog","previous_headings":"","what":"Internal","title":"clustermq 0.8.0","text":"Messages master now processed threads (#42) Jobs now submitted array possible","code":""},{"path":"/news/index.html","id":"clustermq-070","dir":"Changelog","previous_headings":"","what":"clustermq 0.7.0","title":"clustermq 0.7.0","text":"CRAN release: 2017-08-28 Initial release CRAN","code":""}]
+[{"path":"/articles/faq.html","id":"install","dir":"Articles","previous_headings":"","what":"Installation errors","title":"Frequently asked questions","text":"compile package fully C++11 compliant compiler required. implicit CRAN packages since R=3.6.2 hence listed SystemRequirements. encounter error saying matching function call zmq::message_t::message_t(std::string&) exists, compiler (fully) support automated check failed reason. happens instance old versions gcc compiler (default Linux distributions). can check version terminal using: case, likely HPC system already newer compiler installed need add $PATH load module. set, can install package R started terminal module/path active.","code":"In file included from CMQMaster.cpp:2:0: CMQMaster.h: In member function ‘void CMQMaster::proxy_submit_cmd(SEXP, int)’: CMQMaster.h:146:40: error: no matching function for call to ‘zmq::message_t::message_t(std::string&)’ mp.push_back(zmq::message_t(cur)); # the minimum required gcc version is 5.5 for full C++11 support (3.3 for clang) cc --version"},{"path":"/articles/faq.html","id":"stuck","dir":"Articles","previous_headings":"","what":"Session gets stuck at “Running calculations”","title":"Frequently asked questions","text":"R session may stuck something like following: see every time jobs queued yet started. Depending busy HPC , may take long time. can check queueing status jobs terminal e.g. qstat (SGE), bjobs (LSF), sinfo (SLURM). jobs already finished, likely means clustermq workers can connect main session. can confirm passing log_worker=TRUE Q inspect logs created current working directory. state something like: submitted job indeed unable establish network connection head node. can happen HPC allow incoming connections, likely happens multiple network interfaces, access head node. can list available network interfaces using ifconfig command terminal. Find interface shares subnetwork head node add R option clustermq.host=. unclear, contact system administrators see interface use.","code":"> clustermq::Q(identity, x=42, n_jobs=1) Submitting 1 worker jobs (ID: cmq8480) ... Running 1 calculations (5 objs/19.4 Kb common; 1 calls/chunk) ... > clustermq:::worker(\"tcp://my.headnode:9091\") 2023-12-11 10:22:58.485529 | Master: tcp://my.headnode:9091 2023-12-11 10:22:58.488892 | connecting to: tcp://my.headnode:9091: Error: Connection failed after 10016 ms Execution halted"},{"path":"/articles/faq.html","id":"ssh","dir":"Articles","previous_headings":"","what":"SSH not working","title":"Frequently asked questions","text":"trying remote schedulers via SSH, make sure scheduler works first connect cluster run job . terminal stuck make sure step SSH connection works typing following commands local terminal make sure don’t get errors warnings step: get Command found: R error, make sure $PATH set correctly ~/.bash_profile /~/.bashrc (depending cluster config might need either). may also need modify SSH template load R module conda environment. get SSH warning error try ssh -v enable verbose output. forward works, run following local R session (ideally also command-line R, RStudio): create log file remote server contain errors might occurred ssh_proxy startup. ssh_proxy startup fails local machine error server log show errors, can try increasing timeout: can happen SSH startup template includes additional steps starting R, activating module conda environment, confirm connection via two-factor authentication.","code":"Connecting via SSH ... # test your ssh login that you set up in ~/.ssh/config # if this fails you have not set up SSH correctly ssh # test port forwarding from 54709 remote to 6687 local (ports are random) # if the fails you will not be able to use clustermq via SSH ssh -R 54709:localhost:6687 R --vanilla options(clustermq.scheduler = \"ssh\", clustermq.ssh.log = \"~/ssh_proxy.log\") Q(identity, x=1, n_jobs=1) Remote R process did not respond after 5 seconds. Check your SSH server log. options(clustermq.ssh.timeout = 30) # in seconds"},{"path":"/articles/faq.html","id":"master-in-container","dir":"Articles","previous_headings":"","what":"Running the master inside containers","title":"Frequently asked questions","text":"master process inside container, accessing HPC scheduler difficult. Containers, including singularity docker, isolate processes inside container host. R process able submit job scheduler found. Note HPC node running master process must allowed submit jobs. HPC systems allow compute nodes submit jobs. case, may need run master process login node, discuss issue system administrator. container binary compatible host, may able bind scheduler executable container. example, PBS might look something like: working example binding SLURM CentOS 7 container image CentOS 7 host available https://groups.google.com//lbl.gov/d/msg/singularity/syLcsIWWzdo/NZvF2Ud2AAAJ Alternatively, can create script uses SSH execute scheduler login node. , need SSH client container, keys set password-less login, create script call scheduler login node via ssh (e.g. ~/bin/qsub SGE/PBS/Torque, bsub LSF sbatch Slurm): Make sure script executable, bind/copy container somewhere $PATH. Home directories bound default singularity.","code":"#PBS directives ... module load singularity SINGULARITYENV_APPEND_PATH=/opt/pbs/bin singularity exec --bind /opt/pbs/bin r_image.sif Rscript master_script.R #!/bin/bash ssh -i ~/.ssh/ ${PBS_O_HOST:-\"no_host_not_in_a_pbs_job\"} qsub \"$@\" chmod u+x ~/bin/qsub SINGULARITYENV_APPEND_PATH=~/bin"},{"path":[]},{"path":"/articles/technicaldocs.html","id":"base-api-and-schedulers","dir":"Articles","previous_headings":"Worker API","what":"Base API and schedulers","title":"Technical Documentation","text":"main worker functions wrapped R6 class name QSys. provides standardized API lower-level messages sent via ZeroMQ. base class derived scheduler classes add required functions submitting cleaning jobs: user-visible object worker Pool wraps , eventually allow manage different workers.","code":"+ QSys |- Multicore |- LSF + SGE |- PBS |- Torque |- etc."},{"path":[]},{"path":"/articles/technicaldocs.html","id":"creating-a-worker-pool","dir":"Articles","previous_headings":"Worker API > Workers","what":"Creating a worker pool","title":"Technical Documentation","text":"pool workers can created using workers() function, instantiates Pool object corresponding QSys-derived scheduler class. See ?workers details.","code":"# start up a pool of three workers using the default scheduler w = workers(n_jobs=3) # if we make an unclean exit for whatever reason, clean up the jobs on.exit(w$cleanup())"},{"path":"/articles/technicaldocs.html","id":"worker-startup","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker startup","title":"Technical Documentation","text":"workers started via scheduler, know machine run . start every worker TCP/IP address master socket distribute work. achieved call R common schedulers:","code":"R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/technicaldocs.html","id":"worker-communication","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker communication","title":"Technical Documentation","text":"master’s side, wait worker connects: can send expression evaluated worker using send method: expression (...), variables passed along call can added. batch processing clustermq usually , command work_chunk, chunk data added:","code":"msg = w$recv() # this will block until a worker is ready w$send(expression, ...) w$send(clustermq:::work_chunk(chunk, fun, const, rettype, common_seed), chunk = chunk(iter, submit_index))"},{"path":"/articles/technicaldocs.html","id":"worker-environment","dir":"Articles","previous_headings":"Worker API > Workers","what":"Worker environment","title":"Technical Documentation","text":"can add number objects worker environment using env method: also invisibly return data.frame objects currently environment. user wants inspect environment without changing can call w$env() without arguments. environment propagated workers automatically greedy fashion.","code":"w$env(object=value, ...)"},{"path":"/articles/technicaldocs.html","id":"main-event-loop","dir":"Articles","previous_headings":"Worker API","what":"Main event loop","title":"Technical Documentation","text":"Putting together event loop, get essentially implemented master. w$send invisibly returns identifier track call submitted, w$current() matches w$recv(). loop similar structure can used extend clustermq. example, done targets package.","code":"w = workers(3) on.exit(w$cleanup()) w$env(...) while (we have new work to send || jobs pending) { res = w$recv() # the result of the call, or NULL for a new worker w$current()$call_ref # matches answer to request, -1 otherwise # handle result if (more work) call_ref = w$send(expression, ...) # call_ref tracks request identity else w$send_shutdown() }"},{"path":"/articles/technicaldocs.html","id":"zeromq-message-specification","dir":"Articles","previous_headings":"","what":"ZeroMQ message specification","title":"Technical Documentation","text":"Communication master (main event loop) workers (QSys base class) organised messages. chunks serialized data sent via ZeroMQ’s protocol (ZMTP). parts message called frames.","code":""},{"path":"/articles/technicaldocs.html","id":"master---worker-communication","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Master - Worker communication","title":"Technical Documentation","text":"master requests evaluation message X frames (direct) Y proxied. handled clustermq internally. worker identity frame routing identifier delimiter frame Worker status (wlife_t) call evaluated variable name environment object yet present worker variable value using proxy, followed SEXP contains variable names proxy add forwarding worker.","code":""},{"path":"/articles/technicaldocs.html","id":"worker-evaluation","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Worker evaluation","title":"Technical Documentation","text":"worker evaluates call using R C API: error occurs evaluation returned structure class worker_error. developer wants catch errors warnings fine-grained manner, recommended add callingHandlers cmd (clustermq work work_chunk).","code":"R_tryEvalSilent(cmd, env, &err);"},{"path":"/articles/technicaldocs.html","id":"worker---master-communication","dir":"Articles","previous_headings":"ZeroMQ message specification","what":"Worker - Master communication","title":"Technical Documentation","text":"result evaluation returned message four (direct) five (proxied) frames: Worker identity frame (handled internally ZeroMQ’s ZMQ_REQ socket) Empty frame (handled internally ZeroMQ’s ZMQ_REQ socket) Worker status (wlife_t) handled internally clustermq result call (SEXP), visible user using worker via SSH, frames preceded routing identify frame handled internally ZeroMQ added peeled proxy.","code":""},{"path":"/articles/userguide.html","id":"installation","dir":"Articles","previous_headings":"","what":"Installation","title":"User Guide","text":"Install clustermq package R CRAN. automatically detect ZeroMQ installed otherwise use bundled library: Alternatively can use remotes package install directly Github. Note version needs autoconf/automake CMake compilation: develop branch, introduce code changes new features. may contain bugs, poor documentation, inconveniences. branch may install times. However, feedback welcome. installation issues please see FAQ.","code":"# Recommended: # If your system has `libzmq` installed but you want to enable the worker crash # monitor, set the following environment variable to enable compilation of the # bundled `libzmq` library with the required feature (`-DZMQ_BUILD_DRAFT_API=1`): # Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) install.packages(\"clustermq\") # Sys.setenv(CLUSTERMQ_USE_SYSTEM_LIBZMQ=0) # install.packages('remotes') remotes::install_github(\"mschubert/clustermq\") # remotes::install_github(\"mschubert/clustermq@develop\") # dev version"},{"path":"/articles/userguide.html","id":"configuration","dir":"Articles","previous_headings":"","what":"Configuration","title":"User Guide","text":"HPC cluster’s scheduler ensures computing jobs distributed available worker nodes. Hence, clustermq interfaces order computations. default, take whichever scheduler find fall back local processing. work , cases. may need configure scheduler.","code":""},{"path":"/articles/userguide.html","id":"scheduler-setup","dir":"Articles","previous_headings":"Configuration","what":"Setting up the scheduler","title":"User Guide","text":"set scheduler explicitly, see following links: SLURM - work without setup LSF - work without setup SGE - may require configuration PBS/Torque - needs options(clustermq.scheduler=\"PBS\"/\"Torque\") can suggest another scheduler opening issue may addition need activate compute environments containers shell (e.g. ~/.bashrc) automatically. Check FAQ job submission/call Q errors gets stuck.","code":""},{"path":"/articles/userguide.html","id":"local-parallelization","dir":"Articles","previous_headings":"Configuration","what":"Local parallelization","title":"User Guide","text":"main focus package, can use parallelize function calls locally multiple cores processes. can also useful test code subset data submitting scheduler. Multiprocess (recommended) - Use callr package run manage multiple parallel R processes options(clustermq.scheduler=\"multiprocess\") Multicore - Uses parallel package fork current R process multiple threads options(clustermq.scheduler=\"multicore\"). sometimes causes problems (macOS, RStudio) available Windows.","code":""},{"path":"/articles/userguide.html","id":"ssh-connector","dir":"Articles","previous_headings":"Configuration","what":"SSH connector","title":"User Guide","text":"reasons might prefer work computing cluster directly rather local machine instead. RStudio excellent local IDE, ’s responsive feature-rich browser-based solutions (RStudio server, Project Jupyter), avoids X forwarding issues want look plots just made. Using setup, however, lost access computing cluster. Instead, copy data , submit individual scripts jobs, aggregating data end . clustermq trying solve providing transparent SSH interface. order use clustermq local machine, package needs installed computing cluster. computing cluster, set scheduler make sure clustermq runs without problems. Note remote scheduler can LOCAL (default HPC scheduler found) SSH work. local machine, add following options: recommend set SSH keys password-less login.","code":"# If this is set to 'LOCAL' or 'SSH' you will get the following error: # Expected PROXY_READY, received ‘PROXY_ERROR: Remote SSH QSys is not allowed’ options( clustermq.scheduler = \"multiprocess\" # or multicore, LSF, SGE, Slurm etc. ) options( clustermq.scheduler = \"ssh\", clustermq.ssh.host = \"user@host\", # use your user and host, obviously clustermq.ssh.log = \"~/cmq_ssh.log\" # log for easier debugging )"},{"path":[]},{"path":"/articles/userguide.html","id":"the-q-function","dir":"Articles","previous_headings":"Usage","what":"The Q function","title":"User Guide","text":"following arguments supported Q: fun - function call. needs self-sufficient (access master environment) ... - iterated arguments passed function. one, need named const - named list non-iterated arguments passed fun export - named list objects export worker environment Behavior can fine-tuned using options : fail_on_error - Whether stop one calls returns error seed - common seed combined job number reproducible results memory - Amount memory request job (bsub -M) n_jobs - Number jobs submit function calls job_size - Number function calls per job. used combination n_jobs latter overall limit chunk_size - many calls worker process reporting back master. Default: every worker report back 100 times total full documentation available typing ?Q.","code":""},{"path":"/articles/userguide.html","id":"examples","dir":"Articles","previous_headings":"Usage","what":"Examples","title":"User Guide","text":"package designed distribute arbitrary function calls HPC worker nodes. , however, couple caveats observe R session running worker share local memory. simplest example function call completely self-sufficient, one argument (x) iterate : Non-iterated arguments supported const argument: function relies objects environment passed arguments (including functions), can exported using export argument: want use package function need load worker using pkgs parameter, referencing package_name:::","code":"fx = function(x) x * 2 Q(fx, x=1:3, n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 2 #> #> [[2]] #> [1] 4 #> #> [[3]] #> [1] 6 fx = function(x, y) x * 2 + y Q(fx, x=1:3, const=list(y=10), n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 12 #> #> [[2]] #> [1] 14 #> #> [[3]] #> [1] 16 fx = function(x) x * 2 + y Q(fx, x=1:3, export=list(y=10), n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [1] 12 #> #> [[2]] #> [1] 14 #> #> [[3]] #> [1] 16 f1 = function(x) splitIndices(x, 3) Q(f1, x=3, n_jobs=1, pkgs=\"parallel\") #> Running sequentially ('LOCAL') ... #> [[1]] #> [[1]][[1]] #> [1] 1 #> #> [[1]][[2]] #> [1] 2 #> #> [[1]][[3]] #> [1] 3 f2 = function(x) parallel::splitIndices(x, 3) Q(f2, x=8, n_jobs=1) #> Running sequentially ('LOCAL') ... #> [[1]] #> [[1]][[1]] #> [1] 1 2 3 #> #> [[1]][[2]] #> [1] 4 5 #> #> [[1]][[3]] #> [1] 6 7 8 # Q(f1, x=5, n_jobs=1) # (Error #1) could not find function \"splitIndices\""},{"path":"/articles/userguide.html","id":"as-parallel-foreach-backend","dir":"Articles","previous_headings":"Usage","what":"As parallel foreach backend","title":"User Guide","text":"foreach package provides interface perform repeated tasks different backends. can perform function simple loops using %%: can also perform operations parallel using %dopar%: latter allows registering different handlers parallel execution, can use clustermq: BiocParallel supports foreach , means can run packages use BiocParallel cluster well via DoparParam.","code":"library(foreach) x = foreach(i=1:3) %do% sqrt(i) x = foreach(i=1:3) %dopar% sqrt(i) #> Warning: executing %dopar% sequentially: no parallel backend registered # set up the scheduler first, otherwise this will run sequentially clustermq::register_dopar_cmq(n_jobs=2, memory=1024) # this accepts same arguments as `Q` x = foreach(i=1:3) %dopar% sqrt(i) # this will be executed as jobs #> Running sequentially ('LOCAL') ... library(BiocParallel) register(DoparParam()) # after register_dopar_cmq(...) bplapply(1:3, sqrt)"},{"path":"/articles/userguide.html","id":"with-targets","dir":"Articles","previous_headings":"Usage","what":"With targets","title":"User Guide","text":"targets package enables users define dependency structure different function calls, evaluate underlying data changed. targets package Make-like pipeline tool statistics data science R. package skips costly runtime tasks already date, orchestrates necessary computation implicit parallel computing, abstracts files R objects. current output matches current upstream code data, whole pipeline date, results trustworthy otherwise. can use clustermq perform calculations jobs.","code":""},{"path":"/articles/userguide.html","id":"options","dir":"Articles","previous_headings":"","what":"Options","title":"User Guide","text":"various configurable options mentioned throughout documentation, applicable, however, list options reference. Options can set including call options( = ) current session added line ~/.Rprofile. former available active session, latter available time restart R. clustermq.scheduler - One supported clustermq schedulers; options \"LOCAL\", \"multiprocess\", \"multicore\", \"lsf\", \"sge\", \"slurm\", \"pbs\", \"Torque\", \"ssh\" (default HPC scheduler found $PATH, otherwise \"LOCAL\") clustermq.host - name node device constructing ZeroMQ host address (default Sys.info()[\"nodename\"]) clustermq.ssh.host - user name host connecting HPC via SSH (e.g. user@host); recommend setting SSH keys password-less login clustermq.ssh.log - Path file (SSH host) created populated logging information regarding SSH connection (e.g. \"~/cmq_ssh.log\"); helpful debugging purposes clustermq.ssh.timeout - amount time wait (seconds) SSH start-connection timing (default 10 seconds) clustermq.ssh.hpc_fwd_port - Port opened SSH reverse tunneling workers HPC local session (default: one integer range 50-55k) clustermq.worker.timeout - amount time wait (seconds) master-worker communication timing (default wait indefinitely) clustermq.template - Path template file submitting HPC jobs; necessary using template, otherwise default template used (default depends set inferred clustermq.scheduler) clustermq.data.warning - threshold size common data (Mb) clustermq throws warning (default 1000) clustermq.defaults - named-list default values HPC template; takes precedence defaults specified template file (default empty list)","code":""},{"path":"/articles/userguide.html","id":"debugging-workers","dir":"Articles","previous_headings":"","what":"Debugging workers","title":"User Guide","text":"Function calls evaluated workers wrapped event handlers, means even call evaluation throws error, reported back main R session. However, reasons workers might crash, case can report back. include: segfault low-level process Process kill due resource constraints (e.g. walltime) Reaching wait timeout without signal master process Probably others case, useful worker(s) create log file also include events reported back. can requested using: create file called -.log current working directory, irrespective scheduler use. can customize file name using Note case log_file template field scheduler script, hence needs present order work. default templates field included. order log worker separately, schedulers support wildcards log file names. instance: Multicore/Multiprocess: log_file=\"/path/.file.%\" SGE: log_file=\"/path/.file.$TASK_ID\" LSF: log_file=\"/path/.file.%\" Slurm: log_file=\"/path/.file.%\" PBS: log_file=\"/path/.file.$PBS_ARRAY_INDEX\" Torque: log_file=\"/path/.file.$PBS_ARRAYID\" scheduler documentation details available options. reporting bug includes worker crashes, please always include log file.","code":"Q(..., log_worker=TRUE) Q(..., template=list(log_file = ))"},{"path":"/articles/userguide.html","id":"environments","dir":"Articles","previous_headings":"","what":"Environments","title":"User Guide","text":"cases, may necessary activate specific computing environment scheduler jobs prior starting worker. can , instance, R installed specific environment container. Examples environments containers : Bash module environments Conda environments Docker/Singularity containers possible activate job submission script (.e., template file). widely untested, look following LSF scheduler (analogous others): template still needs filled, example need pass either set via option:","code":"#BSUB-J {{ job_name }}[1-{{ n_jobs }}] # name of the job / array jobs #BSUB-o {{ log_file | /dev/null }} # stdout + stderr #BSUB-M {{ memory | 4096 }} # Memory requirements in Mbytes #BSUB-R rusage[mem={{ memory | 4096 }}] # Memory requirements in Mbytes ##BSUB-q default # name of the queue (uncomment) ##BSUB-W {{ walltime | 6:00 }} # walltime (uncomment) module load {{ bashenv | default_bash_env }} # or: source activate {{ conda | default_conda_env_name }} # or: your environment activation command ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")' Q(..., template=list(bashenv=\"my environment name\")) options( clustermq.defaults = list(bashenv=\"my default env\") )"},{"path":[]},{"path":"/articles/userguide.html","id":"lsf","dir":"Articles","previous_headings":"Scheduler templates","what":"LSF","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #BSUB-* defines command-line arguments bsub program. Memory: defined BSUB-M BSUB-R. Check local setup memory values supplied MiB KiB, default 4096 requesting memory calling Q() Queue: BSUB-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # Walltime: BSUB-W {{ walltime }}. Set maximum time job allowed run killed. default disable line. enable , enter fixed value pass walltime argument function call. way written, use 6 hours arguemnt given. options, see LSF documentation add via #BSUB-* (* represents argument) change identifiers curly braces ({{ ... }}), used fill right variables done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"lsf\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #BSUB-J {{ job_name }}[1-{{ n_jobs }}] # name of the job / array jobs #BSUB-n {{ cores | 1 }} # number of cores to use per job #BSUB-o {{ log_file | /dev/null }} # stdout + stderr; %I for array index #BSUB-M {{ memory | 4096 }} # Memory requirements in Mbytes #BSUB-R rusage[mem={{ memory | 4096 }}] # Memory requirements in Mbytes ##BSUB-q default # name of the queue (uncomment) ##BSUB-W {{ walltime | 6:00 }} # walltime (uncomment) ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"sge","dir":"Articles","previous_headings":"Scheduler templates","what":"SGE","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #$-* defines command-line arguments qsub program. Queue: $ -q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see SGE documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"sge\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #$ -N {{ job_name }} # job name ##$ -q default # submit to queue named \"default\" #$ -j y # combine stdout/error in one file #$ -o {{ log_file | /dev/null }} # output file #$ -cwd # use pwd as work dir #$ -V # use environment variable #$ -t 1-{{ n_jobs }} # submit jobs as array #$ -pe smp {{ cores | 1 }} # number of cores to use per job #$ -l m_mem_free={{ memory | 1073741824 }} # 1 Gb in bytes ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"slurm","dir":"Articles","previous_headings":"Scheduler templates","what":"SLURM","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #SBATCH defines command-line arguments sbatch program. Partition: SBATCH --partition default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see SLURM documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"slurm\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #!/bin/sh #SBATCH --job-name={{ job_name }} ##SBATCH --partition=default #SBATCH --output={{ log_file | /dev/null }} #SBATCH --error={{ log_file | /dev/null }} #SBATCH --mem-per-cpu={{ memory | 4096 }} #SBATCH --array=1-{{ n_jobs }} #SBATCH --cpus-per-task={{ cores | 1 }} ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"pbs","dir":"Articles","previous_headings":"Scheduler templates","what":"PBS","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #PBS-* defines command-line arguments qsub program. Queue: #PBS-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see PBS documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options( clustermq.scheduler = \"pbs\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #PBS -N {{ job_name }} #PBS -J 1-{{ n_jobs }} #PBS -l select=1:ncpus={{ cores | 1 }}:mpiprocs={{ cores | 1 }}:mem={{ memory | 4096 }}MB #PBS -l walltime={{ walltime | 12:00:00 }} #PBS -o {{ log_file | /dev/null }} #PBS -j oe ##PBS -q default ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"torque","dir":"Articles","previous_headings":"Scheduler templates","what":"Torque","title":"User Guide","text":"Set following options R session submit jobs: supply template, save contents desired changes file clustermq.template point . file, #PBS-* defines command-line arguments qsub program. Queue: #PBS-q default. Use queue name default. likely exist system, choose right name uncomment removing additional # options, see Torque documentation. change identifiers curly braces ({{ ... }}), used fill right variables. done, package use settings longer warn missing options.","code":"options(clustermq.scheduler = \"Torque\", clustermq.template = \"/path/to/file/below\" # if using your own template ) #PBS -N {{ job_name }} #PBS -l nodes={{ n_jobs }}:ppn={{ cores | 1 }},walltime={{ walltime | 12:00:00 }} #PBS -o {{ log_file | /dev/null }} #PBS -j oe ##PBS -q default ulimit -v $(( 1024 * {{ memory | 4096 }} )) CMQ_AUTH={{ auth }} R --no-save --no-restore -e 'clustermq:::worker(\"{{ master }}\")'"},{"path":"/articles/userguide.html","id":"ssh-template","dir":"Articles","previous_headings":"Scheduler templates","what":"SSH","title":"User Guide","text":"SSH scheduler, can access remote schedulers via SSH. want use , first make sure clustermq works server real scheduler. move setting SSH. default template shown . R HPC $PATH, may need specify path load required bash modules/conda environments. supply template, save contents desired changes file local machine clustermq.template point .","code":"options(clustermq.scheduler = \"ssh\", clustermq.ssh.host = \"myhost\", # set this up in your local ~/.ssh/config clustermq.ssh.log = \"~/ssh_proxy.log\", # log file on your HPC clustermq.ssh.timeout = 30, # if changing the default connection timeout clustermq.template = \"/path/to/file/below\" # if using your own template ) ssh -o \"ExitOnForwardFailure yes\" -f -R {{ ctl_port }}:localhost:{{ local_port }} -R {{ job_port }}:localhost:{{ fwd_port }} {{ ssh_host }} \"R --no-save --no-restore -e 'clustermq:::ssh_proxy(ctl={{ ctl_port }}, job={{ job_port }})' > {{ ssh_log | /dev/null }} 2>&1\""},{"path":"/authors.html","id":null,"dir":"","previous_headings":"","what":"Authors","title":"Authors and Citation","text":"Michael Schubert. Author, maintainer, copyright holder. ZeroMQ authors. Author, copyright holder. source files 'src/libzmq' 'src/cppzmq'","code":""},{"path":"/authors.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Authors and Citation","text":"Schubert, M. clustermq enables efficient parallelisation genomic analyses. Bioinformatics (2019). doi:10.1093/bioinformatics/btz284","code":"@Article{, title = {clustermq enables efficient parallelisation of genomic analyses}, author = {Michael Schubert}, journal = {Bioinformatics}, month = {May}, year = {2019}, language = {en}, doi = {10.1093/bioinformatics/btz284}, url = {https://github.com/mschubert/clustermq}, }"},{"path":"/index.html","id":"clustermq-send-r-function-calls-as-cluster-jobs","dir":"","previous_headings":"","what":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"package allow send function calls jobs computing cluster minimal interface provided Q function: Computations done entirely network without temporary files network-mounted storage, strain file system apart starting R per job. calculations load-balanced, .e. workers get jobs done faster also receive function calls work . especially useful calls return time, one worker high load. Browse vignettes : User Guide Technical Documentation FAQ","code":"# load the library and create a simple function library(clustermq) fx = function(x) x * 2 # queue the function call on your scheduler Q(fx, x=1:3, n_jobs=1) # list(2,4,6)"},{"path":"/index.html","id":"installation","dir":"","previous_headings":"","what":"Installation","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"Install clustermq package R CRAN (including bundled ZeroMQ system library): Alternatively can use remotes package install directly Github. Note version needs autoconf/automake CMake compilation: [!TIP] installation problems, see FAQ","code":"install.packages('clustermq') # install.packages('remotes') remotes::install_github('mschubert/clustermq') # remotes::install_github('mschubert/clustermq@develop') # dev version"},{"path":"/index.html","id":"schedulers","dir":"","previous_headings":"","what":"Schedulers","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"HPC cluster’s scheduler ensures computing jobs distributed available worker nodes. Hence, clustermq interfaces order computations. currently support following schedulers (either locally via SSH): Multiprocess - test calls parallelize cores using options(clustermq.scheduler=\"multiprocess\") SLURM - work without setup LSF - work without setup SGE - may require configuration PBS/Torque - needs options(clustermq.scheduler=\"PBS\"/\"Torque\") via SSH - needs options(clustermq.scheduler=\"ssh\", clustermq.ssh.host=) [!TIP] Follow links configure scheduler case working box check FAQ job submission errors gets stuck","code":""},{"path":"/index.html","id":"usage","dir":"","previous_headings":"","what":"Usage","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"common arguments Q : fun - function call. needs self-sufficient (access master environment) ... - iterated arguments passed function. one, need named const - named list non-iterated arguments passed fun export - named list objects export worker environment documentation arguments can accessed typing ?Q. Examples using const export : clustermq can also used parallel backend foreach. also used BiocParallel, can run packages cluster well: examples available User Guide.","code":"# adding a constant argument fx = function(x, y) x * 2 + y Q(fx, x=1:3, const=list(y=10), n_jobs=1) # exporting an object to workers fx = function(x) x * 2 + y Q(fx, x=1:3, export=list(y=10), n_jobs=1) library(foreach) register_dopar_cmq(n_jobs=2, memory=1024) # see `?workers` for arguments foreach(i=1:3) %dopar% sqrt(i) # this will be executed as jobs library(BiocParallel) register(DoparParam()) # after register_dopar_cmq(...) bplapply(1:3, sqrt)"},{"path":"/index.html","id":"comparison-to-other-packages","dir":"","previous_headings":"","what":"Comparison to other packages","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"packages provide high-level parallelization R function calls computing cluster. compared clustermq BatchJobs batchtools processing many short-running jobs, found approximately 1000x less overhead cost. short, use clustermq want: one-line solution run cluster jobs minimal setup access cluster functions local Rstudio via SSH fast processing many function calls without network storage /O Use batchtools : want use mature well-tested package don’t mind arguments every call written /read disc don’t mind ’s load-balancing run-time Use Snakemake targets : want design run workflow HPC Don’t use batch (last updated 2013) BatchJobs (issues SQLite network-mounted storage).","code":""},{"path":"/index.html","id":"contributing","dir":"","previous_headings":"","what":"Contributing","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"Contributions welcome come many different forms, shapes, sizes. include, limited : Questions: Ask Github Discussions board. advanced user, please also consider answering questions . Bug reports: File issue something work expected. sure include self-contained Minimal Reproducible Example set log_worker=TRUE. Code contributions: look good first issue tag. Please discuss anything complicated putting lot work , ’m happy help get started. [!TIP] Check User Guide FAQ first, maybe query already answered ","code":""},{"path":"/index.html","id":"citation","dir":"","previous_headings":"","what":"Citation","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM, PBS/Torque)","text":"project part academic work, evaluated citations. like able continue working research support tools like clustermq, please cite article using publications: M Schubert. clustermq enables efficient parallelisation genomic analyses. Bioinformatics (2019). doi:10.1093/bioinformatics/btz284","code":""},{"path":"/reference/LOCAL.html","id":null,"dir":"Reference","previous_headings":"","what":"Placeholder for local processing — LOCAL","title":"Placeholder for local processing — LOCAL","text":"Mainly tests pass without setting scheduler","code":""},{"path":"/reference/LSF.html","id":null,"dir":"Reference","previous_headings":"","what":"LSF scheduler functions — LSF","title":"LSF scheduler functions — LSF","text":"Derives QSys provide LSF-specific functions","code":""},{"path":"/reference/MULTICORE.html","id":null,"dir":"Reference","previous_headings":"","what":"Process on multiple cores on one machine — MULTICORE","title":"Process on multiple cores on one machine — MULTICORE","text":"Derives QSys provide multicore-specific functions","code":""},{"path":"/reference/MULTIPROCESS.html","id":null,"dir":"Reference","previous_headings":"","what":"Process on multiple processes on one machine — MULTIPROCESS","title":"Process on multiple processes on one machine — MULTIPROCESS","text":"Derives QSys provide callr-specific functions","code":""},{"path":"/reference/Pool.html","id":null,"dir":"Reference","previous_headings":"","what":"Class for basic queuing system functions — Pool","title":"Class for basic queuing system functions — Pool","text":"Provides basic functions needed communicate machines abstract functions rZMQ scheduler implementations can rely higher level functionality","code":""},{"path":"/reference/Q.html","id":null,"dir":"Reference","previous_headings":"","what":"Queue function calls on the cluster — Q","title":"Queue function calls on the cluster — Q","text":"Queue function calls cluster","code":""},{"path":"/reference/Q.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Queue function calls on the cluster — Q","text":"","code":"Q( fun, ..., const = list(), export = list(), pkgs = c(), seed = 128965, memory = NULL, template = list(), n_jobs = NULL, job_size = NULL, split_array_by = -1, rettype = \"list\", fail_on_error = TRUE, workers = NULL, log_worker = FALSE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/Q.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Queue function calls on the cluster — Q","text":"fun function call ... Objects iterated function call const list constant arguments passed function call export List objects exported worker pkgs Character vector packages load worker seed seed set function call memory Short template=list(memory=value) template named list values fill template n_jobs number LSF jobs submit; upper limit jobs job_size given well job_size number function calls per job split_array_by dimension number split arrays `...`; default: last rettype Return type function call (vector type 'list') fail_on_error error occurs workers, continue fail? workers Optional instance QSys representing worker pool log_worker Write log file worker chunk_size Number function calls chunk together defaults 100 chunks per worker max. 10 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number chunks sent one worker verbose Print status messages progress bar (default: TRUE)","code":""},{"path":"/reference/Q.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Queue function calls on the cluster — Q","text":"list whatever `fun` returned","code":""},{"path":"/reference/Q.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Queue function calls on the cluster — Q","text":"","code":"if (FALSE) { # Run a simple multiplication for numbers 1 to 3 on a worker node fx = function(x) x * 2 Q(fx, x=1:3, n_jobs=1) # list(2,4,6) # Run a mutate() call in dplyr on a worker node iris %>% mutate(area = Q(`*`, e1=Sepal.Length, e2=Sepal.Width, n_jobs=1)) # iris with an additional column 'area' }"},{"path":"/reference/QSys.html","id":null,"dir":"Reference","previous_headings":"","what":"Class for basic queuing system functions — QSys","title":"Class for basic queuing system functions — QSys","text":"Provides basic functions needed communicate machines abstract functions rZMQ scheduler implementations can rely higher level functionality","code":""},{"path":"/reference/Q_rows.html","id":null,"dir":"Reference","previous_headings":"","what":"Queue function calls defined by rows in a data.frame — Q_rows","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"Queue function calls defined rows data.frame","code":""},{"path":"/reference/Q_rows.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"","code":"Q_rows( df, fun, const = list(), export = list(), pkgs = c(), seed = 128965, memory = NULL, template = list(), n_jobs = NULL, job_size = NULL, rettype = \"list\", fail_on_error = TRUE, workers = NULL, log_worker = FALSE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/Q_rows.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"df data.frame iterated arguments fun function call const list constant arguments passed function call export List objects exported worker pkgs Character vector packages load worker seed seed set function call memory Short template=list(memory=value) template named list values fill template n_jobs number LSF jobs submit; upper limit jobs job_size given well job_size number function calls per job rettype Return type function call (vector type 'list') fail_on_error error occurs workers, continue fail? workers Optional instance QSys representing worker pool log_worker Write log file worker chunk_size Number function calls chunk together defaults 100 chunks per worker max. 10 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number chunks sent one worker verbose Print status messages progress bar (default: TRUE)","code":""},{"path":"/reference/Q_rows.html","id":"ref-examples","dir":"Reference","previous_headings":"","what":"Examples","title":"Queue function calls defined by rows in a data.frame — Q_rows","text":"","code":"if (FALSE) { # Run a simple multiplication for data frame columns x and y on a worker node fx = function (x, y) x * y df = data.frame(x = 5, y = 10) Q_rows(df, fx, job_size = 1) # [1] 50 # Q_rows also matches the names of a data frame with the function arguments fx = function (x, y) x - y df = data.frame(y = 5, x = 10) Q_rows(df, fx, job_size = 1) # [1] 5 }"},{"path":"/reference/SGE.html","id":null,"dir":"Reference","previous_headings":"","what":"SGE scheduler functions — SGE","title":"SGE scheduler functions — SGE","text":"Derives QSys provide SGE-specific functions","code":""},{"path":"/reference/SLURM.html","id":null,"dir":"Reference","previous_headings":"","what":"SLURM scheduler functions — SLURM","title":"SLURM scheduler functions — SLURM","text":"Derives QSys provide SLURM-specific functions","code":""},{"path":"/reference/SSH.html","id":null,"dir":"Reference","previous_headings":"","what":"SSH scheduler functions — SSH","title":"SSH scheduler functions — SSH","text":"Derives QSys provide SSH-specific functions","code":""},{"path":"/reference/check_args.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to check arguments with which Q() is called — check_args","title":"Function to check arguments with which Q() is called — check_args","text":"Function check arguments Q() called","code":""},{"path":"/reference/check_args.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to check arguments with which Q() is called — check_args","text":"","code":"check_args(fun, iter, const = list())"},{"path":"/reference/check_args.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to check arguments with which Q() is called — check_args","text":"fun function call iter Objects iterated function call const list constant arguments passed function call","code":""},{"path":"/reference/check_args.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to check arguments with which Q() is called — check_args","text":"Processed iterated argument list 'iter' list","code":""},{"path":"/reference/chunk.html","id":null,"dir":"Reference","previous_headings":"","what":"Subset index chunk for processing — chunk","title":"Subset index chunk for processing — chunk","text":"'attr' `[.data.frame` takes much CPU time","code":""},{"path":"/reference/chunk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Subset index chunk for processing — chunk","text":"","code":"chunk(x, i)"},{"path":"/reference/chunk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Subset index chunk for processing — chunk","text":"x Index data.frame Rows subset","code":""},{"path":"/reference/chunk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Subset index chunk for processing — chunk","text":"x[,]","code":""},{"path":"/reference/clustermq-package.html","id":null,"dir":"Reference","previous_headings":"","what":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"Provides Q function send arbitrary function calls workers HPC schedulers without relying network-mounted storage. Allows using remote schedulers via SSH.","code":""},{"path":"/reference/clustermq-package.html","id":"details","dir":"Reference","previous_headings":"","what":"Details","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"hood, submit cluster job connects master via TCP master send function argument chunks worker worker return results master everything done get back result Computations done entirely network without temporary files network-mounted storage, strain file system apart starting R per job. removes biggest bottleneck distributed computing. Using approach, can easily load-balancing, .e. workers get jobs done faster also receive function calls work . especially useful calls return time, one worker high load. detailed usage instructions, see documentation Q function.","code":""},{"path":[]},{"path":"/reference/clustermq-package.html","id":"author","dir":"Reference","previous_headings":"","what":"Author","title":"Evaluate Function Calls on HPC Schedulers (LSF, SGE, SLURM) — clustermq-package","text":"Maintainer: Michael Schubert mschu.dev@gmail.com (ORCID) [copyright holder] Authors: ZeroMQ authors (source files 'src/libzmq' 'src/cppzmq') [copyright holder]","code":""},{"path":"/reference/cmq_foreach.html","id":null,"dir":"Reference","previous_headings":"","what":"clustermq foreach handler — cmq_foreach","title":"clustermq foreach handler — cmq_foreach","text":"clustermq foreach handler","code":""},{"path":"/reference/cmq_foreach.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"clustermq foreach handler — cmq_foreach","text":"","code":"cmq_foreach(obj, expr, envir, data)"},{"path":"/reference/cmq_foreach.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"clustermq foreach handler — cmq_foreach","text":"obj Returned foreach::foreach, containing following variables: args : Arguments passed, call argnames: character vector arguments passed evalenv : Environment evaluate arguments export : character vector variable names export nodes packages: character vector required packages verbose : whether print status messages [logical] errorHandling: string function name call error , e.g. \"stop\" expr R expression curly braces envir Environment evaluate arguments data Common arguments passed register_dopcar_cmq(), e.g. n_jobs","code":""},{"path":"/reference/dot-onAttach.html","id":null,"dir":"Reference","previous_headings":"","what":"Report queueing system on package attach if not set — .onAttach","title":"Report queueing system on package attach if not set — .onAttach","text":"Report queueing system package attach set","code":""},{"path":"/reference/dot-onAttach.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Report queueing system on package attach if not set — .onAttach","text":"","code":".onAttach(libname, pkgname)"},{"path":"/reference/dot-onAttach.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Report queueing system on package attach if not set — .onAttach","text":"libname default arg compatibility pkgname default arg compatibility","code":""},{"path":"/reference/dot-onLoad.html","id":null,"dir":"Reference","previous_headings":"","what":"Select the queueing system on package loading — .onLoad","title":"Select the queueing system on package loading — .onLoad","text":"done setting variable 'qsys' package environment object contains desired queueing system.","code":""},{"path":"/reference/dot-onLoad.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Select the queueing system on package loading — .onLoad","text":"","code":".onLoad(libname, pkgname)"},{"path":"/reference/dot-onLoad.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Select the queueing system on package loading — .onLoad","text":"libname default arg compatibility pkgname default arg compatibility","code":""},{"path":"/reference/fill_template.html","id":null,"dir":"Reference","previous_headings":"","what":"Fill a template string with supplied values — fill_template","title":"Fill a template string with supplied values — fill_template","text":"Fill template string supplied values","code":""},{"path":"/reference/fill_template.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Fill a template string with supplied values — fill_template","text":"","code":"fill_template(template, values, required = c())"},{"path":"/reference/fill_template.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Fill a template string with supplied values — fill_template","text":"template character string submission template values named list key-value pairs required Keys must present template (default: none)","code":""},{"path":"/reference/fill_template.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Fill a template string with supplied values — fill_template","text":"template placeholder fields replaced values","code":""},{"path":"/reference/host.html","id":null,"dir":"Reference","previous_headings":"","what":"Construct the ZeroMQ host address — host","title":"Construct the ZeroMQ host address — host","text":"Construct ZeroMQ host address","code":""},{"path":"/reference/host.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Construct the ZeroMQ host address — host","text":"","code":"host( node = getOption(\"clustermq.host\", Sys.info()[\"nodename\"]), ports = 6000:9999, n = 100 )"},{"path":"/reference/host.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Construct the ZeroMQ host address — host","text":"node Node device name ports Range ports consider n many addresses return","code":""},{"path":"/reference/host.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Construct the ZeroMQ host address — host","text":"possible addresses character vector","code":""},{"path":"/reference/master.html","id":null,"dir":"Reference","previous_headings":"","what":"Master controlling the workers — master","title":"Master controlling the workers — master","text":"exchanging messages master workers works following way: * submitted job know start * starts, sends message list(id=0) indicating ready * send function definition common data * also send first data set work * get id > 0, result store * send next data set/index work * computatons complete, send id=0 worker * responds id=-1 (usage stats) shuts ","code":""},{"path":"/reference/master.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Master controlling the workers — master","text":"","code":"master( pool, iter, rettype = \"list\", fail_on_error = TRUE, chunk_size = NA, timeout = Inf, max_calls_worker = Inf, verbose = TRUE )"},{"path":"/reference/master.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Master controlling the workers — master","text":"pool Instance Pool object iter Objects iterated function call rettype Return type function fail_on_error error occurs workers, continue fail? chunk_size Number function calls chunk together defaults 100 chunks per worker max. 500 kb per chunk timeout Maximum time seconds wait worker (default: Inf) max_calls_worker Maxmimum number function calls sent one worker verbose Print progress messages","code":""},{"path":"/reference/master.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Master controlling the workers — master","text":"list whatever `fun` returned","code":""},{"path":"/reference/msg_fmt.html","id":null,"dir":"Reference","previous_headings":"","what":"Message format for logging — msg_fmt","title":"Message format for logging — msg_fmt","text":"Message format logging","code":""},{"path":"/reference/msg_fmt.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Message format for logging — msg_fmt","text":"","code":"msg_fmt(verbose = TRUE)"},{"path":"/reference/register_dopar_cmq.html","id":null,"dir":"Reference","previous_headings":"","what":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"Register clustermq `foreach` parallel handler","code":""},{"path":"/reference/register_dopar_cmq.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"","code":"register_dopar_cmq(...)"},{"path":"/reference/register_dopar_cmq.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Register clustermq as `foreach` parallel handler — register_dopar_cmq","text":"... List arguments passed `Q` function, e.g. n_jobs","code":""},{"path":"/reference/ssh_proxy.html","id":null,"dir":"Reference","previous_headings":"","what":"SSH proxy for different schedulers — ssh_proxy","title":"SSH proxy for different schedulers — ssh_proxy","text":"call manually, SSH qsys ","code":""},{"path":"/reference/ssh_proxy.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"SSH proxy for different schedulers — ssh_proxy","text":"","code":"ssh_proxy(fwd_port, qsys_id = qsys_default)"},{"path":"/reference/ssh_proxy.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"SSH proxy for different schedulers — ssh_proxy","text":"fwd_port port master address connect (remote end reverse tunnel) qsys_id Character string QSys class use","code":""},{"path":"/reference/summarize_result.html","id":null,"dir":"Reference","previous_headings":"","what":"Print a summary of errors and warnings that occurred during processing — summarize_result","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"Print summary errors warnings occurred processing","code":""},{"path":"/reference/summarize_result.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"","code":"summarize_result( result, n_errors, n_warnings, cond_msgs, at = length(result), fail_on_error = TRUE )"},{"path":"/reference/summarize_result.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Print a summary of errors and warnings that occurred during processing — summarize_result","text":"result list vector processing result n_errors many errors occurred n_warnings many warnings occurred cond_msgs Error warnings messages, display first 50 many calls procesed point fail_on_error Stop error(s) occurred","code":""},{"path":"/reference/vec_lookup.html","id":null,"dir":"Reference","previous_headings":"","what":"Lookup table for return types to vector NAs — vec_lookup","title":"Lookup table for return types to vector NAs — vec_lookup","text":"Lookup table return types vector NAs","code":""},{"path":"/reference/vec_lookup.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Lookup table for return types to vector NAs — vec_lookup","text":"","code":"vec_lookup"},{"path":"/reference/vec_lookup.html","id":"format","dir":"Reference","previous_headings":"","what":"Format","title":"Lookup table for return types to vector NAs — vec_lookup","text":"object class list length 9.","code":""},{"path":"/reference/work_chunk.html","id":null,"dir":"Reference","previous_headings":"","what":"Function to process a chunk of calls — work_chunk","title":"Function to process a chunk of calls — work_chunk","text":"chunk comes encapsulated data.frame","code":""},{"path":"/reference/work_chunk.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Function to process a chunk of calls — work_chunk","text":"","code":"work_chunk( df, fun, const = list(), rettype = \"list\", common_seed = NULL, progress = FALSE )"},{"path":"/reference/work_chunk.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Function to process a chunk of calls — work_chunk","text":"df data.frame call IDs rownames arguments columns fun function call const Constant arguments passed call rettype Return type function common_seed seed offset common function calls progress Logical indicated whether display progress bar","code":""},{"path":"/reference/work_chunk.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Function to process a chunk of calls — work_chunk","text":"list call results (try-error failed)","code":""},{"path":"/reference/worker.html","id":null,"dir":"Reference","previous_headings":"","what":"R worker submitted as cluster job — worker","title":"R worker submitted as cluster job — worker","text":"call manually, master ","code":""},{"path":"/reference/worker.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"R worker submitted as cluster job — worker","text":"","code":"worker(master, ..., verbose = TRUE, context = NULL)"},{"path":"/reference/worker.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"R worker submitted as cluster job — worker","text":"master master address (tcp://ip:port) ... Catch-break older template values (ignored) verbose Whether print debug messages context ZeroMQ context (internal testing)","code":""},{"path":"/reference/workers.html","id":null,"dir":"Reference","previous_headings":"","what":"Creates a pool of workers — workers","title":"Creates a pool of workers — workers","text":"Creates pool workers","code":""},{"path":"/reference/workers.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Creates a pool of workers — workers","text":"","code":"workers( n_jobs, data = NULL, reuse = TRUE, template = list(), log_worker = FALSE, qsys_id = getOption(\"clustermq.scheduler\", qsys_default), verbose = FALSE, ... )"},{"path":"/reference/workers.html","id":"arguments","dir":"Reference","previous_headings":"","what":"Arguments","title":"Creates a pool of workers — workers","text":"n_jobs Number jobs submit (0 implies local processing) data Set common data (function, constant args, seed) reuse Whether workers reusable get shut call template named list values fill template log_worker Write log file worker qsys_id Character string QSys class use verbose Print message worker startup ... Additional arguments passed qsys constructor","code":""},{"path":"/reference/workers.html","id":"value","dir":"Reference","previous_headings":"","what":"Value","title":"Creates a pool of workers — workers","text":"instance QSys class","code":""},{"path":"/reference/wrap_error.html","id":null,"dir":"Reference","previous_headings":"","what":"Wraps an error in a condition object — wrap_error","title":"Wraps an error in a condition object — wrap_error","text":"Wraps error condition object","code":""},{"path":"/reference/wrap_error.html","id":"ref-usage","dir":"Reference","previous_headings":"","what":"Usage","title":"Wraps an error in a condition object — wrap_error","text":"","code":"wrap_error(call)"},{"path":"/news/index.html","id":"clustermq-094","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.4","title":"clustermq 0.9.4","text":"CRAN release: 2024-03-04 Fix bug worker stats shown NA (#325) Worker API: env() now visibly lists environment called without arguments","code":""},{"path":"/news/index.html","id":"clustermq-093","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.3","title":"clustermq 0.9.3","text":"CRAN release: 2024-01-09 Fix bug BiocParallel export required objects (#302) Fix bug already finished workers killed (#307) Fix bug worker results stats garbage collected (#324) now FAQ vignette answers frequently asked questions Worker API: send() now reports call identifier current() tracks","code":""},{"path":"/news/index.html","id":"clustermq-092","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.2","title":"clustermq 0.9.2","text":"CRAN release: 2023-12-07 Fix bug SSH proxy cache data properly (#320) Fix bug max_calls_worker respected (#322) Local parallelism (multicore, multiprocess) uses local IP (#321) Worker API: info() now also returns current worker number calls","code":""},{"path":"/news/index.html","id":"clustermq-091","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.1","title":"clustermq 0.9.1","text":"CRAN release: 2023-11-21 Disconnect monitor (libzmq -DZMQ_BUILD_DRAFT_API=1) now optional (#317) Fix bug worker shutdown notifications can cause crash (#306, #308, #310) Fix bug template values filled correctly (#309) Fix bug using Rf_error lead improper cleanup resources (#311) Fix bug maximum worker timeout multiplied led undefined behavior Fix bug ZeroMQ’s -Werror flag led compilation issues M1 Mac Fix bug SSH tests error timeout high load Worker API: CMQMaster now needs know add_pending_workers(n) Worker API: status report info() now displays properly","code":""},{"path":"/news/index.html","id":"clustermq-090","dir":"Changelog","previous_headings":"","what":"clustermq 0.9.0","title":"clustermq 0.9.0","text":"CRAN release: 2023-09-23","code":""},{"path":"/news/index.html","id":"features-0-9-0","dir":"Changelog","previous_headings":"","what":"Features","title":"clustermq 0.9.0","text":"Reuse common data now supported (#154) Jobs now error instead stalling upon unexpected worker disconnect (#150) Workers now error can establish connection within time limit Error n_jobs max_calls_worker provide insufficient call slots (#258) Request 1 GB default SGE template (#298) @nickholway Error warning summary now orders index severity (#304) call can multiple warnings forwarded, last","code":""},{"path":"/news/index.html","id":"bugfix-0-9-0","dir":"Changelog","previous_headings":"","what":"Bugfix","title":"clustermq 0.9.0","text":"Fix bug max memory reporting gc() may different column (#240) Fix passing numerical job_id qdel PBS (#265) job port/id pool now used properly upon binding failure (#270) @luwidmer Common data size warning now displayed exceeding limits (#287)","code":""},{"path":"/news/index.html","id":"internal-0-9-0","dir":"Changelog","previous_headings":"","what":"Internal","title":"clustermq 0.9.0","text":"Complete rewrite worker API longer depend purrr package","code":""},{"path":"/news/index.html","id":"clustermq-0895","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.95","title":"clustermq 0.8.95","text":"CRAN release: 2020-07-01 now using ZeroMQ via Rcpp preparation v0.9 (#151) New multiprocess backend via callr instead forking (#142, #197) Sending data sockets now blocking avoid excess memory usage (#161) multicore, multiprocess schedulers now support logging (#169) New option clustermq.host can specify host IP network interface name (#170) Template filling now raise error missing keys (#174, #198) Workers failing large common data improved (fixed?) (#146, #179, #191) Local connections now routed via 127.0.0.1 instead localhost (#192) Submit messages different local, multicore HPC (#196) Functions exported foreach now environment stripped (#200) Deprecation log_worker=T/F argument rescinded","code":""},{"path":"/news/index.html","id":"clustermq-089","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.9","title":"clustermq 0.8.9","text":"CRAN release: 2020-02-29 New option clustermq.ssh.timeout SSH proxy startup (#157) @brendanf New option clustermq.worker.timeout delay worker shutdown (#188) Fixed PBS/Torque docs, template cleanup (#184, #186) @mstr3336 Warning common data large, set clustermq.data.warning (#189)","code":""},{"path":"/news/index.html","id":"clustermq-088","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.8","title":"clustermq 0.8.8","text":"CRAN release: 2019-06-05 Q, Q_rows new arguments verbose (#111) pkgs (#144) foreach backend now uses dedicated API possible (#143, #144) Number size objects common calls now work properly Templates filled internally longer depend infuser package","code":""},{"path":"/news/index.html","id":"clustermq-087","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.7","title":"clustermq 0.8.7","text":"CRAN release: 2019-04-15 Q now max_calls_worker argument avoid walltime (#110) Submission messages now list size common data (drake#800) default templates now optional cores per job field (#123) foreach now treats .export (#124) .combine (#126) correctly New option clustermq.error.timeout wait clean shutdown (#134) SSH command now specified via template file (#122) SSH now forward errors local process (#135) Wiki deprecated, use https://mschubert.github.io/clustermq/ instead","code":""},{"path":"/news/index.html","id":"clustermq-086","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.6","title":"clustermq 0.8.6","text":"CRAN release: 2019-02-22 Progress bar now shown workers start (#107) Socket connections now authenticated using session password (#125) Marked internal functions @keywords internal Added vignettes User Guide Technical Documentation","code":""},{"path":"/news/index.html","id":"clustermq-085","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.5","title":"clustermq 0.8.5","text":"CRAN release: 2018-09-29 Added experimental support parallel foreach backend (#83) Moved templates package inst/ directory (#85) Added send_call worker evaluate arbitrary expressions (drake#501; #86) Option clustermq.scheduler now respected set package load (#88) System interrupts now handled correctly (rzmq#44; #73, #93, #97) Number workers running/total now shown progress bar (#98) Unqualified (short) host names now resolved default (#104)","code":""},{"path":"/news/index.html","id":"clustermq-084","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.4","title":"clustermq 0.8.4","text":"CRAN release: 2018-04-22 Fix error qsys$reusable using n_jobs=0/local processing (#75) Scheduler-specific templates deprecated. Use clustermq.template instead Allow option clustermq.defaults fill default template values (#71) Errors worker processing now shut cleanly (#67) Progress bar now shows estimated time remaining (#66) Progress bar now also shown processing locally Memory summary now adds estimated memory R session (#69)","code":""},{"path":"/news/index.html","id":"clustermq-083","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.3","title":"clustermq 0.8.3","text":"CRAN release: 2018-01-21 Support rettype function calls return type known (#59) Reduce memory requirements processing results receive Fix bug cleanup, log_worker flag working SGE/SLURM","code":""},{"path":"/news/index.html","id":"clustermq-082","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.2","title":"clustermq 0.8.2","text":"CRAN release: 2017-11-30 Fix bug never-started jobs cleaned Fix bug tests leave processes port binding fails (#60) Multicore longer prints worker debug messages (#61)","code":""},{"path":"/news/index.html","id":"clustermq-081","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.1","title":"clustermq 0.8.1","text":"CRAN release: 2017-11-27 Fix performance issues high number function calls (#56) Fix bug multicore workers shut properly (#58) Fix default templates SGE, LSF SLURM (misplaced quote)","code":""},{"path":"/news/index.html","id":"clustermq-080","dir":"Changelog","previous_headings":"","what":"clustermq 0.8.0","title":"clustermq 0.8.0","text":"CRAN release: 2017-11-11","code":""},{"path":"/news/index.html","id":"features-0-8-0","dir":"Changelog","previous_headings":"","what":"Features","title":"clustermq 0.8.0","text":"Templates changed: clustermq:::worker now takes master argument Creating workers now separated Q, enabling worker reuse (#45) Objects function environment must now exported explicitly (#47) Added multicore qsys using parallel package (#49) New function Q_rows using data.frame rows iterated arguments (#43) Job summary now report max memory reported gc (#18)","code":""},{"path":"/news/index.html","id":"bugfix-0-8-0","dir":"Changelog","previous_headings":"","what":"Bugfix","title":"clustermq 0.8.0","text":"Fix bug copies common_data collected gc slowly (#19)","code":""},{"path":"/news/index.html","id":"internal-0-8-0","dir":"Changelog","previous_headings":"","what":"Internal","title":"clustermq 0.8.0","text":"Messages master now processed threads (#42) Jobs now submitted array possible","code":""},{"path":"/news/index.html","id":"clustermq-070","dir":"Changelog","previous_headings":"","what":"clustermq 0.7.0","title":"clustermq 0.7.0","text":"CRAN release: 2017-08-28 Initial release CRAN","code":""}]