CRAN Task View: High-Performance and Parallel Computing with R

Maintainer:Dirk Eddelbuettel
Contact:Dirk.Eddelbuettel at R-project.org
Version:2022-04-23
URL:https://CRAN.R-project.org/view=HighPerformanceComputing
Source:https://github.com/cran-task-views/HighPerformanceComputing/
Contributions:Suggestions and improvements for this task view are very welcome and can be made through issues or pull requests on GitHub or via e-mail to the maintainer address. For further details see the Contributing guide.
Citation:Dirk Eddelbuettel (2022). CRAN Task View: High-Performance and Parallel Computing with R. Version 2022-04-23. URL https://CRAN.R-project.org/view=HighPerformanceComputing.
Installation:The packages from this task view can be installed automatically using the ctv package. For example, ctv::install.views("HighPerformanceComputing", coreOnly = TRUE) installs all the core packages or ctv::update.views("HighPerformanceComputing") installs all packages that are not yet installed and up-to-date. See the CRAN Task View Initiative for more details.

This CRAN Task View contains a list of packages, grouped by topic, that are useful for high-performance computing (HPC) with R. In this context, we are defining ‘high-performance computing’ rather loosely as just about anything related to pushing R a little further: using compiled code, parallel computing (in both explicit and implicit modes), working with large objects as well as profiling.

Unless otherwise mentioned, all packages presented with hyperlinks are available from the Comprehensive R Archive Network (CRAN).

Several of the areas discussed in this Task View are undergoing rapid change. Please send suggestions for additions and extensions for this task view via e-mail to the maintainer or submit an issue or pull request in the GitHub repository linked above. See the Contributing page in the CRAN Task Views repo for details.

Suggestions and corrections by Achim Zeileis, Markus Schmidberger, Martin Morgan, Max Kuhn, Tomas Radivoyevitch, Jochen Knaus, Tobias Verbeke, Hao Yu, David Rosenberg, Marco Enea, Ivo Welch, Jay Emerson, Wei-Chen Chen, Bill Cleveland, Ross Boylan, Ramon Diaz-Uriarte, Mark Zeligman, Kevin Ushey, Graham Jeffries, Will Landau, Tim Flutre, Reza Mohammadi, Ralf Stubner, Bob Jansen, Matt Fidler, Brent Brewington and Ben Bolder (as well as others I may have forgotten to add here) are gratefully acknowledged.

The ctv package supports these Task Views. Its functions install.views and update.views allow, respectively, installation or update of packages from a given Task View; the option coreOnly can restrict operations to packages labeled as core below.

Direct support in R started with release 2.14.0 which includes a new package parallel incorporating (slightly revised) copies of packages multicore and snow. Some types of clusters are not handled directly by the base package ‘parallel’. However, and as explained in the package vignette, the parts of parallel which provide snow -like functions will accept snow clusters including MPI clusters. Use vignette("parallel") to view the package vignette. The parallel package also contains support for multiple RNG streams following L’Ecuyer et al (2002), with support for both mclapply and snow clusters. The version released for R 2.14.0 contains base functionality: higher-level convenience functions are planned for later R releases.

Parallel computing: Explicit parallelism

Parallel computing: Implicit parallelism

Parallel computing: Grid computing

Parallel computing: Hadoop

Parallel computing: Random numbers

Parallel computing: Resource managers and batch schedulers

Parallel computing: Applications

Parallel computing: GPUs

Large memory and out-of-memory data

Easier interfaces for Compiled code

Profiling tools

Packages profvis, proffer, profmem, GUIProfiler, proftools, and aprof summarize and visualize output from the Rprof interface for profiling. The profile package reads and writes profiling data and converts among file formats such as pprof by Google and Rprof. The xrprof command-line tool implements profile sampling for a given R process on Linux or Windows, and it can profile R code alongside compiled code.

CRAN packages

Core:Rmpi, snow.
Regular:aprof, arrow, batch, BatchExperiments, BatchJobs, batchtools, bcp, BDgraph, biglm, bigmemory, bigstatsr, bnlearn, caret, clustermq, data.table, dclone, disk.frame, doFuture, doMC, doMPI, doRNG, doSNOW, dqrng, drake, ff, flexiblas, flowr, foreach, future, gcbd, GUIProfiler, h2o, HistogramTools, inline, keras, LaF, latentnet, Matching, MonetDB.R, mvnfast, OpenCL, orloca, parSim, pbapply, pbdMPI, peperr, pls, proffer, profile, profmem, proftools, profvis, pvclust, qsub, randomForestSRC, Rborist, Rcpp, RcppParallel, Rdsm, reticulate, rgenoud, RhpcBLASctl, RInside, rJava, rlecuyer, RProtoBuf, rslurm, rstream, Sim.DiffProc, sitmo, snowfall, snowFT, speedglm, sqldf, ssgraph, STAR, targets, tensorflow, tfestimators, tm, varSelRF, xgboost.
Archived:ffbase, HadoopStreaming.

Related links

Other resources