
mirai - Minimalist Async Evaluation Framework for R
Evaluates R expressions asynchronously and in parallel, locally or distributed across networks. An official parallel cluster type for R. Built on 'nanonext' and 'NNG', its non-polling, event-driven architecture scales from a laptop to thousands of processes across high-performance computing clusters and cloud platforms. Features FIFO scheduling with task cancellation and bounded queues, promises for reactive programming, 'OpenTelemetry' distributed tracing, and custom serialization for cross-language data types.
Last updated
asyncasynchronous-tasksconcurrencydistributed-computinghigh-performance-computingparallel-computing
16.06 score 323 stars 162 dependents 524 scripts 77k downloads
targets - Dynamic Function-Oriented 'Make'-Like Declarative Pipelines
Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
Last updated
data-sciencehigh-performance-computingmakepeer-reviewedpipeliner-targetopiareproducibilityreproducible-researchtargetsworkflow
15.49 score 1.1k stars 25 dependents 7.4k scripts 17k downloads
targets - Dynamic Function-Oriented 'Make'-Like Declarative Pipelines
Pipeline tools coordinate the pieces of computationally demanding analysis projects. The 'targets' package is a 'Make'-like pipeline tool for statistics and data science in R. The package skips costly runtime for tasks that are already up to date, orchestrates the necessary computation with implicit parallel computing, and abstracts files as R objects. If all the current output matches the current upstream code and data, then the whole pipeline is up to date, and the results are more trustworthy than otherwise. The methodology in this package borrows from GNU 'Make' (2015, ISBN:978-9881443519) and 'drake' (2018, <doi:10.21105/joss.00550>).
Last updated
data-sciencehigh-performance-computingmakepeer-reviewedpipeliner-targetopiareproducibilityreproducible-researchtargetsworkflow
15.45 score 1.1k stars 25 dependents 7.3k scripts 18k downloads
nanonext - Lightweight Toolkit for Messaging, Concurrency and the Web
R binding for NNG (Nanomsg Next Gen), a successor to ZeroMQ. A toolkit for messaging, concurrency and the web. High-performance socket messaging over in-process, IPC, TCP, WebSocket and secure TLS transports implements 'Scalability Protocols', a standard for common communications patterns including publish/subscribe, request/reply and survey. A threaded concurrency framework with intuitive 'aio' objects that resolve automatically upon completion of asynchronous operations, and synchronisation primitives that allow R to wait on events signalled by concurrent threads. A unified HTTP server hosting REST endpoints, WebSocket connections and streaming on a single port, with a built-in HTTP client.
Last updated
concurrencyhttp-clienthttp-serveripc-messagemessaging-libraryrpcsocket-communicationsynchronization-primitivestcp-protocoltlswebsocketmbedtls
13.77 score 82 stars 170 dependents 76 scripts 82k downloads
tarchetypes - Archetypes for Targets
Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.
Last updated
data-sciencehigh-performance-computingpeer-reviewedpipeliner-targetopiareproducibilitytargetsworkflow
11.63 score 151 stars 12 dependents 3.4k scripts 6.5k downloads
crew - A Distributed Worker Launcher Framework
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'NNG'-powered 'mirai' R package by Gao (2023) <doi:10.5281/zenodo.7912722> is a sleek and sophisticated scheduler that efficiently processes these intense workloads. The 'crew' package extends 'mirai' with a unifying interface for third-party worker launchers. Inspiration also comes from packages. 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischel, and Surmann (2017) <doi:10.21105/joss.00135>.
Last updated
high-performance-computing
11.63 score 151 stars 3 dependents 524 scripts 3.8k downloads
tarchetypes - Archetypes for Targets
Function-oriented Make-like declarative pipelines for Statistics and data science are supported in the 'targets' R package. As an extension to 'targets', the 'tarchetypes' package provides convenient user-side functions to make 'targets' easier to use. By establishing reusable archetypes for common kinds of targets and pipelines, these functions help express complicated reproducible pipelines concisely and compactly. The methods in this package were influenced by the 'targets' R package. by Will Landau (2018) <doi:10.21105/joss.00550>.
Last updated
data-sciencehigh-performance-computingpeer-reviewedpipeliner-targetopiareproducibilitytargetsworkflow
11.63 score 150 stars 13 dependents 3.1k scripts 6.5k downloadsdrake - A Pipeline Toolkit for Reproducible Computation at Scale
A general-purpose computational engine for data analysis, drake rebuilds intermediate data objects when their dependencies change, and it skips work when the results are already up to date. Not every execution starts from scratch, there is native support for parallel and distributed computing, and completed projects have tangible evidence that they are reproducible. Extensive documentation, from beginner-friendly tutorials to practical examples and more, is available at the reference website <https://docs.ropensci.org/drake/> and the online manual <https://books.ropensci.org/drake/>.
Last updated
data-sciencedrakehigh-performance-computingmakefilepeer-reviewedpipelinereproducibilityreproducible-researchropensciworkflow
10.69 score 1.3k stars 1 dependents 1.7k scripts 1.3k downloadsbrms.mmrm - Bayesian MMRMs using 'brms'
The mixed model for repeated measures (MMRM) is a popular model for longitudinal clinical trial data with continuous endpoints, and 'brms' is a powerful and versatile package for fitting Bayesian regression models. The 'brms.mmrm' R package leverages 'brms' to run MMRMs, and it supports a simplified interfaced to reduce difficulty and align with the best practices of the life sciences. References: Bürkner (2017) <doi:10.18637/jss.v080.i01>, Mallinckrodt (2008) <doi:10.1177/009286150804200402>.
Last updated
brmslife-sciencesmc-stanmmrmstanstatistics
8.26 score 24 stars 17 scripts 722 downloads
stantargets - Targets for Stan Workflows
Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'stantargets' R package leverages 'targets' and 'cmdstanr' to ease these burdens. 'stantargets' makes it super easy to set up scalable Stan pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. 'stantargets' can access all of 'cmdstanr''s major algorithms (MCMC, variational Bayes, and optimization) and it supports both single-fit workflows and multi-rep simulation studies. For the statistical methodology, please refer to 'Stan' documentation (Stan Development Team 2020) <https://mc-stan.org/>.
Last updated
bayesianhigh-performance-computingmaker-targetopiareproducibilitystanstatisticstargets
6.85 score 50 stars 237 scripts
jagstargets - Targets for JAGS Pipelines
Bayesian data analysis usually incurs long runtimes and cumbersome custom code. A pipeline toolkit tailored to Bayesian statisticians, the 'jagstargets' R package is leverages 'targets' and 'R2jags' to ease this burden. 'jagstargets' makes it super easy to set up scalable JAGS pipelines that automatically parallelize the computation and skip expensive steps when the results are already up to date. Minimal custom code is required, and there is no need to manually configure branching, so usage is much easier than 'targets' alone. For the underlying methodology, please refer to the documentation of 'targets' <doi:10.21105/joss.02959> and 'JAGS' (Plummer 2003) <https://www.r-project.org/conferences/DSC-2003/Proceedings/Plummer.pdf>.
Last updated
bayesianhigh-performance-computingjagsmaker-targetopiareproducibilityrjagsstatisticstargetscpp
6.25 score 11 stars 40 scripts 681 downloads
gittargets - Data Version Control for the Targets Package
In computationally demanding data analysis pipelines, the 'targets' R package (2021, <doi:10.21105/joss.02959>) maintains an up-to-date set of results while skipping tasks that do not need to rerun. This process increases speed and increases trust in the final end product. However, it also overwrites old output with new output, and past results disappear by default. To preserve historical output, the 'gittargets' package captures version-controlled snapshots of the data store, and each snapshot links to the underlying commit of the source code. That way, when the user rolls back the code to a previous branch or commit, 'gittargets' can recover the data contemporaneous with that commit so that all targets remain up to date.
Last updated
data-sciencedata-version-controldata-versioningreproducibilityreproducible-researchtargetsworkflow
6.03 score 89 stars 12 scripts 677 downloads
pmrm - Progression Models for Repeated Measures
A progression model for repeated measures (PMRM) is a continuous-time nonlinear mixed-effects model for longitudinal clinical trials in progressive diseases. Unlike mixed models for repeated measures (MMRMs), which estimate treatment effects as linear combinations of additive effects on the outcome scale, PMRMs characterize treatment effects in terms of the underlying disease trajectory. This framing yields clinically interpretable quantities such as average time saved and percent reduction in decline due to treatment. This package implements frequentist PMRMs by Raket (2022) <doi:10.1002/sim.9581> using 'RTMB' by Kristensen (2016) <doi:10.18637/jss.v070.i05>.
Last updated
adcompdisease-progression-modelmmrmpmrmrtmbtmb
5.80 score 6 stars 2 scripts 507 downloadsrfacts - R Interface to 'FACTS' on Unix-Like Systems
The 'rfacts' package is an R interface to the Fixed and Adaptive Clinical Trial Simulator ('FACTS') on Unix-like systems. It programmatically invokes 'FACTS' to run clinical trial simulations, and it aggregates simulation output data into tidy data frames. These capabilities provide end-to-end automation for large-scale simulation pipelines, and they enhance computational reproducibility. For more information on 'FACTS' itself, please visit <https://www.berryconsultants.com/software/>.
Last updated
clinical-trialsfactssimulation
5.08 score 8 stars 10 scripts 188 downloads
crew.aws.batch - A Crew Launcher Plugin for AWS Batch
In computationally demanding analysis projects, statisticians and data scientists asynchronously deploy long-running tasks to distributed systems, ranging from traditional clusters to cloud services. The 'crew.aws.batch' package extends the 'mirai'-powered 'crew' package with a worker launcher plugin for AWS Batch. Inspiration also comes from packages 'mirai' by Gao (2023) <https://github.com/r-lib/mirai>, 'future' by Bengtsson (2021) <doi:10.32614/RJ-2021-048>, 'rrq' by FitzJohn and Ashton (2023) <https://github.com/mrc-ide/rrq>, 'clustermq' by Schubert (2019) <doi:10.1093/bioinformatics/btz284>), and 'batchtools' by Lang, Bischl, and Surmann (2017). <doi:10.21105/joss.00135>.
Last updated
aws-batchcrewhigh-performance-computing
4.80 score 18 stars 9 scripts 651 downloadstest - Codeberg R Package Test
Codeberg R Package Test. Not a serious project.
Last updated
2.49 score 306 scripts



