CRAN Task View: Package Development
Do not edit this README by hand. See CONTRIBUTING.md.
|Maintainer:||Thomas J. Leeper|
|Contact:||thosjleeper at gmail.com|
Packages provide a mechanism for loading optional code, data, and documentation as needed. At the very minimum only a text editor and an R installation are needed for package creation. Nonetheless many useful tools and R packages themselves have been provided to ease or improve package development. This Task View focuses on these tools/R packages, grouped by topics.
The main reference for packages development is the “Writing R Extension” manual. For further documentation and tutorials, see the “Related links” section below.
If you think that some packages or tools are missing from the list, feel free to e-mail (thosjleeper at gmail dot com) me or contribute directly to the Task View by submitting a pull request on GitHub. Many thanks to Christopher Gandrud, Cristophe Dutang, Darren Norris, Dirk Eddelbuettel, Gabor Grothendieck, Gregory Jefferis, John Maindonald, Luca Braglia, Spencer Graves, Tobias Verbeke, and the R-core team for contributions.
Searching for Existing Packages
Before starting a new package it’s worth searching for already available packages, both from a developer’s standpoint (“do not reinvent the wheel”) and from a user’s one (many packages implementing same/similar procedures can be confusing). If a package addressing the same functionality already exists, you may consider contributing at it instead of starting a new one.
utils::RSiteSearch()allows to search for keywords/phrases in help pages (all the CRAN packages except those for Windows only and some from Bioconductor), vignettes or task views, using the search engine at http://search.r-project.org/. A convenient wrapper around
RSiteSearchthat adds hits ranking is
findFn()function, from sos.
- RSeek allows to search for keywords/phrases in books, task views, support lists, function/packages, blogs etc.
- Rdocumentation allows to search for keywords/phrases in help pages for all CRAN and some Bioconductor/GitHub packages.
- Crantastic! maintains an up-to-date and tagged directory of packages on CRAN. The Managed R Archive Network from Revolution Analytics is a CRAN mirror that additionally provides visualizations of package dependency trees.
- http://www.r-pkg.org/ is an unofficial CRAN mirror that provides a relatively complete archive of package and read-only access to package sources on Github.
- CRANberries provides a feed of new, updated, and removed packages for CRAN.
- If you’re looking to create a package, but want ideas for what sorts of packages are in demand, the rOpenSci maintains a wishlist for science-related packages and a TODO list of web services and data APIs in need of packaging.
Initializing an R package
utils::package.skeleton()automates some of the setup for a new source package. It creates directories, saves functions, data, and R code files provided to appropriate places, and creates skeleton help files and a
Read-and-delete-mefile describing further steps in packaging
kitten()from pkgKitten allows one to specify the main
DESCRIPTIONentries and doesn’t create source code and data files from global environment objects or sourced files. It’s used to initialize a simple package that passes
R CMD checkcleanly.
create()from devtools is similar to
package.skeletonexcept it allows to specify
DESCRIPTIONentries and doesn’t create source code and data files from global environment objects or sourced files.
- mason provides a fun, interactive tool for creating a package based on a variety of inputs.
Rcpp.package.skeleton()from Rcpp adds to
package.skeletonthe C++ via
Rcpphandling, by modifying eg.
NAMESPACEaccordingly, creating examples if needed and allowing the user to specify (with a character vector of paths) which C++ files to include in
srcdirectory . Finally the user can decide main
- mvbutils provides a variety of useful functions for development which include tools for managing and analyzing the development environment, auto-generating certain function types, and visualizing a function dependency graph. pagerank (not on CRAN) can calculate a package’s PageRank from its dependency graph.
- swagger (not on CRAN) uses the Swagger JSON web service API specification to automatically generate an R client package for a web service API.
When initializing a package, it is worth considering how it should be licensed. CRAN provides a list of the most commonly used software licences for R packages. osi (GitHub) provides a more comprehensive list in a standardized format.
R is foremost a functional programming language with dynamic typing, but has three built-in forms of object-oriented programming as well as additional object-oriented paradigms available in add-on packages.
- The built-in S3 classes involve wherein a generic function (e.g.,
summary) employs a distinct method for an object of a given class (i.e., it is possible to implement class-specific methods for a given generic function). If a package implements new object classes, it is common to implement methods for commonly used generics such as
summary, etc. These methods must be registered in the package’s NAMESPACE file. R.methodsS3 aims to simplify the creation of S3 generic functions and S3 methods.
- S4 is a more formalized form of object orientation that is available through
methods. S4 classes have formal definitions and can dispatch methods based on multiple arguments (not just the first argument, as in S3). S4 is notable for its use of the
@symbol to extract slots from S4 objects. John Chambers’s “How S4 Methods Work” tutorial may serve as a useful introduction.
- Reference classes were introduced in R2.12.0 and are also part of
methods. They offer a distinct paradigm from S3 and S4 due to the fact that reference class objects are mutable and that methods belong to objects, not generic functions.
- aoos and R.oo are other packages facilitating object-oriented programming. R6 (Github) provides an alternative to reference classes without a dependency on
- proto provides a prototype-based object orientated programming paradigm.
- rtype provides a strong type system.
- argufy (Not on CRAN), provides a syntax for creating functions with strictly typed arguments, among other possible checks.
- lambda.r, lambdaR (not on CRAN), and purrr provide interfaces for creating lambda (anonymous) functions.
- functools (GitHub) provides higher-order functions (Map, Reduce, etc.) common in funcitonal programming.
Another feature of R is the ability to rely on both standard and non-standard evaluation of function arguments. Non-standard evaluation is seen in commonly used functions like
subset and can also be used in packages.
substitute()provides the most straightforward interface to non-standard evaluation of function arguments.
- lazyeval (Github) aims to help developers design packages with parallel function implementations that follow both standard and non-standard evaluation.
- An increasingly popular form of non-standard evaluation involves chained expressions or “pipelines”. magrittr provides the
%>%chaining operator that passes the results of one expression evaluation to the next expression in the chain, as well as other similar piping operators. pipeR offers a larger set of pipe operators. assertr and ensurer provide (fairly similar) testing frameworks for pipelines.
Packages that have dependencies on other packages need to be vigilant of changes to the functionality, behaviour, or API of those packages.
- packrat (Github) provides facilities for creating local package repositories to manage and check dependencies.
- checkpoint relies on the Revolution Analytics MRAN repository to access packages from specified dates.
- pacman (Github) can install, uninstall, load, and unload various versions of packages from CRAN and Github.
- GRANBase (GitHub) provides some sophisticated tools for managing dependencies and testing packages conditional on changes.
- Two packages currently provide alternative ways to import objects from packages in non-standard ways (e.g., to assign those objects different names from the names used in their host packages). import (Github) can import numerous objects from a namespace and assign arbitrary names. modules (not on CRAN) provides functionality for importing alternative non-package code from Python-like “modules”.
- functionMap provides a visualization tool useful for understanding function dependencies within and across packages. atomize can quickly extract functions from within a package into their own package.
Foreign Languages Interfaces
- inline eases adding code in C, C++, or Fortran to R. It takes care of the compilation, linking and loading of embedded code segments that are stored as R strings.
- Rcpp offers a number of C++ classes that makes transferring R objects to C++ functions (and back) easier. RInside provides C++ classes for embedding within C++ applications.
- rGroovy integrates with the Groovy scripting language.
- rJava provides a low-level interface to Java similar to the
.Callinterface for C and C++. helloJavaWorld provides an example rJava-based package. jvmr (archived on CRAN) provides a bi-directional interface to Java, Scala, and related languages, while rscala is designed specifically for Scala.
- rustr provides bindings to Rust.
- reach (not on CRAN) and matlabr provide rough interfaces to Matlab.
- rPython, rJython, PythonInR, rpy2 (not on CRAN), and SnakeCharmR (not on CRAN) provide interfaces to python.
- RJulia (not on CRAN) provides an interface with Julia. RCall embeds R within Julia.
- RStata is an interface with Stata. RCall embeds R in Stata.
tcltk, which is a package built in to R, provides an general interface to Tcl, usefully especially for accessing Tcl/tk (for graphical interfaces). after (GitHub) uses tcltk to run R code in a separate event loop.
The knitr package, which supplies various foreign language engines, can also be used to generate documents that call python, awk, ruby, haskell, bash, perl, dot, tikz, sas, coffeescript, and polyglot.
Writing packages that involve compiled code requires a developer toolchain. If developing on Windows, this requires Rtools, which is updated with each R minor release.
- log4r (Github) and logging provide logging functionality in the style of log4j.
- loggr (not on CRAN) aims to provide a simplified logging interface without the need for
- rollbar reports messages and errors to Rollbar, a web service.
- The rchk tool provides tools for identifying memory-protection bugs in C code, including base R and packages.
Code Analysis and Formatting
- codetools provides a number of low-level functions for identifying possible problems with source code.
- lint and lintr provide tools for checking source code compliance with a style guide.
- formatR and rfmt (not on CRAN) can be used to neatly format source code.
- FuncMap provides a graphical representation of function calls used in a package.
- Profiling data is provided by
utils::Rprof()and can be summarized by
utils::summaryRprof(). prof.tree (GitHub) provides an alternative output data structure to
- profr can visualize output from the
Rprofinterface for profiling.
- proftools and aprof can also be used to analyse profiling output.
- profvis (not on CRAN) provides an interactive, graphical interface for examining profile results.
- lineprof (not on CRAN) provides a visualization tool for examining profiling results.
- Rperform (not on CRAN) compares package performance across different git versions and branches.
base::system.time()is a basic timing utility that calculates times based on one iteration of an expression.
- microbenchmark and rbenchmark provide timings based on multiple iterations of an expression and potentially provide more reliable timings than
- Packages should pass all basic code and documentation checks provided by the
R CMD checkquality assurance tools built in to R. rcmdcheck provides programmatic access to
R CMD checkfrom within R and callr (not on CRAN) provides a generic interface for calling R from within R.
- R documentation files can contain demonstrative examples of package functionality. Complete testing of correct package performance is better reserved for the
testdirectory. Several packages provide testing functionality, including RUnit, svUnit, testit (Not on CRAN), testthat, testthatsomemore, and pkgmaker. runittotestthat provides utilities for converting exiting RUnit tests to testthat tests.
- assertive, assertr, checkmate ensurer, and assertthat provide test-like functions for use at run-time or in examples that will trigger messages, warnings, or errors if an R object differs from what is expected by the user or developer.
- covr and testCoverage (not on CRAN) offer utilities for monitoring how well tests cover a package’s source code. These can be complemented by services such as Codecov or Coveralls that provide web interfaces for assessing code coverage.
- withr (GitHub) provides functions to evaluate code within a temporarily modified global state, which may be useful for unit testing, debugging, or package development.
revdep_check()functions from devtools can be used to test reverse package dependencies to ensure code changes have not affected downstream package functionality. crandalf (not on CRAN) provides an alternative mechanism for testing reverse dependencies.
Internationalization and Localization
- There is no standard mechanism for translation of package documentation into languages other than English. To create non-English documentation requires manual creation of supplemental .Rd files or package vignettes. Packages supplying non-English documentation should include a
Languagefield in the DESCRIPTION file.
- R provides useful features for the localization of diagnostic messages, warnings, and errors from functions at both the C and R levels based on GNU
gettext. “Translating R Messages” describes the process of creating and installing message translations.
Creating Graphical Interfaces
- For simple interactive interfaces,
readline()can be used to create a simple prompt. getPass provides cross-platform mechanisms for securely requesting user input without displaying the intput (e.g., for passwords).
utils::select.list()can provide graphical and console-based selection of items from a list, and
utils::txtProgressBar()provides a simple text progress bar.
tcltkis an R base package that provides a large set of tools for creating interfaces uses Tcl/tk (most functions are thin wrappers around corresponding Tcl and tk functions), though the documentation is sparse. tcltk2 provides additional widgets and functionality. qtbase provides bindings for Qt. RGtk (not on CRAN) provides bindings for Gtk and gnome. gWidgets2 offers a language-independent API for building graphical user interfaces in Gtk, Qt, or Tcl/tk.
- fgui can create a Tcl/tk interface for any arbitrary function.
- shiny provides a browser-based infrastructure for creating dashboards and interfaces for R functionality. htmlwidgets is a shiny enhancement that provides a framework for creating HTML widgets.
- progress (Github) offers progress bars for the terminal, including a C++ API.
Command Line Argument Parsing
- Several packages provide functionality for parsing command line arguments: argparse, argparser, commandr, docopt, GetoptLong, and optigrab.
Using Options in Packages
- pkgconfig (GitHu) allows developers to set package-specific options, which will not affect options set or used by other packages.
Writing Package Documentation
Package documentation is written in a TeX-like format as .Rd files that are stored in the
man subdirectory of a package. These files are compiled to plain text, HTML, or PDF by R as needed.
- One can write .Rd files directly. A popular alternative is to rely on roxygen2, which uses special markup in R source files to generate documentation files before a package is built. This functionality is provided by
devtools::document(). roxygen2 eliminates the need to learn some of the formatting requirements of an .Rd file at the cost of adding a step to the development process (the need to roxygenise before calling
R CMD build).
- Rd2roxygen can convert existing .Rd files to roxygen source documentation, facilitating the conversion of existing documentation to an roxygen workflow.
- inlinedocs and documair provide further alternative documentation schemes based on source code commenting.
tools::parse_Rd()can be used to manipulate the contents of an .Rd file.
tools::checkRd()is useful for validating an .Rd file. Duncan Murdoch’s “Parsing Rd files” tutorial is a useful reference for advanced use of R documentation. Rdpack provides additional tools for manipulating documentation files.
Package vignettes provides additional documentation of package functionality that is not tied to a specific function (as in an .Rd file). Historically, vignettes were used to explain the statistical or computational approach taken by a package in an article-like format that would be rendered as a PDF document using
Sweave. Since R 3.0.0, non-Sweave vignette engines have also been supported, including knitr, which can produce Sweave-like PDF vignettes but can also support HTML vignettes that are written in R-flavored markdown. To use a non-Sweave vignette engine, the vignette needs to start with a code block indicating the package and function to be used:
utilsprovides multiple functions for spell-checking portions of packages, including .Rd files (
utils::aspell_package_Rd_files) and vignettes (
utils::aspell_package_vignettes) via the general purpose
aspellfunction, which requires a system spell checking library, such as http://aspell.net/, http://hunspell.github.io/, or http://lasr.cs.ucla.edu/geoff/ispell.html.
- hunspell provides an interface to hunspell.
Data in Packages
- lazyData offers the ability to use data contained within packages that have not been configured using LazyData.
Tools and Services
Text Editors and IDEs
- By far the most popular integrated development environment (IDE) for R is RStudio, which is an open-source product available with both commercial and AGPL licensing. It can be run both locally and on a remote server. rstudioapi facilitates interaction from RStudio from within R.
- StatET is an R plug-in for the Eclipse IDE.
- Emacs Speaks Statistics (ESS) is a feature-rich add-on package for editors like Emacs or XEmacs.
- GNU Make is a tool that typically builds executable programs and libraries from source code by reading files called
Makefile. It can be used to manage R package as well; maker is a
Makefilecompletely devoted to R package development based on makeR.
- remake (not on CRAN) provides a yaml-based, Makefile-like format that can be used in Make-like workflows from within R.
- R itself is maintained under version control using Subversion.
- Many packages are maintained using git, particularly those hosted on GitHub. git2r (Github) provides bindings to libgit2 for programmatic use of git within R.
Hosting and Package Building Services
Many hosting services are available. Use of different hosts depends largely on what type of version control software is used to maintain a package. The most common sites are:
- R-Forge, which relies on Subversion. Rforge.net is another popular Subversion-based system.
- GitHub mainly supports Git and Mercurial. Packages hosted on Github can be installed directly using
ghit::install_github()from ghit. gh (not on CRAN) is a lightweight client for the GitHub API. Bitbucket is an alternative host that provides no-cost private repositories and GitLab is an open source alternative. gitlabr provides is an API client for managing Gitlab projects.
- Github supports continuous integration for R packages. Travis CI is a popular continuous integration tools that supports Linux and OS X build environments. Travis has native R support, and can easily provide code coverage information via covr to Codecov.io or Coveralls. travisci (not on CRAN) provides an API client for Travis. Use of other CI services, such as Circle CI may require additional code and examples are available from r-travis and/or r-builder. circleci (not on CRAN) provides an API client for Circle CI. badgecreatr (GitHub) provides a convenient way of creating standardized badges (or “shields”) for package READMEs.
- WinBuilder is a service intended for useRs who do not have Windows available for checking and building Windows binary packages. The package sources (after an
R CMD check) can be uploaded via html form or passive ftp in binary mode; after checking/building a mail will be sent to the
Maintainerwith links to the package zip file and logs for download/inspection. Appveyor is a continuous integration service that offers a Windows build environment. r-appveyor (not on CRAN) and appveyor (not on CRAN) provide API clients for Appveyor.
- Rocker provides containers for use with Docker. harbor can be used to control docker containers on remote and local hosts and dockertest provides facilities for running tests on docker.
- Some packages, especially some that are no longer under active development, remain hosted on Google Code. This service is closed to new projects, however, and will shut down in January 2016.
- drat can be used to distribute pre-built packages via Github or another server.
- CRAN does not provide package download statistics, but the RStudio CRAN mirror does. packagetrackr (Source) facilitates downloading and analyzing those logs.
- devtools (core)
- knitr (core)
- roxygen2 (core)
- [Manual] “Writing R Extension” by R-core team
- [Tutorial] “Creating R Packages: A Tutorial” by Friedrich Leisch
- [Tutorial] “Best practices for writing an API package” by Hadley Wickham
- [Webpage] “CRAN Repository Policy” lists rules for hosting packages on CRAN
- [Webpage] Dirk Eddelbuettel provides a feed of CRAN policy changes
- [Webpage] “Developing R packages” by Jeff Leek
- [Book] “Software for Data Analysis” by John Chambers
- [Book] “Advanced R” by Hadley Wickham
- [Book] “R packages” by Hadley Wickham