Skip to content

R Package Management#

Package Installation#

RStudio Connect installs the R package dependencies of Shiny applications, Plumber APIs, and R Markdown documents when that content is deployed.

The RStudio IDE uses the rsconnect and packrat R packages to identify the target source code and enumerate its dependencies. That information is bundled into an archive (.tar.gz) file and uploaded to RStudio Connect.

RStudio Connect receives a bundle archive (.tar.gz) file, unpacks it, and uses packrat to install the identified package dependencies.

Note

RStudio Connect includes and manages its own installation of the packrat package. This packrat installation is not available to user code and used only when restoring execution environments.

The execution environment created by RStudio Connect and packrat contains the same package versions you are using in your development environment.

Package Caching#

The packrat package attempts to re-use R packages whenever possible. The shiny package, for example, is installed when the first Shiny application is deployed. That version of shiny is placed into the packrat package cache and associated with that Shiny application deployment. Other Shiny applications built with the same version of the shiny package will use that cached installation. Deployments are faster when they can take advantage of previously-installed packages.

The packrat package cache allows multiple versions of a package to exist on a system. An old Shiny application built with shiny version 1.0.5 continues to use that package version even as newer deployments choose updated versions of shiny. Each Shiny application has an R environment with its expected shiny version. The different applications and shiny versions coexist.

Publish new content without worrying about package updates breaking existing, deployed content. Distinct versions of packages are kept isolated from each other.

Package compilation#

Some packages contain C and C++ code components. That code needs to be compiled during package installation. The Server.CompilationConcurrency setting controls the number of concurrent compilation processes used by package installation.

The default value for the Server.CompilationConcurrency setting is derived from the number of available CPUs with the formula max(1, min(8, (cpus-1)/2)). This property controls the number of concurrent C/C++ compilations during R package installation. This value makes it less likely for package installs to encounter memory capacity issues on lightweight hosts while allowing more concurrency on high-capacity servers.

CPUs CompilationConcurrency
1 1
2 1
4 1
6 2
8 3
16 7
24 8
32 8

You can customize Server.CompilationConcurrency to force a specific level of concurrency.

; /etc/rstudio-connect/rstudio-connect.gcfg
[Server]
CompilationConcurrency = 1

External Package Installation#

Warning

Adding external packages decreases the reproducibility and isolation of content on RStudio Connect, and should only be done as a last resort.

You can indicate that a system-wide installation of a package should be used instead of one fetched by packrat. The Packages.External can be used to enumerate each system-provided package

For example, rJava or ROracle are large installations, potentially with odd dependencies, such as your choice of JDK and/or Oracle InstantClient. First, you would install these packages in every R installation that RStudio Connect will be using. Then, you would configure RStudio Connect with the following parameters:

; /etc/rstudio-connect/rstudio-connect.gcfg
[Packages]
External = ROracle
External = rJava

This is the same as settings the packrat option external.packages to c("ROracle", "rJava") using packrat::set_opts. The external.packages option instructs packrat::restore to load certain packages from the user library. See the packrat documentation for more information.

Proxy Configuration#

If the http_proxy and/or https_proxy environment variables are provided to RStudio Connect when the server starts, those variables will be passed to all processes run by RStudio Connect, including the package installation process.

Configuring Packages.HTTPProxy and Packages.HTTPSProxy will provide their values as the http_proxy and https_proxy environment variables only when packages are installed during deployment. This could be useful if you have a special proxy just for downloading package dependencies. You could regulate access to unapproved packages in non-CRAN repositories by rejecting certain URL patterns.

Private Repositories#

Packrat records details about how a package was obtained in addition to information about its dependencies. Most public packages will come from a public CRAN mirror. Packrat lets RStudio Connect support alternate repositories in addition to CRAN.

Info

Learn how to create your own custom repository; this directory can then be shared over HTTP or through a shared filesystem.

Here are some reasons why your organization might use an alternate/private repository.

  1. Internally developed packages are made available through a corporate repository. This is used in combination with a public CRAN mirror.

  2. All packages (private and public) are approved before use and must be obtained through the corporate repository. Public CRAN mirrors are not used.

  3. Direct access to a public CRAN mirror is not permitted. A corporate repository is used as a proxy and caches public packages to avoid external network access.

RStudio Connect supports private repositories in these situations given that the deploying instance of R is correctly configured. No adjustment to the RStudio Connect server is needed in this case.

In the case where the deploying instance of R and RStudio Connect must have different repository URLs, the RPackageRepository configuration option allows the repository URLs set by the user to be overridden on each packrat restore.

Repository information is configured using the repos R option. Your users will need to make sure their desktop R is configured to use your corporate repository.

Note

RStudio IDE version 0.99.1285 or greater is needed when using repositories other than the public CRAN mirrors.

We recommend using an .Rprofile file to configure multiple repositories or non-public repositories.

The .Rprofile file should be created in a user's home directory.

# A sample .Rprofile file with two different package repositories.
local({
  r <- getOption("repos")
  r["CRAN"] <- "https://cran.rstudio.com/"
  r["mycompany"] <- "http://rpackages.mycompany.com/"
  options(repos = r)
})

This .Rprofile creates a custom repos option. It instructs R to attempt package installation first from "CRAN" and then from the "mycompany" repository. R installs a package from the first repository in "repos" containing that package.

With this custom repos option, you will be able to install packages from the mycompany repository. RStudio Connect will be able to install these packages as code is deployed.

For more information about the .Rprofile file, see help(Startup) in R. For details about package installation, see help(install.packages) and help(available.packages).

Private Packages#

Packages available on CRAN, a private package repository, or a public GitHub repository are automatically downloaded and built when an application is deployed. RStudio Connect cannot automatically obtain packages from private GitHub repositories, but a workaround is available.

Note

We recommend using a private repository to host internal packages when possible. See the Private Repositories section for details.

Warning

Server.SourcePackageDir is deprecated as of RStudio Connect 1.8.6 and will be removed in a future version. We recommend using a private repository.

The configuration option Server.SourcePackageDir can reference a directory containing additional packages that Connect would not otherwise be able to retrieve. This directory and its contents must be readable by the Applications.RunAs user. Connect will look in this directory for packages before attempting to obtain them from a remote location.

This feature has some limitations.

  • The package must be tracked in a git repository so that each distinct version has a unique commit hash associated with it.

  • The package must have been installed from the git repository using the devtools package so that the hash is contained in the DESCRIPTION file on the client machine.

If these conditions are met, you may place .tar.gz source packages into per-package subdirectories of SourcePackageDir. The proper layout of these files is <package-name>/<full-git-hash>.tar.gz.

For example, if Server.SourcePackageDir is defined as /opt/R-packages, source bundles for the MyPrivatePkg package are located at /opt/R-packages/MyPrivatePkg. A commit hash of 28547e90d17f44f3a2b0274a2aa1ca820fd35b80 needs its source bundle stored at the following path:

/opt/R-packages/MyPrivatePkg/28547e90d17f44f3a2b0274a2aa1ca820fd35b80.tar.gz

When private package source is arranged in this manner, users of RStudio Connect will be able to use those package versions in their deployed content.

Be aware that this mechanism is specific to the commit hash, so you will either need to make many git revisions of your package available in the Server.SourcePackageDir directory hierarchy or standardize to a particular git commit of the package.