Deploying Content¶
This section explains how to use Connect Server APIs to create content in RStudio Connect and deploy code associated with that content. These APIs can be used for any type of content supported by RStudio Connect, including Shiny applications, R Markdown notebooks, Plumber APIs, and Jupyter notebooks.
The deployment APIs are experimental and will continue to evolve in upcoming RStudio Connect releases. Please try using these APIs to build your own deployment tools. Let your Customer Success representative know about your experience!
The
rstudio/connect-api-deploy-shiny
GitHub repository contains a sample Shiny application and uses the recipes in
this section in deployment scripts that you can use as examples when building
your own workflows.
The Connect Server API Reference contains documentation for each of the endpoints used in these recipes.
These recipes use bash
snippets and rely on curl
to perform HTTP requests.
We use the CONNECT_SERVER
and CONNECT_API_KEY
environment variables
introduced in the Getting Started section of this
cookbook.
These recipes do not prescribe a single workflow. Some example workflows include:
-
Automate the creation of a new, uniquely named Shiny application every quarter to analyze the sales leads. The latest quarter contains new dashboard functionality but you cannot alter prior quarters; those applications need to capture that point-in-time.
-
A Plumber API that receives updates after the R code supporting that API is fully tested by your continuous integration environment. These tests confirm that all updates remain backwards-compatible.
-
A team collaborates on an application over the course of a two week sprint cycle. The code is shared with Git and progress tracked in GitHub issues. The team performs production updates at the end of each sprint with a Docker-based deployment environment.
-
Your organization does not permit data scientists to publish directly to the production server. Production updates are scheduled events and gated by successful user-acceptance testing. A deployment engineer, who is not an R user, uses scripts to create and publish content in production by interacting with the Connect Server APIs.
Workflow¶
The content deployment workflow includes several steps:
- Create a new content item; content can receive multiple deployments.
- Create a bundle capturing your code and its dependencies.
- Upload the bundle archive to RStudio Connect.
- Deploy (activating) that bundle and monitor its progress.
You can choose to create a new content item with each deployment or repeatedly target the same content item. It is good practice to re-use an existing content item as you continue to develop that application or report. Create new content items for new artifacts.
You must create a new content item when changing the type of content. You cannot deploy a bundle containing an R Markdown document to a content item already running a Shiny application.
Creating Content¶
The POST /experimental/content
content creation
API is used to create a new content item in RStudio Connect. It takes a JSON
document as input.
The Content
definition in the Connect Server API
Reference describes the full set of fields that may be supplied to this
endpoint. Our example is only going to provide two: name
and title
.
The name
field is required and must be unique across all content within your
account. It is a descriptive, URL-friendly identifier.
The title
field is where you define a user-friendly identifier. The title
field is optional; when set, title
is shown in the RStudio Connect dashboard
instead of name
.
export DATA='{"name": "shakespeare", "title": "Shakespeare Word Clouds"}'
curl --silent --show-error -L --max-redirs 0 --fail -X POST \
-H "Authorization: Key ${CONNECT_API_KEY}" \
--data "${DATA}" \
"${CONNECT_SERVER}__api__/v1/experimental/content"
# => {
# => "guid": "ccbd1a41-90a0-4b7b-89c7-16dd9ad47eb5",
# => "name": "shakespeare",
# => "title": "Shakespeare Word Clouds",
# => ...
# => "owner_guid": "0b609163-aad5-4bfd-a723-444e446344e3",
# => "url": "http://localhost:3939/content/271/",
# => }
The JSON response is the full set of content fields. Those you did not supply
when creating the content will receive default values. The Connect Server API
Reference describes all the request and response fields for the POST
/experimental/content
content creation endpoint.
Fields marked read-only should not be specified when creating content. If you
happen to include read-only properties, they will be ignored. Read-only fields
are computed internally by RStudio Connect as other operations occur.
Let's define a CONTENT_GUID
environment variable containing the guid
of
the content we just created. We will use this variable in the remaining
deployment examples.
export CONTENT_GUID="ccbd1a41-90a0-4b7b-89c7-16dd9ad47eb5"
Creating a Bundle¶
The RStudio Connect content "bundle" represents a point-in-time representation of your code. You can associate a number of bundles with your content, though only one bundle is active. The active bundle is used to render your report, run your application, and supplies what you see when you visit that content in your browser.
Create bundles to match your workflow:
- As you improve / enhance your content
- Corresponding to a Git commit or merge
- Upon completion of tests run in your continuous integration environment
- After review and approval by your stakeholders
The bundle is uploaded to RStudio Connect as a .tar.gz
archive. You will use
the tar
utility to create this file. Before we create the archive, let's
consider what should go inside.
- All source files used by your content. This is usually a collection of
.R
,.Rmd
,.py
and.ipynb
files. Include any required HTML, CSS, and Javascript resources, as well. - Data files, images, or other resources that are loaded when executing or
viewing your content. This might be
.png
,.jpg
,.gif
,.csv
files. If your report uses an Excel spreadsheet as input, include it! - A
manifest.json
. This JSON file describes the requirements of your content. For R content, this includes a full snapshot of all of your package requirements. Themanifest.json
is created with thersconnect::writeManifest
function.
From the command-line:
bash
# This directory should be your current working directory.
Rscript -e 'rsconnect::writeManifest()'
From an R console:
r
# This directory should be your current working directory.
rsconnect::writeManifest()
NOTE: Please use
rsconnect
version 0.8.15 or higher when generating a manifest file.
We recommend committing the manifest.json
into your source control system
and regenerating it whenever you push new versions of your code --
especially when updating packages or otherwise changing its dependencies!
Refer to the user guide for more information on
creating the manifest.
Create your bundle .tar.gz
file once you have collected the set of files to
include. Here is an example that archives a simple Shiny application; the
app.R
contains the R source and data
is a directory with data files loaded
by the application.
tar czf bundle.tar.gz manifest.json app.R data
You MUST bundle the manifest.json
and primary content files at the
top-level; do NOT include the containing directory in the archive.
Uploading Bundles¶
The CONTENT_GUID
environment variable is the content that will own the
bundle that is uploaded. Bundles are associated with exactly one piece of
content.
We use the POST
/experimental/content/{guid}/upload
upload
content bundle endpoint with the bundle.tar.gz
file as its payload:
curl --silent --show-error -L --max-redirs 0 --fail -X POST \
-H "Authorization: Key ${CONNECT_API_KEY}" \
--data-binary @"bundle.tar.gz" \
"${CONNECT_SERVER}__api__/v1/experimental/content/${CONTENT_GUID}/upload"
# => {"bundle_id":"485","bundle_size":162987}
The response from the upload endpoint contains an identifier for the created bundle and the number of bytes received.
You MUST use the
--data-binary
argument tocurl
, which sends the data file without additional processing. Do NOT use the--data
argument: it submits data in the same way as a browser when you "submit" a form and is not appropriate.
Extract the bundle ID from the upload response and assign it to a BUNDLE_ID
environment variable:
export BUNDLE_ID="485"
Deploying a Bundle¶
This recipe explains how to deploy, or activate, an uploaded bundle. It
assumes that CONTENT_GUID
references the target content item and BUNDLE_ID
indicates the bundle to deploy.
Bundle deployment triggers an asynchronous task that makes the uploaded data available for serving. The workflow applied to the bundled files varies depending on the type of content.
This uses the POST
/experimental/content/{guid}/deploy
deploy
content bundle endpoint.
# Build the JSON input naming the bundle to deploy.
export DATA='{"bundle_id":"'"${BUNDLE_ID}"'"}'
# Trigger a deployment.
curl --silent --show-error -L --max-redirs 0 --fail -X POST \
-H "Authorization: Key ${CONNECT_API_KEY}" \
--data "${DATA}" \
"${CONNECT_SERVER}__api__/v1/experimental/content/${CONTENT_GUID}/deploy"
# => {"task_id":"BkkakQAXicqIGxC1"}
The result from a deployment request includes a task identifier that we use to poll about the progress of that deployment task.
export TASK="BkkakQAXicqIGxC1"
Task Polling¶
The recipe explains how to poll for updates to a task. It assumes that the
task identifier is present in the TASK
environment variable.
The GET /experimental/tasks/{id}
get task endpoint let
you obtain the latest information about a dispatched operation.
There are two ways to poll for task information; you can request complete or
incremental task output. The first
URL query argument controls how much data
is returned.
Here is a typical initial task progress request. It does not specify the
first
URL query argument, meaning all available output is returned. When
first
is not given, the value first=0
is assumed.
curl --silent --show-error -L --max-redirs 0 --fail \
-H "Authorization: Key ${CONNECT_API_KEY}" \
"${CONNECT_SERVER}__api__/v1/experimental/tasks/${TASK}?wait=1"
# => {
# => "id": "BkkakQAXicqIGxC1",
# => "output": [
# => "Building Shiny application...",
# => "Bundle requested R version 3.5.1; using ...",
# => ],
# => "finished": false,
# => "code": 0,
# => "error": "",
# => "last": 2
# => }
The wait=1
argument tells the server to collect output for up to one second.
This long-polling approach is an alternative to explicitly sleeping within
your polling loop.
The last
field in the response lets us incrementally fetch task output. Our
initial request returned two output lines; we want our next request to
continue from that point. Here is a request for task progress that does not
include the first two lines of output.
export FIRST=2
curl --silent --show-error -L --max-redirs 0 --fail \
-H "Authorization: Key ${CONNECT_API_KEY}" \
"${CONNECT_SERVER}__api__/v1/experimental/tasks/${TASK}?wait=1&first=${FIRST}"
# => {
# => "id": "BkkakQAXicqIGxC1",
# => "output": [
# => "Removing prior manifest.json to packrat transformation.",
# => "Performing manifest.json to packrat transformation.",
# => ],
# => "finished": false,
# => "code": 0,
# => "error": "",
# => "last": 4
# => }
Continue incrementally fetching task progress until the response is marked as finished. The final lines of output are included in this response.
export FIRST=86
curl --silent --show-error -L --max-redirs 0 --fail \
-H "Authorization: Key ${CONNECT_API_KEY}" \
"${CONNECT_SERVER}__api__/v1/experimental/tasks/${TASK}?wait=1&first=${FIRST}"
# => {
# => "id": "BkkakQAXicqIGxC1",
# => "output": [
# => "Completed packrat build against R version: '3.4.4'",
# => "Launching Shiny application..."
# => ],
# => "finished": true,
# => "code": 0,
# => "error": "",
# => "last": 88
# => }
Errors are indicated in the response by a non-zero code
and an error
message. It is likely that the output
stream also includes information that
will help you understand the cause of the error. Problems installing R
packages, for example, will appear in the lines of output
.