10 Database
RStudio Connect supports multiple database options. Currently, the supported databases are SQLite and PostgreSQL.
Customize the Database.Provider
property with a database scheme appropriate
for your organization. See Section A.9 for
details
Here is a partial configuration which chooses to use SQLite
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = sqlite
10.1 SQLite
SQLite is the default database provider.
RStudio Connect will use SQLite database if the
Database.Provider
setting has a value of sqlite
or if
Provider
is not present in the configuration file.
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = sqlite
You can also specify the directory to store the SQLite file on your file
system. This can be done by specifying SQLite.Dir
in the
configuration file.
; /etc/rstudio-connect/rstudio-connect.gcfg
[SQLite]
Dir = /mnt/connect/sqlite
If this field is not specified, it will default to {Server.DataDir}/db
. This
location must exist on local storage.
If the location for Server.DataDir
is not local storage but
a networked location over NFS, configure the SQLite.Dir
setting so it still resides on some local volume.
10.1.1 SQLite Backups
RStudio Connect can be configured to periodically back up its database while running.
; /etc/rstudio-connect/rstudio-connect.gcfg
[SQLite]
Backup = true
BackupFrequency = 24h
BackupRetentionLimit = 7
The above config will execute an online backup every 24 hours. RStudio Connect will retain up to 7 of those backups. If an eigth backup is created, the oldest of the previous backups will be deleted. This provides a grace period for an administrator implementing a data retention policy, such as a practice of copying backups to tape periodically.
Backups are stored in the same directory as the main database file: /var/lib/rstudio-connect/db
by default. Backups
are lexically sortable, because they are timestamped with the UNIX epoch time padded to 11 digits. For example, if
the server’s database file is /var/lib/rstudio-connect/db/connect.db
, a backup of that database could be
/var/lib/rstudio-connect/db/connect.db.01508526538
.
The same process is applied to the instrumentation database file. It will be backed up based on the same settings as for the main database file.
Automatic SQLite backups are not a complete backup solution for RStudio Connect. You should also make regular
backups of the Server.DataDir
directory. This is especially important because the Server.DataDir
directory is
expected to be kept in sync with the database.
Restoring a SQLite backup is straightforward:
- Ensure that the backup is valid with the command
sqlite3 <backup file name> "PRAGMA integrity_check;"
- Stop the RStudio Connect service
- Copy the current database file as well as its
.wal
file, if any. (If you wish to analyze them or send a copy to RStudio Support) - Replace the current database file with the backup
- If they exist, delete any
.wal
and.shm
files associated with the previous database. Failure to do this could lead to further downtime and possible data corruption. - Start the RStudio Connect service
Note also that RStudio Connect has no way of restoring applications deployed or changes made since the last backup. Restoring the backup file will cause these changes to be lost permanently.
10.2 PostgreSQL
PostgreSQL is an available database provider which is more powerful and performant than SQLite.
You must provide your own Postgres server which will likely be a separate box from your RStudio Connect server (but not required). We currently support any 9.x, 10.x or 11.x version greater than or equal to 9.4. Your Postgres server does not have to be dedicated to RStudio Connect, but it must have its own dedicated database.
Configure the Database.Provider
as postgres
for RStudio
Connect to use a PostgreSQL database rather than the default SQLite database.
The Postgres.URL
setting is the fully-qualified connection
URL for your Postgres server.
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = postgres
[Postgres]
URL = "postgres://username:password@db.seed.co/connect"
The user credentials can be included in the Postgres.URL
URL - remember to
URL-encode any special
characters in the username or password. We recommend using
Postgres.Password
to avoid using a bare password in your
configuration file. Examples using Postgres.Password
are found in Section
10.2.4.
The user credentials supplied in the Postgres.URL
URL must be able to create
and alter database tables, in addition to read/write permissions in the
database referenced as the path of the URL. A blank database with the given
name MUST already exist.
10.2.1 SSL
RStudio Connect assumes by default that SSL is enabled on the Postgres server.
If SSL is not enabled on your Postgres database, add ?sslmode=disable
to the
Postgres.URL
.
Here is an example configuration using a Postgres database with SSL connections:
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = postgres
[Postgres]
URL = "postgres://username:password@db.seed.co/connect"
Here is an example configuration using a Postgres database without SSL connections:
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = postgres
[Postgres]
URL = "postgres://username:password@db.seed.co/connect?sslmode=disable"
10.2.2 Schemas
If you need to, you can tell Connect to restrict itself to keeping its tables
within a specific schema. You control this by giving PostgreSQL a search path
as part of the URL by adding options=-csearch_path=<schema-name>
to the URL.
If it’s the only item you’re adding, separate it from the rest of the URL with
?
(just like the sslmode
item above). Otherwise, separate it from other
items with ‘&’.
Here is an example configuration that specifies a schema to use, with SSL:
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = postgres
[Postgres]
URL = "postgres://username:password@db.seed.co/connect?options=-csearch_path=connect_schema"
Here is an example configuration that specifies a schema to use, with SSL disabled:
; /etc/rstudio-connect/rstudio-connect.gcfg
[Database]
Provider = postgres
[Postgres]
URL = "postgres://username:password@db.seed.co/connect?sslmode=disable&options=-csearch_path=connect_schema"
RStudio Connect will refuse to start when given a schema that does not already exist. The schema must be owned by the connecting user or by a group that contains the connecting user.
10.2.3 Historical Database
Historical events generated by RStudio Connect can be written to a separate PostgreSQL database. See the section 20 for details.
By default, historical events will be written to the same database.
Use Postgres.InstrumentationURL
to use a separate database.
; /etc/rstudio-connect/rstudio-connect.gcfg
[Postgres]
URL = "postgres://username:password@db.seed.co/connect"
InstrumentationURL = "postgres://username:password@db.seed.co/connect_instrumentation"
10.2.4 Passwords
When your PostgreSQL connection URL requires a password, use
Postgres.Password
with an encrypted value to avoid
credential leakage. Appendix A.1.3
explains encrypted settings, how to encrypt secrets like passwords, and using
them in your configuration file.
Note: We do not recommend embedding passwords directly in the
Postgres.URL
connection URL setting.
RStudio Connect uses Postgres.Password
when connecting to your PostgreSQL
database. Do not
percent-encode your
Postgres.Password
value. If necessary, Connect will encode your password
when it is combined with the Postgres.URL
.
Note: Passwords included directly in
Postgres.URL
need to have special characters percent-encoded. The set of characters that must be encoded within the user information component of a URL can be found in Section 3.2.1 of RFC 3986.
Here is a sample configuration using a PostgreSQL connection URL with a username and a separate, encrypted password.
; /etc/rstudio-connect/rstudio-connect.gcfg
[Postgres]
URL = "postgres://username@db.seed.co/connect"
Password = <base64 encrypted password value>
The historical database connection URL can also be used with an encrypted
password. Use Postgres.InstrumentationPassword
to provide
the password associated with the connection URL
Postgres.InstrumentationURL
.
Here is a sample configuration that has two separate Postgres URLs; one for the primary database and one for the historical database. Both connection URLs have separate, encrypted passwords.
; /etc/rstudio-connect/rstudio-connect.gcfg
[Postgres]
URL = "postgres://username@db.seed.co/connect"
Password = <base64 encrypted password value>
InstrumentationURL = "postgres://username@db.seed.co/connect_instrumentation"
InstrumentationPassword = <base64 encrypted password value>
10.3 Changing Database Provider
Connect includes a migrate
command for migrating data from one database
to another.
The migration utility is installed at
/opt/rstudio-connect/bin/migrate
. It uses the configuration defined in
/etc/rstudio-connect/rstudio-connect.gcfg
unless you specify an alternate
configuration file with the --config
flag.
The migrate
utility must be run as root
.
The migrate
utility can only be run when Connect is stopped. See Section
5.1 for information on stopping and restarting Connect.
Note: Migration from PostgreSQL to SQLite is not supported at this time.
Note: Migration of instrumentation event data is not supported at this time.
If you are also migrating your RStudio Connect installation to a new server, see 4.9.
10.3.1 Database Migration Checklist
Use this checklist to guide your migration process:
10.3.2 Configuration Requirements
When migrating data, the configuration file must contain valid configuration
sections for both SQLite
and Postgres
. The migration utility will connect
to the SQLite and PostgreSQL databases specified in the configuration.
10.4 Authentication Provider
The migration between authentication providers is detailed in section 11.1.
10.5 Configuration Migration
New version of RStudio Connect will automatically generate a migration file whenever a new setting needs to be added to the configuration of an existing installation. The migration file will be kept separated from the main configuration file. It is recommended that migrated settings be moved to the main configuration file at your earliest convenience.
See the configuration appendix A for details about the process and other information on this migration.