Databases and Shiny


Attaching package: 'dplyr'
The following objects are masked from 'package:stats':

    filter, lag
The following objects are masked from 'package:base':

    intersect, setdiff, setequal, union

SQL Jedi

{dplyr} Jedi

Interface to Databases: {DBI}

Unified set of methods & classes bridging interfacing R to database management systems (DBMS)

  • Connect and disconnect from DB
  • Execute queries
  • Extract results
  • Obtain metadata when available
  • Each DBMS supported by a dedicated package

All About Connections

You have used connections in R (and may not realize it)

read.csv("path/to/file.csv", stringsAsFactors = FALSE)

readr::read_csv("path/to/file.csv")

openxlsx::write.xlsx(df, file = "/path/to/file.xlsx")


Behind the scenes: connections are dynamically created (and terminated) during file input/output.

Database Options

  • Relational databases come in many flavors

SQLite


Written to file

Open access

{RSQLite}

Ideal for prototyping DB solutions

PostGreSQL


Hosted on server

Access via authentication

{RPostgres}

Ideal for production-grade DB workflows

Defining Connection

library(DBI)
library(RSQLite)

# initialize connection object
con <- dbConnect(
  drv = RSQLite::SQLite(),
  dbname = ":memory:"
)

# send data frame to a table
dbWriteTable(con, "sim_patients", sim_patients)

# disconnect when done
#dbDisconnect(con)

Applying your {dplyr} Skillz

{dbplyr} provides automatic translation from dplyr syntax to SQL statements

  • Integrates with connection objects created by {DBI}
  • Calls are evaluated lazily: Only when you request results
  • Common dplyr verbs supported out of the box

Applying your {dplyr} Skillz

library(dplyr)

sim_patients_db <- tbl(con, "sim_patients")

sim_patients_db %>%
  group_by(ethnicity) %>%
  count()
# Source:   SQL [2 x 2]
# Database: sqlite 3.39.4 [:memory:]
  ethnicity       n
  <chr>       <int>
1 hispanic       20
2 nonhispanic   140

Connections in Shiny

Logical ways to manage connections when developing solo

Connections in Shiny

  • Many users creating connections
  • Potential for degraded performance in your app

Enter the {pool}!

Abstraction layer on top of database connection objects

  • Holds a group of connections to database
  • Knows to expand or reduce connections as needed

Let’s Dive In

con <- dbConnect(
  drv = RSQLite::SQLite(),
  dbname = ":memory:"
)
pool <- dbPool(
  drv = RSQLite::SQLite(),
  dbname = ":memory:"
)
  • pool object a drop-in replacement for con
  • Each query goes to the pool first, then fetches or initializes a connection
  • Not necessary to create connections yourself

Code-Along

Optimize backend calculations in {simclindata.shiny} with SQLite database