Part 1: to be completed at home before the lab
In this lab, we will cover an introduction in creating interactive visualization, using the R Shiny framework. This will review the fundamental topics covered within the lecture, through supported exercises. Parts of this practical are adapted from Hadley Wickham’s upcoming textbook Mastering Shiny, in addition to the Learn Shiny Tutorials from the Rstudio
website.
For this lab, you will need the package Shiny
. For which it is likely you will need to install.packages("shiny")
before running the library()
function. Additionally, as with all these practicals, it is recommended you also have the tidyverse
loaded, since some data manipulation will be required. You can download the student zip including all needed files for lab 6 here.
Note: the completed homework has to be handed in on Black Board and will be graded (pass/fail, counting towards your grade for individual assignment). The deadline is two hours before the start of your lab. Hand-in should be a separate screen shot of your code for the UI and server tab (see below) for the at home part.
Note that for this lab, we do not include a student .Rmd file to work in. When creating your R shiny applications, it’s better to directly work in the R shiny application files (that is, ui.R
and server.R
as you will see below) instead of in a R markdown file.
library(shiny)
library(tidyverse)
Our R Shiny app
Throughout this lab, you will develop one particular R shiny app. While building this app, you will learn how to make your own RShiny Application
. The code for this particular example will be released alongside the answers to this practical. Today’s practical consists of many parts, with some of them being optional. If you complete all parts (including the optional ones), your R shiny app will look like this.
Note that it is worthwhile to still complete the optional parts afterwards if you did not manage to do them during the lab: the optional parts will learn you how to improve your R shiny app even further and provide you with skills you can use for completing the group assignment.
When you open the example, you can see that there are multiple important components which make this plot interactive:
price
and carat
variables in the data, which are then reflected in both the visualization and the statistical analysis.cut
s you wish to display.Through interacting with these components, you can observe that it creates changes to the output of the ggplot
graphic, the table and the reactive text beneath. Throughout section 4, we will discuss how to implement these different reactive components.
A more simple example to start with
Before we start building the elaborate R shiny app based on the diamonds
dataset, we start with a more simple example. In the simple example, we extend R
s default shiny app provided when starting a new R shiny document via Rstudio.
Creating a new R Shiny app
The easiest way to create a new Shiny Application, is through the RStudio
GUI. This can be done in one of two ways:
For this practical we recommend creating a Shiny Application using the Multiple File format (ui.R/server.R).
R
s default Shiny
app on the Old Faithful Geyser data when you start a new Shiny application.Hint: To run the app, click Run App
in the top right of your main coding panel. This should either produce a pop-out window for this application, if this does not happen, either check your viewer panel (in the bottom right panel usually) or the Console
for any errors.
Now you have opened a new shiny application, you should see R
s default Shiny
application, the Old Faithful Geyser histogram. This is a really nice basic example, to check that everything is working properly (before we start changing anything!).
Core Structure of the app
Lets talk about the core structure of this basic example. If you are using the Multiple File format, you should clearly see that any Shiny
application is comprised of two components:
The ui.R tab
The names of these tabs give a quite clear indicate of what they do, the ui.R tab provides a schematic for how the application will be presented to the individual using the application. Within the Old Faithful Geyser example, you can observe that this provides details such as layout through the functions:
titlePanel()
: Providing a titlesidebarLayout()
& sidebarPanel()
: which provide a side bar & layoutmainPanel()
: provides a main panelThis is only one example of the layouts which an R Shiny
application can take, with more examples being found in Chapter 3.4 in Mastering Shiny. However for this tutorial we will be using this layout as it is one which is most commonly used.
Although not formally defined, it is traditional that your parameter changing aspects, so inputs (for example slider bars, drop down boxes etc), are located in the sidebar
with any outputs (graphs, tables etc) being located in mainPanel
. Although not formally defined, this allows the majority of your application’s visual space to be used to display your output rather than any input.
To add or remove component from your sidebar
or mainPanel
use them as you would any other R
function, and add or remove components as you need them.
It should be noted: This is a really good time, to get into the habit of clearly structuring and annotating (using #
) your code, as although functions are usually clearly named for their function, listing specifically what you intend for each component to do is really helpful when de-bugging or reviewing your own or someone else’s code.
The server.R tab
So what about the server.R tab? This contains everything which should be done in the background of the application. Whether completing any statistical analysis, plotting any graphs or any other code which acts is required for the output.
Everything within the shinyserver
can be classified under two components, either input
s or output
s.
input
s suggest, these take information which is Input from the ui.R tab, and moves it to the server.R tab.output
, does the reverse and allows the calling of information from the server.R tab to the ui.R tab.In the Old Faithful Geyser example, we can see that:
input
here is seen to classify the number of bins (input$bins
)output
defines the entire histogram plotted (distplot
).Important Note: As you can see by the use of the $
operator, both the input
and output
can hold multiple arguments. For example, for the input
we can see in the ui.R tab under the sliderInput
function that we define the object bins
. In the server.R tab, the object bins
is called using input$bins
. If we would have defined an additional function within the SidebarPanel
to define a user specified input, for example hiscol
to set the color of the histogram, we can call the object within the server.R tab using input$hiscol
. For the output, we can observe in the server.R tab we define a plot called displot
using output$distplot
. In the ui.R we directly call the plot in the mainpanel
using distplot
.
So at this point, you should now have a basic understanding that Shiny
applications are a composite of two sections, a user interface and server section, so lets start changing things (until it breaks!!).
Adding a reactive component
Reactivity within Shiny applications, has to be one of the most complex parts to any application, however it is one of the most useful components of it, since it allows a large amount of automation to occur within your application, meaning it can be provided to clients, other workers or anyone in between to help aid their understanding of a topic.
Within this Old Faithful Geyser graphic, there is already one Reactive component, which is the slider component: when a user changes the number of bins, this changes the graph you can observe. Lets add another element of Reactivity through making this graph more colourful.
Step 1. Adding reactive input to server.R
Starting in the server.R tab, we can observe that the current colour of the histogram is darkgrey
, personally I find this very boring. So lets allow the user to change this to one from a selection we provide.
To do this, we need to create a new input
parameter, lets call it input$hiscol
, and replace darkgrey
with input$hiscol
.
darkgrey
to input$hiscol
in the server.R tab# This is how your server.R tab should look:
# Define server logic required to draw a histogram
shinyServer(function(input, output) {
output$distPlot <- renderPlot({
# generate bins based on input$bins from ui.R
x <- faithful[, 2]
bins <- seq(min(x), max(x), length.out = input$bins + 1)
# draw the histogram with the specified number of bins
hist(x, breaks = bins, col = input$hiscol, border = 'white')
})
})
Step 2. Defining the corresponding reactive input in ui.R
Now moving to the ui.R tab, we now need to specify where this input is going to come from. Since these bars can only be a single colour, we must select an input which allows the specification of only one input, so that it doesn’t overload/break the code. For this, you can use any colours single, block colours which are used within base R
such as “red”, “blue” or “purple” for example.
Lets use selectInput()
, this allows users to make a choice from a provided selection.
selectInput()
to your ui.R tab, as part of your sidebarPanel()
.Note: Make sure to place it below sliderInput()
, and ensure that you follow the close bracket of the sliderInput()
function with a comma (,
)
selectInput(inputId = ??? , # What input$??? will be used?
label = ??? , # What would you like the title above the options to be?
choices = c(???, ???), # What colours would you like
selected = ???) # What colour would you like first displayed
# (remember to use `" "` around your colours)
# This is how your ui.R tab sidebar Panel should look:
# Sidebar with a slider input for number of bins
sidebarLayout(
sidebarPanel(
sliderInput("bins",
"Number of bins:",
min = 1,
max = 50,
value = 30),
selectInput(inputId = "hiscol",
label = "Bar Colour:",
choices = c("red", "blue", "purple"),
selected = "red")
),
So now, you should be able to save both your ui.R and server.R tabs and Run/Reload the application. Does everything work - can you now change the colours of your bars?
If so, congratulations, you’ve added your first reactive component!! During the lab, we will move onto making an application from scratch!
Part 2: to be completed during the lab - building your own application
So the beauty of Shiny
applications, as with most things within R
, is that it is highly modular, meaning that this section will focus on building up your application bit by bit to reach the example at the beginning. It is recommended that at the end of each mini-section you test-run your application to ensure it is working correctly, as regular checking ensures that your code is working correctly.
Step 1. Start with the basics
So, if you take a moment to review the example provided at the beginning of this lab, you will see that this is based around our favourite dataset, diamonds
from ggplot
, and as such you will need to ensure that the tidyverse
is loaded into both your ui.R and server.R tabs.
Before we get started, it is important to clean the example from section 1 from our code, to ensure that we can work effectively without any of the previous code from the simple example interacting with our new code we are going to program.
# In your ui.R tab
library(tidyverse)
shinyUI(fluidPage(
# Application title
titlePanel(),
# Description
p(),
# Sidebar
sidebarLayout(
sidebarPanel(
),
# Show the generated plots
mainPanel(
# Sub-Heading
h4(),
# Main Graphical Output
plotOutput()
)
)
))
# In your server.R tab
library(tidyverse)
shinyServer(function(input, output) {
output$ <- renderPlot({
})
})
Step 2. Core Components
So at this point, you now have an empty shell of an RShiny
application, with the aim to create the basic User Interface, which contains a static example plot, which has a title and a brief description to inform the user as to the purpose of the application. As this final example contains a large amount of different components, we will work step-by-step first creating a basic model, before adapting it to make it more complex.
Although there are multiple ways to approach creating a Shiny
application (like all things within R), it is sometimes most useful to start by thinking what you would like to achieve before working backwards from this. So in this case, we are looking to create an application which has a clear title and brief line of text, alongside creating a scatterplot output which plots Diamond Price against the Diamond’s Carat, which is split by the diamond’s cut. The best way for us to create such a graphic, is through using ggplot
.
As such we are able to split this into two parts, firstly adding a title and text, before plotting the static graph.
As previously discussed the typical layout of Shiny
application is based around a titlePanel()
, a sidePanel()
and a mainPanel()
, which are defined within the ui.R tab. As such to add a title, simply add your text within the titlePanel()
function (within " "
) in order to specify this as your application title. But contrast, to add plain (non-reactive) text to your plot, you need to use functions based upon html
code. As such in order to add this to your application, simply add your text you wish to display into the function p()
(see within the template under #Description
).
It is also possible to add in headings or additional text sections within your panels themselves, to add an additional heading simply call one of the functions h[x]()
, replacing the [x]
with a number between 1 & 6 depending on the size of the heading you would like.
6a. Within the ui.R tab, add an appropriate title, as part of the titlePanel()
function, and an appropriate description of the application, as part of the p()
function.
# Application title
titlePanel("Statistics in the Sky with Diamonds"),
# Description
p("A Shiny application, predicting the Carat of different Cuts of Diamonds, by their Price."),
Next, we are able to plot our static graph using ggplot
, as part of the renderPlot({})
function within the server.R tab.
6b. Within the server.R tab, as part of the renderPlot({})
function, start by creating a basic ggplot
graphic, which plots price (x-axis) by carat (y-axis) and cut (colour) from the Diamonds dataset within ggplot
. This should then be assigned to the variable output$diaPlot
# This is how your server.R tab should look:
output$diaPlot <- renderPlot({
ggplot(data = diamonds,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
theme_minimal()
})
So now you have a basic setup behind the scenes which will create a static plot in your Shiny
application. Make sure this is now matched for the UI.
"diaPlot"
to the function plotOutput()
in the applications mainPanel()
. In addition, provide a sub heading for your graphical plot in the function h4()
just above your plot in the mainPanel()
.# This is how your ui.R tab should look:
# Application title
titlePanel("Statistics in the Sky with Diamonds"),
# Sidebar
sidebarLayout(
sidebarPanel(
),
# Show the generated plots
mainPanel(
# Sub-Heading
h4("Graphical Output"),
plotOutput("diaPlot")
)
)
If you run this set up now you should create a rather boring Shiny
application, with no reactive components, meaning you will see a static ggplot
graph.
Step 3. Adding reactive components
Within the example, you can see that there is a large amount of different reactive components used, lets start with one of the core components, slider inputs. In order to add slider inputs into Shiny
application this requires the use of the sliderInput()
within your ui.R tab, with the relevant mapped input
within your server.R tab.
Since there are two sliders within this example, we will walk through one, before you can create the second for yourself.
Let us begin with the Price Input slider, which allows the defining of the price range of the Diamonds observed within your ggplot
graphic.
Withing the ui.R tab, as part of the sidebarPanel()
you will need to add the following component:
sliderInput(inputId = ??, # The name of your slider, which will match with the input$inputId
label = ??, # The label presented on the UI
min = ??, # The sliders min value
max = ??, # The sliders max value
value = c(??,??), # The inital value of the slider
step = ??) # The stepped increments in the slider
This will produce a slider bar which allows the creation of a range of values to be applied to the graphic. If you would prefer the indication of a single value, rather than a single value simply specify a single value under value
rather than a pair.
In order to implement this slider bar behind the scenes in the server.R tab, things are a little more complex. On a conceptual front, the purpose of this slider is to change the range of data displayed within the plot. Therefore, to implement this slider, you will need to a filtering tool, such as dplyr
’s filter()
function. By using this function, it will allow you to filter the data based upon the values selected on this slider bar.
To implement this within the server.R tab, you will need to create a new variable (such as dia.filtered
), which will replace the dataset diamonds
within your ggplot
function, which filters the dataset diamonds
by the price range indicated. In action this looks as so:
dia.filtered <- # name of new variable
diamonds %>% # dataset to be used, piped (%>%)
filter(price >= input$priceInput[1], # retain all values greater than or equal too the lower bound
price <= input$priceInput[2]) # retain all values lower than or equal too the upper bound
In action, this takes the specified input variable and filters the desired dataset accordingly, before allowing it to be produced into the plot. To best use this, ensure it is added into the renderPlot({})
function, before you specify your scatter plot.
Notes:
filter()
function to include carat >= input...
.# ui.R sliders
sliderInput(inputId = "priceInput", # The name of your slider, which will match with the input$inputId
label = "Price", # The label presented on the UI
min = 0, # The sliders min value
max = 20000, # The sliders max value
value = c(100,1000), # The inital value of the slider
step = 100), # The stepped increments in the slider
sliderInput(inputId = "caratInput", # The name of your slider, which will match with the input$inputId
label = "Carat", # The label presented on the UI
min = 0, # The sliders min value
max = 5, # The sliders max value
value = c(0,5), # The inital value of the slider
step = 0.1) # The stepped increments in the slider
# server.R tab should look:
output$diaPlot <- renderPlot({
dia.filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2])
ggplot(data = dia.filtered,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
theme_minimal()
})
So, now if everything is working you should have a Shiny
application which has two slider inputs, allowing you to specify the range of variables which are displayed within your ggplot
graphic output.
At this point, it is a good time to re-run your application so far and ensure that all your core components are working. Can you change the scale range of both your X & Y variable?
OPTIONAL: Adding other types of reactive components
If you look again at the example application, you will see that other reactive components are present which allow a user to interact with the application. Although different in their function, these follow the same logic and application as the slider functions. The other main reactive components are from the functions checkboxGroupInput()
, selectInput()
and checkboxInput()
. This step will look only at the first two, since the last (checkboxInput()
) will be covered in Step 5. Statistical Functions within Shiny
.
Both checkboxGroupInput()
and selectInput()
follow the same functional layout requiring the following components:
# Function can be replaced with checkboxGroupInput() or selectInput()
function(inputId = ??,
label = ??,
choices = c(??, ??),
selected = ??)
With the components of an input identifier, a UI label, provision of the choices as well as those which should initial be selected upon loading of the application, for checkboxGroupInput()
this initial selection can highlight multiple variables.
As with the previous examples we have discussed this produces an input variable, identified through the inputId
. Which can then be called wherever you may need it. Within the presented example, we can see that the cuts of diamonds to be displayed within the graph is altered through the checkboxGroupInput()
function, with the colours for the cut of the diamonds being altered through the selectInput()
function.
Multiple checkbox: displaying by diamond cut
Since the purpose of the checkboxGroupInput()
function in this case is to define the parameters of the data which should be used, this indicates that changes should be made during the filtering of the diamonds dataset.
checkboxGroupInput()
to the side panel of your UI (in ui.R). Next, in server.R, use the operator ==
, to specify that the values selected in this checkboxGroupInput()
should be used to filter the diamonds dataset to allow the user to specify which cuts of diamonds they would like to see in the graph.# ui.R
sliderInput(inputId = "priceInput", # The name of your slider, which will match with the input$inputId
label = "Price", # The label presented on the UI
min = 0, # The sliders min value
max = 20000, # The sliders max value
value = c(100,1000), # The inital value of the slider
step = 100), # The stepped increments in the slider
sliderInput(inputId = "caratInput", # The name of your slider, which will match with the input$inputId
label = "Carat", # The label presented on the UI
min = 0, # The sliders min value
max = 5, # The sliders max value
value = c(0,5), # The inital value of the slider
step = 0.1), # The stepped increments in the slider
checkboxGroupInput(inputId = "cutInput",
label = "Diamond Cut",
choices = c("Fair", "Good", "Very Good", "Premium", "Ideal"),
selected = c("Fair", "Good", "Very Good", "Premium", "Ideal")),
# server.R tab should look:
output$diaPlot <- renderPlot({
dia.filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput)
ggplot(data = dia.filtered,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
theme_minimal()
})
Drop down option: selecting colour arrays
We have already seen earlier in this practical, that selectInput()
can be used to change a single colour with ease (through the replacement of the colour with the reactive input parameter), however as you can see within the main example it is possible to specify the range of colours also. This is completed through the addition of the scale_colour_brewer()
function to your ggplot
function. The addition of this function and its contents, helps to direct what colour scheme should be used within the displaying of variables.
It should be noted that the scale_colour_brewer()
function, can only be added to scale based plots, such as scatter plots. With scale_fill_brewer()
being used for box plots, bar plots and alike. These function additionally are based around the RcolorBrewer Package which should be installed and loaded accordingly, however many users may already have this package installed! The RcolorBrewer package, as briefly mentioned within the visualization lecture & corresponding practical provides a huge amount of pre-defined colour schemes for a variety of different uses. For a full list and examples please visit their website.
selectInput()
to the side panel of your UI (in ui.R) so the user can select one of any number of RcolorBrewer colour palettes. Next, add scale_colour_brewer()
to your ggplot()
function, with its palette
argument being set to your input ID from your UI.# ui.R
sliderInput(inputId = "priceInput", # The name of your slider, which will match with the input$inputId
label = "Price", # The label presented on the UI
min = 0, # The sliders min value
max = 20000, # The sliders max value
value = c(100,1000), # The inital value of the slider
step = 100), # The stepped increments in the slider
sliderInput(inputId = "caratInput", # The name of your slider, which will match with the input$inputId
label = "Carat", # The label presented on the UI
min = 0, # The sliders min value
max = 5, # The sliders max value
value = c(0,5), # The inital value of the slider
step = 0.1), # The stepped increments in the slider
checkboxGroupInput(inputId = "cutInput",
label = "Diamond Cut",
choices = c("Fair", "Good", "Very Good", "Premium", "Ideal"),
selected = c("Fair", "Good", "Very Good", "Premium", "Ideal")),
selectInput( inputId = "colInput",
label = "Colour Choice",
choices = c("Pastel1", "Paired", "Spectral", "RdBu", "PuOr"),
selected = "Spectral"),
# server.R tab should look:
output$diaPlot <- renderPlot({
dia.filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput)
ggplot(data = dia.filtered,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
scale_color_brewer(palette = input$colInput) +
theme_minimal()
})
At this point, it is a good time to re-run your application so far and ensure that all your core components are working.
Can you:
If you can do all of this, well done!! Time to get to the statistical analysis component!
Step 4. Adding statistical functions within Shiny
As Shiny
is build using the R
framework; creating and using statistical functions can be used in the same way as within traditional R
code. However some adaptions and considerations are required to ensure that they are integrated correctly into the application. For this practical, we will focus on the plotting of relatively simple regression analyses, however as this is only an introduction to Shiny
applications, it is recommended you look at other ways to output statistical analysis results (see Chapter 3.3 in Mastering Shiny).
Within the example application we are working towards you will see that the three regression analyses used are a standard Linear Regression (using the lm()
function), a regression including a square term (using the equation lm(y ~ x + I($x^2$)
), and a regression adding a cubic term to the equation (using the equation lm(y ~ x + I($x^2$) + I($x^3$))
). However we will cover the steps you will need to take to integrate statistical functions and their output into your Shiny
applications, before allowing you to apply these three methods (in whatever way you feel fit) to your application so far.
Basics: adding regression models
As we are looking to provide an introduction at this stage, let us say that are looking to apply standard statistical functions to you applications, which although dependent on the data the User controls, can not be directly changed by the user through reactive components (Don’t worry we’ll be coming onto how to change these directly later!). As such, we can apply our knowledge of statistical functions within the server.R tab.
Using your knowledge from Week 4 (linear regressions), you will know that the standard layout of a regression analysis is:
model <- lm(data = ??,
formula = y ~ x)
This can easily be added to a Shiny
application, within the server.R tab. Where, if you would like the data to be influenced by the reactive components, it can simply be placed after your filter()
function.
model.lin <- lm(carat ~ price, data = dia.filtered)
model.sq <- lm(carat ~ price + I(price^2), data = dia.filtered)
model.cub <- lm(carat ~ price + I(price^2) + I(price^3), data = dia.filtered)
Adding corresponding regression lines to the graphic output
Now you have the models included within your server.R tab, it is now useful to add the regression lines themselves to your graphic output.
x prediction
variable for all your models (this predictor variable will remain consistent for all three different regression models). This can be created using the functional template given below, which will create a sequenced output of the possible values for the predictor variable x, which can then be used to predict the y outcome variable.x_pred <- seq(min(data$x),
max(data$x),
length.out = ??)
y predictions
for each individual model. Remember that as each model employs a different formula you will need to create a y prediction
relative to each model you want to use. This can be done using this functional template, which will create prediction variables based upon the x prediction
values created in the previous step.y_pred.model <- predict(model = ??,
newdata = tibble(x = x_pred))
ggplot
function, for this, you can use the geom_line()
function using the data created from the predictive functions as so, replacing x
and y
with the variables you have used in the main section of the plot.geom_line(data = tibble(x = x_pred, y = y_pred),
size = 1,
col = ??)
These steps as you can see are incredibly useful in the plotting of regression lines.
x
& y
prediction values accordingly, and add the predicted regression lines in your output plot# server.R tab should look:
output$diaPlot <- renderPlot({
dia.filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput)
model.lin <- lm(carat ~ price, data = dia.filtered)
model.sq <- lm(carat ~ price + I(price^2), data = dia.filtered)
model.cub <- lm(carat ~ price + I(price^2) + I(price^3), data = dia.filtered)
x_pred <- seq(min(dia.filtered$price), max(dia.filtered$price), length.out = 500)
y_pred.lin <- predict(model.lin, newdata = tibble(price = x_pred))
y_pred.sq <- predict(model.sq, newdata = tibble(price = x_pred))
y_pred.cub <- predict(model.cub, newdata = tibble(price = x_pred))
ggplot(data = dia.filtered,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
geom_line(data = tibble(price = x_pred, carat = y_pred.lin), size = 1, col = "blue") +
geom_line(data = tibble(price = x_pred, carat = y_pred.sq), size = 1, col = "red") +
geom_line(data = tibble(price = x_pred, carat = y_pred.cub), size = 1, col = "green") +
scale_color_brewer(palette = input$colInput) +
theme_minimal()
})
OPTIONAL: Adding reactivity to used statistical functions
So far for these regression functions, changes to them occur via the changing of the data they use, using your interactive components (sliders, selectors etc), however it is possible to use these reactive components to directly effect either the statistical methods used, or the way in which they are completed. In this section, we will examine only the example of changing the statistical method used / displayed. In addition to demonstrating how to aid the communication of your regression lines through isolating which regression lines you wish to display, in essence helping to declutter your plot, aiding a user’s understanding.
Let us firstly begin with the visibility of different regression lines.
The way these lines are included or removed from this example is through the changing of their size from 0 to 1 or vice versa, since at size 0, they become invisible compared to size 1 or above. This you will see is achieved through the use of the checkboxInput()
function. This uses simple binary (True/False) logic to communicate to the graph whether the line should have a size of 0 or 1, with False being equal to zero. This is used as R
interprets False values as 0 and True values as 1.
This can be added in the same way in which we have added the other reactive components, using the following general template for the ui.R tab:
checkboxInput(inputId = ??,
label = ??,
value = ??)
With this being applied to the server.R tab, in the same way as components like colour, through replacing the size value or 1
with your inputId for this reactive component.
checkboxInput()
for each of the three regression lines and the required components within your ggplot function# Reactive checkboxInputs:
checkboxInput("regInput", "Linear regression", value = FALSE, width = NULL),
checkboxInput("sqInput", "Square regression", value = FALSE, width = NULL),
checkboxInput("cubicInput", "Cubic regression", value = FALSE, width = NULL),
# your ggplot function in your server.R tab should look:
ggplot(data = dia.filtered,
mapping = aes(x = price, y = carat, colour = cut)) +
geom_point() +
geom_line(data = tibble(price = x_pred, carat = y_pred.lin), size = input$regInput, col = "blue") +
geom_line(data = tibble(price = x_pred, carat = y_pred.sq), size = input$sqInput, col = "red") +
geom_line(data = tibble(price = x_pred, carat = y_pred.cub), size = input$cubicInput, col = "green") +
scale_color_brewer(palette = input$colInput) +
theme_minimal()
})
It is also possible to add reactive components directly to your statistical functions. These work in an identical way to the reactive functions we have previously discussed; in that a user is able to directly express an input which is to be used within the methodology. This could include, but is not limited to, the polynomial rank used (e.g., including a square term, cubic term or even beyond) or the number of folds used within a K-fold classification or regression. In theory, any statistical method which can have a variable specified can in theory become reactive. Disclaimer: although in theory this is the case, in reality it is sometimes a lot more tricky then at others. We will not try this within this lab, but online there are plenty of examples to find, see below for links.
OPTIONAL: Presenting statistical output in tables
Despite it being fantastic to visually present your results, it significantly limits the information your user can gather from it, as it displays no formal evidence of the strength of fit, or other important values presented. As such presenting your model results in a table can be incredibly useful.
In order to add a table to the Main Panel of your application, you can use the function tableOutput()
within your mainPanel()
function, in your ui.R tab, this when the name (surrounded by " "
) of the output table is called will present a basic table within your application. This is matched behind the scenes within the server.R tab, by adding the following as part of the shinySever()
function, but separately from the output$diaPlot
variable we produced earlier:
output$stattable <- renderTable({})
This when a table, dataframe or matrix is called directly (simply by typing the name of the table into it) will print the table into your UI. So at this point, you may be a little unsure of how to continue, as you have three models which you usually extract the data or results from using the summary()
function. As these results are typically not within the lists formed by the model, through making the summary()
into a list (via model.lin.summary <- summary(model.lin)
) you are then able to directly call the specific components of the results summary, such as the degrees of freedom, the Fstatistic as well as the R-squared value which you can then use within your table.
tableOutput()
in addition to a relevant sub-heading (using h4()
) in your ui.R tab, use the template below to create the server.R side of this table output.Note: When creating the table, it will directly use all digits within the variable, so calling the round()
function in front of any variable (for example `round(mean(diamonds$price), digits = 2), will call this value to 2 decimal places). This can help neaten up any tables produced.
# Model Table output
output$stattable <- renderTable({
## Copy and paste your data preprocessing (via the filter function) here
## Copy and paste your model functions here
model.lin.sum <- summary() # Add your specific models in this function
model.sq.sum <- summary() # Add your specific models in this function
model.cub.sum <- summary() # Add your specific models in this function
tablemodelres <- matrix(c("Linear regression", "Square regression", "Cubic regression",
?? # Call the r-squared value from each model summary list
?? # Call the Adj r-squared value from each model summary list
?? # Call the df value from each model summary list
), ncol = 4)
# Using colnames(tablemodelres), name your table columns accordingly
colnames(tablemodelres) <- c()
tablemodelres
})
# ui.R tab
# Sub-Heading
h4("Statistical Analysis"),
# Statistical Analysis Outcome
tableOutput("stattable"),
# server.R
# Model Table output
output$stattable <- renderTable({
filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput
)
model.lin <- lm(carat ~ price, data = filtered)
model.sq <- lm(carat ~ price + I(price^2), data = filtered)
model.cub <- lm(carat ~ price + I(price^2) + I(price^3), data = filtered)
model.lin.sum <- summary(model.lin)
model.sq.sum <- summary(model.sq)
model.cub.sum <- summary(model.cub)
tablemodelres <- matrix(c("Linear regression", "Square regression", "cubic regression",
round(model.lin.sum$r.squared, 3), round(model.sq.sum$r.squared, 3),
round(model.cub.sum$r.squared, 3), round(model.lin.sum$adj.r.squared, 3),
round(model.sq.sum$adj.r.squared, 3), round(model.cub.sum$adj.r.squared, 3),
model.lin.sum$df[2], model.sq.sum$df[2], model.cub.sum$df[2]), ncol = 4)
colnames(tablemodelres) <- c(" ", "R-squared", "Adj R-squared", "df")
tablemodelres
})
At this point, you should have a reactive table who’s results change depending on the values input-ed.
OPTIONAL: Presenting reactive text
Finally, although incredibly helpful to have a table of results for your statistical models, these at times can be limited to non-statistically minded users. Meaning that it is useful to provide a reactive text output explaining the results seen which adapt depending upon the data provided. For example, as you can see within the example we are working towards, the cubic model rarely changes as our best fitting model. However this may not be the same for all datasets. As such it can be useful to know how to create text which changes depending on the results. It should be noted at this stage, that there are multiple different ways of completing this step, and the method am presenting is just one of many, as such consider the best method for the way you wish to present your data.
Before we talk about the nitty gritty, it is useful to highly that this UI output, is created in the same way as both the graph and table we have previous discussed, requiring the use of an output function, in this case textOutput()
within the ui.R tab in which the name of the output is called. Being matched with the outputId
variable as part of the server.R tab, which uses the render()
function: renderText({})
to produce this text.
Similarly the internal structure (the nitty gritty) bit of the renderText({})
function is similar also to the previous being made up of the reactive sections before a final code section dedicated to the output, which in this case is called by the function paste0()
.
Note that if you want to add new lines in your output, then you need to use htmlOutput("statout")
in your ui.R tab. In this case you can use the html <br />
inside the paste0 funciton in the server.R that will print a new line in your output.
To start with, let us say you would like to simply print your results in text, say in an APA format, to do this you look to have a structure like this: R-square = r.squared, F(df[2], df[3]) = fstat, p..
For example, if I would like to print the sentence the: “The Mean Price of this Filtered Diamond Dataset is…” I would use the following code within the renderText({})
function.
output$statout <- renderText({
## All my lovely filtering... (to the dataset dia.filtered)
paste0("The Mean Price of this Filtered Diamond Dataset is: ", round(mean(dia.filtered$price)))
})
As you can see from this example, any generic text which is to be displayed is contained within " "
with any functions, or reactive components being called outside of them, with each chunk begin separated by a comma (,
).
# Model Statistical Output (Text)
output$statout <- renderText({
## Copy and paste your data preprocessing (via the filter function) here
## Copy and paste your model functions here
paste0("Explaining the association between Diamond Price, Carat and Cut associations were found via the linear model: ", )
})
# ui.R
# Text Commentary
textOutput("statout")
# server.R
# Model Statistical Output (Text)
output$statout <- renderText({
filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput
)
model.lin <- lm(carat ~ price, data = filtered)
model.sq <- lm(carat ~ price + I(price^2), data = filtered)
model.cub <- lm(carat ~ price + I(price^2) + I(price^3), data = filtered)
model.lin.sum <- summary(model.lin)
model.sq.sum <- summary(model.sq)
model.cub.sum <- summary(model.cub)
paste0("Explaining the association between Diamond Price, Carat and Cut associations were found via the Linear Model: ", round(model.lin.sum$adj.r.squared, 3), ", F(", model.lin.sum$fstatistic[2], ", ", model.lin.sum$fstatistic[3], ") = ",
round(model.lin.sum$fstatistic[1], 3), ".")
})
Although as a whole useful to display the generic statistics, for those not familiar with the ways in which to interpret these statistics, having some reactive text which explains which is the best model can also be a useful addition. In order to do this, it requires the use of logic driven statements (if_else()
statements) in order to determine which is the best. This in itself (outside of Rshiny
can be quite complex, so I hope I this summary is useful!!).
Working backwards, the aim of this text block is to present to the user which model best describes the data, and in this case it is useful to compare the Adjusted R-square values, to see which is largest. Although there are many ways to do it one method I like is based upon decision tree based logic (covered more generally in next weeks lecture). In which the value is run against a conditional phrase in order to determine whether it meets that requirement.
In this case we can propose the following logic tree
By following these steps we can determine which Adjusted R-Squared is best.
To achieve this in R
we need to use a process called the ifelse ladder, a process used when there is more than two outcomes (here there are three). Which in general is structured as so, please note text between square brackets ([ ]
) would be replaced by your variables, with the information here being specific to this example for context.
if(([linear model adj r squared variable] > [sq model adj r square ]) &
([linear model adj r squared variable] > [cubic model adj r square])){
[outcome 1]
} else if(([sq model adj r squared variable] > [lin model adj r square ] ) &
([sq model adj r squared variable] > [cubic model adj r square])){
[outcome 2]
} else if(([cubic model adj r squared variable] > [sq model adj r square ]) &
([cubic model adj r squared variable] > [linear model adj r square])){
[outcome 3]
In this example, I would replace the comment outcome X
, with a process which assigns the model defined here as the largest to another model variable which can then be called later in the text, allowing it to be adaptive to the other components.
As a result, it is possible to add this ifelse ladder
to your renderText()
function, in order to allow you to have interpretation based text output.
renderText()
function, add in an ifelse
ladder to allow you to produce interpretation driven comments like those seen within the example application.Note: Add the following code before your ladder to allow you to call on the name of the specific model within your interpretation section
model.lin.sum[["name"]] <- "Linear regression"
model.sq.sum[["name"]] <- "Square regression"
model.cub.sum[["name"]] <- "Cubic regression"
# server.R
output$statout <- renderText({
filtered <-
diamonds %>%
filter(price >= input$priceInput[1],
price <= input$priceInput[2],
carat >= input$caratInput[1],
carat <= input$caratInput[2],
cut == input$cutInput
)
model.lin <- lm(carat ~ price, data = filtered)
model.sq <- lm(carat ~ price + I(price^2), data = filtered)
model.cub <- lm(carat ~ price + I(price^2) + I(price^3), data = filtered)
model.lin.sum <- summary(model.lin)
model.sq.sum <- summary(model.sq)
model.cub.sum <- summary(model.cub)
model.lin.sum[["name"]] <- "Linear regression"
model.sq.sum[["name"]] <- "Square regression"
model.cub.sum[["name"]] <- "Cubic regression"
# Best fitting model, with R-squared
if((model.lin.sum$adj.r.squared > model.sq.sum$adj.r.squared) & (model.lin.sum$adj.r.squared > model.cub.sum$adj.r.squared)){
model.text.out <- model.lin.sum
} else if((model.sq.sum$adj.r.squared > model.lin.sum$adj.r.squared) & (model.sq.sum$adj.r.squared > model.cub.sum$adj.r.squared)){
model.text.out <- model.sq.sum
} else if ((model.cub.sum$adj.r.squared > model.lin.sum$adj.r.squared ) & (model.cub.sum$adj.r.squared > model.sq.sum$adj.r.squared)){
model.text.out <- model.cub.sum
}
paste0("The regression analysis which most accounts for the relationship is: ", model.text.out$name, ". ",
"With an Adjusted R-squared of: ", round(model.text.out$adj.r.squared, 3), ", F(",
model.text.out$fstatistic[2], ", ", model.text.out$fstatistic[3],
") = ", round(model.text.out$fstatistic[1],3), ".")
})
If you have managed to successfully complete all these steps you should now have a shiny
application which looks like the example. If not you can double check you code against the code used here.
Even more advanced Shiny
options
As this practical is aimed at being an introduction to Shiny
applications, there are several topics which we have only touched upon, however we actively encourage you to play around with, read more about and watch the tutorials in the links provided to further develop your understanding. These include:
As you can tell when first looking at the Mastering Shiny Textbook, this is still in development although Shiny
has been around for a while. As such the use of Shiny
applications is growing steadily, with their use ever increasing within data science driven workplaces, due to their convenience in helping to automate the flow of challenging statistical and processing tasks. Nevertheless, the more you use, interact with and experiment with building and editing Shiny
applications the better you will get - like everything in R
, and really like everything in R
you are really only limited by your imagination. Check out some incredible examples of Shiny
applications here.
Closing Note: Single vs Multiple File Formats?
So by this point, you should have successfully built a Shiny Application using the Multiple File format. However it is easily possible to transfer between these file formats, with only minor modifications being required.
When loading up a new Shiny
application using the Rstudio
GUI, you will see once again that it provides a template to create the Old Faithful Geyser Data Histogram. And as such you can see that the application is still divided clearly into two sections, ui & server, except rather than the shinyUI()
and shinyServer()
functions being used, these are instead created as the variables ui and server before the following code is run at the end of the document:
# Run the application
shinyApp(ui = ui, server = server)
As such, to convert your multiple file format to a singular application file, simply copy and paste everything within the shinyUI()
and shinyServer()
functions into their relevant sections and there you have it, a singular file format.
In the grand scheme of things, whether you code your application as a singular file, or multiple, is generally equivalent. With benefits being seen predominantly at the organisational side of programming, rather than the functional side, with the code achieving the same outcome regardless of the format. As a result it is up to you as the programmer which you prefer and the way any clients, supervisors or associates would like it produced!