rmarkdown
Programming paradigm first introduced by Donald E. Knuth.
Treat program as a literature understandable to human beings
move away from writing programs in the manner and order imposed by the computer
focus instead on the logic and flow of human thought and understanding
single document to integrate data analysis (executable code) with textual documentation, linking data, code, and text
Increasing data collection throughput; data are more complex and highdimensional
Existing databases can be merged to become bigger databases
Computing power allows more sophisticated analyses, even on “small” data
For every field “X” there is a “Computational X”
Even basic analyses difficult to describe
Errors more easily introduced into long analysis pipelines
Knowledge transfer is inhibited
Results are difficult to replicate or reproduce
Complicated analyses cannot be trusted
Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.
… highlight problems with users jumping straight into software implementations of methods (e.g. in r) that may lack documentation on biases and assumptions that are mentioned in the original papers.
To help solve these problems, we make a number of suggestions including providing blog posts or videos to explain new methods in less technical terms, encouraging reproducibility and code sharing, making wiki-style pages summarising the literature on popular methods, more careful consideration and testing of whether a method is appropriate for a given question/data set, increased collaboration, and a shift from publishing purely novel methods to publishing improvements to existing methods and ways of detecting biases or testing model fit. Many of these points are applicable across methods in ecology and evolution, not just phylogenetic comparative methods.
rmarkdown (.Rmd
) integrates:
– a documentantion language (.md
)
– a programming language (R
)
Combine tools, processes and outputs into interactive evidence streams that are easily shareable, particularly through the web.
.md
}html
. User can focus on communicating & disseminatingintended to be as easy-to-read and easy-to-write as possible.
most powerful as a format for writing to the web.
syntax is very small, corresponding only to a very small subset of HTML tags.
clean and legible across platforms (even mobile) and outputs.
formatting handled automatically
html markup language also handled.
Code chunks defined through special notation. Executed in sequence. Exceution of individual chunks controllable
knitr
Can read appropriately annotated .R
scripts in and call them within an .Rmd
Knit together through package knitr
to
Many great packages and applications build on rmarkdown.
All this makes it incredibly versatile. Check out the gallery.
Simple interface to powerful modern web technologies and libraries
Publish rendered rmarkdown documents on the web with the click of a button, for free!
Rmd
documentsCan be useful for a number of research related materials
Useful features: - bibliographies and citations
Throughout this workshop, we’ll be working with the gapminder dataset to produce a reproducible Rmarkdown vignette of our work.
.Rmd
!!Before knitting, the document needs to be saved. Give it a useful name, e.g. gapminder.Rmd
Render the document by clicking on the knit button.
You can also render .Rmd
documents to html using rmarkdown
function render()
rmarkdown::render(input = "gapminder.Rmd")
Register an account on RPubs
Publish your rendered document (don’t worry, you can delete or overwrite it later)
install.packages(c("rmarkdown", "tidyverse", "plotly", "DT", "reprex"))
html_document
---
title: "Untitled"
author: "Anna Krystalli"
date: "3/23/2018"
output: html_document
---
---
title: "Untitled"
author: "Anna Krystalli"
date: "3/23/2018"
output:
html_document:
toc: true
toc_float: true
---
Specify bootswatch themes.
---
title: "Untitled"
author: "Anna Krystalli"
date: "3/23/2018"
output:
html_document:
toc: true
toc_float: true
theme: cosmo
---
---
title: "Untitled"
author: "Anna Krystalli"
date: "3/23/2018"
output:
html_document:
toc: true
toc_float: true
theme: cosmo
highlights: zenburn
---
Clear everything BELOW THE YAML header. You should be left with just this:
---
title: "Gapminder"
author: "Anna Krystalli"
date: "3/23/2018"
output: html_document
---
add a floating table of contents
set a theme of your choice (see avalable themes here and the associated bootstrap styles here)
normal text
normal text
*italic text*
italic text
**bold text**
bold text
***bold italic text***
bold italic text
rmarkdown
# Header 1
## Header 2
### Header 3
#### Header 4
##### Header 5
###### Header 6
rendered html
rmarkdown
- first item in the list
- second item in list
- third item in list
rendered html
rmarkdown
1. first item in the list
1. second item in list
1. third item in list
rendered html
rmarkdown
> this text will be quoted
rendered html
this text will be quoted
rmarkdown
`this text will appear as code` inline
rendered html
this text will appear as code
inline
a <- 10
rmarkdown
the value of parameter *a* is `r a`
rendered html
the value of parameter a is 10
Provide either a path to a local image file or the URL of an image.
rmarkdown
![](assets/cheat.png)
rendered html
html in rmarkdown
<img src="assets/cheat.png" width="200px" />
rendered html
rmarkdown
Table Header | Second Header
------------- | -------------
Cell 1 | Cell 2
Cell 3 | Cell 4
rendered html
Table Header | Second Header |
---|---|
Cell 1 | Cell 2 |
Cell 3 | Cell 4 |
Check out handy online .md table converter
rmarkdown
[Download R](http://www.r-project.org/)
[RStudio](http://www.rstudio.com/)
rendered html
Supports mathematical notations through MathJax.
You can write LaTeX math expressions inside a pair of dollar signs, e.g. $\alpha+\beta$
renders \(\alpha+\beta\). You can use the display style with double dollar signs:
$$\bar{X}=\frac{1}{n}\sum_{i=1}^nX_i$$
Do some quick online research on Gapminder. A good places to start: https://www.gapminder.org/
"Background"
section using headersWrite a short description of the Gapminder project (feel free to copy, paste and edit information).
Make use of markdown annotation to:
Add an image related to Gapminder.
R code chunks execute code.
They can also be used as a means render R output into documents or to simply display code for illustration (eg with option eval=FALSE
)
chunk notation in .rmd
```{r chunk-name}
print('hello world!')
```
rendered html code and output
print("hello world!")
## [1] "hello world!"
Chunks can be labelled with chunk names, names must be unique.
You can quickly insert chunks with:
Ctrl + Alt + I
(OS X: Cmd + Option + I
)```{r} and ```
.echo
setting)eval
setting)fig.width
and fig.height
settings)warning
and message
settings)cache
setting)purl
settings)echo
chunk notation in .rmd
```{r hide-code, echo=FALSE}
print('hello world!')
```
rendered html code and output
## [1] "hello world!"
eval
chunk notation in .rmd
```{r dont-eval, eval=FALSE}
print('hello world!')
```
rendered html code and output
print("hello world!")
knitr::opts_chunk$set(echo = TRUE, warning = F, message = F)
For this exercise we’ll be accessing the gapminder data through the gapminder
R package.
Installation
” section using headersWrite brief instructions (including code) for others to access the dataset in R. Have a look at the package documentation on GitHub for inspiration.
In R we often need to describe a setup proceedure that involves specifying the installation of required packages. However, installation of packages in not handled in .Rmd
! (For the moment, install packages through the console).
In our case, we’ll want to include the code for installing the gapminder
package but not evaluate it in the .Rmd
.
data.frame
sdata(airquality)
head(airquality)
## Ozone Solar.R Wind Temp Month Day
## 1 41 190 7.4 67 5 1
## 2 36 118 8.0 72 5 2
## 3 12 149 12.6 74 5 3
## 4 18 313 11.5 62 5 4
## 5 NA NA 14.3 56 5 5
## 6 28 NA 14.9 66 5 6
tibble
slibrary(tibble)
as_tibble(airquality)
## # A tibble: 153 x 6
## Ozone Solar.R Wind Temp Month Day
## <int> <int> <dbl> <int> <int> <int>
## 1 41 190 7.40 67 5 1
## 2 36 118 8.00 72 5 2
## 3 12 149 12.6 74 5 3
## 4 18 313 11.5 62 5 4
## 5 NA NA 14.3 56 5 5
## 6 28 NA 14.9 66 5 6
## 7 23 299 8.60 65 5 7
## 8 19 99 13.8 59 5 8
## 9 8 19 20.1 61 5 9
## 10 NA 194 8.60 69 5 10
## # ... with 143 more rows
Displaying knitr::kable()
tableslibrary(knitr)
data(airquality)
kable(head(airquality), caption = "New York Air Quality Measurements")
Ozone | Solar.R | Wind | Temp | Month | Day |
---|---|---|---|---|---|
41 | 190 | 7.4 | 67 | 5 | 1 |
36 | 118 | 8.0 | 72 | 5 | 2 |
12 | 149 | 12.6 | 74 | 5 | 3 |
18 | 313 | 11.5 | 62 | 5 | 4 |
NA | NA | 14.3 | 56 | 5 | 5 |
28 | NA | 14.9 | 66 | 5 | 6 |
DT::datatable()
tableslibrary(DT)
data(airquality)
datatable(airquality, caption = "New York Air Quality Measurements")
skimr::skim()
Provides a frictionless approach to displaying summary statistics that can be quickly skimmed quickly to understand their data.
skimr::skim(airquality)
## Skim summary statistics
## n obs: 153
## n variables: 6
##
## Variable type: integer
## variable missing complete n mean sd p0 p25 median p75 p100
## Day 0 153 153 15.8 8.86 1 8 16 23 31
## Month 0 153 153 6.99 1.42 5 6 7 8 9
## Ozone 37 116 153 42.13 32.99 1 18 31.5 63.25 168
## Solar.R 7 146 153 185.93 90.06 7 115.75 205 258.75 334
## Temp 0 153 153 77.88 9.47 56 72 79 85 97
## hist
## ▇▇▇▇▆▇▇▇
## ▇▇▁▇▁▇▁▇
## ▇▆▃▃▂▁▁▁
## ▃▃▃▃▅▇▇▃
## ▂▂▃▆▇▇▃▃
##
## Variable type: numeric
## variable missing complete n mean sd p0 p25 median p75 p100
## Wind 0 153 153 9.96 3.52 1.7 7.4 9.7 11.5 20.7
## hist
## ▁▃▇▇▅▅▁▁
install.packages("gapminder")
"Dataset"
?class
)?dim
, ?ncol
etc).(e.g. ?summary
, ?skimr
)
set.seed(100)
d <- diamonds[sample(nrow(diamonds), 1000), ]
p <- ggplot(data = d, aes(x = carat, y = price)) + geom_point(aes(text = paste("Clarity:",
clarity)), size = 1) + geom_smooth(aes(colour = cut, fill = cut)) + facet_wrap(~cut)
p
Wraps nicely around plotting library ggplot2
library(plotly)
ggplotly(p)
Replicate some of the plots you produced earlier today with the gapminder data but hide the code that generates them.
Add a new plot of your own
Add some comments for each plot
.Rmd
R
-> Rmd
You can read in chunks of code from an annotated .R
(or any other language) script using knitr::read_chunks()
Chunks are defined by the following notation. Names must be unique.
# ---- descriptive-chunk-name1 ----
code("you want to run as a chunk")
# ---- descriptive-chunk-name2 ----
code("you want to run as a chunk")
.R
script hello-world.R
hello-world.R
# ---- demo-read_chunk ----
print("hello world")
hello-world.R
knitr::read_chunk("hello-world.R")
rmarkdown r chunk notation
```{r demo-read_chunk}
```
rendered html code and output
print("hello world")
## [1] "hello world"
knitr:::knit_code$get()
## $`demo-read_chunk`
## [1] "print(\"hello world\")"
.Rmd
Rmd
-> R
You can use knitr::purl()
to tangle code out of an Rmd
into an .R
script. purl
takes many of the same arguments as knit()
. The most important additional argument is:
documentation
: an integer specifying the level of documentation to go the tangled script:
purl("file-to-extract-code-from.Rmd", documentation = 0)
purl
Here i’m running a loop to extract the code in demo-rmd.Rmd
for each documentation level
file <- "demo-rmd.Rmd"
for (docu in 0:2) {
knitr::purl(file, output = paste0(gsub(".Rmd", "", file), "_", docu, ".R"),
documentation = docu, quiet = T)
}
demo-rmd_0.R
knitr::opts_chunk$set(echo = TRUE)
summary(cars)
plot(pressure)
demo-rmd_1.R
## ----setup, include=FALSE------------------------------------------------
knitr::opts_chunk$set(echo = TRUE)
## ----cars----------------------------------------------------------------
summary(cars)
## ----pressure, echo=FALSE------------------------------------------------
plot(pressure)
demo-rmd_2.R
#' ---
#' title: "Untitled"
#' author: "Anna Krystalli"
#' date: "3/23/2018"
#' output:
#' html_document:
#' toc: true
#' toc_float: true
#' theme: cosmo
#' highlight: textmate
#'
#' ---
#'
## ----setup, include=FALSE------------------------------------------------
knitr::opts_chunk$set(echo = TRUE)
#'
#' ## R Markdown
#'
#'
#' This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
#'
#' When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
#'
## ----cars----------------------------------------------------------------
summary(cars)
#'
#' ## Including Plots
#'
#' You can also embed plots, for example:
#'
## ----pressure, echo=FALSE------------------------------------------------
plot(pressure)
#'
#' Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
#'
#'
.R
script.R
script.R
script into your .Rmd
(?read_chunk()
).Rmd
workflow by labelling an empty chunk with your chunk(s) name(s)Once your document is ready, try and extract the contents of your .Rmd
into an .R
script.
?purl
This snipped copied from twitter in the embed format
<blockquote class="twitter-tweet" data-lang="en"><p lang="en" dir="ltr">How cool does this tweet look embedded in <a href="https://twitter.com/hashtag/rmarkdown?src=hash&ref_src=twsrc%5Etfw">#rmarkdown</a>! 😎</p>— annakrystalli (@annakrystalli) <a href="https://twitter.com/annakrystalli/status/977209749958791168?ref_src=twsrc%5Etfw">March 23, 2018</a></blockquote>
<script async src="https://platform.twitter.com/widgets.js" charset="utf-8"></script>
renders to this
How cool does this tweet look embedded in #rmarkdown! 😎
— annakrystalli (@annakrystalli) March 23, 2018
Embbed gifs, videos, widgets in this way
To get help, you need a reproducible example
reprex
Use function reprex::reprex()
to produce a reproducible example in a custom markdown format for the venue
of your choice
"gh"
for GitHub (default)"so"
for StackOverflow,"r"
or "R"
for a runnable R script, with commented output interleaved.In the console, call the reprex
function
reprex::reprex()
bookdown
Authoring with R Markdown. Offers:
The publication can be exported to HTML, PDF, and e-books (e.g. EPUB) Can even be used to write thesis!
The workflowr R package makes it easier for researchers to organize their projects and share their results with colleagues.
Check out https://awesome-blogdown.com/, a curated list of awesome #rstats blogs in blogdown for inspiration!
Use Git and GitHub to manage, publish and collaborate on your work