index.Rmd
example
This Rmarkdown file should contain your analysis and be saved in the root directory:
index.Rmd
My final index.Rmd
looks like this:
---
title: "Analysis of NEON Woody plant vegetation structure data"
author: "Anna Krystalli"
date: "2021-05-05"
output:
html_document:
toc: true
toc_float: true
theme: cosmo
highlight: zenburn
---
# Background
<img src="https://data.neonscience.org/data-products/static/media/NSF-NEON-logo.192b6661.png" width="40%">
The [NEON Woody plant vegetation structure dataset](https://data.neonscience.org/data-products/DP1.10098.001) contains structure measurements, including height, canopy diameter, and stem diameter, as well as mapped position of individual woody plants across the survey area.
This data product contains the quality-controlled, native sampling resolution data from in-situ measurements of live and standing dead woody individuals and shrub groups, from all terrestrial NEON sites with qualifying woody vegetation. The exact measurements collected per individual depend on growth form, and these measurements are focused on enabling biomass and productivity estimation, estimation of shrub volume and biomass, and calibration / validation of multiple NEON airborne remote-sensing data products.
Our analyses focus on the **relationship between individual stem height and diameter** and how that relationship **varies across growth forms**.
# Data
The data were downloaded from the NEON data portal and processed into a single table using script `data-raw/individual.R`
```{r, echo = FALSE}
knitr::opts_chunk$set(message = FALSE)
```
```{r read-chunks, echo=FALSE}
knitr::read_chunk("analysis.R")
```
## Read in data and setup analysis
First we read in the data and select only the columns we are interested in, i.e `stem_diameter`, `height` and `growth_form`
```{r analysis-setup}
```
```{r}
summary(individual)
```
## Prepare data
To prepare the data we exclude rows for which the value of `growth_form` was `NA` or `liana`.
```{r analysis-filter-data}
```
We also convert `growth_form` to a factor and set the levels according to ascending counts of each level in the raw data.
```{r analysis-set-factor-levels}
```
```{r, echo=FALSE}
DT::datatable(analysis_df, caption = "Table 1: Prepared analysis data")
```
## Data properties
### Statistical summaries of variables
```{r}
summary(analysis_df)
```
```{r analysis-fig1-barplot, fig.cap="Figure 1: Counts of growth forms"}
```
```{r analysis-fig2-violinplots, fig.cap="Figure 2: Distribution and statistical summaries of stem_diameter and height across growth_forms"}
```
# Analysis
## Modelling overall `stem_diameter` as a function of `height`
Initially we fit a linear model of form `log(stem_diameter)` as a function of `log(height)`
```{r analysis-lm-overall}
```
Our model is statistically significant and has modest coverage, indicated by `r.squared` of `r broom::glance(lm_overall)$r.squared`
```{r analysis-lm-fig3-overall}
```
However, plotting our data reveals sub groups in the data. We can examine whether including `growth_form` in our analysis would improve our model fit by capturing variation explained by differing relationships across growth forms
## Including an interaction with `growth_form`
We fit another model, this time including an interaction term for variable `growth_form`
```{r analysis-lm-growth}
```
Our model is still significant but this time explains a larger proportion of the variation (`r broom::glance(lm_growth)$r.squared`).
```{r analysis-lm-fig4-growth, fig.cap="Figure 4: Log stem diameter as a function of the interaction of log height and growth form"}
```
# Session Info
```{r}
sessionInfo()
```
and renders to: