---
title: "Penguins"
author: Your Name Here
date: May 2022
format: pdf
---
```{r}
#| label: load-packages
#| include: false
library(tidyverse)
library(palmerpenguins)
library(modelsummary)
library(fixest)
```
## Introduction
This report analyzes the penguins data from the palmerpenguins package. It will settle the longstanding debate in the literature whether there is a positive correlation between the variables
bill length
bill depth
According to ..., the dataset contains size measurements for `r nrow(penguins)` penguins from three species observed on three islands in the Palmer Archipelago, Antarctica.
## Distribution of species
We can determine the number of penguins for each species easily with functions from the tidyverse...:
```{r}
penguins %>%
count(species)
```
However, using datasummary_skim() from the package modelsummary... for the same task results in a nicely formatted table that is automatically embedded in this report.
```{r}
#| echo: false
# ...
```
## Visual exploration
At first glance, there seems to be a negative correlation between bill length and bill depth, if anything. However, broken down by species the plot does support the notion of a positive correlation. This phenomenon (an effect that vanishes or reverses when groups are combined) is an example of Simpson's paradox.
```{r}
#| fig-cap: Simpson's paradox
#| echo: false
#| warning: false
ggplot(penguins, aes(bill_length_mm, bill_depth_mm)) +
geom_point(aes(color = species, shape = species)) +
scale_color_manual(values = c("darkorange","purple","cyan4")) +
labs(
title = "Bill length and bill depth",
subtitle = "Dimensions for penguins at Palmer Station LTER",
x = "Bill length (mm)", y = "Bill depth (mm)",
color = "Penguin species",
shape = "Penguin species"
) +
theme_minimal()
```
In the next chapter, we will analyze the correlation depicted in fig-paradox^[Labels should start with `fig-` for figures and `tbl-` for tables.] using linear regression.
## Statistical analysis
We formalize our model with the following equation:
...
Results are displayed in @tbl-linreg.
```{r}
#| label: tbl-linreg
#| tbl-cap: Linear regression
#| echo: false
#| warning: false
mod <- feols(bill_depth_mm ~ bill_length_mm,
data = penguins,
split = ~species)
# ...
```
## References