Articles by R on I Should Be Writing

ggplot: Easy as pie (charts)

August 11, 2021 | R on I Should Be Writing

This post by no means endorses the use of pie charts. But, if you must, here’s how… For some reason, the top Google results for “ggplot2 pie chart” show some very convoluted code to accomplish what should be easy: Make slices Add labels to the mid...
[Read more...]

ggplot: Easy as pie (charts)

August 11, 2021 | R on I Should Be Writing

This post by no means endorses the use of pie charts. But, if you must, here’s how… For some reason, the top Google results for “ggplot2 pie chart” show some very convoluted code to accomplish what should be easy: Make slices Add labels to the mid...
[Read more...]

ggplot: plot only some of the data

July 11, 2021 | R on I Should Be Writing

Often (especially when working with large and/or rich datasets) our (gg)plots can feel cluttered with information. But they don’t have to be! Let’s look at the following plot: Generate some data
library(dplyr)

bfi <- psychTools::bfi %>% 
  mutate(
    O = across(starts_with("O")) %>% rowMeans(na.rm = TRUE),
    C = across(starts_with("C")) %>% rowMeans(na.rm = TRUE),
    E = across(starts_with("E")) %>% rowMeans(na.rm = TRUE),
    A = across(starts_with("A")) %>% rowMeans(na.rm = TRUE),
    N = across(starts_with("N")) %>% rowMeans(na.rm = TRUE)
  ) %>% 
  mutate(
    gender = factor(gender, labels = c("Man", "Woman")),
    education = factor(education, labels = c("HS", "finished HS", "some college", "college graduate", "graduate degree"))
  ) %>% 
  select(gender, education, age, O:N) %>% 
  tidyr::drop_na(education) %>% 
  # multiply the data set
  sample_n(size = 10000, replace = TRUE) %>% 
  # and add some noise
  mutate(across(O:N, \(x) x + rnorm(x, 0, sd(x))))
library(ggplot2)

theme_set(theme_bw())

base_plot <- ggplot(bfi, aes(age, O, color = education)) + 
  facet_wrap(facets = vars(gender)) + 
  coord_cartesian(ylim = c(1, 6)) + 
  scale_color_viridis_d()

base_plot + 
  geom_point(shape = 16, alpha = 0.1) + 
  geom_smooth(se = FALSE)
This is a busy plot. It’s hard to see what the each ...
[Read more...]

ggplot: plot only some of the data

July 11, 2021 | R on I Should Be Writing

Often (especially when working with large and/or rich datasets) our (gg)plots can feel cluttered with information. But they don’t have to be! Let’s look at the following plot: Generate some data
library(dplyr)

bfi <- psychTools::bfi %>% 
  mutate(
    O = across(starts_with("O")) %>% rowMeans(na.rm = TRUE),
    C = across(starts_with("C")) %>% rowMeans(na.rm = TRUE),
    E = across(starts_with("E")) %>% rowMeans(na.rm = TRUE),
    A = across(starts_with("A")) %>% rowMeans(na.rm = TRUE),
    N = across(starts_with("N")) %>% rowMeans(na.rm = TRUE)
  ) %>% 
  mutate(
    gender = factor(gender, labels = c("Man", "Woman")),
    education = factor(education, labels = c("HS", "finished HS", "some college", "college graduate", "graduate degree"))
  ) %>% 
  select(gender, education, age, O:N) %>% 
  tidyr::drop_na(education) %>% 
  # multiply the data set
  sample_n(size = 10000, replace = TRUE) %>% 
  # and add some noise
  mutate(across(O:N, \(x) x + rnorm(x, 0, sd(x))))
library(ggplot2)

theme_set(theme_bw())

base_plot <- ggplot(bfi, aes(age, O, color = education)) + 
  facet_wrap(facets = vars(gender)) + 
  coord_cartesian(ylim = c(1, 6)) + 
  scale_color_viridis_d()

base_plot + 
  geom_point(shape = 16, alpha = 0.1) + 
  geom_smooth(se = FALSE)
This is a busy plot. It’s hard to see what the each ...
[Read more...]

Everything You Always Wanted to Know About ANOVA*

May 24, 2021 | R on I Should Be Writing

ANOVAs in R Simultaneous Sum of Squares Adding Interactions Balanced vs. Unbalanced Data ANOVA Made Easy Other Types of Models GLMs (G)LMMs Concluding Remarks Analysis of variance (ANOVA) is a statistical procedure, developed by R. A. Fisher, used to analyze the relationship between a continuous outcome (dependent variable) and ... [Read more...]

Testing The Equality of Regression Coefficients

February 15, 2021 | R on I Should Be Writing

The Problem Method 1: As Model Comparisons Method 1b: Composite Variable + Difference Method 2: Paternoster et al (1998) Method 3: emmeans \beta_{\text{n_comps}}\). Method 1b: Composite Variable + Difference Edit (Feb-17, 2021): Thanks to @joejps84 for pointing this out! We can achieve the same thing in a single model. If we say that the ... [Read more...]

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)