Signal Detection Theory vs. Logistic Regression

Posted on July 28, 2019 by R on I Should Be Writing: The Musical in R bloggers | 0 Comments

[This article was first published on R on I Should Be Writing: The Musical, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

I recently came across a paper that explained the equality between the parameters of signal detection theory (SDT) and the parameters of logistic regression in which the state (“absent”/“present”) is used to predict the response (“yes”/“no”, but also applicable in scale-rating designs) (DeCarlo, 1998; DOI: 10.1037/1082-989X.3.2.186).

Here is a short simulation-proof for this equality.

Setup

For this simulations we will need the following packages:

# For plotting
library(ggplot2)

# For extracting SDT parameters
library(neuropsychology)

We will also need to make sure, for the logistic regression analysis, that our factors’ dummy coding is set to effects-coding – otherwise the intercept’s meaning will not correspond to the criterion (aka the overall response bias):

options(contrasts = c('contr.sum', 'contr.poly'))

The Simulations

n <- 100L
B <- 100L

We’ll run 100 simulations with 100 trials each.

Simulation Code

set.seed(1)

SDT_params <- function(state,resp) {
  tab <- table(state,resp)
  
  sdt_res <- neuropsychology::dprime(
    n_hit  = tab[2,2],
    n_miss = tab[2,1],
    n_fa   = tab[1,2],
    n_cr   = tab[1,1]
  )
  
  c(sdt_res$dprime , sdt_res$c)
}

logistic_reg_params <- function(state,resp){
  fit <- glm(resp ~ state, family = binomial())
  
  coef(fit)
}

# initialize
res <- data.frame(d_    = numeric(B),
                  c_    = numeric(B),
                  int   = numeric(B),
                  slope = numeric(B))

# Loop
for (b in seq_len(B)) {
  true_sensitivity <- rexp(1,10) # random
  true_criterion <- runif(1,-1,1) # random
  
  # true state vector
  state_i <- rep(c(F,T), each = n/2)
  
  # response vector
  Xn <- rnorm(n/2) # noise dist
  Xs <- rnorm(n/2, mean = true_sensitivity) # signal + noise dist
  X <- c(Xn,Xs)
  resp_i <- X > true_criterion
  
  # SDT params
  res[b,1:2] <- SDT_params(state_i,resp_i)
  
  # logistic regression params
  res[b,3:4] <- logistic_reg_params(state_i,resp_i)
}

Results

SDT parameters are on a standardized normal scale, meaning they are scaled to \(\sigma=1\). However, the logistic distribution’s scale is \(\sigma=\pi/\sqrt3\). Thus, to convert the logistic regression’s parameters to the SDT’s we need to scale both the intercept and the slope by \(\sqrt3/\pi\) to have them on the same scale as \(c\) and \(d'\). Additionally,

The slope must be also scaled by \(-2\) due to R’s default effects coding.
The intercept must also be scaled by \(-1\) - see paper for the full rationale.

The red-dashed line represents the expected regression line predicting the SDT parameters from their logistic counterparts:

\(d' = -2\times\frac{\sqrt{3}}{\pi}\times Slope\)
\(c = -\frac{\sqrt{3}}{\pi}\times Intercept\)

(The blue line is the empirical regression line.)

Conclusions

I haven’t tested here how this equality can be extended to multi-level designs with generalized linear mixed models (GLMM), but I see no reason this wouldn’t be possible… One could model random effects per subject, and the moderating effect of some \(X\) on sensitivity could in theory be modeled by including an interaction between \(X\) and state; similarly, the moderating effect of \(X\) on the criterion can be modeled by including a main effect for \(X\) (moderation the intercept).

To leave a comment for the author, please follow the link and comment on their blog: R on I Should Be Writing: The Musical.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

R-bloggers

R news and tutorials contributed by hundreds of R bloggers

Signal Detection Theory vs. Logistic Regression

Setup

The Simulations

Simulation Code

Results

Conclusions

Related

Setup

The Simulations

Simulation Code

Results

Conclusions

Related

Never miss an update! Subscribe to R-bloggers to receive e-mails with the latest R posts. (You will not see this message again.)

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)