Log odds ratios and an indicator matrix from categorical data
[This article was first published on is.R(), and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
A long title, but there are a couple of handy things in this Gist. The first, and more obscure, is the conversion of a data.frame of categorical variables into a matrix of dummy/binary/indicator variables, one for each category of each original variable.
It is non-obvious (to me, at least) how to best do this, so the solution comes from “Gavin Simpson” and “fabians” at Stack Overflow.
The second part of this Gist shows how to construct a table of log odds ratios between each of these indicator variables, which may be a first step in the estimation of something like (but not exactly the same as) multiple correspondence analysis.
To leave a comment for the author, please follow the link and comment on their blog: is.R().
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.