How to convert contingency tables to data frames with R
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I wanted to write contingency tables in HTML with hwrite()
. I realized that the method hwrite()
does not exist for the table
objects. I could use as.data.frame()
, but the table produced is non-intuitive. I did a search on R-bloggers and I quickly found the solution to my problem: the as.data.frame.matrix()
function.
The contingency table
A contingency table is a display format used to analyse and record the relationship between two categorical variables. For example, we use two variables from the dataset ?state
included in R. The two variables are x (state.division
) and y (state.region
).
state.division state.region nlevels(state.division) nlevels(state.region)
These two variables have respectively r=9 et s=4 terms. The contingency table therefore contains (r+1)×(s+1)–1=49 informatives cells.
The contingency table will show the number of times each combination of state.division
and state.region
appears.
(MyTable <- table(state.division, state.region)) ## state.region ## state.division Northeast South North Central West ## New England 6 0 0 0 ## Middle Atlantic 3 0 0 0 ## South Atlantic 0 8 0 0 ## East South Central 0 4 0 0 ## West South Central 0 4 0 0 ## East North Central 0 0 5 0 ## West North Central 0 0 7 0 ## Mountain 0 0 0 8 ## Pacific 0 0 0 5
as.data.frame()
The R contingency tables are of class table
. They are not handled the same way that the objects of class data.frame
. Some methods of data.frame
are not available for table
(e.g. hwrite()
). Actually, converting contingency tables to data frames gives non-intuitive results.
as.data.frame(MyTable)
state.division | state.region | Freq |
New England | Northeast | 6 |
Middle Atlantic | Northeast | 3 |
South Atlantic | Northeast | 0 |
East South Central | Northeast | 0 |
West South Central | Northeast | 0 |
East North Central | Northeast | 0 |
West North Central | Northeast | 0 |
Mountain | Northeast | 0 |
Pacific | Northeast | 0 |
New England | South | 0 |
Middle Atlantic | South | 0 |
South Atlantic | South | 8 |
East South Central | South | 4 |
West South Central | South | 4 |
East North Central | South | 0 |
West North Central | South | 0 |
Mountain | South | 0 |
Pacific | South | 0 |
New England | North Central | 0 |
Middle Atlantic | North Central | 0 |
South Atlantic | North Central | 0 |
East South Central | North Central | 0 |
West South Central | North Central | 0 |
East North Central | North Central | 5 |
West North Central | North Central | 7 |
Mountain | North Central | 0 |
Pacific | North Central | 0 |
New England | West | 0 |
Middle Atlantic | West | 0 |
South Atlantic | West | 0 |
East South Central | West | 0 |
West South Central | West | 0 |
East North Central | West | 0 |
West North Central | West | 0 |
Mountain | West | 8 |
Pacific | West | 5 |
Here, the same information is presented in a table of 3×r×s=108 cells. Each term of x [y] is written s [respectively r] times.
as.data.frame.matrix()
The convert a table
to a data.frame
keeping its original structure, you must use the as.data.frame.matrix()
function. This is probably the only situation in which this obscure function would be used.
as.data.frame.matrix(MyTable)
Northeast | South | North Central | West | |
New England | 6 | 0 | 0 | 0 |
Middle Atlantic | 3 | 0 | 0 | 0 |
South Atlantic | 0 | 8 | 0 | 0 |
East South Central | 0 | 4 | 0 | 0 |
West South Central | 0 | 4 | 0 | 0 |
East North Central | 0 | 0 | 5 | 0 |
West North Central | 0 | 0 | 7 | 0 |
Mountain | 0 | 0 | 0 | 8 |
Pacific | 0 | 0 | 0 | 5 |
Finally…
If you are fussy, you might notice that the variable names do not appear in contingency tables written with hwrite()
. This can cause problems if the terms do not have explicit names (e.g., a variable encoded 1,2,…,r). In that case, remember to specify your variables by adding a caption to your table.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.