Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
(The connection with variable selection is that each level of the tree corresponds to the binary choice between including and excluding one of the variables. The tree thus has 2k endpoints/leaves for k potential variables in the model.) The cost in updating the probabilities is actually in O(k) if k is the number of levels, instead of 2k because most of the branches of the tree are unaffected by setting one final branch to probability zero. The second part deals with the adaptive and approximative issues.
In a model selection setup, the posterior partial conditional probability of including variable i given the inclusion/exclusion of variables 1,…,i-1 is obviously unknown. The authors suggest to use instead an approximation based on the marginal posterior inclusion probabilities, namely for variable j
the frequency of the inclusion of variable j in the model along past iterations, with a possible shrinkage correction to avoid probability estimates equal to zero. (The motivation is the fundamental paper by Berger and Barbieri, 2004, Annals of Statistics, that shows the optimality of the median posterior probability model.) In this sense, the algorithm is adaptive. (In addition, there is a step to periodically calculate new marginal inclusion probabalities that replace the old sampling distribution at time t=0.) But it also is approximative in that the only convergence result is one by attrition, namely that the posterior partial conditional probabilities are exactly recovered after sampling all models. (The paper is associated with an R package called BAS.)
Filed under: R, Statistics Tagged: adaptivity, BAS, Bayes factor, Bayesian model choice, median posterior probability, R, sampling w/o replacement, tree, variable selection
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.