This post is not about a new technique or package, but rather combining existing functionality in interpretable machine learning and data visualization in a way to facilitate analyses of model results. We’ll make use of two packages DALEX and PLOTLY ot... [Read more...]
This short tutorial provdes a quick guide on how to develop an R package from scratch and how use Travis CI for automatic builds on various R versions and automatic test coverage calculation. The resulting package can be found here: CIexamplePkg
A very nice general introduction can be found here:
... [Read more...]
We present a novel approach for measuring feature importance in k-means clustering, or variants thereof, to increase the interpretability of clustering results. In supervised machine learning, feature importance is a widely used tool to ensure interpretability of complex models. We adapt this idea to unsupervised learning via partitional clustering. Our ...
The goal is to compare a few algorithms for missing imputation when used before k-means clustering is performed. For the latter we use the same algorithm as in ClustImpute to ensure that only the computation time of the imputation is compared. In a nutshell, we’ll se that ClustImpute scales ...
We are happily introducing a new k-means clustering algorithm that includes a powerful multiple missing data imputation at the computational cost of a few extra random imputations (benchmarks following in a separate article). More precisely, the algorithm draws the missing values iteratively based on the current cluster assignment so that ...