Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The useR! 2021 Conference is starting soon, on July 5. This year MI2DataLab will showcase our recent R packages and applications within a 3h workshop on Responsible Machine Learning and five talks. Feel free to hook us up during the conference, especially for chats related to Responsible Machine Learning.
Check out the list below and see you at the conference!
Workshop
Introduction to Responsible Machine Learning
by Przemysław Biecek, Hubert Baniecki, Anna Kozak, Jakub Wisniewski
Wednesday, 7th of July, 7:00–10:00 am (UTC)
What? The workshop focuses on responsible machine learning, including areas such as model fairness, explainability, and validation.
Why? To gain the theory and hands-on experience in developing safe and effective predictive models.
For whom? For those with basic knowledge of R, familiar with supervised machine learning and interested in model validation.
What will be used? We will use the DALEX package for explanations, fairmodels for checking bias, and modelStudio for interactive model analysis.
We also prepare interesting materials, you have to see it!
Regular Talk
Triplot: model agnostic measures and visualisations for variable importance in predictive models that take into account the hierarchical correlation structure
by Katarzyna Pękala, Katarzyna Woźnica, Przemysław Biecek
Tuesday, 6th of July, 9:05–9:25 am (UTC)
We propose new methods to support model analysis by exploiting the information about the correlation between variables. The dataset level aspect importance measure is inspired by the block permutations procedure, while the instance level aspect importance measure is inspired by the LIME method. We show how to analyse groups of variables (aspects) both when they are proposed by the user and when they should be determined automatically based on the hierarchical structure of correlations between variables. Additionally, we present a new type of model visualisation, triplot, that exploits a hierarchical structure of variable grouping to produce a high information density model visualisation.
For more on Triplot, check out our blog:
fairmodels: A Flexible Tool For Bias Detection, Visualization, And Mitigation
by Jakub Wiśniewski, Przemysław Biecek
Friday, 9th of July, 1:45–2:05 pm (UTC)
An R package fairmodels that helps to validate fairness and eliminate bias in classification models in an easy and flexible fashion. The fairmodels package offers a model-agnostic approach to bias detection, visualization, and mitigation. The implemented set of functions and fairness metrics enables model fairness validation from different perspectives. The package includes a series of methods for bias mitigation that aim to diminish the discrimination in the model. The package is designed not only to examine a single model but also to facilitate comparisons between multiple models.
More about fairmodels you can read on our blog:
Elevator Pitches
Open the Machine Learning Black-Box with modelStudio & Arena
by Hubert Baniecki, Piotr Piątyszek
We present the modelStudio and arenar packages which, at their core, automatically generate interactive and customizable dashboards allowing to “open the black-box”. These tools build upon the DALEX package and are model-agnostic — compatible with most of the predictive models and frameworks in R. The effort is put on lowering the entry threshold for crucial parts of nowadays MLOps practice. We showcase how little coding is needed to produce a powerful dashboard consisting of model explanations and data exploration visualizations. The output can be saved and shared with anyone, further promoting reproducibility and explainability in machine learning practice. Finally, we highlight the Arena dashboard’s features — it specifically aims to compare various predictive models.
Posts about Arena:
- How to use the Arena for exploration of ML models for credit scoring
- What will happen if I change this a little — introducing ArenaR 0.2.0
Simpler is Better: Lifting Interpretability-Performance Trade-off via Automated Feature Engineering
by Alicja Gosiewska, Anna Kozak, Przemysław Biecek
The SAFE is a framework that uses elastic black boxes as supervisor models to create simpler, less opaque, yet still accurate and interpretable glass box models. The new models were created using newly engineered features extracted with the help of a supervisor model.
We supply the analysis using a large-scale benchmark on several tabular data sets from the OpenML database. There are three main results: 1) we show that extracting information from complex models may improve the performance of simpler models, 2) we question a common myth that complex predictive models outperform simpler predictive models, 3) we present a real-life application of the proposed method.
Check out: Simplify your model: Supervised Assisted Feature Extraction for Machine Learning
Landscape of R packages for eXplainable Artificial Intelligence
by Szymon Maksymiuk, Alicja Gosiewska, Przemysław Biecek
The growing availability of data and computing power is fueling the development of predictive models. To ensure the safe and effective functioning of such models, we need methods for exploration, debugging, and validation. New methods and tools for this purpose are being developed within the eXplainable Artificial Intelligence (XAI) subdomain of machine learning. In this lightning talk, we present the design by us taxonomy of methods for a model explanation, show what methods are included in the most popular R XAI packages, and acknowledge trends in recent developments.
R packages for eXplainable Artificial Intelligence
If you are interested in other posts about explainable, fair, and responsible ML, follow #ResponsibleML on Medium.
In order to see more R related content visit https://www.r-bloggers.com
MI2 talks at useR! 2021 was originally published in ResponsibleML on Medium, where people are continuing the conversation by highlighting and responding to this story.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.