Site icon R-bloggers

Why R? Text Mining Hackathon Summary

[This article was first published on http://r-addict.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

It’s been 2 weeks since the end of a Text Mining 2020.whyr.pl/hackathon/ at Why R?. This was a promotional event before the Why R? 2020 conference that aimed to promote knowledge related to text analyses. We prepared 4 various challenges so that teams were able to pick tasks that suit their skills best. At the beginning of this week we published videos presenting winning solutions on youtube.com/WhyRFoundation channel. If you are interested in the course of the event follow this blog post.

We had an initial 300 participants interested in joining the hackathon before we even announced the theme and challenges. 2 weeks before the event start we announced it’s going to relate to the text mining and that we are aiming to get group submissions of a size of 4 or 5 team members. Around 60 people from all over the world formed 13 teams, from which 7 teams submitted at least one solution for any of the challenges presented at the hackathon. We’ve seen people from Australia, India, Germany, Poland, Senegal, US, Canada, Nepal, Spain, UK, Brazil!

Intro

A language agnostic competition devoted to text mining where every machine learning practitioner could found challenges to test his/her team!

At this hackathon you could scale the level of difficulty and the area of challenges on your own. Depending on skills and the time that you had you could tune the fun on your own!

Table of Contents

1. Why text mining? 2. Why Hackathon? 3. Challenges
4. Competition Rules 5. Mentors and Judges 6. Sponsor
7. Talks 8. For whom? 9. Organizers

Winners

Challenge 1 – Predictions

Challenge 2 – Segmentation

Challenge 3 – Churn

Challenge 4 – Text Analysis / Revealing the content

Why text mining?

Text mining is widely known within machine wandering practitioners. The increased interest in the text mining is caused by an augmentation of internet users and by rapid growth of the internet data which is said that in a great amount is a text data. Extracting information from articles, news, posts and comments have became a desirable skill but what is even more needful are tools for text mining models diagnostics and visualizations.

Even though there are a lot of tools, books and webinars available online there is still a place for the improvement and development.

Why Hackathon?

Hackathons are events where enthusiasts of a specific topic gather in one place to work together on challenges that arose for a particular community.

Hackathons tend to be timepressure events, where solutions need to be created quicky and active cooperation between participants is necessary. To set the pace of the event, participants are divided into teams which compete to prepare the most valuable solution and win a prize.

For a participant such an undertaking is a great chance to:

Challenges

Challenges and the guidance for solutions are published here

github.com/WhyR2020/hackathon

Competition Rules

Since the event was a competition with symbolic prices, we wanted like to grade solutions. Solutions were sent as videos (we aimed at max 5 min! per video). Videos should aim to present insights developed to solve stated challenges. Each team could send a solution for each challenge in a separate video (one video for one challenge). Details about hackathon criterias were announced at the opening of the hackathon and are below!

Presented solution should be submitted as a video. It was a nice to have if a solution is based on a presentation or a dashboard. For challenges 2-4 the winning solution was chosen based on insightfulness and usefulness of identified patterns. For challenge 1 the winning solution was chosen based on a cost function however we wanted to know how did teams get into such predictions?

Mentors and Judges

Speakers of the hackathon: Julia Silge, Kenneth Benoit.

Judges and mentors: Mateusz Zawisza, Piotr Zielonka, Marcin Kosinski, Maciej Eder, Michał Burdukiewicz.

Sponsor

McKinsey Analytics in Poland combines advanced data analytics solutions with in-depth industry and business knowledge, including multiple sectors such as commerce, banking, insurance, telecommunications, industrial production and heavy industry. McKinsey data scientists and architects, together with machine learning and data engineers, complement strategic and operational consulting and provide clients with advanced and robust data-driven solutions.

McKinsey Analytics experts specialize in many different areas: statistical learning, deep learning, evolutionary and multi-criteria optimization, multi-agent simulations, game theory, reinforcement learning, advanced econometrics, causal & Bayesian inference, uplift modelling, Explainable Artificial Intelligence, visualization and data engineering.

We are all looking forward to share with you some insights on how to identify and capture the most value and meaningful insights from data, and turn them into competitive advantages!

Talks

For whom?

We strongly encouraged people with analytic thinking skills to participate in the event. Data analysts, developers, storytellers, BI consultants, web designers, researchers, data enthusiast were all welcome since they could learn a lot from one another!!

The event was made just for you!

Event details

Organizers

Why R? Foundation – whyr.pl.

To leave a comment for the author, please follow the link and comment on their blog: http://r-addict.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.