Site icon R-bloggers

Working with Big Data with Hadoop and Spark workshop

[This article was first published on R-posts.com, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Learn how to work with Big Data with Hadoop and Spark! Join our workshop on Working with Big Data with Hadoop and Spark which is a part of our workshops for Ukraine series. 

Here’s some more info:

Title: Working with Big Data with Hadoop and Spark

Date: Thursday, May 18th, 18:00 – 20:00 CEST (Rome, Berlin, Paris timezone)

Speaker: Jannic Cutura is an economist turned data engineer turned software engineer who works as a Python developer at the European Central Bank’s Stress Test team. Prior to his current position he worked as research analyst/data engineer in the financial stability and monetary policy divisions of the ECB. He holds a masters and Ph.D. in quantitative economics from Goethe University Frankfurt and conducted research projects at the BIS, the IMF and Columbia University.

Description: Big data — datasets that are difficult to handle on standalone retail-grade computers — are rapidly becoming the norm in social science research. This is true both in academia as well as for policy-oriented research in central banks and similar bodies (let alone industry application). Yet traditional econometrics (and econometrics training) tells us little about how to efficiently work with large datasets. In practice, any data set larger than the researchers computer memory (~20- 30GB) is very challenging to handle as, once that barrier is crossed, most data manipulation tasks becomes painfully slow and prone to failure. The goal of this presentation is to (i) explain what happens under the hood when your computer gets slow and (ii) show how distributed computing (in particular Hadoop/Spark) can help to mitigate those issues. By the end, participants will understand the power of distributed computing and how they can use it to both tackle existing data handling challenges and as well as new ones that were previously prohibitively expensive to evaluate on retail grade computers. The workshop will both contain a theory part and a lab session using data bricks. If you want to follow along during the live session you can create your own free account at data bricks by signing up for the community edition (no credit card required).

Minimal registration fee: 20 euro (or 20 USD or 800 UAH)

How can I register?

If you are not personally interested in attending, you can also contribute by sponsoring a participation of a student, who will then be able to participate for free. If you choose to sponsor a student, all proceeds will also go directly to organisations working in Ukraine. You can either sponsor a particular student or you can leave it up to us so that we can allocate the sponsored place to students who have signed up for the waiting list.


How can I sponsor a student?




If you are a university student and cannot afford the registration fee, you can also sign up for the waiting list here. (Note that you are not guaranteed to participate by signing up for the waiting list).


You can also find more information about this workshop series,  a schedule of our future workshops as well as a list of our past workshops which you can get the recordings & materials here.


Looking forward to seeing you during the workshop!










Working with Big Data with Hadoop and Spark workshop was first posted on April 14, 2023 at 4:45 pm.
To leave a comment for the author, please follow the link and comment on their blog: R-posts.com.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Exit mobile version