This blog post is an excerpt of my ebook Modern R with the tidyverse that you can read for
free here. This is taken from Chapter 5, which presents
the {tidyverse} packages and how to use them to compute descriptive statistics and manipulate data.
In the text below, I show how ... [Read more...]
This blog post is an excerpt of my ebook Modern R with the tidyverse that you can read for
free here. This is taken from Chapter 5, which presents
the {tidyverse} packages and how to use them to compute descriptive statistics and manipulate data.
In the text below, I scrape a ... [Read more...]
This blog post is an excerpt of my ebook Modern R with the tidyverse that you can read for
free here. This is taken from Chapter 7, which deals
with statistical models. In the text below, I explain what hyper-parameters are, and as an example
I run a ridge regression using ... [Read more...]
Introduction
This blog posts will use several packages from the
{tidymodels} collection of packages, namely
{recipes},
{rsample} and
{parsnip} to train a random forest the tidy way. I will
also use {mlrMBO} to tune the hyper-parameters of the random forest.
Set up
Let’s load the needed packages:
Inspired by David Schoch’s blog post,
Traveling Beerdrinker Problem.
Check out his blog, he has some amazing posts!
Introduction
Luxembourg, as any proper European country, is full of castles. According to Wikipedia,
“By some optimistic estimates, there are as many as 130 castles in Luxembourg but more realistically
there are ... [Read more...]
Introduction
In this blog post, I’ll use the data that I cleaned in a previous
blog post, which you can download
here. If you want to follow along,
download the monthly data. In my last blog post
I showed how to perform a grid search the “tidy” way. As ... [Read more...]
Introduction
In this blog post, I’ll use the data that I cleaned in a previous
blog post, which you can download
here. If you want to follow along,
download the monthly data.
In the previous blog post, I used the auto.arima() function to very quickly get a “good-enough”
... [Read more...]
In this blog post, I will show you how you can quickly and easily forecast a univariate time series.
I am going to use data from the EU Open Data Portal on air passenger transport. You can find the
data here. I downloaded
the data in the TSV format for ... [Read more...]
Link to webscraping the data
Link to Analysis, part 1
Introduction
This is the third blog post that deals with data from the game NetHack, and oh boy, did a lot of
things happen since the last blog post! Here’s a short timeline of the events:
I scraped data from ... [Read more...]
Abstract
In this post, I will analyse the data I scraped and put into an R package, which I called {nethack}.
NetHack is a roguelike game; for more context, read my previous blog
post.
You can install the {nethack} package and play around with the data yourself by installing it ... [Read more...]
If someone told me a decade ago (back before I'd ever heard the term "roguelike") what I'd be doing today, I would have trouble believing this...Yet here we are. pic.twitter.com/N6Hh6A4tWl— Josh Ge (@GridSageGames) June 21, 2018
Abstract
In this post, I am going to show ...
Abstract
You can find the data used in this blog post here: https://github.com/b-rodrigues/elections_lux
This is a follow up to a previous blog post
where I extracted data of the 2018 Luxembourguish elections from Excel Workbooks.
Now that I have the data, I will create a map ... [Read more...]
In this blog post, similar to a previous blog post
I am going to show you how we can go from an Excel workbook that contains data to flat file. I will
taking advantage of the structure of the tables inside the Excel sheets by writing a function
that extracts ... [Read more...]
I was recently confronted to the following problem: creating hundreds of plots that could still be
edited by our client. What this meant was that I needed to export the graphs in Excel or Powerpoint
or some other such tool that was familiar to the client, and not export the ... [Read more...]
In a previous blog post I have showed
how you could use the {tidyxl} package to go from a human readable Excel Workbook to a tidy
data set (or flat file, as they are also called). Some people then contributed their solutions,
which is always something I really enjoy when ... [Read more...]
I won’t write a very long introduction; we all know that Excel is ubiquitous in business, and that
it has a lot of very nice features, especially for business practitioners that do not know any
programming. However, when people use Excel for purposes it was not designed for, it ... [Read more...]
I’ve been using GNU+Linux distros for about 10 years now, and have settled for openSUSE as my main operating system around 3 years ago, perhaps even more. If you’re a gamer, you might have heard about SteamOS
and how more and more games are available on GNU+Linux. I ... [Read more...]
First of all, is it heteroskedasticity or heteroscedasticity? According to
McCulloch (1985),
heteroskedasticity is the proper spelling, because when transliterating Greek words, scientists
use the Latin letter k in place of the Greek letter κ (kappa). κ sometimes is transliterated as
the Latin letter c, but only when these words entered the English ... [Read more...]
In this blog post I will discuss missing data imputation and instrumental variables regression. This
is based on a short presentation I will give at my job. You can find the data used here on this
website: http://eclr.humanities.manchester.ac.uk/index.php/IV_in_R
The data ... [Read more...]
I’ve been measuring my weight almost daily for almost 2 years now; I actually started earlier, but
not as consistently. The goal of this blog post is to get re-acquaiented with time series; I haven’t
had the opportunity to work with time series for a long time now and ... [Read more...]