Articles by arthur charpentier

On my way to Manizales (Colombia)

June 16, 2019 | arthur charpentier

Next week, I will be in Manizales, Colombia, for the Third International Congress on Actuarial Science and Quantitative Finance. I will be giving a lecture on Wednesday with Jed Fress and Emilianos Valdez. I will give my course on Algorithms for Predictive Modeling on Thursday morning (after Jed and Emil’...
[Read more...]

Pareto Models for Top Incomes

June 3, 2019 | arthur charpentier

With Emmanuel Flachaire, we uploaded on hal a paper on Pareto Models for Top Incomes, Top incomes are often related to Pareto distribution. To date, economists have mostly used Pareto Type I distribution to model the upper tail of income and wealth distribution. It is a parametric distribution, with an ... [Read more...]

Estimates on training vs. validation samples

May 23, 2019 | arthur charpentier

Before moving to cross-validation, it was natural to say “I will burn 50% (say) of my data to train a model, and then use the remaining to fit the model”. For instance, we can use training data for variable selection (e.g. using some stepwise procedure in a logistic regression), and ...
[Read more...]

The “probability to win” is hard to estimate…

November 6, 2018 | arthur charpentier

Real-time computation (or estimation) of the “probability to win” is difficult. We’ve seem that in soccer games, in elections… but actually, as a professor, I see that frequently when I grade my students. Consider a classical multiple choice exam. After each question, imagine that you try to compute the ...
[Read more...]

Solving the chinese postman problem

October 19, 2018 | arthur charpentier

Some pre-Halloween post today. It started actually while I was in Barcelona : kids wanted to go back to some store we’ve seen the first day, in the gothic part, and I could not remember where it was. And I said to myself that would be quite long to do ...
[Read more...]

Monte Carlo techniques to create counterfactuals

October 11, 2018 | arthur charpentier

In the previous STT5100 course, last week, we’ve seen how to use monte carlo simulations. The idea is that we do observe in statistics a sample , and more generally, in econometrics . But let’s get back to statistics (without covariates) to illustrate. We assume that observations are realizations of ... [Read more...]

October, grant proposal season

October 9, 2018 | arthur charpentier

In 2012, Danielle Herbert, Adrian Barnett, Philip Clarke and Nicholas Graves published an article entitled “on the time spent preparing grant proposals: an observational study of Australian researchers“, whose conclusions had been included in Nature under a more explicit title, “Australia’s grant system wastes time” ! In this study, they included 3700 ...
[Read more...]

Combining automatically factor levels in R

October 6, 2018 | arthur charpentier

Each time we face real applications in an applied econometrics course, we have to deal with categorial variables. And the same question arise, from students : how can we combine automatically factor levels ? Is there a simple R function ? I did upload a few blog posts, over the pas years. But ...
[Read more...]

Convex Regression Model

July 5, 2018 | arthur charpentier

This morning during the lecture on nonlinear regression, I mentioned (very) briefly the case of convex regression. Since I forgot to mention the codes in R, I will publish them here. Assume that where is some convex function. Then is convex if and only if , , Hidreth (1954) proved that ifthen is ...
[Read more...]

Game of Friendship Paradox

June 27, 2018 | arthur charpentier

In the introduction of my course next week, I will (briefly) mention networks, and I wanted to provide some illustration of the Friendship Paradox. On network of thrones (discussed in Beveridge and Shan (2016)), there is a dataset with the network of characters in Game of Thrones. The word “friend” might ...
[Read more...]

Linear Regression, with Map-Reduce

June 18, 2018 | arthur charpentier

Sometimes, with big data, matrices are too big to handle, and it is possible to use tricks to numerically still do the map. Map-Reduce is one of those. With several cores, it is possible to split the problem, to map on each machine, and then to agregate it back at ... [Read more...]

Quantile Regression (home made)

June 14, 2018 | arthur charpentier

After my series of post on classification algorithms, it’s time to get back to R codes, this time for quantile regression. Yes, I still want to get a better understanding of optimization routines, in R. Before looking at the quantile regression, let us compute the median, or the quantile, ...
[Read more...]

Discrete or continuous modeling ?

June 13, 2018 | arthur charpentier

Tuesday, we got our conference “Insurance, Actuarial Science, Data & Models” and Dylan Possamaï gave a very interesting concluding talk. In the introduction, he came back briefly on a nice discussion we usually have in economics on the kind of model we should consider. It was about optimal control. In many ...
[Read more...]

Classification from scratch, boosting 11/8

June 8, 2018 | arthur charpentier

Eleventh post of our series on classification from scratch. Today, that should be the last one… unless I forgot something important. So today, we discuss boosting. An econometrician perspective I might start with a non-conventional introduction. But that’s actually how I understood what boosting was about. And I am ...
[Read more...]

Classification from scratch, bagging and forests 10/8

June 8, 2018 | arthur charpentier

Tenth post of our series on classification from scratch. Today, we’ll see the heuristics of the algorithm inside bagging techniques. Often, bagging is associated with trees, to generate forests. But actually, it is possible using bagging for any kind of model. Recall that bagging means “boostrap aggregation”. So, consider ...
[Read more...]
1 2 3 4 5 19

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)