Articles by arthur charpentier

Third Actuarial Pricing Game

January 9, 2017 | arthur charpentier

With the support of ACTINFO Chair and the (French) Institute of Actuaries, our Third Actuarial Pricing Game starts today ! There is a toolbox file available online, with a description of the game : the rules, the dates, and a description of the datasets 3 datasets : one underwriting and one claims databases, for ... [Read more...]

(Advanced) R Crash Course, for Actuaries

January 6, 2017 | arthur charpentier

In two weeks, the third year of the Data Science for Actuaries program will start. I will be there for the introduction to R. The slides are available online (created with slidify) A markdown is also available. I have to thank Ewen for his help on slidify (especially for the ...
[Read more...]

Forcasting Natural Catastrophes (is rather difficult)

January 2, 2017 | arthur charpentier

Following my previous post, I wanted to spend more time, on the time series with “global weather-related disaster losses as a proportion of global GDP” over the time period 1990-2016 that Roger Pilke sent me last night. db=data.frame(year=1990:2016, ratio=c(.23,.27,.32,.37,.22,.26,.29,.15,.40,.28,.14,.09,.24,.18,.29,.51,.13,.17,.25,.13,.21,.29,.25,.2,.15,.12,.12)) In my previous post, I spend some ...
[Read more...]

What is a Linear Trend, by the way?

January 1, 2017 | arthur charpentier

I had a very stranger discussion on twitter (yes, another one), about regression curves. I think it started with a tweet based on some xkcd picture (just for fun, because it was New Year’s Day) “don’t trust linear regressions” https://t.co/exUCvyRd1G pic.twitter.com/O6...
[Read more...]

Rupture Detection

December 13, 2016 | arthur charpentier

There are some graphs that you cannot forget. One graph that I found puzzling was mentioned on Andrew Gelman’s blog, a few years back, and was related to rupture detection What I remember from this graph is that if you want to get a rupture, you can easily find ...
[Read more...]

How long could it take to run a regression

April 6, 2016 | arthur charpentier

This afternoon, while I was discussing with Montserrat (aka @mguillen_estany) we were wondering how long it might take to run a regression model. More specifically, how long it might take if we use a Bayesian approach. My guess was that the time should probably be linear in , the number ... [Read more...]

Where People Live, part 2

April 4, 2016 | arthur charpentier

Following my previous post, I wanted to use another dataset to visualize where people live, on Earth. The dataset is coming from sedac.ciesin.columbia.edu. We you register, you can download the database __ base=read.table("glp00ag15.asc",skip=6) The database is a ‘big’ 1440×572 matrix, in each cell (...
[Read more...]

Classification on the German Credit Database

March 18, 2016 | arthur charpentier

In our data science course, this morning, we’ve use random forrest to improve prediction on the German Credit Dataset. The dataset is __ url="http://freakonometrics.free.fr/german_credit.csv" __ credit=read.csv(url, header = TRUE, sep = ",") Almost all variables are treated a numeric, but actually, most of them ...
[Read more...]

Forecasts with ARIMA Models

March 16, 2016 | arthur charpentier

In our time series class this morning, I was discussing forecasts with ARIMA Models. Consider some simple stationnary AR(1) simulated time series __ n=95 __ set.seed(1) __ E=rnorm(n) __ X=rep(0,n) __ phi=.85 __ for(t in 2:n) X[t]=phi*X[t-1]+E[t] __ plot(X,type="l") If we fit ...
[Read more...]

Where People Live

March 3, 2016 | arthur charpentier

There was an interesting map on reddit this morning, with a visualisation of latitude and longituge of where people live, on Earth. So I tried to reproduce it. To compute the density, I used a kernel based approch __ library(maps) __ data("world.cities") __ X=world.cities[,c("lat","pop")] __ liss=...
[Read more...]

Mortality by Weekday and Age

February 27, 2016 | arthur charpentier

A few days ago, I did mention on Twitter a nice graph, with Mortality by Weekday and Age https://t.co/LyzQ7nJABZ very interesting difference, young vs. old pic.twitter.com/EfrX0C1GBS — Arthur Charpentier (@freakonometrics) 27 février 2016 My colleague Jean-Philippe was extremely sceptical, so I tried to ...
[Read more...]

Reverse Engineering with Correlated Features

February 11, 2016 | arthur charpentier

In econometric modeling, I usually have a problem with correlated features. A few weeks ago, I was discussing feature selection when features are correlated. This week, I was wondering about reverse engineering when features might be correlated (not to say very correlated). The way I see reverse engineering is the ...
[Read more...]

Clustering French Cities (based on Temperatures)

February 11, 2016 | arthur charpentier

In order to illustrate hierarchical clustering techniques and k-means, I did borrow François Husson‘s dataset, with monthly average temperature in several French cities. __ temp=read.table( + "http://freakonometrics.free.fr/FR_temp.txt", + header=TRUE,dec=",") We have 15 cities, with monthly observations __ X=temp[,1:12] __ boxplot(X) Since the ...
[Read more...]

Clusters of Texts

February 10, 2016 | arthur charpentier

Another popular application of classification techniques is on texmining (see e.g. an old post on French president speaches). Consider the following example,  inspired by Nobert Ryciak’s post, with 12 wikipedia pages, on various topics, __ library(tm) __ library(stringi) __ library(proxy) __ titles = c("Boosting_(machine_learning)", + "Random_forest", + "K-nearest_neighbors_...
[Read more...]
1 3 4 5 6 7 19

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)