Articles by arthur charpentier

Forecast, Automatic Routines vs. Experience

March 18, 2015 | arthur charpentier

This morning, in our Time Series course, we’ve been playing with some data I got from google.ca/trends/. Actually, we’ve been playing on some old version, downloaded 18 months ago (discussed in a previous post, in French). __ urls = "http://freakonometrics.free.fr/report-headphones-2015.csv" __ report=read.table( + urls,... [Read more...]

Growing some Trees

March 18, 2015 | arthur charpentier

Consider here the dataset used in a previous post, about visualising a classification (with more than 2 features), __ MYOCARDE=read.table( + "http://freakonometrics.free.fr/saporta.csv", + header=TRUE,sep=";") The default classification tree is __ arbre = rpart(factor(PRONO)~.,data=MYOCARDE) __ rpart.plot(arbre,type=4,extra=6) We can change the options ... [Read more...]

Visualising a Classification in High Dimension

March 6, 2015 | arthur charpentier

So far, when discussing classification, we’ve been playing on my toy-dataset (actually, I should no claim it’s mine, it is inspired by the one used in the introduction of Boosting, by Robert Schapire and Yoav Freund). But in ral life, there are more observations, and more explanatory variables.... [Read more...]

John Snow, and Google Maps

February 27, 2015 | arthur charpentier

In my previous post, I discussed how to use OpenStreetMaps (and standard plotting functions of R) to visualize John Snow’s dataset. But it is also possible to use Google Maps (and ggplot2 types of graphs). library(ggmap) get_london [Read more...]

John Snow, and OpenStreetMap

February 27, 2015 | arthur charpentier

While I was working for a training on data visualization, I wanted to get a nice visual for John Snow’s cholera dataset. This dataset can actually be found in a great package of famous historical datasets. library(HistData) data(Snow.deaths) data(Snow.streets) One can easily visualize the ... [Read more...]

Visualizing Clusters

February 24, 2015 | arthur charpentier

Consider the following dataset, with (only) ten points x=c(.4,.55,.65,.9,.1,.35,.5,.15,.2,.85) y=c(.85,.95,.8,.87,.5,.55,.5,.2,.1,.3) plot(x,y,pch=19,cex=2) We want to get – say – two clusters. Or more specifically, two sets of observations, each of them sharing some similarities. Since the number of observations is rather small, it is actually possible to ... [Read more...]

k-means clustering and Voronoi sets

February 22, 2015 | arthur charpentier

In the context of -means, we want to partition the space of our observations into  classes. each observation belongs to the cluster with the nearest mean. Here “nearest” is in the sense of some norm, usually the (Euclidean) norm. Consider the case where we have 2 classes. The means being respectively ... [Read more...]

Inequalities and Quantile Regression

February 6, 2015 | arthur charpentier

In the course on inequality measure, we've seen how to compute various (standard) inequality indices, based on some sample of incomes (that can be binned, in various categories). On Thursday, we discussed the fact that incomes can be related to different variables (e.g. experience), and that comparing income inequalities ... [Read more...]

Modeling Incomes and Inequalities

January 17, 2015 | arthur charpentier

Last week, in our Inequality course, we've been looking at data. We started with some simulated data, only a few of them __ library("ineq") __ load(url("http://freakonometrics.free.fr/income_5.RData")) __ (income=sort(income)) [1] 19233 23707 53297 61667 218662 How could we say that there is inequality in this sample? If we look at ... [Read more...]

Confidence vs. Credibility Intervals

November 26, 2014 | arthur charpentier

Tomorrow, for the final lecture of the Mathematical Statistics course, I will try to illustrate - using Monte Carlo simulations - the difference between classical statistics, and the Bayesien approach. The (simple) way I see it is the following, for frequentists, a probability is a measure of the the frequency ... [Read more...]

Reinterpreting Lee-Carter Mortality Model

November 18, 2014 | arthur charpentier

Last week, while I was giving my crash course on R for insurance, we’ve been discussing possible extensions of Lee & Carter (1992) model. If we look at the seminal paper, the model is defined as follows Hence, it means that This would be a (non)linear model on the logarithm ... [Read more...]
1 6 7 8 9 10 19

Never miss an update!
Subscribe to R-bloggers to receive
e-mails with the latest R posts.
(You will not see this message again.)

Click here to close (This popup will not appear again)