Simulate data with R

Mic

8 years ago

[This article was first published on The Beginner Programmer, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

Last semester I was attending a boring class, even though the professor was really clever, he was always bouncing around the main theme and never got straight to the point. While thinking about everything but the class, I had an idea: when you are given a set of data, say X and Y, you can easily compute a linear regression model, e.g. the regression line, and find out information on the data. Now, you will also find information on the error that the linear model made in predicting the data. By finding out the distribution of the error you can somehow simulate data similar to the original, from the regression line, by simply adding a random error (whose distribution is known) to the predicted data.
Furthermore, we know from the regression line that the expected error is 0.

Here is the code to implement this idea in R. You can get the data to work on in the bottom of the page.

The result should look something like this: In blue the actual data and in red the simulated one.

Hope this was useful, if you know the name of this method, please leave a comment and let me know. Click here to get the data I used.

To leave a comment for the author, please follow the link and comment on their blog: The Beginner Programmer.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.