Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Like many concepts in mathematics, linearity has multiple interpretations and meanings. What do we mean when we say something is linear? Well, it depends on what follows. A linear relationship is different from a linear system. Let’s see why.
Linear Relationships
Suppose I’m Mark Zuckerberg, and I have a theory that mathematicians are motivated by money1. My hypothesis is that there is a linear relationship between the percentage of people majoring in mathematics,
enrollment <- c(rep(.7,3), rep(.6,2), rep(.7,5), rep(.8,4), rep(.9,2)) names(enrollment) <- 1:length(enrollment) + 1994 salary <- c(rep(94,3), rep(82,2), rep(94,5), rep(106,4), rep(118,2)) names(salary) <- 1:length(salary) + 1994
Plotting out this contrived example, we see that the relationship does seem to take the shape of a line.
How might we find the slope and y-intercept? Recall that in algebra, we first solve for
> lm(enrollment ~ salary, data.frame(enrollment, salary)) Call: lm(formula = enrollment ~ salary, data = data.frame(enrollment, salary)) Coefficients: (Intercept) salary -0.083333 0.008333
We call these linear models precisely because the relationship between the response and the predictors can be described by a line. Note that it doesn’t matter if we have one or more than one predictor, since the equation is simply a line in multiple dimensions.
If the data provided is messy, then the exact equation for a line might not work out. We would see slight variations in the slope and intercept depending on which data points we chose to solve the slope. Obviously a linear regression gets us closer to a true value in this circumstance. Keep this in the back of your head as we go forward.
Linear Systems
Moving on to a linear system, otherwise known as a system of linear equations, you might think that this is simply a bunch of equations for lines thrown together. While almost right, you’d be slightly off the Mark. For convenience let’s look at a system of two variables, such as the one in Example 1.6 of Cherney, Denton, and Waldron.
How is this different from the structure of the equation in the first section? You might have to squint to see it, as it’s subtle. One thing is that we can solve
> lm(z ~ x + y, data.frame(x=c(2,4), y=c(6,8), z=c(20,28))) Call: lm(formula = z ~ x + y, data = data.frame(x = c(2, 4), y = c(6, 8), z = c(20, 28))) Coefficients: (Intercept) x y 12 4 NA
This is clearly wrong. Before the regression worked out perfectly and gave us the slope and intercept. So what’s going on? In the first example we had one line, but now we have two. Let’s rearrange the equations to see.
Plotting these two equations illustrates the difference nicely.
> f1 <- function(x) -1/3 * x + 10/3 > f2 <- function(x) -1/2 * x + 7/2 > xs <- seq(-2,2, by=.1) > plot(xs, f1(xs), type='l', col='brown') > lines(xs, f2(xs), col='orange') > points(1,3)
The lesson here is that a linear relationship is a single relationship between a dependent variable (response) and a set of independent variables (predictors). Ultimately you are looking for the coefficients that describe a line. On the other hand, a system of linear equations describes multiple linear relationships. What’s important is that here you are solving for the variables that solve all the equations simultaneously.
Linearity
Finally, let’s think about linearity. As we know, a function is linear if it satisfies two properties: additivity and homogeneity. In equation form, these look like
Additivity:
Homogeneity:
So what types of functions are linear? It’s natural to ask whether the equation for a line satisfies linearity. Let’s see what additivity looks like for
Surprise! So lines do not exhibit linearity unless lm
.
The bigger punchline is of course that re-writing the system once more into matrix form yields another insight. Now our system can be described simply by
This brings us back to the textbook, where our matrix
Proving homogeneity is easier, so I leave that up to you.
One last question before we close this chapter is why are matrices also called linear transformations? To answer this, let’s look at our original problem that we posed as Mark Z. Let’s create a more complete line based on our enrollment ~ salary relationship.
> A = matrix(c(2,4,6,8), nrow=2) > old <- rbind(salary, enrollment) > new <- A %*% old > xlim <- range(c(old[1,],new[1,])) > ylim <- range(c(old[2,],new[2,])) > plot(old[1,], old[2,], type='l', col='brown', xlim=xlim, ylim=ylim) > lines(new[1,], new[2,], col='orange')
What we see is that the original line (brown) has been transformed into a new line (orange). This transformation will behave the same for any collection of points. So this time we’re creating a whole new set of
Conclusion
We’ve covered three different aspects of linearity. Notice that even though each section begins with a linear equation, by changing the representation and problem we accomplish a different task. This is probably the most important result and one that you won’t find in textbooks.
Footnotes
1 This example is for entertainment/illustrative purposes only but inspired by real events: http://www.math.columbia.edu/~woit/wordpress/?p=6498
2 See http://www.nsf.gov/statistics/seind12/c2/c2s2.htm
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.