Site icon R-bloggers

The Trapezoidal Rule of Numerical Integration in R

[This article was first published on R – Aaron Schlegel, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Part of 8 in the series Numerical Analysis

The Trapezoidal Rule is another of Closed Newton-Cotes formulas for approximating the definite integral of a function. The trapezoidal rule is so named due to the area approximated under the integral \(\int^a_b f(x) \space dx\) representing a trapezoid. Although there exist much more accurate quadrature methods, the trapezoidal rule converges rather fast comparatively and is known to be extremely accurate when approximating the definite integral of periodic functions.

The Trapezoidal Rule

The trapezoidal rule is a 2-point Closed Newton-Cotes formula that is based somewhat on the midpoint rule, in which the interval \([a, b]\) is divided into \(n\) subintervals of equal width:

\( \large{h = \frac{b – a}{n}} \)

Summed over the \(n\) points, the approximation becomes:

\( \large{\int^b_a f(x) dx \approx \frac{h}{2}[f(x_0) + f(x_1)] \qquad h = x_1 – x_0} \)

Which is the definition of the Trapezoidal Rule for a uniformly-spaced grid of points. The error term is \(\frac{h^3}{12} f^{\prime\prime}(\epsilon)\), which indicates the error can be computed exactly if \(f \in C^2 [a, b]\) (the function is twice differentiable on the interval \([a, b]\)). The Trapezoidal Rule with the error term applied becomes:

\( \large{\int^b_a f(x) dx = \frac{h}{2}[f(x_0) + f(x_1)] – \frac{h^3}{12} f^{\prime\prime}(\epsilon)} \)
Trapezoidal Rule Example

We will use the Trapezoidal Rule to approximate the following definite integral:

\( \large{\int_0^{\frac{\pi}{4}} x \space sin \space x \space dx} \)

The following image depicts how the trapezoidal rule approximates the integral of the function in the interval. The darker area represents the actual area under the function. The trapezoid rule approximates the area under the function by constructing the trapezoid and calculating its area.

# Replicate the function in R
f <- function(x) {
  return(x * sin(x))
}

# Create a data frame to store all the points needed for the approximation trapezoid
df <- data.frame(cbind(c(0, pi/4, pi/4, 0), c(0, f(pi/4), 0, 0)))

# Plot the function and its approximation by the trapezoidal rule
ggplot(data = data.frame(x = 0), mapping = aes(x = x)) + 
  stat_function(fun = f, size = 1.05, alpha = 0.75, color='blue') + 
  geom_segment(aes(x = 0, y = 0, xend = pi/4, yend = f(pi/4))) + 
  geom_segment(aes(x = pi/4, y = 0, xend = pi/4, yend = f(pi/4))) + 
  geom_polygon(data = df, aes(x=X1, y=X2), fill = 'blue', alpha = 0.2) + 
  geom_area(stat = 'function', fun = f, fill = 'black', alpha = 0.3, xlim = c(0, pi/4)) + 
  xlim(c(-0.5,1))

Approximating this integral with the Trapezoidal Rule:

\( \large{\int_0^{\frac{\pi}{4}} x \space sin \space x \space dx \approx \frac{\frac{\pi}{4} – 0}{2}[(0 * sin(0)) + (\frac{\pi}{4} sin(\frac{\pi}{4}))] = 0.2180895} \)
(pi/4)/2 * ((0 * sin(0)) + (pi/4 * sin(pi/4)))
## [1] 0.2180895

Integrating the function over the interval yields \(\frac{\pi – 4}{4\sqrt{2}} = 0.15175\), which gives an error of 0.0663395.

For a second example, apply the Trapezoidal Rule to approximate the definite integral:

\( \large{\int^1_0 x^2 \space e^{-x} \space dx} \)

Visualizing how the trapezoidal rule approximates the integral:

f2 <- function(x) {
  return(x^2 * exp(-x))
}

df <- data.frame(cbind(c(0, 1, 1, 0), c(0, f2(1), 0, 0)))

ggplot(data = data.frame(x = 0), mapping = aes(x = x)) + 
  stat_function(fun = f2, size = 1.05, alpha = 0.75, color='blue') + 
  geom_segment(aes(x = 0, y = 0, xend = 1, yend = f2(1))) + 
  geom_segment(aes(x = 1, y = 0, xend = 1, yend = f2(1))) + 
  geom_polygon(data = df, aes(x=X1, y=X2), fill = 'blue', alpha = 0.2) + 
  geom_area(stat = 'function', fun = f2, fill = 'black', alpha = 0.3, xlim = c(0, 1)) + 
  xlim(c(0,1))

\( \large{\int^1_0 x^2 \space e^{-x} \space dx \approx \frac{1 – 0}{2}[(0 * (0)^2 e^0) + ((1)^2 e^{-1})] = 0.1839397} \)
0.5 * ((0 * 0^2 * exp(0)) + (1^2 * exp(-1)))
## [1] 0.1839397


2 - 5/exp(1)
## [1] 0.1606028

The function integrated over the interval \([0, 1]\) is \(2 – \frac{5}{e} = 0.1606028\). Therefore the error is 0.0233369.

Trapezoidal Rule in R

The Trapezoidal Rule is comparatively straightforward to implement in R.

trapezoid <- function(f, a, b) {
  if (is.function(f) == FALSE) {
    stop('f must be a function with one parameter (variable)')
  }
  
  h <- b - a
  
  fxdx <- (h / 2) * (f(a) + f(b))
  
  return(fxdx)
}

Test our implementation using the functions in the examples above.

trapezoid(f, 0, pi/4)
## [1] 0.2180895


trapezoid(f2, 0, 1)
## [1] 0.1839397

We see the function reports the same values as our manual calculations. Though the Trapezoidal Rule is straightforward to implement and is rather accurate in its approximations, it begins to fail (as do the other Newton-Cotes formulas) when applied over large integration intervals. A quick example using the function \(x^2 – e^{-1}\) integrated over the interval \([0, 10]\):

trapezoid(f2, 0, 10)
## [1] 0.02269996


# The function integrated over the interval
2 - (122/exp(10))
## [1] 1.994461

The error becomes much higher as the integration interval increases. Composite methods, which implement a piecewise approach to numerical integration, are therefore generally preferred and used in actual practice.

Composite Trapezoidal Rule

The Composite Trapezoidal Rule, similar to other composite methods, divides the integral interval into \(n\) subintervals. The Trapezoidal Rule is then performed on each of those \(n\) subintervals. As before, we let the function \(f\) be twice differentiable in the interval \([a, b]\), or more formally, \(f \in C^2 [a, b]\). We also let \(h = (b – a) / n\) and \(x_j = a + jh\) for each \(j = 0, 1, \cdots, n\). The Composite Trapezoidal Rule, with its error term, is defined as:

\( \large{\int^a_b f(x) \space dx = \frac{h}{2}\big[f(a) + 2 \sum^{n-1}_{j=1} f(x_j) + f(b) \big] – \frac{b – a}{12}h^2 f^{\prime\prime}(\mu)} \)

Where \(a \leq \mu \leq b\). As the Trapezoidal Rule only requires one interval in each iteration of the subintervals, \(n\) can be either even or odd.

Let’s say we are interested in approximating the following integral using the Composite Trapezoidal Rule with \(n = 8\) subintervals:

\( \large{\int_0^2 e^{2x} \space sin \space 3x \space dx} \)

Replicate the function in R.

f3 <- function(x) {
  return(exp(2 * x) * sin(3 * x))
}

Visualize the composite trapezoid rule and how it divides the interval \([a, b]\) into \(n = 8\) subintervals.

# Break the interval into n subintervals
seg <- seq.int(0, 2, length.out = 9)

# Initialize an empty vector to store function outputs
fx <- vector(length = length(seg))

# For each subinterval, calculate the function
for (i in 1:length(seg)) {
  fx[i] <- f3(seg[i])
}

# Collect the needed points into a data.frame
df <- data.frame(xend = seg, 
                 y = rep(0, 9), 
                 yend = fx, 
                 yend1 = c(fx[2:9], fx[9]), 
                 xend1 = c(seg[2:9], seg[9]))

# plot the function and its approximation
ggplot(data = df) + 
  stat_function(fun = f3, size = 1.05, alpha = 0.75, color='blue') + 
  geom_segment(aes(x=xend, y=y, xend=xend, yend=yend)) + 
  geom_segment(aes(x=xend, y=yend, xend=xend1, yend=yend1)) + 
  geom_ribbon(aes(x=xend, ymin=y, ymax=yend), fill = 'blue', alpha = 0.3) + 
  geom_area(stat = 'function', fun = f3, fill = 'black', alpha = 0.3, xlim = c(0, 2)) + 
  xlim(c(-0.5, 2))

The following function is an implementation of the composite trapezoidal rule.

composite.trapezoid <- function(f, a, b, n) {
  if (is.function(f) == FALSE) {
    stop('f must be a function with one parameter (variable)')
  }
  
  h <- (b - a) / n
  
  j <- 1:n - 1

  xj <- a + j * h

  approx <- (h / 2) * (f(a) + 2 * sum(f(xj)) + f(b))
  
  return(approx)
}

Use the function to perform the composite trapezoid procedure.

composite.trapezoid(f3, 0, 2, 8)
## [1] -13.57598

Integrating the function over the interval \([0, 2]\) gives the actual value:

\( \int_0^2 e^{2x} \space sin \space 3x \space dx = \frac{1}{13} (3 + e^4 (2 sin(6) – 3 cos(6))) \approx -14.214 \)

The error is therefore \(-14.214 – (-13.57598) = -0.63802\).

Though the error is relatively large in this case, the composite trapezoid rule was more accurate than the trapezoid rule.

trapezoid(f3, 0, 2)
## [1] -15.25557

The error is \(-15.25557 – (-13.57598) = -1.67959\). As the wideness of the interval increases, the less accurate the trapezoid rule becomes. Since this example considered the interval \([0, 2]\), the difference in accuracy between the composite and standard trapezoidal rules is not as drastic as it would be with a wider interval.

References

Burden, R. L., & Faires, J. D. (2011). Numerical analysis (9th ed.). Boston, MA: Brooks/Cole, Cengage Learning.

Weisstein, Eric W. “Trapezoidal Rule.” From MathWorld–A Wolfram Web Resource. http://mathworld.wolfram.com/TrapezoidalRule.html

The post The Trapezoidal Rule of Numerical Integration in R appeared first on Aaron Schlegel.

To leave a comment for the author, please follow the link and comment on their blog: R – Aaron Schlegel.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.