Advanced Raster Data: Exercises
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Geospatial data is becoming increasingly used to solve numerous ‘real-life’ problems (check out some examples here.) In turn, R is becoming a powerful open-source solution to handle this type of data, currently providing an exceptional range of functions and tools for GIS and Remote Sensing data analysis.
In particular, raster data provides support for representing spatial phenomena by diving the surface into a grid (or matrix) composed of cells of regular size. Each raster data-set has a certain number of columns and rows and each cell contains a value with information for the variable of interest. Stored data can be either: (i) thematic – representing a discrete variable, (ex. land cover classification map) or continuous (ex. elevation).
The raster
package currently provides an extensive set of functions to create, read, export, manipulate and process raster data-sets. It also provides low-level functionalities for creating more advanced processing chains, as well as the ability to manage large data-sets. For more information, see: vignette("functions", package = "raster")
. You can also check more about raster data on the tutorial series about this topic here.
In this exercise set, we will explore the following topics in raster data processing and geostatistical analysis (previously discussed in this tutorial series):
- Unsupervised classification/clustering of satellite data
- Regression-kriging (RK)
We will also address how to use the package RSToolbox
(link) to calculate the:
- Tasseled Cap Transformation (TCT)
- PCA rotation/transformation
Both data compression techniques examined here will use spectral data from satellite imagery.
Answers to these exercises are available here.
Exercise 1
Use the data in this link (Landsat-8 surface reflectance data bands 1-7, for Peneda-Geres National Park – PGNP, NW Portugal) to answer the next exercises (1 to 6). Download the data, uncompress and create a raster brick. How many pixels and layers does the data have?
Exercise 2
Make an RGB plot with bands 5, 1, and 3 with linear stretching.
Exercise 3
Using k-means algorithm performs an unsupervised classification/clustering of the data with 5 clusters.
Exercise 4
Use the CLARA algorithm (package cluster
) to perform an unsupervised classification/clustering of the data with 5 clusters and Euclidean distance.
Exercise 5
Using package RStoolbox
, calculate the Tasseled Cap Transformation of the data (remember it is Landsat-8 data with bands 1-7).
Exercise 6
Using package RStoolbox
, calculate the standardized PCA transform. What is the cumulative % of explained variance in the three first components?
Exercise 7
- Use the data in this link to answer the next exercises (annual average temperature for weather stations in Portugal; col
AvgTemp
). Using Lat and Lon columns from theclim_data_pt.csv
table, create aSpatialPointsDataFrame
object with CRS WGS 1984. - Using Ordinary Kriging from package
gstat
, interpolate temperature values employing a Spherical empirical variogram. Calculate the RMSE from 5-fold cross-validation (see functionkrige.cv
) and use theset.seed(12345)
.
Exercise 8
Using the previous question rationale, experiment now with an Exponential model. Calculate RMSE also from 5-fold CV. Which one was the best model according to RMSE?
Exercise 9
Using the cubist regression algorithm (package Cubist
), predict the based AvgTemp
on latitude (Lat
), elevation (column Elev
) and distance to the coastline (column distCoast
). Calculate the RMSE for a random test set of 15 observations. Use the set.seed(12345)
.
Exercise 10
From the previous exercise, extract the train residuals and interpolate them. Following a Regression-kriging approach, add the interpolated residuals and the regression results. Calculate the RMSE for the test set (defined in E9) and check if this improves the modeling performance any further.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.