[This article was first published on NIR-Quimiometría, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Once we have chosen the model, we can continue acquiring spectra of new samples. Spectra is exported to a txt or csv file and we imported in R to be reprocessed.
We use the function “predict” from the PLS package. I have done this with 20 new samples. We need first to apply to them the adequate math treatment (same as the used in the model). I call this sample set for prediction “fatt2ac_val”, after apply the “msc” math treatment.
So, let´s see the predictions:
> predict(C16_0,ncomp=12,newdata=fatty2ac_val)
, , 12 comps
C16_0
220 22.01807
221 20.44803
222 19.79991
223 21.64232
224 20.29058
225 20.20099
226 21.52053
227 19.83305
228 18.95492
229 21.39239
230 21.11044
231 20.67454
232 19.28662
233 20.97292
234 21.70614
235 20.27464
236 19.70897
237 21.30686
238 20.21069
239 19.21576
I have used the model with 12 terms.
If this data has more than the spectra, (the Lab values) we can also validate and to check the number of terms to use.
> RMSEP(C16_0,newdata=fatty2ac_val,ncomps=12)
(Intercept) 1 comps 2 comps 3 comps 4 comps 5 comps
1.6280 1.5987 1.2554 1.1071 1.3447 0.9122
6 comps 7 comps 8 comps 9 comps 10 comps 11 comps
0.8754 0.8413 0.6597 0.6154 0.5669 0.5791
12 comps 13 comps 14 comps 15 comps 16 comps
0.5935 0.6261 0.5994 0.6315 0.5787
We can see our RMSE error and compare it with the RMSEP obtained in the PLSR “LOO validation statistics” which was 0,5733.We can see also that we would get even lower values for validation with a lower number of components (10).
It seems that the is working almost as expected, but let´s have a look to the plots:
>predplot(C16_0,ncomp=7:15,newdata=fatty2ac_val,
+ asp=1,line=TRUE)
I can observe a Bias, of almost 0,50.
The error corrected by the Bias would be 0.34.
Bias can be due to diferent reasons (temperature, sample presentation,particle size,optical path,state of the instrument,…).
This samples are fat triturated as a paste, and put it in a petri dish, and place it in an instrument in transmitance.
The next step would be to add this data to the data base and recalibrate.
Divide the data into a Calibration and a Validation Set. Be sure that in the validation set there are some samples of this last set.
The idea with all this series of post related to this data set was to work with data I used almost daily in my job, and I wanted to see how to proceed in R, and once you get use to it the results are very good for the understanding of chemometrics for multivariate data. I will continue exporting some of my data sets to work in R. The analysis of Fatty Acids by NIR/NIT can give good results for some of them (C16:0, C18:0, C18:1,..).
To leave a comment for the author, please follow the link and comment on their blog: NIR-Quimiometría.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.