[This article was first published on More or Less Numbers, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
For those that don’t work in education or aren’t aware, there is a measurement for a child’s reading level called a Lexile ® Level. There are ways in which this level can be retrieved using different reading assessments. The measurement can be used to match a child’s reading level to books that are the same level. So the company maintains a database of books that they have assigned levels. There are other assessments that measure how well a child is reading, but I’m not aware of any systems that assign books as comprehensively. The library at the school I work at lacked Lexile Levels for their books. This is unfortunate because of the fact that teachers are not able provide students with books for their respective Lexile Level. Fortunately though the library had a list of ISBNs for the books.
On the Lexile website there is a way to search for books using ISBN numbers to retrieve their Lexile ® Level if the book is available in their database. Entering every ISBN number available is a task fit for something not human.
rvest to the rescue.
Below is the script to retrieve the Lexile Levels of books if a list of ISBNs is available. This was an incredible time save provided by some R code and hopefully someone else out there could use it.
Link to code
On the Lexile website there is a way to search for books using ISBN numbers to retrieve their Lexile ® Level if the book is available in their database. Entering every ISBN number available is a task fit for something not human.
rvest to the rescue.
Below is the script to retrieve the Lexile Levels of books if a list of ISBNs is available. This was an incredible time save provided by some R code and hopefully someone else out there could use it.
library(rvest) library(httr) | |
library(htmltools) | |
library(dplyr) | |
##Prep for things used later | |
url<-“https://www.lexile.com/fab/results/?keyword=“ | |
url2<-“https://lexile.com/book/details/“ | |
##CSV file with ISBN numbers | |
dat1<-read.csv(“~isbns.csv“,header=FALSE) | |
##dat1<-data.frame(dat1[203:634,]) | |
dat<-as.character(dat1[,1])%>%trimws() | |
##dat<-dat[41:51] | |
blank<-as.character(“NA“) | |
blank1<-as.character(“NA“) | |
##blank2<-as.character(“NA”) | |
##blank3<-as.character(“NA”) | |
all<-data.frame(“A“,“B“,“C“) | |
colnames(all)<-c(“name“,“lexiledat“,“num“) | |
all<-data.frame(all[–1,]) | |
for(i in dat) { | |
sites<-paste(url,i,sep=““) | |
x <- GET(sites, add_headers(‘user-agent‘ = ‘r‘)) | |
webpath<-x$url%>%includeHTML%>%read_html() | |
##Book Name | |
name<-webpath%>%html_nodes(xpath=“///div[2]/div/div[2]/h4/a“)%>%html_text()%>%trimws() | |
##Lexile Range | |
lexile<-webpath%>%html_nodes(xpath=“///div[2]/div/div[3]/div[1]“)%>%html_text()%>%trimws()%>%as.character() | |
##CSS change sometimes | |
lexiledat<-ifelse(is.na(lexile[2])==TRUE,lexile,lexile[2]) | |
test1<-data.frame(lexiledat,NA) | |
##Breaks every now and then when adding Author/Pages | |
##Author Name | |
##author<-webpath%>%html_nodes(xpath=’///div[2]/div/div[2]/span’)%>%html_text()%>%as.character()%>%trimws() | |
##author<-sub(“by: “,””,author) | |
##Pages | |
##pages<-webpath%>%html_nodes(xpath=’///div[2]/div/div[2]/div/div[1]’)%>%html_text()%>%as.character()%>%trimws() | |
##pages<-sub(“Pages: “,””,pages) | |
##Some books not found, this excludes them and replaces with NA values | |
df<-if(is.na(test1)) data.frame(blank,blank1) else data.frame(name,lexiledat,stringsAsFactors = FALSE) | |
colnames(df)<-c(“name“,“lexiledat“) | |
df$num <- i | |
all<-bind_rows(all,df) | |
} | |
master<-rbind(all1,all) |
Link to code
To leave a comment for the author, please follow the link and comment on their blog: More or Less Numbers.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.