Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
I recently taught a very basic introduction to SQL workshop and needed a way to have participants interact with SQL statements. Obviously there are lots of tools to interface with a database, but since we are all R users I thought it would be nice to be able interact without leaving R. Although this interface is fairly basic, the fact that we can type in a SQL statement and get the results as an R data frame provides all the advantages of having data in R. Moreover, I found this to be an interesting exercise in see the power of R as programming language, not just as statistical software. The function described here is part of the sqlutils
package which was created to manage a library of SQL files. More information about that is provided on the project page and I will likely have a forthcoming blog post too.
First we need to create a database to interact with. In this example we will use the students
data frame from the retention
package. We will save this data frame into a SQLite database using the RSQLite package. The R code to setup the database is provided as a demo in the package. Type demo('isql')
to start.
require(sqlutils) require(RSQLite) require(retention) data(students) students$CreatedDate = as.character(students$CreatedDate) m <- dbDriver("SQLite") tmpfile <- tempfile('students.db', fileext='.db') conn <- dbConnect(m, dbname=tmpfile) dbWriteTable(conn, "students", students[!is.na(students$CreatedDate),])
We begin an interactive SQL environment with the isql
function. The only required parameter is conn
which is the connection to the database that SQL statements will be executed. The sql
parameter is optional and sets the initial SQL statement for the session that can be edited or executed.
> hist <- isql(conn=conn, sql=getSQL('StudentSummary')) Interactive SQL mode (type quit to exit, help for available commands)... SQL> help Command Description ___________ ______________________________________________________ quit quit interactive mode help display this message sql enter SQL statement edit edit SQL in a separate text window print print the last entered SQL statement exec execute that last entered SQL statement result prints the last results save [name] save the last executed query to the global environment SLQ> print SELECT CreatedDate, count(StudentId) AS count FROM students GROUP BY CreatedDate ORDER BY CreatedDate SLQ> edit
SLQ> print SELECT CreatedDate, count(StudentId) AS count FROM students GROUP BY CreatedDate ORDER BY CreatedDate SLQ> exec Executing SQL... 118 rows of 2 variables returned SLQ> save Data frame sql.results saved to global environment SLQ> quit
The isql
function returns the history of the session invisibly (that is the results will not be printed but can be assigned to a variable). There are two elements in the returned list, commands
is a character vector listing all the commands entered and sql
is a character vector containing all the SQL statements entered.
> names(hist) [1] "sql" "commands"
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.