Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
The number of times I have to connect to SQL and I forget part of the RODBC command to connect to an internal data table.
As part of a project I am working on I have been connecting to lots of different sources and became tired of typing lots of lines and repeating the same command. Therefore, I have been rescued by R’s lovely functions.
The function
The function is very simple, but extremely useful. This is contained in the code block below and I will talk you through this:
sql_connection <- function(conn_str, sql_str){ library(RODBC) message("Loading SQL....") conn <- odbcDriverConnect(connection=conn_str) sql_command <- sql_str df <- data.frame(sqlQuery(conn,sql_command)) return(df) odbcCloseAll() message("SQL loaded") }
The function is simple - it looks for a connection string (conn_str) and a SQL command (sql_str) and the following process is initiated:
- The RODBC library is loaded, please make sure this package is installed beforehand
- A message is sent to the Console to say that the SQL has started to load - you could add more functionality to this message, but this will suffice
- A variable is created called conn to initiate the connection and pass the connection string to the connection parameter
- Then a variable to store the sql command is created and links to the user defined function parameter of sql_str
- Next, a data frame is created based on the connection and SQL command
- Following this the data frame is returned from the function
- The last piece in the puzzle is to close the ODBC connection and show a message to the user to say that the SQL has been loaded
Using the function
To use the function we need to call it in our R code and pass two important pieces of information. The first is the connection string, this is the most important part of the function, as an error here will cause the connection not to be established. I provide a brilliant website I used to use when I was a DBA as reference: https://www.connectionstrings.com/sql-server/ .
The second part is the SQL command itself. This is shown in the R block below:
sql_df <- sql_connection(conn_str="Driver={SQL Server Native Client 11.0};Server=yourserver;Database=yourDB;Trusted_Connection=yes;", sql_str="SELECT your fields FROM your table WHERE some condition")
This will then create a data frame with the results of your SQL query in. Then you can use DPLYR and the Tidyverse to do the relevant data manipulation and cleaning.
I hope you find this function useful and use it at will. It has certainly helped me not have to search the functionality of this package every time.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.