Site icon R-bloggers

Real-time data visualization using R and data extracting from SQL Server

[This article was first published on R – TomazTsql, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

In the previous post, I have showed how to visualize near real-time data using Python and Dash module.  And it is time to see one of the many ways, how to do it in R. This time, I will not use any additional frames for visualization, like shiny, plotly or any others others, but will simply use base R functions and RODBC package to extract data from SQL Server.

Extracting data from SQL Server will and simulating inserts in SQL Server table will primarily simulate the near real-time data. If you have followed the previous post, you will notice that I am using same T-SQL table and query to extract real-time data.

First, we will create a sample table in SQL Server and populate it with some sample data:

CREATE DATABASE Test;
GO

USE Test;
GO

CREATE TABLE dbo.LiveStatsFromSQLServer
(ID INT IDENTITY(1,1)
,Num tinyint NOT NULL)

And populate it with some sample data:

-- Insert some test data
INSERT INTO dbo.LiveStatsFromSQLServer(num)
SELECT ABS(CHECKSUM(NewId())) % 14
GO 10

Now, that we have SQL foundations set up, let’s focus on R code.

First we set the environment variable and the RODBC library:

library(RODBC)
# create env for storing the variables/data frames between the functions
assign("getREnvironment", new.env(), envir = .GlobalEnv)

We will generate a function for extracting data from SQL Server and storing it in environment data.frame variable:

# Function to read data from SQL Server
getSQLServerData <- function()
{
#extract environment settings for storing data
getREnvironment <- get("getREnvironment", envir = .GlobalEnv, mode = "environment")
#get the SQL Server data
con <- odbcDriverConnect('driver={SQL Server};
                         server=TOMAZK\\MSSQLSERVER2017;
                         database=test;trusted_connection=true')
db_df <- sqlQuery(con, 'SELECT 
                         TOP 20 id
                           ,num 
                        FROM LiveStatsFromSQLServer ORDER BY id DESC')
close(con)
#overwrite existing data with new data
df_overwrite <- db_df
getREnvironment$db_df <- data.frame(df_overwrite)
try(assign("getREnvironment", getREnvironment, envir = .GlobalEnv))
invisible() #do not print the results
}

 

Once we have this function registered, we can now create a small for loop that will update the plot with newly fetched data from SQL Server:

# Plot graph 
n=1000 #nof iterations
windowQuery=20 # syncronised with TOP clause in SELECT statement
for (i in 1:(n-windowQuery)) {
  flush.console()
  getSQLServerData()
  getREnvironment <- get("getREnvironment", envir = .GlobalEnv, mode = "environment")
  data <- getREnvironment$db_df
  plot( data$id, data$num, type='l',main='Realtime data from SQL Server')
  Sys.sleep(0.5)
}

 

Once we run the complete R code, we need to trigger and run also the new inserts in SQL Server Management studio:

-- Do some inserts to mimic the data stream
INSERT INTO dbo.LiveStatsFromSQLServer(num)
SELECT ABS(CHECKSUM(NewId())) % 14
WAITFOR DELAY '00:00:00.500'
GO 100

Once we do this, we can observe the realtime data from SQL Server being plotted in R environment (R Studio).

As always, complete code is available at Github.

Happy R-coding!

To leave a comment for the author, please follow the link and comment on their blog: R – TomazTsql.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.