Site icon R-bloggers

Item Based Collaborative Filtering Recommender Systems in R

[This article was first published on Data Perspective, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
In the series of implementing Recommendation engines, in my previous blog about recommendation system in R, I have explained about implementing user based collaborative filtering approach using R. In this post, I will be explaining about basic implementation of Item based collaborative filtering recommender systems in r.
Intuition:

Item based Collaborative Filtering:
Unlike in user based collaborative filtering discussed previously, in item-based collaborative filtering, we consider set of items rated by the user and computes item similarities with the targeted item. Once similar items are found, and then rating for the new item is predicted by taking weighted average of the user’s rating on these similar items.
let’s understand with an example:
As an example: consider below dataset, containing users rating to movies. Let us build an algorithm to recommend movies to CHAN.
Implementing Item based recommender systems, like user based collaborative filtering, requires two steps:
  • Calculating Item similarities
  •  Predicting the targeted item rating for the targeted User.

Step1: Calculating Item Similarity:
This is a critical step; we calculate the similarity between co-rated items. We use cosine similarity or pearson-similarity to compute the similarity between items. The output for step is similarity matrix between Items.

Code snippet:
#step 1: item-similarity calculation co-rated items are considered and similarity between two items< o:p>
#are calculated using cosine similarity
library(lsa)
ratings = read.csv(“Rating Matrix.csv”)
x = ratings[,2:7]
x[is.na(x)] = 0

item_sim = cosine(as.matrix(x))
Step2: Predicting the targeted item rating for the targeted User CHAN.
In this most important step, we first predict the items which the user is not rated by making use of the ratings he has made to previously interacted items and the similarity values calculated in the previous step. First we select item to be predicted, in our case “INCEPTION”, we predict the rating for INCEPTION movie by calculating the weighted sum of ratings made to movies similar to INCEPTION. i.e We take the similarity score for each rated movie by CHAN w.r.t INCEPTION and multiply with the corresponding rating and sum up all the for all the rated movies. This final sum is divided by total sum of similarity scores of rated items w.r.t INCEPTION.
Recommending Top N items:
Once all the non rated movies are predicted we recommend top N movies to CHAN. Code for Item based collaborative filtering in R:
 #data input< o:p>
 ratings = read.csv(“~Rating Matrix.csv”)< o:p>

“step 1: item-similarity calculationnco-rated items are considered and similarity between two itemsnare calculated using cosine similarity”< o:p>

 library(lsa)< o:p>
 x = ratings[,2:7]< o:p>
 x[is.na(x)] = 0< o:p>
 item_sim = cosine(as.matrix(x))< o:p>
  < o:p>
“Recommending items for chan: since three movies are not ratednas a first step we have to predict rating value for each movienin CHANs case we have to first predict values for Titanic, Inception,Matrix”< o:p>

 rec_itm_for_user = function(userno)
 {< o:p>
   #extract all the movies not rated by CHAN< o:p>
   userRatings = ratings[userno,]< o:p>
   non_rated_movies = list()< o:p>
   rated_movies = list()< o:p>
   for(i in 2:ncol(userRatings)){< o:p>
     if(is.na(userRatings[,i]))< o:p>
     {< o:p>
       non_rated_movies = c(non_rated_movies,colnames(userRatings)[i])< o:p>
     }< o:p>
     else< o:p>
     {< o:p>
       rated_movies = c(rated_movies,colnames(userRatings)[i])< o:p>
     }< o:p>
   }< o:p>
   non_rated_movies = unlist(non_rated_movies)< o:p>
   rated_movies = unlist(rated_movies)< o:p>
   #create weighted similarity for all the rated movies by CHAN< o:p>
   non_rated_pred_score = list()< o:p>
   for(j in 1:length(non_rated_movies)){< o:p>
     temp_sum = 0< o:p>
     df = item_sim[which(rownames(item_sim)==non_rated_movies[j]),]< o:p>
     for(i in 1:length(rated_movies)){< o:p>
       temp_sum = temp_sum+ df[which(names(df)==rated_movies[i])]< o:p>
        }< o:p>
     weight_mat = df*ratings[userno,2:7]< o:p>
     non_rated_pred_score = c(non_rated_pred_score,rowSums(weight_mat,na.rm=T)/temp_sum)< o:p>
     }
   pred_rat_mat = as.data.frame(non_rated_pred_score)< o:p>
   names(pred_rat_mat) = non_rated_movies< o:p>
   for(k in 1:ncol(pred_rat_mat)){< o:p>
     ratings[userno,][which(names(ratings[userno,]) == names(pred_rat_mat)[k])] = pred_rat_mat[1,k]< o:p>
   }< o:p>
   return(ratings[userno,])< o:p>
 }< o:p>

> rec_itm_for_user(7)
  Users  Titanic Batman Inception SuperMan Spiderman   matrix

7  CHAN 3.085298    4.5  2.940811        4         1 3.170034
Calling above function gives the predicted values not previously seen values for movies Titanic, Inception, Matrix. Now we can sort and recommend the top items.
This is all about Collaborative filtering in R, in my upcoming posts I will talk about content based recommender systems in r.
https://feeds.feedburner.com/DataPerspective

To leave a comment for the author, please follow the link and comment on their blog: Data Perspective.

R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.