Data Hacking with RDSTK 2
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
RDSTK is a very versatile package. It includes functions to help you convert IP address to geo locations and derive statistics from them. It also allows you to input a body of text and convert it into sentiments.
This is a continuation from the last exercise RDSTK 1
This package provides an R interface to Pete Warden’s Data Science Toolkit. See www.datasciencetoolkit.org for more information.
Answers to the exercises are available here.
If you obtained a different (correct) answer than those listed on the solutions page, please feel free to post your answer as a comment on that page.
Exercise 1
Load the rdstk2 csv dataset.
Exercise 2
a.create a string called s1 and store “statistics” inside
b.create a string called s3 and store “value”
c. create a function that will take a string s2 as an input and output a string in the format s1+s2+s3 seperated by “.”. Name this function “stringer”
Exercise 3
Lets test out this function.
stringer("hello")
You should see an output in the format “statistics.hello.value”
Exercise 4
Create a for loop that will iterate over the rows in df and derive the population density of the location using coordinates2statistics function. Save the results in df$pop
Exercise 5
Lets now make a function using elements you learned from exercise 3 and 4. So the function is going to take a string as an input like s2 from exercise 3. Inside the function you can combine it with s1 and s3. You have to create the same for loop from exercise 4. Instead of storing the result of the for loop in df$pop, use df$pop2.You should see a new feature inside df with all the results once you return df from it.
Exercise 6
Test the function stat_maker. stat_maker(“population_density”). Notice it did not explicitly make the changes to the df but just returned it once you called the function. This is because we did not define df as a global variable. But thats okay. We will learn it later
Exercise 7
Great. Now before we modify our function, lets learn how we can make a global variable inside a function. Use the same code from exercise 5 but this time instead of defining df$pop2 as a local variable, define it as a global variable. Run the function and test it again.
Exercise 8
You can also use the assign() function inside a function and set the results as a global variable. Lets see an example of assign function
assign(“test”,50)
Now if you type test in your console. You should see 50. Try it
Exercise 9
Now try putting the same code in exercise 8 while changing test to test2 inside the stat_maker function. Once you test the function, you will see that test2 does not return anything. This is because it was not set as a global variable
Exercise 10
Set test2 as a global variable inside the stat_maker function. Run the function and now you should see test 2 return 50 when you call it.
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.