Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.
by Alan Weaver, Advanced Analytics Specialist at Microsoft
Very often data scientists and analysts require access to back-end resources on Azure. For example, they may need to start a virtual machine or resize a Hadoop cluster. This typically requires making a request to the IT department and patiently waiting.
AzureSMR is a simple R package that enables those users to do many of those operations themselves.
It’s very easy to script commonly-used functions which can be run without having to navigate the portal or wizards. AzureSMR uses the Azure Systems Management API and leverages standard packages such as httr, so it can easily run in any R session (you don't need Microsoft R Server). You can also manage multiple Azure subscriptions from within the same session.
The AzureSMR functions currently addresses the following Azure Services:
- Azure Blob: List, Read and Write to Blob Services
- Azure Resources: List, Create and Delete Azure Resource. Deploy ARM templates.
- Azure VM: List, Start and Stop Azure VMs
- Azure HDI: List and Scale Azure HDInsight Clusters
- Azure Hive: Run Hive queries against a HDInsight Cluster
- Azure Spark: List and create Spark jobs/Sessions against a HDInsight Cluster(Livy)
For example, here's how you would use the AzureSMR package to find, start and stop an Azure virtual machine:
azureListVM(sc, resourceGroup = "Analytics") # LIST VMS azureStartVM(sc, vmName = "MYVM") # START A VM azureStopVM(sc, vmName = "MYVM") # STOP A VM
Or, you could spin up a Spark cluster in HDInsight (though you should choose better passwords!):
azureCreateHDI(sc, resourceGroup = "myResourceGroup", clusterName = "myhdicluster", location = "northeurope", storageAcc = "azuresmrteststorage", version = "3.5", sshUser = "hdiuser", sshPassword = "Password123!!", adminUser = "admin", adminPassword = "Password123!!" )
Or create an Azure Storage account and list and create files:
azureCreateStorageAccount(sc, storageAccount = "azuresmrteststorage", resourceGroup = "myResourceGroup") azureCreateStorageContainer(sc, "opendata") azureListStorageContainers(sc, storageAccount = "azuresmrteststorage", resourceGroup ="myResourceGroup") azurePutBlob(sc, contents=”XXX”, blob = "testdir/testfile.csv") azureBlobCD(sc, "/testdir") azureBlobLS(sc)
For a detailed list of the available functions and their syntax please refer to the Help pages. A vignette is available to help set up Authentication and there is also a Getting started tutorial.
To get started with the AzureML package, simply use the devtools package to install it from Github.
if(!require("devtools")) install.packages("devtools") devtools::install_github("Microsoft/AzureSMR") library(AzureSMR)
We welcome your feedback on this package! Contact us via the Github repo, linked below.
Github (Microsoft): AzureSMR, an R package for managing a selection of Azure resources
R-bloggers.com offers daily e-mail updates about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.
Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.