Bug bash instructions - RevolutionAnalytics/AzureML GitHub Wiki
To install the package directly from github, use:
# Install devtools
if(!require("devtools")) install.packages("devtools")
devtools::install_github("RevolutionAnalytics/AzureML")
To publish web services, you need to have an external zip utility installed. This utility should be in the available in the path. See ?zip
for more details.
On windows, it's sufficient to install RTools.
Note: the utility should be called zip
, since the R function zip()
looks for a file called zip
in the path. Thus, publishWebservice()
may fail, even if you have a program like 7-zip
installed.
To use any of the functions, you need your AzureML credentials. To find these, read the vignette Getting Started with the AzureML Package.
You will need these credentials to use the function workspace()
. This function sets up your credentials in R and allows you to use all of the other functions in the package.
The easiest way to use the workspace()
function is to create a json file in the location ~/.azureml/settings.json
Copy the following and modify with your own credentials:
{"workspace":{
"id" : "Add your id here",
"authorization_token" : "Add your authorisation token here"
}}
Then save the file at ~/.azureml/settings.json
. On windows, save the file at C:\Users\<yourname>\Documents\.azureml
If you have any doubt as to the location of `~/", try:
> path.expand("~/")
[1] "C:/Users/adevries/Documents/"
Try:
- Read the help for
?workspace
,?datasets
and?download.datasets
- Create a workspace object
- Getting a listing of available datasets in your workspace
- Download a specific dataset from AzureML as a data frame
ws <- workspace()
d <- datasets(ws)
dat <- download.datasets(d, "Movie Ratings")
head(dat)
You can publish almost any R function as a web service in AzureML, subject to some input/output constraints.
- Read the help for
?publishWebService
- Try some of the examples in
?publishWebService
or?consume
Here is a more complicated example showing how to create a function that takes ordered factors as input:
# Train a model using diamonds in ggplot2
library(rpart)
data(diamonds, package="ggplot2")
set.seed(1)
train_idx = sample.int(nrow(diamonds), 30000)
test_idx = sample(setdiff(seq(1, nrow(diamonds)), train_idx), 500)
train <- diamonds[train_idx, ]
test <- diamonds[test_idx, ]
model <- glm(price ~ carat + clarity + color + cut - 1, data = train,
family = Gamma(link = "log"))
diamondLevels <- diamonds[1, ]
# The model works reasonably well, except for some outliers
plot(exp(predict(model, test)) ~ test$price)
# Create a function to publish. The function takes care of converting characters correctly to factors
predictDiamonds <- function(x){
x$cut <- factor(x$cut,
levels = levels(diamondLevels$cut), ordered = TRUE)
x$clarity <- factor(x$clarity,
levels = levels(diamondLevels$clarity), ordered = TRUE)
x$color <- factor(x$color,
levels = levels(diamondLevels$color), ordered = TRUE)
predict(model, newdata = x, type="response")
}
# Publish the service
ws <- workspace()
ep <- publishWebService(ws, fun = predictDiamonds, name = "diamonds",
inputSchema = test)
Now that you've published an API, you can send data for scoring by using the function consume()
.
- Read the help for
?consume
- Try some of the examples
To consume the model you published in the previous section, try:
results <- consume(ep, test)$ans
# A summary of the relative prediction errors:
summary((results - test$price) / test$price)
# Compare the AzureML results with locally computed ones:
crossprod(predictDiamonds(test) - results)
Delete this example web service when you're done if you wish:
deleteWebService(ws, "diamonds")
To report issues or problems, use the issue log or send me a direct message: