dcsimg
February 21, 2017
Hot Topics:

Azure Machine Learning (Azure ML) How-to

  • December 29, 2016
  • By Uma Narayanan
  • Send Email »
  • More Articles »

Introduction to Azure ML

Azure Machine Learning (Azure ML) is a SAAS cloud offering by Microsoft. The advantage of Azure ML is that it provides a UI-based interface and pre-defined algorithms that can be used to create a training model. It also supports R and Python script integration. This article explains how to create a training model and then deploy it as a Web service. We will create a "book recommender" model as an example.

Before we start with the actual experiment, let's look at the prerequisites.

Prerequisites

To create a machine learning model, we first create a workspace. Navigate to https://portal.azure.com and click the "New" button. In the panel, search for "Machine Learning Workspace." In the search results, click "Machine learning workspace" and then "Create."

ML01
Figure 1: Creating the workspace

ML02
Figure 2: Search results

Navigate through the steps and finish the workspace creation. Once the workspace is created, navigate to the workspace and click "Launch Machine Learning Studio."

ML03
Figure 3: Launching Machine Learning Studio

ML Studio is a UI-based editor that provides a set of predefined algorithms to create a training model. The most popular algorithms have already been created, ready to be used in an experiment. ML Studio gives an easy and a quick way to create ML experiments and validate them.

Let's understand it with the help of an example. In this example, we will create a book recommender experiment. We will upload two new datasets, "book-ratings" and "book". The "book" is a master set of books with the following columns: ISBN, Book-Title, Book-Author, Year-of-Publication, and Publisher. The "book-ratings" has the following columns: User-ID, ISBN, and Rating.

Navigate to ML Studio https://studio.azureml.net.

Dataset

On the left panel, select "Dataset;" then, click the "New" button as shown in Figure 4:

ML04
Figure 4: Starting a new dataset

Clicking "New" opens up a screen to upload a new csv file. Click "From Local File" and upload the file. Because only one file can be uploaded at a time, we need to do this action twice to upload "Book Ratings" and "Book" data.

ML05
Figure 5: Uploading a local file

Clicking "From Local File" opens up a dialog window, as shown in Figure 6:

ML06
Figure 6: The Upload a new dataset window

Choose a file to upload. Leave the "Select a Type for the new dataset" as "Generic CSV File with a header (.csv)" as is and save the changes. Repeat the same process for Books.csv as well.

Experiment

Once the datasets are created, we are ready to create an experiment. Select Experiment from the left panel (as seen in Figure 5) and click "New." Select "Blank Experiment," as shown in Figure 7.

ML07
Figure 7: The Blank Experiment window

This opens a canvas with a panel on the left with a number of modules listed. These modules can be dragged-dropped on the canvas, as in Figure 8:

ML08
Figure 8: The canvas with a panel and modules

Give the experiment a name—for example, "Books Experiment." Expand "My Datasets" on the left panel (refer to Figure 8), select "Book rating," and drag-drop it on the canvas. Right-click the circle and click Visualize to see the data and the column heading names (see Figure 9).

ML09
Figure 9: Clicking Visualize

Once the dataset is added, the next step is to cleanse the data to ensure the experiment gives the desired results. As a data cleansing process in this example, the records are filtered based on the rating; in other words, if they don't have any ratings or the rating is 0. On the left panel, search for "Split," select "Split Data," and then drag-drop it on the canvas.

ML10
Figure 10: Selecting Split Data

Connect the two modules as shown in Figure 11:

ML11
Figure 11: Connecting the two modules

Select the "Split Data" and, in the properties panel, select the "splitting mode" as "Relative Expression" and the "Relational Expression" as ‘\"Book-Rating" != 0', where "Book-Rating" is a column name in the "Book-Ratings" dataset.

Note: Column names are case sensitive.

We further split the data so that a few records can be used to train the model and the rest to score the model. The original dataset is divided 50-50—50% of the data will be used to train the model and the other 50% will be used to score it. This ratio can be adjusted to 80-20 or 70-30.

ML12
Figure 12: Adjusting the ratio

The next step is to train the model. Because we are building a "book recommender," in the left panel search for "recommender." It brings up "Train Matchbox Recommender," "Score Matchbox Recommender," and "Evaluate recommender." We will use all the three in this experiment.

ML13
Figure 13: Finding Train Matchbox Recommender

First, add the "Train Matchbox Recommender" to the canvas and connect, as shown in Figure 14:

ML14
Figure 14: Adding Train Matchbox Recommender to the canvas

Note: Hover over the nodes and it displays the kind of data supported by the node. An example can be seen in Figure 15.

ML15
Figure 15: Hovering over the nodes

Now, we add the "Score Matchbox Recommender" to the canvas. This has a few input nodes and they expect different types of data. The first node expects the output from the "Trained Matchbox Recommender" and the second node expects the dataset to score against—the second half of the split data. The connected model is as shown in Figure 16:

ML16
Figure 16: The connected model

Click "Score Matchbox Recommender" and set its properties as demonstrated in Figure 17:

ML17
Figure 17: Setting the Score Matchbox Recommender properties

The last step is to add the "Evaluate Recommender." Evaluates' first node takes a "Test dataset"—the second part of the split—and the second node takes input from the "Score Matchbox Recommender." The updated model is shown in Figure 18:

ML18
Figure 18: Adding Evaluate Reminder

Run the experiment to visualize the data. Once the experiment is done, it will display a green icon against all the modules. However, when we visualize the output of the "Score Matchbox Recommender," the result is in some IDs with no book titles (see Figure 19).

ML19
Figure 19: Showing IDs, but no titles

The scored dataset has "Item" and "Related Item 1" columns, both contain the ISBN values and it's difficult to interpret with just IDs. To make it more readable, we would need two joins with the "book" dataset to get the book titles. Add the "Book" dataset to the canvas and add "Select Columns in Dataset" to the canvas. Connect the modules (see Figure 20):

ML20
Figure 20: Connecting the modules

Select the "Select Columns in Dataset" and, in the properties panel, click "Launch the column selector." Select the columns, as shown in Figure 21:

ML21
Figure 21: Selecting the columns

Now, add two "Join Data" modules to the canvas and connect the modules (see Figure 22):

ML22
Figure 22: Connecting the modules

Select the first "Join Data" and, in the properties panel, select the keys to perform the join. In the first "Join Data," inner join the "Item" column from the "Score Matchbox Recommender" output dataset to the "ISBN" column of the "Books" dataset.

ML23
Figure 23: Performing a Join

In the second "Join Data," inner join "Related Item 1" column from the "Join Data" output to the "ISBN" column of the "Books" dataset.

ML24
Figure 24: Inner joining two columns

Run the experiment again. Now, visualize the data at the second "Join Data" and it displays the title names. The Final experiment looks as shown in Figure 25:

ML25
Figure 25: The final experiment

After executing the experiment successfully, click "Predictive Web service" (see Figure 26).

ML26
Figure 26: Clicking Predictive Web service

The output of the "Predictive Web service" looks as shown in Figure 27:

ML27
Figure 27: The output of Predictive Web service

Run the Predictive model; the "Score Matchbox Recommender" displays the data with ISBN IDs. The join information needs to be added again because it was added in the training model. The only difference is that the "Join Data" output is added as an input to the "Web Service Output." After adding the joins to the Predictive model, the model looks as indicated in Figure 28:

ML28
Figure 28: After adding the joins

Once the Predictive Model runs successfully select "Deploy Web service [Classic]" as shown in Figure 29:

ML29
Figure 29: Selecting Deploy Web service (Classic)

The published Web service provides an API key to access the Web service. Verify the Web service by clicking the "Test" links/buttons under the "Default Endpoint" --> "Test" column.

ML30
Figure 30: Verifying the Web service

Summary

In the preceding example, we could create a training and predictive model without writing any code. ML studio makes it easier because popular algorithms are already defined for use. It also provides modules to run custom R and Python scripts. Azure ML provides an easy and faster way to create training models. For beginners, it's a great place to start.

References


Tags: Python, Microsoft, SaaS, Azure, CSV Files, R, workspace, Azure Machine Learning, Azure ML, training model, UI-based, dataset, experiment




Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date
Rocket Fuel