Image
Adopt or foster a pet. Save a life. Reduce animal shelter overcrowding. Read more...

All things data and analytics

  • Home
  • About
  • Topics
    • Industry Perspective
    • BI & Analytics
    • Automation
    • Big Data & IoT
    • Cloud BI
    • Data Discovery
    • Data Prep
    • Data Warehousing
    • Data Science
    • ETL
    • Mobile BI
    • Predictive Analytics
    • Prescriptive Analytics
    • Reporting
  • Reviews
  • Events
  • Resources
  • Contact
  • Home
  • About
  • Topics
    • Industry Perspective
    • BI & Analytics
    • Automation
    • Big Data & IoT
    • Cloud BI
    • Data Discovery
    • Data Prep
    • Data Warehousing
    • Data Science
    • ETL
    • Mobile BI
    • Predictive Analytics
    • Prescriptive Analytics
    • Reporting
  • Reviews
  • Events
  • Resources
  • Contact
IBM Watson
Predictive AnalyticsIBM

Introduction to Machine Learning with IBM Watson Studio

By Jen Underwood  •  1 year ago
31

Jul
2018

Share
Share

Introduction to Machine Learning with IBM Watson Studio

Machine learning is a type of artificial intelligence (AI) that enables computers to learn without being explicitly programmed. Algorithms identify patterns found in data to generate predictive models.Typically machine learning tasks fall into three categories:

Supervised Learning – Computers train on labeled data and learn general rules. Commonly used algorithms include Support Vector Machines, Linear Regression, Logistic Regression, Naive Bayes, and Neural Networks.

Unsupervised Learning – Data fed into the computer is not labeled. The goal is to explore and find structure. Popular unsupervised learning algorithms include Cluster Analysis and Market Basket Analysis.

Reinforcement Learning – Computers learn through feedback systems. Reinforcement learning powers self-driven systems and robotics.

In this lesson, we will walk through creating a supervised learning model with IBM SPSS Modeler, a service within IBM Watson Studio. Let’s get started. 

Getting Started with IBM SPSS Modeler

IBM Watson Studio flow

With IBM SPSS Modeler, you can build machine learning models with drag and drop ease. Using a visual canvas, you can load data, sample it, transform it, apply algorithms and evalute predictive model performance through a series of nodes to find hidden patterns or variables that influence outcomes. 

For our first foray into machine learning, we will download and explore Titanic. Titanic is a publicly available dataset from Kaggle about the infamous shipwreck.Titanic sank after colliding with an iceberg killing 1502 out of 2224 passengers and crew. Unfortunately, the ship did not carry enough lifeboats for everyone. To predict what groups of people were more likely to survive than others, we will create a supervised learning model.

1. Creating a Project and Loading Data

IBM Watson Studio

After logging into Watson Studio, select New Modeler Flow. Enter a name, keep the default settings, and then click Create.

 

2. Loading Training Data

IBM Watson Studio

Next expand the Import menu, drag the Data Asset node onto the stream canvas and select Titanic training data file (train.csv) in the node settings to load data into the project. Right-click the node and select Preview to see your detailed dataset.

3. Designing a Stream

IBM Watson Studio

To build a modeler stream look under Record Operations. Pick Sample and drag it onto the canvas. Then click on the circle on the right side of the Data Asset node and drag the line to the left side of the Sample node to connect the operations. Now right-click on Sample to view the settings. For Titanic, we will use the First n defaults.

4. Choosing Model Algorithms

Now we will experiment with algorithms. Expand the Modeling menu, explore the vast library of available machine learning models. For classifying Titanic survivors, we will pick Decision List, Classification & Regression Tree (C&R Tree), and Neural Net. Drag those three nodes onto the canvas and connect them to the Data Types node. Now let’s run the stream.

To run the stream, click the small blue triangle on the stream canvas top menu. SPSS will process the data through the selected machine learning models. Notice upon run completion, new orange nodes appear. These nodes contain model performance results.

5. Evaluating Model Performance

To review the findings, right-click each of the model results nodes and investigate the evaluation menus. Note each algorithm has different options. For the Titanic C&R Tree Model, females with 1st class tickets had the highest 97.33% probability of survival. Other groups did not fare nearly as well.

IBM Watson Studio

IBM Watson Studio

 

6. Deploying and Using Models

Now that you created several simple supervised machine learning models with IBM SPSS Modeler, you would begin testing those models with the unlabeled Titanic test dataset (test.csv) to see if they continue to remain highly accurate for predicting survival outcomes on new datasets.

Keep in mind that finding an optimal machine learning model on your first run is unusual. Typically you will continue to iteratively experiment by refining machine learning model input and algorithm settings to improve predictive accuracy.

IBM Watson Studio

After a strong performing model is built, it can be used for predicting new data. To deploy a machine learning model, right-click a final output node and then click Save branch as a model. Navigate to your model list on your Watson Studio project overview page. On the right side of that list, click Add Deployment and choose Web Deployment, Batch Prediction, or Real-time Streaming Prediction. That’s all there is to it.

For More Information

In this tutorial, we introduced how to get started building machine learning models using IBM SPSS Modeler. If you’d like to learn more, please review the following recommended resources.

  • Watson Studio: ibm.biz/watsonstudio
  • Watson Studio online docs
  • SPSS Flow online docs

This tutorial was written by Jen Underwood (@idigdata) courtesy of IBM Watson Studio. She received compensation to write it. However, all opinions expressed are her own.

Share this:

  • Click to email this to a friend (Opens in new window)
  • More
  • Click to share on Twitter (Opens in new window)
  • Click to share on LinkedIn (Opens in new window)
  • Click to share on Facebook (Opens in new window)
Tags
Sponsored SPSS

The Author

Jen Underwood

Jen Underwood works in the emerging citizen data science segment. She has a unique blend of product management and “hands-on” experience in data warehousing, reporting, visualization, and advanced analytics. In addition to keeping a constant pulse on industry trends, she enjoys digging into oceans of data to solve complex problems with machine learning. Over the past 20+ years, Jen has held worldwide product management roles at Microsoft, DataRobot, and Aible. She also served as a technical lead for system implementation firms. She has experience launching new products and turning around failed projects. Most recently she provided advisory, strategy, educational content development, and marketing services to 100+ technology vendors through her own firm. She has been mentioned by KD Nuggets, Information Management and Forbes for her work. She also has written for InformationWeek, O’Reilly Media, and numerous other tech industry publications. Jen has a Bachelor of Business Administration – Marketing, Cum Laude from the University of Wisconsin, Milwaukee and a post-graduate certificate in Computer Science – Data Mining from the University of California, San Diego. She was also honored to be a former IBM Analytics Insider, Tableau Zen Master, and Top 10 Women Influencer.

You Might Also Like

Datawatch Data Prep Governance 2 years ago
ExcelData PrepData Quality

Self-Service Data Prep Governance

Read More
Self-Service Data Prep Governance
Attunity Compose 3 years ago
Industry PerspectiveData WarehousingAmazon Web Services

Accelerate Your Data Warehouse Build with Automation

Read More
Accelerate Your Data Warehouse Build with Automation
DMCA.com Protection Status

SUBSCRIBE

Tweets by idigdata

POPULAR ARTICLES

Multi-Cloud, Hybrid Data Integration 
Data Visualization Solution Migrations
Public Data Storytelling
Storytelling with Infographics
Moving from BI to AI with AutoML
Did you Buy a Self-Service BI Fantasy?
Immersive Data Visualization 
Unifying Analytics Across Vendors
Data Prep Tips for Machine Learning
Getting Started with Python
Natural Language Generation for Analytics
Secrets to Staying Relevant in Analytics
Making a Difference with Data
Dimensional Modeling Basics
Other Good Reads and Resources

TAGS

Advanced Analytics Alteryx Analysis Services Automation Azure Azure ML Big Data Citizen Data Science Cloud BI Cognitive Analytics Customer 360 Data Prep Data Quality Data Visualization ETL Excel Google Analytics Governance Hadoop Hybrid BI Infographics Kimball Master Data Management Metadata Management Microsoft BI Microstrategy Mobile BI Office 365 Oracle BI Power Pivot Power Query Predictive Prescriptive Qlik Real Time Reporting Services REST API SAP SAP Hana Scorecard Self-Service BI SharePoint Spotfire SQL Server Tableau
Top
loading Cancel
Post was not sent - check your email addresses!
Email check failed, please try again
Sorry, your blog cannot share posts by email.