I first came across Alteryx about a year and a half ago. To be completely honest, I was tough on them. In our initial meeting, I literally blurted out “why would I ever pay $X to use this solution in light of Excel Project Data Explorer (now Excel Power Query), Lavastorm and open source Rattle for R”? I had seen a demo, asked a few questions, felt what I saw was too narrowly focused on location analytics and totally objected to the pricing model that at the time included unnecessary, premium data sets. Although I was skeptical and difficult, I conceptually liked Alteryx’s visual, easy solution for self-service ETL usage and continued to review the predictive R wrapper progress.

Back in April 2013, I reviewed them again. This time I was pleasantly surprised by rapid improvements and deemed this solution to be “king of the hill” per se in the self-service ETL niche. I also noted that Alteryx had listened and addressed my prior pricing model objections. The new pricing model was much more appealing and competitive. I was impressed with the predictive R wrapper, options to map to existing R libraries and the great array of samples to jump-start development.


Flash forward to February 2014, Alteryx totally shocked and rocked the latest Gartner BI Magic Quadrant AND they were included in the first ever Gartner Advanced Analytics Magic Quadrant for their maturing predictive offering. Alteryx had one of the largest positive leaps in year-over-year Gartner rating performance that I had ever seen! Thus it was time to review them again and indeed the Alteryx Designer desktop had really blossomed into much more than a pleasant self-service ETL tool. The offering included support for the full CRISP-DM life-cycle and quite a few different algorithms via drag-drop, point-and-click tasks that I was able to self-teach in a couple minutes.

Alteryx INSPIRE!

In mid-June Alteryx will be holding their annual user conference, INSPIRE, in sunny San Diego, California. INSPIRE is a fantastic name. I only touch on tiny segments of Alteryx that I care about in data preparation and predictive/data mining. There are user interface and app features that I have not played with at all. There are also excellent recent additions of Revolution R, social media, JSON parsing, blob handling, Marketo, SAS and SPSS connectors for combining solutions, etc. It will be fun to see how other folks are using Alteryx in the real world. If you are going to INSPIRE, please let me know. I’d love to meet up with you during that conference.

New Alteryx 9.0 Features

In preparation for the conference, I went ahead and downloaded the most recent 9.0 trial. The overall ease of use factor, scheduler, scale-out, big data, and variety of cool data connector improvements are getting even more compelling! I was surprised to see a QlikView connector since Alteryx and Tableau historically have gone to market hand-in-hand. The Alteryx connector expansion to include more self-service data visualization tools does make a lot of sense to protect and secure a premium market segment position. We already got a “hint” of Tableau adding a bit of ETL-like features with v8.2 from TCC 2013 and a recently posted role seeking a Program Manager, ETL Features.


Recently at Informatica’s conference, a solution called Springbok for self-service data visualization tools was mentioned. Excel Power Query is getting better and so on.

Looking at Alteryx v9.0, I can clearly see that they have not been complacent.   Here is a brief summary of a few new features that I genuinely appreciate:

New Data Connectors

  • Social Media
    • Twitter: Supports tweet searches over the last 7 days by given search terms, with location and user relationship as optional properties
    • Gnip: Supports tweet searches over the last 30 days from the social media market leading Gnip Search API with multiple configuration options
    • DataSift: Supports DataSift social media analytics. a market leading, totally awesome social data platform for extracting insights from billions of public conversations
    • Foursquare: Supports searching the Foursquare Venues API by a location, with multiple options
  • SQL Server bulk loader
  • IBM SPSS – Read and write sav: files
  • SAS – Read and write . sas7bdat: files for SAS version 7 through 9.4
  • Amazon Redshift – ODBC driver
  • HP Vertica – ODBC driver
  • Pivotal Greenplum – ODBC driver
  • Marketo Marketing Automation
  • QlikView – Write .qvx files
  • Google Analytics Core Reporting API
  • Improved connectivity to web APIs with an enhanced Download task
  • JSON Parse: Supports separating Java Script Object Notation text into a table schema for the purpose of downstream processing  (JSON is getting more and more important when dealing with unstructured data sources)

Predictive Analytics

  • R was updated to version 3.0.2 and there are quite a few improvements to the R tool for higher scalability
  • Revolution R Integration: Alteryx can now leverage the performance benefits of Revolution R Enterprise (See Revolution R Installation Guide on the downloads page)
    • XDF Input: This tool enables access to an XDF format file (the format used by Revolution R Enterprise’s RevoScaleR system to scale predictive analytics to millions of records) for either: (1) using the XDF file as input to a predictive analytics tool or (2) reading the file into an Alteryx data stream for further data hygiene or blending activities.
    • XDF Output: This tool reads an Alteryx data stream into an XDF format file.
  • Gamma Regression: Relate a Gamma distributed, strictly positive variable of interest (target variable) to one or more variables (predictor variables) that are expected to have an influence on the target variable
  • Heat Plot: Uses a heat plot color map to show the joint distribution of two variables that are either continuous numeric variables or ordered categories. ( Note: I could not resist testing this one the moment I saw it! Here is a sneak peek.)


  • Spline Model: Predict a variable of interest (target variable) based on one or more  predictor variables using the two-step approach of Friedman’s multivariate adaptive regression (MARS) algorithm.  Step 1 selects the most relevant variables for predicting the target variable and creates a piecewise linear function to approximate the relationship between the target and predictor variables. Step 2 smooths out the piecewise function, which minimizes the chance of overfitting the model to the estimation data. The Spine model is useful for a multitude of classification and regression problems and can automatically select the most appropriate model with minimal input from the user.

And last but not least…Alteryx added more samples and updated their website with a clean, fresh, modern feel.  If you want to play with the latest and greatest Alteryx v9.0, you can download it from http://downloads.alteryx.com/.