Decision making and the techniques and technologies to support and automate it will be the next competitive battleground for organizations. Those who are using business rules, data mining, analytics and optimization today are the shock troops of this next wave of business innovation.
- Tom Davenport, Competing on Analytics
Additional articles available on Tech Target's BeyeNETWORK and SQL Server Pro Magazine.
Impact Analytix Winter 2013 News & Views
As promised, I am pleased to announce that Impact Analytix quarterly newsletter, NEWS & VIEWS, has been published for Winter 2013. NEWS & VIEWS contains the latest business intelligence and predictive analytics industry updates, what's hot/what's not, technical product reviews, career outlooks and other hot topics - no holds barred - in a slightly different, much more condensed format than the blog.
The current issue contains critical career tips for those of you considering a new job move in 2014 when making those annual New Year's resolutions. I warn of employment and contractor agreement traps for high demand skills, non-competes and other challenging career issues now happening in the
war for limited analytic talent - a topic near and dear to my heart. I also share 2013 industry reflections and 2014 predictions, a look at Tableau TCC and the Tableau 8.1 release, other BI vendor upcoming events, resources, specials and introduce Josh Sebert.
Josh has over 20 years of combined financial, accounting, human resources, information technology, legal and operational management experience as a CFO and COO. He has extensive experience with enterprise systems, reporting systems and business improvement methodologies. His projects have been highlighted in several high profile Business Intelligence Case Studies. Josh graduated from Purdue University with a BS and followed with a JD from California Western School of Law, then successfully obtained an MBA from University of California, Irvine. I am excited to be working with him. He brings a fantastic background of Business Strategy, Operations Management, Performance Management Systems, Financial Forecasting and Analytical Modeling.
For more scoop, download the current copy of NEWS & VIEWS and subscribe to automatically get future updates emailed to you. This is just the beginning of what is sure to be an enlightening, quite different industry newsletter that will include hot button industry topics that no one openly talks about but everyone needs to know about if you are in the BI and analytics profession.
SAP Lumira v1.13
I have been following SAP Lumira since the debut as SAP Visual Intelligence at SAP TechEd in 2012. I remember sitting in various SAP TechEd sessions in Las Vegas last year hearing SAP Product Managers tell the audience to expect three to four week feature release cycles. At that time, I was both shocked and skeptical. I thought that what I was hearing was almost impossible from a global business intelligence vendor the size of SAP. One year later, that is exactly what SAP is doing.
I bought SAP Lumira Professional after the Sapphire conference in May. Since then I have already received a few updates. No sooner do I install v1.12 and a few weeks later v1.13 is released with 1.14 right behind it. These rapid releases are truly amazing. Now to be totally blunt, SAP does have a lot of catching up to do in the Data Discovery space where SAP Lumira plays. SAP Lumira is nowhere near a Tableau or Tibco Spotfire offering today. However, SAP Lumira does have many more data visualization types than Microsoft's Power View. With the latest SAP Lumira 1.12 and 1.13 releases, SAP has made improvements worth coverage. Here are a few of my review notes from a BI Professional perspective.
Upon launching the latest and greatest SAP Lumira v1.13, I noticed the Welcome screen had been enhanced with links to sample content, videos and other resources. I was pleased with the nicer user interface and delightful new branding. The development path steps were visually displayed and linked to the related screens for a quick and easy jump start.
There were more data sources available in this version than in previous versions including - Excel, CSV, SAP HANA, SAP Business Objects Universe, JDBC and ODBC SQL Queries, IBM DB2, Grrenplum, PostgreSQL, Apache Hadoop, Microsoft SQL Server, Oracle, Netezza, SAP Hana, SAP ERP, SAP R/3, Sybase and Teradata. For my review, I wanted to test a Big Data set to see how SAP Lumira would perform. I used the Hortonworks Hadoop Sandbox NYSE Stock data set and a Hive connection to load the data. To connect and query Big Data with Hive, you need to install the drivers. Other options for connecting and querying Big Data are described on the SAP + Hortonworks partnership page.
Once I had the Big Data loaded, I used the SAP Lumira Prepare features to add a Time hierarchy for Calendar date time based intelligence and analysis. The Prepare functional area felt like a less robust Microsoft Power Query for imported data manipulation. To create a Time hierarchy, I selected the gear icon on the Calendar Date attribute and Create Time Hierarchy. Instantly SAP Lumira generated a drill capable hierarchy including Year, Quarter, Month and Date. I did not test how easy it would be to create a custom Calendar or a classic 4-4-5. Most of the BI tools I review do not have great wizards or tools for those classic BI time analysis scenarios. The Prepare screen also had features for data cleansing including data type conversions, geospatial data types, filters, sorts, appends, merges, show or hide a field, calculated fields with a library of data formatting and logic functions.
Now that my Big Data was cleansed and prepared, I was ready to get to the fun part…visually exploring the data to look for patterns and trends. To begin visualizing a data set in SAP Lumira, you use the Visual features. To create create visualizations, you simply drag fields onto the user interface and choose a visualization type. This experience was still a bit clunky and limited in comparison to the other BI tools I review. Tableau and Microsoft Power View have far nicer drag-and-drop visualization build experiences and much better filtering capabilities.
I was impressed by the plethora of data visualization types available in SAP Lumira 1.13. These include but are not limited to Column (Bar), Line, Pie, Area, Stacked, Dual Axis, Combination, Donut, Scatter, Bubble, Tree Maps, Heat Maps, Geospatial Maps, Radar, Box Plots, Word Clouds, Waterfall, Parallel Coordinate, Funnel Charts and Grids. I did not see a way to overlay visualizations or control the axes but the most popular viz types were indeed there. I also liked the visualization brushing features. I was able to visually select a subset of data on any of my views to narrow my focus on and dig deeper into the details.
One of the Data Discovery features that I like to test is the plotting capability to visualize big data sets. This is where a lot of BI tools fall down or fail altogether. Why do I care about this capability? We live in a big data world! A decent, modern Data Discovery tool should be capable of helping you visually analyze big data sets. For example, Excel does a horrible job of visualizing large data sets and suffers from a condition called over-plotting. Microsoft Power View can only render 1000 points and then starts to randomly sample for rendering the visualization; SAP Lumira used to have that same 1000 point limit limit. Now with SAP Lumira 1.13, it can render up to 10,000 points. 10,000 is much better than 1000 but it totally pales in comparison to Tableau that can render 60 million points as recently shown by Allen Walker.
A wonderful feature that I stumbled upon was basic, out-of-the-box SAP Lumira Predictive Calculations. By choosing the down arrow on a measure displayed with a date range I was able to choose a Forecast or Linear Regression Predictive Calculation type to add to my visualization along with specifying how many periods forward I wanted to predict. There is also another snap-in product for SAP Lumira called SAP Predictive Analysis.
SAP Predictive Analysis is far more robust with regards to predictive features than base SAP Lumira. SAP Predictive Analysis supports use of predictive algorithms from open source R and SAP Hana. This offering is closer to a Tibco Spotfire TERR type solution and will allow you to get quite sophisticated in your visual predictive analysis. SAP Predictive Analysis builds upon the data acquisition and data manipulation functionality from SAP Lumira and adds predictive features into it. I don't know if SAP might add KXEN Predictive features into future releases of SAP Predictive Analysis but predictive does appear to be an area where SAP has some core strengths that they could better leverage.
Each time I created a visualization, I could optionally save it to a collection for usage in a SAP Lumira story board a.k.a. a dashboard. Available visualizations for story boards were displayed as thumbnails at the bottom of the SAP Lumira user interface. To create an SAP Lumira story board you use the Compose features. The Compose functional area of SAP Lumira allows you to select a layout, drag views onto the story board layout sections, and add filters, text boxes and images. There is an option to immediately preview your work - switching between authoring and viewing during the development process.
When you are happy with your story board, you can create a new one to add to the story or you can share it.
After a story board is built with SAP Lumira v1.13, there are a few different ways to share it. You can export the file for another SAP Lumira desktop user to import or you can publish the dataset to SAP HANA, SAP Business Objects Explorer, Streamworks or SAP Lumira Cloud. You can also email the visualization as a Portable Network Graphics (.png) image. The biggest gap here is NO export to Excel - the #1 request for almost all BI projects and dashboard tools! The other huge gap was rendering the story board when published to SAP Lumira Cloud.
In my review, I chose to publish to the SAP Lumira Cloud. Since I had a larger data set, it did take a few minutes to complete the transfer of both the SAP Lumira views and dataset created in the SAP Lumira desktop to the SAP Lumira Cloud. When it did finish uploading, I saw both my views file and dataset in the My Items list. Immediately I wanted to see how my dashboard looked in a web browser but I could not find any way to see it up there!?! After searching help, Google and all the usual places, I learned that the SAP Lumira views built in the desktop version DO NOT render in SAP Lumira Cloud right now but they do render in an SAP Mobile BI app. Odd...
So what does render in the in SAP Lumira Cloud? Exploring a bit more I discovered that a different authoring and exploration view of my uploaded Hadoop data set was available. This SAP Lumira Cloud authoring environment did not have as many bells and whistles as the SAP Lumira desktop version but it did render the big data views extremely fast when I recreated them on the cloud.
Bottom line, SAP Lumira 1.13 has come a loooong way from where it was only a mere few months ago with the rapid, couple week release cycles. It is a good stride forward for SAP who historically has had the worst customer satisfaction performance in BI analyst surveys and the most difficult BI tools to implement. In my opinion, SAP Lumira is something to keep an eye on but most likely it won't fulfill real world dashboard, Data Discovery or analytic needs yet.
BeyeNETWORK Prescriptive Analytics Channel Series
I am happy to announce the launch of my Prescriptive Analytics Channel on BeyeNETWORK. For those of you that are unfamiliar with Prescriptive Analytics, it is a decision science area that provides the best options for given situations based on the concepts of optimization. Prescriptive analytics lies at the far end of the analytics maturity spectrum that starts with descriptive analytics, progresses to diagnostic analytics, predictive analytics and finally finishes with prescriptive analytics.
Where descriptive analytics is reactive in nature and allows an understanding of what has happened in the past, both predictive and prescriptive analytics support identifying what may be best in the future based on a variety of scenarios in a solution space. The types of real-world problems businesses face today are often quite complex and can be solved by taking multiple courses of action. This is where prescriptive analytics, modeling and optimization can be used to evaluate those complex options to help select the best one. Prescriptive analytics can't solve all problems - but - it can be incredibly powerful!
For more information on prescriptive analytics, skills, popular tools and use cases, please tune into my new article series on BeyeNETWORK.
Fun Tips for Tableau Maps
Here are a few fun tips for visualizing data with Tableau maps. Tableau maps are more powerful and deeper than they initially appear. You can do some amazing geospatial visualizations with them - including awesome animated mapping visualizations over time periods. Unlike just about every other aspect of Tableau that I simply adore, I do confess to battling a bit with maps. If it weren't for Richard Leeke's example workbooks on radius mapping and other shared workbooks on Tableau Public, I would have been totally stumped on how to accomplish the advanced radius mapping visualizations.
(Note: Speaking of sharing workbooks, one more Tableau community site was launched this week that will be a good resource to reverse engineer Tableau workbooks. It is called the Tableau Workbook Library.)
First Tip, Street Level Mapping
Tableau has a mapping page dedicated to providing information on how their maps work and what data is used. They also have some good tutorials on mapping. To visualize locations and for example competitor locations at street level on a map, you would need to have the street address latitude and longitude. There are several free and for fee services available for geocoding latitude and longitude. Here are a couple of them:
Address Geocoding for Street level Latitude/Longitude Mapping
Tableau has a live example of Street Level Mapping in the Retail industry section of the Tableau website with a fully functional workbook that you can download, use and/or change with your own data sets. Several free online mapping tutorials, step-by-step examples and a Tableau Knowledgebase article is available on this topic.
Second Tip, Radius Mapping.
To add radius filtering to your maps, there are several examples and a knowledgebase article that are a must read. Radius circles are an advanced topic since they use complex earth geometry calculations in the calculated fields to compute and render circles. I would highly recommend taking an existing, fully functional radius circle example and load your data into it to avoid having to define the geometry calculations. Here are my favorite “How To” articles on this topic that also include Tableau workbook samples. Richard Leeke is a brilliant guru on the Tableau radius mapping topic; he has shared several complex examples in his posts that I have found invaluable.
If you want to get really savvy and create Drive Time radius geocodes, Alteryx has advanced geocoding that is easy for business users to do this themselves. Alteryx also a rich array of related data sources such as MOSAIC, Dun and Bradstreet, Experian, and TomTom. I showed an example of Drive Time radius mapping in an earlier blog Self-Service ETL Tool Options.
Third Tip, Weather and Data Mapping Overlays
The WMS map layering in Tableau just might be one of my favorite mapping features. I do like to see if weather affects results when analyzing data. To add a weather layer, you can use a free or paid WMS service. Once you find a WMS service, you can add it to Tableau's WMS Servers list. To do that, go to Maps > Background Maps > WMS Servers and choose Add. Once there enter in the URL of the WMS Service, click ok.
Now go to Maps > Background Maps > and select the WMS Service to display the weather. Additional WMS map layering information is available in the Tableau Online docs.
Fourth Tip, Using Google Maps in Tableau
Tableau can also integrate with Google maps. There is a Tableau Knowledgebase article on this topic. I usually have integrated Google maps via an Action.
On the Tableau Dashboard, choose Dashboard > Actions > Add Action. Select URL and paste in the Google Maps URL i.e. http://maps.google.com/maps?f=q&source=s_q&hl=en&authuser=0&q=Miami . Highlight the city name Miami and select the City attribute in your data set to dynamically display city level mapping. You can use this same approach for mapping other locational data. Aside from maps, actions can also be used to integrate other web applications to display a specific record such as Salesforce.com record or embed a Microsoft Reporting Services Report.
There are many other Tableau mapping visualization techniques that I have not touched on here for things like filled maps, background image maps, Polygon-Shaded Maps with ArcGIS Shapefiles, logistics routing/optimization and hurricane paths. If you still need more than what is available in the base Tableau map offering, several Tableau partners offer enhanced mapping solutions. Many of these and other great data visualizations are highlighted on my Inspirations web page.
Bottom line on Tableau maps, there is more to them than meets the eye. You can get exceptionally sophisticated with Tableau mapping to add location context within your Tableau analytic dashboards.
Power Query = the Shining Star of Power BI
I have written a few times this year about Data Explorer, now called Power Query for Excel 2010 and 2013, in blogs New Data Explorer for Excel and Self-Service ETL Tool Options. I honestly feel that Power Query is the shining star of Power BI. Literally anyone who uses Excel and pulls in data for quick analysis or even for visualization in other Data Discovery applications can enjoy Power Query.
In the first Beta Preview versions, I was impressed with simple things like XML imports and the pivot/unpivot in the data load/preparation steps. I continue to be really pleased with all the investments that I am seeing in simple, practical yet incredibly helpful data querying and manipulation capabilities with many different types of structured and unstructured data sources. If you have not tried Power Query yet or kept up with all the enhancements this year, you are missing out on a good thing. You can download it and learn more about it on the Office web site.
With all the extremely frequent BI vendor updates, I simply can't keep up reporting on all of them in long blogs but I will try to share quick bits to keep my loyal and ever growing reader base informed of what I feel is significant. I was just about ready to share a blog on SAP Lumira v1.13 and see a mere two weeks later v1.14 is coming - couple week release cycles from SAP - WOW!!! I have had Spotfire 6.0 to report on for over one month now, my homework is done and I am almost ready, more awesome Tableau material being developed and so on. Then I get November Updates to Power Query this week that are simply too good to sit in my blog backlog. It appears that Power Query and Power BI, just like SAP Lumira, is on a rapid release cadence. From what I have seen it looks like a monthly release cycle. I hinted years ago these faster Microsoft BI releases were coming in my SharePoint 2010 talks relating the future to be like Azure-world. Now faster releases are indeed a reality. Is it sassy to say that I told you so years later? I will have to do a webinar and video on Power Query soon. If you love data, you will love Power Query.
Here is a quick summary of highlighted Power Query updates:
- Sharing of Queries (LOVE THIS FEATURE - a little like SSRS Shared Data Sets)
- Certify shared queries (great for self-service data governance)
- Add users to the Data Steward Role (to certify shared queries)
- View and manage queries in an Excel workbook
- Improved search experience
- Expanded number of available datasets including Dun & Bradstreed (D&B)
- New Search Tools and Cancel Search
- Merge multiple queries
- Import data using Native Database Query
- Connect to Windows Azure Table Storage sources
- Directly select a table and load it into your workbook
- Import multiple items from the same source in a single shot
- Specify the Load Settings for each of your queries upfront
- Use the Query Editor ribbon to perform reshape operations
- Added support for 20 additional languages in this release (37 in total)
- Store credentials for data sources associated with a data management gateway
- Restore a gateway on another computer for fail over scenarios
For more information, please review: What's new for Power BI Excel Help topic.
Predixion Enterprise Insight 3.1
Recently I had the honor of walking through Predixion Insight 3.1 and getting a glimpse at upcoming 3.2 directly with Jamie MacLennan, Co-Founder and CTO at Predixion Software. I have been a long-time fan of Jamie's for almost 10 years now. He formerly was the development manager of SQL Server Analysis Services and also was the Development lead of the SQL Server Data Mining platform.
In previous blogs I have written about SQL Server Data Mining in What-If Analytic Simulation Options, Predictive Analytics with Tableau and Practical Predictive Analytics. I also have a myriad of Predictive decks on SlideShare that cover SQL Server Data Mining solutions such as Predictive Analytics with Excel, and Predictive Analytics with SQL Server. In most of these materials, I mention Predixion but I never go deep in sharing why I mention them so this blog post is long overdue.
(Note: If you are not familiar with the base SQL Server Analysis Services Data Mining features, please see my SlideShare decks on this topic: Predictive Analytics with Excel, and Predictive Analytics with SQL Server. This blog article assumes an existing understanding of that offering.)
Historically when I recommended Predixion, it was only when a customer wanted more than could be accomplished out-of-the-box with the base SQL Server Analysis Services Data Mining features. Microsoft also slowed down investments in that offering a few years ago leaving some of the features like PMML behind and out of date. After reviewing this latest version of Predixion Insight 3.1, I will be suggesting use of it much more often for Microsoft-centric accounts. Predixion Insight is far more feature rich than the base offering that I have been showing in my web casts, SQL Saturday sessions and blogs. Here is a peek at what the latest version of Predixion Insight 3.1 brings to the table.
Predixion Insight is a supplement to the base SQL Server Analysis Services Data Mining features. However starting with version 3.0, Predixion Insight also works with R and some Mahout via a plug-in that allows various machine learning libraries to be used. That is a really significant, warmly welcome enhancement that I will dig deeper into a little later on. Predixion Insight also has an Excel add-in, a Server and an optional Cloud offering. The user-friendly Excel add-in adds two new tabs to Excel, INSIGHT NOW and INSIGHT ANALYTICS. The INSIGHT NOW tab contains the enhancements to the base Table Analysis Tools. Things like Analyze Key Influencers, Detect Categories and Market Basket analysis reside here but have been improved upon. For example the
Analyze Key Influencer reports are more detailed and the report presentation is nicer.
Most of the great, robust features data miners and analysts will use are located in the INSIGHT ANALYTICS tab. Here is where you will immediately notice the sheer breadth and depth of enhancements over the base Microsoft offering. Things like being able to use PowerPivot as a data source, a wonderful data profiling feature that I could see using on non-predictive projects as well as predictive, easier ability to use external data sources and an option for in-database model scoring, better sampling, discretization and labeling capabilities including a new predictive analytical expression (PAX) function that provides better binning, added features for normalizing the data or adding calculated fields with a statistical function library, a link to the Predixion Marketplace where you can share or get already developed predicted models to fast track your project, a link to the the Predixion Server or Cloud to centralize, share and collaborate on models, a feature rich Insight Workbench for developing predictive models, and so on. The list is simply too long to do justice in one blog. Refer to the online documentation to get a much better idea of all the goodies Predixion Insight 3.1 offers.
One of my favorite features that I longed for in the base Microsoft offering but never did have a nice work-around for is the instant base statistical information for predictive model variables. I simply love the way Jamie and his team have implemented this feature to show the variable distributions, core stats and correlation relationships with any other variables in the predictive model. It is fantastic and obvious to me that they know what information predictive modelers need and when they need it in the predictive life-cycle work flow!
Other enhancements that I really appreciated include the improved model performance and exploration capabilities that allowed me to test my demo predictive model using the classic Bike Buyer data mining sample data set with various variable combinations, a form of live predictive what-if analysis, to see how changes would impact model prediction and the related prediction score.
One of the biggest new features is a new Machine Learning Semantic Model (MLSM) allows data scientists and predictive modelers to change their work flow from creating predictive models to creating predictive applications. MLSM packages all of the necessary data transformations, predictive modeling logic, sampling, validation, and model selection generated while creating a predictive solution into a reusable application. That is a big deal for automating and speeding up predictive model development. In addition it also empowers predictive model sharing and collaboration amongst a team.
Revisiting the Predixion plug-ins for R and Mahout, I wanted to better understand how that works since I saw what looked like the classic Analysis Services data mining models except with GUID names that I was used to seeing in the past. I was informed to not be fooled - it is possible no model resides on Analysis Services so do not try viewing, exploring or building Predictive Queries (DMX) on the under the covers Predixion Analysis Services data sources. After training a predictive model and finding patterns regardless of library format (Analysis Services, R, etc.), Predixion extracts and stores the needed information in a Predixion specific format. In that process, adjustments may be made to improve model predictive accuracy. Bottom line, the raw models created in Analysis Services or R by Predixion cannot and should not be executed directly.
The only area that I thought was more difficult than the base Microsoft offering or could use some improvement for us mere mortals was prediction queries and embedding those into applications. In the base Microsoft offering I have been able to easily create and embed dynamic Prediction Queries (DMX) in applications and reports simply using an Analysis Services connection and DMX script (like a SQL script) with variables allowing real-time predictions. That is a powerful concept to make predictive models useful and actionable. I have used the Prediction Queries (DMX) solution for check fraud, healthcare payments and other use cases. The Predixion solution equivalent capability for real-time prediction scoring/queries was a bit more difficult. I pinged Jamie and his team to see if I was missing something. They came back with a few ways to achieve similar capability.
1) Predixion’s API can be invoked with a row of data (singleton), batch of data or a pointer to an external data source (e.g. a database table or a Hadoop hdfs:// store). From an application, the prediction query is invoked pretty similarly to Analysis Services DMX – an ADO.NET like connection, a query with parameters and results. The query is an XML representation of the request. From a modeling tool, such as Excel, the query is presented as a Visual Macro with a tabular format can be easily copied to SSIS ETL or other applications.
You can execute a singleton query, in “real-time” mode from Excel by means of the VBA API, or from a .NET application. A VB API real time predictive call looks like below:
Dim pred As New PredixionVBA.Prediction
‘ Specify the target MLSM and Model
pred.Application = "Bike Buyer Demo Application"
pred.Model = "BikeBuyer_Classification"
‘ Add singleton inputs
pred.Inputs.AddField "Age", 35
pred.Inputs.AddField "Gender", "M"
‘ specify desired output
pred.Outputs.Add ("PredictProbability([Purchased Bike], 1)")
‘ execute prediction and collect result
Set result = pred.Execute
All the scoring is done in the Predixion code, without calling into whatever machine learning library was used in training. If all the MLSM-transformation-code-plus-model-scoring belongs to Predixion, then it can be encapsulated and moved out of Predixion for embedding into applications.
2) Alternatively, the MLSM can be downloaded from the Predixion server and brought into a process running .NET or Java for true real-time scoring, without the price of the network latency - another enhancement over base Analysis Services DMX since that is simply not possible. Scoring code can also run inside certain databases, such as SQL Server (via SQLCLR), Greenplum (Java UDFs) or Hadoop (Java) and the result is in-database scoring, with no latency.
If a developer wants to change the execution of a query from Server-side to Local, all that needs to be added is a few lines of code:
using (IDbCommand icmd = cn.CreateCommand() )
PredixionCommand cmd = icmd as PredixionCommand;
cmd.ScoringExecutionMode = PredixionCommand.ScoringQueryExecutionMode.LocalExecution;
// Download the MLSM and execute it locally
cmd.CachedExecutionPlanExpiration = new TimeSpan(0, 0, 10);
// the MLSM should be checked for updates every 10 seconds
cmd.CommandText = query;
Easy querying and embedding is an important capability as we will see more predictive functionality being embedded into business processes and reporting. Is it easier than before? Ah not really, I still don't think so but it is not much more difficult and I do like the added packaging and in-database scoring options for removing network latency.
Some other features for embedding predictive analytics into data, reporting or business processes with Predixion include an API for VBA, components for SSIS ETL packages, updated PMML for exchanging models versions 2.0 through 4.0 of the PMML standard or ODBC connectivity that allows querying of Predixion job results. Last but not least, Predixion is compatible with the latest Microsoft technologies including Office 365, Excel 2013, Windows 8 and SQL Server 2012.
Stay tuned for additional articles on this topic here and on SQL Server Pro Magazine. I will also make a video and put it on my YouTube channel and hold an upcoming web cast on this excellent Excel Add-In and Predictive Platform offering soon. In the meantime, if you want to dive in and check it out yourself, download a trial and do the walk-throughs in the Predixion online documentation.
A Quick Peek at the Free MicroStrategy Analytics Desktop
I openly admit to being a business intelligence and analytics tool addict. When I come across news about a new and interesting offering, I simply can't help myself... I usually end up downloading, playing with it and learning all about it. This year it feels like I have played with a zillion business intelligence vendor releases. I do have a backlog of blogs on a few: SAP Lumira 1.13, Microsoft Power BI, Tableau 8.1, Spotfire 6.0, Predixion, Solver, Pentaho, Jaspersoft, Datameer, Karmasphere, Splunk Hunk, Rapid Miner, R, IBM BigSheets and so on. I don't blog about them all but I often tweet about them and include highlights of tools in various SlideShare decks.
I recently explored the new free MicroStrategy Analytics Desktop (formerly Visual Insight) that I saw being mentioned in the social Twitter-sphere. MicroStrategy is one of the market leaders in the professional business intelligence space. They have a broad and unified set of business intelligence capabilities. They are also innovative with a long history of being first or early to market with fairly nice offerings. In 2012 they ranked very high in the Gartner Magic Quadrant but this past year they dropped a bit. I wrote up some notes on that in an earlier blog. MicroStrategy is expensive in both licensing fees and build times. They are a traditional BI player with long semantic layer builds and a significant learning curve. The end results are great and their mobile BI solutions are truly best in class. They have been offering some free entry level solutions for a while now to take share in the Cloud BI space and in some other areas. The new free MicroStrategy Analytics Desktop appears to be their response to increasing market pressure from Tableau, Excel Power BI, SAP Lumira, Spotfire and other data discovery players.
After downloading and installing Analytics Desktop, I launched the app, watched the tutorials and looked through the the nice set of showcased examples and sample data - hmmmm, it looks a lot like Tableau's opening view. They did a wonderful job fast-tracking the ramp up with this splash page. Ironically this was close to an idea that I had back in my Microsoft days for Excel 2013 Power View but it is challenging to get changes like this into Microsoft Office world - especially BI specific ideas for a product that is not only for BI. I was kindly shut down at the time but I did see that even Power BI comes with a gallery of samples too so maybe someone listened to me back in the day. : ) Anyway, kudos to MicroStrategy. The introduction, tutorials and samples were excellent. I also downloaded the PDF user guide. My only gripe was that the loud music and child voice narration in the tutorial videos was really annoying.
From here I went ahead and tried the data import process. It was easy. You can import data from file sources, databases, Hadoop HiveQL and Freeform SQL, SOQL, or Web Service XQuery scripts. You can also import a dashboard that another MicroStrategy Analytics Desktop user has shared with you. It does appear that you can mash up multiple data sources with limited automatic relationship detection based on column names. When you import data into MicroStrategy Analytics Desktop, the data is saved as a MicroStrategy in-memory OLAP Cube. You can define a data column as an attribute or metric, change the data type or rename data columns in a Data Preview panel. You can also assign geospatial roles or shape keys to your data for Map or Image Layout visualizations. I did not try a stored procedure so I do not know if those are supported in this release. Also I did not see any data cleansing, filtering, aggregation, transformation or deduplication type features in the data import process. The import was limited. Once data is loaded to the MicroStrategy in-memory OLAP Cube, there is an option for both full or incremental data refresh.
After I loaded a test Bike Buyer data set, I went on to building a few dashboards. Again it was pretty easy though I was not able to find Filters until I searched the user guide and re-watched the tutorial. Filters are hidden under the Show menu icon. Exploring the user interface, it is not anywhere near a Tableau-level benchmark. I did like the simple builds of basic views, nice selection of charts, drill-down grids, show data feature, search feature, exports, cascading filters that can have displays changed to save screen real-estate, ability to change threshold colors, ability to use a visualization as a filter with optional targeting of other visuals. I was disappointed in the "boxy" placement that quite a few BI tools still require in 2013. It would be soooo much nicer to have free form control on where to place charts on dashboards. I also was surprised by the use of Flash...uugh. Birst also suffers from a Flash UI. What is the future of Flash these days? HTML5 seems to be more future-friendly. Microsoft has gone in that direction for Power View migrating away from SilverLight. I was not able to find parameters and I am not sure if I adore the web-based UI when working on my desktop. There is a list of additional features, limitations and versions that you can reference here.
Speaking of charts, MicroStrategy Analytics Desktop has Grid,
Area, Dual Axis and
Combination Charts. All of these charts can be used to filter other charts for contextual dashboard creation. I did giggle again when I saw the MicroStrategy Sample Gallery and Data Sets - it looks exactly like Tableau's. Even the demos look the similar. I do like that they included the raw data sets - that was something I was often asked to do as a BI Product Manager for groups that wanted to create their own demos but struggled finding the right data.
Once my dashboard was created, I could save it, email it, print it or export it to JPG, PDF or a MicroStrategy file. There are also options to create a new folder or export the data in the views.
All in all - for a freebie - the new MicroStrategy Analytics Desktop is decent. Will the data visualization freebies like this one, Power View now in Excel 2013, open source Pentaho, Jaspersoft and others put some pricing pressure on Tableau? Gee, I don't know. I do know that Tableau is still the best in class and superior in deep data discovery feature sets right now to continue justifying the premium price tag. The market for basic data discovery and data visualization though is getting crowded. It will be fascinating to see where this all goes, how these basic free offerings mature in a year or two and what the market leaders add next to stay ahead of all the copy-cat vendors that are heavily investing since it sure seems like they are all fighting for the same segments of the business intelligence market.
PASS Summit 2013 News
This one will be short and sweet. If you want to watch the PASS Summit 2013 key note, it is available at the PASS Summit site.
This year very little news and no forward looking roadmap came out of PASS Summit. Microsoft only showcased existing known solutions and some preview solutions you can already use today. The biggest bit of news was that SQL Server 2014 CTP 2 is now available for public download. This CTP will be very close to the future release to market (RTM) experience. If you want to get a feel for what is coming in SQL Server 2014, then CTP2 is what you should download and evaluate. Other news bytes were around a nice new capability using a URL to back up databases with encryption and compression to Azure Cloud. That option is supposed to work with all supported versions of SQL Server. Lastly on the BI front, no news to report except a Power BI Facebook contest. Winners of the that contest will get trips to PASS BA Conference and runners up will get Microsoft game consoles.
Setting Up a SharePoint 2013 BI Farm
Getting through a SharePoint 2013 BI Farm install is a complex and daunting process for even the very best Microsoft BI talent in the world. It is not easy...period. Each and every time I go through a SharePoint BI installation process, I promise myself that it will be the last one for a very, very, very long time.
Last year at this same time I was setting up SQL Server PASS Summit key note SharePoint BI demo machine installations and also the Microsoft BI booth demo machines. I think I set up SharePoint 2013 BI machines about 50 times in the past two years with all the various SQL Server 2008R2, 2012 CTPs, Office 2013 and SharePoint 2013 releases. For the most part, I usually set up "All-in-One" machines for demos - meaning databases reside on the same SharePoint machine to avoid Kerberos Delegation configuration that is required for double-hops. "All-in-One" machines take about a day or two max to set up and are not too rough. However in the real world, a distributed SharePoint BI Farm is typically what would need to be installed... and that my friends is a another story, level of complexity and a bit larger project.
If you are considering a new SharePoint 2013 BI Farm installation, first read Kay Unkroth's incredible white paper to understand SharePoint BI Security. It is a variation of the previous, infamous Carl Rabeler's paper on a related topic and several SQL CAT articles on Microsoft BI with Kerberos Delegation. Then watch Rob Kerr's video series - a must see!!! Finally start reviewing the TechNet docs SharePoint 2013/SQL Server 2012 SP1 BI Installation Overview and Process, Supported Combinations of SharePoint and Reporting Services Components, (TIP! For SharePoint 2013, you can ONLY use the SQL Server 2012 SP1 version of the Reporting Services add-in for SharePoint), take notes on your specific environment and collaborate planning this SharePoint 2013 BI Farm installation with both your IT and database operations folks.
With each SQL Server, Office or SharePoint release, the SharePoint BI install process changes and/or new steps are added. There are also a myriad of little BI bugs that sneak in usually with neglected PerformancePoint. SharePoint CUs also tend to break BI features - beware of that and be extra diligent when choosing to apply a SharePoint CU to your BI Farm. In SharePoint 2007 days, BI was a very hacky install. In SharePoint 2010 days with the introduction of PowerPivot, it was still hacky, web.config file tweaks were common, you had to run and rerun installs, install SSRS msi's, ADOMD drivers and on and on...it got more complex. In 2013, a few parts of the PowerPivot and SSRS BI installs have improved but the growing list of little things to install for BI now makes this the longest installation process of all SharePoint and SQL Server releases thus far. Allocate at least a week or two for up to a four machine SharePoint BI farm build. I have heard of some SharePoint 2013 BI Farm installations taking much, much longer. It really depends on your environment.
In preparing for this past SharePoint 2013 BI Farm installation with the latest SQL Server 2012 SP1 release and SQL Server CU4+ for DAXMD Power View, I noticed that there were no instructions for a multi-server farm installation on TechNet - only single machine installs. When I reached out to my former peers at Microsoft they informed me that there had been some staffing/budget reductions and they weren't able to do the level of documentation that they used to do for BI. Since this group and my BI peers over there have always been kind to me, I promised that I would share the SharePoint 2013 BI Farm steps in a blog. They know it is tricky and we have all been on calls together with client broken installs before...if you get it wrong, there might not be an easy repair. So don't take it lightly.
Minimum Steps for Setting Up a SharePoint 2013 BI Farm
with SQL Server 2012 SP1 CU4+
Note that this is NOT the official blessed install process - one does not seem to exist. However, treat these steps as a running start outline to get you prepared. Also keep in mind that each step listed here has related TechNet articles that have many other little steps within the specific higher level listed step below.
TIP! Create detailed documents as you go through each step with your chosen token passwords, account configurations and other options to help you troubleshoot or upgrade later on.
1. DO YOUR HOMEWORK! Minimally read:
Kay Unkroth's SharePoint BI Security White Paper
- Supported Combinations of SharePoint and Reporting Services Components
- SharePoint 2013/SQL Server 2012 SP1 BI Installation Overview and Process
2. Plan SharePoint 2013 BI Farm Server Topology and Roles
3. Plan Service Accounts for SharePoint, various services and data sources
4. Take care of Install Pre-requisites
- Check that the hardware or VMs are capable of hosting SharePoint 2013 (the specs have increased) and the various SQL Server 2012 SP1 BI features you want installed
Log software license installation keys to use, note SharePoint Server 2013 and SQL Server Enterprise Edition is required for BI features
Make sure Kerberos is supported within the organization
- Installer accounts must be a Local Administrator to run installs
- SharePoint Farm machines must be joined to a Domain/Active Directory
- Must have one or more domain user accounts to provision the BI services. Domain accounts are required by the managed accounts feature in SharePoint. The Database Engine can be provisioned using a virtual account, but all other services should run as a domain user. Domain user accounts are needed for the following BI related services: Reporting Services, Analysis Services, Excel Services, Secure Store Services and PowerPivot System Service.
SharePoint Farm machines should have an Internet connection for various installation steps, you can turn that off later
- Choose the URLs that you will use to hit the SharePoint site and have those configured in DNS
Set Up the Base SharePoint Farm
5. Install and Configure Base SharePoint
6. Create the SharePoint 2013 Farm App Servers and WFEs
Register the SharePoint 2013 Service Accounts
8. Configure the SharePoint 2013 Search Service
9. Creating the SharePoint 2013 Site Collection
TIP! Use BI Center as Root Site Collection for the best BI experience.
10. Add additional Base SharePoint Services, Apps or Add-Ins
Set Up BI on the Base SharePoint Farm
11. Configure and Start Secure Store Services
12. Configure and Start Excel Services
- Set up Unattended Service Account: this is what Excel uses to connect to data sources to auto-refresh data, the account needs database permissions too for published data sources.
- Optionally Enable EffectiveUserName and follow configuration steps
- Set up at least one Trusted Data Source Library
- TIP! If you run into errors, it is often related to Loopback in Registry settings
13. Configure and Start Claims to Windows Token Service
14. Configure PerformancePoint Services in SharePoint 2013
- Manually install
SQLSERVER2008_ASADOMD10.msi for PerformancePoint to connect to SSAS cubes on all servers in the SharePoint Farm and do an IISRESET
15. Install and Configure PowerPivot Install for SharePoint 2013 and the PowerPivot Add-In
- Note that this has changed a bit with SQL Server 2012 SP1 but no one really talked much about it. There is a new stand-alone SSAS server option for
higher scale. To learn about that, check out this article.
- Don't forget to change
Maximum File Upload Size
16. Install and Configure SSRS Integrated Mode for SharePoint 2013
- You need to download and Install SSRS Add-In 2012 SP1 and ADOMD10.msi to all WFEs in the farm and issue another IISRESET
- Optional set up SSRS Alerts and Subscriptions in SharePoint and SQL Server
- Add SSRS Content Types to SharePoint document libraries
17. Install SQL Server 2012 CU4+ for
Power View on Multi-dimensional Models DAXMD
TIP! Bradley Schact has a nice blog on this step.
18. Configure Kerberos Constrained Delegation
TIP! This is usually the step that bites you. READ THE DOCS OR YOU WILL FAIL. Watch Rob's videos. Register entries in Active Directory for SharePoint, various services and data sources. Read the latest docs - it has changed. You also have to make sure the appropriate service accounts have permissions on the data sources for use within SharePoint. You will have to do this for all the data sources used with SharePoint that exist on a machine not in the SharePoint Farm, basically all real world data sources.
19. Test, test, test and test again
20. Brand the SharePoint BI Farm, master pages, page layouts and theme
21. Celebrate that the SharePoint BI Farm install pain is finally over
22. Get back to enjoying the fun data and BI work!!!
I hope this helps other folks out there that have been looking for TechNet documentation that simply does not exist. Oh and if you need a consultant to come in to do a SharePoint 2013 BI Farm install for you, I'd be happy to refer you to someone else, anyone else but me. : )
UPDATE: So I did hear from a former Microsoft peer today that there are instructions for a setting up a SharePoint 2013 BI Farm - if you deploy it IaaS on the Azure Cloud! Here is an 80+ page white paper that now only exists for the SharePoint 2013 BI Farm in the Azure Cloud IaaS option: Deploy SQL Server Business Intelligence in Windows Azure Virtual Machines.
Pivotstream Evolves to All Up Microsoft Cloud BI
Earlier this past summer after the Office 365 Power BI announcements, I chatted with the folks at Pivotstream about how that news may impact their successful, fairly unique PowerPivot hosting business. While at Microsoft, I would refer high profile, externally hosted PowerPivot projects to them because I knew they were the world's experts in that specific niche. External hosted Microsoft BI projects can be very challenging due to authentication complexities. There is an excellent white paper by Kay Unkroth on Microsoft BI Authentication and Identity Delegation that deep dives into those topics if you ever want to really understand how it all works.
I had also heard that Rob Collie, a well-known former Microsoft peer, left to start his own Excel training business. Rob's sometimes sassy blog was the main marketing and communication vehicle for Pivotstream. Since Rob left, it had been pretty quiet over there. Honestly, I was a little concerned about what the future would hold for them.
We had some amazing conversations with regards to where they had been in the past, market perceptions, their web site messaging and where they might go in the future in light of both Power BI announcements -and- existing Azure IaaS for Microsoft BI. During those talks, I learned what Pivotstream really offered and was surprised by what I heard.
Actually Pivotstream hosts a lot more than PowerPivot and they also do a lot more than just hosting. They offer easy button, all up Microsoft Cloud BI professional level solutions for the various flavors of Analysis Services, Reporting Services and SharePoint BI that can be fully customized for specific needs. Pivotstream also has high touch, support services. These are things you will not find in Office 365 Power BI-land today where no back-end infrastructure customizations can be made. There are not a lot of easy button, all up SharePoint BI/Microsoft Cloud BI professional level hosting options out there today except maybe CloudShare...and Cloudshare is a hosted development VM solution, not a production solution.
For my tech-savvy readers, yes you could probably set up your own Pivotstream-type solution today on your own infrastructure, hosted servers like Rackspace, AWS or Azure IaaS. That would be time consuming to set up, you'd also have to maintain it and you would not get the added-extras. Pivotstream has created a library of customizations and add-on features for Microsoft BI Cloud hosted scenarios that I have not seen done anywhere else. They got pretty excited when I told them I was impressed and asked for permission to use my comments on their new web site.
Seriously, I don't know of anyone else that has instant Microsoft Cloud BI "professional" level solutions - sign up on a web form and get all the pro quality Microsoft BI without the pain and suffering of setting up a SharePoint BI environment. I wanted that super badly while I was at Microsoft and I could not find it anywhere. The hosted Microsoft BI sales demos and virtual labs were as close as I could get. Office 365 had some BI with Excel and Power View. Power BI adds Power Query, Power Map, Q&A and few other things but for self-service with Excel not traditional Analysis Services, Reporting Services and PerformancePoint. If you know of someone that does have it, please let me know.
My one Pivotstream gripe thus far... when I asked about how much this fantastic solution costs, I got the classic "it depends". Unlike the easy button set up, there does not appear to be easy button pricing. In sharing all the good, I need to be balanced and add the fair warning that I truly have no clue how much Pivotstream costs. Costs must be weighed into technical solution design decisions to buy versus build-it-yourself.
All in all, if you haven't taken a look at the new Pivotstream it may be time for a peek. They have significantly evolved from the PowerPivot hosting group that I used to know to fulfilling a unique void in the hosted Microsoft Cloud BI professional solution space today.
Getting Started with Tableau 8.1 Beta & R - Part 1
Last weekend I finally got a chance in the middle of the night again to play with the new Tableau 8.1 Beta and R integration with RServe. After a Diet Mountain Dew to wake me up and some tasty Betty Crocker vanilla frosting to ease my frustration with RServe on a Windows laptop (more to come on that), I did get functional with this cool combined solution and WOW - Tableau with R is a total blast! Here is what I did to get it up and running. I am going to walk through the install, connecting Tableau with RServe, the classic Hello World example and one example of calling an R function with a parameter from Tableau to visualize the results in Tableau.
Getting R to work with Tableau 8.1 should not be difficult but the key issue for me was to ensure my firewall and ports were configured to allow communications to RServe. Before you install RServe, you do need to have base R installed as a prerequisite. You can download base R from http://www.r-project.org. Note that an R GUI is not required for using R with Tableau. However, I am finding that using R Studio does help testing/debugging my R scripts before putting them into Tableau.
Once you have base R installed, then you can go ahead install RServe. RServe is a TCP/IP server which allows other programs like Tableau or a web app to use R without the need to initialize R or link to R library. RServe supports remote connection, authentication and file transfer. The RServe download contains code samples for popular languages such as C/C++ and Java. It even includes nice RServe web app code examples for PHP. In addition to Tableau 8.1+, SAP Predictive Intelligence, Oracle and many other well known enterprise applications use this same RServe approach to integrate R computation of statistical models and visualizations into their applications today.
Read the instructions and FAQs first to fully understand what you are installing and the little tid bits that may bite you if you are on a Windows platform - RServe is not 64 bit, you need to copy a few files to the 32 bit R bin versus the typical R library install routine, you need to double-click to start the RServe.exe process or use a Windows command line, etc. I thought JSoftware and Sudo had some nice reference material too on this topic. Do note the many warnings that RServe is not ideal with Windows - it shines on Unix, Linux and other platforms. There are connection limitations and many other annoyances with Windows. However if you simply want to learn how to use this combined solution like I am doing, Windows should be ok. Serious RServe implementations would not be rolled out on Windows...hmmm, maybe that's why Microsoft has not fully embraced R yet, don't know.
For my RServe install on a 64 bit Windows 8 laptop, I needed to open the RServe default port 6311 and also allow communications to the RServe app in Windows firewall. I also had to copy a few RServe files to the R program 32 bit folders for it to run.
To ensure it was running, I looked in Windows Task Manager for the RServe.exe process. I also used a Telnet trick to connect to it: telnet localhost 6311. With a functional RServe.exe process ready to go, it is now time to connect Tableau Desktop to this RServe and send it some R functions!
Connecting Tableau to RServe
Here is the super easy part that used to be much more challenging (see end of this blog) prior to Tableau 8.1. To connect Tableau with RServe, simply navigate to Help > Manage R Connection and enter your RServe URL and port. The defaults are populated automatically. Note that RServe is NOT a data source and thus it is not in the list of data sources. R is an web server app so the connection info is located under Help. To ensure your connections works, click Test Connection. If you get a success message then continue on. Otherwise, you most likely need to check your firewall settings.
Hello World and Parameterized R calls from Tableau
Here the real fun with R begins! To use R with Tableau, you are going to create a Calculated Field with the R function call and possibly use Tableau Parameters to store Results. The R Calculated Field will be treated in a similar manner as Tableau Table Calculations. In Tableau 8.1 there are several new Calculated Field functions that are related to R called SCRIPT_STR, SCRIPT_REAL, SCRIPT_BOOL, and SCRIPT_INT. These new functions include examples in the help descriptions of how to format your R calls properly. I find it is easy to first test and run your R function in RStudio or another R tool. Then copy over the functioning R code snippet into the Tableau Calculated Field window. In classic programming 101, Hello World is always the first example taught so here is the Hello World example in R with Tableau.
You preface the R function with SCRIPT_STR since the R result you will get back from RServe is a string data type in this example. Then you enter in the R function and the parameter where you want to store the returned value ('hello<-"Hello world!"',ATTR([R Result])). With that created, you can now test it and visualize your exciting R Result in Tableau just as any you would any other Calculated Field.
That was nifty and certainly helps start putting this puzzle together. In all reality, you will most likely be passing parameters from Tableau Calculated Field R functions and getting numeric values back or other really amazing things like data mining results that we will cover in future blogs. (By the way if you went to TCC13 last month, Bora Beran had a great session on more advanced R statistics, what if analysis and R data mining with Tableau that is a must see for R enthusiasts. It was called "Making the most of your expanded analysis toolbox – Stats 2".)
The next beginner Tableau with R example involves passing a Tableau field as a parameter to a simple Tableau Calculated Field R function and displaying the result. What I like about this example is you can easily get fancier and pass in a Tableau parameter as an R parameter (.arg argument in R world) for what if analysis.
Here you will use SCRIPT_INT since you will be getting a number back from RServe. In this sample you are passing the R function a parameter that is a Tableau field called [CONSTANT_VAR]. This does not have to be a constant value - I am using a constant for testing ease. You could also use an aggregate like SUM([Sales]) where I am using [CONSTANT_VAR]=10. When I visualize my new RServeTest2 field I see indeed R is computing the value 20 and Tableau is displaying that result.
There you have it. Easy breezy visualization of R functions with Tableau using an RServe connection...I totally love it. The new Tableau 8.1 R Integration features are A LOT EASIER than my previous blogs and decks on this combined solution topic Visualizing R Models in Tableau and How To Predictive in Tableau. That wraps up Getting Started with Tableau 8.1 Beta & R - Part 1. In the next part of this series, we will showcase visualizing much more exciting R data mining routines in Tableau.
BI News from Oracle Open World and Qlik QTUnsummit
In the continued annual fall barrage of conferences, this week Oracle Open World and QlikTech QTUnsummit has some big BI news to share. Although these are not heavily implemented solutions for my little biz, I do like to keep a pulse on them and play with their offerings from time to time. I also know my customers may have these solutions in their BI portfolios. Many companies have a lot of BI tools in the mix. One customer shared that they had 19 different BI tools - variety is the real-world BI reality.
Oracle BI News
Typically my Oracle projects involve existing customer Oracle data sources such as the relational database or Hyperion Essbase. I do like Oracle Data Miner but I don't see much of it. I have been keeping a pulse on Oracle Exalytics, Endeca and OBIEE for a whille. Since Oracle now has preconfigured BI VMs, I have been meaning to revisit OBIEE again soon - especially to test combining it with Tableau, Microsoft and other Data Discovery solutions.
This week at Oracle Open World, one of the key announcements was around Oracle 12c In-Memory Database and M6 Big Memory Machine with 32TB of DRAM. You can watch the key note highlights if you want to hear their pitch. SAP HANA and SQL Server 2014 Project Hekaton both have similar In-Memory Database offerings. Oracle's added in-memory, column-store for Oracle databases is across all platforms, not just Exadata. Like SAP Hana and SQL Server 2014, Oracle's column-store capability will be side-by-side the existing on-disk row-store. Sure seems like SAP, Microsoft and Oracle are all singing the same tune to me. I do know all these database vendors are seeing market take share pressure from Hadoop, MongoDB, MySQL, Google Big Query, Amazon RedShift and other cloud and open source technologies that have skyrocketed the past few years. I certainly won't forget the "partnership" announcement earlier this year between Microsoft and Oracle - it was nothing short of totally shocking if you know the history between these two companies.
Oracle also had a lot to say about Cloud like every other major vendor does. Oracle introduced 10 Cloud services including some BI in the Cloud for interactive dashboards on the web and mobile devices. They set the stage with trends driving IT - Internet connected device and machine logging. I heard that a lot at Microsoft with Azure too. We saw Internet device and machine logging begin to emerge as a trend when Splunk growth exploded. This will only continue to grow with all the major vendors wanting a piece of that logging and event analytics big data.
On the BI specific front, a few other newsworthy items include:
- Exalytics T5-8 hype started...expect to hear much more of it
- Oracle Planning and Budgeting Service Cloud with GA in 2013
- Endeca will be integrated into the E-Business Suite and OBIEE
- Oracle Business Analytics in the Cloud is planned for 2014
- Oracle Mobile BI apps were showcased heavily
and they are HOT
- EPM and Planning mobile apps are a work-in-progress
Qlik BI News
Next on to Qlik QTSummit. Here I must add a huge disclaimer that I am not a fan of Qlik. I am usually rough on them because of I don't care for their intense misleading sales tactics, scripted development experience, QVDs and so on. I do know that people out there like the Qlik solution so I will try my very best to buffer my comments and stay open minded being a vendor neutral implementer.
I will say that just the fact that Qlik is overhauling their entire experience and platform in vNext is an admission of needing to significantly improve their offering. I have not seen Qlik vNext yet but it will be interesting to see the new version and test the differences in user and developer experiences.
Qlik is heavily spinning vNext as "Natural Analytics" playing up being outdoors in nature in all their analyst interviews and marketing imagery. The Natural Analytics story sounds similar to the "Analytic Journey" story Tableau has been telling and winning customers over with for a few years. Tableau truly does have a beautiful analytic journey today that is intuitive and deep, the best in user experience of all the Data Discovery tools (yes, I try them all or almost all of them) and application design is what differentiates Tableau from all the look-alikes. The challenge for Qlik is not to become another look-alike that lacks great user experience. Simply marketing the same story with a different label will not fool savvy BI buyers.
Now reading analyst blogs, I see the Qlik is saying their flavor of Natural Analytics is based on persona mapping, designing Qlik for different mindsets and skill levels from novice, explorer, achiever and collaborator. The concept is to align product capabilities and user interface to fit into the needs of those personas. This is a little unique. I have seen similar designs with Microsoft Excel PowerPivot, PowerPivot Advanced and the pro-version SSAS Tabular mode so it is not entirely new. Again it will be interesting to see what this looks like when released.
Other interesting tid bits from QTUnsummit include:
- Tweets about smart vizzes. I assume this means Qlik added predictive features into views to compete with the likes of Spotfire, SAP Lumira with R and the recent news of Tableau 8.1 R integration. It could also mean guided analytics, or something else "smart"...time will tell.
- Tweets about an HTML5 mobile-first, design once, deploy anywhere UI with automated data adaptation.
- Tweets about Story Telling. Now Tableau, SAP and Qlik are all pitching this Story Telling, a nice way of saving your specific views with some text context for presentations.
- The early adopter release of Qlik vNext will be rolled out to a small group of customers in 2013. A quite different "strategic phased approach" for rolling out vNext to other customers includes cutting support for Qlik v11 after 3 years. This kind of bold move by a BI vendor typically invites customers to shop around for other BI solutions if the platform migration requires rework and heavy migration investment. All the major BI vendors that have been through platform overhauls like this usually offer investment funding to offset customer rebuild spend. When a BI platform overhaul happens, the other BI vendors jump on the window of opportunity to start calling into those accounts with hopes to win them back.
The short 3 year support time frame for Qlik v11 is a surprise. See another blog here on that shock factor. While with Microsoft in 2012, when I talked about sunsetting products or features I was sharing that news with a 10 year support cycle. I can't even fathom what those conversations would have been like with a short 3 year support cycle. Those chats are rarely pleasant with 10 years of continued support. We do live in a world of rapid and continuous cloud release cycles. Qlik may be one of the first BI vendors to start a new reduced support cycles trend.
- The "feature release cycle choices" are also new. Qlik customers will now have to pick from two feature release choices - either get three feature updates a year or only one per year.
- Qlik is forming a Customer Success Framework to help with this transition presenting it as a warm fuzzy best practices thing. In reality, that is a "save market share during transition" thing. Much like other BI vendors going through major platform transitions to the cloud or otherwise, the final Qlik destination may be worth the waiting, wondering and what will it take to get there. Usually that process is unnerving even for the most loyal customers. The other vendors will pounce on that FUD to take market share.
- Cloud ??? What I was expecting to hear but I did not was Qlik Cloud. Qlik is the only Data Discovery vendor now that does NOT have a real Cloud offering - only an IaaS type offering in the Cloud today. I guess after vNext is rolled out, maybe then they will pitch a real Cloud BI offering. We shall see.
SAP buys KXEN to Further Predictive Analytics
The red hot BI and Predictive Analytics market is packed with top vendors on big time shopping sprees. Somehow I missed this bit of news at the time it was released. Running a business is a non-stop, 24x7 effort - I tend to be catching up on industry highlights now each Saturday. This morning scanning through my alert mail and news I saw the SAP + KXEN announcement.
Although I omitted KXEN in my earlier Practical Predictive Analytics blog - I did cover what SAP offered with regards to predictive in their offerings. KXEN is a GREAT buy for SAP. KXEN has a state-of-the-art, partially automated predictive model building solution through InfiniteInsight® Modeler. KXEN claims through automation to build accurate models in hours, not the historical weeks or months type work effort. Their solution covers a broad array of data mining functions including classification, regression, attribute predictive importance, segmentation/clustering, forecasting and association rules. In addition, KXEN has strong in-database predictive model scoring with added ability to apply SAS and SPSS models. That is a HOT feature! Soooo even though SAP already had Predictive Analytic Library PAL functions and some R integration in SAP HANA and their SAP Predictive Intelligence tool that supplements SAP Lumira (formerly Visual Insight) - the addition of KXEN is a fantastic buy that may get them to a better place with their annual BI customer satisfaction surveys.
According to the press release, the market for predictive analytics software is estimated to be worth US$2 billion today and is expected to exceed US$3 billion in 2017. It is finally time for predictive analytics to take center stage and be brought to the mainstream. I have been doing predictive analytic projects for 10 years now but they are almost always a rare gem find or something that I have suggested and had to first prove out to gain executive sponsor buy in. Usually predictive is an afterthought since so many companies still struggle with basic data needs today: cleansing data, getting data into a useful format for historical reporting and so on. The past two years I have seen more predictive inquiries than I saw in all of the last 10 years combined. Part of this is due to Cloud Computing, Big Data and various flavors of Hadoop adoption. Other reasons for the predictive analytic boom includes but is not limited to analytic maturity in organizations, younger talent entering into the workforce bringing advanced analytic skills that they have learned in updated school curriculums, increased global competition in a connected world with no or low barriers to entry into many markets, and the total market fall out of 2008. I am ecstatic for the predictive analytics community because I have witnessed the power of predictive at work. It can make a huge positive impact when done right.
So back to SAP + KXEN and what this acquisition may bring. In the article, SAP mentions that KXEN is intended to complement existing advanced analytics from SAP. They will be adding it to the SAP Predictive Intelligence workbench tool that snaps into SAP Lumira. SAP also intends on applying KXEN algorithms to core SAP business processes within their industry focused applications for managing operations, customer relationships, energy, supply chains, risk and fraud. Embedding predictive analytics into core applications and business processes is a trend I expect to continue to see from ALL the software and SaaS application vendors. I am already doing some of this work today and see it growing steadily across all industries. Last but not least, of course SAP will integrate KXEN with SAP HANA. SAP is all about SAP HANA - you can't go to an SAP event or talk to an SAP rep and not hear that H A N A word said a million times. At SAPPHIRE and SAP TechEd there were pools on how many times it would be used in the event key note. We are smart people and tired of hearing one word repeated incessantly. We get it.
Personally I am excited to see and play with the new SAP + KXEN predictive features when they are made available. I am not sure when we will start seeing this but I imagine we will learn more about it at the upcoming SAP TechEd in late October.
TIBCO (Spotfire) Acquires Extended Results and PUSH BI
Breaking news today from a great little company called Extended Results, creator of the PUSH BI mobile applications many Microsoft BI customers used to bridge the mobile BI gaps. TIBCO, the maker of Spotfire, acquired them and their technology. The press release is here. While this is great news for my BI friends Patrick Husting and Bryan Colyer, I imagine this news may cool their prior exceptionally cozy relationship with Microsoft BI now that they are living under a competing firm umbrella. I look forward to hearing more news as their story unfolds in the BI news network channels. ( Information Age Article, Information Management Article ). Here is what Extended Results had to say about it:
For those of you that don't know Extended Results, they are simply amazing. They just "get BI" and they are "brilliant marketers". Patrick Husting is a good Midwest guy from Minnesota - I adored him and his team right away often referring them into accounts to help my customers prior to starting my business. They have nice solutions that totally sizzle. I look foward to seeing what they will cook up next in the TIBCO Spotfire world. Congrats!
Tableau TCC 2013, What's New in 8.1+ and Zen Master
What an incredible week of data lover industry and community events - Annual Tableau Customer Conference (TCC), SAP BusinessObjects User Conference, Tampa Analytics Group with Pentaho by Mark Kromer, South Florida BI with Chris Webb and SQL Saturday Orlando. I am EXHAUSTED and finally getting a chance to update the blog on a Saturday night again. Here is the latest and greatest Tableau scoop from TCC on Tableau v8.1 (Fall 2013), 8.2 (Early 2014) and a glimpse of 9.0 (TBD in 2014).
TCC 2013 was in Washington DC this year. The DATA crowd has grown to ~4,000 from over 40 countries. I saw quite a few of my peers there from the Big 4 consulting firms, niche BI implementer firms and Microsoft Partner community. I also noticed a few Microsoft BI Program Managers and other competing BI vendors checking out TCC - Tableau is now "on their radar". The big name BI industry analysts such as Rita Sallam, Claudia Imhoff, Cindi Howson, Wayne Eckerson, Andrew Brust, Steve Dine and other well-known names in the BI and analytics world were there. TCC was THE place to be and I was not going to miss it this year.
The TCC 13 key note was quite good and walked through upcoming features in 8.1, 8.2 and some 9.0. Here is a quick summary of what I saw and heard. A full summary of key note highlight features is also on the Tableau blog.
The opening theme was about Discovery. Christian Chabot talked about the pains of working with slow, cumbersome Traditional BI that crushes innovation, prevents on-the-fly thinking and data discovery impact potential. He compared Traditional BI to trying to write with a pencil taped to a large brick. Pictures of the Traditional BI process and change requests were shown on the screen. Anyone who has been through Traditional BI could totally relate to this message.
Tableau is revolutionary. The highly intuitive dashboard design experience is best in class. It is also shockingly easy and fast to set up a Tableau Server - literally a 15 to 30 minute process for some Tableau Server installs versus multiple week or month engagements in the Traditional BI world that I have grown up in all these years. To be fair, Traditional BI vendors do have a few more bells and whistles in the broader BI spectrum across master data, data quality services, ETL, operational reporting, predictive and alerting where Tableau is a focused BI dashboard and data discovery solution. However, just looking at development of a BI semantic layer and dashboards, Tableau is exceptionally better than all the other players. When I first learned Tableau Server - I kept thinking that I must be missing something, where is the catch, this can't possibly scale out, etc. The reality is Tableau Server is elegant - it is easy, fast to get rolling and it does scale out. Heck, it even upgrades, in-place easily. After experiencing life with Tableau, you never want to go back to the nightmare install, upgrade and development experiences of SharePoint BI, Microstrategy, IBM Cognos, SAP Business Objects or other Traditional BI players. Sorry to be so blunt about that pain but it is the cold, hard truth. Traditional BI vendors take note. Cloud may be a nice future "easy-button", plug and play alternative but on-premise is not going away soon - improve your offerings or continue to lose market share.
From there my favorite part of the TCC key note - Tableau Engineering Devs on Stage! This is where the developers that are building the products, not product marketing, CXO, or a conference presenter/vision sales demo guy, get on the stage to show & tell what they have been working on. It is refreshing - I don't know of another BI vendor that uses this approach. Tableau Engineering Devs opened with improvements to importing data into Tableau. What was shown looked a little like Data Explorer a.k.a. Power Query but less feature rich. They showed visual joins, some workflow and data prep/cleansing features.
Next up was statistics related features: box plots, easier ranking with percentiles, two pass totals, prediction bands and my #1 TOP FAVORITE new feature R integration. YAY! You could work with R and Tableau before - I have a deck and blog on it - but it was not ideal. Tableau chose to implement a R Serve hook approach similar to the one SAP uses with SAP Hana R and SAP Predictive Intelligence.
Then on to some other nice features like dashboard item transparency - imagine the creative applications that feature alone will allow. Some basic things like copying and pasting dashboard items between workbooks sheets, folders to categorize dimension attributes and measures in the shared semantic layer, a calendar control and quick filter formatting. Mobile and web authoring also showcased great new features such as better branding color palettes, movable filters, resizing windows, improved personalization and dashboard authoring in the Android App.
A feature set that the crowd seemed to really like but I didn't quite understand or value was called story telling. I won't be able to give this one much justice but it looked to me like a new navigation control with variables from the worksheet data. Some audience members were tweeting bye-bye PowerPoint... nah, I don't agree with that but I do see value-add if Tableau has better PowerPoint integration or slickery presentation features for delivering analytic results to an audience.
Some of the BIGGEST high impact improvements for Tableau enterprise deployments were ones that most of this audience would not fully appreciate. Serious enterprise BI features such as SAML for Single Sign-On, IPv6, external load balancing support, automatic gateway fail over, host names versus IP addresses in configurations for easy machine swaps and finally 64-bit for both Tableau Desktop and Tableau Server were announced. The enterprise Tableau feature news was just huuuuge. Way to go!
One of the last announcements mentioned in the TCC 2013 key note was Tableau for the Mac - it got a standing ovation from the crowd. As a PC gal myself that can't imagine life without my Lenovo W530 with Windows 8, I didn't really care about Mac but this audience sure did. Tableau for Mac is expected in v9.0.
To wrap up news from TCC 2013, I confess to keeping a secret for a few weeks. My secret = I was chosen as a Tableau Zen Master. It is an incredible honor and total SHOCK. I would not have guessed it in a million years. A Tableau Zen Master is like a Microsoft MVP but there are far less - less than 15 in the whole world. I was dying to tell people but I was sworn to secrecy. I am deeply grateful for this honor. I even had a wonderful, heart touching conversation with Christian Chabot. Zen Master was awarded right before Nate Silver's TCC key note. It is hard to see me but I am wearing my favorite color "red", standing in the middle of the pack on the stage in the photo above. With this award comes responsibility too - I promise to do my best to keep the Tableau community abreast of news, tips, tricks, keep the engineers on their toes, the sales team honest and also continue sharing my knowledge just as I have in the past year.
That concludes Tableau TCC 2013 updates. I will share other Pentaho and SAP news one of these days after getting some seriously overdue sleep.
Power View HTML5 Mobile Preview for Office 365
The Power BI and Mobile App Preview is beginning to expand. Honestly, I have not had one spare moment yet to spin up my Power BI Preview environment. I do hope to give it a try this weekend. In the meantime, I have been watching the cross vendor BI world blogs and tweets closely as always I do. I saw that Dan English, one of my long-time MSBI favorite gurus, tweeted the link to the NEW Power View HTML5 Mobile Preview notes. It is a good read and link to keep in your Favorites to reference to see how it continues to evolve.
The Silverlight to HTML5 migration of Power View is a seriously huuuuuge deal for the Power View team led by Ariel Netz to get out the door. Almost every single MSBI conversation that I have had over the past two years - especially after showing cool Power View moving bubbles or Maps demos - also had the iPad, mobile and Silverlight question come up. I am genuinely happy for Ariel and his team to achieve this major milestone. I can't wait to play with it now.
Other big wins for Ariel and team, would be taking this to on-premise MSBI and adding Power View parameters for common embedding, dashboards, and ISV scenarios. They do mention ISV scenarios in this article but they do not share a timeline nor mention on-premise. It is only for Cloud Office 365. I do look forward to seeing and hearing what the others that have beat me to this Cloud Power BI Preview are thinking and experiencing as they dig into it.
I would bet that the Power View team probably needs more real world feedback on various mobile device type experiences, what works and does not. These are the ones supported in the Preview right now. Mobile device testing was an obsession area that I used to do a lot of research in while I was at Microsoft. Many people don't realize just how many bazillion mobile devices there are around the world. At TechEd 2012, I had someone show me a SSRS report on a "Dolphin" phone and to my surprise it worked ...even though it was not officially supported. Last month I shared in the Power BI Mobile App blog that Power View maps and scatter plots will not render in the initial version. Over time updates will be pushed to the Mobile BI app though the app stores. There is an excellent demo by AJ Mee posted on YouTube of how to publish your own Excel 2013 files to test this app.
So a shout out to all of you that wanted Power View to be HTML5, step up, start testing it on your favorite mobile devices and provide the Power View team feedback when you get your Cloud Power BI Preview invitation.
Tip! How To Display SSAS Member Properties in Tableau
I have been meaning to share this tip for a long time and stumbled upon it again backing up files to my off-site server this past week. A while ago I was pinged by a large soft drink manufacturer's BI group that was using Analysis Services cubes with Tableau. Within their cubes, they used Member Properties for attribute characteristics and metadata that did not aggregate such as packaging specs. If you are unfamiliar with the concept of Analysis Services Member Properties and want to learn about it, please review User-Defined Member Properties and Creating and Using Property Values. Another resource that is easier to follow than the TechNet Online docs on this specific topic is Sorna's blog.
This BI group wanted to be able to see and use their Custom Member Properties data along with the Analysis Services cube data in Tableau dashboards but they could not see them nor could they figure out how to get them. Turns out that getting those Member Properties was a little tricky since they are not available in the default Analysis Services connection data. After some experimentation, I was able to get them using Tableau Custom SQL with Open Query...my favorite Tableau go-to tricks.
The following example uses the Adventure Works cube so you can try this yourself. I created two connections to the Analysis Services cube - the usual one for the Analysis Services cube data and one more using Custom SQL with Open Query to query the Analysis Services cube Member Properties. The Analysis Services Member Properties Custom SQL with Open Query query syntax is shown below:
Then I created two Tableau worksheets. One for the Analysis Services cube report that contained a Tool Tip to display an Action to get the selected Member Properties. The other worksheet was used to display the selected Member Properties within the dashboard.
On the dashboard that contained these two Tableau worksheets, I used a Filter Action. I wired the Filter Action to source the Analysis Services cube report worksheet, target the Member Properties worksheet and mapped the Target Filter Source Field to the Analysis Services cube Product dimension [Product].[Product].[Product] level where the Member Properties are defined.
The final solution is shown in the very first image of this blog. In the final solution, the Tableau dashboard user can now see
Analysis Services Custom Member Properties for any Product that they select by clicking on the Tool Tip Action. It is not ideal but it works. Ironically, it is a similar (not exact) approach to querying Analysis Services Member Properties within Reporting Services reports.
For further reference, improvements or to simply reverse engineer what I did, you can download the Tableau packaged workbook. Note that you will need to update the Analysis Services and SQL Server data source connections to point to your own instance of Adventure Works.
Hope that Tableau tip is useful for other Analysis Services fans using Tableau.
Couple Quick SQL Server World Updates
August has been a busy blog month so I am going to keep this one short and sweet. There are a couple key updates in SQL Server World floating around right now. The first is related to PASS, another comes from Marco Russo on rallying for a BI MCM and the last comes from SQLCAT.
A PASS announcement on July 26, 2013 by Thomas Larock explains additional new leadership and the vision for PASS over the next two years. This was followed by a SQL Connector article on August 21, 2013 titled the PASS Mission Statement: Updated for a Growing Community. PASS is expanding into new international markets and new data-related markets. We have already seen PASS touching new audiences including the Business Analyst (Excel User) with the PASS BA Conference. PASS, like the SQL Server product line, is also expanding into new areas of data such as big data and cloud.
With PASS BA and these two recent announcements, there seems to be confusion about what this means for PASS within the wider SQL Server community. I have heard some misinformation myself last week prompting me to ask around and write this blog. If confused, read the two PASS vision and mission articles linked above and you will see that the foundational SQL Server Core is still there in the PASS we know and love. PASS is expanding like many other successful offerings do - ie. Cool Ranch Doritos, Spicy BBQ Doritos, Doritos Mini Bites in the Doritos product line. Expansion is natural in technical innovation product and services strategy. Here is an Ansoff Matrix published by Harvard Business Review and a McKinsey Strategy guru. Imagine the PASS SQL Server audience as that Core layer and the recent PASS changes being the Adjacent audience expansions to continue growth. I don't have inner insight into the PASS strategy. From the outside if I compare what I see happening to generic winning growth strategies, the recent expansion while also optimizing the Core is a great move.
The next bit of news comes from Marco Russo and is a Call to Action for Microsoft BI pros. Please write an email to email@example.com and tell them you would be interested in Microsoft Certified Architect (MCA) and Microsoft Certified Solutions Master (MCSM) for Business Intelligence certifications just like the SQL Service Database pro path. Make your voice heard! We know it can work since we recently saw Microsoft answer customer pleas to add an Excel 2013 SKU with PowerPivot after the infamous Rob Collie rally earlier this year. (Also per Rob Collie, Excel 2013 Standalone will *not* successfully install PowerPivot or Power View until the September 10, 2013 auto update http://ow.ly/ojlaS.)
The other SQL Server news comes from SQLCAT on August 21, 2013. The popular SQLCAT.com website is shutting down and the content on it is merging into MSDN to minimize reader confusion and to streamline content publication. That move makes a lot of sense and I sure hope they use a domain pointer/redirect when it migrates so people that miss this news in all the daily noise can easily find content they bookmarked and are used to getting at SQLCAT.com. SQLCAT also added Azure Cloud coverage into their scope and with that change now has two Twitter handles: @SQLCAT and @WinAzureCAT. So update your favorites, bookmarks, feeds and Twitter to continue getting the deep technical content that the SQLCAT team delivers.
...and PowerBI Preview is now available for the first wave of groups that subscribed to be notified.
SAP BI and Analytics Roadmap Updates
In a colorful #AllAccessAnalytics webinar and Twitter event, Steven Lucas, President of SAP Platform Solutions, and his newly announced BI, Analytics an Big Data team openly briefed attendees on new organization players, philosophy changes and the upcoming SAP BI and Analytics Roadmap Updates. Here are some highlights from the session and what you will be seeing soon from SAP.
SAP opened by stating a bold new approach, boasting they have 60,000 SAP Business Objects customers and shared that the SAP Lumira Personal free offering was evidence of a new change in thinking. They joked a little about Tableau but also confessed to understanding that they have a lot of work to do for their customers. As the session progressed a variation of the classic Gartner BI Lifecycle concept was shown using overlapping circles: Enterprise BI, Agile Visualization, and Advanced Analytics (Data Science for Everyone). SAP talked to a new mindset of easily adopted LAVA (Lightweight Applied Visual Analytics) solutions, business focus, and story telling. Ironically earlier this summer at the European Tableau Customer Conference, story telling was also shared as an investment theme. I imagine other BI vendors will also jump on this "trend train" and be singing the same tune in 2014.
Anyone who has implemented SAP Business Objects, SAP BW Business Warehouse or other SAP products knows that SAP BI solutions are a far cry away from being lightweight. Crystal Reports and Excelsius are exceptions. Of all the BI vendors, SAP probably has the most time consuming, complex and clunky BI platform roll outs with very slooooow initial solution times to value. Although SAP Business Objects pros are passionate and treat it like a BI religion, SAP BI consistently ranks the lowest of all BI vendors in annual Gartner BI customer satisfaction surveys. The latest 2013 Gartner BI customer satisfaction survey is available in the Alteryx newsletter. SAP has a lot of room for improvement and it appears that they know it. Could this be the beginning of a serious SAP BI change? Time will tell...
The SAP BI pros in the session are used to the old way of doing things and expressed typical BI pro control fears of overlapping BI tools, product sunsets and self-service BI tool chaos. These concerns are not at all unique to SAP - I have heard and addressed the same exact concerns expressed with Microsoft PowerPivot, Tableau, Spotfire and other BI player tools. The entire BI industry is evolving right now. In reality these new Data Discovery/Agile BI tools ***if properly implemented, governed and managed*** are strategic game changers with exceptionally rapid solution time to value businesses love. BI pros need to embrace these tools, learn how to best manage them and focus on developing consumable business usable semantic data layers.
SAP did confess to having the most tools and SKUs of any other BI vendor. I agree their offerings are overwhelming and they do overlap. Microsoft suffers from the same issues, offers white papers on when to use each BI tool, and now uses Excel as a single wrapper around many stand-alone add-in tools. Other traditional vendors have the same bazillion tools in the BI bag of tricks issues and yet all of them now have added a Data Discovery/Agile BI tool into the mix. Typical product sunset fears were addressed - SAP Lumira is not replacing other tools. Each tool has a place and like the other traditional BI vendors, SAP will probably have to provide guidance to customers on when to use each tool, which tool is best for certain use cases and so on. Of course SAP would love it if only SAP tools were used in BI strategies but realistically there will be a mix of BI tools for many reasons. Having a variety of BI tools in a BYORT (Bring Your Own Reporting Tool) world does not have to be a bad thing but it does need to be carefully thought through and properly managed.
The session progressed to a slide on SAP's vision for Agile Visualization. (see image above) The theme was BI on any device, anywhere...SAP Lumira Desktop ties into on-premise SAP Hana and/or SAP Lumira Cloud. SAP stated SAP Lumira Cloud is NOT just for Lumira content but that other BI content can also be hosted on it. SAP Lumira Cloud is available for anyone to test and try out.
From there the self-service ETL features of SAP Lumira where shown along with upcoming improvements that will be available soon. Among the new features included an HTML5 user interface with more than one view - the prior one view at a time was a show stopper for me so I am awaiting this update to my SAP Lumira Pro version to see if it is a viable player yet or not. They also showed a very nice storyboarding and mobile device layout publishing feature that I am looking forward to testing. Lastly they showed geospatial views that sure looked a lot like Alteryx and Microsoft Power Map (Geoflow) solutions.
From there SAP Predictive Intelligence was mentioned. Right now SAP Predictive Intelligence is an add-in to SAP Lumira with different licensing and purchase requirements. I did not catch if it was now included or not. Personally I really, really like this add-in. There are several base predictive algorithms out-of-the-box with ties into SAP HANA and a plug-in to R that opens up a world of predictive possibility.
The traditional SAP BI tools were almost totally neglected on this call with the exception of Webi. SAP had one slide on upcoming improvements of working with Webi and SAP Hana. Now that does not mean nothing is happening with those tools, they simply were not the focus of the call claims SAP. At SAP TechEd the traditional SAP BI tool roadmaps will be shared with the usual anything can change disclaimers that all major software vendors toss in.
My favorite announcement was the new FREE SAP BI Academy and BI Resource site that will be launched on September 9, 2013. I think that is fantastic and long overdue. I already took the SAP Hana Academy courses and thought they were fabulous for technical professionals to jump start learning new technology. The SAP Academy courses also had related AWS and hosted VM images to play hands-on with the technologies being taught.
At the end of the session, SAP was literally grilled for a bit. No question was off limits and some rude ones were asked questioning their commitment to the new direction, another new team and speed to market. Steven Lucas directly expressed his personal commitment to this direction and SAP's desire to be a serious player in the wider Analytics ecosystem. He also shared that more SAP BI and Analytics announcements will made at SAP TechEd this fall. In the meantime, several new SAP BI product general availability releases are expected later this month.
Myth: Tableau can't use Stored Procedures. Reality: Yes it can!
There seems to be a myth out there that Tableau can't work with SQL Server stored procedures. That may have been true in Tableau v6 but not it is not true in Tableau v7 or v8. I have done it. I do admit that the stored procedure solution is not the simplest one around. At the upcoming Tableau Customer Conference, I will ask about this along with assumed predictive improvements that I am hoping are shown at the much anticipated Day 1 TCC Key Note. The bottom line right now = you can indeed use SQL Server stored procedures with and without parameters.
The example Tableau workbook above calls a stored procedure [Federal].[dbo].[uspSQLSprocDemo] with two parameters, @State='FL' and @Progam='Water'. Although I hard coded them in my quick sample above, starting in Tableau v8 you can use Tableau parameters in a Custom SQL statement to make the stored procedure call truly dynamic.
Here is a quick explanation of how I did it.
1. Enabled SQL Server 'Ad Hoc Distributed Queries' a SQL Server server level configuration setting.
exec sp_configure 'show advanced options', 1;
sp_configure 'Ad Hoc Distributed Queries', 1;
2. Added a Linked Server with a log in to the existing SQL Server instance. Note I used localhost and my account since I have no users set up on my dev laptop. In a production environment, you can use any of the other Linked Server log in account mapping options.
EXEC sp_addlinkedserver @server='localhost', @srvproduct='SQL Server';
-- create a login map to a single account
@rmtuser = 'jen',
@rmtpassword = 'mysupersecretpassword';
3. I created the demo stored procedure with default parameters.
/****** Object: StoredProcedure [dbo].[uspSQLSprocDemo]
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
-- Author: Jen Underwood
-- Create date: 01/28/2012
-- Description: Sproc Demo
-- exec [dbo].[uspSQLSprocDemo] 'FL', 'Water'
CREATE PROCEDURE [dbo].[uspSQLSprocDemo]
-- Defaults only used when no parms passed in
@State varchar(4) = 'FL',
@Program varchar(100) = 'Water'
SET NOCOUNT ON;
4. I then created a SQL Server data connection in Tableau with the Advanced option for Custom SQL and copied the following OPENQUERY snippet. In v8 you can use Tableau parameters to dynamically pass in SQL Server stored procedures parameters within the live data connection Custom SQL statement.
SELECT * FROM OPENQUERY
(localhost,'SET FMTONLY OFF; exec [Federal].[dbo].[uspSQLSprocDemo] ''FL'', ''Water'';') X
If I did use Tableau parameters, this is what it would look like:
SELECT * FROM OPENQUERY
(localhost,'SET FMTONLY OFF; exec [Federal].[dbo].[uspSQLSprocDemo] @State, @Progam';') X
And that is it... This was not hard to do but there is a myth that it is not possible. MYTH BUSTED. Enjoy and happy querying to all the stored procedure lovers like me.
Ah ha! Power BI Mobile App is in the Windows App Store
Checking the social media and Twitter-sphere tonight, I noticed a post on the Power BI Mobile App being showcased in the Windows 8 app store. (Tip: For my readers outside the United States, you may need to toggle your web browser locale if this app is not added to your local country Windows App store yet to get it.) Guess I must have been working way too hard to have seen this earlier today but it also looks like we did not get the email updates that we were expecting from registering on Power BI site. Regardless of why the late night pleasant surprise, the Power BI Mobile BI app contains demo material that is hosted on a Power BI site somewhere for you to play with it.
We are still awaiting the Public Preview availablity announcements that are assumed will be made any day now. You can currently test the Mobile BI app preview with your own Enterprise Office 365 E3 level account and Excel 2013 files or review the posted demo Excel 2013 content if you have a Windows 8 device for a "sneak peak" of this looooooooooooog awaited native Microsoft Mobile BI app. The Microsoft field sales team has shared that Power View maps and scatter plots will not render in the initial version. Over time updates will be pushed to the Mobile BI app though the app stores. There is an excellent demo by AJ Mee posted on YouTube of how to publish your own Excel 2013 files to test this app.
iPad lovers were informed that they would have to wait a little longer and there are currently no plans for an Android app - Android fans are being told to use the browser-based option with Office 365 Power BI. If you want more information about Power BI, please check out my latest articles on SQL Server Pro magazine and also the earlier July 2013 Office 365 Power BI blog below.
If you have on-premise Microsoft BI - this Power BI mobile app WILL NOT work with it. If you want Mobile BI for on-premise Microsoft BI: Power View, SSRS, Excel Services, PerformancePoint, and so on, I have several blogs on the Microsoft Mobile BI topic, a presentation on SlideShare and a TechEd 2012 Mobile BI video that you may find super helpful...OR feel free to simply ping me. I have been openly obsessed with Mobile BI and Microsoft Mobile BI the past few years.
Data Discovery Player Market Trends
In keeping a close pulse on the overall BI industry, one of the stats I like to track is the Job Trends and Skills that are HOT and their increase or decrease in searches. You can do this at Indeed.com, an aggregator of postings across many sites. Right now in August 2013, HTML5, MongoDB and iOS are the top 3 technical job trends. Looking at the Data Discovery Players was also really interesting. I will share the Database Player results in another blog since those trends were also quite revealing in light of MongoDB, Hadoop, and other players that have entered into that space - let's just say those results made the Microsoft and Oracle "partnership" make a tiny bit more sense. To do your own comparison, simply put a comma between the vendor, skill or app name.
What was fun and interesting about this specific comparison of players is the obvious market growth explosion that started in January 2010. At the time Spotfire was the HOT tool. I recall playing with it and thinking it was great. I had the pleasure of meeting some Spotfire folks that had left and started their own non-BI related company called Phreesia. I had run into Qlik only a few times in the healthcare market and was being courted as a possible partner since I was a BI Practice lead. I did not care for the Qlik developer experience, tried Tableau and at that time it felt limited, and ended up staying with Spotfire and also participating in the Microsoft PowerPivot TAP program. Each year I download and review all these tools again when they announce releases. During 2011 and 2012, I was doing a lot more with PowerPivot than the others since I joined Microsoft. I saw and felt the market changing as I talked to customers and heard from people all around the world that were buying these other vendor tools. It seemed like I heard the Tableau love everywhere I went. The Tableau user experience and analytic depth was undeniably excellent and clearly right now they are growing the fastest even in a mix of FREE and much lower cost tools. Proves that people are willing to pay a premium for a fantastic analytic solution. SAP Visual Insight now Lumira just entered the game in late 2012. Microsoft embedded Power View into Excel in late 2012 and added many other Excel Add Ins to try to compete more effectively in this space. IBM, Microstrategy and many other BI players are also heavily investing in better data discovery user experiences trying to emulate current market leader Tableau. It will be fun to keep track of this to see where it goes in 2014, 2015 and so on.
The past few years have been a fascinating time in BI and truly a traditional BI game changer. There are still many heartaches around how to best roll out, manage, and deploy these strategic tools to employer the front-line business user SMEs. Many CIOs, BI Directors, and traditional skilled BI Lead resources are still struggling with how to embrace these tools in their BI Strategies while also maintaining a valid, single-version of the truth. It can be done - there are self-service/data discovery BI best practices that apply to all of these tools and how to best combine them with traditional BI solutions. I am happy to help you if you are struggling with this yourself. Self-service/data discovery BI best practices and governance frameworks do need to be personalized to a company's specific environment - people/culture, processes and technology. If you are a traditional BI skilled resource, learn how these tools connect to data sources, define semantic models that can be shared, apply security, how to deploy, integrate, manage, and monitor them among other things. You will see there are many features that look a lot like the world you lived in before - they are just implemented a little differently allowing for truly rapid, high impact results versus the slow, old school ways of doing things. At the end of the day, it is the BI total solution time to value the business is buying and why these tools took off like wildfire!
Prediction, Optimization, Simulation and Prescriptive Analytics with the Solver offering for Excel 2013
I recently came across an impressive Excel 2013 Add-In that I recall first seeing a few years ago for an older version of Excel while in Data Mining class at UCSD. This new Excel 2013 Add-In happens to be an early adopter in migrating from old world Excel Add-In and VBA frameworks to the modern Microsoft Windows Azure and Apps for Office framework. The solution is called Frontline Systems Solver. This Solver is advanced - it not exactly the same thing as the basic Excel Solver that I mention at the end of this article. This advanced analytics Excel Solver offering can leverage Excel data, Excel PowerPivot data, and data stored in relational databases. It works with stand-alone Excel and also has an Excel Web App on Office 365 or SharePoint 2013. It is a very hot, must check it out type solution for analytics lovers on Office 2013 or Office 365 needing a lot more than basic Excel Solver intelligence. This much better Solver App is located within the Data Visualization + BI category App listing in the Office Store. It is priced from free to $995 for a basic offering to $5,495 for the Analytic Solver Platform.
Some of the advanced analytics features in this advanced Solver app that caught my eye included:
Time Series Forecasting with classical regression, exponential smoothing, and ARIMA models.
Data Mining for Predictions with XLMiner include regression trees, k-nearest neighbors, neural networks, naive bayes, principal components, k-means clustering, hierarchial clustering, and neural nets.
Conventional Optimization, Stochastic Optimization, and Prescriptive Analytics with Premium Solver Pro to better allocate scarce resources or evaluate decision options in uncertain scenarios. They also have linear, nonlinear, quadratic, and mixed integer programming. To ease development of some of the complex models, a wizard can be used.
Risk Analysis with Monte Carlo Simulation and Decision Trees with Risk Solver.
There is also an SDK for application developers to integrate these features into their applications. This embedded analytics is the type of thing that we will see in the near future for sure! The Solver Server has APIs for standard web service calls, SOAP and WSDL protocols.
For someone wanting to take analytics to the next level and needing much more than the current out-of-the-box Excel What-If analysis options: 1) Scenarios, 2) Data Tables, and 3) Goal Seek 4) Excel's Basic Solver, an add-in to Excel that allows for more variables or 5) the dated SQL Server Data Mining Add-Ins for Excel that are NOT live data supported in SharePoint 2013 or Excel 2013 - Frontline Systems Solver may just be your analytics dream come true and worth a test drive.
Setting, Revisiting, and Pursuing Life Goals
Pursuing Life Goals. Two months ago I shared in the Impact Analytix Launch blog that I decided to pursue a life dream of owning my own business. Not everyone in my network reads my blog and I continue to get quite a few pings about "WHY?". Why did I do this? Why not be a senior level BI or Analytics leader for an amazing company? What on earth am I thinking and on and on? Sure there are probably a hundred easier paths that I could take than to start my own business but I would never have excelled in the BI/technology world if I was all about the easy path, right? I thrive on challenges, enjoy helping people, do work that I truly believe makes a positive impact, like to do my own thing, enjoy exploring new technologies, and have an undeniable entrepreneurial spirit that has not gone away since I wrote my Life Goals list in 1991.
So let's talk about Life Goals! I think everyone should have personal and professional Life Goals - set them early in life and continue to update them as life takes you in different directions. I set my Life Goals (see the wrinkled yellow paper above) when I was young, naive, and in college. I was in Business School and not in Computer Science. At the time, I had no idea that I would fall madly in love with data and technology - a direction where hobbies, passions, and talents ended up leading me after school. Of course over time, some goals are achieved, others change, new ones are added while some are removed. Some examples of goals accomplished include graduating from college, being able to support myself, paying off my debts, buying a house, working for Microsoft, volunteering to teach kids, coaching for a cheerleading team, traveling to Italy and Mexico, going on a cruise, and learning to scuba dive. An example goal that I decided after time just didn't make sense to keep on my Life Goals list = being able to do a round-off back-handspring with a full twist. My body was never going to be able to do that after age 29 and in reality it really was unimportant in the bigger picture. I also ended up not having children. My husband was in the military during most of our marriage and our lives took very different directions than most people take. I studied and worked A LOT while he was out to sea when many other people had children and busy personal lives. That is not a bad thing at all - it is just a bit different life path. Instead of 3 children - I had 3 dogs that I deeply cherish type different. Some goals that stayed on the Life Goals list each year and NEVER seemed to change include: having an unlimited supply of Diet Mountain Dew, wonderful friendships, enjoying the great outdoors, traveling to Spain, Brazil, England, Australia and Japan, buying a red convertible, and HAVING MY OWN BUSINESS.
It is no secret but I also haven't widely shared that my husband and I struggled with carrying two peak bought mortgages and a steep Southern California rent payment after the 2008 housing market crash holding us back from a couple fantastic career opportunities and putting our Life Goals on long-term hold. We did not see that market crash coming - it changed our lives forever. I will never forget that time and ironically I had a great job on a "keep the lights on team" helping a company in the construction industry survive the financial crisis analyzing data and scenarios that had never, ever been planned for in their business models. I was very lucky. My husband was not as lucky since his employer went bankrupt. We both put our Life Goals on hold to ensure that we could work to pay for food, our housing debt, and make it through the long spell of rainy days per se. We cut off TV and all other non-essential expenses just like many other people during that time. We also faced a variety of other challenges during those hard times but vowed to do the right things and we would get through it. We ultimately made the choice to leave a good career role to live in one of our two empty peak bought houses to stay afloat. We ended up landing nice jobs back home. I was further blessed to work in my Microsoft dream job the last few years and then finally we were able to sell one of those empty houses at a significant loss just this past March 2013. Loss or not, it was a huuuuge financial relief for us. Selling that house opened the door for being able to explore our Life Goals again.
There are many challenges that you may encounter in your life and career - health/illness, family member death, financial stress, job loss, failure, or many other things that may put your life dreams on a long-term hold. In my case, I kept that little yellow Life Goals paper and tightly held on to those goals and dreams. I never completely gave up on them, not ever, not for a moment...and it kept me going even in the scariest of times like that Fall of 2008. "It is never too late to be what you might have been." - George Eliot
Life takes many unexpected twists and turns. I would advise all of you to reflect on your own life goals, write them down to remember them, continue pursuing them, and revisit them each and every year no matter what happens to stay true to your soul. My Life Goals list has been a wonderful compass for me to keep on going in the right direction in good times and in times of uncertainty or fear. I hope this helps clarify the why I am doing what I am doing and that it may also help you keep striving towards your own Life Goals. ; )
Like it or not, Self-Service BI and Cloud BI are Here
The BI market is buzzing with two significant BI players in the last month announcing Cloud BI offerings for their popular Self-Service BI solutions: Microsoft and now Tableau. These two now join veteran Cloud BI players such as GoodData, Birst, and Microstrategy, as well as other vendors like Pivotstream, Spotfire, SAP, IBM, Jaspersoft, and so on in this crowded Cloud BI space. There are many players entering into the game due to lower barriers to market entry now that visualization and cloud hosting is mainstream. New players seem to pop up in the market almost monthly these days – at least in the big data BI and analytics space that is closely related to Cloud BI. Big data BI also has some overlaps with cloud BI. In the BI Big data space players like Splunk, Datameer, Karmasphere, Hadapt and Platfora also offer cloud BI and reporting. Now not all vendors are equal here at all - this is still an immature market. Some players are pure SaaS, some only IaaS, and some offer both. Some have strange pricing and will limit you by amount of rows in your data, some can only use your data if it is in the cloud, some can connect to on-premise data but don't have the capability to use your company authentication options, and so on. Pricing for the various Cloud BI Self-Service BI offerings varies widely today from free to over $1000 per month per user. Many vendors offer free private trials and also very low cost and "freemium" versions that typically require published views to be public facing.
Keep in mind according the last Gartner Magic Quadrant for BI, there was a really key point about Cloud BI buried within the Spotfire section where Gartner that cited "67% of the survey population at large will never put their enterprise BI in the cloud". According to MarketsandMarkets Cloud Based Business Analytics Market Global Forecast & Analysis (2013 – 2018) study, "currently more than 13% of companies have cloud based business analytics and BI solutions. This figure is expected to reach more than 50% by 2018, at a Compound Annual Growth Rate (CAGR) of 25.8% for the period of 2013 to 2018. The overall cloud based business analytics market had an overall size of $4.3 billion in 2012. It is expected to grow to $16.5 billion by 2018". The flurry and mass of BI vendors pitching cloud to us is either going after a mere 33% of the market share and/or they are pushing/forcing this solution to the on-premise market to grab market share as quickly as possible. Regardless of where or how they are getting customers and any lack of customer driven demand for Cloud - make no mistake that BOTH Self-Service BI and Cloud BI are here to stay. A good article on this topic was written a while ago to Embrace Self-Service BI. There are many, many more of these articles, white papers, and studies from top BI and IT industry analysts.
Now there are some fantastic reasons to go to the Cloud - things like ease of spinning up an environment versus weeks or months of implementation consulting services before you see any real business value for your BI investment. You also get much quicker enhancements to the offerings you buy since it is easier for the vendors to deploy changes in the cloud. There is also a peace of mind that someone else is protecting the systems, doing the back ups, and can recover the systems should something go wrong. Business units can now by-pass IT in some cases and truly enjoy a robust, fully functional BI system without IT involvement. Of course that last point usually freaks out most DBAs and IT people that understandably fear for data security, governance, loss of control, insight into what is happening, and also fear the future of their careers.
There are a few valid concerns around the Cloud BI option that you should sort through and evaluate. Some common concerns include the "true costs", how to really estimate costs, how to get the data and report IP back once up there if you need to swap vendors, data movement latency, how to authorize, authenticate and secure the data with your existing security mechanisms, laws about where the data can reside (some countries in the world require local data centers), privacy, and user BI tool connectivity experiences. There are many others. Cloud BI can be great but it really needs to be tested before you commit to it. Consultants (like me - of course I had to throw that in there to stay alive) can help you sort through the Cloud BI realities and vendor fact/fiction, test these solutions with pilot projects, help socialize the value if it does make sense in politically charged environments, guide you on how to best implement governance around these solutions, design the migration path to Cloud BI, or Hybrid BI. Personally I feel Hybrid BI is going to be a very hot area of BI in the next 5 years since most companies I know will not move all of their data into a cloud but they will want to utilize the nice array of Cloud BI solutions with their on-premise data.
A few words of advice to the folks pushing back on turning on Self-Service BI and Cloud BI - learn it, understand how to govern, deploy it properly enterprise-wide, manage it, transform your role into enabling the business and providing a consumable data source the business can use with these tools or you will not be employable in the future. Although according to Gartner 57% reports are still built by IT as of 2012, the days of job security from long report queues and the business needing to go to IT to add a field to a report are going to be long gone. There is nothing more rewarding that having an empowered customer that sees you as their hero. I worked for a Hawaii company back in 2003 that recently came back to me in 2013 to ask for help empowering them again 10 years later. It is a big shift in mentality - especially for IT control freaks. I totally, totally get it. I am still a little hesitant around true cloud costs myself. Bottom line is to keep an open mind, look around you, and learn about these technologies now or be left behind since the train is leaving the station. The sheer amount of free and low cost training resources from $99 a year ACM membership, to free Microsoft Virtual Academy, OpenSAP, IBM Big Data University, Dataversity, Pluralsight, and sooooo many others is simply staggering. There really is no excuse to keep skills/knowledge up to date - prioritize the time to keep up.
Microsoft Office 365 Cloud Power BI
Big Cloud BI news today from Microsoft's annual partner conference. Announcements regarding an upcoming preview offering for Office 365 customers called Power BI will include:
- Mobile BI for Windows Surface Pro and iPads (YAY! YAY! YAY! Finally but when? Android?)
- Direct data source connectivity to on-premise data sources from Office 365
HTML5 was mentioned for Power View (GOOD move away from dying Silverlight platform)
- Data Explorer is now called Power Query
- GeoFlow is now called Power Maps
- PowerPivot is now officially Power Pivot with a space. PowerPivot all one word is also valid.
- Natural language Query Engine (Oracle Endeca-ish, Splunk Hunk Search Processing Language response?)
There are some interesting points about this news. One of the key points to note is most if not ALL BI news since July 2012 is coming from the Office side of the Microsoft house and NOT the traditional SQL Server side. If you want to follow Microsoft BI, take note that Office is where you should look for BI news. The SQL Server 2014 CTP 1 news and decks are further proof of where Microsoft BI lives since July 2012 = in Office. That is a huuuge hot button for me personally but it is a good move for Microsoft overall. Aside from Excel everywhere, Office touches over 2 BILLION people and what better way to market your product than to have it sitting right in front of your target buyer audience. Excel was also notably cited in the Microsoft announcement blogs as being "the one and only Power BI authoring tool". Microsoft says it is betting on Excel to provide basic BI for everyone, removing BI specialist tools. Microsoft BI = Excel.
Next key point, "Cloud First and Cloud Only" is further proved by this news. Actions speak louder than words and offering new BI enhancements like Mobile BI for Office 365 Cloud Only is a significant action. Great news for the lucky people that are early Office 365 adopters for cloud like me.
The first version of the native Mobile BI will be available on Windows 8 and Windows RT devices with a native Mobile BI app for iPad coming later. For Android and other unsupported platforms, SharePoint browser-based options or third-party apps might be available. WHEN will the Mobile BI app be in the Apple Store! I am literally dying to play with it. Little businesses like mine really should look at the higher level Office 365 E plans where Power BI is sure to land if we want to use this functionality though pricing has yet to be released. I do wish that moving from an Office 365 P to an E plan was easier than the start over, downtime, DNS migration moves, etc. I need to inquire to see if the Office 365 Developer account will work with Power BI.
If I read this announcement right, I am disappointed that SharePoint on-premise customers that have been waiting for Mobile BI for a few years now might not get to use the Mobile BI app. If I read it wrong, please let me know and I will immediately correct this disappointment statement. What I read was for the rest of the folks that are not on Office 365 (the current majority and most of the largest enterprise accounts in the world right now), today's Cloud BI news is not so lovely for them. Many customers can't and won't be able to immediately move to Office 365 or even Office 2013 for that matter. A lot of people are locked into lower versions of Office because of reliance on third-party software add-ins not being compatible. Some groups will not move to the Office 365 Cloud because of security fears and in some cases / countries / industries there are laws against putting data in the Cloud. Many businesses need to find the extra money from somewhere and budget to pay more for Office than they ever had before with the subscription-based model. Budgeting changes take time, planning, and ROI evaluation. Those people won't be able to fully enjoy their Microsoft BI investments, get new BI enhancements, or use Mobile BI until they move to the Cloud. I guess they might be able to buy one Office 365 Power BI account to have some Mobile BI but it would be a hacky work-around at best and it might violate licensing if that account was shared.
Seeing Microsoft say "Excel is the one and only Power BI authoring tool" raises questions around PerformancePoint and even classic SSRS futures. Neither PerformancePoint nor classic SSRS is in the Mobile BI app. SSRS is the most popular Microsoft BI reporting tool after Excel. I have a feeling about where PerformancePoint stands today but I'd really like to know where classic SSRS stands. I haven't seen any real investment in it since SQL Server 2008 R2 days (2010) or any news about classic SSRS other than the integration of it into SharePoint back in the SQL Server 2012 release. The BI focus has been around Excel and Excel add-ins. Once in a blue moon SSRS Azure might be mentioned but it is a very rare thing. When Power View can accept parameters and Office 365 Excel can be easily embedded into apps, then I can see classic SSRS taking a back seat. In the meantime, we really need classic SSRS and we need it to be truly mobile in the real BI world as Melissa Coates recently blogged. The current browser mobile is ok but has many gaps. The native Mobile BI app looks much nicer. For groups not on Office 365, third-party vendors are still the best option for Mobile BI. I guess you gotta start somewhere and this is v1 of the Mobile BI app.
Direct data source connectivity from Office 365 is a BIG deal. It will be interesting to see/test if transferring data back and forth is performant. Typically bandwidth varies quite a bit especially in developing countries. It also will be interesting to see if big customers use it with data warehouses that traditionally have been the data sources for BI or if they will have to move their data warehouses to the cloud for it to be usable. These would be awesome areas to evaluate and see what works best for you. FUN! Let me know if you want to do a pilot project, Hybrid BI is HOT and I would love to work on a Hybrid BI project.
There was also an update posted to Data Explorer now called Power Query. I am a big fan of this tool and think after a few revs and increase in general market awareness Power Query will be quite popular. If you have not looked at it, check out my blog at New Data Explorer for Excel. It works with both Office 2010 and 2013.
I have not been a big promoter of GeoFlow - it is a neat add-in but I could literally think of a hundred other top customer BI requests that I would have rather invested in before developing yet another stand-alone, slickery "demo-ware", Excel add-in that has a very niche and limited use case in the real BI world. I will leave it at that. ; ) Don't get me started... It is what it is and now that silly shiny ball has a new name, Power Maps.
Another big name change - PowerPivot is now officially Power Pivot with a space. PowerPivot all one word is also still valid. That is really quite interesting to me having been in product management. There were lots of chats about PowerPivot name changes, what gets affected, docs, books, marketing, market awareness, and so on. I am pleased that they chose to keep Power Pivot - the brand has strong value. I can't share what other names we talked about but I will share that I was passionate about keeping the PowerPivot name.
To wrap up, Microsoft Office 365 Power BI a.k.a. Cloud BI is fantastic news for Microsoft Office 365 customers like me to enjoy some real Cloud BI and Mobile BI. Sign up for the Preview notice to be one of the first people to play with this release when it is made available.
Practical Predictive Analytics
I recently did a fun TDWI Practical Predictive Analytics presentation on how to get started with data mining and predictive analytics using a wide variety of tools including: RapidMiner, R, Predixion, Weka, Statistica, SAS, IBM, Microsoft, Oracle, Spotfire, Tableau, and Alteryx. This blog will summarize some of the presentation key points and reviews of what I liked about the various products shown.
I can't believe it but it has actually been a little over eleven years
since I first implemented a truly predictive project. I was working for a retailer in Hawaii while my husband was in the U.S. Navy supporting Operation Enduring Freedom overseas. Retailers historically are quite savvy in statistics and predictive. At the time, I was working alongside a Harvard University intern that was both brilliant and inspiring. He came to me with an algorithm called Item Based Collaborative Filtering (a Market Basket Recommender) and asked that I run it on the Oracle database servers that I was managing as part of my role back then. I was intrigued by conversations with him, the results of our predictive projects, dived deeper into what was SQL Server data mining, statistics with SAS, and other analytic areas. I later enrolled in a two year Data Mining Post-Graduate certificate program at the University of California, San Diego. From that point on I was hooked. Later on I implemented check fraud detection, healthcare payment estimation, customer segmentation, insurance agent classification, and a few other predictive projects. In general predictive projects seem few and far between. I literally beg to get these projects since I totally love it. For those of you just starting out, here is how you can get started learning this awesome science and some of my lessons learned. Enjoy! ...and please ping me if you ever do have one of these projects and want some extra assistance.
What is predictive analytics? it is an area of statistics that is focused around capturing relationships between explanatory variables and predicted variables from past occurrences and using them for prediction. Often
predictive analytics involves Data Mining, automatically discovering interesting patterns in data. Predictive analytics can be applied to unknowns in the Past, Present or Future. The accuracy and usability of your models varies upon the level of analysis and quality of your assumptions. Companies use predictive analytics to improve decision making and because in the world of exploding data there is waaaaaaaaay too much data and too many variables to manually analyze or use traditional statistical techniques effectively.
Traditional analytics and statistical methods fail due to the complex non-linear and multi-variable combinations.
Predictive analytics also provides a strategic competitive advantage. Most Fortune 500 companies already are using predictive analytics today. In the future, we will see much more predictive embedded into business processes and applications even in small companies.
One way to get started with learning data mining and predictive analytics is to download free open-source software, buy a couple books, take a course, get a mentor, and start learning hands-on with your own data. You don't need to have big data or be a Data Scientist, PhD, or Statistician to learn the basics. I do feel that some specialty courses and an understanding of foundational statistics is important. A few of my favorite books are shown below.
Another great place to get started is KD Nuggets. KD Nuggets is a "goldmine" of resources and a true data mining classic. Don't let the ugly, old, 90's web site design fool you - this site is FANTASTIC.
Once you are ready to get started, you will follow what the industry calls the CRISP-DM process. This will involve choosing an appropriate Business Question or Problem to predict.
Gathering, understanding, and preparing your data for predictive algorithms. Preparing data is time consuming and a bit of an art and a science. Predictive algorithms use "flattened" input data sets - you do not just point them at a data warehouse or data mart to get decent results. After the data is prepared in data mining friendly format, you then load that data into the analytics tool of your choice or use "in-database" predictive algorithms. From there identify true predictive influencers, evaluate various data mining models, further transform and iteratively experiment with variables until you have a good predictive model to share, deploy, or integrate into reporting or application logic. When evaluating predictive models, scoring typically means squared error or percent correctly classified. To choose the best model, keep the business problem in mind and what errors, true and false positives, really mean. Typically models are judged on estimation or classification Gain/Lift, ROI, or ROC Curves. To deploy predictive models, you might encode the rules into application logic, use industry standard PMML, program predictive queries for smart KPIs or forecast reports, or apply the models within ETL processes.
Some of the lessons learned from real-world predictive projects include:
- Do not overlook the critical importance of properly choosing, preparing, cleansing, transforming, and sampling data to train and develop high performing models (Think MONEYBALL for your business)
-Use principal component analysis or other attribute reduction techniques to reduce variables to avoid "over fitting" predictive models
-Be sure to partner with the business process subject matter
experts to make sure all relevant aspects are captured as to not "under-fit" or incorrectly design a model
- Choose a predictive modeling algorithm that can be effectively used and deployed within the business process and not just look cool in a slide deck
For deeper review of these concepts and more, please refer to the shared Practical Predictive Analytics presentation posted at SlideShare.
Some of the tools mentioned at the TDWI meeting included RapidMiner, R, Predixion, Weka, Statistica, SAS, IBM, Microsoft, Oracle, SAP, Spotfire, Tableau, Alteryx, and Knime. There are many more than these in the market today. Here is my quick take on these solutions.
RapidMiner: Free and one of the most popular predictive analytic tools in the world according to the KD Nuggets 2013 Poll Results. It has deep and compelling functionality across the entire life-cycle of predictive analytic model development. Initially learning and designing workflows can be cumbersome and confusing. There are some videos but the best resource I found was Matthew North's book, Data Mining for the Masses, shown above.
R and Rattle: Free and seriously one of the most popular, most used statistics and predictive platforms. Most of the major BI and database vendors have added R wrappers into their solutions to leverage a sea of available community developed predictive algorithms. R is a must know for anyone seriously interested in an analytics career. Typically if enterprise scale R is needed, the Revolution Analytics R flavor is needed due to limitations in free, open source R.
Predixion: Founded by former Microsoft predictive analytics program manager/engineering expert Jamie MacLennan, Predixion has a wonderful Excel Add-In and other predictive offerings. I was really sad to see Jamie leave Microsoft but I am thrilled by the raging success of Predixion. Predixion grew 622% in popularity on the KD Nuggets poll in 2013 - they are doing exceptionally well. Predixion has a fantastic set of add-ins for Excel that layer on top of Micosoft's base predictive offering in SQL Server Analysis Services. They also are up to date with their PMML unlike Microsoft's base offering. They have a cloud solution and several pre-cooked industry solutions. If you are using Microsoft predictive, Predixion is one of the must see solutions.
Weka: Free and one of the classic data mining tools used by long term data miners. This was the tool I used throughout the University of California, San Diego program courses and it is covered in depth within the Witten Frank Data Mining classic text shown above. Weka has features across the entire life-cycle of data mining from preparation to evaluation, some visualization and workflow. Pentaho recently added Weka into their mix of tools.
Statistica: One of the more mature and popular for-fee vendors in this space, StatSoft Statistica has an entire suite of products and industry specific solutions including but by no means limited to ERP, SAP certified solutions, Teradata, ETL, Process Control, Quality, Text Mining, and many, many others... far too many to list here.
SAS Enterprise Miner and SAS JMP: SAS is quite popular and the best of the best... If you have it, I admit that I am jealous! Many eduational institutions including the analytics program that I would most LOVE to attend, NC State Analytics, uses SAS in their courses. Seriously SAS is a market leading for-fee vendor in this space. SAS solutions are high cost but also come with a sea of depth in robust, highly scalable predictive functionality. SAS Enterprise Miner is a true, full CRISP-DM life-cycle data mining solution offering. Although SAS JMP is not really a predictive modeling focused tool, it is more of a statistical survey, experimenting, and evaluation tool - SAS JMP does contain decision trees, neural networks, and some regression algorithms. SAS also has robust in-database processing options via SAS/ACCESS to can leverage Base SAS, SAS_PUT(), and SAS Scoring features within databases such as Teradata, Oracle, Aster, Netezza, Greenplum, DB2, and Hadoop via an SAS Embedded Process.
IBM SPSS Modeler: SPSS is also quite popular and one of the best of the best. SPSS Modeler is also a true, full CRISP-DM life-cycle data mining solution offering. IBM has exceptional depth in text Analytics with their innovative natural language, speaking "Watson", research that they applied to their Semantic and Text Mining features making IBM a clear industry leader in those specific applications. SPSS is another vendor that is often used in eduational courses and has long been used by government agencies, top Fortune 500 and other high profile groups.
Microsoft Data Mining: Microsoft has had data mining algorithms embedded within Analysis Services at least since the SQL Server 2000 days. I recall playing with the Poisonous Mushroom Decision Trees demos way back when learning predictive for retail scenarios. Although this offering did not get significantly enhanced in the latest SQL Server 2012 release, the Add-Ins to Excel have continued to be upgraded to 64-bit, work with the latest versions of SQL Server, and also with both Excel 2010 and 2013. If you have basic data mining needs and want to start somewhere really easy, the Excel Data Mining Add-Ins are a no-brainer. You can also easily deploy and use Microsoft data mining models with DMX predictive queries in your applications or reports. I have a nice public live demo of combining Microsoft data mining models DMX predictive queries with Tableau here. Many Microsoft data mining customers often opt to upgrade those solutions to use Predixion.
Oracle Data Mining: Oracle Data Mining is a component of the Oracle Advanced Analytics Option for Oracle database. This is an "in-database" predictive solution offering with full life cycle development. The free Oracle Data Miner GUI is an extension to Oracle SQL Developer that enables working directly with data inside the database, exploring data graphically, building and evaluating multiple data mining models. Like other "in-database" predictive solutions, developers can use native SQL APIs to deploy and use predictive models within business intelligence, reports, or applications. Oracle can also integate with SAS via SAS/ACCESS.
SAP Lumira with Predictive Analysis: also SAP HANA PAL: SAP entered into the predictive market late last year with a new offering that combines a visual discovery/anaytics tool SAP Lumira with a Predictive Analysis tool. The Predictive Analysis tool snaps into the visual tool and allows for full life-cycle model development. It currently has limited out of the box models but does have a R wrapper that can be configured opening up the use of the wealth of R algorithms. The current toolsets are immature today but SAP is releasing frequently to catch up to other vendors. SAP Hana PAL, Predictive Analytic Function Library, is another "in-database" predictive solution offering and is quite nice. There are some great videos here to learn how to get started with the SAP Hana PAL.
Teradata Warehouse Miner and Aster: Teradata also has an "in-database" predictive solution offerings with full life cycle development. This solution has statistics, data profiling, transformation, data reduction functions, model management, and scoring. Teradata and Aster both can integate with SAS via SAS/ACCESS allowing SAS_SCORE and other predictive capabilties to be available as a database UDF for easy predictive model integration.
Spotfire: Tibco Spotfire also has predictive modeling capabilities though the focus of the software is typically perceived as being more on the visual discovery/anaytics side than the CRISP-DM model development process. For many years, the pharmaceutical industry has been using Spotfire in drug research and development. Base Spotfire has limited predictive options. More sophisticated Spotfire flavors allow you to visualize existing predictive models from other applications and also create new predictive models with TIBCO Enterprise Runtime for R (TERR), R, and S+. One of the nice features of Spotfire is the ability to combine or mash-up predictive model results with other data sets to visually examine what-if analysis. Here is a demo video showcasing this concept.
Tableau: It is no secret that I am a huge fan of Tableau and I have already posted several blogs this year on how to use Visualizing R Models in Tableau and with Microsoft DMX. Another popular combination is using Alteryx Predictive R with Tableau since a Tableau TDE destination component was added in Alteryx this year. Tableau is currently not a true data mining or predictive modeling development tool but rather a solution to visualize and explore results. There are some native Tableau, out-of-the-box, predictive features around Trending and Forecasting that do not require R or other any stats programs. There is also multi-pass aggregation capabilties that can be used for some predictive processing scenarios that can be easily overlooked. The Tableau Trending features cover linear and nonlinear model types including Linear, Logarithmic, Exponential and Polynomial modeling.
Alteryx: Although I think of Alteryx as the king of Self-Service ETL tools, it also has a nice array of predictive analytics features and a R algorithm wrapper that can be used by mainstream business analysts. I do not think of Alteryx as a true CRISP-DM data mining model development tool but you can certainly build, apply, and evaluate predictive models in a self-service ETL process rather easily with it. Since Alteryx comes with an awesome library of exceptional, high value, data sets such as MOSAIC, Dunn and Bradstreet, demographics, and geospatial, it may make a lot of sense to use it for predictive analytics in retail, marketing, and other use cases that reference those data sets.
Knime: Free and user-friendly visual workbench for the entire predictive analysis process: data access, data transformation, initial investigation, model development, visualisation and reporting. Although I have not used this tool myself, another respected TDWI member mentioned it and after looking at it I agree that it should be added to this list of favorite predictive modeling tools.
That finally concludes my high-level overview of the TDWI Practical Predictive presentation I did last week. I also have a few other blogs on this topic including: Predictive Analytics with Tableau, How To Predictive in Tableau, What-If Analytic Simulation Options with Excel and SSAS, Visualizing R Models in Tableau, and Predictive Analytics with Mahout.
If you are interested in learning more about this fascinating and powerful topic, please don't hesitate to contact me directly.
SQL Server 2014 CTP 1 Business Intelligence Summary
In an unusual but warmly welcomed move, Microsoft SQL Server Technical Marketing released the SQL Server 2014 CTP 1 Product Guide before the CTP was released to the public. The first CTP for SQL Server 2014 CTP is expected to be released later this month. If you have not looked through SQL Server product guides before, they contain fantastic technical overviews, white papers, and presentations. I used to be on the Microsoft SQL Server Technical Marketing team that put this type of material together and I can share that these resources are excellent for keeping up to date on what is coming next with SQL Server and getting a feel for the key investment themes for future-proofing your technology roadmaps.
After downloading the latest product guide, I found the general decks and the BI 300 deck and took a quick peek through them hoping to see some exciting news on mobile BI, predictive, and other areas that I am dying to get an update on. Here are some notes on the overall BI investment themes that I did see in the CTP 1 product guide decks for the next release of SQL Server: In-Memory Across Workloads, Performance & Scale, Hybrid Cloud, HDInsight, and Cloud BI.
Note that "Cloud BI" a.k.a. Office365 Excel and desktop Excel for the BI presentation layer seems to be the BI message in this next release. This "SQL Server 2014 BI = Excel for presentation layer" message does align to the "Basic Intelligence" Excel BI theme from the PASS BA Conference a few months ago. There was notably NO mention of Reporting Services that I could see in the BI decks or whitepapers....hmmmm, interesting.
SQL Server 2014 CTP 1 Product Guide BI themes:
- HDInsight Big Data (most of the emphasis and focus in this release)
- PDW Polybase (Remember Dr. DeWitt's keynotes, PASS 2011, NoSQL + SQL = SQL Server 2014)
- Excel, Data Explorer, PowerPivot, Power View and GeoFlow (heavy emphasis on Excel)
- BI Semantic Model, Analysis Service Tabular, and Multidimensional flavors (lightly covered)
- Reporting Services (notably missing?, no Reporting Services or Azure Reporting Services mentioned)
Data Quality Services and Master Data Services (mentioned but no coverage)
- SharePoint and Office365 (lightly covered)
If something is not covered or is covered lightly, it does not necessarily mean that there are no investments. I know there are investments in Analysis Services Tabular and other areas that were only glossed over lightly from different Microsoft product team public blogs. However, I was truly surprised not to see any classic Reporting Services covered in any of the decks since Reporting Services is the #1 most popular Microsoft BI application by far used today across the globe (aside from Excel of course). Excel is not really an operational reporting tool, operational reporting is needed everywhere, comes up in most BI conversations, etc. In the last 2012 release of SQL Server, there was not a lot of classic Reporting Services enhancement aside from SharePoint integration. Power View was truly the main focus of the Reporting Services team at that time. Power View does not really replace operational reporting, especially as it stands today with no parameters, scheduling, or bursting... I'd love to know the future vision of classic Reporting Services as would a bazillion other people. I get asked about that each time I do a Wave 15/ What is New in 2013 BI overview. I also saw one mention of the old Excel Data Mining Add-Ins that I totally love by the way. It was very nice to see them shown in here and I do hope that means those great Excel predictive analytic add-ins will still be around in v2014.
It was quite clear to me that HDInsight and PDW are getting most of the love in 2014 from the sheer amount and depth of coverage in the product guide content. The above image of the Microsoft big data technical architecture components and how they all tie together is also key. I would like to see an updated technical architecture diagram like the one SAP uses (see picture below) so customers can clearly understand Microsoft's comparable offerings. Here is my hack of what I saw in SQL Server 2014 and how it relates to SAP's big data framework that has the exact same types of technical functionality.
PowerShell, and cross-platform CLI tools. MapReduce included conceptual overviews and code examples. I imagine this area will be the one with the most significant learning curve for most of us and does seem to look more like traditional application development versus database development skills. If you haven't started learning about big data yet, you really, really, really, should...it is still early adoption but it is gaining serious market traction, will be mainstream soon, and thus soon a need to know.
Ah, PDW Polybase! Finally Dr. DeWitt's keynotes from PASS 2011 of combining the NoSQL world with the relational database SQL world come to life in SQL Server 2014. PDW Polybase allows native querying across PDW and Hadoop using regular SQL queries, integrating structured and unstructured data.
- SQL query access to data stored in HDFS seen as ‘external tables’ in PDW
- Basic statistics support for data coming from HDFS
- Distributed join querying across PDW and Hadoop tables
- Fully parallelized, high performance import of data from HDFS files into PDW tables
- Fully parallelized, high performance export of data in PDW tables into HDFS files
- Integration with various Hadoop distributions: Hadoop on Windows Server, Hortonworks, and Cloudera.
- Supporting Hadoop1.0 and 2.0
Polybase will be a powerful and totally fantastic feature - if only I could get my hands on a PDW to play with it! I am seeing PDW demand pick up this year. For those of you that don't know PDW, it is Microsoft's Parallel Data Warehouse appliance. Although it is not covered in the BI section but it is BI related - update capable column store index is in 2014 - that will be a very nice enhancement to the existing
column store index in 2012.
In the SQL Server 2014 release, we do see a lot of Excel where we used to see Reporting Services or PerformancePoint. I noted only one mention of PerformancePoint in a recycled slide. The message is loud and clear = Excel with Add-Ins and Office365 Excel is the future for Microsoft BI. The 300 deck opened with Excel Data Explorer and reviewing all the data source types that can be searched and explored.
I have a nice overview of Excel Data Explorer, an easy self-service ETL tool, from a previous blog if you are interested in it. Also in Excel slides, there was basic coverage of Excel Web Apps, PowerPivot, Power View, and GeoFlow. No specific enhancements were really mentioned in these decks yet. I am guessing that future CTP deck iterations would add deeper dive technical content for these areas - possibly even contain a deck for each one of these areas like there used to be in past SQL Server release product guides.
Lastly, there was light coverage of the BI Semantic Model, Analysis Service Tabular and Multidimensional flavors,
Data Quality Services, and Master Data Services. Again, no specific enhancements were mentioned in any of the decks yet. My wish list for the next iteration of SQL Server 2014 Product Guide BI content includes:
- Expand upon classic Reporting Services or cover what will be the future for operational reporting
Add more Predictive Analytics, Mobile BI, and Cloud BI content
- Expand Data Explorer and include things like powerful M scripting
Add deeper PowerPivot, Analysis Services, DQS and MDS material for what is new, wild, and exciting
That wraps up what I saw in the current SQL Server 2014 material from the latest product guide release. I highly advise keeping a pulse on these docs as they evolve thoughout the next year.
The Beginning of Impact Analytix
After much reflection, thought, and consideration, I chose to pursue one of my life dreams - starting my own business. My new business, Impact Analytix, LLC, will consult and develop solutions with the BI, analytics, and technology platforms customers choose. Over the last 20 years I have seen many BI peers go out on their own, take a leap of faith, and SUCCEED! I have quietly envied their ability to play with varying database and BI platform technologies, decide when to accept/turn away projects, write books, blog, go and speak at amazing industry conferences, take long vacations, write off technology and continued learning as business expenses, and so on. Having saved enough money, lived modestly, gained relevant work experiences, and with my husband's support and continual urging to start my own business for years now - I FINALLY DID IT. It is exciting and I am completely reinvigorated. I registered my own business, got an EIN, business accounts, set up Office365 (super fantastic for a small business), started researching technology partnership programs across various information technology vendors, looked at cloud stuff, set up the computer systems, interviewed numerous small business owners, and continue to plan and research as I get my first engagements started. Thanks to Mark Tabladillo, Lynn Overall, Mayra Harley and others for being true inspirations and invaluable mentors for me in this new venture. I know there are many more of you out there that have made this leap too. I DO want to talk to you, hear your stories, and lessons learned.
So that is my scoop on the beginning of Impact Analytix, LLC. Fear not, my next blog will be a technical topic again like SAS JMP, Tableau European Conference news, Microsoft SQL Server's latest service pack with the long awaited and finally released DAXMD allowing Power View reporting on top Analysis Services multidimensional cubes (Reporting Services/SSRS SharePoint right now not for Excel 2013 yet), or something else truly geeky.
New Data Explorer for Excel 2010 and 2013
I recently blogged about Self-Service ETL Tool Options. That blog was pretty much a high level summary of current market solutions in that space with links to further explore the reviewed tools and also find more related tools. Having now spent some hands-on time with the new Excel Data Explorer Preview add-in, I'd like to supplement that post with additional feedback. To be fair to the other excellent self-service ETL vendors, Alteryx and LavaStorm, I will explore their offerings deeper in future blogs too. I do know that Alteryx has released an upgrade that works with varied local R installations so I can better evaluate their predictive features in my next review. Since Excel touches the lives of at least 2 billion people globally and the Data Explorer add-in is currently free, I wanted to first dig deeper into the new Excel Data Explorer Preview add-in since most people will naturally start there and then decide if another tool is needed.
Just to refresh you, the Data Explorer Preview is an Excel add-in for Excel 2010 and 2013. You can download and install it from http://www.microsoft.com/en-us/download/details.aspx?id=36803. Data Explorer provides easy, self-service ETL type functionality within Excel.
It has a simple, non-technical wizard interface for loading, combining, and transforming both structured and unstructured data into Excel - it also has some complex scripting capability via a new language called "M". You can
read up on M in the M technical docs. Additional resources to ramp up on Data Explorer include an introduction tutorial and a more advanced one on combining data data sources.
To get started in, navigate to the Excel Data Explorer menu, choose Online Search or a Data Source Type to Find, Query, Prepare, Cleanse, and Load Data into Excel. The current list of supported data source is fairly large: Web Page, Excel or CSV files, XML, Text, Folders and file metadata, SQL Server database, Windows Azure SQL database, Access, Oracle, IBM DB2, MySQL, PostgreSQL, Teradata, SharePoint Lists, OData, Hadoop HDFS, Windows Azure Marketplace, Active Directory, Windows Azure HDInsight, and Facebook. More data sources are sure to be added over time - I wish Twitter and LinkedIn for example were much easier. TweetSQL, DataSift, R packages, and many other third-parties can get you your social data to analyze. The Data Explorer Web Page and Online Search data sources were interesting and quite excellent. I can't begin to count the number of times I have had search and then copy, paste and restructure data from a web page somewhere where the figures were embedded in messy HTML tables for analysis.
A few of the other features that I liked and foresee being really invaluable to typical data analysts are the pivot/unpivot, scripting, joining, filtering, deduplicating, grouping, splitting, and transforms. The Data Explorer formula language is pretty straight forward though I do wish learning yet another a new programming language was not necessary. Many of the Microsoft MVPs have great blogs on Data Explorer - Chris Webb, Paul Turley, Melissa Coates, Bill Pearson, Mark Tabladillo, Teo Lachev, Rob Kerr, and Jason Thomas' are probably some of my top of mind, favorite bloggers that have good technical coverage. I know I am forgetting a few of the best of the best out there. One of the things that I found "hidden" or not best implemented was the nice menu for cleansing and data preparation shown above. The "trick" to see/access this menu is to right-click on top of the data column header name in the Query Editor window. That right-click is not at all intuitive given the little drop-down arrow the user is naturally drawn to click on. If the user does not know to right-click on top of the data column header, they miss out on the true beauty and power of all the transformation capabilities. I do hope that nit pick is fixed or I fear many analysts may overlook Data Explorer depth of functionality altogether or fear the script window being too complex. After visually choosing your transformations, you can then click on the script icon to see the generated M scripts. If you are a hard-core scripting fan, you can write your own M scripts in the script window.
Personally I don't think Excel Data Explorer got the love it deserved at PASS BAC most likely since the UI does not "sizzle" and there were some grumbles about PASS 2012 Day 2 keynotes. Regardless, it is one of the best things to come out of Microsoft BI this year and it is something to add to your bag of analytical tricks. Since Excel Data Explorer is more for personal ETL versus enterprise ETL, you may be asking when should you use it. This is where I think Excel Data Explorer seems to make the most sense:
- You need basic ETL on data sources with less than one million rows
(Refer to Data Explorer limits)
- You are struggling with a mess of VBA routines or are using Access linked servers with merge queries, and so forth (There are much better ways to do data mash-ups today though I keep seeing the VBA and Access messes all over the place in the real world.)
- You need personal data cleansing
- You want lightweight web page content imports or text mining capabilities
I feel that Excel Data Explorer currently is not a go-to solution for the following scenarios:
- You need large scale data for ETL or want true web or text mining
- You need ETL scheduling, monitoring, automation, etc.
- You want a data quality solution that references third-party data, sources, can be automated, uses workflows, data stewards, reporting, etc. (use a real DQS tool for this type of thing)
- You need needs to merge same attributes (i.e. Customer, Product) from different systems, system of record priority, workflows, data stewards, and automation (use a real MDM tool for this type of thing)
That concludes my "slightly" deeper dive into Excel Data Explorer. I know it is still pretty shallow from a "true techy" perspective but I do hope that it was enough to provide a better understanding for what Data Explorer is, what it can do, and where it fits.
Combined BI solutions with SAP for SAPPHIRE 2013
I recently went to SAP SAPPHIRE and had a great time seeing my BI friends from Microsoft, Simplement, Accenture, Deloitte, Hitachi, Ernst & Young and numerous customer accounts. SAPPHIRE seems to be a much heavier attended conference than SAP TechEd that I attended last fall. To prepare for this event, I had to ramp up on combined Tableau with SAP solutions. In the past, I have worked with combined Microsoft with SAP solutions - to read up on those check out PowerPivot
with SAP HANA and Microsoft BI with SAP. Now having seen both combined vendor solutions with SAP, I do see common patterns and I also see some areas where it sure does help to have a great SAP relationship for the combined solution connectivity!
Tableau directly connects to SAP HANA, SAP BW and SAP Sybase IQ. Tableau also can be called or embedded to/from an SAP Business Objects - the Ablaze Group has an excellent video of combining Tableau with SAP Business Objects. Historically SAP customers have been using Tableau with SAP BW for a while now. This year I am seeing a little more SAP HANA than I did last year as the database market share competition heats up. Thus I started my exploration of the SAP + Tableau experience with SAP HANA and it found that it was super easy. The first thing I did was install the SAP HANA ODBC driver
on my laptop and connected to an SAP HANA instance hosted on AWS. From there the Tableau experience was exactly the same as it is with any other relational database - simple drag and drop visual analysis with the lightning fast speeds that SAP HANA in memory delivers. See the image above of my Mortgage Analytics testing (using fake data of course).
Although it was not required for Tableau, I also installed SAP HANA Studio so I could play with SAP Hana in general, load data, create stored procedures and test the Predictive Analytics Library for a later post. For learning SAP HANA, I watched a few of the wonderful videos on SAP Academy. Working with SAP HANA feels like most other relational databases that I have worked with in the past: DBA/Developer management studio, role based permissions, tables, stored procedures, import/export, SQL, explain plans, sequences and so forth. Some of the differences were the optional Analytics Function Library, Predictive Analysis Library, R integration (it relies on RServe so that seems more like a wrapper to me than true integration), and a nice data previewer with charting capabilities. More to come on SAP HANA but for now back to Tableau + SAP.
Next on the list was SAP BW with Tableau. There are quite a few of the largest companies in the world already doing this and they are sure to stop by the booth and ping me so I had to be on my A-game! To get connected to SAP BW using the Tableau SAP BW connector, I found the Knowledgebase article was quite helpful. This connector uses the OLE DB for OLAP provider and issues live MDX queries to SAP BW. The connection experience is similar to using Tableau with other OLAP data sources, such as Microsoft SQL Server Analysis Services and Oracle Essbase. I needed to install the SAP GUI for Windows and the OLE DB for OLAP provider from the SAP Service Marketplace. Tableau enables you to connect to a BEx Query or to an InfoCube in SAP BW - other SAP objects like MultiProviders and ODS must be done through a Query that has the “Allow External Access to this Query” option selected in the Properties tab in the BEx Query Designer application. In Tableau, choose the SAP Netweaver Business Warehouse connection type, enter your credentials and the select an InfoProvider to connect to a BEx Query or InfoCube. Note that BEx queries often perform better than connecting to an InfoCube since they can be designed to get specific information from SAP BW rather than everything. BEx queries can be defined using BEx Query Designer. Once a BEx Query or Infocube is selected, Tableau detects the properties of the available objects, dimensions, hierarchies, and key figures as shown side by side in the image below.
Now you can get to the real fun and start creating amazing Tableau visualizations of this SAP data, blend it with other data and publish it to Tableau Server for sharing or even embedding the analysis into a web site, web application, SharePoint or SAP Reporting portal.
While on this topic quite a few people asked about SAP Visual Intelligence now called SAP Lumira vs. Tableau this week. I would advise you to check them both out side-by-side, test, play and experience the sea of vast differences of product maturity. Tableau has a good 10+ years of development head start on SAP in this space. Tableau is #1 in this class, the depth, breadth and design is far superior. It is notable that SAP Lumira's embedded predictive analytics with SAP HANA, R, and writeback features are much easier than doing predictive with Tableau today. SAP Lumira also has much nicer data cleansing options than Tableau right now. However, data connectivity and just about everything else visual analytics is much better in Tableau than SAP Lumira. In time SAP Lumira may have something more compelling but I have not seen much for enhancements since the debut last year at SAP TechEd in October. Bottom line = test them both and choose what is the best strategic analytics tool for you.
Hopefully this was a good primer on how to get started with Tableau and SAP. This happens to be a pretty popular combined solution at some of the largest companies in the world. If you have any questions or need further guidance about this topic, please don't hesitate to reach out to me.
Mobile BI for iOS, Android, Surface, and Smart Phones
Mobile, mobile, mobile BI! Powerful information delivery for anytime, anywhere analytics, getting answers to your questions when and where you need it most - in the field. If you folllow me, you know that I am obsessed with mobile. I have loved mobile apps since the 90s. I have been dying to write this post for a long time now. Honestly I was waiting to see if there would be any news announced at PASS BA Conference to add that info into this post since I get soooooooo many pings about mobilizing Microsoft BI. Alas crickets were chirping on the mobile BI topic - there was NO mobile news. So if you have pure Microsoft BI, the same limited, not ideal, but can work options that I showed you last year with HTML5, jQuery, SSRS 2012 SP1, Excel 2010/2013, Visio 2013, PerformancePoint, and SharePoint 2010/2013 with browser based mobile for iPad are still around. However, Power View is still Silverlight and thus not mobile-friendly. GeoFlow is not mobile-friendly. The PerformancePoint right click feature to access menus broke late last year when Apple disabled that menu navigation option. I also would not recommend building any new PerformancePoint content now that "Excel has been positioned as the future BI direction for Microsoft BI" at PASS BA Conference. There are still a wide variety of third-party mobile BI options you can use - Tableau being my favorite.
I first learned of Tableau's mobile capabilities with Microsoft BI from Jen Stirrup, a Microsoft BI MVP, and since have fallen in love with it. Why do I love Tableau + Microsoft BI for Mobile BI?
1) Tableau’s touch-optimized, business intelligence is available across a wide array of mobile device types, browsers, and operating systems including Windows, iPad and Android - native apps and browser based. In the BYOD world we live in today, having wide device support is critical. Don't limit your mobile BI consumers to one device type.
2) Mobile users can view, edit or even author new visualizations on their favorite mobile tablets to not only view data but also ask on the fly questions in meetings or while on site to get immediate answers.
3) With Tableau you can mobilize the most popular Microsoft BI and other data sources including:
4) Tableau mobile business intelligence emphasizes usability. It is touch context sensitive. You can filter, scroll, pinch, zoom and it all just works without HTML5 coding hoops - there are NO coding hoops.
5) Tableau has an author-once, view anywhere approach means that every dashboard you publish is automatically touch-enabled.
7) You can use Tableau to easily combine data from with many other enterprise platforms like SAP or IBM, not just Microsoft to fully leverage AGILE, mobile data discovery alongside your existing enterprise BI implementations. Tableau is not a rip and replace vendor - Tableau values supplementing many other vendor BI solutions for best of breed solutions.
8) You can install the fully functional Tableau Desktop app on a Windows Surface Tablet if you are a true, purist, die-hard Microsoft fan!
9) You can embed live Tableau visualizations in PowerPoint (I was mad when the Power View export to PowerPoint was removed in Excel 2013...boo!) You can have that favorite feature with Tableau. Oh, and one more bonus. You can play/animate all Tableau views over time periods to see trends and patterns over time on maps, in bar charts, line charts, scatter charts, combination charts, and even in background image charts. I have seen some amazing animated visual analytics applied to supply chain, finance, healthcare, retail, transportation, science, weather, and spatial analytics scenarios.
10) Tableau mobile BI has solid security. You can enforce existing security protocols and integrate with Active Directory via Tableau Server. If a user loses their iPad or Android tablet, simply disable their Tableau Server account and give them a new one. Sensitive data is not stored on mobile devices.
So that wraps up my Top 10 list for Why Tableau + Microsoft BI for Mobile BI. I am sure I could come up with many more reasons. To get started exploring and deploying Tableau mobile BI, check out your mobile phone or tablet app store to install and play with the latest free Tableau native mobile BI app samples. You can also download a fully functional, free trial of Tableau Desktop and Tableau Server to test it internally on your network - or - you can use the free Tableau Public option and publish your dashboards there to see what they look like from anywhere in the world that has an internet connection on your favorite mobile device.
Feel free to reach out to me if you want more information on this super hot mobile BI topic. Note that I will also try to provide a SlideShare deck and some videos like I have done in the past soon.
Self-Service ETL, Data Quality, Cleansing, Prep and other Analytic Data Goodies
As promised, I wanted to share some of the options out there that I have been evaluating for self-service ETL, data quality and cleansing. In this post I will briefly highlight Microsoft Data Explorer Excel Add-In, Data Wrangler, LavaStorm Analytics, Alteryx, and Microsoft Data Quality Services. There are actually quite a few of these tools out there! A great list of more options is avallable on KD Nuggets, my favorite data mining site. Data prep is probably the most important aspect of getting a great predictive or analytic data model.
Lately there has been a lot of good Microsoft Excel MVP buzz about the Data Explorer Preview, an Excel add-in for Excel 2010 and 2013. Data Explorer seems to be a Microsoft light version of an Oracle Endeca-like offering. Data Explorer pricing and 2013 version support for it are not yet known - the preview is free. After the Rob Collie Excel 2013 PowerPivot rant, I wanted to be sure to clarify Data Explorer pricing/version details are totally unknown to me right now. (Digression: Also Microsoft is not evil - wow Rob's blog post got a huge uproar. Microsoft is not a charity - they just happen to do a lot of charitable things and build low cost, less feature rich, for the masses-type tools. Microsoft is a great company that makes a positive difference in the world. Yes, I am biased.) Data Explorer is absolutely worth checking out if you want easy, self-service ETL type functionality within Excel.
Data Explorer has a simple, non-technical wizard interface for loading, combining, and transforming both structured and unstructured data into Excel. The learning curve is minimal. The Data Explorer team did a great job of covering common data loading, data prep, and tweaking tasks from the most popular data and web data sources. They even have a social media connector and big data connector. It is notably interesting that instead of using SQL to query and tweak data, they introduced a new language called "M" with Data Explorer . "M" seems pretty powerful. You can
read up on it in the M docs. Additional resources to ramp up on Data Explorer include an introduction tutorial and a more advanced one on combining data data sources.
Another tool in this self-service ETL space is LavaStorm Analytics. I really liked what this group offers but do confess that now we are closer to a power user or an enterprise ETL developer tool user experience. The LavaStorm Analytics tool user interface is much simpler than the enterprise ETL tools I have used before. The introductory YouTube video was quite helpful for me to get a quick understanding of capability to begin building ETL solutions right away the same day. I liked the richness of the Lavastorm Analytics Engine functionality, ETL component reusablity, pre-packaged analytic libraries to incorporate predictive analytic models and statistical data analysis, and R task/R integration out-of-the-box. You do need to script the R right now in the R task (bummer) but I am hopeful that will improve in the future. The ETL component reusablity is a big plus since analytic routines can be complex and often are shared for use on numerous analytic projects. I am not doing this great self-service ETL tool justice in a one paragraph overview. Download it, watch the online free video training, and play with it for yourself. They do have a free desktop version, and several levels of paid desktop versions and server versions starting at $4,500. They also have some other product related offerings around specific analytic use cases like fraud and spend, and optimization.
Alteryx seems to be the "king" of the self-service ETL tools I reviewed and they also had the highest pricing at ~$45,000 per seat and $15,000 for personal edition. Alteryx was a highlighted vendor in the 2013 Gartner BI Magic Quadrant report. Alteryx is super feature rich and comes with embedded, prepackaged industry data sources like MOSAIC profiles,TomTom geospatial, Dunn and Bradstreet, Experian, and other in-tool geographic, demographic, and business data that you get when you buy it. So the high price is really a bundled price for both the industry analytic data sources and the self-service ETL tool. Alteryx has bulk data loaders for SalesForce, Amazon S3, SharePoint, Teradata, Oracle, MongoDb, Hive Hadoop, and many other structured and unstructed data sources. They offer a read-write, hosted analytics application wrapper framework on their website. They also have nice R integration that did not require any R coding AND an R code task option if you want/need R code within your self-service ETL flows. I was not able to test the R functions myself since I had a version conflict with my local open source R installation. Alteryx' predictive product manager was kind enough to spend time with me covering the R features in depth and also shared that they are working on a vNext iteration that will support side-by-side various R flavor installs. What I felt were true strengths with Alteryx includes their deep location/geospatial analytics and prepackaged industry data sources for mash ups. Powerful common analytic tasks like give me the geocodes for plotting 15 minute drive times from a specific store location on a map are very hard to geocode and then visualize with other tools - in Alteryx it is easy. They also just added a Tableau TDE ETL destination to allow Alteryx ETL data to be easily sent to Tableau for the deep visual, interactive analytics - see image above for the type of analytic solution this awesome combination empowers. Alteryx seems to be used by larger enterprise analytic groups - it is not for the occasional one-off, ETL project or even pure ETL. If you just need ETL, there are other much cheaper ETL tools like LavaStorm Analytics, SSIS or Pentaho.
Last but not least if you just want data cleansing and your company already owns SQL Server 2012, don't forget about SQL Server Data Quality Services (DQS) and the Data Quality Services add-in for Excel. I am going to copy the description directly from the official overview "SQL Server Data Quality Services (DQS) is a knowledge-driven data quality product. DQS enables you to build a knowledge base and use it to perform a variety of critical data quality tasks, including correction, enrichment, standardization, and de-duplication of your data. DQS enables you to perform data cleansing by using cloud-based reference data services provided by reference data providers. You can also perform data quality processes by using the DQS Cleansing component in Integration Services and Master Data Services (MDS) data quality functionality, both of which are based on DQS." Now this option is really not self-service ETL but I felt the need to mention it here since many self-service ETL projects really are basic data cleansing projects where DQS fits very well. There are tons of free videos and resources to learn how to get started and use DQS.
Tip! Self-Service BI/Data Discovery Software Evaluations
Tip of the day, download free trials of self-service BI/data discovery tools to experience using them yourself BEFORE you buy. "Experiencing" the differences in HOW to build views with your own data sources and playing with the depth of available analytic features is really important since many of these tools look the exactly the same when you only see the end result views or dashboards. If it is really a self-service or data discovery tool, then you should be able to point to/ load your own data sources, mash up/ blend your own data sources, add your own time intelligence (YTD, QTD, MTD, Year over Year, Parallel Period), and build your own analytic views - all by yourself! Don't let BI vendor sales reps fool you by copying your data into "pre-made", self-sevice BI demos or POCs. The HOW in these use cases is a Critical Success Factor.
Tip! Analysis Services Cubes with Tableau Design Patterns
I found a “golden nugget” for using Analysis Services Cubes (SSAS) with Tableau. There is a Knowledgebase article on the topic that is a little dated BUT the attached workbook with the common design pattern examples is really excellent. There do seem to be a lot of customers using Analysis Services Cubes OR trying to figure out how to do it. Keep this handy.
KB Article: http://kb.tableausoftware.com/articles/knowledgebase/functional-differences-olap-relational
Design Patterns: http://kbcdn.tableausoftware.com/workbooks/adventureworks_cube_on_scdemo-dbs.twbx
Tableau v8 "Kraken" is Officially Released
So that wraps up another late night blog post. I was not expecting this release today and was pleasantly surprised when I saw the Tweets. My next planned blog post will be on Self-Service ETL covering the new Microsoft Data Explorer, Alteryx, and Lavastorm Analytics.
Big Data Visual Analytics
Earlier this week I attended and presented at the TDWI Big Data Analytics Solution Summit. It was interesting to hear the other presenters, talk to groups using big data, and get to play hands on with messy XML big data that was stored on a Cloudera Hadoop cluster for my big data visual analytics with Tableau demo. Prior to this event, I tooled around with big data over a year ago after Denny Lee, Saptak Sen, and a few others inspired me to do so while at Microsoft but confess I felt overwhelmed by all the strange terminology of Pig, HIve, Sqoop, and the high level of effort to be functional with it. Since then the ease of using big data has vastly improved. Today most big data vendors offer connectors like the Hive ODBC and market leading visual analytics tools like Tableau include additional user friendly, big data function libraries with those drivers to make the power of big data analytics accessible to a much wider audience of analysts and developers. As a result this technology is widespread. If you are a business intelligence professional, you need to learn what big data is and how to work with it. In this post, I will share how simple it is to do big data analytics with Tableau and cover some cool, value-added benefits of using the Tableau-specific features with unstructured, messy, big data objects that is a common pain point with many other visual analytic tools.
To get started with Tableau and big data, you need a big data source. Tableau can connect to quite a few of them including Hortonworks Hadoop Hive, MapR Hadoop Hive, Cloudera Hadoop, Cassandra, Hadapt, Karmasphere and Google BigQuery. If you don't have one of these, most of the vendors do have virtual machines you can download with a cluster already set up that you can use to learn. Cloudera Hadoop is the one I am using. Once you have a big data source, you need the related Hive ODBC driver installed on each machine running Tableau Desktop or Tableau Server and also ensure Hive is set up. Hive enables SQL like queries on Hadoop file systems. More about Hive, what it is, and how it works can be found here. For me I have a 64-bit WIndows 7 OS and found out that setting up the ODBC was a little tricky. On 64-bit OS I had to call "C:\WINDOWS\SysWOW64\odbcad32.exe" in the command line to get the right ODBC user interface to see the Hive driver to create a DSN to use in Tableau. Once I had that figured out, I could point to the DSN in the Tableau data source connection window, choose my connection type, schema, and start having some real fun visualizing the big data. Creating big data visualizations in Tableau was no different than any other data source type - it was a drag/drop, pleasant, visual data discovery experience.
Tableau shines in big data visualization, no programming required, and also no waiting for super slow, long running MapReduce queries if you choose to use and schedule extracts like most groups do in the real world use cases after initial exploration and experimentation. In the 2012 TCC key note demos, Christian Chabot showcased how Tableau could render over 800,000 data points with ease blowing away the other data visualization vendors that often fail to render more than a thousand data points and use uncontrollable data sampling techniques to hide those limitations. In a world of big data, the ability to easily see and analyze patterns, outliers, or exceptions can be the difference between struggling to survive and thriving in your industry. I talked about this strategic competitive advantage during my TDWI Solution Summit demo.
I also discussed how working with big data sources often means working with messy data, unstructured data, JSON, and XML files that can be exceptionally challenging to analyze directly on a Hadoop cluster. Most other visual analytics tools require programming or ETL before you can analyze this type of data. That key point shown was the low level of effort required with Tableau in these common big data scenarios! I showed how easy it was to do drag/drop visual analysis of XML files with a live, direct Hadoop connection and using Tableau's enhanced big data driver libraries for XML, JSON, and other string functions that uncomplicate text processing, unpacking nested data, performing data transformations, and processing URLs. Tableau’s big data driver features also support pass through Hadoop UDF functions and Java programs. This is another very important point to keep in mind because one of the benefits of using open source is getting to enjoy freely leveraging the work the world of open source developers further easing big data analytics. In Tableau, you can call Hadoop UDFs, often
built as JAR files, by appending the JAR flle location to the Initial SQL (add JAR /usr/lib/hive/lib/hive-contrib-0.7.1-cdh3u1.jar; add FILE /mnt/hive_backlink_mapper.py;) For more informaton on this point refer to Hive CLI.
Another thing you can do with Tableau Initial SQL is performance tune the Hadoop MapReduce query settings. To force more query parallelism you can lower the threshold of data set size required for a single unit of work. (set mapred.max.split.size=1000000;) You can also specify using clustered fields, "bucketed fields", to improve big data query join performance. (set hive.optimize.bucketmapjoin=true;) Another Tableau Initial SQL big data optimization technique is to consider the shape of your data. Often big data is unevenly distributed, Map/Reduce tasks can lead to system hot spots where a small number of compute nodes get slammed with the most computation work. The following Tableau Initial SQL setting informs Hive that the data may be skewed and to take a different approach formulating Map/Reduce jobs. (set hive.groupby.skewindata=true;)
For ad-hoc big data queries, on-the-fly ETL, or data cleansing per se, Tableau Custom SQL can be used. In real world implementations a LIMIT clause is often added to Custon SQL statements during development and later removed when the Tableau views are deployed into production. There are many other tips and tricks for using Tableau with big data. I hope the above provides a taste for some of the things you can easily do to start getting up to speed. More information on this topic is available in the Tableau knowledgebase and forums.
How To Visually Analyze R Predictive Models in Tableau
One of the frequently asked questions that I get these days when talking to analysts about Tableau solutions is "what does Tableau offer around predictive analytics"? There does not seem to be a lot of information available on this topic. There is a fantastic How To tutorial on how to use R with Tableau in the Tableau community including some great pointers from Bora Beran on using SQL UDF functions with R. I have been continuously updating a SlideShare presentation as I learn more about this topic.
If I can find the time, I will write up a few white papers on various predictive solutions with Tableau alone and by combining Tableau with open source, R, SAS, SPSS, Alteryx, Microsoft SQL Server and Excel Data Mining, Oracle Data Mining, Oracle SQL Extensions, SAP HANA, Sybase RAP, and other options. In the meantime, I hope my blogs on this fun topic will be helpful. In my January blog, I covered how to combine Microsoft SQL Server Data Mining with Tableau (live view here). In this blog, I want to cover the super popular, R. Using R with Tableau is actually very EASY to do, does not require R scripting, or any scripting if you prefer simple point and click user interfaces.
Before we jump into R, keep in mind that there are some native Tableau, out-of-the-box, predictive features around Trending and Forecasting (new in v8) that do not require R or other any stats programs. There is also multi-pass aggregation capabilties that can be used for some predictive processing scenarios that may not be easily overlooked. The Tableau Trending feature covers both linear and nonlinear model types including Linear, Logarithmic, Exponential and Polynomial modeling with statistical significance details, model formulas, p-value, t-value, degrees of freedom, standard error, mean squared error, confidence bands, residual analysis and related model evaluation information. The Tableau Forecasting feature is new with Tableau v8. The Forecasting features use exponential smoothing algorithms with and without trending and seasonality detection. Both Trending and Forecasting can be accessed by right-clicking on a visualization in Tableau or via the Analysis menu. Anyone can easily use these Tableau predictive features - they do not need to be a data scientist. It would be helpful if they understood the concepts of statistical significance to understand the generated model formulas, what is a good model vs. what is not so good, or further deep dive into the model evaluation details.
If you are looking for more sophisticated data mining models, a combined R and Tableau solution can be used. You can use R with or without the nice point and click Rattle user interface to generate predictive models such as classification (kmeans, decision tree, regression tree), clustering (hierarchial, ewkm, bicluster), association (market basket), neural networks, support vector machines, or other types of R models, then export the R predictive model scoring output to a .csv file or a database, and visually analyze those predictive model outputs in Tableau just as you would any other data source. For most scenarios you can use the free versions of R and Rattle. For very large predictive models with 100's of attributes and variables that surpass the free R version memory limits, something like Revolution Analytics for R may be a better option to create the predictive models and scoring.
Let's walk through it. 1) Your data should already be prepared for data mining - a single data set, in a flattened format with variables and transformed variables like those used in statistics. The Dorian Pyle Data Preparation for Data Mining book is my favorite for learning how to best structure data for predictive analytics. Regardless of your predictive analytic tool choice, preparing data is something that has to be done and is a bit of an art and a science. If none of this makes sense to you thus far, you need to learn the basics of data mining to understand what you are doing even if it is easy to do. The classic book used to teach data mining 101 is the Witten Frank Hall Data Mining, Practical Machine Learning Tools and Techniques. The data prep task typically takes the most time in any predictive project, much like ETL in a data warehouse project.
Your flattened, prepared data can reside in a variety of places. R can load data from csv files, Excel, databases via ODBC, or other formats. 2) To load your data into R Rattle choose the data source and click the Execute button. Now choose the fields/attributes/variables that you want to predict (Target) and what variables are the influencers (Inputs) used in the prediction. You can optionally assign weightings for the inputs.
Now you can 3) start creating predictive models by choosing the desired predictive model type tab and model option.
Then you can 4) evaluate the model to check if it is a good or poor predictive model and continue experimenting until you have a good model.
When you are happy with your model Lift/ROC and want to running new data through the model and scoring it for predictions and visualizing it in Tableau, click the Evaluate tab, choose Type Score, Report Class or Probablity, and Include All to generate the .csv file model scoring output .csv file. Then click the Execute button and select where to save the output.
Yay, so now we have predictive model output that we can start exploring and visualizing with all the awesome features within Tableau that you already know, create a dashboard, publish it or even further explore it on a mobile device like an iPad, Android, or Surface tablet. 6) To load the predictive model output, open Tableau, choose Connect to Data, select Text file and "ta da", now you can use Tableau with the predictive model just like you would with any other data source type. If you do want to incorporate some of the R visuals like the decision tree that are not in the box yet, you can embed that image into your Tableau workbook or dashboard like I did in the sample pictures at the start of this article.
Now there is no excuse for not adding some advanced predictive modeling into your Tableau analytics. It is FREE, quite easy to do, fun, and powerful!
I do confess that this blog post was rushed, written on a plane trip from Washington DC to Tampa, due to high demand after mentioning I did this for a recent POC. There will be much more to come on this topic when I have a few spare moments breathe! In addition to the white paper, I have a SlideShare presentation with a sample live demo of a Tableau packaged workbook from my previous post for you to play with this yourself. Enjoy!
Thoughts on the Gartner 2013 Magic Quadrant for BI Platforms
***See a visualization of the rankings changes over time posted here.***
Tableau is ranked as a Leader in the 2013 Gartner Magic Quadrant for Business Intelligence report.
Customers rate Tableau #1 for ease of use, fastest development times, and it has
unmatched big data visualization. Told ya so... I couldn't resist! I literally just said that last in my last few posts. I knew that Tableau had matured into a true beauty for data lovers, the upcoming version 8 is simply amazing, and apparently the Gartner industry analysts see it in a similar light. Data discovery is now mainstream with every vendor trying to emulate what Tableau has been doing for years. It should get interesting down the road but Tableau does appear to be way ahead of this trend and the best in class in this niche space. I told my Microsoft friends Tableau was complementary and indeed Gartner confirmed that important point as well today. I am sooooo excited that both of my loved BI platforms were seen as stars.
This year the Gartner Magic Quadrant for Business Intelligence report took on a whole new level of special meaning for me since I was on the technical product marketing team that submitted Microsoft's response. It was an excellent experience, working hard with both the engineering and marketing teams, huddling up and rallying for what may be the best answer with brilliant engineers that don't give their solutions enough of the rah-rah that they deserve. The teams did a fantastic job responding, presenting, and supporting the analysts reviews. It is a thorough process.
After reading through a few of the BI vendor detail comments, I wanted to share a few notes and take aways. The first one is that ease of use and total solution time to value are typically the most important decision factors and need to be included or "scored" in BI vendor evaluations. Time is money. period. I see the intelligent people overlook that in their RFP/RFI evaluation matrices.The happiest customers have BI platforms that they enjoy working with, are agile and supportable. I did appreciate that for the second year in a row QlikView had the worst/longest development times of any vendor reviewed. Having been in a global BI technical marketing role and listening to win/loss customer reviews around the world, I heard my fair share of customers cite QlikView pains, excessive development times and total costs of ownership becoming far more than estimated/budgeted. If you are reviewing QlikView, make them show you how it is built with the scripts, time intelligence, how to change, add new columns, how to do mash-ups, test real time data, test incremental loads, load test it and so forth. The next item I want to bring up is about cloud BI. Many of the vendors were given kudos about cloud BI but in the Spotfire commentary section I noted a point about "67% of the survey population at large will never put their enterprise BI in the cloud". That is an interesting figure that I'd like to better understand. There are a couple vendors betting big on the cloud including Birst and Microsoft. I do think it is interesting in 2013 that several of the pure in-memory players finally are waking up to the direct query advantages in the world of big data. The mobile BI commentary in the Microstrategy section was spot on with what I have seen in the market the past year. The report noted 50% of Microstrategy customers were piloting mobile BI in 2012 and Microstrategy was being invited to competitive bids in companies that see mobility as a strategic imperative succeeding to replace long-established BI vendors that lag behind in mobile BI. It is a shame that innovative Microstrategy tumbled this year. It is a great but expensive and complex solution that takes a bit of time to get the shared metadata layers ready to be able to enjoy Visual Insight and other reporting. Congrats to a peer of mine at IBM that again rocked this report. I used to develop with Cognos, got some Cognos v10 books this year to play around, and I do like some of the things that I am seeing from them. SAP...oooh. So I went to SAP TechEd last fall and left there feeling exhuberant about Microsoft BI. SAP is all over the place, obsessed with SAP HANA, alienating acquired Business Objects shops that are not SAP ERP, and much like Qlik, SAP customers cite long project development times and pains. I did like SAP's Predictive Analytics with R but I didn't see any mention of it in this report. Spotfire will be one to keep an eye on this year! I do like what they are doing with predictive and R. Since they are tied into Tibco, I wonder how much of the middleware may become a future requirement. Lastly, I do feel it is a bit intriguing that vendors with a narrow scope of offerings were included/rated side-by-side with vendors that have a plethora of functionality across all the BI lifecycle areas. This report should be reviewed along with other reports on the specific functional areas of interest to understand apples to apples comparisons across vendors.
I could ramble on and on but I do need to get some sleep. To wrap up, congratulations to all the BI Leaders. The rapid BI industry development and vendor progression in this space is thrilling. I can't wait to see how this plays out in the upcoming years.
Combining the Best of Microsoft BI and Tableau
The past month I have been fielding a lot of calls and email inquiries. Feel free to email me if you want to chat. Let me take this opportunity to reassure all of my Microsoft friends that I continue to LOVE Microsoft deeply with all my heart and soul. As mentioned in my previous post, in 2013 I will be advocating the best of combined Microsoft + Tableau solutions for many reasons. One of the biggest reasons is that Tableau has mature touch-friendly, "browser-based" mobile BI for a wide variety of mobile device types. Tableau published content is mobile aware and touch optimized with no extra coding or dual authoring - a publish once, view anywhere model. Tableau also has "native" mobile BI apps for both iOS and Android - by far the most popular BYOD mobile devices in the market today. I anxiously await the Microsoft Surface Pro and I do hope that there will be market uptake of it. If you do have a Windows mobile device, you can use Tableau's "browser-based" mobile BI option. Anyone that has followed me through the years knows that I have always tooled around with mobile gadgets since PocketPC and SQL Server CE in 1999/2000. I also need to apologize to Jen Stirrup because she was recommending this Microsoft + Tableau mobile BI combination last year and I kindly asked her as a Microsoft MVP to downplay it...now here I am only a little over a year later singing the praises of the solution she first evangelized. I am sorry Jen - you did find a great solution and I was slow to embrace it.
Why Tableau? Tableau has won many accolades from Gartner, Forrester and other analysts as the BI industry data visualization "sweetheart". It has been best in class for a few years now. It also won the top award in SQL Server Pro magazine's 2012 Best BI Reporting product. Tableau is data and BI platform vendor neutral and allows optional in-memory or direct connections. Tableau customers totally love them - much like I totally love Microsoft. I have never in my career replaced Tableau or heard bad things about them from customers. I have replaced QlikView quite often and heard cries for help only months after a QlikView purchase more than once. Spotfire seemed interesting to me especially with the R integration for predictive but I didn't hear the amazing kind of customer feedback from Spotfire customers like I heard from Tableau customers. I also figured that I'd find a way to combine R with Tableau. When I tested Tableau I found it to be user friendly, functionally deep for analytics, had gorgeous ad-hoc data visualization with a huge, dreamy array of native direct data source connectivity and visualization options. The mobile BI aspects for iPad and Android sealed the deal per se.
For over 15 years I have been a consultant recommending, building and delivering the best, mixed technology solutions. It has only been in the past few years that I became vendor specific due to my role. I am a true engineer that does listen to her customers, admits when she is wrong, calls you up when you are wrong, and also likes nothing more than to explore what all the technology vendors offer to recommend the best solutions. It is incredibly wonderful to have customers that evolve into true friends as the years pass...I thrive on hearing that a solution that I built or recommended a zillion years ago is still up, running and doing the job today. It is no secret that I am motivated by making a difference - not by making the most money. If I can help people and be intellectually challenged while doing so than it is a fantastic day. Thank you to all of the people that have been reaching out to me. This is what I have been up to lately. I still love Microsoft and I am still a huge data geek.
Happy New Year!
Forbes 100 Must Watch Trends in 2013* Ok, so it is not my article but I did feel that it was a "must include" for my readers to stay at the top of their professions. I also had an appreciation for "22.Data Scientists: The New Hotshots (Tableau is great visualization software for data science types)." 59.Mobile-Optimized Goes Mainstream, 72.Personal Data Ownership, 77.Responsive Web Design, 80.Self-Service and 88.Tablet Shopping.
Predictive Data Visualization with Tableau
One of my New Year's resolutions was to drink a little less Microsoft Kool-Aid and become more open minded to combined analytic solutions that have best in class offerings. Don't take that the wrong way, I still LOVE the Microsoft data platform, SQL Server, confess to using Microsoft Office daily and even pay for an Office365 personal account - BUT - I also feel it makes a lot of sense to supplement Microsoft products when there are voids or the basic, "for the masses", features become a limiting factor or even a liability. I believe that BI and analytics tools are strategic market differentiators. Organizations that have the best BI and analytics will have a competitive advantage, period.
Let's look at a scenario where I was unable to easily explore or visualize a large data set in Excel 2013. I needed something much more robust that could deliver interactive data visualizations, at the speed of thought, with data sets having hundreds of thousands or millions of records - a scenario that even the latest Excel 2013 and/or Power View version simply can't do or compromises with data sampling techniques. Power View starts sampling at 1000 data points to guestimate the display of large data sets. Tableau on the other hand was rapidly rendering over 800,000 data points with ease in the last TCC Keynote demos. The Microsoft Yahoo! case study and Klout case study are other excellent scenarios where it just made sense to supplement the largest Microsoft data platform projects with Tableau for the large scale visual analytics.
Having said that, let's get on to the real title content of this blog. I wanted to test taking the Tableau v8 "Kraken" beta and combine it with Microsoft SSAS Prediction Queries (DMX) to showcase the power of combining two best of breed offerings. To see this in action I posted a live demo on Tableau Public.
2012 and 2011 Blog Archive
Looking back on 2012...
Moved to Blog Archive
Unstructured Data Mining/Predictive Analytics with Mahout
Moved to Blog Archive
PASS and SharePoint Conference 2012 BI Highlights
Moved to Blog Archive
Extend Excel Data Visualizations with Office Apps
Moved to Blog Archive
What-If Analytic Scenario Simulation Options
Moved to Blog Archive
Beautiful BI Sites with SharePoint 2013
Moved to Blog Archive
Free BI Technical Training from Microsoft Product Teams
Moved to Blog Archive
Office 2013 Business Intelligence Highlights
Moved to Blog Archive
Mobile Business Intelligence
Moved to Blog Archive
FAQ: SSAS Tabular Model Security White Paper Released
Moved to Blog Archive
How to deploy SQL Server 2012 BI in a SharePoint 2010 farm
Moved to Blog Archive
FAQ: How is PowerPivot Different than Excel?
Moved to Blog Archive
Server 2012 Improvements for SAP Environments
Moved to Blog Archive
PowerPivot "How To" for Large Scale Projects
Moved to Blog Archive
SAP Hana and PowerPivot
Moved to Blog Archive
Windows 8 Consumer Preview
Moved to Blog Archive
Big Data, Hadoop and StreamInsight
Moved to Blog Archive
Social Analytics PowerPivot Solutions
Moved to Blog Archive
PowerPivot Maps, Tabs, Image Overlays and Other Creative Tips
Moved to Blog Archive
Good News for Microsoft BI on iPad
Moved to Blog Archive
Playing with the Big Data Hadoop ODBC Driver
Moved to Blog Archive
Microsoft BI Metadata, Impact Analysis, Dependency Analysis and Lineage
Moved to Blog Archive
How Microsoft Rolled Out Self-Service Business Intelligence
to 40,000+ Global Users
Moved to Blog Archive
Data, Hadoop + PDW Better Together: must see for data lovers!
Moved to Blog Archive
Visio Services Data Driven Visualizations in SharePoint
Moved to Blog Archive
Extending Project Server 2010 Business Intelligence
Moved to Blog Archive
Automate Unstructured Document Data Extraction
Moved to Blog Archive
SharePoint BI/Excel Services Tip!!!
Moved to Blog Archive
Using PowerPivot as PerformancePoint Data Sources
Moved to Blog Archive
Undocumented SSRS with SSAS Stricter Parameter Naming Change
Moved to Blog Archive
SQL Server 2008R2 Data Quality and Master Data Management
Moved to Blog Archive
Moved to Blog Archive
Analysis Services Operations Guide
Moved to Blog Archive
More MS BI with SAP
Moved to Blog Archive
TechEd 2011 BI Tips
Moved to Blog Archive
The Data Loading Performance Guide
Moved to Blog Archive
Analysis Services Performance Guide
Moved to Blog Archive
Microsoft BI with SAP
Moved to Blog Archive
Data Warehousing Best Practices and Guides
Moved to Blog Archive
Please do not copy blog content and present it as your own work without permission. For more information on copyright laws, review CopyScape and other resources.