Adopt or foster a pet. Save a life. Reduce animal shelter overcrowding. Read more...

Search Results for: Spark

Spark for Big Data Analytics [Part 3]

By  •  December 20, 2016

Next up in the Oceans of Data series covering Apache Spark, I will explore analyzing big data with SparkR, MLlib and H2O for data science and machine learning. In previous articles, …
Read More

Spark for Big Data Analytics [Part 2]

By  •  October 31, 2016

In this next Oceans of Data series article covering Apache Spark, I continue where I left off in Part 1. Let’s have fun playing with data on a Spark Cluster. …
Read More

Spark for Big Data Analytics [Part 1]

By  •  October 16, 2016

In this Oceans of Data series, I will begin diving into Apache Spark. Spark is the most active open source project in big data according to Databricks latest survey with 1000+ contributors …
Read More

Go Beyond Predictions. Optimize Business Impact.

By  •  November 12, 2019

Why didn’t I think of that? Sometimes we get caught up in our day-to-day lives and don’t stop to see if we are solving the right, bigger picture problems. That’s …
Read More

Highlighted Projects

By  •  July 23, 2019

The following is a list of my highlighted projects over the past two and a half decades. During the first 15 years of my career, I worked with system implementation …
Read More

Exploring IBM Watson Studio Part 1

By  •  June 11, 2018

IBM Watson Studio has come a long way since I first tested IBM Data Science Experience in November 2016. The new Watson Studio delivers a more collaborative, enterprise quality data …
Read More

Infoworks Automated Big Data Engineering

By  •  May 14, 2018

Recently I engaged in a guided “hands-on” evaluation of Infoworks, a “no code” big data engineering solution that expedites and automates Hadoop and cloud workflows. Within four hours of logging …
Read More

DataRobot Automated Machine Learning

By  •  May 6, 2018

DataRobot is the world’s most advanced automated machine learning platform. It empowers data analysts and data scientists to rapidly find key insights, hidden data patterns and make better predictions faster. …
Read More

Getting Started with Seahorse

By  •  April 21, 2018

Powered by Apache Spark, Seahorse is an open-source visual framework for data science pipelines. Seahorse’s compelling value proposition is that it quickly and easily allows the user to take advantage …
Read More

Gartner Magic Quadrant for Data Science and Machine Learning 2018

By  •  February 27, 2018

Last week the annual Gartner Magic Quadrant for Data Science and Machine-Learning Platforms 2018 was published. The old guard of SAS and IBM has tumbled this year with, Knime and RapidMiner taking …
Read More

Industry Pulse November 2017 Highlights

By  •  December 5, 2017

November 2017 was another peak analytics industry event news month. I am thrilled that it is finally over. Here are the collected highlights from IBM Watson Data Science for All, …
Read More

Moving from BI to Machine Learning with Automation

By  •  October 14, 2017

Automated machine learning is an ideal innovation for business intelligence and analytics professionals to advance their careers. You don’t need to be a data scientist to get immediate predictive insights …
Read More

R-Brain: A New Data Science Platform

By  •  September 21, 2017

R-Brain is a next generation platform for data science built on top of Jupyterlab with Docker. It was recently unveiled at JupyterCon in late August. Don’t let the name fool …
Read More

Industry Pulse August 2017 Highlights

By  •  August 31, 2017

So much for a slow summer. In August 2017, Python exceeded R in popularity in a recent KD Nuggets poll. Both the Gartner Magic Quadrant for Data Integration 2017 and …
Read More

Why You Need a Data Catalog and How To Select One

By  •  August 30, 2017

In a digital world where data lives everywhere, enterprise data catalogs are an invaluable asset in your information architecture. Over the past two years, I mentioned data catalogs for enhancing …
Read More

Spotlight on TIBCO: Exploring Spotfire Today

By  •  August 8, 2017

Have you taken a look at TIBCO Spotfire and TIBCO’s growing ecosystem of offerings lately? My initial impression is…shock. I missed what has been happening outside of TIBCO Spotfire the …
Read More

Industry Pulse July 2017 Highlights

By  •  July 31, 2017

Here are filtered analytics industry highlights from the sweltering month of July. Augmented analytics, automation, data privacy, artificial intelligence and increased anti-trust concerns dominated the news. Don’t forget you can also …
Read More

Getting Started with Python [Part 2]

By  •  July 30, 2017

In Getting Started with Python [Part 1], you were introduced to the popular Python language for data analysis. Now we are going to play with pandas, Python’s data libraries. pandas …
Read More

Top 5 OLAP on Hadoop Design Tips

By  •  July 17, 2017

Process failed. That’s what happens when your data outgrows your OLAP technology. To solve that problem, OLAP on Hadoop was born. In a recent article, I shared classic OLAP dimensional …
Read More

Getting Started with Python [Part 1]

By  •  July 7, 2017

Python is a splendid, flexible, open source language that is easy to learn, easy to use, and has powerful libraries for data analysis and data science. I have been meaning …
Read More