Recently we’ve seen demos on several billion rows of data using older technology that is stretching previous data visualization boundaries. Although we see and hear numerous vendors touting big data analytics capabilities, most can’t truly deliver on that big data exploration claim when put to the test. In this article, I will discuss common challenges and share how Zoomdata, a leading big data visual analytics solution, uses microservices to interactively join, add custom calculations, and visually explore 10s of billions, 100s of billions and even larger datasets at the speed-of-thought.

Common Big Data Visual Analytics Issues

In today’s big data world, data analysts and data scientists are tasked to gain a competitive advantage by leveraging untapped massive volumes of collected data. Unfortunately, that ask is a perplexing task given the current state of modern visual analytics tools.

Modern visual analytics tools often require you to move big data into a single source for analysis. If you work with a big data source, you might be restricted by direct query connections that do not support blending any other dataset – big or small – limiting exploratory analysis. Additional challenges such as connection timeouts, out of memory exceptions and rendering limitations almost immediately become show stoppers. Another common challenge with big data exploration is the types of data sources required for analysis. Big data analytics projects often include streaming and unstructured data sources that mainstream visual analytics solutions cannot elegantly handle. It is a frustrating experience at best to try and visually examine the valuable data locked within those treasure chests.

Loading data...

Last summer I tested the top data visualization offerings with datasets that exceeded several billion records. Notably, none of the tested solutions performed reasonably. I consistently experienced extremely long query wait times to load data and also ran into query timeouts. In my tests, I’d wait 30 minutes, an hour or longer only to get back a system out of memory error, timeout message or the solution crashed. If I did finally get past data loading issues, rendering the data in a chart was my next big disappointment with big data exploration.

True Big Data Discovery with Zoomdata

After my failed performance testing exercises, I called peers and collected market intelligence to find out how visual exploration with big data is being done successfully. I learned numerous groups were trying to build their own solutions with limited success using open source frameworks. Another solution that came up in my calls was Zoomdata. Zoomdata solves the big data discovery problem with a unique micro-query approach. I have been following this group since 2013.

Big Data Exploration in Three Steps

Zoomdata offers the world’s fastest visual analytics solution for big, fast and unstructured data. They directly connect to the broadest set of big data sources for exploratory analysis and dashboards without requiring data to be moved. Zoomdata delivers almost immediate big data visual exploration and interactive experiences without long waits using patented data sharpening and micro-queries.

Zoomdata’s key differentiators in the big data analytics space include:

  • Supports direct query with Hadoop, search-based, NoSQL, cloud and many types of big data sources
  • Empowers anyone to enjoy rapid visualization and interactive big data discovery without the long query waits or system time-outs using unique Zoomdata sharpening and micro-query technology
  • Enables big data joins and dynamic data blending across different direct query sources without moving data with a capability called Zoomdata Fusion
  • Provides strong, live streaming data rendering along with record, rewind, and review of streaming data history with Zoomdata Data DVR features
  • Offers rich APIs and an SDK that is easy to embed or extend into any application

Zoomdata was specifically designed to handle massive, streaming and unstructured data from the start. The company has been active in the big data open source community for years presenting novel microservices solutions at Strata + Hadoop World and other well-known conferences. They were one of the first teams leading the way into a new world of visual big data discovery. In fact, they are currently partnering with SAP, Amazon, Microsoft and Google to deliver integrations that fill gaps in mainstream visual analytics tools. They also partner with Cloudera, Informatica, Atos, Deloitte and Infosys.

Zoomdata is partnering with SAP, Amazon, Microsoft and Google

Big data discovery with Zoomdata

Zoomdata enables the business to extract value from big data sources of any size data without running into user or account data size restrictions. Data sharpening delivers a delightfully simple, drag-and-drop, big data visualization experience that has been sorely missing from other modern visual analytics tools.

Zoomdata has already been adopted in numerous big data rich industries. 

  • A global investment bank is using Zoomdata for monitoring real-time trades, looking for anomalies and market opportunities with 10s-100s billions of rows, tens of thousands of transactions per minute
  • A federal government agency that needed to support 1000s of concurrent users with 4 PB+ clusters of data and 32K+ transactions per minute
  • Telco network monitoring use cases are being done with Zoomdata to monitor 330M subscribers 24×7 across 4 continents and 100s of billions of observations
  • Cybercrime operations are using Zoomdata to protect more than 5 billion devices and applications by analyzing 100s of billions of observations
  • Cable network and IoT usage scenarios are being done with Zoomdata where device usage is analyzed across 10s of millions of consumers totaling over 480 billion observations
  • Digital advertising providers that serve up real-time, personalized ad recommendations across 200 billion observations also use Zoomdata to keep a pulse on what is happening
  • Pharmaceutical firms are fusing together clinical data in Impala with clinical notes in Solr with more than 2,100 data sources with over >5 petabytes with Zoomdata to give scientists an analytic advantage and actionable data-driven insights
  • Healthcare groups are using Zoomdata to explore millions of detail records in entire clinical populations, identifying outliers, distributions and exploring scatterplots with numerous dimensions

Mobile big data discovery

 

Big data visual exploration at the speed-of-thought is possible with Zoomdata. Response times that took me 30 minutes or longer with other data visualization tools completed in the range of 1 to 10 seconds  with Zoomdata micro-queries and data sharpening.

 

Diving into Zoomdata’s Technical Architecture

To get a sense for what makes Zoomdata special in a sea of solutions that look similar, you need to understand how it works with big data and why the microservices approach is superior.

Zoomdata Technical Architecture

Zoomdata includes a set of highly optimized native “Smart Connectors” for modern big, streaming,  search, and cloud data sources. Each connector is designed to leverage the capabilities of the underlying data source. For example, there are faceted search queries on Elasticsearch and Solr, or micro-queries on high-volume sources like Impala, Hive and MPP data warehouses.

Micro-queries work by breaking-down traditional long-running, monolithic analytic queries into a set of tiny subset queries that typically execute in a few seconds or less even on huge datasets. As each micro-query completes, the visual display is “sharpened” in real-time delivering immediate information aka gratification. It is a bit like starting with a fuzzy image that sharpens into a beautiful high fidelity image over time. Rapid rendering happens live as you interactively explore big data. You never have to wait or watch an annoying data loading icon. At any time, you can proceed with exploratory slicing, dicing, filtering, zooming or drilling down into the details.

Big data discovery with Zoomdata

In the latest Zoomdata January 2018 release, smarter push-down processing, patented intelligent querying that only queries and renders what is currently visible, next generation streaming analytics, massive scale through in-cluster execution, new personalization and scorecard visualizations have been added to an already impressive offering.

For More Information

Successful digital era businesses need to be able to explore massive volumes of data stored in data lakes or other big data sources to make fast, actionable, insights-driven decisions. Zoomdata is an ideal solution for those use cases. It empowers self-service, big data exploration, analytics, and visualization with no data movement or complex data modeling required. Zoomdata is simple enough for anyone to visually analyze big data in a governed manner. It can be used stand-alone, combined with your favorite visual analytics tools, embedded into applications, or run on mobile devices delivering real-time, big data analytics anywhere, anytime.

To learn more about Zoomdata, please review the following resources.