GPU-accelerated computing is the next frontier of computational speed and efficiency. Several weeks ago, I wrote an article explaining how GPU acceleration works and why analytics pros should care about it. GPU power together with CPU speeds up applications. With GPU cores, bulk tasks are processed in parallel versus the serial manner of CPU processing. Thus, GPU databases performs orders of magnitude faster than CPU-based modern columnar databases and in-memory architectures that most analytics applications such as Tableau leverage.
In this article, I continue to explore GPU computing, common analytics use cases and the added benefits of specifically combining Tableau with Kinetica.
- Intelligent dashboards with TensorFlow, Caffe or Torch
- Real-time dashboards with Apache NiFi, Kafka or Spark
Kinetica is a GPU-accelerated, in-memory analytics database that delivers real-time response to queries on large, complex, and streaming data sets. With GPU technology, thousands of small, efficient cores can execute repeated similar instructions in parallel. Built from the ground up to take advantage of GPUs, Kinetica is well-suited for compute-intensive analytics workloads on large data sets.
Key Kinetica benefits for analytics pros include the minimization of database tuning, indexing, aggregation, and storage of pre-aggregated data in data marts. GPU acceleration with Kinetica provides 100X query performance improvements either on commodity, industry-standard hardware from IBM, Cisco, Dell, and HPE or on GPU instances in the cloud from Amazon, Azure or Google Cloud Platform.
GPU provides 100X query performance improvements
Kinetica consistently outperforms leading in-memory and NoSQL databases ingest speeds and millisecond response time on complex OLAP queries on hundreds of billions of records. It is often used as a speed layer in modern analytics architectures. Tableau and Kinetica together make data analytics easy, simple, and instantaneous so you can query massive datasets in seconds.
Kinetica’s database runs completely in-memory to optimize throughput and deliver fast query performance. A tiered memory management approach ensures that hot, warm, and cold data can be distributed across GPU VRAM and system memory to balance capacity and performance. A column-oriented database design ensures that the data structures are optimized for in-memory management and fast analytics. A relational database model with familiar concepts such as tables, columns, and SQL support ensures that Kinetica is easy to deploy, use, and manage.
With a distributed, scale-out architecture, a Kinetica cluster contains data sharded across multiple nodes to leverage parallelization for ingest, analytics, and visualization. Additional nodes can be added for scale-out to improve query performance and system capacity. Kinetica can power real-time Tableau queries, reports, and dashboards by simultaneously ingesting data into the same tables that are being queried by Tableau.
Common Use Cases
Organizations use Kinetica to simultaneously ingest, explore, analyze, and visualize data within milliseconds. US Army intelligence analysts have been conducting near real-time analytics with Kinetica for SIGINT, ISR, and GEOINT streaming, geospatial or temporal big data feeds for a major joint cloud initiative. In that project, they query and visualize billions to trillions of near real-time objects in a mission critical environment. Previously this group was using 42 Oracle 10gR2 servers thatit would take 92 minutes to process queries. Now they use one Kinetica GPU database server that completes the same queries within 20 milliseconds, uses 42 times less storage and costs 28 times less than Oracle.
USPS is another Kinetica customer. As the largest logistic entity in the country, USPS employs ~600,000 employees, uses 215,000 vehicles, and delivers 500 million pieces of mail to 154 million addresses daily. Over 200,000 USPS devices emit location every minute. USPS is classic large-scale, IoT analytics operation. Their parallel Kinetica cluster serves up to 15,000 simultaneous sessions, providing the service’s managers and analysts with the capability to instantly analyze geospatial dashboards.
Kinetica customers are in financial services, retail, oil and gas, life sciences, utilities, and media and entertainment. Regarding financial services use cases, a large European Bank was able to move from batch/overnight analysis to a streaming/real-time system for counterparty risk analytics. With increased regulatory requirements, banks struggle to calculate trading book fair value. Valuation adjustments need to project years into the future making risk computations complex and computationally heavy. The bank turned to Kinetica’s GPU-accelerated database to run custom risk algorithms in-database at scale to compute risk metrics for each trade in real-time to accurately measure the fair value of its trading book and manage risk.
Kinetica + Tableau
Tableau combined with Kinetica helps you do more with your data. As a speed layer for instantaneous insights, Kinetica solves slowness pains and eliminates the need for data preparation, tuning, and data aggregation. Free-form, big data discovery is finally possible with Tableau using Kinetica’s GPU database. To get started accelerating Tableau workbooks quickly, you can use Tableau’s Replace Data Source feature. In a recent Kinetica deployment, the customer simply pointed their Tableau workbooks to Kinetica’s ODBC or JDBC connector. Since it is SQL-92 compliant, there was minimal changes needed and they immediately gained 100x faster reporting speeds.
For developing intelligent Tableau dashboards, Kinetica provides access to C++, Java, and Python-based User Defined Functions (UDF). When combined with Tableau’s TabPy feature, Kinetica’s UDF framework democratizes data science by making machine learning, deep learning, and custom functions available to non-technical business users through Tableau dashboards.
Kinetica UDFs can be used with Tableau and TabPy as an orchestration layer for machine learning. UDFs can receive table data, execute calculations, and save data. With direct access to APIs, compute-to-grid analytics can be accomplished with custom code or packaged code. Analytics pros can deploy intelligent machine learning or artificial intelligence libraries written in TensorFlow, Caffe, Torch, Python, JAVA, C++, and other languages with Tableau.
Kinetica also can enable IoT streaming analytics with Tableau for real-time reporting from sensors, connected devices, social media, and mobile apps. Kinetica connectors for Apache Kafka, Apache Nifi, Apache Storm, and Apache Spark can ingest large, complex data in parallel, making streaming data available for Tableau to query in real-time.
To Learn More
In this article, I barely scratched on the surface on what is possible with Kinetica GPU-powered analytics and Tableau. For more information, please join me in the upcoming webinar, “5 Steps to Smarter, Faster, Simpler Tableau Dashboards,” on Thursday, July 20th, 10am PT. You can also explore the plethora of technical resources provided on the Kinetica web site and within the docs online.