How are you doing on those New Year’s resolutions? You have to be doing better than me. I get an F in January = total failure. Despite all the workout gear and live healthy gifts that I received over the holidays, I did not make a single healthy or work/life balance change yet. Now I have been indulging in enjoying life’s simple pleasures: scented candles, books, an active bird feeder, thriving plants including a lovely new bunch of purple calla lilies and the pups. So on Monday, February 1, I am going to give my live healthy goals and work/life balance resolutions another try. If at first you don’t succeed, try, try again.
Where have I been successful this month? I am getting my top wish list content out the door. I am totally amazed that ~60,000 of you checked out the SQL Server 2016 BI SlideShare in just a few weeks. That makes it my second ranked, most popular shared deck of all time! Only my data visualization best practices one is beating it today and that presentation has been posted for a few years. My most viewed blog this month – the Star Wars themed industry trends article – also broke records.
BI Industry Hot Topics
One of the BI industry hot topics also happens to be one of my personal hot topics, self-service BI governance. Self-service BI governance is critical for long-term success and avoiding an enterprise reporting data mess. I have written about this topic before and yet again earlier this year from a different perspective. Now I want to revisit the original framework for a few reasons:
- Data lakes in modern information delivery technical architectures
- Enterprise data catalog and the significant value that it brings to self-service BI in a bring your own reporting tool (BYORT) world
- Hybrid business intelligence considerations
- Changes with Power BI 2.0 and SQL Server 2016
Yesterday I led a self-service BI governance webinar along with an uber-talented peer of mine, Adam Saxton. The presentation is now available on-demand and on SlideShare. Note that this session is only Part 1 in an upcoming series.
In this session, I shared one example of how IT and BI Pros can empower the self-service BI masses by structuring your information delivery approach with governance steps. The process is similar to the one I showed in the past. This time I added in streaming, big data and IoT sources, Enterprise Data Lake, Enterprise Data Catalog, Master Data Services and Data Quality Services.
The goal is to provide information that is reliable and timely for self-service reporting. There is also a balancing act with ensuring data availability versus data quality depending on the use case. For data scientists that need to immediately explore information, there is less control. For enterprise-wide KPI, financial or compliance related reporting, that data should be cleansed and blessed before letting the business go wild building reports with it. Ultimately governance will be uniquely structured to best serve the unique needs of varied organizations and data-driven cultures.
Enterprise Data Lakes
In the global BI market, we are seeing a growing number of organizations adopt enterprise data lakes. A data lake is a hyper-scale repository for big data analytics workloads. It enables you to capture data of any size, shape, type and ingestion speed in one single place for operational and exploratory analytics. It removes the slow, old school ETL complexities by ingesting and storing all of your data while making it faster to get up and running with batch, streaming, and interactive analytics. It also brings new considerations and capabilities for delivering even faster self-service BI to top data science and power user talent.
I shared new information delivery architectural design patterns in my presentation on real-time analytics last year. The modern patterns are quite different from traditional data warehousing methodology. At times, they contradict just about everything we did in this space a few years ago.
- Collect everything: A data lake contains all data, both raw sources over extended periods of time as well as any processed data
- Dive in anywhere: A data lake enables users across multiple business units to refine, explore and enrich data on their terms
- Flexible access: A data lake enables multiple data access patterns across a shared infrastructure: batch, interactive, online, search, in-memory and other processing engine
I find these new approaches and related technological advances to be absolutely fascinating. It is fun as a data enthusiast to see how analytics with exponential data volumes is really being accomplished today. I loved Intel’s recent white paper on how they put 140,000 data sources, including the data warehouse data, into their data lake. Chatting with other groups, I usually see the data warehouse used in a parallel fashion with the data lake.
Earlier earlier this week I had a conversation about analytics with a group capturing telemetry data at a pace of 30 petabytes an hour. Yes, you read that correctly. It is an IoT events workload with massive data coming in at them in streams. What an awesome project.
If the data lake topic interests you, check out the upcoming Azure Data Lake webinar that will cover components, architectural layers for storage and compute, new U-SQL, and design patterns for organizing and processing big data pipelines.
Enterprise Data Catalog
The other topic that is near and dear to my heart is Enterprise Data Catalog. If you have a data lake, then you will want a data catalog for self-service BI. Even if you don’t have a data lake, it is likely that there are hundreds or thousands of data sources in your organization that your business users struggle to find or even know about that would be invaluable to include in their decision making processes. Enterprise data catalog supports an ever growing array of data source types including but not limited to Azure SQL DB, SQL Server, Analysis Services, Reporting Services Reports, Oracle, Azure Storage, Azure Data Lake, HDFS, Apache Hive Tables, Teradata, MySQL and SAP Hana. It also supports a growing list of self-service BI tools that will not be limited to Microsoft offerings. If you want a quick overview, I covered Enterprise Data Catalog in a previous article.
In the self-service BI governance session, I demonstrated Azure Data Catalog and showcased how Microsoft is already using this fabulous tool in our self-service BI processes. I personally value the following features the most.
- Data source publish and search
- Linked data subject matter experts
- Added data and metadata context
- Data previews and profiles
- Linked data source documentation
- Open in my self-service BI tool of choice
Additional Key Self-Service BI Capabilities
For self-service BI to be successful, you can’t just tell the business users where the data sources reside and expect them to figure out the rest on their own. Often database column names make no sense to a business user. ERP databases like Dynamics, Salesforce, Oracle Financials or SAP are impossible to navigate even with the best self-service BI tools. There are many other reasons that I am not citing here. The bottom line is that non-technical business users do need a user-friendly reporting layer aka a BI Semantic Model.
While I am mentioning BI Semantic Model, it is also important that it work with multiple reporting tools in a BYORT world.
Don’t trap reporting dimensions, facts, business logic or data in a proprietary BI Semantic Model that only one tool can use. In other words, don’t lock up analytical assets in a “data jail”.
Just like Excel Power Pivot, Power BI’s underlying engine is powered by an improved, modernized Analysis Services in-memory, columnar compressed database. Enterprise flavors of Analysis Services BI Semantic Models can be used with Excel, Power BI Desktop, Reporting Services and a plethora of other third-party vendor reporting and self-service BI tools. Don’t overlook or underestimate the importance of analytical freedom in your enterprise self-service BI strategy.
Other areas that were covered in the self-service BI governance webinar include:
- Master Data Services
- Data Quality Services
- Enterprise Data Gateway
- Organizational Content Packs
- Power BI Administration
- Mobile BI Administration
- Development Life Cycle
Each of these topics is worthy of a few blogs and deep dive sessions to truly appreciate their capabilities and related self-service BI reporting potential. Stay tuned for more webinars and articles on self-service BI governance technologies and other industry hot topics.