It’s hard to believe, but back in the ’90s people actually said, “The Internet changes everything.” Now it’s something else: 89% of business leaders believe big data will revolutionize business operations in the same way the Internet did. But, nearly a decade into the big data era, only 28% percent of businesses believe that they are generating strategic value from the data they collect, and nearly 40% admit that they need a plan to take advantage of big data.
Investment in Hadoop is “tentative”
These statistics neatly capture both the potential and pitfalls of Hadoop, an open-source software framework for distributed storage and distributed processing of very large data sets. According to Gartner, Inc. ’s 2015 Hadoop Adoption Study, only 26% of respondents claim to be either deploying, piloting, or experimenting with Hadoop. Eleven percent of survey participants plan to invest in it within 12 months and 7% are planning investment in 24 months.
What’s holding back Hadoop?
Netting it all out, Gartner says investment in Hadoop remains tentative “in the face of sizable challenges around business value and skills.” Here’s why: Hadoop is great for data storage and processing. But the same capabilities that make Hadoop great for running complex calculations on massive amounts of unstructured data make it not-so-great for analytics. Specifically:
1. Hadoop is not designed to answer analytics questions at business speed. Most SQL-on-Hadoop engines are designed to do full table scans, but tend to be too slow for the bread and butter of interactive analysis: individual record lookups, range scans, and highly iterative analytic scenarios.For example, to see weekly close-of-business sales, as compared to the month prior, and for products that fall into category x but don’t belong to region r, you need data to be organized in an analytic-ready format. Otherwise you will be joining many tables and waiting a lot of time to get your results, and if you change category x to category y, you will have to rinse, repeat and start over again.
2. Hadoop is not build to handle high-volume user concurrency. Hadoop delivers high throughput; queries can be very complex and touch most, if not all, of the data in the system at any time. This is powerful but does not support many simultaneous queries at once, as is the case with large user populations who demand information regularly and simultaneously.
3. Hadoop is not consumable by business users. Today, the simplest reporting interface for getting data out of Hadoop is SQL (via tools like Spark, Impala or Hive). This is simply not a skill business users have. They think in terms of revenue, sales, and customers, and not joins, where clauses, and select statements.
As a result, IT becomes the Hadoop gatekeeper, increasingly burdened to prepare data for each new business question or use case. This is manageable for pilot projects but untenable when insights are needed at the point of impact, every day, where new questions are asked every hour.
How to bring Hadoop into the enterprise mainstream
The good news is, there are ways to up-level Hadoop to meet the demands of business users. In his new report, “Market Overview: Boost Your Customer Insights With BI-On-Hadoop: Six Steps To Bridge The Worlds Of Business Intelligence And Big Data,” Forrester vice president and principal analyst Boris Evelson “analyzes the market landscape and provides vendor segmentation of business intelligence (BI) platforms designed to process and analyze Hadoop-based data.” (The report can be accessed by Forrester clients and can also be purchased.)
In the report Evelson says, “Existing BI architectures are not flexible enough. Most organizations take too long to get to the ultimate goal of a centralized BI environment, and by the time they think they are done, there are new data sources, new regulations, and new customer needs, which all require more changes to the BI environment.”
This is a great point. In my next post, I’ll talk about the specifics of using BI to bring Hadoop up to the performance standards enterprise business users expect. In the meantime, to learn more about Birst’s flexible BI architecture, click here.