Put the BI in Big Data by going data native
Customer drop-off, supply chain problems, lack of competitor awareness -- there are a plethora of critical business problems out there, but the answer is a panacea -- actionable big data. For any company struggling to understand what direction to take its business decisions, there is often a disconnect between its business intelligence (BI) and its big data.
Companies are amassing more data than ever, with Gartner predicting it will continue to grow by 800 percent over the next five years -- yet 80 percent will be unstructured. Herein lies the breakdown -- companies need robust storage, processing, and analysis to get that unstructured data to be actionable and, therefore, capable of stopping critical business problems in their tracks. But many companies’ existing data architecture investments are holding them back. They are hamstrung by their legacy business intelligence and data visualization tools that aren’t geared toward solving complex big data problems.
Imagine if there were a self-driving car maker or a cyber threat monitoring company that couldn’t review their systems’ data in real time. Neither of these would be safe options, and neither of their businesses would outlast the competition. If analytics, or BI tools, can’t operate where the data lives, then data must be shipped around from system to system. That reduces visibility and time to action. This means that companies can’t keep up with their data -- or business realities -- and fall behind those who can.
There is a solution -- a data-native approach to increased data demand. In other words, companies must build data-centric applications with an analytics framework that runs in a distributed fashion, where the data resides and without data movement. There are a slew of popular data platforms out there aimed at addressing the problem -- Apache Hadoop, Amazon S3, Apache Solr, and Mongo-DB, to name a few -- which all have their benefits. A native approach to a framework like Hadoop -- used by mega-companies Facebook, Intel, and Yahoo -- enables those companies to get actionable insights on demand in as close to real time as possible.
The Consequences of Not Going Native
Without a data-native approach, features of a traditional BI stack come unglued. Time to insights are slower, and the ability to drill to detail are diminished.
For one, traditional BI architectures will require more system management and administration, and a lot of that time will be dedicated to data movement, transformation, and aggregation. The data will often be stored inside of Hadoop in a completely raw format, which then needs to be transformed and refined. Then it needs to be aggregated and summarized in another system for reporting and analytics given scaling limitations of traditional BI tools. As data is aggregated and replicated, it loses detailed information -- which is never a good thing. This setup provides limited drill-down capabilities, meaning the same data that could be run natively will be not only be more expensive to manage, but less useful. A native tool would push down analytic computation and visualization as close to the data as possible, with no data movement required.
When a company has to dedicate this amount of time and resources just to keep its data architecture churning, it is also typically harder to integrate new features into the end user application. That means that the punch list of data points a marketing department decides to use on day one will largely be in place for the lifecycle of the data architecture, and new features thought up even during the implementation process will likely be left on the drawing board. All this makes it more likely that -- despite the best intentions -- addressing critical business problems will continue to seem out of reach.
The Promise of a Native Approach
By running BI and analytics within an architecture like Hadoop in a native fashion, many of the above consequences flip to the other side of the coin: Analytics are scalable, functional, and easy to use -- all without any data movement.
As a company scales up, its big data efforts need to expand -- after all, the amount of data it collects certainly has. When a company is running a native data and analytics solution, they can work with massive data sets that are accessible by hundreds of users all at once. This empowers that business to have its employees collaborate to achieve their business intelligence goals.
The power of having data in one place is that comprehensive and deep analytics can be performed right on top of it. And users can utilize more data in a new way, since native integration is optimal for behavioral data analytics and other insights which leverage complex, semi-structured, and unstructured data. (Meanwhile, relational databases are more geared toward transactional data.) This data-native approach takes less time, pushing processing speeds into the sub-seconds. This is because there are no data marts and much fewer ETL steps to take. This allows employees to start exploring their raw data in real time, right where the data lives.
All of this improves performance of interactive BI and analytics, but being data-native also improves security, since the data never leaves the cluster, and functionality, since a business can now drill down into customer engagement flows and conversion funnels through behavior-based segmentation. With data that is accessible, super-fast, and easy to explore, companies can leverage web-based dashboards with visualization tools that make it easier to present discoveries and enable customers, suppliers, and partners to have self-service access to the same insights and ask deeper questions in a self-service fashion.
While traditional BI tools will always be necessary, it’s imperative to have a way to perform native data analysis at big data scale. By keeping data and analytics all in one primary place, like through a native data and analytics architecture, businesses can save money and time, all while increasing analytics performance and functionality. This powers businesses to quickly visualize data that can drive their day-to-day decision making and support new customer-facing data applications -- ensuring they can immediately address their critical business problems and create sustained differentiation by leveraging big data with greater ease.
Priyank Patel is co-founder and CPO of Arcadia Data, and leads the charge in building visually beautiful and highly scalable analytical products. Prior to co-founding Arcadia, he was part of the founding engineering team at Aster Data, where he designed core components of the Aster Database. He later transitioned into field roles to win the company’s first customers in the Eastern US region and then into product management for the SQL-MapReduce and Analytical Frameworks.