How data APIs accelerate creation of analytics apps

Intelligent APIs

One way to access data efficiently and accelerate the development and deployment of analytics apps is to build an API. APIs are a natural way to access data, whether it be personalization scores for web content or a service to assess the risk of a part failing.

There are a number of benefits to using an API for data access. First, it restricts the user to efficient requests. Google Analytics is a prime example. A query API gives you access to the rich data in your Google Analytics instance. While the API is fairly flexible, it allows Google to more accurately describe the types of queries that can be performed efficiently.

APIs also enable you to mask the complexity of a query. It’s not uncommon to have data in multiple places. APIs allow you to hide the details regarding the data’s location. You might have an API that runs a Presto query while another part of the API looks up something in a prebuilt index and yet another hits prebuilt aggregates in HBase. In short, an API can efficiently navigate a variety of implementation mechanisms without exposing messy details to the user.

An API can also be used to help data scientists frame successful queries. At Think Big we’ve seen instances where data scientists run a direct query against Impala that works 80% of the time, but 20% of the time it fails and needs to be expressed a little differently in Hive. An API can make the query process more reliable.

Some APIs become standardized and widely used. Elasticsearch is a good example. It’s a general purpose API that lets you access data in your environment. But organizations also build APIs that more specifically address a business problem such as personalization or looking up of a set of related information assets stored in big data.

On the other hand, sometimes you want the flexibility of a true data warehouse where you have organized data to support a wide variety of ad hoc queries. In those cases, there’s a lot of value in using a mature MPP database with a good optimizer. Big data systems are still primarily useful for transforming and loading data in these formats. So good design for data systems blends APIs for efficiency and succinctness with the flexibility of third normal form for data products that support a variety of ad hoc queries.

Advice for Building APIs

Building good APIs takes some thought. An API represents a long-term contract that implies you’re going to continue to offer and support that API. It’s painful for users if you break that contract, so think carefully about what you’re willing to support. Complexity is another consideration; you need enough richness in your APIs to allow people to do their jobs, but you must also weigh the cost of adding complexity against the benefits for users. Bear in mind that good APIs support loose coupling. That way, as your analytics architecture evolves, you can change implementation details under the API while still supporting the functionality and keep the contract with your users.

Build Analytic Wealth

As you build APIs for data access, they serve as a form of analytic wealth. When building an analytic app, you’re no longer starting from scratch; you already have a portion of what you need in existing APIs and can simply add what you don’t have. Our Dashboard Engine for Hadoop is one example of this pattern; it’s made up of APIs built for understanding consumer behavior across a series of events. In the same way, the APIs you create to access data make it easier to build analytics apps without reinventing the wheel, and building analytic wealth.

Image Credit: totallyPic.com / Shutterstock

Ron BodkinRon Bodkin is Think Big's President and Founder. He founded Think Big to help companies realize measurable value from Big Data. Previously, Ron was VP Engineering at Quantcast where he led the data science and engineer teams that pioneered the use of Hadoop and NoSQL for batch and real-time decision making. Prior to that, Ron was Founder of New Aspects, which provided enterprise consulting for Aspect-oriented programming.

Comments are closed.

© 1998-2024 BetaNews, Inc. All Rights Reserved. Privacy Policy - Cookie Policy.