New structured data service allows faster access for applications
As enterprises make more use of widely-available analytics engines such as Presto, Apache Spark SQL or Apache Hive, they often run into inefficient data formats and face performance challenges as a result.
Open source cloud data software company Alluxio is launching a new Structured Data Service (SDS) that will allow developers and data scientists to benefit from a more simplified data platform that enables connections to different catalogs for access to structured data, with less copies and pipelines and more compute-optimized data.
"Alluxio now provides just-in-time data transform of data to be compute-optimized, independent of the storage format for OLAP engines, such as Presto and Apache Spark," says Haoyuan Li, founder and CTO of Alluxio. "These schema-aware optimizations are made possible with the new Alluxio Catalog Service which abstracts the widely-used Apache Hive Metastore, so regardless of how the data was initially stored -- CSV and text formatted files, for example -- the data is now transformed into the generally recognized compute-optimized parquet format. Almost every organization has a surprising amount of data in CSV or other text formats and this removes the manual work to make that data more usable. A second type of transformation will coalesce many smaller files, enabling the data to be combined into fewer files, which is more efficient to process for SQL engines. And yet a third type of transformation is for sorting, enabling table columns to be sorted adding to the efficiency of queries, newly available in our Enterprise Edition."
Features of the SDS include a Presto connector allowing easy integration and configuration of Alluxio with Presto; a new Catalog Service that manages the metadata of structured data in the system; and a Transformation Service which turns data into a compute-optimized representation which is independent from the storage-optimized format. This enables physical data independence.
You can find out more on the Alluxio blog.