New cloud platform enables cost-effective data engineering

By Ian Barker
Published 7 years ago

Dark data

Businesses encounter a variety of challenges in building systems on and around Spark to meet the needs of data engineering.

Often engineers need to perform mission-critical data cleansing, transformations, and manipulations, to make business activities real-time dashboards or fraud detection possible. Mastering data engineering is therefore an essential step to automating systems and making data-driven decisions.

Spark-based data science platform Databricks is launching a new edition of its cloud platform optimized specifically for data engineering workloads.

Databricks for Data Engineering enables more cost-effective data engineering using Spark while empowering data engineers to easily combine SQL, structured streaming, Extract, Transform, Load (ETL), and machine learning workloads to rapidly and securely deploy data pipelines into production.

The new release will complement the company’s existing cloud platform by providing all enterprises with a unified data analytics platform that enables seamless collaboration to accelerate data-driven decisions across the organization.

"The expansion of our product portfolio to meet the needs of data engineering workloads is a major step in our journey to make big data simple for very complex data problems," says Ali Ghodsi, CEO and co-founder at Databricks. "Databricks for Data Engineering will offer organizations a unified environment for data science and data engineering users alike, while optimizing Apache Spark performance -- all with the reliability of an enterprise data and analytics platform at an efficient price."

Key features include performance optimization for faster processing speeds, plus optimizing for the AWS S3 access layer. Cluster management capabilities such as auto-scaling and AWS Spot instances reduces operational costs by avoiding time-consuming tasks needed to build, configure, and maintain complex Spark infrastructure.

It's designed to integrate with tools and services, such as Redshift and Kinesis, and machine learning frameworks such as TensorFlow. An integrated data sources catalog makes every data source immediately available to all Databricks users without duplicating data ingest work.

Enterprise levels of security are built-in with turnkey security standards including SOC 2 Type 1 certification and HIPAA compliance, end-to-end data encryption, detailed logs easily accessible in AWS S3 for debugging, and IT admin capabilities such as Single Sign-On with SAML 2.0 support and role-based access controls for clusters, jobs, and notebooks.

You can find out more about Databricks and start a free trial on the company's website.

Photo Credit: agsandrew/Shutterstock

No Comments

Comments are closed.

New cloud platform enables cost-effective data engineering

Recent Headlines

Software file converters: How they work and why you need them

Human risk management automation can help beat burnout

Vivaldi 6.7 debuts Memory Saver performance booster, expands Feed Reader capabilities

Best Windows apps this week

Addressing digital transformation needs in the public sector [Q&A]

The psychological impact of phishing attacks on your employees

Canonical releases Ubuntu Linux 24.04 LTS 'Noble Numbat'

Most Commented Stories

Say goodbye to Microsoft Windows 11 and hello to Nitrux Linux 3.4.0 'pl'

The stunning Windows 13 -- yes, 13! -- is the Microsoft operating system we want

Microsoft 'improves' Windows 11 by bringing ads to the Start menu in the US

Microsoft is up to its old tricks yet again -- Windows 10 users harassed with full-screen Windows 11 upgrade warnings

Windows 11 slammed for its 'comically bad' performance even on high-end hardware

Outrageous: Microsoft to charge $61 for Windows 10 updates -- consider switching to Linux!

Microsoft releases preview version of Office 2024 for Windows and macOS -- download it now!

Easter giveaway! Get a licensed copy of 'VideoProc Converter for Windows/Mac' (worth $78.90) for FREE