menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Open Source News

>

Databricks...
source image

VentureBeat

3d

read

124

img
dot

Image Credit: VentureBeat

Databricks open-sources declarative ETL framework powering 90% faster pipeline builds

  • Databricks open-sources its declarative ETL framework as Apache Spark Declarative Pipelines at the Data + AI Summit, aiming to enhance the Apache Spark community's capabilities.
  • The move serves to strengthen Databricks' commitment to open ecosystems and compete with Snowflake's recent Openflow service for data integration.
  • Spark Declarative Pipelines simplifies data pipeline authoring, automates operations, and supports both batch and streaming workloads.
  • Engineers define pipelines using SQL or Python, allowing Apache Spark to handle execution, dependency tracking, and operational tasks.
  • The framework supports various data types and sources, enabling real-time and periodic processing through a unified API with validations before execution.
  • By leveraging the declarative approach, Databricks aims to streamline the Apache Spark experience and simplify end-to-end pipeline development.
  • Numerous enterprises have benefited from the framework, reducing development time, maintenance efforts, and achieving optimized performance.
  • Databricks' open-source strategy makes the Spark Declarative Pipelines accessible to a broader user base beyond its existing customers.
  • The rollout aligns with Databricks' past contributions to the open-source community and coincides with the commercial version's availability.
  • Apache Spark Declarative Pipelines will be integrated into the Apache Spark codebase soon, while the commercial version offers additional enterprise features.

Read Full Article

like

7 Likes

For uninterrupted reading, download the app