menu
techminis

A naukri.com initiative

google-web-stories
Home

>

Programming News

>

Ed Crewe: ...
source image

PlanetPython

2w

read

409

img
dot

Image Credit: PlanetPython

Ed Crewe: Talk about Cloud Prices at PyConLT 2025

  • Ed Crewe will be speaking at PyConLT 2025 about cloud pricing complexity and data pipelines for EDB's Postgres AI product.
  • Cloud pricing involves managing nearly 5 million prices across AWS, Azure, and GCP for various services, types, tiers, and regions.
  • To estimate costs for customers, a data pipeline was built using Python, Airflow, and Postgres, replacing a 3rd party service.
  • The pipeline's Python code uses an abstract base class for scrapers, Psycopg for faster database updates, and Go for Embedded Postgres.
  • Separate temporary Postgres databases per step ensure independent data handling and compatibility with the final target database.
  • Click package functionalities help in developing and testing pipelines, with the ability to run individual scrapes for debugging.
  • Unit testing is facilitated by creating mock response objects for data scrapers, enabling functional testing of the scrape and data creation ETL cycle.
  • Data pipelines like Airflow and Dagster allow for local development mode and faster testing of DAG steps to improve development experience.
  • Soda tests ensure the correctness of the scraped data by validating the number of prices, tiered rates, and service ranges expected.
  • The final data artifacts are loaded into a Postgres cluster price schema micro-service running on CloudNativePG for efficient data management.

Read Full Article

like

24 Likes

For uninterrupted reading, download the app