Estimation in data projects is about being less wrong rather than aiming to be right.
Four proven estimation methods are discussed in the context of building a modern data pipeline on Azure.
One method involves drawing estimates from past similar projects, such as ingesting data which took 4 weeks previously.
Another method involves doing calculations based on known metrics, like the time it takes to complete dbt models for dimension tables.
Using the PERT formula, a time estimate is derived for building a Delta Live Table pipeline, considering optimistic, pessimistic, and most likely scenarios.
Breaking down the project into smaller, manageable tasks allows for more accurate estimation by considering various factors like stage, predictability, uncertainty, and commitment levels.
Estimation is compared to a lens that helps in seeing more clearly, even if the end result may still be somewhat inaccurate.