Data scientists often face challenges in translating fuzzy concepts into concrete target variables for predictive modeling tasks.
A study involving interviews with fifteen data scientists in education and healthcare reveals that they construct target variables through a bricolage process.
The target variable construction process involves iterative negotiation between high-level measurement objectives and practical constraints to satisfy criteria like validity, simplicity, predictability, portability, and resource requirements.
Data scientists employ adaptive problem (re)formulation strategies, such as swapping target variables or combining multiple outcomes, to meet modeling objectives effectively.