Multimodal learning in artificial intelligence is popular for combining different types of data, but missing modalities can impact accuracy.
Ignoring missing data when estimating the value of additional modalities can lead to incorrect conclusions in downstream tasks.
ICYM2I (In Case You Multimodal Missed It) framework is introduced to evaluate predictive performance and information gain under missingness.
The importance of accounting for missing data in estimating information gain is demonstrated on synthetic, semi-synthetic, and real-world medical datasets.