Topic models often struggle to capture minority topics like mental health themes in online comments.
Proposed solution involves Constrained Non-negative Matrix Factorization (NMF) to incorporate seed word lists for minority content without detailed pre-specification.
The Constrained NMF, applied with Karush-Kuhn-Tucker conditions, outperforms baselines on topic purity and mutual information on synthetic data.
In a case study on YouTube vlog comments, the model successfully identifies and highlights minority mental health content in viewer discussions.