A new machine learning model has been introduced for generating synthetic network flow datasets that closely resemble real-world networks.
The approach involves creating dynamic multigraphs using a stochastic Kronecker graph generator for structure generation and a tabular generative adversarial network for feature generation.
An XGBoost model is employed for graph alignment to ensure accurate overlay of features onto the generated graph structure.
The model demonstrates improved accuracy over previous large-scale graph generation methods while maintaining efficiency and explores the trade-off between accuracy and diversity in synthetic graph dataset creation.