Designing neural networks typically involves manual trial and error or neural architecture search (NAS) followed by weight training.A new approach called SWAT-NN optimizes both the architecture and the weights of a neural network simultaneously.This method uses a universal multi-scale autoencoder to embed architectural and parametric information into a continuous latent space.Experiments show that SWAT-NN effectively discovers sparse and compact neural networks with strong performance on synthetic regression tasks.