Reinforcement learning has shown potential in training legged robots for complex locomotion behaviors.Policies trained in simulation struggle to transfer to real-world robots due to unrealistic assumptions.Traditional methods penalize aggressive motions, but require extensive tuning.This work proposes Spectral Normalization as an efficient method to enforce Lipschitz continuity and reduce memory usage.