A new learning approach has been proposed to efficiently satisfy complex Linear Temporal Logic (LTL) specifications in multi-task reinforcement learning (RL).
Existing approaches for satisfying LTL specifications suffer from various limitations, such as only being applicable to finite-horizon fragments of LTL, suboptimal solutions, and insufficient handling of safety constraints.
The proposed method uses B"uchi automata to represent the semantics of LTL specifications and learns policies based on sequences of truth assignments.
Experiments show that the approach can zero-shot satisfy a wide range of specifications, both finite- and infinite-horizon, and outperforms existing methods in terms of satisfaction probability and efficiency.