This paper explores the theoretical questions that arise when applying active learning of probabilistic deterministic finite automata (PDFA) to neural language models.
The paper defines a congruence that deals with null next-symbol probabilities in language models that arise when constraining the output of a language model by composing it with an automaton and/or a sampling strategy.
An algorithm is developed to efficiently learn the quotient PDFA created by the congruence, and case studies are conducted to analyze the statistical properties of large language models.
The experimental results demonstrate the relevance and effectiveness of the approach.