Tokenization is crucial in language modeling to segment text inputs into atomic units.
A new deep model has been introduced to incorporate morphological structure guidance into tokenization.
The model utilizes a mechanism called $ extit{MorphOverriding}$ to maintain the indecomposability of morphemes and align with morphological rules.
Empirical results show that the proposed method outperforms traditional methods like BPE and WordPiece in morphological segmentation and language modeling tasks.