Deep neural networks have been powerful in predictive tasks across various fields, including tabular data problems.
The transformer architecture has challenged gradient-based decision trees in handling tabular data.
However, the black-box nature of deep tabular transformer networks makes it difficult to interpret marginal feature effects.
A proposed adaptation of tabular transformer networks aims to identify and maintain intelligible marginal feature effects while maintaining predictive power.