Recent research introduces RETENTION, an end-to-end framework to reduce CAM capacity requirement for tree-based model inference.
RETENTION includes an iterative pruning algorithm with a novel criterion and a tree mapping scheme with innovative data placement strategies.
Implementation of the tree mapping scheme alone achieves significantly improved space efficiency.
The full RETENTION framework results in a substantial improvement in CAM capacity requirement with minimal accuracy loss, offering a resource-efficient approach for tree-based model acceleration.