A novel spatiotemporal learning framework for event-based object recognition is presented.The framework utilizes a VGG network enhanced with Convolutional Block Attention Module (CBAM).The approach achieves comparable performance to state-of-the-art ResNet-based methods while reducing parameter count.Experimental results highlight the efficiency and effectiveness of the framework for real-world applications.