“Deep convolutional neural networks (DCNNs) have been found to demonstrate hierarchical mapping to human brain regions on tasks such as object recognition. However, it remains unclear if such hierarchical mapping also applies to action recognition, which involves dynamic visual information processing. Here, we compared action representations of two-stream DCNNs to the human visual system. Five visual areas that are associated with object and action processing were selected. Nine human action categorieswere adopted from three semantic classesto examine the action representations of both DCNNs and human visual areas. In two fMRI experiments, actions were presented in the forms of computer-rendered videos and point-light biological motionvideos.Results showedthat although two-stream DCNNs demonstrated hierarchical representations of actions as layers grow deeper, DCNNslack a hierarchical mapping to human visual areas. Consistently across different video displays and DCNN pathways, only the top DCNN layers demonstrated highsimilarity to representations in the human visual system. The results suggest that the dynamic representations of human actions may be different in DCNNs compared to the humanvisual system, even after big-data training.”
Supplementary notes can be added here, including code, math, and images.