Abstract
Multi-person pose estimation is a fundamental yet challenging task in computer vision. Both rich context information and spatial information are required to precisely locate the keypoints for all persons in an image. In this paper, a novel Context-and-Spatial Aware Network (CSANet), which integrates both a Context Aware Path and Spatial Aware Path, is proposed to obtain effective features involving both context information and spatial information. Specifically, we design a Context Aware Path with structure supervision strategy and spatial pyramid pooling strategy to enhance the context information. Meanwhile, a Spatial Aware Path is proposed to preserve the spatial information, which also shortens the information propagation path from low-level features to high-level features. On top of these two paths, we employ a Heavy Head Path to further combine and enhance the features effectively. Experimentally, our proposed network outperforms state-of-the-art methods on the COCO keypoint benchmark, which verifies the effectiveness of our method and further corroborates the above proposition.
Abstract (translated by Google)
URL
https://arxiv.org/abs/1905.05355