Human parsing with contextualized convolutional neural network

Page 1

Human Parsing with Contextualized Convolutional Neural Network

Abstract: In this work, we address the human parsing task with a novel Contextualized Convolutional Neural Network (Co (Co-CNN) CNN) architecture, which well integrates the cross-layer layer context, global image image-level level context, semantic edge context, withinwithin super-pixel pixel context and cross cross-super-pixel pixel neighborhood context into a unified network. Given an input human imag image, Co-CNN CNN produces the pixel-wise pixel categorization in an end-to--end way. First, the cross-layer layer context is captured by our basic local-to-global-to to-local local structure, which hierarchically combines the global semantic information and the local fine details across acro different convolutional layers. Second, the global image image-level level label prediction is used as an auxiliary objective in the intermediate layer of the Co Co-CNN, CNN, and its outputs are further used for guiding the feature learning in subsequent convolutional layers laye to leverage the global image-level level context. Third, semantic edge context is further incorporated into Co-CNN, CNN, where the high high-level level semantic boundaries are leveraged to guide pixel-wise wise labeling. Finally, to further utilize the local supersuper pixel contexts,, the within within-super-pixel pixel smoothing and cross-super-pixel cross neighbourhood voting are formulated as natural sub sub-components components of the Co-CNN Co to achieve the local label consistency in both training and testing process. Comprehensive evaluations on two public datase datasets ts well demonstrate the significant superiority of our Co Co-CNN over other state-of-the-arts arts for human parsing. In particular, the F--1 1 score on the large dataset [1] reaches 81.72percent


Turn static files into dynamic content formats.

Create a flipbook
Issuu converts static files into: digital portfolios, online yearbooks, online catalogs, digital photo albums and more. Sign up and create your flipbook.
Human parsing with contextualized convolutional neural network by ieeeprojectchennai - Issuu