Automatic Body Segmentation with Graph Cut and Self-Adaptive Initialization Level Set (SAILS)


[Abstract] [Experimental Results]   [References]   [PDF]


Abstract

With the extensive potential applications of computer technologies, automatic object segmentation plays a more and more important role in digital video processing, pattern recognition, and computer vision. In this paper, we propose an automatic human body segmentation system mainly consisting of human body detection and object segmentation. Firstly, an automatic human body detector is designed to provide hard constraints on the object and background for future segmentation. Secondly, in the first frame to be segmented, a coarse-to-fine segmentation strategy is employed to deal with the situation of partly detected object. By investigating the well-known graph cut segmentation algorithm and its implementations, we find that the segmentation error often occurs at the object boundary with the clattered background which will result in unpleasant visual artifacts in the segmented video. Therefore, background contrast removal (BCR) is proposed to weaken the high contrast in the background and preserve the contrast belonging to the foreground and background simultaneously. Thirdly, we propose a self-adaptive initialization level set (SAILS) to solve the tough problem, the undefined boundary in region-based segmentation, and to speed up the process of evolution simultaneously. Finally, an object updating scheme is proposed to detect and re-initialize new object when object disappears and reappears in the scene. Experimental results demonstrate that our body segmentation system works very well in the live video with strong edge and similar color in the background.


Experimental Results (Compressed by JPEG)

Fig.1 Comparison with "Background cut" [12]: the input images are listed in the first row, the second and last row are the segmented results by "Background cut" and our method.
the improvement is marked by white ellipses.

Fig.2 Comparison with Level Set [32]: the red curves are the segmented boundary of object and background. the first row is the segmented results by Level Set 
and the second row is our method's results. the improvement is marked by white ellipses.

Fig.3 Segmentation results of "In lab" and "ZW" sequence with partly moving background and similar color in the background: the input images are listed in the odd rows and the corresponding results are shown in the even rows.

Fig.4 Segmentation results of "In office" sequence with head and body moving: the input images are listed in the odd rows and the corresponding results are shown in the even rows.

Fig.5 Segmentation results of outdoor sequence with moving people: the input images are listed in the odd rows and the corresponding results are shown in the even rows.

กก


References

[1] C¸ . E. Erdem, F. Ernst, A. Redert and E. Hendriks, "Temporal stabilization of Video Object Segmentation for 3D-TV applications," Signal Processing: Image Communication, Volume 20, Issue 2, February 2005, Pages 151-16
[2] J. Y. A.Wang and E. H. Adelson, "Layered representation for motion analysis," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 361-366, New. York, June 1993.
[3] J. Wills, S. Agarwal, and S. Belongie, "What went where," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 37-44, 2003. 22
[4] H. Li and K.N. Ngan, "Automatic Video Segmentation and Tracking for Content-based Multimedia Services," IEEE Communications Magazine, U.S.A., vol. 45, no. 1, pp. 27-33, January 2007.
[5] Y. Boykov and M. P. Jolly, "Interactive graph cuts for optimal boundary & region segmentation of objects in n-d images," in Proceedings of the International Conference on Computer Vision, vol. 1, pp. 105-112, July 2001.
[6] C. Rother, V. Kolmogorov, and A. Blake, "GrabCut - interactive foreground extraction using iterated graph cuts," in Proceedings of ACM SIGGRAPH, 2004.
[7] Y. Li, J. Sun, C. K. Tang, and H. Y. Shum, "Lazy snapping," in Proceedings of ACM SIGGRAPH, 2004.
[8] Y. Li, J. Sun, and H. Y. Shum, "Video object cut and paste," in Proceedings of ACM SIGGRAPH, pp. 595-600, 2005.
[9] J.Wang, P. Bhat, R. A. Colburn, M. Agrawala, and M. F. Cohen, "Interactive video cutout," in Proceedings of ACM SIGGRAPH, pp. 585-594, 2005.
[10] Huitao Luo, Alexandros Eleftheriadis, "An interactive authoring system for video object segmentation and annotation," Signal Processing: Image Communication, Volume 17, Issue 7, August 2002, Pages 559-572.
[11] H. Li and K.N. Ngan, "Unsupervised video segmentation with low depth of field," IEEE Transactions on Circuits and Systems for Video Technology, U.S.A., vol. 17, no. 12, pp. 1742-1751, December 2007.
[12] J. Sun, W. Zhang, X. Tang, and H.-Y. Shum, "Background cut," in Proceedings of European Conference on Computer Vision, vol.2, pp. 628-641, 2006.
[13] Q. Zhang, K.N. Ngan and W. Yang, "Automatic Segmentation for Semantic Objects from Multiview Images," International Conference on Visual Information Engineering, Xian, China, pp. 554-559, 29 July-1 August, 2008.
[14] V. Kolmogorov, A. Criminisi, A. Blake, G. Cross, and C. Rother, "Bi-layer segmentation of binocular stereo video," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1186-1193, 2005.
[15] A. Criminisi, G. Cross, A. Blake, and V. Kolmogorov, "Bilayer segmentation of live video," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 53.60, 2006.
[16] H. Li, K.N. Ngan and Q. Liu, "FaceSeg: Automatic Face Segmentation for Real-time Video," IEEE Transactions on Multimedia, U.S.A., vol. 11, no. 1, pp. 77-88, January 2009.
[17] H. Gao, Weisi Lin, P. Xue, Wan-Chi Siu, "Marker-based image segmentation relying on disjoint set union," Signal Processing: Image Communication, Volume 21, Issue 2, February 2006, Pages 100-112.
[18] K. Mehmood, M. Mrak, J. Calic, A. Kondoz , "Object tracking in surveillance videos using compressed domain features from scalable bit streams," Signal Processing: Image Communication, Volume 24, Issue 10, November 2009, Pages 814-824.
[19] Changick Kim and Jenq-Neng Hwang, "Fast and Automatic Video Object Segmentation and Tracking for Content-Based Applications," IEEE Transactions on Circuits and Systems for Video Technology, vol. 12, no.2, Feb. 2002
[20] N. Atzpadin, P. Kauff and O. Schreer, "Stereo analysis by hybrid recursive matching for real-time immersive video conferencing," IEEE Transactions on Circuits and Systems for Video Technology, vol.14, no.3. pp. 321-334, 2004.
[21] J. Ostermann, "Object-based analysis-synthesis coding based on the source model of moving rigid 3D objects," Signal Processing: Image Communication, Volume 6, Issue 2, May 1994, Pages 143-161.
[22] J.Wang , Yingqing Xu , Heung-Yeung Shum , Michael F. Cohen, "Video tooning," ACM SIGGRAPH 2004 Papers, August 08-12, 2004, Los Angeles, California.
[23] P. Viola and M. Jones, "Rapid object detection using a boosted cascade of simple features," in IEEE Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 511-518, Dec. 2001.
[24] L. Bourdev and J. Malik, "Poselets: Body part detectors trained using 3d human pose annotations," In ICCV, 2009. 5.
[25] Y. Boykov and V. Kolmogorov, "An experimental comparison o min-cut/max-flow algorithms for energy minimization in vision," In IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 26, no. 9, pp. 1124-11 , September 2004.
[26] Y. Sheikh and M. Shah, "Bayesian object detection in dynamic scenes," in IEEE Conference on Computer Vision and Pattern Recognition, pp. 1778-1792, 2005.
[27] O. Tuzel, F. Porikli, and Peter Meer, "A bayesian approach to background modeling," In IEEE Workshop on Machine Vision for Intelligent Vehicles, ISSN: 1063-6919, Vol. 3, pp.58-63, June 2005.
[28] C. Wren, A. Azarbayejani, T. Darrell, and A. Pentland, "Pfinder: Real-time tracking of the human body," IEEE Transactions on Pattern Analysis and Machine Intelligence, volume 19, pp. 780-785, 1997.
[29] C¸ . E. Erdem, "Video object segmentation and tracking using region based statistics," Signal Processing: Image Communication, Volume 22, Issue 10, November 2007, Pages 891-905.
[30] Ioannis Patras, Emile A. Hendriks, Reginald L. Lagendijk, "Semiautomatic object-based video segmentation with labeling of color segments," Signal Processing: Image Communication, Volume 18, Issue 1, January 2003, Pages 51-65.
[31] L. Evans, "Partial Differential Equations," Providence: American Mathematical Society, 1998.
[32] C. Li, C.Y. Xu, C. F. Gui and M.D. Fox, "Level Set Evolution Without Re-initialization: A New Variational Formulation," in IEEE Conference on Computer Vision and Pattern Recognition, 2005. Volume 1, 20-25 June 2005 Pages:430 - 436.
[33] J. A. Sethian, "Level set methods and fast marching methods," Cam bridge: Cambridge University Press, 1999.
[34] S. Osher and R. Fedkiw, "Level Set Methods and Dynamic Implicit Surfaces," Springer-Verlag, New York, 2002.
[35] T. Chan and L. Vese, "Active contours without edges", IEEE Transactions on Image Processing, vol.10, pp.266-277, Feb. 2001.
[36] B. Vemuri and Y. Chen, "Joint image registration and segmentation", Geometric Level Set Methods in Imaging, Vision, and Graphics, Springer, New York, pp. 251-269, 2003.
[37] I. E. Richardson, "H.264 and MPEG-4 Video Compression: Video Coding for Next-generation Multimedia," John Wiley, Sons Ltd., Sussex, England, Dec. 2003.
[38] Y. Y. Chuang, B. Curless, D. H. Salesin, and R. Szeliski, "A Bayesian approach to digital matting," in IEEE Conference on Computer Vision and Pattern Recognition, vol. 2, pp. 264-271, 2001.
[39] J. Sun, J. Jia, C. K. Tang, and H.-Y. Shum, "Poisson matting," in Proceedings of ACM SIGGRAPH, 2004.


Miscellaneous

Innovation Expro 2007


 Last Update: July 7, 2010.
Best viewed with 1024x768 resolution. All rights reserved.
Powered by CUHK