Feature-based Object Rendering from Sparse Views

 

Introduction

Most current multiview rendering scheme involves a large number of cameras to capture the scene/object of interest.

This work makes an attempt to synthesize photorealistic novel views from a small set of widely separated views.
Let's say the angular spacing between two adjacent views is more than 30 degree.

 

                                

dense camera system                                                                              sparse camera system

 

Such a sparse camera system has a number of advantages:

 

 

 

Experimental Results

 

Yoga sequence

Input: three views + one mask that defines the object of interest

( Note that because the background changes a lot in the wide baseline setting, we only focus on rendering
the foreground object )

 

       

 

 

Output: free viewpoint navigation between the input views

(Note that the rendered object may look incomplete in the leftmost and rightmost views, because some regions
of the object surface inherently have no correspondence among the input images)

 

Download the video with full resolution

 

Comparison: Here shows the result obtained by PMVS [1], for which three object masks are provided for
                     its best performance.

 

 

Comparison: Here shows the comparison with the ground truth

 

   

ground truth                                                                                                 synthesized view

color difference

 

 

 

Girl sequence

Input: two views + one object mask

 

   

                                                                                                  

 

 

Output: free viewpoint navigation between the input views

 

Download the video with full resolution

 

Comparison: Here shows the comparison with the ground truth

 

   

ground truth                                                                                                     synthesized view

color difference

 

 

 

Cityhall sequence

Input: two views (the entire first image is supposed to be the object of interest)

 

   

 

 

Output: free viewpoint navigation between the input views

 

Download the video with full resolution

 

 

[1] Furukawa, Y. & Ponce, J., "Accurate, dense, and robust multi-view stereopsis". In IEEE Conf. Comp. Vision &
     Pattern Recognition (CVPR)
, 2007