Stereo Vision Laboratory

8. PAPER: A new stereo formulation not using pixel and disparity models

We submitted a paper about gaze_line-depth model stereo to arXiv.

Abstract

We introduce a new stereo formulation which does not use pixel and disparity models. Many problems in vision are treated as assigning each pixel a label. Disparities are labels for stereo. Such pixel-labeling problems are naturally represented in terms of energy minimization, where the energy function has two terms: one term penalizes solutions that inconsistent with the observed data, the other term enforces spatial smoothness. Graph cuts are one of the efficient methods for solving energy minimization. However, exact minimization of multi labeling problems can be performed by graph cuts only for the case with convex smoothness terms. In pixel-disparity formulation, convex smoothness terms do not generate well reconstructed 3D results. Thus, truncated linear or quadratic smoothness terms, etc. are used, where approximate energy minimization is necessary. In this paper, we introduce a new site-labeling formulation, where the sites are not pixels but lines in 3D space, labels are not disparities but depth numbers. For this formulation, visibility reasoning is naturally included in the energy function. In addition, this formulation allows us to use a small smoothness term, which does not affect the 3D results much. This makes the optimization step very simple, so we could develop an approximation method for graph cut itself (not for energy minimization) and a high performance GPU graph cut program. For Tsukuba stereo pair in Middlebury data set, we got the result in 5ms using GTX1080GPU, 19ms using GTX660GPU.