Data association via set packing for computer vision applications

, , , and

BibTeX reference

Significant progress has been made in the field of computer vision, due to the development of supervised machine learning algorithms, which efficiently extract information from high-dimensional data such as images and videos. Such techniques are particularly effective at recognizing the presence or absence of entities in the domains, where labeled data is abundant. However, supervised learning is not sufficient in applications where one needs to annotate each unique entity in crowded scenes respecting known domain specific structures of those entities. This problem, known as data association, provides fertile ground for the application of combinatorial optimization. In this paper, we present the computer vision applications, namely, multi-person tracking, multi-person pose estimation, and multi-cell segmentation, which can be formulated as integer linear programs with a massive number of variables. In order to solve this problem, column generation algorithms are applied to circumvent the need to enumerate all variables explicitly. To enhance the solution process, we provide a general approach for applying subset-row inequalities to tighten the formulations, and introduce novel dual optimal inequalities to reduce the dual search space. The proposed algorithms and their enhancements are successfully applied to solve the three aforementioned computer vision problems and achieve superior performance compared to benchmark approaches.

, 34 pages

Research Axes


G1942.pdf (20 MB)