Parallax effect motion generation

Stereo vision is a growing topic in computer vision due to the innumerable opportunities and applications this technology offers for the development of modern solutions, such as virtual and augmentedreality applications. To enhance the user’s experience in three-dimensional virtual environments, the motion parallax estimation isa promising technique to achieve this objective. In this paper, we propose an algorithm for generating parallax motion effects from a single image, taking advantage of state-of-the-art instance segmentation and depth estimation approaches. This work also presents acomparison against such algorithms to investigate the trade-off between efficiency and quality of the parallax motion effects, taking into consideration a multitask learning network capable of estimating instance segmentation and depth estimation at once. Experimental results and visual quality assessment indicate that the PyD-Netnetwork (depth estimation) combined with Mask R-CNN or FBNet networks (instance segmentation) can produce parallax motion effects with good visual quality

Jose Luis Flores Campana
Jose Luis Flores Campana
Ph.D. in Computer Science

Jose Luis Flores received his B.Sc. in Computer and Software Engineering from the University of San Antonio Abad de Cusco (UNSAAC), Peru, in 2016. As a bachelor’s student, Jose worked on a research paper related to the recognition and classification of hand gestures based on sign language using artisanal and deep learning techniques. After. Jose obtained his M.Sc in Computer Science from the State University of Campinas (Unicamp), Brazil, in 2020. As a master’s student, Jose was part of a team of researchers from SAMSUNG Brasil and UNICAMP. In this team he worked on two projects, “Multilingual text detection and recognition in images and videos” and “Generation of parallax motion effects”. In 2024, Jose received his Ph.D. from the State University of Campinas (Unicamp), Brazil. As a Ph.D. student, Jose worked on topics such as Image Inpainting and Image Synthesis, focusing his research on Deep Learning models such as Generative Adversarial Networks and Vision Transformer. His research focuses on Machine Learning, Deep Learning, and Image Processing, with specialization in Text Detection and Recognition in images and videos, Image Inpainting, and Image Synthesis. He currently works as a software engineer at Loggi.