Sayan Deb Sarkar

I am a Computer Science MSc student at ETH Zurich, majoring in Visual Interactive Computing. Currently, I am a research intern in the Perception team at Qualcomm XR Labs hosted by Marco Manfredi, optimising SLAM algorithms for real-time performance and integrating deep learning algorithms for improved tracking in adversarial scenarios. At ETHZ, I am a part of the Computer Vision and Geometry Group headed by Prof. Marc Pollefeys, where I work closely with Dr. Daniel Barath, Dr. Ondrej Miksik (Microsoft Mixed Reality & AI Labs, Zurich) and Prof. Iro Armeni (Stanford University) on aligning real-world 3D environments from multi-modal data and improving design processes using data-driven methods.

Before starting MSc, I gained experience working as a Computer Vision Research Engineer at Mercedes-Benz R&D, on Intelligent Interior Camera Systems, for around a year, and as a Research Assistant for the other one and a half years with Prof. Vincent Lepetit at the Institute of Computer Graphics and Vision, TU Graz. My work has been published multiple times in top-tier Vision conferences such as CVPR, ICCV and ECCV.

I am always looking for interesting research collaborations and ideas, please get in touch via email for any such opportunities. Also, if you're in or around Zurich, feel free to reach out, I am always up for a good cup of coffee!

CV  /  Google Scholar  /  Twitter  /  Github  /  LinkedIn

profile photo
Recent News
  • 07/2023 Started a research internship at Qualcomm XR Labs, moved to Amsterdam
  • 07/2023 SGAligner accepted to ICCV 2023, my first first-author submission!
  • 04/2023 We're organising the workshop CV4AEC @ CVPR 2023 , participate in the 2D and 3D challenges!
  • 10/2022 Started working at CVG on scene understanding
  • 09/2022 Moved to Zurich! started MSc at ETH
  • 07/2022 Keypoint Transformer accepted at CVPR 2022 as an Oral presentation
  • 05/2021 Started at Mercedes-Benz R & D as a Computer Vision Research Engineer!
  • ---- show more ----

My research interests lie at the intersection of Computer Vision and Machine Learning, specifically in the areas of 3D scene understanding and pose estimation.

SGAligner : 3D Scene Alignment with Scene Graphs
Sayan Deb Sarkar, Ondrej Miksik, Marc Pollefeys, Daniel Barath, Iro Armeni
arXiv / Project Page / Video / Code
International Conference on Computer Vision (ICCV), 2023

We focus on the fundamental problem of aligning pairs of 3D scene graphs whose overlap can range from zero to partial and can contain arbitrary changes. We propose SGAligner, the first method for aligning pairs of 3D scene graphs that is robust to in-the-wild scenarios.

Keypoint Transformer: Solving Joint Identification in Challenging Hands and Object Interactions for Accurate 3D Pose Estimation
Shreyas Hampali, Sayan Deb Sarkar, Mahdi Rad, Vincent Lepetit
Computer Vision and Pattern Recognition (CVPR), 2022 Oral
arXiv / Project Page / Video / Code

We propose an efficient network architecture for estimating pose of two hands and object during complex interaction. We also release the challenging H2O-3D dataset, which contains two hands interacting with YCB objects.

Monte Carlo Scene Search for 3D Scene Understanding
Shreyas Hampali*, Sinisa Stekovic*, Sayan Deb Sarkar, Chetan Srinivasa Kumar, Friedrich Fraundorfer, Vincent Lepetit
Computer Vision and Pattern Recognition (CVPR), 2021
arXiv / Project Page / Video / Code

We propose a Monte-Carlo Tree Search (MCTS) based analysis-by-synthesis method to recover complete scene (3D layout+objects) from a RGB-D scan of the environment.
*Equal contribution

General 3D Room Layout from a Single View by Render-and-Compare
Sinisa Stekovic, Shreyas Hampali, Mahdi Rad, Sayan Deb Sarkar, Friedrich Fraundorfer, Vincent Lepetit
European Conference on Computer Vision (ECCV), 2020
arXiv / Project Page / Video / Code

We propose an analysis-by-synthesis method to estimate a 3D layout of the room - walls, floors, ceilings - from a single perspective view. The method recovers complex non-cubiod layouts by solving a constrained discrete optimization problem.

Course Projects

Ray Tracing
Computer Graphics Rendering Competition, Autumn Semester 2022

Implemented a ray tracer with functionalities such as advanced camera models, participating media, photon mapping, Disney BRDF, etc on the Nori framework.

Design and code from Jon Barron's website