Haoye Dong 

Carnegie Mellon University

I am a Postdoctoral Fellow at CMU,  working with Prof.  Fernando de la Torre and Dr. Dong Huang, from Sept. 2022.  I received my Ph.D. degree from Sun Yat-sen University, advised by Prof. Jian Yin and Prof. Xiaodan Liang.

My research interests mainly focus on Human-centric Generative AI

donghaoye@cmu.edu

I'm looking for a research position starting in 2024. Please kindly reach out to me for any opportunities. Thanks!

Research Statement

Research Timeline

Overview of my past research. Including controllable 2D/3D human image generation, realistic human try-on video synthesis, and accurate 3D human motion/generation using robust regressor/neural rendering.

Future Research Plan

Big picture of my future research plan. Firstly, building large human models for accurate and robust 3D humans based on a single image. Secondly, understanding and generating 3D humans in the scene. Lastly, leveraging physical dynamics-aware models for 3D human-scene interaction learning.

Latest Projects

Human-Adapter: Customizable 3D Human Generation with Gaussian Splatting

Key idea:

We introduce a shape and appearance adaption framework, named Human-Adapter, which enables the generation of customizable 3D humans without multi-view images, using user-input measurements. Human-Adapter makes the first attempt to achieve not only shape-control but also identity-preserving with a specified face image.

DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models

Key idea:

We propose a novel customizing 3D human try-on model, named DreamVTON, to separately optimize the geometry and texture of the 3D human. A personalized SD with multi-concept LoRA is proposed to provide the generative prior about the specific person and clothes. DreamVTON introduces a template-based optimization mechanism, which employs mask templates for geometry shape learning and normal/RGB templates for geometry/texture details learning. 

WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on

Key idea:

we propose WarpDiffusion, which bridges the warping-based and diffusion-based paradigms via a novel informative and local garment feature attention mechanism. Specifically, WarpDiffusion incorporates local texture attention to reduce resource consumption and uses a novel auto-mask module that effectively retains only the critical areas of the warped garment while disregarding unrealistic or erroneous portions. Notably, WarpDiffusion can be integrated as a plug-and-play component into existing VITON methodologies, elevating their synthesis quality. 

Physical-space Multi-body Mesh Detection Achieved by Local Alignment and Global Dense Learning

Key idea:

We introduce Physical-space Multi-body Mesh Detection, in which (1) Locally, we preserve the body aspect ratio, align the body-to-RoI layout, and densely refine the person-wise RoI features for robustness; (2) Globally, we learn dense-depth-guided features to amend the body-wise local feature for physical depth estimation. 

Hello 3D: Universal Monocular 3D Human Recovery Engine

Key idea:

We present a universal software engine for real-time 3D human perception in moving robots, stationary monitoring, and sports training. Our perception engine only uses one monocular RGB camera, produces accurate 3D human meshes in physical sizes and 3D translations, and enables real-time deployment in both moving and stationary platforms. A live public demo based on the engine is installed in the NSH building 3rd floor on the Carnegie Mellon University campus, Pittsburgh, PA.
https://delightcmu.github.io/Hello3D/ 

Publications

  1. Human-Adapter: Customizable 3D Human Generation with Gaussian Splatting. Under review

  1. DreamVTON: Customizing 3D Virtual Try-on with Personalized Diffusion Models. Under review

  1. WarpDiffusion: Efficient Diffusion Model for High-Fidelity Virtual Try-on. Under review

  1. ControlNeRF: Text-to-3D Object Generation via Learning Diffusion-driven Cross-modal Guidances. Under review

  1. Universal Monocular 3D Human Recovery Engine. Project Web
Haoye Dong, Jun Liu, Dong Huang.Proceedings of  IEEE International Conference on Robotics and Automation (ICRA), 2024.
  1. Physical-space Multi-body Mesh Detection Achieved by Local Alignment and Global Dense Learning. PDF, CODE coming soon
Haoye Dong, Tiange Xiang, Sravan Chittupalli, Jun Liu, Dong Huang.Proceedings of  Winter Conference on Applications of Computer Vision (WACV), 2024:1267--1276.
  1. Coordinate Transformer: Achieving Single-stage Multi-person Mesh Recovery from Videos. PDF, CODE 
Haoyuan Li*, Haoye Dong*, Hanchao Jia, Dong Huang, Michael C. Kampffmeyer, Liang Lin, Xiaodan Liang.Proceedings of International Conference on Computer Vision (ICCV), 2023:7744--8753
  1. GP-VTON: Towards General Purpose Virtual Try-on via Collaborative Local-Flow Global-Parsing Learning. PDF, CODE
Zhenyu Xie, Zaiyu Huang, Xin Dong, Fuwei Zhao, Haoye Dong, Xijin Zhang, Feida Zhu, Xiaodan Liang.Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), 2023.
  1. Human MotionFormer: Transferring Human Motions with Vision Transformers. PDF, CODE
Hongyu Liu, Xintong Han, Chenbin Jin, Lihui Qian, Huawei Wei, Zhe Lin, Faqiang Wang, Haoye Dong, Yibing Song, Jia Xu and Qifeng Chen.Proceedings of International Conference on Learning Representations (ICLR), 2023.
  1. XFormer: Fast and Accurate Monocular 3D Body Capture. PDF
Lihui Qian, Xintong Han, Faqiang Wang, Hongyu Liu, Haoye Dong, Zhiwen Li, Huawei Wei, Zhe Lin and Chengbin Jin.Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), 2023.
  1. Towards Scalable Unpaired Virtual Try-On via Patch-Routed Spatially-Adaptive GAN. PDF, CODE
Zhenyu Xie, Zaiyu Huang, Fuwei Zhao, Haoye Dong, Michael Kampffmeyer, Xiaodan Liang.Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), 2021.
  1. M3D-VTON: A Monocular-to-3D Virtual Try-On Network. PDF, CODE
Fuwei Zhao, Zhenyu Xie, Michael Kampffmeyer, Haoye Dong, Songfang Han, Tianxiang Zheng, Tao Zhang, Xiaodan Liang.Proceedings of International Conference on Computer Vision (ICCV), 2021:13239--13249 
  1. Image Comes Dancing with Collaborative Parsing-Flow Video Synthesis. PDF  
Bowen Wu, Zhenyu Xie, Xiaodan Liang, Yubei Xiao, Haoye Dong, Liang Lin.IEEE Transactions on Image Processing (TIP). 30: 9259--9269 (2021)
  1. WAS-VTON: Warping Architecture Search for Virtual Try-on Network. PDF, CODE
Zhenyu Xie, Xujie Zhang, Fuwei Zhao, Haoye Dong, Michael C. Kampffmeyer, Haonan Yan, Xiaodan Liang. ACM Multimedia 2021 (ACM MM):3350--3359
  1. Fashion Editing with Adversarial Parsing Learning. PDF, CODE & DATSETS
Haoye Dong, Xiaodan Liang, Xiaohui Shen, Zhenyu Xie, Jian Yin, et al.. Proceedings of International Conference on Computer Vision and Pattern Recognition (CVPR), 2020: 8120--8128.
  1. Towards Multi-pose Guided Virtual Try-on Network. PDF, CODE & DATASET
Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bochao Wang, Hanjiang Lai, Jia Zhu, Zhiting Hu, Jian Yin. Proceedings of International Conference on Computer Vision (ICCV), 2019:9026--9035. 
  1. FW-GAN: Flow-navigated Warping GAN for Video Virtual Try-on. PDF, CODE & DATASET
Haoye Dong, Xiaodan Liang, Xiaohui Shen, Bowen Wu, Bing-Cheng Chen, Jian Yin. Proceedings of International Conference on Computer Vision (ICCV), 2019:1161--1170.
  1. Part-Preserving Pose Manipulation for Person Image Synthesis. PDF
Haoye Dong, Xiaodan Liang, Chenxing Zhou, Hanjiang Lai, Jia Zhu, Jian Yin.  Proceedings of IEEE International Conference on Multimedia & Expo (ICME), 2019: 1234--1239.
  1. Deep Generative Models with Learnable Knowledge Constraints. PDF
Zhiting Hu, Zichao Yang, Ruslan Salakhutdinov, Xiaodan Liang, Lianhui Qin, Haoye Dong, Eric Xing.  Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), 2018: 10501--10512.
  1. Soft-Gated Warping-GAN for Pose-Guided Person Image Synthesis. PDF
Haoye Dong, Xiaodan Liang, Ke Gong, Hanjiang Lai, Jia Zhu, Jian Yin. Proceedings of Annual Conference on Neural Information Processing Systems (NeurIPS), 2018: 472--482.

Mentoring

Very fortunate to meet you all. Welcome more motivated friends to collaborate together. 

Academic Services


donghaoye12 at gmail.com | wechat: humanmodeling

© 2024 Haoye Dong