Wan, Bo (万博)

Ph.D. candidate,
PSI, ESAT,
KU Leuven,
Belgium
E-mail: bowan.cvml AT gmail.com

About me

Now I'm a research scientist at Meta FAIR. Previously I completed my Ph.D. supervised by Prof. Tinne Tuytelaars at KU Leuven, Belgium. Before that, I received the B.S. degree from the Beijing University of Posts and Telecommunications, in 2017, and the M.S. degree from ShanghaiTech University, in 2020. I also obtained a double bachelor's degree in Economics from Peking University in 2017. Currently, my main research interests focus on machine learning, especially on:

Vision-Language Pretraining and Understanding
Diffusion-based Image and Video Generation

Internship

Student Researcher @ DeepMind, 7.2023-12.2023

Work on large-scale location-aware vision-language pretraining.

Student Researcher @ Google Brain, 10.2022-2.2023

Explore a encoder-decoder trasnformer structure for vision-language multitasking.

Research Intern @ Tencent AI Lab, 08.2020-10.2020

Concentrate on human-centric video understanding.

Algorithm Intern @ Microsoft, 03.2017-06.2017

Develop algorithms for Bing Search Question Answering system in Chinese.

Publications

B. Wan, M. Tschannen, Y. Xian, F. Pavetic, I. Alabdulmohsin, X. Wang, A.S. Pinto, A. Steiner, L. Beyer, X. Zhai, "LocCa: Visual Pretraining with Location-aware Captioners", NeurIPS, 2024. [pdf]
M. Li*, B. Wan*, S. Moens, T. Tuytelaars, "Animate Your Motion: Turning Still Images into Dynamic Videos", ECCV, 2024. [pdf]
H. Diao, B. Wan, X. Jia, Y. Zhuge, Y. Zhang, H. Lu, L. Chen, "SHERL: Synthesizing High Accuracy and Efficient Memory for Resource-Limited Transfer Learning", ECCV, 2024. [pdf]
H. Diao, B. Wan, Y. Zhang, X. Jia, H. Lu, L. Chen, "UniPT: Universal Parallel Tuning for Transfer Learning with Efficient Parameter and Memory Exploitation", CVPR, 2024. [pdf]
B. Wan, T. Tuytelaars, "Exploiting CLIP for Zero-shot HOI Detection Requires Knowledge Distillation at Multiple Levels", WACV, 2024. [pdf]
B. Wan*, Y. Liu*, D. Zhou, T. Tuytelaars, X. He, "Weakly-supervised HOI Detection via Prior-guided Bi-level Representation Learning", ICLR, 2023. [pdf]
L. Beyer*, B. Wan*, G. Madan*, F. Pavetic*, A. Steiner*, A. Kolesnikov, A.S. Pinto, E. Bugliarello, X. Wang, Q. Yu, L. Chen, X. Zhai*, "A Study of Autoregressive Decoders for Multi-Tasking in Computer Vision", Arxiv, 2023. [pdf]
B. Wan, W. Han, Z. Zheng, and T. Tuytelaars, "Unsupervised Vision-Language Grammar Induction with Shared Structure Modeling", ICLR Oral, 2022. [pdf]
Y. Liu*, B. Wan*, L. Ma, and X. He, "Relation-aware Instance Refinement for Weakly Supervised Visual Grounding", CVPR, 2021. [pdf][code]
R. Li, S. Zhang, B. Wan, and X. He, "Bipartite graph network with adaptive message passing for unbiased scene graph generation", CVPR, 2021. [pdf][code]
Q. He, D. Zhou, B. Wan, and X. He, "Single Image 3D Object Estimation with Primitive Graph Networks", ACMMM, 2021. [pdf]
Y. Liu*, B. Wan*, L. Ma, and X. He, "Learning cross-modal context graph for visual grounding", AAAI, 2020. [pdf][code]
B. Wan*, D. Zhou*, Y. Liu, R. Li, and X. He, "Pose-aware multi-level feature network for human object interaction detection", ICCV Oral, 2019. [pdf][code]

Note: * indicates equal contribution. Full list of publications in Google Scholar.

Academic service

Reviewer for TPAMI, ICML, ICLR, NeurIPS, CVPR, ICCV, ECCV, WACV