Ziqi Pang (庞子奇)

I am a fourth-year CS Ph.D. student focusing on computer vision and machine learning at University of Illinois Urabana-Champaign (UIUC), where my advisor is Prof. Yu-Xiong Wang. Before that, I graduated from Peking University (PKU) with a Bachelor degree in Computer Science.

I interned at Toyota Research Institute (TRI) with Dr. Pavel Tokmakov during my Ph.D. study. Prior to joining UIUC, I interned at Carnegie Mellon University (CMU) with Prof. Martial Hebert, practiced research at Peking University (PKU) with Prof. Shiliang Zhang, and spent an exciting year at TuSimple pushing the boundaries of autonomous driving guided by Dr. Naiyan Wang.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  / 

profile photo
Research

My research highlights perception in spatial (3D) and temporal (video) contexts, which is the cornerstone for embodied agents and digital assistants. Although being a so-called "perception" people, my ambition is how to unify generative modeling (LLMs, diffusion models, NeRF, etc.) with perception tasks . My goal is to unlock better scaling, better interactivity with humans, self-improvement and self-exploration of perception models from the generative capabilities of models.

InstructG2I: Synthesizing Images from Multimodal Attributed Graphs
Bowen Jin, Ziqi Pang, Bingjun Guo, Yu-Xiong Wang, Jiaxuan You, Jiawei Han
NeurIPS , 2024
Project Page / Code / arXiv

Text-to-image diffusion models can digest additional "graph" conditions about the relationships of entities, supporting more nuanced generation for recommendation systems, virtual arts, etc.

RMem: Restricted Memory Banks Improve Video Object Segmentation
Junbao Zhou*, Ziqi Pang*, Yu-Xiong Wang
CVPR , 2024 (Winner at ECCV 2024 VOTS Challenge)
Project Page / Code / arXiv

Simply bounding the size of memory banks improves VOS on challenging state-changes and long videos, indicating the importance of selecting relevant information from long contexts.

Frozen Transformers from Language Models are Effective Visual Encoder Layers
Ziqi Pang, Ziyang Xie*, Yunze Man*, Yu-Xiong Wang
ICLR , 2024 (Spotlight)  
Code / arXiv

Frozen transformers from language models, though trained solely on textual data, can effectively improves diverse visual tasks by directly encoding visual tokens.

MV-Map: Offboard HD-Map Generation with Multi-view Consistency
Ziyang Xie*, Ziqi Pang*, Yu-Xiong Wang
ICCV, 2023  
Code / arXiv / Demo

MV-Map is the first offboard auto-labeling pipeline for HD-Maps, whose crust is to fuse BEV perception results guided by geometric cues from NeRFs.

Streaming Motion Forecasting for Autonomous Driving
Ziqi Pang, Deva Ramanan, Mengtian Li, Yu-Xiong Wang
IROS, 2023  
Code / arXiv / Demo

"Streaming forecasting" mitigates the gap between "snapshot-based" conventional motion forecasting and the streaming real-world traffic.

Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking (Alias: PF-Track)
Ziqi Pang, Jie Li, Pavel Tokmakov, Dian Chen, Sergey Zagoruyko, Yu-Xiong Wang
CVPR, 2023  
Code / arXiv / Demo

PF-Track is an vision-centric 3D MOT framework that dramatically decreases ID-Switches with an end-to-end framework for autonomous driving.

Embracing Single Stride 3D Object Detector with Sparse Transformer (Alias: SST)
Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang
CVPR, 2022  
Code / arXiv

SST emphasize the small object sizes and sparsity of point clouds. Its sparse transformers enlight new backbones for outdoor LiDAR-based detection.

SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking
Ziqi Pang, Zhichao Li, Naiyan Wang
ECCV Workshop, 2022  
Code / arXiv / Patent

SimpleTrack is simple-yet-effective 3D MOT system with more than 200 stars on GitHub.

Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences (Alias: LiDAR-SOT)
Ziqi Pang, Zhichao Li, Naiyan Wang
IROS, 2021  
Code / arXiv / Demo

LiDAR-SOT is a LiDAR-based state estimation algorithm for both the onboard usage of redundancy system and offboard usage of auto-labeling.


Huge thanks to Jon Barron for proving the template for the page.