Ziqi Pang (庞子奇)

I am a thrid-year CS Ph.D. student focusing on computer vision and machine learning at University of Illinois Urabana-Champaign (UIUC), where my advisor is Prof. Yu-Xiong Wang. Before that, I graduated from Peking University (PKU) with a Bachelor degree in Computer Science.

I interned at Toyota Research Institute (TRI) with Dr. Pavel Tokmakov during my Ph.D. study. Prior to joining UIUC, I interned at Carnegie Mellon University (CMU) with Prof. Martial Hebert, practiced research at Peking University (PKU) with Prof. Shiliang Zhang, and spent an exciting year at TuSimple pushing the boundaries of autonomous driving guided by Dr. Naiyan Wang.

Email  /  CV  /  Google Scholar  /  Twitter  /  Github  / 

profile photo

My research concentrates on perception in 3D and contexts, which is the cornerstone for embodied agents. The core of my research principle is understanding the the flow of information across datasets, models, and inference-time contexts. Therefore, I mostly adopt the approach of Paul Erdos by investigating a broad range of tasks, topics, models, and techniques:

  • Representation learning driven by foundation models (e.g. LLM)
  • Propogation of temporal information in 3D tracking of autonomous driving.
  • Selection and fusion of long-context in AR/VR oriented video understanding and autonomous driving.
  • Sparsifying and condensing the representation for 3D perception.

Restricted Memory Banks Improve Video Object Segmentation: A Revisit (Alias: RMem)
Junbao Zhou*, Ziqi Pang*, Yu-Xiong Wang
CVPR , 2024
Project Page / Code / arXiv

Simply bounding the size of memory banks improves VOS on challenging state-changes and long videos, indicating the importance of selecting relevant information from long contexts.

Frozen Transformers from Language Models are Effective Visual Encoder Layers
Ziqi Pang, Ziyang Xie*, Yunze Man*, Yu-Xiong Wang
ICLR , 2024 (Spotlight)  
Code / arXiv

Frozen transformers from language models, though trained solely on textual data, can effectively improves diverse visual tasks by directly encoding visual tokens.

MV-Map: Offboard HD-Map Generation with Multi-view Consistency (Alias: MV-Map)
Ziyang Xie*, Ziqi Pang*, Yu-Xiong Wang
ICCV, 2023  
Code / arXiv / Demo

MV-Map is the first offboard auto-labeling pipeline for HD-Maps, whose crust is to fuse BEV perception results guided by geometric cues from NeRFs.

Streaming Motion Forecasting for Autonomous Driving
Ziqi Pang, Deva Ramanan, Mengtian Li, Yu-Xiong Wang
IROS, 2023  
Code / arXiv / Demo

"Streaming forecasting" mitigates the gap between "snapshot-based" conventional motion forecasting and the streaming real-world traffic.

Standing Between Past and Future: Spatio-Temporal Modeling for Multi-Camera 3D Multi-Object Tracking (Alias: PF-Track)
Ziqi Pang, Jie Li, Pavel Tokmakov, Dian Chen, Sergey Zagoruyko, Yu-Xiong Wang
CVPR, 2023  
Code / arXiv / Demo

PF-Track is an vision-centric 3D MOT framework that dramatically decreases ID-Switches with an end-to-end framework for autonomous driving.

Embracing Single Stride 3D Object Detector with Sparse Transformer (Alias: SST)
Lue Fan, Ziqi Pang, Tianyuan Zhang, Yu-Xiong Wang, Hang Zhao, Feng Wang, Naiyan Wang, Zhaoxiang Zhang
CVPR, 2022  
Code / arXiv

SST emphasize the small object sizes and sparsity of point clouds. Its sparse transformers enlight new backbones for outdoor LiDAR-based detection.

SimpleTrack: Understanding and Rethinking 3D Multi-object Tracking (Alias: SimpleTrack)
Ziqi Pang, Zhichao Li, Naiyan Wang
ECCV Workshop, 2022  
Code / arXiv / Patent

SimpleTrack is simple-yet-effective 3D MOT system with more than 200 stars on GitHub.

Model-free Vehicle Tracking and State Estimation in Point Cloud Sequences (Alias: LiDAR-SOT)
Ziqi Pang, Zhichao Li, Naiyan Wang
IROS, 2021  
Code / arXiv / Demo

LiDAR-SOT is a LiDAR-based state estimation algorithm for both the onboard usage of redundancy system and offboard usage of auto-labeling.

Huge thanks to Jon Barron for proving the template for the page.