Hello! I am currently a PhD student in Zhejiang University (浙江大学), majoring in Computer Science and Technology supervised by Prof. Jianke Zhu. I have obtained B.Eng. (with Honors) from Wuhan University (武汉大学) majoring in Computer Science and Technology supervised by Prof. Zheng Wang . I used to be a summer research intern at McGill University and Mila-Quebec AI Institute in Montreal, Canada, under the supervision of Prof. Xujie Si. Prior to that, I was a visit student at KAIST in Daejeon, Korea, supervised by Prof. Chang D. Yoo.

My research interests include 2D/3D Multimodal LLMs, Visual/Scene Understanding, Spatial Intelligence, and Embodied AI, particularly in:

1.Enabling MLLMs with common visual tasks, including open-vocabulary visual grounding for image/video/3D scene.

2.Embodied scene understanding/reasoning, including streaming 3D interaction and embodied dialogue/planning.

3.Efficient and effective MLLMs, including visual token compression and lightweight MLLM.

If you are interested in any form of academic cooperation with me, please feel free to email at hanxun.yu@zju.edu.cn.

🔥 News

  • 2026.01:  🎉🎉 One paper is accepted by ICLR 2026.
  • 2025.02:  🎉🎉 One paper is accepted by CVPR 2025 Highlight. (2.9%, 387/13008)
  • 2024.07:  🎉🎉 One paper is accepted by IEEE TPAMI 2024.
  • 2023.07:  🎉🎉 One paper is accepted by ACM MM 2023.
  • 2023.06:  🎉🎉 I won the National Scholarship at Wuhan University. (Top 2%)
  • 2022.06:  🎉🎉 Accepted to the Mitacs Globalink Research Internship 2022 program. (200/year Nationwide)

📝 Selected Publications

* indicates equal contribution


ICLR 2026
sym

VisionTrim: Unified Vision Token Compression for Training-Free MLLM Acceleration

Hanxun Yu*, Wentong Li*, Xuan Qu*, Song Wang, Junbo Chen, Jianke Zhu

ICLR 2026

[Paper] [Code]

  • An efficient vision token compression framework with two modules, Dominant Vision Token Selection (DVTS) and Text-Guided Vision Complement (TGVC).
CVPR 2025 (Highlight)
sym

Inst3D-LMM: Instance-Aware 3D Scene Understanding with Multi-modal Instruction Tuning

Hanxun Yu*, Wentong Li*, Song Wang, Junbo Chen, Jianke Zhu

CVPR 2025 (Highlight, Top 2.9%)

[Paper] [Code]

  • A unified and effective instance-aware 3D Large Multi-modal Model for multi-task 3D scene understanding through coupled 2D-3D modality encoding.
ACM MM 2023
sym

Moiré Backdoor Attack (MBA): A Novel Trigger for Pedestrian Detectors in the Physical World

Hui Wei*, Hanxun Yu*, Kewei Zhang, Zhixiang Wang, Jianke Zhu, Zheng Wang

ACM MM 2023

[Paper] [Code]

  • This paper focuses on AI safety-critical tasks and innovatively proposes moiré-based backdoor attack triggers into pedestrian detection models.

📚 Other Publications

arXiv
sym

N3D-VLM: Native 3D Grounding Enables Accurate Spatial Reasoning in Vision-Language Models

Yuxin Wang, Lei Ke, Boqiang Zhang, Tianyuan Qu, Hanxun Yu, Zhenpeng Huang, Meng Yu, Dan Xu, Dong Yu

arXiv 2025

[Project] [Paper] [Code]

  • A unified framework that empowers native 3D grounding to enable accurate spatial reasoning in Vision-Language Models.
arXiv
sym

StreamingAssistant: Efficient Visual Token Pruning for Accelerating Online Video Understanding

Xinqi Jin*, Hanxun Yu*, Bohan Yu, Kebin Liu, Jian Liu, Keda Tao, Yixuan Pei, Huan Wang, Fan Dang, Jiangchuan Liu, Weiqiang Wang

arXiv 2025

[Paper] [Code]

  • A token pruning method designed to reduce both spatial and temporal redundancy in online video understanding.
TPAMI 2024
sym

Physical Adversarial Attack meets Computer Vision: A Decade Survey

Hui Wei, Hao Tang, Xuemei Jia, Zhixiang Wang, Hanxun Yu, Zhubo Li, Shin’ichi Satoh, Luc Van Gool, Zheng Wang

IEEE TPAMI 2024

[Paper] [Code]

  • This survey aims to summarize existing physical adversarial attack methods, providing insights toward the development of trustworthy AI systems.

🎖 Honors and Awards

  • 2024  The Chiang Chen Scholarship, China.
  • 2024,2025  The First Prize of Excellent Graduate Scholarship, Zhejiang University.
  • 2023  The National Scholarship, China. (Top 2%)
  • 2023  Outstanding Undergraduate Dissertation Award, Wuhan University.
  • 2023  Outstanding Graduate, Wuhan University.
  • 2022  Mitacs-CSC Globalink Research Internship Scholarship, China. (200/year Nationwide)
  • 2020,2021,2022  The First Prize of Excellent Undergraduate Scholarship, Wuhan University.

📖 Educations

  • 2023.09 - now, Ph.D, Zhejiang University.
  • 2019.09 - 2023.06, B.Eng. (with Honors), Wuhan University.

💻 Internships

💬 Academic Services

  • Journal Reviewer: IEEE TPAMI
  • Conference Reviewer: ECCV 2026, ICLR 2026, ICML 2025-2026, CVPR 2025-2026, ACM MM 2023-2024