|
News
2025-02: A paper on KV Cache compression and video understanding with LLMs accepted by CVPR, 2025.
2025-02: A paper on vision token pruning for MLLMs accepted by CVPR, 2025.
2024-12: Start an internship at Bytedance.
2024-09: A paper on referring image & video segmentation accepted by TPAMI, 2024.
2024-02: Start an internship at Tencent ARC Lab.
|
|
Recent Publications
* Indicates Equal Contribution
|
|
VoCo-LLaMA: Towards Vision Compression with Large Language Models
Xubing Ye,
Yukang Gan,
Xiaoke Huang,
Yixiao Ge,
Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[arXiv]
[PDF]
[Project Page]
[Code]
[AK]
[中文解读]
Proposed VoCo-LLaMA, an attention-distilled video token compression method enabling video-LLMs to train and inference million-token (1+ hour) videos within a 4k-context LLM.
|
|
ATP-LLaVA: Adaptive Token Pruning for Large Vision Language Models
Xubing Ye,
Yukang Gan,
Yixiao Ge,
Xiao-ping Zhang,
Yansong Tang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2025
[arXiv]
[PDF]
[Project Page]
Proposed ATP-LLaVA, an efficient MLLM that performs adaptive instance-wise and decoder-layer-wise token pruning with nearly no performance degradation.
|
|
Language-Aware Vision Transformer for Referring Segmentation
Xubing Ye*,
Zhao Yang*,
Jiaqi Wang*,
Yansong Tang,
Kai Chen,
Hengshuang Zhao,
Philip H.S. Torr
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI, IF=20.8), 2024
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
[IEEE]
[PDF]
[Code]
[Conference Version]
Proposed LAVT, a Transformer-based universal referring image and video segmentation (RIS and RVOS) framework that performs language-aware visual encoding in place of cross-modal fusion post feature extraction.
|
|
Selected Honors and Awards
Nanhu Elite Scholarship of Tsinghua University, 2025. (清华大学综合优秀奖学金, 校级一等)
Zhaoyi Scholarship of Tsinghua University, 2024. (清华大学综合优秀奖学金, 校级一等)
First Prize Scholarship of Tongji University, 2023. (同济大学综合优秀奖学金, 校级一等)
Second Prize Scholarship of Tongji University, 2021, 2022. (同济大学综合优秀奖学金, 校级二等)
|
|
Bytedance Seed Application, Beijing, China. December, 2024 - April, 2025.
Project: AI Search with MLLMs.
Work with Dr. Baihan Shu.
|
|
Tencent ARC Lab (PCG), Shenzhen, China. February, 2024 - December, 2024.
Project: Token Pruning & Compression for MLLMs, Video MLLMs.
Work with Dr. Yukang Gan, Dr. Yixiao Ge, Dr. Ying Shan.
|
|
Academic Services
Conference Reviewer: CVPR 2025; JVCIR 2024, 2025
|
© Xubing Ye | Last updated: Mar. 17, 2024 | Website Template
|