10 16 4

Ye Liu

yeliudev

grinlif's profile picture

21world's profile picture

Tonic's profile picture

https://yeliu.dev/

yeliudev
yeliudev

AI & ML interests

Vision & Language

Recent Activity

updated a model 2 days ago

yeliudev/VideoMind-2B-FT-QVHighlights

updated a dataset 2 days ago

yeliudev/VideoMind-Dataset

updated a model 2 days ago

yeliudev/VideoMind-7B

View all activity

Organizations

yeliudev 's collections 4

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

36

VideoMind 2B

💡

36

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated 2 days ago • 19 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated 2 days ago • 15 • 4
yeliudev/VideoMind-Dataset

Preview • Updated 2 days ago • 1.69k • 11

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 191 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 369 • 4
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 3 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 170

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 152 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 342 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 988 • 2

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024

VideoMind

[ICLR 2026] VideoMind: A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning

Running on Zero

36

VideoMind 2B

💡

36

A Chain-of-LoRA Agent for Temporal-Grounded Video Reasoning
yeliudev/VideoMind-2B

Video-Text-to-Text • Updated 2 days ago • 19 • 2
yeliudev/VideoMind-7B

Video-Text-to-Text • Updated 2 days ago • 15 • 4
yeliudev/VideoMind-Dataset

Preview • Updated 2 days ago • 1.69k • 11

UniPixel

[NeurIPS 2025] UniPixel: Unified Object Referring and Segmentation for Pixel-Level Visual Reasoning

Running on Zero

6

UniPixel

🔮

6

An MLLM for Unified Object Referring and Segmentation
PolyU-ChenLab/UniPixel-3B

Video-Text-to-Text • 4B • Updated Oct 4, 2025 • 152 • 3
PolyU-ChenLab/UniPixel-7B

Video-Text-to-Text • 8B • Updated Oct 22, 2025 • 342 • 1
PolyU-ChenLab/UniPixel-SFT-1M

Preview • Updated Oct 4, 2025 • 988 • 2

E.T. Bench

[NeurIPS 2024] E.T. Bench: Towards Open-Ended Event-Level Video-Language Understanding

PolyU-ChenLab/ETBench

Viewer • Updated Oct 29, 2024 • 5 • 191 • 4
PolyU-ChenLab/ET-Instruct-164K

Viewer • Updated Sep 27, 2024 • 115k • 369 • 4
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-1

Video-Text-to-Text • 5B • Updated Oct 29, 2024 • 3 • 2
PolyU-ChenLab/ETChat-Phi3-Mini-Stage-2

5B • Updated Sep 27, 2024 • 170

R2-Tuning

[ECCV 2024] R2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Running

6

R2-Tuning

🌀

6

[ECCV 2024] Localizing moments in videos via text queries
yeliudev/R2-Tuning

Updated Apr 17, 2024 • 2
R^2-Tuning: Efficient Image-to-Video Transfer Learning for Video Temporal Grounding

Paper • 2404.00801 • Published Mar 31, 2024

Ye Liu

AI & ML interests

Recent Activity

Organizations

yeliudev 's collections 4

VideoMind 2B

UniPixel

R2-Tuning

VideoMind 2B

UniPixel

R2-Tuning