Don't Look Twice: Faster Video Transformers with Run-Length Tokenization Paper • 2411.05222 • Published Nov 7, 2024 • 3
One Layer Is Enough: Adapting Pretrained Visual Encoders for Image Generation Paper • 2512.07829 • Published Dec 8, 2025 • 23
Supervised Learning as Lossy Compression: Characterizing Generalization and Sample Complexity via Finite Blocklength Analysis Paper • 2602.04107 • Published 14 days ago • 1
Full-Cycle Energy Consumption Benchmark for Low-Carbon Computer Vision Paper • 2108.13465 • Published Aug 30, 2021 • 1
CoPE-VideoLM: Codec Primitives For Efficient Video Language Models Paper • 2602.13191 • Published 4 days ago • 26
BPDQ: Bit-Plane Decomposition Quantization on a Variable Grid for Large Language Models Paper • 2602.04163 • Published 14 days ago • 7
OneVision-Encoder: Codec-Aligned Sparsity as a Foundational Principle for Multimodal Intelligence Paper • 2602.08683 • Published 8 days ago • 42
MirrorLA: Reflecting Feature Map for Vision Linear Attention Paper • 2602.04346 • Published 14 days ago • 1
Dual-Representation Image Compression at Ultra-Low Bitrates via Explicit Semantics and Implicit Textures Paper • 2602.05213 • Published 13 days ago • 1
Vision Transformer Finetuning Benefits from Non-Smooth Components Paper • 2602.06883 • Published 11 days ago • 4
AudioSAE: Towards Understanding of Audio-Processing Models with Sparse AutoEncoders Paper • 2602.05027 • Published 13 days ago • 59
Unified ROI-based Image Compression Paradigm with Generalized Gaussian Model Paper • 2602.01325 • Published 16 days ago • 1
Generative Preprocessing for Image Compression with Pre-trained Diffusion Models Paper • 2512.15270 • Published Dec 17, 2025 • 1
L-STEC: Learned Video Compression with Long-term Spatio-Temporal Enhanced Context Paper • 2512.12790 • Published Dec 14, 2025 • 1