VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published 16 days ago • 6
VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published 16 days ago • 6
VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published 16 days ago • 6
VTCBench: Can Vision-Language Models Understand Long Context with Vision-Text Compression? Paper • 2512.15649 • Published 16 days ago • 6
Local-Prompt: Extensible Local Prompts for Few-Shot Out-of-Distribution Detection Paper • 2409.04796 • Published Sep 7, 2024
DESIRE: Dynamic Knowledge Consolidation for Rehearsal-Free Continual Learning Paper • 2411.19154 • Published Nov 28, 2024
EventVAD: Training-Free Event-Aware Video Anomaly Detection Paper • 2504.13092 • Published Apr 17, 2025
MambaIC: State Space Models for High-Performance Learned Image Compression Paper • 2503.12461 • Published Mar 16, 2025
PPT: Token Pruning and Pooling for Efficient Vision Transformers Paper • 2310.01812 • Published Oct 3, 2023
ModalPrompt: Towards Efficient Multimodal Continual Instruction Tuning with Dual-Modality Guided Prompt Paper • 2410.05849 • Published Oct 8, 2024
Towards Efficient and General-Purpose Few-Shot Misclassification Detection for Vision-Language Models Paper • 2503.20492 • Published Mar 26, 2025
Parameter Efficient Merging for Multimodal Large Language Models with Complementary Parameter Adaptation Paper • 2502.17159 • Published Feb 24, 2025 • 2
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model Paper • 2503.12941 • Published Mar 17, 2025
Token Reduction Should Go Beyond Efficiency in Generative Models -- From Vision, Language to Multimodality Paper • 2505.18227 • Published May 23, 2025 • 15
ChartEdit: How Far Are MLLMs From Automating Chart Analysis? Evaluating MLLMs' Capability via Chart Editing Paper • 2505.11935 • Published May 17, 2025