Think-at-Hard: Selective Latent Iterations to Improve Reasoning Language Models Paper • 2511.08577 • Published Nov 11 • 105
Distilled Decoding 2: One-step Sampling of Image Auto-regressive Models with Conditional Score Distillation Paper • 2510.21003 • Published Oct 23 • 7
FrameFusion: Combining Similarity and Importance for Video Token Reduction on Large Visual Language Models Paper • 2501.01986 • Published Dec 30, 2024 • 1
Linear Combination of Saved Checkpoints Makes Consistency and Diffusion Models Better Paper • 2404.02241 • Published Apr 2, 2024 • 2
FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation Paper • 2506.18899 • Published Jun 23 • 6
MBQ: Modality-Balanced Quantization for Large Vision-Language Models Paper • 2412.19509 • Published Dec 27, 2024
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published Sep 19 • 45
FilMaster: Bridging Cinematic Principles and Generative AI for Automated Film Generation Paper • 2506.18899 • Published Jun 23 • 6
Struct-Bench: A Benchmark for Differentially Private Structured Text Generation Paper • 2509.10696 • Published Sep 12
Latent Zoning Network: A Unified Principle for Generative Modeling, Representation Learning, and Classification Paper • 2509.15591 • Published Sep 19 • 45
Jigsaw-R1: A Study of Rule-based Visual Reinforcement Learning with Jigsaw Puzzles Paper • 2505.23590 • Published May 29 • 25
R2R: Efficiently Navigating Divergent Reasoning Paths with Small-Large Model Token Routing Paper • 2505.21600 • Published May 27 • 71
MMDT: Decoding the Trustworthiness and Safety of Multimodal Foundation Models Paper • 2503.14827 • Published Mar 19