-
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 20 -
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 25 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 31 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 13
Shengmei Shen
janeshen
AI & ML interests
Computer vision, ML/DL. AIGC
Organizations
None yet
CLIP
Video
-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 74 -
SF-V: Single Forward Video Generation Model
Paper • 2406.04324 • Published • 24 -
VideoTetris: Towards Compositional Text-to-Video Generation
Paper • 2406.04277 • Published • 25 -
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 28
Text To Image
-
Improving Text-to-Image Consistency via Automatic Prompt Optimization
Paper • 2403.17804 • Published • 20 -
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation
Paper • 2403.16990 • Published • 25 -
Getting it Right: Improving Spatial Consistency in Text-to-Image Models
Paper • 2404.01197 • Published • 31 -
Condition-Aware Neural Network for Controlled Image Generation
Paper • 2404.01143 • Published • 13
Transformer
CLIP
Video Generation
Video
-
ShareGPT4Video: Improving Video Understanding and Generation with Better Captions
Paper • 2406.04325 • Published • 74 -
SF-V: Single Forward Video Generation Model
Paper • 2406.04324 • Published • 24 -
VideoTetris: Towards Compositional Text-to-Video Generation
Paper • 2406.04277 • Published • 25 -
Vript: A Video Is Worth Thousands of Words
Paper • 2406.06040 • Published • 28