Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Shengmei Shen's picture

Shengmei Shen

janeshen

https://www.janeshensm.com/

AI & ML interests

Computer vision, ML/DL. AIGC

Organizations

None yet

janeshen 's collections 5

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Paper • 2403.17804 • Published Mar 26, 2024 • 20
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25, 2024 • 25
Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1, 2024 • 31
Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1, 2024 • 13

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12, 2024 • 29

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 74
SF-V: Single Forward Video Generation Model

Paper • 2406.04324 • Published Jun 6, 2024 • 24
VideoTetris: Towards Compositional Text-to-Video Generation

Paper • 2406.04277 • Published Jun 6, 2024 • 25
Vript: A Video Is Worth Thousands of Words

Paper • 2406.06040 • Published Jun 10, 2024 • 28

ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27, 2024 • 55
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Video Generation

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 74

Improving Text-to-Image Consistency via Automatic Prompt Optimization

Paper • 2403.17804 • Published Mar 26, 2024 • 20
Be Yourself: Bounded Attention for Multi-Subject Text-to-Image Generation

Paper • 2403.16990 • Published Mar 25, 2024 • 25
Getting it Right: Improving Spatial Consistency in Text-to-Image Models

Paper • 2404.01197 • Published Apr 1, 2024 • 31
Condition-Aware Neural Network for Controlled Image Generation

Paper • 2404.01143 • Published Apr 1, 2024 • 13

ViTAR: Vision Transformer with Any Resolution

Paper • 2403.18361 • Published Mar 27, 2024 • 55
Autoregressive Model Beats Diffusion: Llama for Scalable Image Generation

Paper • 2406.06525 • Published Jun 10, 2024 • 71

Scaling (Down) CLIP: A Comprehensive Analysis of Data, Architecture, and Training Strategies

Paper • 2404.08197 • Published Apr 12, 2024 • 29

Video Generation

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 74

ShareGPT4Video: Improving Video Understanding and Generation with Better Captions

Paper • 2406.04325 • Published Jun 6, 2024 • 74
SF-V: Single Forward Video Generation Model

Paper • 2406.04324 • Published Jun 6, 2024 • 24
VideoTetris: Towards Compositional Text-to-Video Generation

Paper • 2406.04277 • Published Jun 6, 2024 • 25
Vript: A Video Is Worth Thousands of Words

Paper • 2406.06040 • Published Jun 10, 2024 • 28

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs