S.F.'s picture

S.F.

search-facility

·

ipv6

AI & ML interests

None yet

Recent Activity

upvoted a paper 3 days ago

Representation Alignment for Just Image Transformers is not Easier than You Think

upvoted a paper 3 days ago

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

upvoted a paper 3 days ago

AVControl: Efficient Framework for Training Audio-Visual Controls

View all activity

Organizations

None yet

upvoted 5 papers 3 days ago

Representation Alignment for Just Image Transformers is not Easier than You Think

Paper • 2603.14366 • Published 16 days ago • 9

Less Gaussians, Texture More: 4K Feed-Forward Textured Splatting

Paper • 2603.25745 • Published 4 days ago • 10

AVControl: Efficient Framework for Training Audio-Visual Controls

Paper • 2603.24793 • Published 5 days ago • 21

Voxtral TTS

Paper • 2603.25551 • Published 4 days ago • 49

PixelSmile: Toward Fine-Grained Facial Expression Editing

Paper • 2603.25728 • Published 4 days ago • 114

upvoted a paper 6 days ago

Repurposing Geometric Foundation Models for Multi-view Diffusion

Paper • 2603.22275 • Published 7 days ago • 45

upvoted a paper 7 days ago

FlowScene: Style-Consistent Indoor Scene Generation with Multimodal Graph Rectified Flow

Paper • 2603.19598 • Published 11 days ago • 32

upvoted 2 papers 11 days ago

LoST: Level of Semantics Tokenization for 3D Shapes

Paper • 2603.17995 • Published 12 days ago • 31

Complementary Reinforcement Learning

Paper • 2603.17621 • Published 13 days ago • 36

upvoted a paper 12 days ago

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 13 days ago • 304

upvoted 3 papers 13 days ago

Mixture-of-Depths Attention

Paper • 2603.15619 • Published 14 days ago • 79

Attention Residuals

Paper • 2603.15031 • Published 15 days ago • 169

Grounding World Simulation Models in a Real-World Metropolis

Paper • 2603.15583 • Published 14 days ago • 149

upvoted a paper 14 days ago

OmniForcing: Unleashing Real-time Joint Audio-Visual Generation

Paper • 2603.11647 • Published 19 days ago • 31

upvoted 3 papers 19 days ago

Fish Audio S2 Technical Report

Paper • 2603.08823 • Published 21 days ago • 36

Reading, Not Thinking: Understanding and Bridging the Modality Gap When Text Becomes Pixels in Multimodal LLMs

Paper • 2603.09095 • Published 21 days ago • 28

Omni-Diffusion: Unified Multimodal Understanding and Generation with Masked Discrete Diffusion

Paper • 2603.06577 • Published 24 days ago • 48

upvoted 2 papers 20 days ago

Holi-Spatial: Evolving Video Streams into Holistic 3D Spatial Intelligence

Paper • 2603.07660 • Published 22 days ago • 84

Lost in Stories: Consistency Bugs in Long Story Generation by LLMs

Paper • 2603.05890 • Published 25 days ago • 91

upvoted a paper 21 days ago

WildActor: Unconstrained Identity-Preserving Video Generation

Paper • 2603.00586 • Published about 1 month ago • 37