Yansong Shi's picture

Yansong Shi

nanamma

·

https://huggingface.co/nanamma

AI & ML interests

multi modality, video understanding, robotics

Recent Activity

upvoted a paper 3 days ago

Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning

new activity 6 days ago

nanamma/RIVER:Add task categories and link to paper

updated a dataset 6 days ago

OpenGVLab/RIVER

View all activity

Organizations

authored a paper 7 days ago

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Paper • 2603.03985 • Published 8 days ago • 5

submitted a paper to Daily Papers 7 days ago

RIVER: A Real-Time Interaction Benchmark for Video LLMs

Paper • 2603.03985 • Published 8 days ago • 5

authored 2 papers 6 months ago

InternVideo2: Scaling Video Foundation Models for Multimodal Video Understanding

Paper • 2403.15377 • Published Mar 22, 2024 • 29

TimeSuite: Improving MLLMs for Long Video Understanding via Grounded Tuning

Paper • 2410.19702 • Published Oct 25, 2024 • 1