zhangtao's picture

2 15 5

zhangtao

zhangtao-whu

·

https://github.com/zhang-tao-whu

zhang-tao-whu

AI & ML interests

segmentation

Recent Activity

upvoted a paper 10 days ago

SAMTok: Representing Any Mask with Two Words

liked a model 18 days ago

stepfun-ai/Step3-VL-10B

upvoted a paper 3 months ago

Open-o3 Video: Grounded Video Reasoning with Explicit Spatio-Temporal Evidence

View all activity

Organizations

authored 6 papers about 1 year ago

Point Cloud Mamba: Point Cloud Learning via State Space Model

Paper • 2403.00762 • Published Mar 1, 2024

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

Paper • 2312.13305 • Published Dec 20, 2023

Are They the Same? Exploring Visual Correspondence Shortcomings of Multimodal LLMs

Paper • 2501.04670 • Published Jan 8, 2025

OMG-LLaVA: Bridging Image-level, Object-level, Pixel-level Reasoning and Understanding

Paper • 2406.19389 • Published Jun 27, 2024 • 54

DVIS-DAQ: Improving Video Segmentation via Dynamic Anchor Queries

Paper • 2404.00086 • Published Mar 29, 2024

DVIS: Decoupled Video Instance Segmentation Framework

Paper • 2306.03413 • Published Jun 6, 2023