Yujun Zhou's picture

2 14 1

Yujun Zhou

yujunzhou

·

AI & ML interests

None yet

Recent Activity

upvoted a paper 10 days ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

submitted a paper 10 days ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

updated a model 10 days ago

yujunzhou/SFT_Advanced_Risk_Self_Grading_Qwen3-4B

View all activity

Organizations

None yet

upvoted a paper 10 days ago

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published 10 days ago • 17

upvoted a paper 16 days ago

MotionEdit: Benchmarking and Learning Motion-Centric Image Editing

Paper • 2512.10284 • Published 17 days ago • 25

upvoted a paper about 1 month ago

VisPlay: Self-Evolving Vision-Language Models from Images

Paper • 2511.15661 • Published Nov 19 • 42

upvoted a paper 2 months ago

Building a Foundational Guardrail for General Agentic Systems via Synthetic Data

Paper • 2510.09781 • Published Oct 10 • 26

upvoted 3 papers 3 months ago

VOGUE: Guiding Exploration with Visual Uncertainty Improves Multimodal Reasoning

Paper • 2510.01444 • Published Oct 1 • 19

CLUE: Non-parametric Verification from Experience via Hidden-State Clustering

Paper • 2510.01591 • Published Oct 2 • 27

Evolving Language Models without Labels: Majority Drives Selection, Novelty Promotes Variation

Paper • 2509.15194 • Published Sep 18 • 33

upvoted 2 papers 4 months ago

CDE: Curiosity-Driven Exploration for Efficient Reinforcement Learning in Large Language Models

Paper • 2509.09675 • Published Sep 11 • 28

Parallel-R1: Towards Parallel Thinking via Reinforcement Learning

Paper • 2509.07980 • Published Sep 9 • 101

upvoted a paper 5 months ago

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 130

upvoted a paper 10 months ago

On the Trustworthiness of Generative Foundation Models: Guideline, Assessment, and Perspective

Paper • 2502.14296 • Published Feb 20 • 45

upvoted a paper 11 months ago

Preference Leakage: A Contamination Problem in LLM-as-a-judge

Paper • 2502.01534 • Published Feb 3 • 40

upvoted a paper about 1 year ago

Interleaved Scene Graph for Interleaved Text-and-Image Generation Assessment

Paper • 2411.17188 • Published Nov 26, 2024 • 20