TongZheng PRO
TongZheng1999
AI & ML interests
Natural Language Processing
Recent Activity
upvoted
a
paper
2 days ago
Training Data Efficiency in Multimodal Process Reward Models
liked
a Space
3 days ago
EfficientReasoning/efficient_reasoning_online_judgement
upvoted
a
paper
3 days ago
CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs