Official collection for paper "Reward Modeling from Natural Language Human Feedback".
AI & ML interests
LLM, Conversational AI, Agent
Recent Activity
View all activity
Papers
View all Papers Organization Card
Tongyi-ConvAI: The official repository containing the Alibaba Tongyi Conversational AI models and datasets.
models 13
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-32B
32B • Updated • 11 • 1
Tongyi-ConvAI/Cold-Start-MetaRM-RM-NLHF-Qwen-7B
7B • Updated • 16
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-7B
7B • Updated • 11
Tongyi-ConvAI/Final-MetaRM-RM-NLHF-Qwen-32B
32B • Updated • 8
Tongyi-ConvAI/RM-NLHF-Qwen-32B
33B • Updated • 15
Tongyi-ConvAI/Baseline-Outcome-Reward-Qwen-7B
8B • Updated • 14
Tongyi-ConvAI/RM-NLHF-Qwen-7B
8B • Updated • 24 • 1
Tongyi-ConvAI/P-GenRM-8B-ChatbotArena
8B • Updated • 10 • 2
Tongyi-ConvAI/P-GenRM-8B-PRISM
8B • Updated • 1
Tongyi-ConvAI/OmniCharacter-7B
Updated
datasets 6
Tongyi-ConvAI/RM-NLHF
Viewer • Updated • 49.5k • 23 • 1
Tongyi-ConvAI/OmniCharacter
Viewer • Updated • 10.1k • 268 • 2
Tongyi-ConvAI/EPO-RL-data
Viewer • Updated • 9.38k • 46 • 1
Tongyi-ConvAI/OpenOmni
Preview • Updated • 145 • 4
Tongyi-ConvAI/SDPO
Preview • Updated • 35 • 5
Tongyi-ConvAI/MMEvol
Preview • Updated • 1.33k • 15