ConicCat's picture

ConicCat

ConicCat

·

AI & ML interests

Preference optimization for learning from human feedback.

Recent Activity

updated a model about 11 hours ago

ConicCat/Llama3_3-Nemo-Super-Writer-49B

published a model about 11 hours ago

ConicCat/Llama3_3-Nemo-Super-Writer-49B

new activity about 12 hours ago

Jackrong/Qwen3.5-27B-Claude-4.6-Opus-Reasoning-Distilled:How do you hack the cot responses?

View all activity

Organizations

None yet

upvoted 2 papers about 2 months ago

WildFeedback: Aligning LLMs With In-situ User Interactions And Feedback

Paper • 2408.15549 • Published Aug 28, 2024 • 2

WildReward: Learning Reward Models from In-the-Wild Human Interactions

Paper • 2602.08829 • Published Feb 9 • 3

upvoted a paper 6 months ago

DISCO: Diversifying Sample Condensation for Efficient Model Evaluation

Paper • 2510.07959 • Published Oct 9, 2025 • 15