On Policy Preference Data

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

shizhuo2 submitted a paper 3 days ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

genglinliu authored a paper 3 months ago

AI Debate Aids Assessment of Controversial Claims

genglinliu authored a paper 3 months ago

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

View all activity

shizhuo2

submitted a paper to Daily Papers 3 days ago

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published 5 days ago • 38

genglinliu

authored 2 papers 3 months ago

AI Debate Aids Assessment of Controversial Claims

Paper • 2506.02175 • Published Jun 2, 2025 • 1

WebCoach: Self-Evolving Web Agents with Cross-Session Memory Guidance

Paper • 2511.12997 • Published Nov 17, 2025 • 11

genglinliu

authored 2 papers 10 months ago

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15, 2025 • 35

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Paper • 2504.07830 • Published Apr 10, 2025 • 18

shizhuo2

authored a paper over 1 year ago

$\textbf{Only-IF}$:Revealing the Decisive Effect of Instruction Diversity on Generalization

Paper • 2410.04717 • Published Oct 7, 2024 • 18

genglinliu

authored a paper over 1 year ago

SciCode: A Research Coding Benchmark Curated by Scientists

Paper • 2407.13168 • Published Jul 18, 2024 • 17

shizhuo2

authored 3 papers over 1 year ago

Instruction Diversity Drives Generalization To Unseen Tasks

Paper • 2402.10891 • Published Feb 16, 2024

PACE-LM: Prompting and Augmentation for Calibrated Confidence Estimation with GPT-4 in Cloud Incident Root Cause Analysis

Paper • 2309.05833 • Published Sep 11, 2023

PLUM: Preference Learning Plus Test Cases Yields Better Code Language Models

Paper • 2406.06887 • Published Jun 11, 2024 • 2

shizhuo2

authored a paper almost 2 years ago

Transformer-Based Models Are Not Yet Perfect At Learning to Emulate Structural Recursion

Paper • 2401.12947 • Published Jan 23, 2024 • 4

AI & ML interests

Recent Activity

Team members 2

on-policy-pref's activity