StreamBP: Memory-Efficient Exact Backpropagation for Long Sequence Training of LLMs Paper • 2506.03077 • Published Jun 3, 2025 • 17
DPO-Shift: Shifting the Distribution of Direct Preference Optimization Paper • 2502.07599 • Published Feb 11, 2025 • 15