Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
THU-KEG 's Collections
LLaDA-8B-BGPO
DeepPrune
SIRI
VerIF
AdaptThink
LongWriter-V
OpenSAE-LLaMA-3.1-8B
Crab
ADELIE

LLaDA-8B-BGPO

updated Oct 11, 2025

Boundary-Guided Policy Optimization for Memory-Efficient RL of Diffusion Large Language Models

Upvote
4

  • THU-KEG/LLaDA-8B-BGPO-math

    Reinforcement Learning • 8B • Updated Oct 14, 2025 • 11 • 1

  • THU-KEG/LLaDA-8B-BGPO-code

    Reinforcement Learning • 8B • Updated Oct 14, 2025 • 13 • 1

  • THU-KEG/LLaDA-8B-BGPO-countdown

    Reinforcement Learning • 8B • Updated Oct 14, 2025 • 7 • 1

  • THU-KEG/LLaDA-8B-BGPO-sudoku

    Reinforcement Learning • 8B • Updated Oct 14, 2025 • 10 • 1
Upvote
4
  • Collection guide
  • Browse collections
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs