Qwen3.5-4B LoRA SFT (checkpoint-971)
LoRA adapter fine-tuned on stepfun-ai/Step-3.5-Flash-SFT using Axolotl on a single NVIDIA H100 PCIe 80GB.
This is checkpoint-971 (epoch=1.0), the best checkpoint by eval loss (0.787).
Training Details
| Base model | Qwen/Qwen3.5-4B |
| Dataset | stepfun-ai/Step-3.5-Flash-SFT (chunk_0, ~14.5k samples) |
| LoRA rank | 128 |
| LoRA alpha | 256 |
| Trainable params | 195M / 4.73B (4.13%) |
| Best eval loss | 0.787 (epoch 1.0) |
| Hardware | 1× H100 PCIe 80GB |
| Training time | ~3 hours |
Usage
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B", dtype=torch.bfloat16, device_map="cuda")
tokenizer = AutoTokenizer.from_pretrained("FutureMa/qwen35-4b-lora-sft")
model = PeftModel.from_pretrained(base, "FutureMa/qwen35-4b-lora-sft")
messages = [{"role": "user", "content": "Explain what a lambda function is in Python."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))
Framework versions
- PEFT 0.18.1
- Axolotl (latest)
- Downloads last month
- 20