Qwen3.5-4B LoRA SFT (checkpoint-971)

LoRA adapter fine-tuned on stepfun-ai/Step-3.5-Flash-SFT using Axolotl on a single NVIDIA H100 PCIe 80GB.

This is checkpoint-971 (epoch=1.0), the best checkpoint by eval loss (0.787).

Training Details


Base model	Qwen/Qwen3.5-4B
Dataset	stepfun-ai/Step-3.5-Flash-SFT (chunk_0, ~14.5k samples)
LoRA rank	128
LoRA alpha	256
Trainable params	195M / 4.73B (4.13%)
Best eval loss	0.787 (epoch 1.0)
Hardware	1× H100 PCIe 80GB
Training time	~3 hours

Usage

from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch

base = AutoModelForCausalLM.from_pretrained("Qwen/Qwen3.5-4B", dtype=torch.bfloat16, device_map="cuda")
tokenizer = AutoTokenizer.from_pretrained("FutureMa/qwen35-4b-lora-sft")
model = PeftModel.from_pretrained(base, "FutureMa/qwen35-4b-lora-sft")

messages = [{"role": "user", "content": "Explain what a lambda function is in Python."}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True, enable_thinking=False)
inputs = tokenizer(text, return_tensors="pt").to("cuda")
out = model.generate(**inputs, max_new_tokens=512, do_sample=False)
print(tokenizer.decode(out[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True))

Framework versions

PEFT 0.18.1
Axolotl (latest)

Downloads last month: 20

Model tree for FutureMa/qwen35-4b-lora-sft

Base model

Qwen/Qwen3.5-4B-Base

Finetuned

Qwen/Qwen3.5-4B

Adapter

(66)

this model