Llama-3-8B-CLARA-MeD-QLoRA 🏥🇪🇸

Developed by: Jordiett
Task: Biomedical Text Simplification (Spanish)
Base Model: unsloth/llama-3-8b-Instruct-bnb-4bit
Dataset: CLARA-MeD (3,800 pairs of parallel medical texts)

Model Description

This model is a fine-tuned version of Llama-3-8B-Instruct optimized for simplifying complex medical texts into plain Spanish understandable by patients.

It was trained using Unsloth and QLoRA (4-bit quantization) on the CLARA-MeD dataset. The model outperforms baseline translation models (like NLLB) and previous generation LLMs (Llama-2) in simplification metrics, specifically achieving a high SARI score.

Key Features

Domain: Clinical/Biomedical.
Language: Spanish.
Method: QLoRA (Quantized Low-Rank Adaptation) + Unsloth.
Optimization: Inference parameters selected via Grid Search (Best: Greedy Decoding, Temp=0.0).

Performance 📊

The model was evaluated on the CLARA-MeD test set (10% split).

Metric	Score	Description
SARI	39.92	Main simplification metric (Keep/Add/Del).
BLEU	22.97	N-gram precision against reference.
COMET	~0.76	Semantic similarity.
ROUGE-L	0.44	Recall-based metric (Longest Common Subsequence).

Qualitative Example

The model demonstrates deep understanding of medical terminology, avoiding hallucinations common in zero-shot baselines.

Type	Text
Original (Input)	"onicosimicotica y perionixis"
Reference (Gold)	"infección por hongos de la uña del pie"
Model Prediction	"infección de la uña del dedo del pie"

How to Use 💻

To use this model, you need unsloth. It runs 2x faster and uses 60% less memory.

# 1. Install Unsloth
# !pip install "unsloth[colab-new] @ git+[https://github.com/unslothai/unsloth.git](https://github.com/unslothai/unsloth.git)"
# !pip install --no-deps "xformers<0.0.27" "trl<0.9.0" peft accelerate bitsandbytes

from unsloth import FastLanguageModel
import torch

# 2. Load Model & Tokenizer
model_name = "Jordiett/llama3-8b-claramed-qlora"
max_seq_length = 512
dtype = None 
load_in_4bit = True 

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = model_name,
    max_seq_length = max_seq_length,
    dtype = dtype,
    load_in_4bit = load_in_4bit,
)
FastLanguageModel.for_inference(model)

# 3. Define the Prompt (Alpaca Style)
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Actúa como un doctor experto. Simplifica el siguiente texto médico técnico al español claro para un paciente.

### Input:
{}

### Response:
"""

# 4. Run Inference
text_to_simplify = "El paciente presenta cefalea tensional crónica y odinofagia." # Example

inputs = tokenizer(
[
    alpaca_prompt.format(text_to_simplify)
], return_tensors = "pt").to("cuda")

outputs = model.generate(**inputs, max_new_tokens = 128, use_cache = True, temperature = 0.0)
result = tokenizer.batch_decode(outputs, skip_special_tokens = True)

print(result[0].split("### Response:")[-1].strip())
# Output expected: "El paciente tiene dolor de cabeza constante y dolor al tragar."

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Jordiett/llama3-8b-claramed-qlora

Base model

unsloth/llama-3-8b-Instruct-bnb-4bit

Finetuned

(1123)

this model

Jordiett
/

llama3-8b-claramed-qlora