YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

A proof of concept that a smaller AE can cut inference time of Sana in half.

Code is here: https://github.com/Luke100000/Mini-DC-AE

Training is suboptimal and blurry since I failed to replicate the GAN training from the paper.

Parameter count of the AE decoder reduced by 9x.

End-to-end inference time went down massively:

import torch
from diffusers import AutoencoderDC, SanaSprintPipeline

device = "cuda"
dtype = torch.float32 if device == "cpu" else torch.bfloat16

pipeline = SanaSprintPipeline.from_pretrained(
    "Efficient-Large-Model/Sana_Sprint_0.6B_1024px_diffusers",
    torch_dtype=dtype,
)
pipeline.vae = AutoencoderDC.from_pretrained(
    "Luke100000/dc-ae-mini-f32c32-sana-1.1-diffusers",
    torch_dtype=dtype,
    low_cpu_mem_usage=False
)

pipeline.to(device=device, dtype=dtype)

pipeline(prompt="a tiny astronaut hatching from an egg on the moon", num_inference_steps=2, width=512, height=512).images[0]

Downloads last month: 3

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support