Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
dougeeai 
posted an update Nov 9, 2025
Post
317
Llama.cpp wheels for Windows - Hot off the press!

I got tired of fighting with Visual Studio and CUDA Toolkit every time I wanted to use llama-cpp-python on Windows, so I've been building pre-compiled wheels for the community.

## What's Available:
✅ RTX 50/40/30/20 Series support (Blackwell, Ada, Ampere, Turing)
✅ CUDA 11.8, 12.1, 13.0 (Blackwell is CUDA 13 only)
✅ Python 3.10-3.13
✅ Just 'pip install' and run - no build tools needed

## Why this matters:
Windows users face a painful setup process with llama-cpp-python. These wheels eliminate:
- Visual Studio installation
- CUDA Toolkit setup
- Compilation errors
- Hours of troubleshooting

**Download:** https://github.com/dougeeai/llama-cpp-python-wheels

Linux wheels coming soon! Let me know what configs you need.

Tested on Ada Lovelace & Ampere

dougeeai/llama-cpp-python-wheels

#llama-cpp #gguf #windows #prebuilt

Nice! The original repository is still maintained, but the supply of prebuilt wheels has stopped...
Other variations I know of:
https://github.com/JamePeng/llama-cpp-python

In this post