You should explain your file types better
Just a suggestion, but most users have no idea what your file suffixes mean or how to choose the right model. Outside of the hardcore Discord crowd, this isn’t common knowledge.
You clearly put a lot of effort into the quantization levels, but without explaining them up front, it ends up confusing people.
Please paste the README info directly into the model card section. That’s where users decide what to download, and it would prevent 90% of the confusion. I'm assuming the info about Q and K file types is in there (I had other sources). Either way, really easy fix to get your work out there to more people.
Right now the presentation makes the project look rough around the edges, even though the work itself is impressive. A little documentation goes a long way.
Hi, I am glad you decided to open the discussion in the place where you download them and not somewhere else like the server you participate in on Discord.
These GGUF files are quantizations using llama.cpp and you can find information about that at:
https://github.com/ggml-org/llama.cpp
Just like other users who ask questions about the models we publish, we invite you to ask your questions by creating a discussion in the model you have, as you did now but without asking anything.