Qdrant
/

minicoil-v1

Sentence Similarity

Model card Files Files and versions

mrscoopers commited on Jun 11, 2025

Commit

a267ad7

·

verified ·

1 Parent(s): ebc6496

Update README.md

Files changed (1) hide show

README.md +13 -4

README.md CHANGED Viewed

@@ -8,16 +8,25 @@ pipeline_tag: sentence-similarity
 # MiniCOIL v1
-MiniCOIL - is a sparse contextualized per-token embeddings.
-Read more about it in [the article](https://qdrant.tech/articles/minicoil).
 ## Usage
-This model is designed to be used with [FastEmbed](https://github.com/qdrant/fastembed) library.
 > Note:
-This model is supposed to be used with Qdrant. Vectors have to be configured with [Modifier.IDF](https://qdrant.tech/documentation/concepts/indexing/?q=modifier#idf-modifier).
 ```py
 from fastembed import SparseTextEmbedding

 # MiniCOIL v1
+MiniCOIL is a sparse neural embedding model for textual retrieval.
+It creates 4-dimensional embeddings for each word stem, capturing the word's meaning.
+These meaning embeddings are combined into a bag-of-words (BoW) representation of the input text.
+The final sparse representation is calculated by weighting each word using the BM25 scoring formula.
+<img src="https://storage.googleapis.com/qdrant-examples/miniCOIL_inference.png" alt="miniCOIL inference" width="600"/>
+In the case of a word's absence in the miniCOIL vocabulary, word weight in sparse representation is purely based on the BM25 score.
+Read more about miniCOIL in [the article](https://qdrant.tech/articles/minicoil).
 ## Usage
+This model is designed to be used with the [FastEmbed](https://github.com/qdrant/fastembed) library.
 > Note:
+This model was designed with Qdrant's specifics in mind; miniCOIL sparse vectors in Qdrant have to be configured with [Modifier.IDF](https://qdrant.tech/documentation/concepts/indexing/?q=modifier#idf-modifier). Otherwise, you'll have to personally calculate & scale the produced sparse representations by the IDF part of the BM25 formula.
 ```py
 from fastembed import SparseTextEmbedding