webxos commited on
Commit
e07033f
·
verified ·
1 Parent(s): 970442c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -2
README.md CHANGED
@@ -52,8 +52,7 @@ small set of files the user can use to template their own agents. Designed for e
52
  Use **MICROD V1.0 (micro-distill-grpo-vae)** in your own custom projects and train it from the ground up.
53
 
54
  The model's architecture details further underscore an educational niche: a hidden size of 512, 8 layers, 8 attention heads, a vocabulary of 50,257 tokens,
55
- and a max sequence length of 1024. It supports KV-cache reuse with a 512 cache size, enabling faster generation for sequential thoughts, though this feature
56
- is noted as inactive in some interfaces. Licensed under Apache 2.0, it's openly available for modification, and its small footprint allows quantization,
57
  making it runnable on modest hardware like CPUs or even browsers via TensorFlow.js integration.
58
 
59
  ## Model Details
 
52
  Use **MICROD V1.0 (micro-distill-grpo-vae)** in your own custom projects and train it from the ground up.
53
 
54
  The model's architecture details further underscore an educational niche: a hidden size of 512, 8 layers, 8 attention heads, a vocabulary of 50,257 tokens,
55
+ and a max sequence length of 1024. Licensed under Apache 2.0, it's openly available for modification, and its small footprint allows quantization,
 
56
  making it runnable on modest hardware like CPUs or even browsers via TensorFlow.js integration.
57
 
58
  ## Model Details