Marziel OS v1.0.1

Private AI Operating System — runs entirely on your hardware.

Install

pip install marziel==1.0.1
marziel serve

v1.0.1 — Marziel OS

AI Kernel

Persistent event loop with autonomous decision-making
3-Tier Memory: Working, Long-Term (AES-256), Episodic
Process Manager: Unix-like ps/top/kill
Task Scheduler for recurring tasks

MarzielFlow — 5-Component Adaptive Inference

Speculative Decoding with automatic fallback
T/Z Distribution Quantization (Normal/Student-t/Beta)
Adaptive Bit-Width: 2.88-bit avg across 32 layers
Fuzzy Logic Controller (Mamdani-style)
Attention Sink Cache: 75% memory savings

TurboQuant

PolarQuant + QJL 3-bit KV cache — 3x memory reduction.

Model Formats

Format	Size	Platform
GGUF Q4_K_M	4.8 GB	NVIDIA GPU, CPU
MLX 4-bit	4.5 GB	Apple Silicon
Safetensors	16 GB	Full precision

GGUF Usage

from llama_cpp import Llama
model = Llama.from_pretrained(
    repo_id="efops/marziel-8b-custom",
    filename="marziel-v6-Q4_K_M.gguf",
    n_gpu_layers=-1, n_ctx=4096,
)
output = model.create_chat_completion(
    messages=[{"role": "user", "content": "Hello!"}]
)

MLX Usage

pip install mlx-lm
mlx_lm.generate --model efops/marziel-8b-custom-MLX --prompt "Hello!"

OS API

GET  /os/status    — Kernel status
GET  /os/memory    — Memory tiers
GET  /os/ps        — Process list
GET  /os/top       — Resource monitor
POST /os/recall    — Memory recall
POST /os/remember  — Store memory
POST /os/schedule  — Schedule tasks
POST /os/kill/:pid — Kill process

Performance

52.9 tok/s on NVIDIA RTX A5000
75% KV cache memory savings
2.88-bit avg quantization

Model tree for efops/marziel-8b-custom

Base model

mistralai/Ministral-8B-Instruct-2410

Quantized

(69)

this model

efops
/

marziel-8b-custom

Marziel OS v1.0.1

Install

v1.0.1 — Marziel OS

AI Kernel

MarzielFlow — 5-Component Adaptive Inference

TurboQuant

Model Formats

GGUF Usage

MLX Usage

OS API

Performance

Links

Model tree for efops/marziel-8b-custom