FunctionGemma Infrastructure Tools v8
A fine-tuned FunctionGemma 270M model for infrastructure error diagnosis and remediation. Achieves 100% accuracy on 7 infrastructure tools when using the correct tool definitions.
Model Details
- Base Model: google/functiongemma-270m-it
- Format: LiteRT-LM (.litertlm) - optimized for on-device inference
- Quantization: INT8 (Q8)
- Size: ~271MB
- Training: 50 epochs on 10,500 examples (1,500 per tool)
Supported Tools
| Tool | Description | Use Case |
|---|---|---|
enableCors |
Enable CORS for a specific origin | CORS policy errors, blocked cross-origin requests |
updateConnectionUrl |
Update service connection URL | ECONNREFUSED errors, localhost connection issues in containers |
setEnvVar |
Set environment variable | Missing configuration, undefined env vars |
addHostMapping |
Add hostname to IP mapping | DNS resolution (ENOTFOUND) errors |
increaseMemory |
Increase memory limit | OOMKilled errors, out of memory crashes |
increaseTimeout |
Increase timeout value | 504 Gateway Timeout, connection timeout errors |
restartService |
Restart a service | Stuck processes, stale data after deployment |
Usage with LiteRT-LM
Download the Model
# Using huggingface-cli
huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm
# Or using Python
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
repo_id="macmacmacmac/functiongemma-nextjs",
filename="functiongemma-infra-v8_q8_ekv1024.litertlm"
)
Required Tool Definitions
Important: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions.
const tools = [
{
type: "function",
function: {
name: "enableCors",
description: "Enable CORS for a specific origin to fix blocked cross-origin requests.",
parameters: {
type: "object",
properties: {
origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" },
methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" }
},
required: ["origin"]
}
}
},
{
type: "function",
function: {
name: "updateConnectionUrl",
description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.",
parameters: {
type: "object",
properties: {
service: { type: "string", description: "The service to update (e.g., database, redis, api)" },
hostname: { type: "string", description: "The correct hostname to connect to" },
port: { type: "integer", description: "The port number to connect to" }
},
required: ["service", "hostname", "port"]
}
}
},
{
type: "function",
function: {
name: "setEnvVar",
description: "Set an environment variable to fix missing configuration errors.",
parameters: {
type: "object",
properties: {
name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" },
value: { type: "string", description: "The value to set" }
},
required: ["name", "value"]
}
}
},
{
type: "function",
function: {
name: "addHostMapping",
description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.",
parameters: {
type: "object",
properties: {
hostname: { type: "string", description: "The hostname to map" },
ip: { type: "string", description: "The IP address to map to" }
},
required: ["hostname", "ip"]
}
}
},
{
type: "function",
function: {
name: "increaseMemory",
description: "Increase memory limit for a service to fix OOMKilled errors.",
parameters: {
type: "object",
properties: {
service: { type: "string", description: "The service/container/pod name" },
memoryMb: { type: "integer", description: "Memory limit in megabytes" }
},
required: ["service", "memoryMb"]
}
}
},
{
type: "function",
function: {
name: "increaseTimeout",
description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.",
parameters: {
type: "object",
properties: {
service: { type: "string", description: "The service to configure" },
timeoutMs: { type: "integer", description: "Timeout value in milliseconds" }
},
required: ["service", "timeoutMs"]
}
}
},
{
type: "function",
function: {
name: "restartService",
description: "Restart a service to apply configuration changes or fix a stuck process.",
parameters: {
type: "object",
properties: {
service: { type: "string", description: "The service/container/pod name to restart" }
},
required: ["service"]
}
}
}
];
Example Usage with dad-express
const { FunctionGemmaEngine } = require('dad-express');
const engine = new FunctionGemmaEngine({
modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm',
tools: JSON.stringify(tools)
});
// Diagnose an error
const result = await engine.call('Container api was OOMKilled - out of memory');
console.log(result.tool_calls[0].function);
// { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } }
Training Data
The model was trained on 10,500 synthetic examples covering common infrastructure errors:
| Error Pattern | Tool | Examples |
|---|---|---|
| CORS policy errors | enableCors | 1,500 |
| ECONNREFUSED errors | updateConnectionUrl | 1,500 |
| Missing env vars | setEnvVar | 1,500 |
| DNS/ENOTFOUND errors | addHostMapping | 1,500 |
| OOMKilled errors | increaseMemory | 1,500 |
| Timeout errors | increaseTimeout | 1,500 |
| Stuck services | restartService | 1,500 |
Sample Training Examples
"CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" β enableCors
"Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" β updateConnectionUrl
"Missing required environment variable: DATABASE_URL" β setEnvVar
"getaddrinfo ENOTFOUND db" β addHostMapping
"Container api was OOMKilled" β increaseMemory
"504 Gateway Timeout from backend" β increaseTimeout
"nginx container is not responding" β restartService
Fully Loaded Serving
Fully Loaded Serving is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines:
- Low-latency vector embeddings (EmbeddingGemma) for streaming log classification
- Semantic clustering to group similar errors/issues/outliers
- Function calling (FunctionGemma) to automatically diagnose and fix infrastructure issues
- Prompt optimization via Ax with MiPRO for continuous improvement
Architecture
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Next.js Application β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β stdout/stderr βββΆ Log Stream βββΆ dad-express middleware β
β β β
β βββββββββββββββββββββββΌβββββββββββββββββββββββ β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββ β β
β β β EmbeddingGemma (~5ms) β β β
β β β 768-dim vector per log line β β β
β β ββββββββββββββββ¬ββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββ β β
β β β Semantic Clustering (cosine) β β β
β β β β’ Group similar errors β β β
β β β β’ Detect outliers β β β
β β β β’ Identify recurring patterns β β β
β β ββββββββββββββββ¬ββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββ β β
β β β FunctionGemma (~50ms/call) β β β
β β β β enableCors, setEnvVar, etc. β β β
β β ββββββββββββββββ¬ββββββββββββββββββββ β β
β β β β β
β β βΌ β β
β β ββββββββββββββββββββββββββββββββββββ β β
β β β Auto-Remediation Layer β β β
β β β Execute fix or notify developer β β β
β β ββββββββββββββββββββββββββββββββββββ β β
β β β β
β β LiteRT-LM (on-device, ~300MB RAM) β β
β ββββββββββββββββββββββββββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Ax Integration with MiPRO
Ax is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides AxLiteRTProvider to run Ax signatures entirely on-device:
import { AxGen } from "@ax-llm/ax";
import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express";
// Create on-device provider with both embedding and chat models
const provider = new AxLiteRTProvider({
chat: {
modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
tools: infrastructureTools, // The 7 tools from this repo
},
embed: {
modelPath: "./models/embedding_gemma.tflite",
tokenizerPath: "./models/tokenizer.model",
},
});
// Define Ax signature for error diagnosis (MiPRO-optimizable)
const diagnoseError = new AxGen(`
errorMessage:string "The error log line",
errorCluster:string? "Similar errors seen recently"
->
diagnosis:string "Root cause analysis",
toolName:string "Which infrastructure tool to call",
confidence:class "high, medium, low"
`);
// Run inference on-device
const result = await diagnoseError.forward(provider, {
errorMessage: "CORS error from http://localhost:3000",
errorCluster: "3 similar CORS errors in last 5 minutes",
});
console.log(result);
// { diagnosis: "Frontend origin not in allowed list",
// toolName: "enableCors",
// confidence: "high" }
Example: Hosting Next.js with Fully Loaded Serving
// server.ts - Next.js with intelligent error remediation
import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express";
import { spawn } from "child_process";
// Infrastructure tools (exact definitions for 100% accuracy)
const tools = [
{ type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } },
{ type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } },
{ type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } },
{ type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } },
{ type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } },
{ type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } },
{ type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } },
];
// Initialize on-device models
const embedEngine = new EmbeddingEngine({
modelPath: "./models/embedding_gemma.tflite",
tokenizerPath: "./models/tokenizer.model",
});
const functionGemma = new FunctionGemmaEngine({
modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
tools: JSON.stringify(tools),
});
// Error clustering state
const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>();
async function classifyAndCluster(logLine: string): Promise<string | null> {
// Skip non-error lines
if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) {
return null;
}
// Generate embedding (~5ms on CPU)
const embedding = await embedEngine.encodeAsync(logLine);
// Find similar errors via cosine similarity
let bestMatch: string | null = null;
let bestSimilarity = 0.85; // Threshold for clustering
for (const [clusterId, cluster] of errorClusters) {
const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding);
if (similarity > bestSimilarity) {
bestSimilarity = similarity;
bestMatch = clusterId;
}
}
if (bestMatch) {
// Update existing cluster
const cluster = errorClusters.get(bestMatch)!;
cluster.count++;
cluster.lastSeen = new Date();
return bestMatch;
}
// Create new cluster
const clusterId = `cluster_${Date.now()}`;
errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() });
return clusterId;
}
async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> {
const cluster = errorClusters.get(clusterId);
// Call FunctionGemma for diagnosis (~50ms)
const result = await functionGemma.sendMessage(errorLog);
if (result.functionCalls && result.functionCalls.length > 0) {
const call = result.functionCalls[0];
console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`);
console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`);
// Execute remediation (in production, this would call actual infrastructure APIs)
switch (call.name) {
case "enableCors":
console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`);
break;
case "restartService":
console.log(`[AutoFix] Would restart: ${call.arguments.service}`);
break;
case "increaseMemory":
console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`);
break;
// ... handle other tools
}
}
}
// Create dad-express app
const app = createApp();
// API routes
app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } }));
app.get("/clusters", () => {
const clusters = [];
for (const [id, cluster] of errorClusters) {
clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen });
}
return clusters;
});
// Start Next.js as child process with log monitoring
const nextProcess = spawn("npx", ["next", "start"], {
stdio: ["inherit", "pipe", "pipe"],
env: { ...process.env, PORT: "3001" },
});
// Stream stdout
nextProcess.stdout.on("data", (data) => {
const line = data.toString().trim();
console.log(`[next] ${line}`);
});
// Stream stderr with intelligent processing
nextProcess.stderr.on("data", async (data) => {
const line = data.toString().trim();
console.log(`[next:err] ${line}`);
// Classify and cluster error
const clusterId = await classifyAndCluster(line);
if (clusterId) {
// Diagnose and auto-fix
await diagnoseAndFix(line, clusterId);
}
});
// Start dad-express on separate port for monitoring
app.listen(4000, () => {
console.log("dad-express monitoring on http://localhost:4000");
console.log("Next.js app on http://localhost:3001");
});
Key Benefits
| Feature | Latency | Memory | Cloud Calls |
|---|---|---|---|
| EmbeddingGemma | ~5ms/embed | ~50MB | 0 |
| FunctionGemma | ~50ms/call | ~271MB | 0 |
| Semantic clustering | <1ms | Varies | 0 |
| Total pipeline | ~60ms | ~350MB | 0 |
- Zero cloud dependency: All inference runs locally via LiteRT-LM
- Sub-100ms latency: Fast enough for real-time log processing
- Privacy-preserving: Error logs never leave the device
- Continuous improvement: Use Ax MiPRO to optimize prompts over time
Limitations
- Optimized for the 7 specific infrastructure tools listed above
- Requires exact tool definitions for best accuracy
- May not generalize well to error patterns not seen in training
License
This model inherits the Gemma license from the base model.
Model tree for macmacmacmac/functiongemma-nextjs
Base model
google/functiongemma-270m-it