FunctionGemma Infrastructure Tools v8

A fine-tuned FunctionGemma 270M model for infrastructure error diagnosis and remediation. Achieves 100% accuracy on 7 infrastructure tools when using the correct tool definitions.

Model Details

Base Model: google/functiongemma-270m-it
Format: LiteRT-LM (.litertlm) - optimized for on-device inference
Quantization: INT8 (Q8)
Size: ~271MB
Training: 50 epochs on 10,500 examples (1,500 per tool)

Supported Tools

Tool	Description	Use Case
`enableCors`	Enable CORS for a specific origin	CORS policy errors, blocked cross-origin requests
`updateConnectionUrl`	Update service connection URL	ECONNREFUSED errors, localhost connection issues in containers
`setEnvVar`	Set environment variable	Missing configuration, undefined env vars
`addHostMapping`	Add hostname to IP mapping	DNS resolution (ENOTFOUND) errors
`increaseMemory`	Increase memory limit	OOMKilled errors, out of memory crashes
`increaseTimeout`	Increase timeout value	504 Gateway Timeout, connection timeout errors
`restartService`	Restart a service	Stuck processes, stale data after deployment

Usage with LiteRT-LM

Download the Model

# Using huggingface-cli
huggingface-cli download macmacmacmac/functiongemma-nextjs functiongemma-infra-v8_q8_ekv1024.litertlm

# Or using Python
from huggingface_hub import hf_hub_download
model_path = hf_hub_download(
    repo_id="macmacmacmac/functiongemma-nextjs",
    filename="functiongemma-infra-v8_q8_ekv1024.litertlm"
)

Required Tool Definitions

Important: You must use these exact tool definitions for optimal accuracy. The model was trained with these specific descriptions.

const tools = [
  {
    type: "function",
    function: {
      name: "enableCors",
      description: "Enable CORS for a specific origin to fix blocked cross-origin requests.",
      parameters: {
        type: "object",
        properties: {
          origin: { type: "string", description: "The origin to allow (e.g., http://localhost:3000)" },
          methods: { type: "string", description: "Allowed HTTP methods (e.g., GET,POST,PUT,DELETE)" }
        },
        required: ["origin"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "updateConnectionUrl",
      description: "Update a service connection URL to fix ECONNREFUSED errors, typically changing localhost to the correct service hostname.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to update (e.g., database, redis, api)" },
          hostname: { type: "string", description: "The correct hostname to connect to" },
          port: { type: "integer", description: "The port number to connect to" }
        },
        required: ["service", "hostname", "port"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "setEnvVar",
      description: "Set an environment variable to fix missing configuration errors.",
      parameters: {
        type: "object",
        properties: {
          name: { type: "string", description: "Environment variable name (e.g., DATABASE_URL, API_KEY)" },
          value: { type: "string", description: "The value to set" }
        },
        required: ["name", "value"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "addHostMapping",
      description: "Add a hostname to IP mapping to fix DNS resolution (ENOTFOUND) errors.",
      parameters: {
        type: "object",
        properties: {
          hostname: { type: "string", description: "The hostname to map" },
          ip: { type: "string", description: "The IP address to map to" }
        },
        required: ["hostname", "ip"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseMemory",
      description: "Increase memory limit for a service to fix OOMKilled errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name" },
          memoryMb: { type: "integer", description: "Memory limit in megabytes" }
        },
        required: ["service", "memoryMb"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "increaseTimeout",
      description: "Increase timeout value to fix 504 Gateway Timeout or connection timeout errors.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service to configure" },
          timeoutMs: { type: "integer", description: "Timeout value in milliseconds" }
        },
        required: ["service", "timeoutMs"]
      }
    }
  },
  {
    type: "function",
    function: {
      name: "restartService",
      description: "Restart a service to apply configuration changes or fix a stuck process.",
      parameters: {
        type: "object",
        properties: {
          service: { type: "string", description: "The service/container/pod name to restart" }
        },
        required: ["service"]
      }
    }
  }
];

Example Usage with dad-express

const { FunctionGemmaEngine } = require('dad-express');

const engine = new FunctionGemmaEngine({
  modelPath: './functiongemma-infra-v8_q8_ekv1024.litertlm',
  tools: JSON.stringify(tools)
});

// Diagnose an error
const result = await engine.call('Container api was OOMKilled - out of memory');
console.log(result.tool_calls[0].function);
// { name: 'increaseMemory', arguments: { service: 'api', memoryMb: 1024 } }

Training Data

The model was trained on 10,500 synthetic examples covering common infrastructure errors:

Error Pattern	Tool	Examples
CORS policy errors	enableCors	1,500
ECONNREFUSED errors	updateConnectionUrl	1,500
Missing env vars	setEnvVar	1,500
DNS/ENOTFOUND errors	addHostMapping	1,500
OOMKilled errors	increaseMemory	1,500
Timeout errors	increaseTimeout	1,500
Stuck services	restartService	1,500

Sample Training Examples

"CORS error: No 'Access-Control-Allow-Origin' header from http://localhost:3000" → enableCors
"Error: connect ECONNREFUSED 127.0.0.1:5432 - database connection failed" → updateConnectionUrl
"Missing required environment variable: DATABASE_URL" → setEnvVar
"getaddrinfo ENOTFOUND db" → addHostMapping
"Container api was OOMKilled" → increaseMemory
"504 Gateway Timeout from backend" → increaseTimeout
"nginx container is not responding" → restartService

Fully Loaded Serving

Fully Loaded Serving is an end-to-end intelligent error remediation pipeline that runs entirely on-device. It combines:

Low-latency vector embeddings (EmbeddingGemma) for streaming log classification
Semantic clustering to group similar errors/issues/outliers
Function calling (FunctionGemma) to automatically diagnose and fix infrastructure issues
Prompt optimization via Ax with MiPRO for continuous improvement

Architecture

┌─────────────────────────────────────────────────────────────────────────┐
│                         Next.js Application                             │
├─────────────────────────────────────────────────────────────────────────┤
│  stdout/stderr ──▶ Log Stream ──▶ dad-express middleware                │
│                                          │                              │
│                    ┌─────────────────────┼──────────────────────┐       │
│                    │                     ▼                      │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │      EmbeddingGemma (~5ms)       │      │       │
│                    │  │   768-dim vector per log line    │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │   Semantic Clustering (cosine)   │      │       │
│                    │  │  • Group similar errors          │      │       │
│                    │  │  • Detect outliers               │      │       │
│                    │  │  • Identify recurring patterns   │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │   FunctionGemma (~50ms/call)     │      │       │
│                    │  │  → enableCors, setEnvVar, etc.   │      │       │
│                    │  └──────────────┬───────────────────┘      │       │
│                    │                 │                          │       │
│                    │                 ▼                          │       │
│                    │  ┌──────────────────────────────────┐      │       │
│                    │  │      Auto-Remediation Layer      │      │       │
│                    │  │  Execute fix or notify developer │      │       │
│                    │  └──────────────────────────────────┘      │       │
│                    │                                            │       │
│                    │     LiteRT-LM (on-device, ~300MB RAM)      │       │
│                    └────────────────────────────────────────────┘       │
└─────────────────────────────────────────────────────────────────────────┘

Ax Integration with MiPRO

Ax is a TypeScript DSPy-style framework for declarative AI programming. dad-express provides AxLiteRTProvider to run Ax signatures entirely on-device:

import { AxGen } from "@ax-llm/ax";
import { AxLiteRTProvider, EmbeddingEngine, FunctionGemmaEngine } from "dad-express";

// Create on-device provider with both embedding and chat models
const provider = new AxLiteRTProvider({
  chat: {
    modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
    tools: infrastructureTools,  // The 7 tools from this repo
  },
  embed: {
    modelPath: "./models/embedding_gemma.tflite",
    tokenizerPath: "./models/tokenizer.model",
  },
});

// Define Ax signature for error diagnosis (MiPRO-optimizable)
const diagnoseError = new AxGen(`
  errorMessage:string "The error log line",
  errorCluster:string? "Similar errors seen recently"
  ->
  diagnosis:string "Root cause analysis",
  toolName:string "Which infrastructure tool to call",
  confidence:class "high, medium, low"
`);

// Run inference on-device
const result = await diagnoseError.forward(provider, {
  errorMessage: "CORS error from http://localhost:3000",
  errorCluster: "3 similar CORS errors in last 5 minutes",
});

console.log(result);
// { diagnosis: "Frontend origin not in allowed list", 
//   toolName: "enableCors", 
//   confidence: "high" }

Example: Hosting Next.js with Fully Loaded Serving

// server.ts - Next.js with intelligent error remediation
import { createApp, FunctionGemmaEngine, EmbeddingEngine } from "dad-express";
import { spawn } from "child_process";

// Infrastructure tools (exact definitions for 100% accuracy)
const tools = [
  { type: "function", function: { name: "enableCors", description: "Enable CORS for a specific origin to fix blocked cross-origin requests.", parameters: { type: "object", properties: { origin: { type: "string", description: "The origin to allow" } }, required: ["origin"] } } },
  { type: "function", function: { name: "updateConnectionUrl", description: "Update a service connection URL to fix ECONNREFUSED errors.", parameters: { type: "object", properties: { service: { type: "string" }, hostname: { type: "string" }, port: { type: "integer" } }, required: ["service", "hostname", "port"] } } },
  { type: "function", function: { name: "setEnvVar", description: "Set an environment variable to fix missing configuration errors.", parameters: { type: "object", properties: { name: { type: "string" }, value: { type: "string" } }, required: ["name", "value"] } } },
  { type: "function", function: { name: "addHostMapping", description: "Add a hostname to IP mapping to fix DNS resolution errors.", parameters: { type: "object", properties: { hostname: { type: "string" }, ip: { type: "string" } }, required: ["hostname", "ip"] } } },
  { type: "function", function: { name: "increaseMemory", description: "Increase memory limit for a service to fix OOMKilled errors.", parameters: { type: "object", properties: { service: { type: "string" }, memoryMb: { type: "integer" } }, required: ["service", "memoryMb"] } } },
  { type: "function", function: { name: "increaseTimeout", description: "Increase timeout value to fix 504 Gateway Timeout errors.", parameters: { type: "object", properties: { service: { type: "string" }, timeoutMs: { type: "integer" } }, required: ["service", "timeoutMs"] } } },
  { type: "function", function: { name: "restartService", description: "Restart a service to apply changes or fix stuck processes.", parameters: { type: "object", properties: { service: { type: "string" } }, required: ["service"] } } },
];

// Initialize on-device models
const embedEngine = new EmbeddingEngine({
  modelPath: "./models/embedding_gemma.tflite",
  tokenizerPath: "./models/tokenizer.model",
});

const functionGemma = new FunctionGemmaEngine({
  modelPath: "./models/functiongemma-infra-v8_q8_ekv1024.litertlm",
  tools: JSON.stringify(tools),
});

// Error clustering state
const errorClusters = new Map<string, { embedding: Float32Array; count: number; lastSeen: Date }>();

async function classifyAndCluster(logLine: string): Promise<string | null> {
  // Skip non-error lines
  if (!logLine.match(/error|fail|exception|timeout|refused|denied/i)) {
    return null;
  }

  // Generate embedding (~5ms on CPU)
  const embedding = await embedEngine.encodeAsync(logLine);

  // Find similar errors via cosine similarity
  let bestMatch: string | null = null;
  let bestSimilarity = 0.85; // Threshold for clustering

  for (const [clusterId, cluster] of errorClusters) {
    const similarity = EmbeddingEngine.cosineSimilarity(embedding, cluster.embedding);
    if (similarity > bestSimilarity) {
      bestSimilarity = similarity;
      bestMatch = clusterId;
    }
  }

  if (bestMatch) {
    // Update existing cluster
    const cluster = errorClusters.get(bestMatch)!;
    cluster.count++;
    cluster.lastSeen = new Date();
    return bestMatch;
  }

  // Create new cluster
  const clusterId = `cluster_${Date.now()}`;
  errorClusters.set(clusterId, { embedding, count: 1, lastSeen: new Date() });
  return clusterId;
}

async function diagnoseAndFix(errorLog: string, clusterId: string): Promise<void> {
  const cluster = errorClusters.get(clusterId);
  
  // Call FunctionGemma for diagnosis (~50ms)
  const result = await functionGemma.sendMessage(errorLog);
  
  if (result.functionCalls && result.functionCalls.length > 0) {
    const call = result.functionCalls[0];
    console.log(`[AutoFix] Detected ${cluster?.count || 1}x: ${call.name}`);
    console.log(`[AutoFix] Args: ${JSON.stringify(call.arguments)}`);
    
    // Execute remediation (in production, this would call actual infrastructure APIs)
    switch (call.name) {
      case "enableCors":
        console.log(`[AutoFix] Would enable CORS for: ${call.arguments.origin}`);
        break;
      case "restartService":
        console.log(`[AutoFix] Would restart: ${call.arguments.service}`);
        break;
      case "increaseMemory":
        console.log(`[AutoFix] Would increase memory for ${call.arguments.service} to ${call.arguments.memoryMb}MB`);
        break;
      // ... handle other tools
    }
  }
}

// Create dad-express app
const app = createApp();

// API routes
app.get("/health", () => ({ status: "ok", models: { embed: true, functionGemma: true } }));

app.get("/clusters", () => {
  const clusters = [];
  for (const [id, cluster] of errorClusters) {
    clusters.push({ id, count: cluster.count, lastSeen: cluster.lastSeen });
  }
  return clusters;
});

// Start Next.js as child process with log monitoring
const nextProcess = spawn("npx", ["next", "start"], {
  stdio: ["inherit", "pipe", "pipe"],
  env: { ...process.env, PORT: "3001" },
});

// Stream stdout
nextProcess.stdout.on("data", (data) => {
  const line = data.toString().trim();
  console.log(`[next] ${line}`);
});

// Stream stderr with intelligent processing
nextProcess.stderr.on("data", async (data) => {
  const line = data.toString().trim();
  console.log(`[next:err] ${line}`);
  
  // Classify and cluster error
  const clusterId = await classifyAndCluster(line);
  
  if (clusterId) {
    // Diagnose and auto-fix
    await diagnoseAndFix(line, clusterId);
  }
});

// Start dad-express on separate port for monitoring
app.listen(4000, () => {
  console.log("dad-express monitoring on http://localhost:4000");
  console.log("Next.js app on http://localhost:3001");
});

Key Benefits

Feature	Latency	Memory
EmbeddingGemma	~5ms/embed	~50MB
FunctionGemma	~50ms/call	~271MB
Semantic clustering	<1ms	Varies
Total pipeline	~60ms	~350MB

Zero cloud dependency: All inference runs locally via LiteRT-LM
Sub-100ms latency: Fast enough for real-time log processing
Privacy-preserving: Error logs never leave the device
Continuous improvement: Use Ax MiPRO to optimize prompts over time

Limitations

Optimized for the 7 specific infrastructure tools listed above
Requires exact tool definitions for best accuracy
May not generalize well to error patterns not seen in training

License

This model inherits the Gemma license from the base model.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for macmacmacmac/functiongemma-nextjs

Base model

google/functiongemma-270m-it

Finetuned

(148)

this model