AI Copilot — KubeBolt Docs

How it Works

The copilot sends your question to a configured LLM provider along with tool definitions that map to KubeBolt’s REST API. The LLM calls tools to fetch live cluster data, analyzes the results, and responds with data-backed answers and kubectl commands.

17 Native Tools

Tool	What it does
`get_cluster_overview`	Resource counts, CPU/memory, health score, events
`list_resources`	List any of 23 resource types with filtering
`get_resource_detail`	Full detail of a specific resource
`get_resource_yaml`	Raw YAML definition (secrets redacted)
`get_resource_describe`	kubectl-describe output for deep troubleshooting
`get_pod_logs`	Pod logs with container, tail, since and grep options
`get_workload_pods`	Pods owned by a workload controller
`get_workload_history`	Revision history for Deployments/StatefulSets/DaemonSets
`get_cronjob_jobs`	Job children of a CronJob to investigate execution history
`get_topology`	Full cluster topology graph
`get_insights`	Active insights with severity
`get_events`	Events with filtering
`search_resources`	Global search by name across 16 resource types
`get_permissions`	Detected RBAC permissions
`list_clusters`	All available kubeconfig contexts
`get_kubebolt_docs`	Product knowledge base (features, navigation, admin pages)

Supported Providers

Anthropic — Claude Sonnet 4.6, Claude Opus 4.6/4.7, Claude Haiku 4.5. Prompt caching on system prompt + tool definitions.
OpenAI — GPT-5, GPT-5 Mini, GPT-4o, GPT-4o Mini. Automatic prompt caching, max_completion_tokens for reasoning models.
Self-hosted — Ollama, vLLM, Groq, DeepSeek (OpenAI-compatible).

BYO Key: You bring your own API key. KubeBolt never stores or transmits your key to any server other than the provider you configure. For production, use the API proxy mode to keep keys server-side.

Contextual “Ask Copilot”

Launches the Copilot panel with a pre-loaded prompt that already carries the cluster, namespace, resource name and symptom. Five surfaces:

Insights — “Diagnose this insight and recommend a fix”
Resource Detail (Pods, Deployments, StatefulSets, Services, Nodes) — “Investigate this resource”
Events — button on every Warning row, “Explain this Kubernetes Warning event”

Templates live in services/copilot/triggers.ts and are versioned so the LLM never sees stale framing.

Conversation memory

Long sessions stop bleeding context. When the estimated conversation size crosses SESSION_BUDGET_TOKENS × AUTO_COMPACT_THRESHOLD (default 80%), the handler folds older turns into a summary generated by the provider’s cheap-tier model (Haiku 4.5 / gpt-4o-mini) and stubs bulky tool_results in the preserved tail. The active turn’s tool_results are always protected so mid-flight compacts never truncate a response.

A Scissors icon in the panel header exposes the same primitive on demand — “new session with summary” collapses the whole transcript into a single summary message so you can pivot topics without losing context.

Env var	Default	Purpose
`KUBEBOLT_AI_AUTO_COMPACT`	`true`	Master switch for auto-compaction
`KUBEBOLT_AI_SESSION_BUDGET_TOKENS`	context window of the model	Total ceiling. Trigger fires at budget × threshold
`KUBEBOLT_AI_AUTO_COMPACT_THRESHOLD`	`0.80`	Fraction of the budget at which compact fires
`KUBEBOLT_AI_COMPACT_MODEL`	auto (cheap tier of same provider)	Override the summarisation model
`KUBEBOLT_AI_COMPACT_PRESERVE_TURNS`	`3`	Turns kept intact after compaction

Scope guardrail

The system prompt defines in-scope (Kubernetes operations, DevOps/SRE topics that support the user’s cluster, KubeBolt itself) and out-of-scope (general coding unrelated to cluster resources, non-technical topics, competitor cloud products). The LLM refuses out-of-scope questions with a one-sentence polite redirect in the user’s language — never answers partially.

Admin Copilot Usage

At /admin/copilot-usage: sessions, tokens billed, cache hit rate, estimated USD cost (best-effort pricing table for Anthropic and OpenAI), top tools with error rates, and a per-session drill-down modal with tool breakdown and compact events. Range selector 24h / 7d / 30d. Stored locally in BoltDB with a 30-day / 5000-entry retention cap. Requires authentication to be enabled.