Skip to content

Release Notes — 2.4.0

Released: 2026-05-13

Kure Monitor 2.4.0 introduces the AI Advice tab — a new dashboard view that proactively detects architectural mismatches in your cluster across two layers of analysis. Twenty-three detectors ship out of the box, covering scaling, reliability, networking, data, config, capacity, scheduling, supply-chain, and startup categories.

This release also renames the PostgreSQL resources with the kure-monitor- prefix, consolidates the three-role system (admin / write / read) down to two (admin / member), and tightens the agent’s WebSocket auth and reconnect behaviour. No API-level breaking changes.

Operator action: if upgrading from 2.3.x, helm upgrade handles the PostgreSQL rename for you. The old PVC is orphaned (rebuild-empty migration plan — no data carries over). Existing read / write users are auto-mapped to member on first backend boot.

A new Advice tab in the dashboard, scoped per namespace (and optionally per workload / pod).

  • Namespace scope auto-runs a scan. Picking a namespace from the selector immediately runs the detectors against everything in that namespace.
  • Workload / pod scope requires an explicit click. Narrowing to a workload, pod, or label set requires a Run scan click — to keep large clusters responsive when you’re browsing.
  • 23 detectors out of the box — 7 original + 16 added in this release — grouped by category in the side panel.
  • Two layers of detection.
    • Layer 1 (20 of 23) — manifest-only. Works with the existing backend K8s permissions. No extra infra required.
    • Layer 2 (3-4 of 23) — requires Cilium Hubble. Detectors like fan-out-pattern, websocket-on-deployment, all-to-all-replicas, and ephemeral-processes rely on flow data. The Hubble client is currently a stub; until the real gRPC client lands, the panel shows a Needs Hubble badge on those detectors and a coverage banner at the top.

Scans only run the detectors and persist findings with explanation: null. Cards collapse by default. Expanding a card lazily calls POST /api/advice/findings/{id}/explain, which generates the explanation and caches it on the finding. The call is idempotent — re-expanding the same card never re-invokes the LLM.

The advice explainer prompt is constrained against invention: it forbids replica counts, image names, container names, ports, or labels that are not present in the finding’s evidence dict. Explanations are grounded in the evidence the detector produced, not in plausible-sounding guesses.

A new admin-only modal to enable / disable detectors:

  • Grouped by category, with search across name and category.
  • Bulk Enable all / Disable all / Reset to defaults.
  • Hubble-gated detectors are visually marked and their toggles are disabled when Hubble is unavailable.
  • Export findings to JSON or CSV.
  • Lazy rendering on the findings list: the first 5 cards render initially, IntersectionObserver loads 5 more as you scroll, with a Load more link as a fallback for environments where the observer is throttled.
  • PostgreSQL resources renamed with the kure-monitor- prefix (StatefulSet, Service, Secret, ConfigMap — e.g. kure-monitor-postgresql).
  • Role consolidation. admin / write / read collapsed to admin / member. Idempotent DB migration auto-maps existing users on first boot.
  • Login rate limiting moved from in-memory dict to DB-backed (login_attempts table) so backend replicas share state.
  • Service-token rotation paths cleaned up. SERVICE_TOKEN env var is the source of truth and overwrites the DB row on boot if the two differ.
  • Agent WS auth defaults to X-Service-Token header instead of ?token= query param, so the secret never lands in proxy / access logs. Falls back to query-param with AGENT_AUTH_VIA_HEADER=false / SCANNER_AUTH_VIA_HEADER=false.
  • Agent WS reconnects use exponential backoff with jitter (AGENT_WS_RECONNECT_MAX_SECONDS, AGENT_WS_HEARTBEAT_SECONDS).
  • Defensive cleanup across all four services: Pydantic v2 migration, asyncio.to_thread / asyncio.Lock / asyncio.wait_for on K8s API calls, shared aiohttp.ClientSession, async shutdown lifecycle in the scanner.
  • Modal accessibility: shared useModalA11y hook (role=dialog, aria-modal, focus trap, Escape close, backdrop click close).
  • Tab order: AI Advice slots in as Monitoring → Security → Advice → Diagram → Admin (Advice and Diagram swapped order).

The PostgreSQL rename means the old kure-postgresql StatefulSet, Service, Secret, and ConfigMap are no longer managed by the chart.

  • helm upgrade handles the rename for you — it creates the new kure-monitor-postgresql resources and stops managing the old ones.
  • The old PVC is orphaned and can be deleted at your convenience. Data does not carry over — this is the accepted “rebuild empty” migration plan.
  • If you had any read or write users in 2.3.x, they’re auto-mapped to member on the first 2.4.0 backend boot. No manual action required.

Layer-1 detectors (20 of 23) work without any extra infra. If you want the Layer-2 detectors to actually produce findings rather than sit behind a Needs Hubble badge, install Cilium Hubble. Until the real gRPC client ships, those detectors stay greyed out.

Terminal window
helm repo update
helm upgrade kure-monitor kure-monitor/kure \
--namespace kure-system \
--version 2.4.0

The chart handles the PostgreSQL rename and the role-table migration.

  1. Open the dashboard. The tab order should read Monitoring → Security → Advice → Diagram → Admin.
  2. Click Advice, pick a namespace — a scan should auto-run and produce findings within a few seconds.
  3. Expand any finding card. The explanation appears (one LLM call per first expand; re-expand is cached). If you don’t have an LLM configured, the explanation falls back to the detector’s static description.
  4. Open Admin Panel → Detector Settings and confirm the per-detector toggles work, including the disabled Hubble-gated entries.

Advice scan returns “Needs Hubble” for some detectors

Section titled “Advice scan returns “Needs Hubble” for some detectors”

Expected. Three to four detectors (fan-out-pattern, websocket-on-deployment, all-to-all-replicas, ephemeral-processes) require Cilium Hubble. Install Hubble or disable those detectors in Admin Panel → Detector Settings to silence the coverage banner.

”Old PostgreSQL pod still around after helm upgrade

Section titled “”Old PostgreSQL pod still around after helm upgrade””

The orphaned kure-postgresql StatefulSet won’t be deleted automatically. Remove it manually:

Terminal window
kubectl delete statefulset kure-postgresql -n kure-system
kubectl delete svc kure-postgresql -n kure-system
kubectl delete cm kure-postgresql -n kure-system
kubectl delete secret kure-postgresql -n kure-system
# PVC is also orphaned; delete when you're sure you don't need it.
kubectl delete pvc -l app=kure-postgresql -n kure-system

Admin Panel → Detector Settings shows everything as disabled

Section titled “Admin Panel → Detector Settings shows everything as disabled”

This happens if the user account does not have the admin role. The detector toggles are admin-only; member users see the Advice tab in read mode (run scans, view findings, expand explanations) but cannot change detector configuration.

See CHANGELOG.md in the repository.