Troubleshooting
Common issues with Kure Monitor and how to fix them.
Install issues
Section titled “Install issues”Pods not starting
Section titled “Pods not starting”kubectl get pods -n kure-systemkubectl describe pod <pod-name> -n kure-system| Issue | Fix |
|---|---|
| Insufficient resources | Reduce resource requests in values.yaml |
| Image pull errors | Check registry access and image tags |
| PVC pending | Ensure StorageClass exists with available capacity |
| RBAC issues | Verify ClusterRole and ClusterRoleBinding were created |
PostgreSQL connection failed
Section titled “PostgreSQL connection failed”kubectl get pods -n kure-system -l app=postgresqlkubectl logs -n kure-system -l app=postgresqlkubectl get svc -n kure-system | grep postgreskubectl get secret -n kure-system kure-secrets -o yamlkubectl get pvc -n kure-systemImage pull errors
Section titled “Image pull errors”docker pull ghcr.io/nan0c0de/kure-monitor/backend:2.4.0
# Private registrykubectl create secret docker-registry regcred \ --docker-server=ghcr.io \ --docker-username=<user> \ --docker-password=<token> \ -n kure-systemimagePullSecrets: - name: regcredAuthentication issues
Section titled “Authentication issues”Dashboard goes straight to main page (no login)
Section titled “Dashboard goes straight to main page (no login)”Auth is always on in 2.3+. If no login page appears, the bootstrap Secret is missing.
kubectl get secret kure-monitor-bootstrap -n kure-system
# If missing (e.g., manifest install)kubectl create secret generic kure-monitor-bootstrap -n kure-system \ --from-literal=service-token="$(openssl rand -hex 32)" \ --from-literal=session-secret="$(openssl rand -hex 32)"
kubectl rollout restart deployment/kure-monitor-backend -n kure-system401 Unauthorized on API calls
Section titled “401 Unauthorized on API calls”Verify your session cookie or service token. From the browser, inspect cookies (kure_session should be set after login). From the agent / scanner, confirm X-Service-Token is in the request:
kubectl logs -n kure-system -l app=kure-backend | grep -i authAgent / scanner getting 401s
Section titled “Agent / scanner getting 401s”Ingest endpoints (POST /api/pods/failed, etc.) require X-Service-Token, not user auth. Confirm:
# Should return 400 (bad body), NOT 401curl -X POST http://localhost:8000/api/pods/failed \ -H "Content-Type: application/json" \ -H "X-Service-Token: $SERVICE_TOKEN" \ -d '{}'If you see 401, the service token in the agent/scanner pod doesn’t match the backend’s. Confirm both mount the same <release>-bootstrap Secret.
WebSocket rejected
Section titled “WebSocket rejected”- Confirm you’re logged in
- Check browser console for WebSocket errors
- Confirm the session cookie exists
Dashboard issues
Section titled “Dashboard issues”Cannot access dashboard
Section titled “Cannot access dashboard”kubectl get svc kure-monitor-frontend -n kure-systemBy service type:
# NodePortkubectl get nodes -o widekubectl get svc kure-monitor-frontend -n kure-system \ -o jsonpath='{.spec.ports[0].nodePort}'
# Port-forwardkubectl port-forward svc/kure-monitor-frontend 8080:8080 -n kure-system
# LoadBalancerkubectl get svc kure-monitor-frontend -n kure-system -wDashboard stuck on “Connecting…”
Section titled “Dashboard stuck on “Connecting…””kubectl port-forward svc/kure-monitor-backend 8000:8000 -n kure-systemcurl http://localhost:8000/api/config
kubectl logs -n kure-system -l app=kure-frontendCommon causes: backend not ready, CORS misconfig, NetworkPolicy blocking traffic.
No pod failures showing
Section titled “No pod failures showing”kubectl get pods -n kure-system -l app=kure-agentkubectl logs -n kure-system -l app=kure-agent
# Confirm agent can reach backendkubectl exec -n kure-system -l app=kure-agent -- \ curl -s http://kure-monitor-backend:8000/api/configAlso check Admin → Suppressions — the namespace may be excluded.
AI / LLM issues
Section titled “AI / LLM issues””AI Not Configured” banner
Section titled “”AI Not Configured” banner”Admin → AI Configuration → pick provider → key → model → Test → Save. See LLM Providers.
Test connection failed
Section titled “Test connection failed”| Error | Fix |
|---|---|
| Invalid API key | Re-check the key |
| Rate limited | Wait and retry, or use a different key |
| Model not available | Pick another model |
| Network error | Confirm backend can reach the LLM API |
kubectl exec -n kure-system -l app=kure-backend -- \ curl -s https://api.openai.comAI solutions not generating
Section titled “AI solutions not generating”curl http://localhost:8000/api/admin/llm/statuskubectl logs -n kure-system -l app=kure-backend | grep -i llmIf LLM fails, Kure falls back to rule-based solutions. Generic but functional.
Security scanner issues
Section titled “Security scanner issues”No security findings
Section titled “No security findings”kubectl get pods -n kure-system -l app=kure-security-scannerkubectl logs -n kure-system -l app=kure-security-scannerAlso check Admin → Suppressions for excluded namespaces.
Too many findings from system namespaces
Section titled “Too many findings from system namespaces”Add to Admin → Suppressions:
kube-systemkube-publickube-node-leasekure-system
Mirror pod issues
Section titled “Mirror pod issues””Test Fix” button missing
Section titled “”Test Fix” button missing”- Logged in as
read/write(admin only) - The pod has no AI solution yet
Mirror pod fails to deploy
Section titled “Mirror pod fails to deploy”| Issue | Fix |
|---|---|
| RBAC missing | Backend ClusterRole needs create, delete on pods |
| Quota exceeded | Check ResourceQuota in the namespace |
| Image pull error | Mirror uses the same image as the original |
kubectl get clusterrole -l app.kubernetes.io/component=backend -o yaml | grep -A5 "resources.*pods"kubectl logs -n kure-system -l app.kubernetes.io/component=backend | grep -i mirrorStuck in Pending
Section titled “Stuck in Pending”- Mirror inherits resource requests from the original — confirm node capacity
- Original may have node selectors / affinity that can’t be satisfied
kubectl describe pod <mirror-pod-name> -n <namespace>
Not auto-deleting
Section titled “Not auto-deleting”- Backend pod isn’t running (cleanup is a background task)
- Backend logs may show cleanup errors
- Manual:
kubectl delete pod <mirror-pod-name> -n <namespace> - TTL setting: Admin → Settings
Performance issues
Section titled “Performance issues”High memory usage / OOMKilled
Section titled “High memory usage / OOMKilled”backend: resources: limits: memory: 2Gi
agent: resources: limits: memory: 1GiSlow dashboard
Section titled “Slow dashboard”- Filter by namespace
- Dismiss resolved failures
- Check browser console for errors
Agent using too much CPU
Section titled “Agent using too much CPU”agent: checkInterval: 30 # bump from defaultNetwork issues
Section titled “Network issues”NetworkPolicies blocking traffic
Section titled “NetworkPolicies blocking traffic”kubectl get networkpolicies -n kure-systemkubectl delete networkpolicy -n kure-system --all # debug onlyDNS failures
Section titled “DNS failures”kubectl exec -n kure-system -l app=kure-agent -- \ nslookup kure-monitor-backend.kure-system.svc.cluster.localUpgrade issues
Section titled “Upgrade issues”Database migration failed
Section titled “Database migration failed”kubectl logs -n kure-system -l app=kure-backend | grep -i migration
kubectl exec -n kure-system -l app=postgresql -- \ psql -U kure -d kure -c "SELECT version FROM schema_migrations;"Helm upgrade fails
Section titled “Helm upgrade fails”helm list -n kure-systemhelm history kure-monitor -n kure-systemhelm rollback kure-monitor <revision> -n kure-systemhelm upgrade kure-monitor kure-monitor/kure -n kure-system --forceLogging and debugging
Section titled “Logging and debugging”Enable debug logging
Section titled “Enable debug logging”kubectl set env deployment/kure-monitor-backend -n kure-system LOG_LEVEL=DEBUGkubectl set env daemonset/kure-monitor-agent -n kure-system KURE_LOG_LEVEL=DEBUGView logs
Section titled “View logs”# All componentskubectl logs -n kure-system \ -l app.kubernetes.io/instance=kure-monitor --all-containers
# Per componentkubectl logs -n kure-system -l app=kure-backend -fkubectl logs -n kure-system -l app=kure-agent -fkubectl logs -n kure-system -l app=kure-frontend -fkubectl logs -n kure-system -l app=kure-security-scanner -fDiagnostics bundle
Section titled “Diagnostics bundle”kubectl get all -n kure-system -o yaml > kure-diagnostics.yamlkubectl logs -n kure-system -l app=kure-backend --all-containers >> kure-diagnostics.yamlkubectl logs -n kure-system -l app=kure-agent --all-containers >> kure-diagnostics.yamlkubectl describe pods -n kure-system >> kure-diagnostics.yamlStill stuck?
Section titled “Still stuck?”Open an issue on GitHub with the diagnostics bundle.