Offline-First · RAG-Enforced · Secure Local AI Second Brain
Topology = Policy. 5 invariants enforced by graph edges. Zero telemetry. Zero cloud by default.
What CyClaw is and why the architecture is deliberately hostile to cloud leakage
A personal offline-first "second brain" that answers exclusively from your private Markdown vault. No cloud, no subscription, no data exfiltration by default. FastAPI gateway + LangGraph state machine + Chroma/BM25 RRF hybrid retrieval + optional Grok fallback triple-gated behind explicit user confirmation.
Most RAG systems bolt security on as config flags. CyClaw makes the 5 core guarantees architectural constraints — graph edges, not README promises. You literally cannot bypass RAG-first or skip the triple-gate Grok fallback without rewriting the LangGraph graph itself.
All embeddings, retrieval, and inference on local hardware (LM Studio). Cloud fallback (Grok) is triple-gated and opt-in only. Zero telemetry — env vars killed before any SDK import.
Persistent identity layer (soul.md) with SHA-256 drift detection, atomic writes via os.replace(), explicit human reason string required. No autonomous self-modification from any graph node.
FastAPI HTTP at 127.0.0.1:8787, MCP server for Claude Desktop & Copilot Studio, full browser terminal UI — all bound to loopback only.
From query to audited response — every path converges, no shortcuts
┌────────────────────────────────────────────────────────────────┐ │ CLIENT Browser · curl · Claude Desktop · MCP │ └──────────────────────┬─────────────────────────────────────────┘ │ HTTP POST /query (127.0.0.1:8787 only) ▼ ┌────────────────────────────────────────────────────────────────┐ │ gate.py (FastAPI) │ │ ① Rate limit (60/min) ② Injection filter (33 OWASP) │ │ ③ Soul init + SHA drift check ④ Telemetry kill (pre-import) │ └──────────────────────┬─────────────────────────────────────────┘ │ ▼ ┌────────────────────────────────────────────────────────────────┐ │ graph.py (LangGraph 7-Node State Machine) │ │ │ │ [ENTRY] ─► retrieve (Chroma+BM25+RRF) ─► route_score │ │ │ │ ├─ score ≥ 0.028 ─► local_llm (LM Studio) │ │ │ │ │ └─ score low ─► user_gate (needs_confirm) │ │ │ │ │ ┌──────────┤ confirmed+hybrid │ declined │ │ ▼ ▼ │ │ grok_fallback offline_best │ │ (triple gate) │ │ │ │ ALL PATHS ─► audit_logger (SHA-256 + PII redact) │ └────────────────────────────────────────────────────────────────┘ Parallel: POST /ops/* ─► ops_runner.py (subprocess shim, v1.7 NEW) subprocess.run(argv_list) only ─ never imports sync/ or agentic/
retrieve is the unconditional first node. No LLM call can ever precede retrieval. Enforced by graph entry edge — not config.mode=hybrid AND grok.enabled=true AND user_confirmed_online=true simultaneously. All three must be true.audit_logger. Zero shortcut paths. Every query is SHA-256 hashed and logged to audit.jsonl.os.replace(). No node autonomously modifies soul.Every file in plain English and code terms
sanitizer.py injection scan (33 patterns), soul init, and kills all telemetry env vars before any SDK import. All /ops/* routes proxy through ops_runner.py.StateGraph with 7 named nodes. Routing is edge-defined — route_score edge function reads RRF score from state, never asks the LLM. All paths terminate at audit_logger. No dead-end paths.[FILTERED] placeholder on match.soul.md on every request, computes SHA-256, compares against stored hash. Mismatch triggers drift alert. Writes are atomic via os.replace(). Human reason string required. Full version history.POST /ops/sync and POST /ops/agentic. Uses only subprocess.run(argv_list) — never import sync or import agentic. Hard isolation boundary. Loopback + API-key gated. Dry-run default.8 layers, all enforced by code not convention
| Layer | Mechanism | Status | File |
|---|---|---|---|
| Network | 127.0.0.1:8787 only + TrustedHostMiddleware DNS rebinding defense | ENFORCED | gate.py |
| Rate Limit | 60 req/min per IP, sliding window, thread-safe, 429 + audit on exceed | HARDCODED | gate.py |
| Injection Filter | 33 OWASP patterns, config hot-reloadable, applied pre-LLM + corpus ingest | ACTIVE | sanitizer.py |
| Telemetry Kill | LangChain/Chroma/OTel env vars blocked before any SDK import | PRE-IMPORT | gate.py |
| Grok Triple Gate | mode=hybrid + grok.enabled=true + user_confirmed_online=true — all three required | TRIPLE | graph.py |
| Soul Writes | Human reason + injection scan + atomic os.replace() + SHA-256 drift check | GOVERNED | personality.py |
| Audit | SHA-256 hash + PII redaction on ALL 6 paths — mandatory convergence | MANDATORY | logger.py |
| /ops/* Isolation (v1.7) | Loopback + API-key gated, subprocess.run(argv) only — hard module boundary | NEW v1.7 | ops_runner.py |
20 commits · Browser Ops + isolation boundary perfected
Two new operator panels in static/terminal.html backed by loopback-only audited POST /ops/sync and POST /ops/agentic subprocess shims. sync/ and agentic/ are never imported by core — true out-of-band isolation. PR #239, 470 tests passing, adversarial review by 4 reviewers, zero in-scope defects.
Functional simulation of the real 46KB terminal. Real invariants. Real palette.
Real terminal.html is a 46KB single-file operator console at 127.0.0.1:8787/static/terminal.html. This demo mirrors the invariants, flow, and four console panels exactly.
ops_runner subprocess shim, Sync + Agentic consoles, 33-pattern sanitizer, 470 tests, this infographic. Version bump from 1.5.0.
Grok fallback now physically impossible without three simultaneous flags + explicit user confirmation in UI.
Topology = Policy became non-negotiable. RAG-first enforced at graph entry. BM25 + Chroma RRF fusion introduced.
FastAPI gateway, LangGraph state machine, ChromaDB, soul governance, MCP server. Core architecture established.