codex-local

What 9B can actually do.

The reason I’m bullish on local models: codex-local — my fork of the Codex CLI, tuned to babysit a local model through a structured task list — drives multi-turn, multi-file, multi-task coding sessions to completion using just Qwen3.5-9B-IQ4_XS.

The trick isn’t the model — it’s the scaffolding. Specifically:

Most local-model coding agents stall past trivial single-edit tasks. With those four disciplines, 9B handles real work. The frontier model is no longer the moat — the scaffolding is.

The fork itself adjusted compaction behavior, model-routing interception, and exec-log inspection so a long unattended run could survive overnight without losing the thread.

The rig

Dual-GPU local-first AI infra.

OpenClaw bridges

3D scenes by voice.

Built so my third son could direct his Roblox / Blender / Daz scenes through natural language — “make Torus.1 50% thinner on the Z-axis” — and watch the live environment update. AI helper bridge through OpenClaw, with the language layer driving live scene state. Sits at the intersection of three threads: my AI work, my long-running 3DCG interest, and a kid who’s already designing in Blender, Roblox Studio, and Suno.

coding-agent-router

Multi-agent CLI routing.

Docs-first multi-agent CLI router. Routes prompts across specialized workers with shared context, built on the assumption that no single model is the right choice for every step in a long task. Pairs with codex-local for the local side and with cloud providers for harder work.

K.O.R.A.

Same scaffolding, in production.

Everything above runs in production at Kora Labs as K.O.R.A. — the autonomous Discord ops agent. Multi-stage pipeline, multi-model routing with graceful fallback, three-tier test battery, set-me-loose-all-night execution mode. The personal tools and the production agent share the same operating doctrine. See /work for the platform context.

Operating doctrine

How I actually work with AI.