Deep-dive research briefings on specific topics, newest first.
Comparative analysis of the four major agent-registry offerings and the decision framework for picking one
Registries are the layer where agent-protocol fragmentation gets absorbed and where enterprise governance lives — choosing one is a lock-in decision for the decade
Empirical findings on agent rules files and an evidence-based style guide
First controlled studies on a pattern the entire agent-coding community has been using — time to replace folklore with evidence
The capability-vs-autonomy gap and a practitioner's evaluation rubric for long-horizon agentic deployments
'Just upgrade, it's strictly better' is closed as a path — model selection for autonomous deployments needs its own evaluation protocol
Convergent UX primitives across Cursor 3, Claude Code, Gemini CLI, and Stage
Three major coding-agent vendors shipped 'post-IDE' bets in one week — what UX primitive set practitioners should build on vs. treat as vendor-locked
Capability-gated commercial model release, Project Glasswing partner requirements, lab responses, and enterprise model-access strategy implications
Enterprise teams building on frontier models must now plan for tier-gated access as a procurement concern distinct from price-gating. Glasswing is the first post-GPT-2 commercial capability gate and every downstream lab is already moving in response.
Production-scale harness deployments (OpenAI Frontier & Symphony, Cognition Devin), LangChain's harness-as-lock-in critique, evals-driven harness iteration, and open deployment alternatives to Claude Managed Agents
Four weeks after harness engineering was named, the production chapter has datapoints, a lock-in critique, and competing deployment models — teams now face an architectural commitment decision, not a tooling choice
The gateway-and-registry layer hardening between agents and MCP servers — what the MCP Dev Summit put on record, which vendors are filling which slots, and how practitioners should decide what to build, buy, or skip
MCP deployment work now consists mostly of middle-tier decisions — auth proxies, tool registries, execution sandboxes, observability. Every team committing to MCP in 2026 will make four or five build-vs-buy calls on this layer before their first production agent ships.
Quantifying and mitigating the integration cost of agent-generated code
As teams scale agent-assisted development, integration costs (merge conflicts, review overhead, coordination gaps) become the binding constraint
Documentation-driven injection and poisoning attacks against agent skill ecosystems
Agents that consume third-party skills are vulnerable to a new class of supply-chain attack that poisons documentation rather than code
Harness engineering as a named discipline — mental models, concrete patterns, and implementation guidance
Three independent sources converged on harness engineering this week; teams need a unified framework for designing agent scaffolds
OpenClaw security crisis — CVEs, ClawHavoc, architectural root causes, and defensive patterns
First major real-world agent security incident; establishes patterns every agent infrastructure team needs to understand
Convergence of SKILL.md, CLAUDE.md skills, AGENTS.md, and vendor-specific agent skill systems toward a shared format
The skills layer is where agent capability actually lives — practitioners need to know which format to invest in and what portability they can expect
Comparing emerging permission model architectures for AI agents
Every agent system needs a permission model — practitioners need to choose between classifier-based, role-based, and metric-based approaches
Transforming architectural decisions into machine-readable guardrails that coding agents respect
Over-permissioned agents cause 4.5x more incidents — teams need practical patterns for encoding architectural constraints agents can follow
Supply chain attack vectors and defensive patterns specific to AI agent infrastructure
AI infrastructure dependencies are high-value targets — teams need specific defensive patterns beyond general supply chain hygiene
Strategic implications of AI labs acquiring developer tooling companies
Practitioners depend on tools now owned by AI labs — understanding the strategic logic and risks informs tooling choices
Architectural patterns from Stripe Minions, Spotify Honk, and HubSpot Sidekick
Practitioners building internal coding agents need concrete architectural patterns validated at enterprise scale
Guidance injection attacks on coding agent bootstrap/skill files
Anyone building or deploying coding agents with skill ecosystems (CLAUDE.md, AGENTS.md, custom skills) needs to understand this novel attack surface
The emerging discipline of supervisory engineering in agent-assisted workflows
Teams adopting coding agents need to understand how engineering roles, skills, and team structures are shifting
Comparing TDAD, Arbiter, and existing approaches to prompt regression testing for AI agents
Every production agent team eventually discovers that small prompt changes cause silent behavioral regressions — this scout maps the landscape of available detection and prevention approaches
Autoresearch — the autonomous experiment loop pattern for iterative optimization
Directly applicable to automated optimization, quality improvement, and security hardening workflows in agent infrastructure
Model Context Protocol production deployment issues — what breaks when MCP leaves the demo and hits real infrastructure
MCP is becoming the de facto agent-to-tool standard; understanding its production gaps is essential before committing to it as architectural foundation