Deep-dive research briefings on specific topics, newest first.
Empirical findings on agent rules files and an evidence-based style guide
First controlled studies on a pattern the entire agent-coding community has been using — time to replace folklore with evidence
The capability-vs-autonomy gap and a practitioner's evaluation rubric for long-horizon agentic deployments
'Just upgrade, it's strictly better' is closed as a path — model selection for autonomous deployments needs its own evaluation protocol
Convergent UX primitives across Cursor 3, Claude Code, Gemini CLI, and Stage
Three major coding-agent vendors shipped 'post-IDE' bets in one week — what UX primitive set practitioners should build on vs. treat as vendor-locked
Capability-gated commercial model release, Project Glasswing partner requirements, lab responses, and enterprise model-access strategy implications
Enterprise teams building on frontier models must now plan for tier-gated access as a procurement concern distinct from price-gating. Glasswing is the first post-GPT-2 commercial capability gate and every downstream lab is already moving in response.
Production-scale harness deployments (OpenAI Frontier & Symphony, Cognition Devin), LangChain's harness-as-lock-in critique, evals-driven harness iteration, and open deployment alternatives to Claude Managed Agents
Four weeks after harness engineering was named, the production chapter has datapoints, a lock-in critique, and competing deployment models — teams now face an architectural commitment decision, not a tooling choice
The gateway-and-registry layer hardening between agents and MCP servers — what the MCP Dev Summit put on record, which vendors are filling which slots, and how practitioners should decide what to build, buy, or skip
MCP deployment work now consists mostly of middle-tier decisions — auth proxies, tool registries, execution sandboxes, observability. Every team committing to MCP in 2026 will make four or five build-vs-buy calls on this layer before their first production agent ships.
Quantifying and mitigating the integration cost of agent-generated code
As teams scale agent-assisted development, integration costs (merge conflicts, review overhead, coordination gaps) become the binding constraint
Documentation-driven injection and poisoning attacks against agent skill ecosystems
Agents that consume third-party skills are vulnerable to a new class of supply-chain attack that poisons documentation rather than code
Harness engineering as a named discipline — mental models, concrete patterns, and implementation guidance
Three independent sources converged on harness engineering this week; teams need a unified framework for designing agent scaffolds
OpenClaw security crisis — CVEs, ClawHavoc, architectural root causes, and defensive patterns
First major real-world agent security incident; establishes patterns every agent infrastructure team needs to understand
Convergence of SKILL.md, CLAUDE.md skills, AGENTS.md, and vendor-specific agent skill systems toward a shared format
The skills layer is where agent capability actually lives — practitioners need to know which format to invest in and what portability they can expect
Comparing emerging permission model architectures for AI agents
Every agent system needs a permission model — practitioners need to choose between classifier-based, role-based, and metric-based approaches