AI Coding Teams and User Acceptance Testing: A Practitioner's Guide
The closed-loop problem, four UAT patterns that work, benchmark contamination explained, and a decision framework across Claude Code, OpenCode, and open-source models.
Read →Notes
Field notes from building and shipping with AI tools. Verified sources, disclosed conflicts, honest numbers.
The closed-loop problem, four UAT patterns that work, benchmark contamination explained, and a decision framework across Claude Code, OpenCode, and open-source models.
Read →Eight specialized agents, each with a single job: Analyst, Architect, Engineer, Sentinel, Healer, Scribe, Inspector, Orchestrator. Why specialization beats generalism in QA.
Read →Why AI cannot own the merge gate, the inner/outer loop architecture, four required gate components, and a six-step checklist for pipelines that catch AI-generated bugs.
Read →What if every healthy choice you make could contribute to your financial future? A product design thesis for health-driven retirement savings.
Read →114k tokens per task via MCP vs 27k via CLI. Token economics, setup, and a decision framework across Playwright MCP, agent-browser, playwright-skill, and Stagehand.
Read →How Poshmark weaves social media features into its UX, the ethical implications of monetizing engagement behaviors, and where social commerce is headed.
Read →AI tools will become table stakes within two to three years. The competitive advantage is practice — the judgment you build by using the tools, not the tools themselves.
Read →Individual subscription vs. enterprise token billing, open-source model cost spreads, human QA benchmarks, and where the ROI actually is.
Read →DOM assertions test structure, not what users see. The screenshot feedback loop, tool comparison, and handling non-deterministic AI-generated content.
Read →82% of teams use AI in testing. Most either disable it within three months or spend more time managing AI misses than the AI saves. Here's why.
Read →A framework for evaluating AI testing claims: vendor epistemology, benchmark contamination, ceiling-as-average, and category errors — including an honest accounting of the data in this cluster.
Read →