2026Live

Agent-Readiness Linter

Score an OpenAPI spec against the patterns LLM agents actually need.

ReactGatsbyOpenAPIjs-yamlRule Engine

Inspiration

The Blog Post, as a Tool

After writing the 'Designing Backends for Agents' post, the obvious next move was to turn its thesis into something a reader could use on their own API. Most teams I've talked to know their spec isn't quite agent-ready but can't point to the specific gaps. A deterministic linter is the shortest path from that feeling to a concrete punch list.

The constraint was strict: no AI in the tool itself. The whole point of the blog is that good agent ergonomics is an architectural discipline, not a model capability. Using an LLM to evaluate API specs would have undercut the argument.

The Idea

Ten Rules, Each Worth Some Points

Paste an OpenAPI spec — JSON or YAML — and a rule engine scores it out of 100 across ten weighted checks: error shape, retryability, idempotency on writes, operation descriptions, verb-led names, parameter descriptions, enum documentation, pagination, rate limits, and per-operation auth scopes.

Each rule is a standalone function that walks the spec, produces findings with actionable messages, and returns a pass/fail plus a set of offending paths. The total score is the sum of passed weights over the total, giving you a single number that moves when you actually fix things.

Architecture

Paste → Parse → Score → Drill-In

Split-pane layout. Paste on the left, report on the right. Score at the top of the report, followed by one row per rule — passing rules collapsed, failing rules expanded with rationale and per-endpoint findings.

Spec Input

Textarea accepting JSON or YAML, with a 'Load sample spec' button wired to a realistic OpenAPI 3.0 document.

Parser

Detects JSON vs YAML by first character, parses with js-yaml. Surfaces parse errors inline rather than crashing the UI.

Rule Engine

Array of rule objects. Each rule has id, title, weight, rationale, and a check(spec) function returning { pass, findings }.

Score + Report

Total computed from passed weights. Each rule rendered as a collapsible row; failing rules expand by default with rationale and findings.

Zero-Network Runtime

All parsing, scoring, and rendering is client-side. The textarea contents never leave the browser.

Tech Deep Dive

Under the Hood

Rule Engine

Ten rules in a single file, each ~30 lines. New rules plug in by adding an entry to the array — weight, title, rationale, check function. Refactor cost for new rules is near zero.

js-yaml

Only dep needed at runtime (~30KB gzipped). Imported inside the linter page component so Gatsby's route-based code splitting keeps it out of the homepage bundle.

Gatsby Route

Lives at /agent-linter as a standalone page. Heavy work stays in its own chunk; the rest of the site is unaffected.

Deterministic Scoring

No randomness, no LLM. Same spec always produces the same score, which is the entire point — agent-readiness is a contract-design question, not a modeling one.

Challenges

What Made It Hard

Writing heuristics that catch real problems without triggering on well-designed specs. The 'operations are verb-led' rule originally flagged 'GET /users' because 'users' isn't a verb — loosened it to pass as long as the operationId or summary itself starts with a verb.
Detecting idempotency support without a perfect signal. Settled on a dual check: a header parameter named Idempotency-Key, OR the word 'idempotent' appearing in the operation description. Catches most real-world patterns.
Scoring behaviour when the spec has zero 4xx/5xx responses defined. Rather than failing silently or giving full credit, the rule surfaces the gap explicitly as a failed check with an 'undocumented errors' finding.