Workflow on Bradley Fidler

Wireword: Agent Control Words Should Be Hard to Misread

Tue, 21 Apr 2026 00:00:00 +0000

This is a research note for Wireword, a small tool I am building to lint LLM agent control words.

By control words, I mean short labels that can change what an agent does:

route names
tool names
prompt macro names
environment targets
approval targets
exact enum values the model must emit

The goal is narrow: make labels that control agent behavior harder to misread, miscopy, or misroute.

Of words and tokens being expensive

This started with caveman-style LLM output. The useful comparison is not really cavemen. It is telegraphese: compressed language for an expensive channel.

Western Union did not bill like an LLM API, but the pressure was similar. Ordinary domestic telegrams were billed by chargeable body word, usually with a ten-word minimum; address, signature, and date were free, while extra body words cost more.¹ A ten-word sentence from New York to Boston could cost 30 cents.²

That maps to LLM work in two basic ways:

Token cost: shorter turns are cheaper.
Context quality: shorter turns leave less low-information text in the conversation history.

The second point is not just aesthetic. Long histories are not used perfectly. Irrelevant text can distract the model or bury the useful constraint.

But compression has a failure mode. If compressed labels become too similar, the model has less redundancy to recover the intended control word.

Learning from telegraphy

I looked at other telegraph practices to see what might apply to LLM agents. Could Victorian engineers provide fresh insights for our changing world? No, except for one thing, sort of.

Most parallels are useful but general:

Telegraph practice	General pattern	LLM-agent version
`STOP` and spelled punctuation	delimiters	source/task boundaries
repeat-back	confirmation	human approval gates
service classes	priority and cost tiers	model routing / effort levels
codebooks	macros	prompt libraries
word-count checks	validation	output checks
operators	review and observability	linters / traces
private codes	substitution	PII masking

These are durable information-management practices. They are worth remembering, but they do not justify a new tool by themselves.

The more specific lead was codeword design.

Compression with redundancy

Commercial telegraph codebooks had to balance compression and recoverability. A codeword had to be short enough to save money, but distinct enough that a damaged word did not silently become another valid word.

E. L. Bentley described the rule directly: good codewords should differ by at least two letters. Then a one-letter mutilation produces an invalid codeword, not the wrong valid codeword.³

The ABC Code used the same principle. John McVey’s index quotes the 1920 sixth edition saying its five-letter codewords were built with at least a two-letter difference. The same note says the compilers considered Morse similarities and removed risky words.⁴

Useful rule:

Good compression leaves enough redundancy to detect mistakes.

The LLM agent version

This problem is not unique to LLMs. Similar issues appear in APIs, command-line flags, protocol enums, medication names, service names, and airport codes.

LLM agents make the problem newly common because they combine:

probabilistic language generation
exact symbolic control
natural-language prompts around short labels
tool calls and routes with real side effects

Example labels:

A1
AI
Al
prod
production
live
docs.api
doc.api
FACTCHECK_API
FACT_CHECK_API

These are not just strings. In an agent system, they may route work, call tools, select environments, expand macros, approve targets, or satisfy exact enum values.

The risk boundary is narrow. Similar labels matter when three conditions hold:

the label is visible to the model or copied through natural language
the model or a human can choose or emit the label
downstream code treats the label as an exact control input

A wrong valid label is worse than an invalid label. Invalid labels can fail validation. Wrong valid labels can pass validation and trigger the wrong action.

This matters less when routing is deterministic, internal IDs are hidden from the model, schemas constrain the choice, or a UI forces selection from canonical options.

So Wireword should not only ask whether two strings are similar. It should ask:

What kind of label is this?
Can the model emit it?
Does a parser require an exact match?
What happens if the wrong label is chosen?
Does it target production or another external system?

Generic check vs agent-aware check

Generic similarity check:

docs.api / doc.api
Reason: edit distance 1.

Agent-aware check:

CRITICAL docs.api / doc.api
Reason: route-name collision across different effects.
Risk: read-only route is one edit away from external-write route.
Fix: rename to ROUTE_DOCS_REVIEW and ROUTE_DOCS_PUBLISH.

Generic similarity check:

prod / production / live
Reason: related strings.

Agent-aware check:

CRITICAL prod / production / live
Reason: multiple production-like environment labels.
Risk: agent may choose an inconsistent deployment target.
Fix: use ENV_PRODUCTION as the only valid production label.

That is the product line: do not only lint strings. Lint control words by the action they can trigger.

Current prototype and V1 plan

The tool is Wireword. V1 should stay small.

The current prototype now checks both layers:

raw labels: visual confusables, edit-distance-one pairs, case-only differences, punctuation-only differences, plural/stem collisions, and production-like aliases
agent-aware labels: routes, tools, named agent handoffs, approval targets, macros, profiles, production-like environments, and exact enum values the model must emit

That is enough to test the shape of the idea. The repo now has a small validation corpus with safe, dangerous, and malformed configs, plus a narrow FastMCP source extractor for tool names. It is still not a full agent security scanner.

The useful output is not just these strings are similar. It is these strings are similar, the model can see or emit them, and confusing them could call the wrong tool, route work to the wrong place, or target the wrong environment.

Representative targets:

MCP servers with model-visible tools
router or handoff agents
graph-based agent workflows
skill/plugin systems with named routes
exact enum outputs consumed by parsers

The repo should carry the detailed CLI examples, fixtures, and tests. This note only needs the argument.

What Wireword is not

Wireword is not:

an agent framework
a prompt framework
a general security scanner
a replacement for schemas or constrained decoding
a proof that LLMs confuse every similar label
necessary when labels are hidden behind deterministic routing, internal IDs, or strict UI selection

It is a narrow lint pass for labels that become model-visible or human-visible control inputs.

Conclusion

Telegraph codebooks might inspire useful linting for LLM agent control identifiers.

Sources

Nelson E. Ross, How to Write Telegrams Properly (1928), “How Tolls Are Computed” and “Punctuation Marks.” Ross explains domestic body-word billing, cable/radiogram address billing, and the rule that requested punctuation marks were counted and charged as words. ↩︎
Western Union Telegraph Company, The Proposed Union of the Telegraph and Postal Systems (1869). Western Union gives the 1866 New York-to-Boston tariff as 30 cents for ten words, exclusive of address and signature. ↩︎
E. L. Bentley, “Codes: Their Nature and Manipulation”, transcribed by John McVey. Bentley describes the two-letter-difference rule and explains that it prevents a one-letter mutilation from silently becoming another valid codeword. ↩︎
John McVey, “A.B.C. Telegraphic Codes, seven editions 1873-1936”. The page quotes the 1920 sixth edition on five-letter codewords built with at least a two-letter difference and notes the code’s attention to Morse similarities. ↩︎

Using CHANGELOG.md as LLM session memory

Sat, 21 Feb 2026 00:00:00 +0000

Most LLM assistants don’t maintain memory between sessions. The standard workaround — a large CLAUDE.md or AGENTS.md with everything in it — breaks down quickly. What’s more, it duplicates other content in your repo, growing the documentation maintenance surface without adding value.

Lately I avoid this problem by treating CHANGELOG.md as my LLM’s memory — specifically the [Unreleased] section from the format standardized by Keep a Changelog, which becomes the primary mutable state document.

Why it works

Keep a Changelog defines a format most LLMs recognize on sight: a fenced [Unreleased] block at the top, dated releases below. Most LLMs recognize the convention: [Unreleased] is active work, dated entries are history.

That maps directly onto what you need for session continuity:

[Unreleased] — mutable, updated every session. Current state, active priorities, blockers, decisions pending. The model reads this first.
Dated entries — append-only history. Evidence that decisions happened and why. The model reads these to reconstruct context if it needs depth.

The AGENTS.md (or CLAUDE.md) file becomes stable configuration: conventions, file paths, source-of-truth map. It changes rarely. The CHANGELOG takes on everything that does change.

The session start instruction

One line at the top of AGENTS.md is enough:

Read CHANGELOG.md [Unreleased] at session start.

From there the model knows where it is, what’s in flight, and what to do next — without re-explanation.

What goes in [Unreleased]

I use explicit subsections:

## [Unreleased]

### Current State
One-paragraph snapshot. Where things stand right now.

### Active Priorities
Ordered list of what needs to happen next.

### In Progress
What the model started in the current session.

### Blocked
Anything waiting on external action.

### Decisions Needed
Open questions the model should surface, not resolve unilaterally.

### Recently Completed
What just shipped. Moves to a dated entry on the next commit.

The model updates [Unreleased] at the end of each session. The next session reads it cold and picks up cleanly.

What this is not

This is not a replacement for good project documentation. Architectural decisions, integration details, and source-of-truth maps still belong in stable docs. The changelog is the session state layer, not the full context layer.

It also does not solve the problem of context window limits on large projects. It reduces the cost of context: the model loads a small, structured, current-state document instead of scanning a stale megafile.

Result

Sessions are shorter to start, more reliable to hand off, and easier to audit. The changelog does the work it was always supposed to do — track what changed and when — and the LLM does less redundant orientation work each time.

The format is well-understood, self-describing, and version-controlled. If you’re already using Keep a Changelog, the only addition is a discipline: update [Unreleased] at the end of each session.