<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/">
  <channel>
    <title>Workflow on Bradley Fidler</title>
    <link>https://brfid.github.io/tags/workflow/</link>
    <description>Recent content in Workflow on Bradley Fidler</description>
    <generator>Hugo -- 0.156.0</generator>
    <language>en-us</language>
    <lastBuildDate>Tue, 21 Apr 2026 00:00:00 +0000</lastBuildDate>
    <atom:link href="https://brfid.github.io/tags/workflow/index.xml" rel="self" type="application/rss+xml" />
    <item>
      <title>Wireword: Agent Control Words Should Be Hard to Misread</title>
      <link>https://brfid.github.io/posts/wireword-control-words/</link>
      <pubDate>Tue, 21 Apr 2026 00:00:00 +0000</pubDate>
      <guid>https://brfid.github.io/posts/wireword-control-words/</guid>
      <description>&lt;p&gt;This is a research note for &lt;a href=&#34;https://github.com/brfid/wireword&#34;&gt;Wireword&lt;/a&gt;, a small tool I am building to lint LLM agent control words.&lt;/p&gt;
&lt;p&gt;By control words, I mean short labels that can change what an agent does:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;route names&lt;/li&gt;
&lt;li&gt;tool names&lt;/li&gt;
&lt;li&gt;prompt macro names&lt;/li&gt;
&lt;li&gt;environment targets&lt;/li&gt;
&lt;li&gt;approval targets&lt;/li&gt;
&lt;li&gt;exact enum values the model must emit&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The goal is narrow: make labels that control agent behavior harder to misread, miscopy, or misroute.&lt;/p&gt;
&lt;h2 id=&#34;of-words-and-tokens-being-expensive&#34;&gt;Of words and tokens being expensive&lt;/h2&gt;
&lt;p&gt;This started with &lt;a href=&#34;https://github.com/juliusbrussee/caveman&#34;&gt;caveman-style LLM output&lt;/a&gt;. The useful comparison is not really cavemen. It is telegraphese: compressed language for an expensive channel.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>This is a research note for <a href="https://github.com/brfid/wireword">Wireword</a>, a small tool I am building to lint LLM agent control words.</p>
<p>By control words, I mean short labels that can change what an agent does:</p>
<ul>
<li>route names</li>
<li>tool names</li>
<li>prompt macro names</li>
<li>environment targets</li>
<li>approval targets</li>
<li>exact enum values the model must emit</li>
</ul>
<p>The goal is narrow: make labels that control agent behavior harder to misread, miscopy, or misroute.</p>
<h2 id="of-words-and-tokens-being-expensive">Of words and tokens being expensive</h2>
<p>This started with <a href="https://github.com/juliusbrussee/caveman">caveman-style LLM output</a>. The useful comparison is not really cavemen. It is telegraphese: compressed language for an expensive channel.</p>
<p>Western Union did not bill like an LLM API, but the pressure was similar. Ordinary domestic telegrams were billed by chargeable body word, usually with a ten-word minimum; address, signature, and date were free, while extra body words cost more.<sup id="fnref:1"><a href="#fn:1" class="footnote-ref" role="doc-noteref">1</a></sup> A ten-word sentence from New York to Boston could cost 30 cents.<sup id="fnref:2"><a href="#fn:2" class="footnote-ref" role="doc-noteref">2</a></sup></p>
<p>That maps to LLM work in two basic ways:</p>
<ul>
<li><strong>Token cost:</strong> shorter turns are cheaper.</li>
<li><strong>Context quality:</strong> shorter turns leave less low-information text in the conversation history.</li>
</ul>
<p>The second point is not just aesthetic. Long histories are not used perfectly. Irrelevant text can distract the model or bury the useful constraint.</p>
<p>But compression has a failure mode. If compressed labels become too similar, the model has less redundancy to recover the intended control word.</p>
<h2 id="learning-from-telegraphy">Learning from telegraphy</h2>
<p>I looked at other telegraph practices to see what might apply to LLM agents. Could Victorian engineers provide fresh insights for our changing world? No, except for one thing, sort of.</p>
<p>Most parallels are useful but general:</p>
<table>
  <thead>
      <tr>
          <th>Telegraph practice</th>
          <th>General pattern</th>
          <th>LLM-agent version</th>
      </tr>
  </thead>
  <tbody>
      <tr>
          <td><code>STOP</code> and spelled punctuation</td>
          <td>delimiters</td>
          <td>source/task boundaries</td>
      </tr>
      <tr>
          <td>repeat-back</td>
          <td>confirmation</td>
          <td>human approval gates</td>
      </tr>
      <tr>
          <td>service classes</td>
          <td>priority and cost tiers</td>
          <td>model routing / effort levels</td>
      </tr>
      <tr>
          <td>codebooks</td>
          <td>macros</td>
          <td>prompt libraries</td>
      </tr>
      <tr>
          <td>word-count checks</td>
          <td>validation</td>
          <td>output checks</td>
      </tr>
      <tr>
          <td>operators</td>
          <td>review and observability</td>
          <td>linters / traces</td>
      </tr>
      <tr>
          <td>private codes</td>
          <td>substitution</td>
          <td>PII masking</td>
      </tr>
  </tbody>
</table>
<p>These are durable information-management practices. They are worth remembering, but they do not justify a new tool by themselves.</p>
<p>The more specific lead was codeword design.</p>
<h2 id="compression-with-redundancy">Compression with redundancy</h2>
<p>Commercial telegraph codebooks had to balance compression and recoverability. A codeword had to be short enough to save money, but distinct enough that a damaged word did not silently become another valid word.</p>
<p>E. L. Bentley described the rule directly: good codewords should differ by at least two letters. Then a one-letter mutilation produces an invalid codeword, not the wrong valid codeword.<sup id="fnref:3"><a href="#fn:3" class="footnote-ref" role="doc-noteref">3</a></sup></p>
<p>The ABC Code used the same principle. John McVey&rsquo;s index quotes the 1920 sixth edition saying its five-letter codewords were built with at least a two-letter difference. The same note says the compilers considered Morse similarities and removed risky words.<sup id="fnref:4"><a href="#fn:4" class="footnote-ref" role="doc-noteref">4</a></sup></p>
<p>Useful rule:</p>
<blockquote>
<p>Good compression leaves enough redundancy to detect mistakes.</p>
</blockquote>
<h2 id="the-llm-agent-version">The LLM agent version</h2>
<p>This problem is not unique to LLMs. Similar issues appear in APIs, command-line flags, protocol enums, medication names, service names, and airport codes.</p>
<p>LLM agents make the problem newly common because they combine:</p>
<ul>
<li>probabilistic language generation</li>
<li>exact symbolic control</li>
<li>natural-language prompts around short labels</li>
<li>tool calls and routes with real side effects</li>
</ul>
<p>Example labels:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">A1
</span></span><span class="line"><span class="cl">AI
</span></span><span class="line"><span class="cl">Al
</span></span><span class="line"><span class="cl">prod
</span></span><span class="line"><span class="cl">production
</span></span><span class="line"><span class="cl">live
</span></span><span class="line"><span class="cl">docs.api
</span></span><span class="line"><span class="cl">doc.api
</span></span><span class="line"><span class="cl">FACTCHECK_API
</span></span><span class="line"><span class="cl">FACT_CHECK_API
</span></span></code></pre></div><p>These are not just strings. In an agent system, they may route work, call tools, select environments, expand macros, approve targets, or satisfy exact enum values.</p>
<p>The risk boundary is narrow. Similar labels matter when three conditions hold:</p>
<ul>
<li>the label is visible to the model or copied through natural language</li>
<li>the model or a human can choose or emit the label</li>
<li>downstream code treats the label as an exact control input</li>
</ul>
<p>A wrong valid label is worse than an invalid label. Invalid labels can fail validation. Wrong valid labels can pass validation and trigger the wrong action.</p>
<p>This matters less when routing is deterministic, internal IDs are hidden from the model, schemas constrain the choice, or a UI forces selection from canonical options.</p>
<p>So Wireword should not only ask whether two strings are similar. It should ask:</p>
<ul>
<li>What kind of label is this?</li>
<li>Can the model emit it?</li>
<li>Does a parser require an exact match?</li>
<li>What happens if the wrong label is chosen?</li>
<li>Does it target production or another external system?</li>
</ul>
<h3 id="generic-check-vs-agent-aware-check">Generic check vs agent-aware check</h3>
<p>Generic similarity check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">docs.api / doc.api
</span></span><span class="line"><span class="cl">Reason: edit distance 1.
</span></span></code></pre></div><p>Agent-aware check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">CRITICAL docs.api / doc.api
</span></span><span class="line"><span class="cl">Reason: route-name collision across different effects.
</span></span><span class="line"><span class="cl">Risk: read-only route is one edit away from external-write route.
</span></span><span class="line"><span class="cl">Fix: rename to ROUTE_DOCS_REVIEW and ROUTE_DOCS_PUBLISH.
</span></span></code></pre></div><p>Generic similarity check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">prod / production / live
</span></span><span class="line"><span class="cl">Reason: related strings.
</span></span></code></pre></div><p>Agent-aware check:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-text" data-lang="text"><span class="line"><span class="cl">CRITICAL prod / production / live
</span></span><span class="line"><span class="cl">Reason: multiple production-like environment labels.
</span></span><span class="line"><span class="cl">Risk: agent may choose an inconsistent deployment target.
</span></span><span class="line"><span class="cl">Fix: use ENV_PRODUCTION as the only valid production label.
</span></span></code></pre></div><p>That is the product line: do not only lint strings. Lint control words by the action they can trigger.</p>
<h3 id="current-prototype-and-v1-plan">Current prototype and V1 plan</h3>
<p>The tool is <a href="https://github.com/brfid/wireword">Wireword</a>. V1 should stay small.</p>
<p>The current prototype now checks both layers:</p>
<ul>
<li><strong>raw labels:</strong> visual confusables, edit-distance-one pairs, case-only differences, punctuation-only differences, plural/stem collisions, and production-like aliases</li>
<li><strong>agent-aware labels:</strong> routes, tools, named agent handoffs, approval targets, macros, profiles, production-like environments, and exact enum values the model must emit</li>
</ul>
<p>That is enough to test the shape of the idea. The repo now has a small validation corpus with safe, dangerous, and malformed configs, plus a narrow FastMCP source extractor for tool names. It is still not a full agent security scanner.</p>
<p>The useful output is not just <code>these strings are similar</code>. It is <code>these strings are similar, the model can see or emit them, and confusing them could call the wrong tool, route work to the wrong place, or target the wrong environment</code>.</p>
<p>Representative targets:</p>
<ul>
<li>MCP servers with model-visible tools</li>
<li>router or handoff agents</li>
<li>graph-based agent workflows</li>
<li>skill/plugin systems with named routes</li>
<li>exact enum outputs consumed by parsers</li>
</ul>
<p>The repo should carry the detailed CLI examples, fixtures, and tests. This note only needs the argument.</p>
<h3 id="what-wireword-is-not">What Wireword is not</h3>
<p>Wireword is not:</p>
<ul>
<li>an agent framework</li>
<li>a prompt framework</li>
<li>a general security scanner</li>
<li>a replacement for schemas or constrained decoding</li>
<li>a proof that LLMs confuse every similar label</li>
<li>necessary when labels are hidden behind deterministic routing, internal IDs, or strict UI selection</li>
</ul>
<p>It is a narrow lint pass for labels that become model-visible or human-visible control inputs.</p>
<h2 id="conclusion">Conclusion</h2>
<p>Telegraph codebooks might inspire useful linting for LLM agent control identifiers.</p>
<h2 id="sources">Sources</h2>
<div class="footnotes" role="doc-endnotes">
<hr>
<ol>
<li id="fn:1">
<p>Nelson E. Ross, <a href="https://en.wikisource.org/wiki/How_to_Write_Telegrams_Properly"><em>How to Write Telegrams Properly</em></a> (1928), &ldquo;How Tolls Are Computed&rdquo; and &ldquo;Punctuation Marks.&rdquo; Ross explains domestic body-word billing, cable/radiogram address billing, and the rule that requested punctuation marks were counted and charged as words.&#160;<a href="#fnref:1" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:2">
<p>Western Union Telegraph Company, <a href="https://www.gutenberg.org/ebooks/62214.html.images"><em>The Proposed Union of the Telegraph and Postal Systems</em></a> (1869). Western Union gives the 1866 New York-to-Boston tariff as 30 cents for ten words, exclusive of address and signature.&#160;<a href="#fnref:2" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:3">
<p>E. L. Bentley, <a href="https://www.jmcvey.net/cable/harmsworth_2.htm">&ldquo;Codes: Their Nature and Manipulation&rdquo;</a>, transcribed by John McVey. Bentley describes the two-letter-difference rule and explains that it prevents a one-letter mutilation from silently becoming another valid codeword.&#160;<a href="#fnref:3" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
<li id="fn:4">
<p>John McVey, <a href="https://jmcvey.net/cable/scans/ABC.htm">&ldquo;A.B.C. Telegraphic Codes, seven editions 1873-1936&rdquo;</a>. The page quotes the 1920 sixth edition on five-letter codewords built with at least a two-letter difference and notes the code&rsquo;s attention to Morse similarities.&#160;<a href="#fnref:4" class="footnote-backref" role="doc-backlink">&#x21a9;&#xfe0e;</a></p>
</li>
</ol>
</div>
]]></content:encoded>
    </item>
    <item>
      <title>Using CHANGELOG.md as LLM session memory</title>
      <link>https://brfid.github.io/posts/changelog-as-llm-memory/</link>
      <pubDate>Sat, 21 Feb 2026 00:00:00 +0000</pubDate>
      <guid>https://brfid.github.io/posts/changelog-as-llm-memory/</guid>
      <description>&lt;p&gt;Most LLM assistants don&amp;rsquo;t maintain memory between sessions. The standard workaround — a large &lt;code&gt;CLAUDE.md&lt;/code&gt; or &lt;code&gt;AGENTS.md&lt;/code&gt; with everything in it — breaks down quickly. What&amp;rsquo;s more, it duplicates other content in your repo, growing the documentation maintenance surface without adding value.&lt;/p&gt;
&lt;p&gt;Lately I avoid this problem by treating &lt;code&gt;CHANGELOG.md&lt;/code&gt; as my LLM&amp;rsquo;s memory — specifically the &lt;code&gt;[Unreleased]&lt;/code&gt; section from the format standardized by &lt;a href=&#34;https://keepachangelog.com/&#34;&gt;Keep a Changelog&lt;/a&gt;, which becomes the primary mutable state document.&lt;/p&gt;</description>
      <content:encoded><![CDATA[<p>Most LLM assistants don&rsquo;t maintain memory between sessions. The standard workaround — a large <code>CLAUDE.md</code> or <code>AGENTS.md</code> with everything in it — breaks down quickly. What&rsquo;s more, it duplicates other content in your repo, growing the documentation maintenance surface without adding value.</p>
<p>Lately I avoid this problem by treating <code>CHANGELOG.md</code> as my LLM&rsquo;s memory — specifically the <code>[Unreleased]</code> section from the format standardized by <a href="https://keepachangelog.com/">Keep a Changelog</a>, which becomes the primary mutable state document.</p>
<h2 id="why-it-works">Why it works</h2>
<p><a href="https://keepachangelog.com/">Keep a Changelog</a> defines a format most LLMs recognize on sight: a fenced <code>[Unreleased]</code> block at the top, dated releases below. Most LLMs recognize the convention: <code>[Unreleased]</code> is active work, dated entries are history.</p>
<p>That maps directly onto what you need for session continuity:</p>
<ul>
<li><strong><code>[Unreleased]</code></strong> — mutable, updated every session. Current state, active priorities, blockers, decisions pending. The model reads this first.</li>
<li><strong>Dated entries</strong> — append-only history. Evidence that decisions happened and why. The model reads these to reconstruct context if it needs depth.</li>
</ul>
<p>The AGENTS.md (or CLAUDE.md) file becomes stable configuration: conventions, file paths, source-of-truth map. It changes rarely. The CHANGELOG takes on everything that does change.</p>
<h2 id="the-session-start-instruction">The session start instruction</h2>
<p>One line at the top of <code>AGENTS.md</code> is enough:</p>
<pre tabindex="0"><code>Read CHANGELOG.md [Unreleased] at session start.
</code></pre><p>From there the model knows where it is, what&rsquo;s in flight, and what to do next — without re-explanation.</p>
<h2 id="what-goes-in-unreleased">What goes in [Unreleased]</h2>
<p>I use explicit subsections:</p>
<div class="highlight"><pre tabindex="0" class="chroma"><code class="language-markdown" data-lang="markdown"><span class="line"><span class="cl"><span class="gu">## [Unreleased]
</span></span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### Current State
</span></span></span><span class="line"><span class="cl">One-paragraph snapshot. Where things stand right now.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### Active Priorities
</span></span></span><span class="line"><span class="cl">Ordered list of what needs to happen next.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### In Progress
</span></span></span><span class="line"><span class="cl">What the model started in the current session.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### Blocked
</span></span></span><span class="line"><span class="cl">Anything waiting on external action.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### Decisions Needed
</span></span></span><span class="line"><span class="cl">Open questions the model should surface, not resolve unilaterally.
</span></span><span class="line"><span class="cl">
</span></span><span class="line"><span class="cl"><span class="gu">### Recently Completed
</span></span></span><span class="line"><span class="cl">What just shipped. Moves to a dated entry on the next commit.
</span></span></code></pre></div><p>The model updates <code>[Unreleased]</code> at the end of each session. The next session reads it cold and picks up cleanly.</p>
<h2 id="what-this-is-not">What this is not</h2>
<p>This is not a replacement for good project documentation. Architectural decisions, integration details, and source-of-truth maps still belong in stable docs. The changelog is the <em>session state layer</em>, not the full context layer.</p>
<p>It also does not solve the problem of context window limits on large projects. It reduces the cost of context: the model loads a small, structured, current-state document instead of scanning a stale megafile.</p>
<h2 id="result">Result</h2>
<p>Sessions are shorter to start, more reliable to hand off, and easier to audit. The changelog does the work it was always supposed to do — track what changed and when — and the LLM does less redundant orientation work each time.</p>
<p>The format is well-understood, self-describing, and version-controlled. If you&rsquo;re already using Keep a Changelog, the only addition is a discipline: update <code>[Unreleased]</code> at the end of each session.</p>
]]></content:encoded>
    </item>
  </channel>
</rss>
