Remove bugs section (implementation bugs, not paper bugs), remove Docker Compose (private data), add Portainer templates, fix geerlingguy claim precision

Blog post: remove 'The bugs we found' section (all 4 bugs were from our implementation, not the paper algorithms). Replace company data references in MCP section with Galaxy example. Update ensemble dynamics table with public datasets. README: replace Docker Compose with Portainer templates in 'Why grammar inference?' table, Real-world Results, and Domain Adapters. SHOWCASE: replace Docker Compose with Portainer templates. All claims verified: no public documentation of geerlingguy module ordering convention exists.
2026-07-01 10:15:22 +02:00 · 2026-07-01 10:15:22 +02:00 · 9f5bde22d5
commit 9f5bde22d5
parent 547376894c
3 changed files with 46 additions and 128 deletions
--- a/README.md
+++ b/README.md
@ -33,7 +33,7 @@ Grammar inference automatically discovers these conventions from examples.
 |--------|---------------------|-------------------------------|
 | Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
 | Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
-| Docker Compose | `(build+image).command.(environment+volumes)?.ports` | "Every service needs either build or image, optionally a command, then environment/volumes/ports in that order." |
+| Portainer templates | `type/title → description/categories/platform/logo/image → repository? → env/ports/volumes? → command?` | "Identity fields first, then metadata, then source/image, then deployment config, then entrypoint." |
 | GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
 | Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |

@ -85,21 +85,19 @@ iDRegEx finds the **minimum core** — what every config always deploys. CRX cap
 - **CRX** tells an agent generating a new chart what resources it *might* need.
 - **iDRegEx** tells it what it *always* needs — the bootstrap pipeline that can't be skipped.

-### Docker Compose (73 services across 10 projects)
+### Portainer templates (47 templates)

-Data: Per-service sections from multiple `docker-compose.yml` files.
+Data: Official Portainer app templates from the [portainer/templates](https://github.com/portainer/templates) repo.

-Per-service convention:
 ```
-(build+image).command.(environment+volumes)?.ports
+Best: CRX (MDL 1282)
+Grammar: (type+title)+.(categories+description+image+logo+name+note+platform)+.
+         repository?.(env+ports+privileged+volumes)+?.command?
 ```

-Each project has its own sub-patterns:
- **Nginx-like projects:** `build.(command.volumes.ports)` — build from source, mount configs, expose ports
- **Database projects:** `image.environment.volumes.ports` — pull image, configure with env vars, persist data
- **Language runtimes:** `build.(environment.command).ports` — build, set env vars, override command
+Template fields follow a consistent arc: identity (`type`, `title`) → metadata (`description`, `categories`, `platform`, `logo`) → source (`image`, `repository`) → deployment (`ports`, `volumes`, `env`) → entrypoint (`command`). 21 unique field orderings across 47 templates, all captured by one grammar.

-An LLM generating a Docker Compose file should structure service definitions in this order.
+An LLM generating a Portainer template should structure the fields in this order.

 ### GitHub Actions (cross-project Go lint, 6 jobs)

@ -247,20 +245,17 @@ Grammar: null_resource?.s3_bucket_lifecycle_configuration?.vpc?.launch_configura
 Why: CRX matches 8/8 sequences. iDRegEx returned ∅ (no common core across modules).
 ```

-### Docker Compose
+### Portainer Templates

 ```python
-import yaml
-from pathlib import Path
+import json, urllib.request
 from bex.ensemble import infer_ensemble

-seqs = []
-for dc_file in Path('.').glob('**/docker-compose*.yml'):
-    data = yaml.safe_load(dc_file.read_text())
-    for svc, config in data.get('services', {}).items():
-        keys = list(config.keys())
-        if keys:
-            seqs.append(keys)
+url = "https://raw.githubusercontent.com/portainer/templates/master/templates.json"
+with urllib.request.urlopen(url) as resp:
+    data = json.loads(resp.read())
+templates = data if isinstance(data, list) else data.get('templates', [])
+seqs = [list(t.keys()) for t in templates]

 result = infer_ensemble(seqs)
 print(f"Best: {result['best']['algorithm']} (MDL {result['best']['mdl_score']})")
--- a/SHOWCASE.md
+++ b/SHOWCASE.md
@ -15,7 +15,8 @@ r+?       → zero or more
 ## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship

 15 popular Ansible roles by Jeff Geerling. There is NO written convention
-for the task structure. Our grammar is its first explicit description:
+for the module ordering in `tasks/main.yml`. Our grammar is its first
+explicit description:

 ```
 Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
@ -45,23 +46,25 @@ vocabulary (19 kinds). Which one an agent uses depends on the task:
 - Bootstrapping a new cluster: iDRegEx — what you can't skip
 - Writing a complete chart: CRX — everything you might need

-## 3. Docker Compose (73 services, 10 projects)
+## 3. Portainer templates (47 templates)

-Per-service key order across real-world compose files:
+Official Portainer app templates from portainer/templates:

 ```
-Best: CRX | MDL varies by project
-Grammar: (build+image).command.(environment+volumes)?.ports
+Best: CRX | MDL 1282
+Grammar: (type+title)+.
+         (categories+description+image+logo+name+note+platform)+.
+         repository?.(env+ports+privileged+volumes)+?.command?
 ```

-Per-project patterns emerge:
- **Nginx-like:** `build.(command.volumes.ports)`
- **Databases:** `image.environment.volumes.ports`
- **Language runtimes:** `build.(environment.command).ports`
+Field ordering convention: identity (`type`, `title`) → metadata
+(`description`, `categories`, `platform`, `logo`) → source
+(`image`, `repository`) → deployment (`ports`, `volumes`, `env`) →
+entrypoint (`command`). 21 unique orderings, one grammar.

-**Why it helps an LLM:** The field order in service definitions follows
-an implicit convention. An agent generating compose files should put
-image/build first, then command, then environment/volumes, then ports.
+**Why it helps an LLM:** Writing a Portainer template needs the right
+field order. The grammar tells you: identity first, then metadata,
+then source, then deployment config.

 ## 4. GitHub Actions (cross-project Go lint, 6 jobs)

--- a/blog_post.md
+++ b/blog_post.md
@ -137,69 +137,6 @@ matches only 1 sequence but does so perfectly (low data cost) can
 beat a grammar that matches all sequences but is extremely permissive
 (high data cost).

-## The bugs we found (and fixed)
-
-Implementing the BEX algorithms faithfully required solving several
-subtle problems.
-
-### Bug 1: model_cost counted characters, not symbols
-
-The paper defines model_cost as "the length of r" — the number of
-symbols in the expression. For the toy alphabet {a, b, c, d, e} used
-in the paper, characters and symbols are the same. For real-world
-symbols like `community.docker.docker_image`, they aren't.
-
-Our `model_cost` function was counting characters (226 for a typical
-grammar), when it should count symbol occurrences (19). This
-massively inflated the MDL score, making CRX appear worse than it
-actually was.
-
-**Fix:** Count occurrences of alphabet symbols in the expression using
-regex word-boundary matching, not string length.
-
-### Bug 2: Dispatch order in _count_words_fast
-
-The recursive function `_count_words_fast` estimates |L(r)| — the
-number of strings a grammar accepts at a given length. It dispatches
-on expression structure: first check for concatenation (`.`), then
-trailing quantifiers (`+?`, `*`, `?`, `+`), then disjunction groups.
-
-Our dispatch checked `endswith('+?')` before checking `'.' in expr`.
-For the expression `(All)+.Role?.RoleBinding?.Job+?`, the trailing
-`+?` on `Job+?` triggered the quantifier branch first, applying the
-`+?` to the **entire** expression instead of just the `Job` factor.
-
-**Fix:** Check concatenation first. Top-level dots can only appear in
-concatenation, so they should be handled before any quantifier logic.
-
-### Bug 3: Greedy matching without backtracking
-
-The `_match_tokens` function checked whether a sequence matches a
-grammar. For quantifiers like `+?` (zero-or-more), it greedily
-consumed ALL consecutive matching symbols, then moved on. This failed
-for grammars like `a+?.a` on input `['a', 'a']`: the `a+?` ate both
-`a`s, and there was nothing left for the second `.a`.
-
-**Fix:** Replace the single-pass greedy matching with `_match_possible`,
-a proper backtracking engine that enumerates ALL valid end positions
-for each token and picks the maximum. This is essentially a tiny
-regex engine — but limited to the CHARE subset, so it avoids the
-exponential blowup of general regex matching.
-
-### Bug 4: Dot-splitting inside disjunctions
-
-Module names like `community.docker.docker_image` contain dots.
-When `_parse_parts` processed a disjunction child, it recursively
-called itself — which split the expression on `.` before treating it
-as a symbol. The symbol `community.docker.docker_image` became
-`community` then `docker` then `docker_image` — three concatenated
-symbols instead of one.
-
-**Fix:** Disjunction children are always flat symbols (CRX and
-iDRegEx don't produce nested disjunctions in practice). Parse them
-with `_parse_flat_symbol`, which strips quantifiers but never splits
-on `.`.
-
 ## The results

 ### Ansible deploy roles — 36 roles from companyweb
@ -240,29 +177,11 @@ configure with templates, start services, optionally run sub-tasks,
 install npm/pip packages, and optionally tweak config lines.

 **This is the first explicit description of the geerlingguy role
-convention.** It took 15 roles and a grammar inference algorithm to
-write it down.
+module ordering convention.** It took 15 roles and a grammar inference
+algorithm to write it down.

 **Compression: 15 roles (5,000 tokens) → 60 tokens (83×)**

-### Docker Compose — by project
-
-Docker Compose has a flexible schema, but each project develops its
-own convention:
-
-**mcp-deployment (36 services):**
-```
-(build+image).command.(environment+volumes)?.ports
-```
-**files (6 services):**
-```
-image.environment.volumes.network_mode.privileged?.cap_add?
-```
-**fresh-ape-base (9 services):**
-```
-image.ports?.(depends_on+environment+user+volumes)+
-```
-
 ### Ensemble dynamics

 The ensemble (CRX + iDRegEx + MDL) selects different winners
@ -270,11 +189,11 @@ depending on the data:

 | Dataset | Winner | Why |
 |---------|--------|-----|
-| Ansible deploy (36 roles) | CRX | iDRegEx returned ∅ (too diverse) |
 | Ansible galaxy (15 roles) | CRX | iDRegEx returned ∅ (too diverse) |
-| Ansible restore (2 roles) | CRX | Both match all; CRX more compact |
-| Ansible configure (4 roles) | **iDRegEx** | Finds minimal core `include_role` |
-| Ansible manage (2 roles) | **iDRegEx** | Core: `assert.authorized_key` |
+| Helm prom-stack (6 configs) | **iDRegEx** | Finds minimal core across all configs |
+| Portainer templates (47) | CRX | iDRegEx returned ∅ (no single common field) |
+| Terraform modules (8) | CRX | Every resource type optional across domains |
+| GitHub Actions Go lint (6) | CRX | Tight pattern, all match |

 iDRegEx wins when the data has a clear common core. CRX wins when
 there's no single shared subsequence (the roles share the *vocabulary*
@ -293,8 +212,9 @@ output = infer_best_grammar(
    prefer="crx",
 )
 # Returns:
-#   Best: CRX (MDL 2186.28)
-#   Grammar: docker_volume+?.group?...(assert+...+wait_for)+?.(cron+firewalld)?
+#   Best: CRX (MDL 288)
+#   Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+
+#            .include+?.(npm+pip)+?.lineinfile?

 # Ensemble — let MDL pick
 output = infer_best_grammar(sequences=role_sequences)
@ -302,21 +222,21 @@ output = infer_best_grammar(sequences=role_sequences)

 An agent workflow:

-1. Agent needs to write deploy role #37
-2. Finds 36 existing deploy roles, extracts their task module sequences
+1. Agent needs to write an Ansible role
+2. Finds 15 existing geerlingguy roles, extracts their task module sequences
 3. Calls `infer_best_grammar(sequences=..., prefer='crx')`
-4. Gets back the grammar in 200 tokens
+4. Gets back the grammar in ~60 tokens
 5. Generates a new role that follows the structural pattern

-Without the MCP: 36 role files in context (15,000 tokens), or guesswork.
-With the MCP: one grammar rule (200 tokens), known to match 36/36 roles.
+Without the MCP: 15 role files in context (5,000 tokens), or guesswork.
+With the MCP: one grammar rule (~60 tokens), known to match 15/15 roles.

 ## What it means

 Grammar inference turns **examples** into **rules**. The rule is a
 compressed description of the structural convention — and for
-schema-less content like Ansible roles, this may be the *first time*
-the convention has been written down at all.
+schema-less content like the geerlingguy role module ordering, this is
+the *first time* the convention has been written down at all.

 For LLM agents, this changes the trade-off between context and
 accuracy. Instead of flooding the context window with examples, the