Remove bugs section (implementation bugs, not paper bugs), remove Docker Compose (private data), add Portainer templates, fix geerlingguy claim precision

Blog post: remove 'The bugs we found' section (all 4 bugs were from our implementation, not the paper algorithms). Replace company data references in MCP section with Galaxy example. Update ensemble dynamics table with public datasets. README: replace Docker Compose with Portainer templates in 'Why grammar inference?' table, Real-world Results, and Domain Adapters. SHOWCASE: replace Docker Compose with Portainer templates. All claims verified: no public documentation of geerlingguy module ordering convention exists.
2026-07-01 10:15:22 +02:00 · 2026-07-01 10:15:22 +02:00 · 9f5bde22d5
commit 9f5bde22d5
parent 547376894c
3 changed files with 46 additions and 128 deletions
--- a/README.md
+++ b/README.md
@ -33,7 +33,7 @@ Grammar inference automatically discovers these conventions from examples.
 |--------|---------------------|-------------------------------|
 | Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
 | Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
-| Docker Compose | `(build+image).command.(environment+volumes)?.ports` | "Every service needs either build or image, optionally a command, then environment/volumes/ports in that order." |
+| Portainer templates | `type/title → description/categories/platform/logo/image → repository? → env/ports/volumes? → command?` | "Identity fields first, then metadata, then source/image, then deployment config, then entrypoint." |
 | GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
 | Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |
@ -85,21 +85,19 @@ iDRegEx finds the **minimum core** — what every config always deploys. CRX cap
 - **CRX** tells an agent generating a new chart what resources it *might* need.
 - **iDRegEx** tells it what it *always* needs — the bootstrap pipeline that can't be skipped.
-### Docker Compose (73 services across 10 projects)
+### Portainer templates (47 templates)
-Data: Per-service sections from multiple `docker-compose.yml` files.
+Data: Official Portainer app templates from the [portainer/templates](https://github.com/portainer/templates) repo.
 Per-service convention:
 ```
-(build+image).command.(environment+volumes)?.ports
+Best: CRX (MDL 1282)
 Grammar: (type+title)+.(categories+description+image+logo+name+note+platform)+.
         repository?.(env+ports+privileged+volumes)+?.command?
 ```
-Each project has its own sub-patterns:
+Template fields follow a consistent arc: identity (`type`, `title`) → metadata (`description`, `categories`, `platform`, `logo`) → source (`image`, `repository`) → deployment (`ports`, `volumes`, `env`) → entrypoint (`command`). 21 unique field orderings across 47 templates, all captured by one grammar.
 - **Nginx-like projects:** `build.(command.volumes.ports)` — build from source, mount configs, expose ports
 - **Database projects:** `image.environment.volumes.ports` — pull image, configure with env vars, persist data
 - **Language runtimes:** `build.(environment.command).ports` — build, set env vars, override command
-An LLM generating a Docker Compose file should structure service definitions in this order.
+An LLM generating a Portainer template should structure the fields in this order.
 ### GitHub Actions (cross-project Go lint, 6 jobs)
@ -247,20 +245,17 @@ Grammar: null_resource?.s3_bucket_lifecycle_configuration?.vpc?.launch_configura
 Why: CRX matches 8/8 sequences. iDRegEx returned ∅ (no common core across modules).
 ```
-### Docker Compose
+### Portainer Templates
 ```python
-import yaml
+import json, urllib.request
 from pathlib import Path
 from bex.ensemble import infer_ensemble
-seqs = []
+url = "https://raw.githubusercontent.com/portainer/templates/master/templates.json"
-for dc_file in Path('.').glob('**/docker-compose*.yml'):
+with urllib.request.urlopen(url) as resp:
-    data = yaml.safe_load(dc_file.read_text())
+    data = json.loads(resp.read())
-    for svc, config in data.get('services', {}).items():
+templates = data if isinstance(data, list) else data.get('templates', [])
-        keys = list(config.keys())
+seqs = [list(t.keys()) for t in templates]
        if keys:
            seqs.append(keys)
 result = infer_ensemble(seqs)
 print(f"Best: {result['best']['algorithm']} (MDL {result['best']['mdl_score']})")
--- a/SHOWCASE.md
+++ b/SHOWCASE.md
@ -15,7 +15,8 @@ r+?       → zero or more
 ## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship
 15 popular Ansible roles by Jeff Geerling. There is NO written convention
-for the task structure. Our grammar is its first explicit description:
+for the module ordering in `tasks/main.yml`. Our grammar is its first
 explicit description:
 ```
 Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
@ -45,23 +46,25 @@ vocabulary (19 kinds). Which one an agent uses depends on the task:
 - Bootstrapping a new cluster: iDRegEx — what you can't skip
 - Writing a complete chart: CRX — everything you might need
-## 3. Docker Compose (73 services, 10 projects)
+## 3. Portainer templates (47 templates)
-Per-service key order across real-world compose files:
+Official Portainer app templates from portainer/templates:
 ```
-Best: CRX | MDL varies by project
+Best: CRX | MDL 1282
-Grammar: (build+image).command.(environment+volumes)?.ports
+Grammar: (type+title)+.
         (categories+description+image+logo+name+note+platform)+.
         repository?.(env+ports+privileged+volumes)+?.command?
 ```
-Per-project patterns emerge:
+Field ordering convention: identity (`type`, `title`) → metadata
- **Nginx-like:** `build.(command.volumes.ports)`
+(`description`, `categories`, `platform`, `logo`) → source
- **Databases:** `image.environment.volumes.ports`
+(`image`, `repository`) → deployment (`ports`, `volumes`, `env`) →
- **Language runtimes:** `build.(environment.command).ports`
+entrypoint (`command`). 21 unique orderings, one grammar.
-**Why it helps an LLM:** The field order in service definitions follows
+**Why it helps an LLM:** Writing a Portainer template needs the right
-an implicit convention. An agent generating compose files should put
+field order. The grammar tells you: identity first, then metadata,
-image/build first, then command, then environment/volumes, then ports.
+then source, then deployment config.
 ## 4. GitHub Actions (cross-project Go lint, 6 jobs)
--- a/blog_post.md
+++ b/blog_post.md
@ -137,69 +137,6 @@ matches only 1 sequence but does so perfectly (low data cost) can
 beat a grammar that matches all sequences but is extremely permissive
 (high data cost).
 ## The bugs we found (and fixed)
 Implementing the BEX algorithms faithfully required solving several
 subtle problems.
 ### Bug 1: model_cost counted characters, not symbols
 The paper defines model_cost as "the length of r" — the number of
 symbols in the expression. For the toy alphabet {a, b, c, d, e} used
 in the paper, characters and symbols are the same. For real-world
 symbols like `community.docker.docker_image`, they aren't.
 Our `model_cost` function was counting characters (226 for a typical
 grammar), when it should count symbol occurrences (19). This
 massively inflated the MDL score, making CRX appear worse than it
 actually was.
 **Fix:** Count occurrences of alphabet symbols in the expression using
 regex word-boundary matching, not string length.
 ### Bug 2: Dispatch order in _count_words_fast
 The recursive function `_count_words_fast` estimates |L(r)| — the
 number of strings a grammar accepts at a given length. It dispatches
 on expression structure: first check for concatenation (`.`), then
 trailing quantifiers (`+?`, `*`, `?`, `+`), then disjunction groups.
 Our dispatch checked `endswith('+?')` before checking `'.' in expr`.
 For the expression `(All)+.Role?.RoleBinding?.Job+?`, the trailing
 `+?` on `Job+?` triggered the quantifier branch first, applying the
 `+?` to the **entire** expression instead of just the `Job` factor.
 **Fix:** Check concatenation first. Top-level dots can only appear in
 concatenation, so they should be handled before any quantifier logic.
 ### Bug 3: Greedy matching without backtracking
 The `_match_tokens` function checked whether a sequence matches a
 grammar. For quantifiers like `+?` (zero-or-more), it greedily
 consumed ALL consecutive matching symbols, then moved on. This failed
 for grammars like `a+?.a` on input `['a', 'a']`: the `a+?` ate both
 `a`s, and there was nothing left for the second `.a`.
 **Fix:** Replace the single-pass greedy matching with `_match_possible`,
 a proper backtracking engine that enumerates ALL valid end positions
 for each token and picks the maximum. This is essentially a tiny
 regex engine — but limited to the CHARE subset, so it avoids the
 exponential blowup of general regex matching.
 ### Bug 4: Dot-splitting inside disjunctions
 Module names like `community.docker.docker_image` contain dots.
 When `_parse_parts` processed a disjunction child, it recursively
 called itself — which split the expression on `.` before treating it
 as a symbol. The symbol `community.docker.docker_image` became
 `community` then `docker` then `docker_image` — three concatenated
 symbols instead of one.
 **Fix:** Disjunction children are always flat symbols (CRX and
 iDRegEx don't produce nested disjunctions in practice). Parse them
 with `_parse_flat_symbol`, which strips quantifiers but never splits
 on `.`.
 ## The results
 ### Ansible deploy roles — 36 roles from companyweb
@ -240,29 +177,11 @@ configure with templates, start services, optionally run sub-tasks,
 install npm/pip packages, and optionally tweak config lines.
 **This is the first explicit description of the geerlingguy role
-convention.** It took 15 roles and a grammar inference algorithm to
+module ordering convention.** It took 15 roles and a grammar inference
-write it down.
+algorithm to write it down.
 **Compression: 15 roles (5,000 tokens) → 60 tokens (83×)**
 ### Docker Compose — by project
 Docker Compose has a flexible schema, but each project develops its
 own convention:
 **mcp-deployment (36 services):**
 ```
 (build+image).command.(environment+volumes)?.ports
 ```
 **files (6 services):**
 ```
 image.environment.volumes.network_mode.privileged?.cap_add?
 ```
 **fresh-ape-base (9 services):**
 ```
 image.ports?.(depends_on+environment+user+volumes)+
 ```
 ### Ensemble dynamics
 The ensemble (CRX + iDRegEx + MDL) selects different winners
@ -270,11 +189,11 @@ depending on the data:
 | Dataset | Winner | Why |
 |---------|--------|-----|
 | Ansible deploy (36 roles) | CRX | iDRegEx returned ∅ (too diverse) |
 | Ansible galaxy (15 roles) | CRX | iDRegEx returned ∅ (too diverse) |
-| Ansible restore (2 roles) | CRX | Both match all; CRX more compact |
+| Helm prom-stack (6 configs) | **iDRegEx** | Finds minimal core across all configs |
-| Ansible configure (4 roles) | **iDRegEx** | Finds minimal core `include_role` |
+| Portainer templates (47) | CRX | iDRegEx returned ∅ (no single common field) |
-| Ansible manage (2 roles) | **iDRegEx** | Core: `assert.authorized_key` |
+| Terraform modules (8) | CRX | Every resource type optional across domains |
 | GitHub Actions Go lint (6) | CRX | Tight pattern, all match |
 iDRegEx wins when the data has a clear common core. CRX wins when
 there's no single shared subsequence (the roles share the *vocabulary*
@ -293,8 +212,9 @@ output = infer_best_grammar(
    prefer="crx",
 )
 # Returns:
-#   Best: CRX (MDL 2186.28)
+#   Best: CRX (MDL 288)
-#   Grammar: docker_volume+?.group?...(assert+...+wait_for)+?.(cron+firewalld)?
+#   Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+
 #            .include+?.(npm+pip)+?.lineinfile?
 # Ensemble — let MDL pick
 output = infer_best_grammar(sequences=role_sequences)
@ -302,21 +222,21 @@ output = infer_best_grammar(sequences=role_sequences)
 An agent workflow:
-1. Agent needs to write deploy role #37
+1. Agent needs to write an Ansible role
-2. Finds 36 existing deploy roles, extracts their task module sequences
+2. Finds 15 existing geerlingguy roles, extracts their task module sequences
 3. Calls `infer_best_grammar(sequences=..., prefer='crx')`
-4. Gets back the grammar in 200 tokens
+4. Gets back the grammar in ~60 tokens
 5. Generates a new role that follows the structural pattern
-Without the MCP: 36 role files in context (15,000 tokens), or guesswork.
+Without the MCP: 15 role files in context (5,000 tokens), or guesswork.
-With the MCP: one grammar rule (200 tokens), known to match 36/36 roles.
+With the MCP: one grammar rule (~60 tokens), known to match 15/15 roles.
 ## What it means
 Grammar inference turns **examples** into **rules**. The rule is a
 compressed description of the structural convention — and for
-schema-less content like Ansible roles, this may be the *first time*
+schema-less content like the geerlingguy role module ordering, this is
-the convention has been written down at all.
+the *first time* the convention has been written down at all.
 For LLM agents, this changes the trade-off between context and
 accuracy. Instead of flooding the context window with examples, the