Remove bugs section (implementation bugs, not paper bugs), remove Docker Compose (private data), add Portainer templates, fix geerlingguy claim precision
Blog post: remove 'The bugs we found' section (all 4 bugs were from our implementation, not the paper algorithms). Replace company data references in MCP section with Galaxy example. Update ensemble dynamics table with public datasets. README: replace Docker Compose with Portainer templates in 'Why grammar inference?' table, Real-world Results, and Domain Adapters. SHOWCASE: replace Docker Compose with Portainer templates. All claims verified: no public documentation of geerlingguy module ordering convention exists.
This commit is contained in:
parent
547376894c
commit
9f5bde22d5
3 changed files with 46 additions and 128 deletions
35
README.md
35
README.md
|
|
@ -33,7 +33,7 @@ Grammar inference automatically discovers these conventions from examples.
|
||||||
|--------|---------------------|-------------------------------|
|
|--------|---------------------|-------------------------------|
|
||||||
| Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
|
| Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
|
||||||
| Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
|
| Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
|
||||||
| Docker Compose | `(build+image).command.(environment+volumes)?.ports` | "Every service needs either build or image, optionally a command, then environment/volumes/ports in that order." |
|
| Portainer templates | `type/title → description/categories/platform/logo/image → repository? → env/ports/volumes? → command?` | "Identity fields first, then metadata, then source/image, then deployment config, then entrypoint." |
|
||||||
| GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
|
| GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
|
||||||
| Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |
|
| Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |
|
||||||
|
|
||||||
|
|
@ -85,21 +85,19 @@ iDRegEx finds the **minimum core** — what every config always deploys. CRX cap
|
||||||
- **CRX** tells an agent generating a new chart what resources it *might* need.
|
- **CRX** tells an agent generating a new chart what resources it *might* need.
|
||||||
- **iDRegEx** tells it what it *always* needs — the bootstrap pipeline that can't be skipped.
|
- **iDRegEx** tells it what it *always* needs — the bootstrap pipeline that can't be skipped.
|
||||||
|
|
||||||
### Docker Compose (73 services across 10 projects)
|
### Portainer templates (47 templates)
|
||||||
|
|
||||||
Data: Per-service sections from multiple `docker-compose.yml` files.
|
Data: Official Portainer app templates from the [portainer/templates](https://github.com/portainer/templates) repo.
|
||||||
|
|
||||||
Per-service convention:
|
|
||||||
```
|
```
|
||||||
(build+image).command.(environment+volumes)?.ports
|
Best: CRX (MDL 1282)
|
||||||
|
Grammar: (type+title)+.(categories+description+image+logo+name+note+platform)+.
|
||||||
|
repository?.(env+ports+privileged+volumes)+?.command?
|
||||||
```
|
```
|
||||||
|
|
||||||
Each project has its own sub-patterns:
|
Template fields follow a consistent arc: identity (`type`, `title`) → metadata (`description`, `categories`, `platform`, `logo`) → source (`image`, `repository`) → deployment (`ports`, `volumes`, `env`) → entrypoint (`command`). 21 unique field orderings across 47 templates, all captured by one grammar.
|
||||||
- **Nginx-like projects:** `build.(command.volumes.ports)` — build from source, mount configs, expose ports
|
|
||||||
- **Database projects:** `image.environment.volumes.ports` — pull image, configure with env vars, persist data
|
|
||||||
- **Language runtimes:** `build.(environment.command).ports` — build, set env vars, override command
|
|
||||||
|
|
||||||
An LLM generating a Docker Compose file should structure service definitions in this order.
|
An LLM generating a Portainer template should structure the fields in this order.
|
||||||
|
|
||||||
### GitHub Actions (cross-project Go lint, 6 jobs)
|
### GitHub Actions (cross-project Go lint, 6 jobs)
|
||||||
|
|
||||||
|
|
@ -247,20 +245,17 @@ Grammar: null_resource?.s3_bucket_lifecycle_configuration?.vpc?.launch_configura
|
||||||
Why: CRX matches 8/8 sequences. iDRegEx returned ∅ (no common core across modules).
|
Why: CRX matches 8/8 sequences. iDRegEx returned ∅ (no common core across modules).
|
||||||
```
|
```
|
||||||
|
|
||||||
### Docker Compose
|
### Portainer Templates
|
||||||
|
|
||||||
```python
|
```python
|
||||||
import yaml
|
import json, urllib.request
|
||||||
from pathlib import Path
|
|
||||||
from bex.ensemble import infer_ensemble
|
from bex.ensemble import infer_ensemble
|
||||||
|
|
||||||
seqs = []
|
url = "https://raw.githubusercontent.com/portainer/templates/master/templates.json"
|
||||||
for dc_file in Path('.').glob('**/docker-compose*.yml'):
|
with urllib.request.urlopen(url) as resp:
|
||||||
data = yaml.safe_load(dc_file.read_text())
|
data = json.loads(resp.read())
|
||||||
for svc, config in data.get('services', {}).items():
|
templates = data if isinstance(data, list) else data.get('templates', [])
|
||||||
keys = list(config.keys())
|
seqs = [list(t.keys()) for t in templates]
|
||||||
if keys:
|
|
||||||
seqs.append(keys)
|
|
||||||
|
|
||||||
result = infer_ensemble(seqs)
|
result = infer_ensemble(seqs)
|
||||||
print(f"Best: {result['best']['algorithm']} (MDL {result['best']['mdl_score']})")
|
print(f"Best: {result['best']['algorithm']} (MDL {result['best']['mdl_score']})")
|
||||||
|
|
|
||||||
27
SHOWCASE.md
27
SHOWCASE.md
|
|
@ -15,7 +15,8 @@ r+? → zero or more
|
||||||
## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship
|
## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship
|
||||||
|
|
||||||
15 popular Ansible roles by Jeff Geerling. There is NO written convention
|
15 popular Ansible roles by Jeff Geerling. There is NO written convention
|
||||||
for the task structure. Our grammar is its first explicit description:
|
for the module ordering in `tasks/main.yml`. Our grammar is its first
|
||||||
|
explicit description:
|
||||||
|
|
||||||
```
|
```
|
||||||
Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
|
Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
|
||||||
|
|
@ -45,23 +46,25 @@ vocabulary (19 kinds). Which one an agent uses depends on the task:
|
||||||
- Bootstrapping a new cluster: iDRegEx — what you can't skip
|
- Bootstrapping a new cluster: iDRegEx — what you can't skip
|
||||||
- Writing a complete chart: CRX — everything you might need
|
- Writing a complete chart: CRX — everything you might need
|
||||||
|
|
||||||
## 3. Docker Compose (73 services, 10 projects)
|
## 3. Portainer templates (47 templates)
|
||||||
|
|
||||||
Per-service key order across real-world compose files:
|
Official Portainer app templates from portainer/templates:
|
||||||
|
|
||||||
```
|
```
|
||||||
Best: CRX | MDL varies by project
|
Best: CRX | MDL 1282
|
||||||
Grammar: (build+image).command.(environment+volumes)?.ports
|
Grammar: (type+title)+.
|
||||||
|
(categories+description+image+logo+name+note+platform)+.
|
||||||
|
repository?.(env+ports+privileged+volumes)+?.command?
|
||||||
```
|
```
|
||||||
|
|
||||||
Per-project patterns emerge:
|
Field ordering convention: identity (`type`, `title`) → metadata
|
||||||
- **Nginx-like:** `build.(command.volumes.ports)`
|
(`description`, `categories`, `platform`, `logo`) → source
|
||||||
- **Databases:** `image.environment.volumes.ports`
|
(`image`, `repository`) → deployment (`ports`, `volumes`, `env`) →
|
||||||
- **Language runtimes:** `build.(environment.command).ports`
|
entrypoint (`command`). 21 unique orderings, one grammar.
|
||||||
|
|
||||||
**Why it helps an LLM:** The field order in service definitions follows
|
**Why it helps an LLM:** Writing a Portainer template needs the right
|
||||||
an implicit convention. An agent generating compose files should put
|
field order. The grammar tells you: identity first, then metadata,
|
||||||
image/build first, then command, then environment/volumes, then ports.
|
then source, then deployment config.
|
||||||
|
|
||||||
## 4. GitHub Actions (cross-project Go lint, 6 jobs)
|
## 4. GitHub Actions (cross-project Go lint, 6 jobs)
|
||||||
|
|
||||||
|
|
|
||||||
112
blog_post.md
112
blog_post.md
|
|
@ -137,69 +137,6 @@ matches only 1 sequence but does so perfectly (low data cost) can
|
||||||
beat a grammar that matches all sequences but is extremely permissive
|
beat a grammar that matches all sequences but is extremely permissive
|
||||||
(high data cost).
|
(high data cost).
|
||||||
|
|
||||||
## The bugs we found (and fixed)
|
|
||||||
|
|
||||||
Implementing the BEX algorithms faithfully required solving several
|
|
||||||
subtle problems.
|
|
||||||
|
|
||||||
### Bug 1: model_cost counted characters, not symbols
|
|
||||||
|
|
||||||
The paper defines model_cost as "the length of r" — the number of
|
|
||||||
symbols in the expression. For the toy alphabet {a, b, c, d, e} used
|
|
||||||
in the paper, characters and symbols are the same. For real-world
|
|
||||||
symbols like `community.docker.docker_image`, they aren't.
|
|
||||||
|
|
||||||
Our `model_cost` function was counting characters (226 for a typical
|
|
||||||
grammar), when it should count symbol occurrences (19). This
|
|
||||||
massively inflated the MDL score, making CRX appear worse than it
|
|
||||||
actually was.
|
|
||||||
|
|
||||||
**Fix:** Count occurrences of alphabet symbols in the expression using
|
|
||||||
regex word-boundary matching, not string length.
|
|
||||||
|
|
||||||
### Bug 2: Dispatch order in _count_words_fast
|
|
||||||
|
|
||||||
The recursive function `_count_words_fast` estimates |L(r)| — the
|
|
||||||
number of strings a grammar accepts at a given length. It dispatches
|
|
||||||
on expression structure: first check for concatenation (`.`), then
|
|
||||||
trailing quantifiers (`+?`, `*`, `?`, `+`), then disjunction groups.
|
|
||||||
|
|
||||||
Our dispatch checked `endswith('+?')` before checking `'.' in expr`.
|
|
||||||
For the expression `(All)+.Role?.RoleBinding?.Job+?`, the trailing
|
|
||||||
`+?` on `Job+?` triggered the quantifier branch first, applying the
|
|
||||||
`+?` to the **entire** expression instead of just the `Job` factor.
|
|
||||||
|
|
||||||
**Fix:** Check concatenation first. Top-level dots can only appear in
|
|
||||||
concatenation, so they should be handled before any quantifier logic.
|
|
||||||
|
|
||||||
### Bug 3: Greedy matching without backtracking
|
|
||||||
|
|
||||||
The `_match_tokens` function checked whether a sequence matches a
|
|
||||||
grammar. For quantifiers like `+?` (zero-or-more), it greedily
|
|
||||||
consumed ALL consecutive matching symbols, then moved on. This failed
|
|
||||||
for grammars like `a+?.a` on input `['a', 'a']`: the `a+?` ate both
|
|
||||||
`a`s, and there was nothing left for the second `.a`.
|
|
||||||
|
|
||||||
**Fix:** Replace the single-pass greedy matching with `_match_possible`,
|
|
||||||
a proper backtracking engine that enumerates ALL valid end positions
|
|
||||||
for each token and picks the maximum. This is essentially a tiny
|
|
||||||
regex engine — but limited to the CHARE subset, so it avoids the
|
|
||||||
exponential blowup of general regex matching.
|
|
||||||
|
|
||||||
### Bug 4: Dot-splitting inside disjunctions
|
|
||||||
|
|
||||||
Module names like `community.docker.docker_image` contain dots.
|
|
||||||
When `_parse_parts` processed a disjunction child, it recursively
|
|
||||||
called itself — which split the expression on `.` before treating it
|
|
||||||
as a symbol. The symbol `community.docker.docker_image` became
|
|
||||||
`community` then `docker` then `docker_image` — three concatenated
|
|
||||||
symbols instead of one.
|
|
||||||
|
|
||||||
**Fix:** Disjunction children are always flat symbols (CRX and
|
|
||||||
iDRegEx don't produce nested disjunctions in practice). Parse them
|
|
||||||
with `_parse_flat_symbol`, which strips quantifiers but never splits
|
|
||||||
on `.`.
|
|
||||||
|
|
||||||
## The results
|
## The results
|
||||||
|
|
||||||
### Ansible deploy roles — 36 roles from companyweb
|
### Ansible deploy roles — 36 roles from companyweb
|
||||||
|
|
@ -240,29 +177,11 @@ configure with templates, start services, optionally run sub-tasks,
|
||||||
install npm/pip packages, and optionally tweak config lines.
|
install npm/pip packages, and optionally tweak config lines.
|
||||||
|
|
||||||
**This is the first explicit description of the geerlingguy role
|
**This is the first explicit description of the geerlingguy role
|
||||||
convention.** It took 15 roles and a grammar inference algorithm to
|
module ordering convention.** It took 15 roles and a grammar inference
|
||||||
write it down.
|
algorithm to write it down.
|
||||||
|
|
||||||
**Compression: 15 roles (5,000 tokens) → 60 tokens (83×)**
|
**Compression: 15 roles (5,000 tokens) → 60 tokens (83×)**
|
||||||
|
|
||||||
### Docker Compose — by project
|
|
||||||
|
|
||||||
Docker Compose has a flexible schema, but each project develops its
|
|
||||||
own convention:
|
|
||||||
|
|
||||||
**mcp-deployment (36 services):**
|
|
||||||
```
|
|
||||||
(build+image).command.(environment+volumes)?.ports
|
|
||||||
```
|
|
||||||
**files (6 services):**
|
|
||||||
```
|
|
||||||
image.environment.volumes.network_mode.privileged?.cap_add?
|
|
||||||
```
|
|
||||||
**fresh-ape-base (9 services):**
|
|
||||||
```
|
|
||||||
image.ports?.(depends_on+environment+user+volumes)+
|
|
||||||
```
|
|
||||||
|
|
||||||
### Ensemble dynamics
|
### Ensemble dynamics
|
||||||
|
|
||||||
The ensemble (CRX + iDRegEx + MDL) selects different winners
|
The ensemble (CRX + iDRegEx + MDL) selects different winners
|
||||||
|
|
@ -270,11 +189,11 @@ depending on the data:
|
||||||
|
|
||||||
| Dataset | Winner | Why |
|
| Dataset | Winner | Why |
|
||||||
|---------|--------|-----|
|
|---------|--------|-----|
|
||||||
| Ansible deploy (36 roles) | CRX | iDRegEx returned ∅ (too diverse) |
|
|
||||||
| Ansible galaxy (15 roles) | CRX | iDRegEx returned ∅ (too diverse) |
|
| Ansible galaxy (15 roles) | CRX | iDRegEx returned ∅ (too diverse) |
|
||||||
| Ansible restore (2 roles) | CRX | Both match all; CRX more compact |
|
| Helm prom-stack (6 configs) | **iDRegEx** | Finds minimal core across all configs |
|
||||||
| Ansible configure (4 roles) | **iDRegEx** | Finds minimal core `include_role` |
|
| Portainer templates (47) | CRX | iDRegEx returned ∅ (no single common field) |
|
||||||
| Ansible manage (2 roles) | **iDRegEx** | Core: `assert.authorized_key` |
|
| Terraform modules (8) | CRX | Every resource type optional across domains |
|
||||||
|
| GitHub Actions Go lint (6) | CRX | Tight pattern, all match |
|
||||||
|
|
||||||
iDRegEx wins when the data has a clear common core. CRX wins when
|
iDRegEx wins when the data has a clear common core. CRX wins when
|
||||||
there's no single shared subsequence (the roles share the *vocabulary*
|
there's no single shared subsequence (the roles share the *vocabulary*
|
||||||
|
|
@ -293,8 +212,9 @@ output = infer_best_grammar(
|
||||||
prefer="crx",
|
prefer="crx",
|
||||||
)
|
)
|
||||||
# Returns:
|
# Returns:
|
||||||
# Best: CRX (MDL 2186.28)
|
# Best: CRX (MDL 288)
|
||||||
# Grammar: docker_volume+?.group?...(assert+...+wait_for)+?.(cron+firewalld)?
|
# Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+
|
||||||
|
# .include+?.(npm+pip)+?.lineinfile?
|
||||||
|
|
||||||
# Ensemble — let MDL pick
|
# Ensemble — let MDL pick
|
||||||
output = infer_best_grammar(sequences=role_sequences)
|
output = infer_best_grammar(sequences=role_sequences)
|
||||||
|
|
@ -302,21 +222,21 @@ output = infer_best_grammar(sequences=role_sequences)
|
||||||
|
|
||||||
An agent workflow:
|
An agent workflow:
|
||||||
|
|
||||||
1. Agent needs to write deploy role #37
|
1. Agent needs to write an Ansible role
|
||||||
2. Finds 36 existing deploy roles, extracts their task module sequences
|
2. Finds 15 existing geerlingguy roles, extracts their task module sequences
|
||||||
3. Calls `infer_best_grammar(sequences=..., prefer='crx')`
|
3. Calls `infer_best_grammar(sequences=..., prefer='crx')`
|
||||||
4. Gets back the grammar in 200 tokens
|
4. Gets back the grammar in ~60 tokens
|
||||||
5. Generates a new role that follows the structural pattern
|
5. Generates a new role that follows the structural pattern
|
||||||
|
|
||||||
Without the MCP: 36 role files in context (15,000 tokens), or guesswork.
|
Without the MCP: 15 role files in context (5,000 tokens), or guesswork.
|
||||||
With the MCP: one grammar rule (200 tokens), known to match 36/36 roles.
|
With the MCP: one grammar rule (~60 tokens), known to match 15/15 roles.
|
||||||
|
|
||||||
## What it means
|
## What it means
|
||||||
|
|
||||||
Grammar inference turns **examples** into **rules**. The rule is a
|
Grammar inference turns **examples** into **rules**. The rule is a
|
||||||
compressed description of the structural convention — and for
|
compressed description of the structural convention — and for
|
||||||
schema-less content like Ansible roles, this may be the *first time*
|
schema-less content like the geerlingguy role module ordering, this is
|
||||||
the convention has been written down at all.
|
the *first time* the convention has been written down at all.
|
||||||
|
|
||||||
For LLM agents, this changes the trade-off between context and
|
For LLM agents, this changes the trade-off between context and
|
||||||
accuracy. Instead of flooding the context window with examples, the
|
accuracy. Instead of flooding the context window with examples, the
|
||||||
|
|
|
||||||
Loading…
Add table
Reference in a new issue