Move MCP server to top of README — it's the primary interface

Restructure: MCP Server first (with agent workflow example), then Why grammar inference / showcases, then Quick Start, then details. This matches how users actually interact with the project: via MCP tools.
2026-07-01 10:18:10 +02:00 · 2026-07-01 10:18:10 +02:00 · a8a8bddb37
commit a8a8bddb37
parent 9f5bde22d5
1 changed files with 68 additions and 47 deletions
--- a/README.md
+++ b/README.md
@ -2,6 +2,64 @@

 Infer **regular expression grammars** from example sequences using the BEX family of algorithms. Given a set of example sequences (strings over some alphabet), the engine learns a compact regular expression that describes the general pattern.

+## MCP Server
+
+The primary interface is a **Model Context Protocol (MCP)** server. Connect any MCP-compatible client (Claude, opencode, etc.) and get grammar inference as a tool:
+
+```json
+{
+  "mcpServers": {
+    "grammar-inference": {
+      "command": "python3",
+      "args": ["/path/to/bex/mcp_server.py"]
+    }
+  }
+}
+```
+
+### Tools
+
+| Tool | What it does |
+|------|-------------|
+| `infer_grammar(sequences, method, kmax, N)` | Core CRX or iDRegEx inference |
+| `infer_best_grammar(sequences, prefer, kmax, N)` | **Ensemble:** runs both CRX and iDRegEx, picks the best by MDL score. `prefer='crx'` or `prefer='idregex'` to skip the comparison and return only that algorithm. |
+| `infer_yaml_grammar(yaml_dir, pattern, method)` | YAML → key-paths → grammar |
+| `infer_ansible_role_grammar(roles_dir)` | Ansible role module sequences → per-category grammar |
+
+### Agent workflow
+
+An LLM agent uses the MCP to discover an unwritten convention from existing examples:
+
+```
+User: Generate a new Ansible role for installing PostgreSQL.
+Agent: I'll first check 15 existing geerlingguy roles to find the structural pattern.
+       [calls infer_best_grammar with 15 role sequences, prefer='crx']
+
+       Best: CRX (MDL 288)
+       Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+
+                .include+?.(npm+pip)+?.lineinfile?
+
+       Convention: check preconditions → OS-specific vars → install packages →
+       configure templates → start services → handle language tooling.
+```
+
+Without the MCP: 15 role files in context (5,000+ tokens) or guesswork.
+With the MCP: one grammar rule (~60 tokens), known to match 15/15 existing roles.
+
+## Why grammar inference?
+
+There are many domains where developers follow **unwritten conventions** — implicit rules about the order and structure of things that no formal schema captures. An LLM generating code in these domains needs to know the convention, but it's rarely documented.
+
+Grammar inference automatically discovers these conventions from examples:
+
+| Domain | Unwritten convention | What the grammar tells an LLM |
+|--------|---------------------|-------------------------------|
+| Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
+| Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
+| Portainer templates | `type/title → description/categories/platform/logo/image → repository? → env/ports/volumes? → command?` | "Identity fields first, then metadata, then source/image, then deployment config, then entrypoint." |
+| GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
+| Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |
+
 ## Quick Start

 ```bash
@ -23,29 +81,6 @@ print(f"Grammar: {result['best']['grammar']}")
 print(f"Score: {result['best']['mdl_score']}")
 ```

-## Why grammar inference?
-
-There are many domains where developers follow **unwritten conventions** — implicit rules about the order and structure of things that no formal schema captures. An LLM generating code in these domains needs to know the convention, but it's rarely documented.
-
-Grammar inference automatically discovers these conventions from examples.
-
-| Domain | Unwritten convention | What the grammar tells an LLM |
-|--------|---------------------|-------------------------------|
-| Ansible roles | `fail → include_vars/set_fact → package → file/template → service → ... → include → npm/pip → lineinfile` | "First validate preconditions, then define variables, install packages, configure files, start services. Include other roles last." |
-| Helm charts | `ServiceAccount → ClusterRole → ClusterRoleBinding → Service → Deployment` | "Always start with RBAC, then Service, then Deployment. Other resources are optional." |
-| Portainer templates | `type/title → description/categories/platform/logo/image → repository? → env/ports/volumes? → command?` | "Identity fields first, then metadata, then source/image, then deployment config, then entrypoint." |
-| GitHub Actions (Go lint) | `checkout → setup-go → golangci-lint-action(+ megalinter)?` | "Checkout, set up Go, run the linter. Only megalinter for extra coverage." |
-| Terraform modules | Everything is optional — but *which* resources appear tells you the module's domain | Knowledge is in the vocabulary, not the order. VPC implies subnets, route tables, gateways. |
-
-## Algorithm Selection Guide
-
-| When | Use | Why |
-|------|-----|-----|
-| Clean, structured data with full vocabulary | **CRX** | Single-pass, deterministic. Accepts all sequences. |
-| Few examples, or want minimal common core | **iDRegEx** | Probabilistic EM, finds only what's shared. |
-| Don't know which is better | **Ensemble (default)** | Runs both, picks the best by MDL score. |
-| Data is clearly one type | `prefer='crx'` or `prefer='idregex'` | Skips ensemble comparison, runs one algorithm. |
-
 ## Real-world Results

 ### Ansible Galaxy (15 roles, 44+ modules each)
@ -61,11 +96,7 @@ Grammar:

 Every single role follows this pattern. The convention was **unwritten** — no document says "Ansible roles should check preconditions first, then install packages, configure with templates, enable services, then optionally install language packages."

-An LLM generating a new role:
- **Must** start with conditional includes and variable setup
- **Should** then install packages and configure files
- **Then** start services
- **Finally** include handling of language-specific tooling
+This is the first explicit description of the geerlingguy role module ordering convention.

 **Compression:** The grammar is ~250 chars. The 15 examples are 7200+ modules combined. **~29× compression.**

@ -97,8 +128,6 @@ Grammar: (type+title)+.(categories+description+image+logo+name+note+platform)+.

 Template fields follow a consistent arc: identity (`type`, `title`) → metadata (`description`, `categories`, `platform`, `logo`) → source (`image`, `repository`) → deployment (`ports`, `volumes`, `env`) → entrypoint (`command`). 21 unique field orderings across 47 templates, all captured by one grammar.

-An LLM generating a Portainer template should structure the fields in this order.
-
 ### GitHub Actions (cross-project Go lint, 6 jobs)

 Data: Lint jobs from prometheus, goreleaser, cosign, sigstore.
@ -132,6 +161,15 @@ Not every domain has an unwritten convention. Grammar inference failed (produced

 The sweet spot: **multiple implementations of the same abstract task** (like "deploy a service" or "configure a chart"), each following a shared but undocumented pattern.

+## Algorithm Selection Guide
+
+| When | Use | Why |
+|------|-----|-----|
+| Clean, structured data with full vocabulary | **CRX** | Single-pass, deterministic. Accepts all sequences. |
+| Few examples, or want minimal common core | **iDRegEx** | Probabilistic EM, finds only what's shared. |
+| Don't know which is better | **Ensemble (default)** | Runs both, picks the best by MDL score. |
+| Data is clearly one type | `prefer='crx'` or `prefer='idregex'` | Skips ensemble comparison, runs one algorithm. |
+
 ## When each algorithm wins

 | Data property | Winner | Why |
@ -142,23 +180,6 @@ The sweet spot: **multiple implementations of the same abstract task** (like "de
 | 2–3 sequences | iDRegEx | CRX overfits. iDRegEx handles noise better. |
 | Many sequences, tight pattern | CRX | Learns precise concatenation with optional suffixes. |

-## MCP Server
-
-A **Model Context Protocol** server exposes all algorithms and domain adapters:
-
-```bash
-python -m bex.mcp_server
-```
-
-### Tools
-
-| Tool | What it does |
-|------|-------------|
-| `infer_grammar(sequences, method, kmax, N)` | Core CRX or iDRegEx inference |
-| `infer_best_grammar(sequences, prefer, kmax, N)` | **Ensemble:** runs both, picks best by MDL. `prefer='crx'` or `prefer='idregex'` to skip comparison. |
-| `infer_yaml_grammar(yaml_dir, pattern, method)` | YAML → key-paths → grammar |
-| `infer_ansible_role_grammar(roles_dir)` | Ansible role module sequences → per-category grammar |
-
 ## Domain Adapters

 ### Ansible Roles