grammar-inference-engine/SHOWCASE.md

# Grammar Inference Engine — Showcase

Infer the unwritten convention from existing examples. Given N example
sequences, produce a ~100-char grammar that captures the structural
pattern — in far fewer tokens than the originals.

## How it works

Your agent calls the MCP tool `infer_best_grammar` with a list of
existing sequences. It returns a compressed grammar:

```
a.b       → a then b (concatenation)
(a+b)     → a or b (disjunction)
r?        → optional (zero or one)
r+        → one or more (iteration)
r+?       → zero or more
```

Use `prefer='crx'` for full coverage (accepts all examples), or let the
ensemble pick between CRX and iDRegEx by MDL score.

## Ansible Galaxy roles — 15 geerlingguy roles

Jeff Geerling maintains 100+ of the most popular Ansible roles on
Galaxy. He has never written down their task structure. Our grammar is
the first explicit description:

```
Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
         include+?.(npm+pip)+?.lineinfile?

  CRX         MDL=  596.64  match=15/15
```

Every role follows the same arc: check prerequisites, OS-specific vars,
install packages, configure with templates, start services, optionally
run sub-tasks. It works because 15 roles all converged on the same
unwritten convention.

**Compression: 15 roles (~5,000 tokens) → 60 tokens.**

## Notation reference

| Symbol | Meaning |
|--------|---------|
| `a.b` | a then b |
| `(a+b)` | a or b (CRX disjunction) |
| `(a\|b)` | a or b (iDRegEx disjunction) |
| `r?` | zero or one |
| `r+` | one or more |
| `r+?` | zero or more |
| `MDL` | Minimum Description Length — lower is better |

## Usage

```python
from bex.mcp_server import infer_best_grammar

output = infer_best_grammar(
    sequences=role_sequences,
    prefer="crx",
)
```