grammar-inference-engine/SHOWCASE.md
tobjend 547376894c Update README and SHOWCASE with real-world dataset evaluations
README:
- Replace outdated company benchmarks with public showcases
- Add Algorithm Selection Guide
- Add 'When each algorithm wins' table
- Add 'Why grammar inference?' table with value prop for LLMs
- Add 'What doesn't work' section documenting failed approaches
- Update all domain adapter examples with public results
- Clean up outdated references (companyweb roles, hashistack terraform)

SHOWCASE:
- Add Helm (kube-prometheus-stack) with iDRegEx minimal core
- Add Docker Compose per-project patterns
- Add GitHub Actions cross-project Go lint pattern
- Add Terraform modules with vocabulary analysis
- Add 'What doesn't work' section
- Explain WHY each dataset helps an LLM
2026-07-01 10:04:10 +02:00

119 lines
3.7 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Grammar Inference Engine — Showcase
Infer the **unwritten convention** from existing examples. Given N example
sequences, produce a ~100-char grammar that captures the structural
pattern — in far fewer tokens than the originals.
```
a.b → a then b (concatenation)
(a+b) → a or b (disjunction)
r? → optional (zero or one)
r+ → one or more (iteration)
r+? → zero or more
```
## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship
15 popular Ansible roles by Jeff Geerling. There is NO written convention
for the task structure. Our grammar is its first explicit description:
```
Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
include+?.(npm+pip)+?.lineinfile?
```
Every role: check preconditions → OS-specific vars → install packages →
configure with templates → start services → optionally handle language tooling.
All 15/15 match. **~29× compression** (7200+ modules → ~250 chars).
**Why it helps an LLM:** Generating a new Ansible role, the LLM knows the
exact structure: fail-check first, then vars, then packages, then config/svc.
No guessing.
## 2. Helm chart (kube-prometheus-stack, 6 configs)
6 different `values.yaml` files rendered through the same chart:
```
Best: iDRegEx | MDL 1433
Grammar: ServiceAccount.ClusterRole.ClusterRoleBinding.Service.Deployment
```
The **minimal core** every config must deploy. CRX captures the full
vocabulary (19 kinds). Which one an agent uses depends on the task:
- Bootstrapping a new cluster: iDRegEx — what you can't skip
- Writing a complete chart: CRX — everything you might need
## 3. Docker Compose (73 services, 10 projects)
Per-service key order across real-world compose files:
```
Best: CRX | MDL varies by project
Grammar: (build+image).command.(environment+volumes)?.ports
```
Per-project patterns emerge:
- **Nginx-like:** `build.(command.volumes.ports)`
- **Databases:** `image.environment.volumes.ports`
- **Language runtimes:** `build.(environment.command).ports`
**Why it helps an LLM:** The field order in service definitions follows
an implicit convention. An agent generating compose files should put
image/build first, then command, then environment/volumes, then ports.
## 4. GitHub Actions (cross-project Go lint, 6 jobs)
Lint jobs from prometheus, goreleaser, cosign, sigstore:
```
Best: CRX | MDL 13.6
Grammar: actions/checkout.(actions/setup-go+run:echo+run:sudo)+.
golangci/golangci-lint-action?.megalinter?
```
Every Go project's lint CI follows: checkout → setup Go → run linter.
Only the biggest add megalinter.
**Why it helps an LLM:** Starting a new Go project? The lint workflow
has a near-universal pattern.
## 5. Terraform (8 AWS modules)
Terraform modules by hashicorp and terraform-aws-modules:
```
Best: CRX | MDL 1876
Grammar: null_resource?.s3_bucket...?.vpc?...(26+ types all optional)
```
Every resource type is optional — VPC, S3, EC2, and security-group
modules share no mandatory ordering. But the **vocabulary** is the signal:
seeing `aws_vpc` implies subnets, route tables, internet gateways.
**Why it helps an LLM:** The grammar encodes which resources belong
together in each module domain.
## What doesn't work
| Dataset | Problem |
|---------|---------|
| Dockerfiles | Too simple — just the Dockerfile spec |
| Pre-commit (cross-project) | 252 unique hooks, no common core |
| GHA per-project | One repo = too many job types |
| Prometheus rules | Schema-enforced, no convention |
Sweet spot: **multiple implementations of the same abstract task**
with a shared but undocumented pattern.
## Usage
```python
from bex.mcp_server import infer_best_grammar
output = infer_best_grammar(
sequences=role_sequences,
prefer="crx",
)
```