2026-07-01 10:19:08 +02:00
|
|
|
|
# Dervish — Showcase
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
Infer the **unwritten convention** from existing examples. Given N example
|
2026-07-01 09:51:41 +02:00
|
|
|
|
sequences, produce a ~100-char grammar that captures the structural
|
|
|
|
|
|
pattern — in far fewer tokens than the originals.
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
a.b → a then b (concatenation)
|
|
|
|
|
|
(a+b) → a or b (disjunction)
|
|
|
|
|
|
r? → optional (zero or one)
|
|
|
|
|
|
r+ → one or more (iteration)
|
|
|
|
|
|
r+? → zero or more
|
|
|
|
|
|
```
|
|
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
## 1. Ansible Galaxy roles (15 geerlingguy roles) — flagship
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
15 popular Ansible roles by Jeff Geerling. There is NO written convention
|
2026-07-01 10:15:22 +02:00
|
|
|
|
for the module ordering in `tasks/main.yml`. Our grammar is its first
|
|
|
|
|
|
explicit description:
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
|
|
|
|
|
|
include+?.(npm+pip)+?.lineinfile?
|
2026-07-01 10:04:10 +02:00
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Every role: check preconditions → OS-specific vars → install packages →
|
|
|
|
|
|
configure with templates → start services → optionally handle language tooling.
|
|
|
|
|
|
|
|
|
|
|
|
All 15/15 match. **~29× compression** (7200+ modules → ~250 chars).
|
|
|
|
|
|
|
|
|
|
|
|
**Why it helps an LLM:** Generating a new Ansible role, the LLM knows the
|
|
|
|
|
|
exact structure: fail-check first, then vars, then packages, then config/svc.
|
|
|
|
|
|
No guessing.
|
|
|
|
|
|
|
|
|
|
|
|
## 2. Helm chart (kube-prometheus-stack, 6 configs)
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
6 different `values.yaml` files rendered through the same chart:
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
Best: iDRegEx | MDL 1433
|
|
|
|
|
|
Grammar: ServiceAccount.ClusterRole.ClusterRoleBinding.Service.Deployment
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
The **minimal core** every config must deploy. CRX captures the full
|
|
|
|
|
|
vocabulary (19 kinds). Which one an agent uses depends on the task:
|
|
|
|
|
|
- Bootstrapping a new cluster: iDRegEx — what you can't skip
|
|
|
|
|
|
- Writing a complete chart: CRX — everything you might need
|
|
|
|
|
|
|
2026-07-01 10:15:22 +02:00
|
|
|
|
## 3. Portainer templates (47 templates)
|
2026-07-01 10:04:10 +02:00
|
|
|
|
|
2026-07-01 10:15:22 +02:00
|
|
|
|
Official Portainer app templates from portainer/templates:
|
2026-07-01 10:04:10 +02:00
|
|
|
|
|
|
|
|
|
|
```
|
2026-07-01 10:15:22 +02:00
|
|
|
|
Best: CRX | MDL 1282
|
|
|
|
|
|
Grammar: (type+title)+.
|
|
|
|
|
|
(categories+description+image+logo+name+note+platform)+.
|
|
|
|
|
|
repository?.(env+ports+privileged+volumes)+?.command?
|
2026-07-01 09:51:41 +02:00
|
|
|
|
```
|
|
|
|
|
|
|
2026-07-01 10:15:22 +02:00
|
|
|
|
Field ordering convention: identity (`type`, `title`) → metadata
|
|
|
|
|
|
(`description`, `categories`, `platform`, `logo`) → source
|
|
|
|
|
|
(`image`, `repository`) → deployment (`ports`, `volumes`, `env`) →
|
|
|
|
|
|
entrypoint (`command`). 21 unique orderings, one grammar.
|
2026-07-01 10:04:10 +02:00
|
|
|
|
|
2026-07-01 10:15:22 +02:00
|
|
|
|
**Why it helps an LLM:** Writing a Portainer template needs the right
|
|
|
|
|
|
field order. The grammar tells you: identity first, then metadata,
|
|
|
|
|
|
then source, then deployment config.
|
2026-07-01 10:04:10 +02:00
|
|
|
|
|
|
|
|
|
|
## 4. GitHub Actions (cross-project Go lint, 6 jobs)
|
|
|
|
|
|
|
|
|
|
|
|
Lint jobs from prometheus, goreleaser, cosign, sigstore:
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
Best: CRX | MDL 13.6
|
|
|
|
|
|
Grammar: actions/checkout.(actions/setup-go+run:echo+run:sudo)+.
|
|
|
|
|
|
golangci/golangci-lint-action?.megalinter?
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Every Go project's lint CI follows: checkout → setup Go → run linter.
|
|
|
|
|
|
Only the biggest add megalinter.
|
|
|
|
|
|
|
|
|
|
|
|
**Why it helps an LLM:** Starting a new Go project? The lint workflow
|
|
|
|
|
|
has a near-universal pattern.
|
|
|
|
|
|
|
|
|
|
|
|
## 5. Terraform (8 AWS modules)
|
|
|
|
|
|
|
|
|
|
|
|
Terraform modules by hashicorp and terraform-aws-modules:
|
|
|
|
|
|
|
|
|
|
|
|
```
|
|
|
|
|
|
Best: CRX | MDL 1876
|
|
|
|
|
|
Grammar: null_resource?.s3_bucket...?.vpc?...(26+ types all optional)
|
|
|
|
|
|
```
|
|
|
|
|
|
|
|
|
|
|
|
Every resource type is optional — VPC, S3, EC2, and security-group
|
|
|
|
|
|
modules share no mandatory ordering. But the **vocabulary** is the signal:
|
|
|
|
|
|
seeing `aws_vpc` implies subnets, route tables, internet gateways.
|
|
|
|
|
|
|
|
|
|
|
|
**Why it helps an LLM:** The grammar encodes which resources belong
|
|
|
|
|
|
together in each module domain.
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
## What doesn't work
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
| Dataset | Problem |
|
|
|
|
|
|
|---------|---------|
|
|
|
|
|
|
| Dockerfiles | Too simple — just the Dockerfile spec |
|
|
|
|
|
|
| Pre-commit (cross-project) | 252 unique hooks, no common core |
|
|
|
|
|
|
| GHA per-project | One repo = too many job types |
|
|
|
|
|
|
| Prometheus rules | Schema-enforced, no convention |
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
2026-07-01 10:04:10 +02:00
|
|
|
|
Sweet spot: **multiple implementations of the same abstract task**
|
|
|
|
|
|
with a shared but undocumented pattern.
|
2026-07-01 09:51:41 +02:00
|
|
|
|
|
|
|
|
|
|
## Usage
|
|
|
|
|
|
|
|
|
|
|
|
```python
|
|
|
|
|
|
from bex.mcp_server import infer_best_grammar
|
|
|
|
|
|
|
|
|
|
|
|
output = infer_best_grammar(
|
|
|
|
|
|
sequences=role_sequences,
|
|
|
|
|
|
prefer="crx",
|
|
|
|
|
|
)
|
|
|
|
|
|
```
|