Dervish — Showcase

Infer the unwritten convention from existing examples. Given N example sequences, produce a ~100-char grammar that captures the structural pattern — in far fewer tokens than the originals.

a.b       → a then b (concatenation)
(a+b)     → a or b (disjunction)
r?        → optional (zero or one)
r+        → one or more (iteration)
r+?       → zero or more

1. Ansible Galaxy roles (15 geerlingguy roles)

15 popular Ansible roles by Jeff Geerling. There is NO written convention for the module ordering in tasks/main.yml. Our grammar is its first explicit description:

Grammar: fail?.(include_vars+set_fact+package+file+template+service+...)+.
         include+?.(npm+pip)+?.lineinfile?

Every role: check preconditions → OS-specific vars → install packages → configure with templates → start services → optionally handle language tooling.

All 15/15 match. ~29× compression (7200+ modules → ~250 chars).

Why it helps an LLM: Generating a new Ansible role, the LLM knows the exact structure: fail-check first, then vars, then packages, then config/svc. No guessing.

Bonus: core+outlier analysis

Set min_coverage=0.8 to find the tight grammar for the majority while flagging outlier roles with unusual module usage:

Core CRX (80% coverage, 3 outliers):
  fail?.(include_vars+set_fact+package+file+template+service+...)+

Outlier sequences:
  1. phpmyadmin: include_vars → set_fact → include → include → lineinfile
  2. composer:   fail → set_fact → stat → uri → get_url → command
  3. pip:        package → file → pip

phpmyadmin uses raw lineinfile instead of templates; composer needs a stat check + uri download; pip is purely pip — all three deviate from the mainstream install → configure → enable pattern.

2. Helm charts — cross-project convention (15 charts, 6 publishers)

15 popular Helm charts from Bitnami (10), Grafana, Jetstack (cert-manager), Argo, Ingress-Nginx, and Elastic. Different publishers, different purposes (databases, web servers, infrastructure tools) — but they converged on a common resource ordering:

Best: CRX | MDL 230
Grammar: NetworkPolicy?.PodDisruptionBudget?.ServiceAccount?.Secret?
         .ConfigMap?.PersistentVolumeClaim?.ClusterRole?.ClusterRoleBinding?
         .Role?.RoleBinding?.Service.Deployment?.StatefulSet?.
         (IngressClass+MutatingWebhookConfiguration)?.ValidatingWebhookConfiguration?.Job?

Match rates: CRX=15/15

Every chart follows: resilience → identity → data → service → workload → extensions.

Service is the only resource type that appears in all 15 charts. Bitnami charts (10/15) consistently start with NetworkPolicy + PodDisruptionBudget before identity and service. Infrastructure tools (cert-manager, grafana, argo-cd, ingress-nginx) add RBAC and admission webhooks for cluster-wide access.

Why it helps an LLM: Generating a Helm chart template? You know the structure: start with availability guarantees (PDB, NetworkPolicy), then identity (ServiceAccount, Secrets), then the Service endpoint, then your workload type. Only cluster-wide tools need RBAC and webhooks — skip them for simple application charts.

3. GitHub Actions (cross-project Go lint, 6 jobs)

Lint jobs from prometheus, goreleaser, cosign, sigstore:

Best: CRX | MDL 13.6
Grammar: actions/checkout.(actions/setup-go+run:echo+run:sudo)+.
         golangci/golangci-lint-action?.megalinter?

Four independently-maintained Go projects converged on: checkout → setup Go → run golangci-lint. Only the biggest add megalinter.

Why it helps an LLM: Setting up CI for a Go project on GitHub Actions? The grammar encodes an emergent cross-project convention — four teams wrote the same pipeline without coordinating.

What doesn't work

Dataset	Problem
Dockerfiles	Too simple — just the Dockerfile spec
Pre-commit (cross-project)	252 unique hooks, no common core
GHA per-project	One repo = too many job types
Prometheus rules	Schema-enforced, no convention

Sweet spot: multiple implementations of the same abstract task with a shared but undocumented pattern.

Usage

from bex import infer_ensemble

# Pick best across all 3 algorithms (CRX + iDRegEx + kOREInference)
result = infer_ensemble(role_sequences)
print(f"Best: {result['best']['algorithm']}")
print(f"Grammar: {result['best']['grammar']}")

# Or: find the tight core + flag outliers
result = infer_ensemble(role_sequences, min_coverage=0.8)
print(f"Core: {result['core']['grammar']}")
print(f"Outliers ({result['core']['outlier_count']}):")
for i, o in enumerate(result['core']['outliers'], 1):
    print(f"  {i}. {' → '.join(str(x) for x in o[:8])}{'...' if len(o) > 8 else ''}")

4.9 KiB Raw Permalink Blame History Unescape Escape