diff --git a/README.md b/README.md index 11dd1b1..baffa62 100644 --- a/README.md +++ b/README.md @@ -79,6 +79,11 @@ Agent: Let me check what pattern the existing community roles follow. **With Dervish:** one MCP call returns a ~60-token grammar known to match 15/15 existing roles. The agent follows it reliably. +**Core+outlier mode:** When generating a new role, the agent can call with +`min_coverage=0.8` to learn the mainstream pattern while seeing which roles +deviate and why — useful when the user's case resembles an outlier +(e.g., a PHP app like phpmyadmin that needs raw `lineinfile`). + ## Quick Start ```bash @@ -125,6 +130,8 @@ Dervish has been tested against public datasets from Ansible Galaxy, Helm, and G The sweet spot: **multiple implementations of the same abstract task** with a shared but undocumented pattern. Not everything works — Dockerfiles, pre-commit configs, and schema-enforced formats are too rigid or too diverse to yield a convention. +> **kOREInference note:** Algorithm 4 (iDRegEx with MDL, arXiv 1004.2372) is included for paper-faithful correctness. On real tool-sequence data, its rwr₀ repair step returns ∅ because the k-OA is rarely SORE (interconnected symbols). The ensemble falls back to CRX or iDRegEx automatically. + ## Algorithm Selection Guide | When | Use | Why | @@ -139,8 +146,9 @@ The sweet spot: **multiple implementations of the same abstract task** with a sh | Data property | Winner | Why | |---------------|--------|-----| -| Diverse patterns, full vocabulary needed | CRX | Captures all symbols. iDRegEx/kOREInference return ∅. | +| Diverse patterns, full vocabulary needed | CRX | Captures all symbols. iDRegEx returns ∅. | | Clean sequences with clear core | iDRegEx | Extracts minimal common subsequence. CRX buries it in optional noise. | +| Interconnected (non-SORE) data | CRX | kOREInference (rwr₀) returns ∅ when k-OA is not SORE. CRX handles it. | | Single sequence | iDRegEx (+ RWR₀) | RWR₀ repair produces a grammatical regex from one example. | | 2–3 sequences | iDRegEx | CRX overfits. iDRegEx handles noise better. | | Many sequences, tight pattern | CRX | Learns precise concatenation with optional suffixes. | diff --git a/SHOWCASE.md b/SHOWCASE.md index fc2ff39..0eb081f 100644 --- a/SHOWCASE.md +++ b/SHOWCASE.md @@ -34,6 +34,25 @@ All 15/15 match. **~29× compression** (7200+ modules → ~250 chars). exact structure: fail-check first, then vars, then packages, then config/svc. No guessing. +### Bonus: core+outlier analysis + +Set `min_coverage=0.8` to find the tight grammar for the majority while +flagging outlier roles with unusual module usage: + +``` +Core CRX (80% coverage, 3 outliers): + fail?.(include_vars+set_fact+package+file+template+service+...)+ + +Outlier sequences: + 1. phpmyadmin: include_vars → set_fact → include → include → lineinfile + 2. composer: fail → set_fact → stat → uri → get_url → command + 3. pip: package → file → pip +``` + +phpmyadmin uses raw `lineinfile` instead of templates; composer needs +a `stat` check + `uri` download; pip is purely `pip` — all three deviate +from the mainstream install → configure → enable pattern. + ## 2. Helm chart (kube-prometheus-stack, 6 configs) 6 different `values.yaml` files rendered through the same chart: @@ -77,10 +96,15 @@ with a shared but undocumented pattern. ## Usage ```python -from bex.mcp_server import infer_best_grammar +from bex import infer_ensemble -output = infer_best_grammar( - sequences=role_sequences, - prefer="crx", -) +# Pick best across all 3 algorithms (CRX + iDRegEx + kOREInference) +result = infer_ensemble(role_sequences) +print(f"Best: {result['best']['algorithm']}") +print(f"Grammar: {result['best']['grammar']}") + +# Or: find the tight core + flag outliers +result = infer_ensemble(role_sequences, min_coverage=0.8) +print(f"Core: {result['core']['grammar']}") +print(f"Outliers: {result['core']['outliers']}") ```