rename to Dervish MCP; expand description with token-savings framing; add xkcd-style bar charts; link papers to actual URLs
This commit is contained in:
parent
6d1c033267
commit
b05c3ee116
5 changed files with 103 additions and 7 deletions
33
README.md
33
README.md
|
|
@ -1,8 +1,19 @@
|
|||
# Dervish
|
||||
# Dervish MCP
|
||||
|
||||
<p align="center"><img src="dervish.gif" alt="Dervish"></p>
|
||||
|
||||
**Dervish** infers **regular expression grammars** from example sequences using the BEX family of algorithms. Given a set of example sequences (strings over some alphabet), it learns a compact regular expression that describes the general pattern.
|
||||
**Dervish** infers **regular expression grammars** from example sequences using the BEX family of algorithms. Given a set of example sequences (strings over some alphabet), it learns a compact regular expression that captures the general pattern.
|
||||
|
||||
Every codebase has unwritten conventions — the order tasks appear in Ansible roles, the resources a Helm chart always creates, the steps every CI pipeline runs. Nobody writes these down. They emerge from copying and converging.
|
||||
|
||||
When an LLM agent needs to follow these conventions, it usually has two bad options:
|
||||
|
||||
1. **Stuff every existing file into context** — 15 Ansible roles = 5,000 tokens. You'll hit the context window by the third example.
|
||||
2. **Guess from one or two examples** — the LLM infers a pattern and often gets it wrong.
|
||||
|
||||
Dervish replaces both with a **one-call MCP tool**: pass your sequences, get back a ~60-token grammar. A rule you can trust, at a fraction of the cost.
|
||||
|
||||
**Without Dervish:** token cost scales linearly with examples. **With Dervish:** one compact grammar describes them all — a ~60–200 token rule instead of thousands of tokens of raw examples.
|
||||
|
||||
## MCP Server
|
||||
|
||||
|
|
@ -196,6 +207,20 @@ The sweet spot: **multiple implementations of the same abstract task** (like "de
|
|||
| 2–3 sequences | iDRegEx | CRX overfits. iDRegEx handles noise better. |
|
||||
| Many sequences, tight pattern | CRX | Learns precise concatenation with optional suffixes. |
|
||||
|
||||
## Token savings
|
||||
|
||||
<p align="center">
|
||||
<img src="chart_context_cost.png" alt="Context cost: raw examples vs Dervish grammar" width="75%">
|
||||
</p>
|
||||
|
||||
Without Dervish, including N examples in context costs N × ~100 tokens. With Dervish, the grammar stays small and flat — ~60 tokens for a tight pattern, ~200 for diverse data.
|
||||
|
||||
<p align="center">
|
||||
<img src="chart_token_savings.png" alt="Token savings per dataset" width="75%">
|
||||
</p>
|
||||
|
||||
Across all public benchmarks, Dervish delivers **40–83× compression**. The grammar is smaller than a single example file would be — and it represents the entire dataset.
|
||||
|
||||
## How MDL scoring works
|
||||
|
||||
```
|
||||
|
|
@ -217,8 +242,8 @@ The ensemble selects the grammar with the lowest total MDL.
|
|||
|
||||
## Papers
|
||||
|
||||
- **Bex et al.** *"Inferring Deterministic Regular Expressions from Positive Data"* — TODS 2010
|
||||
- **Bex et al.** *"Inferring k-optimal REs from Positive Data"* — arXiv:1004.2372
|
||||
- **Bex et al.** *[Learning Deterministic Regular Expressions for the Web](https://doi.org/10.1145/1806907.1806911)* — TODS 2010
|
||||
- **Bex et al.** *[Simplifying XML Schema: Single-Type Approximations of Regular Expressions](https://arxiv.org/abs/1004.2372)* — arXiv:1004.2372
|
||||
|
||||
## Tests
|
||||
|
||||
|
|
|
|||
|
|
@ -253,9 +253,9 @@ the pattern they all share. The structural convention is in the data
|
|||
## References
|
||||
|
||||
- Bex, G. J., Gelade, W., Neven, F., & Vansummeren, S. (2010).
|
||||
*Learning Deterministic Regular Expressions for the Web.* TODS 2010.
|
||||
[*Learning Deterministic Regular Expressions for the Web.*](https://doi.org/10.1145/1806907.1806911) TODS 2010.
|
||||
- Bex, G. J., Gelade, W., Martens, W., & Neven, F. (2010).
|
||||
*Simplifying XML Schema: Single-Type Approximations of Regular
|
||||
Expressions.* arXiv:1004.2372.
|
||||
[*Simplifying XML Schema: Single-Type Approximations of Regular
|
||||
Expressions.*](https://arxiv.org/abs/1004.2372) arXiv:1004.2372.
|
||||
- Rissanen, J. (1978). *Modeling by shortest data description.*
|
||||
Automatica 14(5).
|
||||
|
|
|
|||
BIN
chart_context_cost.png
Normal file
BIN
chart_context_cost.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 161 KiB |
BIN
chart_token_savings.png
Normal file
BIN
chart_token_savings.png
Normal file
Binary file not shown.
|
After Width: | Height: | Size: 137 KiB |
71
make_charts.py
Normal file
71
make_charts.py
Normal file
|
|
@ -0,0 +1,71 @@
|
|||
import matplotlib.pyplot as plt
|
||||
import numpy as np
|
||||
|
||||
plt.xkcd(scale=0.7, length=60, randomness=2)
|
||||
|
||||
FIG_W = 8
|
||||
FIG_H = 5
|
||||
|
||||
# ── Chart 1: Context cost vs examples ──
|
||||
fig1, ax1 = plt.subplots(figsize=(FIG_W, FIG_H))
|
||||
|
||||
N = [1, 5, 15, 36]
|
||||
raw = [100, 500, 1500, 3600] # ~100 tokens/example
|
||||
dervish = [40, 60, 60, 200] # grammar grows only when diversity grows
|
||||
|
||||
x = np.arange(len(N))
|
||||
w = 0.35
|
||||
|
||||
bars1 = ax1.bar(x - w/2, raw, w, label='Raw examples', color='#e74c3c', alpha=0.85)
|
||||
bars2 = ax1.bar(x + w/2, dervish, w, label='Dervish grammar', color='#3498db', alpha=0.85)
|
||||
|
||||
ax1.set_xticks(x)
|
||||
ax1.set_xticklabels([f'{n} examples' for n in N])
|
||||
ax1.set_ylabel('Tokens needed in context')
|
||||
ax1.set_title('Context cost: raw examples vs Dervish grammar')
|
||||
ax1.legend(frameon=False)
|
||||
|
||||
for bar in bars1:
|
||||
ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 80,
|
||||
f'{int(bar.get_height())}', ha='center', va='bottom', fontsize=9)
|
||||
for bar in bars2:
|
||||
ax1.text(bar.get_x() + bar.get_width()/2, bar.get_height() + 80,
|
||||
f'{int(bar.get_height())}', ha='center', va='bottom', fontsize=9)
|
||||
|
||||
ax1.set_ylim(0, 4500)
|
||||
fig1.tight_layout()
|
||||
fig1.savefig('chart_context_cost.png', dpi=200)
|
||||
plt.close(fig1)
|
||||
|
||||
# ── Chart 2: Tokens — Without vs With Dervish (per dataset) ──
|
||||
fig2, ax2 = plt.subplots(figsize=(FIG_W, FIG_H))
|
||||
|
||||
datasets = ['Ansible Galaxy\n(15 roles)', 'Helm\n(6 configs)', 'Go lint\n(6 jobs)']
|
||||
without = [5000, 3000, 900]
|
||||
with_derv = [60, 40, 30]
|
||||
ratios = [f'{int(w/d)}×' for w, d in zip(without, with_derv)]
|
||||
|
||||
x2 = np.arange(len(datasets))
|
||||
w2 = 0.3
|
||||
|
||||
bw = ax2.bar(x2 - w2/2, without, w2, label='Without Dervish', color='#e74c3c', alpha=0.85)
|
||||
bd = ax2.bar(x2 + w2/2, with_derv, w2, label='With Dervish', color='#3498db', alpha=0.85)
|
||||
|
||||
ax2.set_xticks(x2)
|
||||
ax2.set_xticklabels(datasets)
|
||||
ax2.set_ylabel('Tokens')
|
||||
ax2.set_title('Token savings per dataset')
|
||||
ax2.legend(frameon=False)
|
||||
ax2.set_yscale('log')
|
||||
ax2.set_ylim(5, 30000)
|
||||
|
||||
# Label compression ratios
|
||||
for i, (r, wbar, dbar) in enumerate(zip(ratios, bw, bd)):
|
||||
ax2.text(x2[i], without[i] * 1.3, r, ha='center', va='bottom', fontsize=11, fontweight='bold',
|
||||
bbox=dict(boxstyle='round,pad=0.2', facecolor='white', edgecolor='gray', alpha=0.8))
|
||||
|
||||
fig2.tight_layout()
|
||||
fig2.savefig('chart_token_savings.png', dpi=200)
|
||||
plt.close(fig2)
|
||||
|
||||
print("Charts saved: chart_context_cost.png, chart_token_savings.png")
|
||||
Loading…
Add table
Reference in a new issue