From 136ae08fe33b84e1159119d4ac128fc0f68e6132 Mon Sep 17 00:00:00 2001 From: tobi Date: Thu, 2 Jul 2026 18:06:38 +0000 Subject: [PATCH] Update README.md --- README.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README.md b/README.md index dbd08e6..186d6bb 100644 --- a/README.md +++ b/README.md @@ -25,7 +25,7 @@ When an LLM agent needs to follow these conventions, it usually has two bad opti 2. **Guess from one or two examples** — the LLM infers a pattern and often gets it wrong. Dervish replaces both with a **one-call MCP tool**: pass your sequences, get back a ~60-token grammar. -By leveraging **Minimum Description Length (MDL) scoring**, Dervish treats the grammar discovery problem as an optimal compression task—meaning the resulting rule is mathematically tuned to consume as few tokens as possible without losing the pattern. +By leveraging **Minimum Description Length (MDL) scoring**, Dervish treats the grammar discovery problem as an optimal compression task. the resulting rule is optimized to consume as few tokens as possible without losing the pattern. **Without Dervish:** token cost scales linearly with examples. **With Dervish:** one compact grammar describes them all — a ~60–200 token rule instead of thousands of tokens of raw examples. Try it out and you too will say: