Commit graph

5 commits

Author SHA1 Message Date
tobjend
b05c3ee116 rename to Dervish MCP; expand description with token-savings framing; add xkcd-style bar charts; link papers to actual URLs 2026-07-01 11:05:03 +02:00
tobjend
25d844d1f9 purge Portainer references, format-specific tools, and Domain Adapters section; make showcases concrete with extracted types 2026-07-01 10:36:04 +02:00
tobjend
097dfc9954 Rename to Dervish, add animated logo to README 2026-07-01 10:19:08 +02:00
tobjend
9f5bde22d5 Remove bugs section (implementation bugs, not paper bugs), remove Docker Compose (private data), add Portainer templates, fix geerlingguy claim precision
Blog post: remove 'The bugs we found' section (all 4 bugs were from our implementation, not the paper algorithms). Replace company data references in MCP section with Galaxy example. Update ensemble dynamics table with public datasets.

README: replace Docker Compose with Portainer templates in 'Why grammar inference?' table, Real-world Results, and Domain Adapters.

SHOWCASE: replace Docker Compose with Portainer templates.

All claims verified: no public documentation of geerlingguy module ordering convention exists.
2026-07-01 10:15:22 +02:00
tobjend
0e2aec582b Grammar inference engine: CRX + iDRegEx ensemble with MDL scoring, MCP server, showcase, and blog post
- Ensemble inference (infer_ensemble) runs both CRX and iDRegEx, picks best by MDL
- CRX: CRX algorithm for wide coverage (accepts all sequences, large vocabulary)
- iDRegEx: iDRegEx for minimal core grammar (tightest common pattern)
- MDL scoring: fixed model_cost to count alphabet symbol occurrences, fixed dispatch order in _count_words_fast
- Fixed _match_tokens: rewritten as _match_possible with proper backtracking
- Fixed _parse_parts disjunction: children use _parse_flat_symbol to avoid dot-splitting
- MCP server: infer_best_grammar and infer_grammar tools
- Added prefer parameter (crx/idregex) to skip ensemble
- 28 passing tests
- SHOWCASE.md with Geerlingguy Galaxy demonstration
- blog_post.md with full technical deep-dive
2026-07-01 09:51:41 +02:00