grammar-inference-engine

Author	SHA1	Message	Date
tobjend	c8a49f0149	scale PNG logo to 150%	2026-07-01 11:21:02 +02:00
tobjend	9d71763b27	restore original dervish.gif below explanation; add new logo left-aligned, smaller	2026-07-01 11:18:13 +02:00
tobjend	64c3313da9	replace logo with dervis_logo.png; add to SHOWCASE.md and blog_post.md	2026-07-01 11:16:21 +02:00
tobjend	6d1c033267	remove Terraform showcase (everything-optional grammar isn't useful); fix GHA scope claim	2026-07-01 10:42:08 +02:00
tobjend	25d844d1f9	purge Portainer references, format-specific tools, and Domain Adapters section; make showcases concrete with extracted types	2026-07-01 10:36:04 +02:00
tobjend	097dfc9954	Rename to Dervish, add animated logo to README	2026-07-01 10:19:08 +02:00
tobjend	9f5bde22d5	Remove bugs section (implementation bugs, not paper bugs), remove Docker Compose (private data), add Portainer templates, fix geerlingguy claim precision Blog post: remove 'The bugs we found' section (all 4 bugs were from our implementation, not the paper algorithms). Replace company data references in MCP section with Galaxy example. Update ensemble dynamics table with public datasets. README: replace Docker Compose with Portainer templates in 'Why grammar inference?' table, Real-world Results, and Domain Adapters. SHOWCASE: replace Docker Compose with Portainer templates. All claims verified: no public documentation of geerlingguy module ordering convention exists.	2026-07-01 10:15:22 +02:00
tobjend	547376894c	Update README and SHOWCASE with real-world dataset evaluations README: - Replace outdated company benchmarks with public showcases - Add Algorithm Selection Guide - Add 'When each algorithm wins' table - Add 'Why grammar inference?' table with value prop for LLMs - Add 'What doesn't work' section documenting failed approaches - Update all domain adapter examples with public results - Clean up outdated references (companyweb roles, hashistack terraform) SHOWCASE: - Add Helm (kube-prometheus-stack) with iDRegEx minimal core - Add Docker Compose per-project patterns - Add GitHub Actions cross-project Go lint pattern - Add Terraform modules with vocabulary analysis - Add 'What doesn't work' section - Explain WHY each dataset helps an LLM	2026-07-01 10:04:10 +02:00
tobjend	0e2aec582b	Grammar inference engine: CRX + iDRegEx ensemble with MDL scoring, MCP server, showcase, and blog post - Ensemble inference (infer_ensemble) runs both CRX and iDRegEx, picks best by MDL - CRX: CRX algorithm for wide coverage (accepts all sequences, large vocabulary) - iDRegEx: iDRegEx for minimal core grammar (tightest common pattern) - MDL scoring: fixed model_cost to count alphabet symbol occurrences, fixed dispatch order in _count_words_fast - Fixed _match_tokens: rewritten as _match_possible with proper backtracking - Fixed _parse_parts disjunction: children use _parse_flat_symbol to avoid dot-splitting - MCP server: infer_best_grammar and infer_grammar tools - Added prefer parameter (crx/idregex) to skip ensemble - 28 passing tests - SHOWCASE.md with Geerlingguy Galaxy demonstration - blog_post.md with full technical deep-dive	2026-07-01 09:51:41 +02:00

9 commits