# Grammar Inference Engine — Agent Guide ## Overview This repo implements the BEX family of algorithms for inferring regular expression grammars from example sequences. Use it whenever you need to discover the pattern behind a set of strings or structured sequences. ## Quick Start for Agents ```python # Fast pattern inference from bex.crx import CRX g = CRX().infer([['a','b','c'], ['a','b'], ['a','c']]) # a.(b+c)? # Probabilistic k-ORE inference (handles noise better) from bex.idregex import idregex g = idregex([['a','b','c'], ['a','b'], ['a','c']], kmax=2, N=3) ``` ## Use Cases 1. **Ansible role patterns** — extract module sequences from tasks/main.yml, learn per-category grammars 2. **Log analysis** — find common patterns in event sequences 3. **API call patterns** — learn the typical order of API operations 4. **Configuration structure** — discover the schema behind YAML files 5. **Workflow mining** — extract the typical task flow from process logs ## Architecture Two inference pipelines: | Pipeline | When to use | |----------|-------------| | CRX (fast) | Many examples, need speed, CHAREs output | | iDRegEx (robust) | Few/noisy examples, need probabilistic handling | ## Running Tests ```bash python tests/test_bex.py ``` ## MCP Roadmap - [ ] Standalone MCP server wrapping CRX + iDRegEx - [ ] Tool: `infer_grammar(sequences, method="crx")` - [ ] Tool: `ansible_role_grammar(roles_dir)` - [ ] Tool: `yaml_to_sequences(yaml_path)`