remove redundant infer_grammar tool; update docs to single-tool MCP
This commit is contained in:
parent
ed495d3477
commit
b8cc40177c
3 changed files with 11 additions and 37 deletions
14
AGENTS.md
14
AGENTS.md
|
|
@ -38,8 +38,12 @@ Two inference pipelines:
|
|||
python tests/test_bex.py
|
||||
```
|
||||
|
||||
## MCP Roadmap
|
||||
- [ ] Standalone MCP server wrapping CRX + iDRegEx
|
||||
- [ ] Tool: `infer_grammar(sequences, method="crx")`
|
||||
- [ ] Tool: `ansible_role_grammar(roles_dir)`
|
||||
- [ ] Tool: `yaml_to_sequences(yaml_path)`
|
||||
## MCP Server
|
||||
|
||||
The primary interface is an MCP server exposing a single tool:
|
||||
|
||||
| Tool | Parameters | What it does |
|
||||
|------|-----------|-------------|
|
||||
| `infer_best_grammar` | `sequences`, `prefer`, `kmax`, `N` | Runs CRX + iDRegEx, picks best by MDL. `prefer='crx'` or `prefer='idregex'` skips ensemble. |
|
||||
|
||||
Start it: `python /path/to/bex/mcp_server.py`, then connect any MCP client.
|
||||
|
|
|
|||
|
|
@ -36,13 +36,12 @@ The primary interface is a **Model Context Protocol (MCP)** server. Connect any
|
|||
|
||||
| Tool | Parameters | What it does |
|
||||
|------|-----------|-------------|
|
||||
| `infer_best_grammar` | `sequences`, `prefer`, `kmax`, `N` | **Recommended.** Runs CRX + iDRegEx, picks best by MDL. Set `prefer='crx'` or `prefer='idregex'` to run one algorithm. |
|
||||
| `infer_grammar` | `sequences`, `method`, `kmax`, `N` | Core single-algorithm inference. `method='crx'` (fast, deterministic) or `method='idregex'` (probabilistic EM). |
|
||||
| `infer_best_grammar` | `sequences`, `prefer`, `kmax`, `N` | **The only tool you need.** Runs CRX + iDRegEx, picks best by MDL. Set `prefer='crx'` for full coverage or `prefer='idregex'` for minimal core — skips the ensemble and runs one algorithm. |
|
||||
|
||||
**Parameters explained:**
|
||||
- **`prefer`**: `'crx'` for full vocabulary (accepts all sequences), `'idregex'` for minimal common core (only what every example shares). Omit to let MDL pick the winner.
|
||||
- **`kmax`** (1–5): Context window for iDRegEx's k-testable automaton. Higher values capture longer-range dependencies but need more data and are slower. Default 2 works for most cases.
|
||||
- **`N`** (1–10): Baum-Welch EM iterations for iDRegEx training. More iterations = better convergence but slower. Default 3 is a good balance.
|
||||
- **`prefer`**: Skip the CRX-vs-iDRegEx comparison. Use when you know which algorithm fits your data.
|
||||
|
||||
### Agent workflow
|
||||
|
||||
|
|
|
|||
|
|
@ -6,40 +6,11 @@ Run as: python -m bex.mcp_server
|
|||
|
||||
from mcp.server.fastmcp import FastMCP
|
||||
|
||||
from .crx import CRX
|
||||
from .idregex import idregex
|
||||
from .ensemble import infer_ensemble, _matches
|
||||
|
||||
mcp = FastMCP("grammar-inference", log_level="ERROR")
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def infer_grammar(
|
||||
sequences: list[list[str]],
|
||||
method: str = "crx",
|
||||
kmax: int = 2,
|
||||
N: int = 3,
|
||||
) -> str:
|
||||
"""Infer a grammar (regular expression) from example sequences.
|
||||
|
||||
Args:
|
||||
sequences: List of sequences, each a list of symbols (strings).
|
||||
method: Algorithm to use — 'crx' (fast, deterministic) or 'idregex' (probabilistic, handles noise better).
|
||||
kmax: Maximum k for k-ORE inference (iDRegEx only).
|
||||
N: Number of EM iterations (iDRegEx only).
|
||||
|
||||
Returns:
|
||||
A regular expression string describing the inferred grammar.
|
||||
"""
|
||||
if method == "crx":
|
||||
return CRX().infer(sequences)
|
||||
elif method == "idregex":
|
||||
result = idregex(sequences, kmax=kmax, N=N)
|
||||
return result or "∅"
|
||||
else:
|
||||
raise ValueError(f"Unknown method: {method}. Use 'crx' or 'idregex'.")
|
||||
|
||||
|
||||
@mcp.tool()
|
||||
def infer_best_grammar(
|
||||
sequences: list[list[str]],
|
||||
|
|
|
|||
Loading…
Add table
Reference in a new issue