code-search agents mcp token-efficiency local-first

Semble indexes codebases, cuts agent token use 98%

Natural-language code search library that returns only relevant snippets to agents via MCP or bash, replacing grep+read workflows with ~250ms indexing and ~1.5ms queries on CPU.

May 19, 2026

Summary

Agents waste tokens reading full files to find code; Semble returns only matched chunks, reducing context window pressure and latency on every retrieval step. Replaces manual grep exploration with semantic search agents can call directly.

Why it matters

Implementation verdict

Ready now. Drop-in MCP server (Claude Code, Cursor, Codex, OpenCode) or bash tool; no setup beyond `pip install semble`. Replaces grep+find workflows entirely. Requires uv for MCP or pip for CLI. Worth testing immediately if you run agents against large codebases.

Sources

1.returns the exact code snippets they need instantly, using ~98% fewer tokens than grep+read
2.indexes an average repo in ~250 ms and answers queries in ~1.5 ms, all on CPU
3.NDCG@10 of 0.854 on our benchmarks, on par with code-specialized transformer models
4.Everything runs on CPU with no API keys, GPU, or external services
5.~200x faster indexing and ~10x faster queries than a code-specialized transformer

Dev Signal

Get briefs like this in your inbox — free, every weekday.

100+ sources compressed into one 4-minute read. Ranked, cited, implementation-ready.

Read the full issue →All briefs