Setup
Setup
Gets you from nothing to "agent searching my docs" in 60 seconds.
Usage
/gnosis:setup # Interactive — asks where your docs live
/gnosis:setup /path/to/docs # Same, but ingest that folder right away
/gnosis:setup --postgres # Print PostgreSQL setup instructions too
Target: $ARGUMENTS
Step 1 — Install
Pick the extras you need. They're additive.
pip install gnosis-mcp # core (SQLite, keyword search)
pip install 'gnosis-mcp[embeddings]' # + local ONNX embeddings (hybrid search)
pip install 'gnosis-mcp[web]' # + web crawler (sitemap / BFS)
pip install 'gnosis-mcp[postgres]' # + PostgreSQL backend
pip install 'gnosis-mcp[rst,pdf]' # + .rst and .pdf file ingestion
pip install 'gnosis-mcp[reranking]' # + cross-encoder reranker
# (OFF by default; tune before turning on —
# the bundled MS-MARCO model can HURT
# dev-doc retrieval by ~27 nDCG@10)
Full stack in one shot:
pip install 'gnosis-mcp[embeddings,web,postgres,rst,pdf]'
Verify:
gnosis-mcp --version # should print the semver (≥ 0.10.13)
Step 2 — Pick a backend (usually SQLite)
SQLite (default): zero config. Database auto-creates at
~/.local/share/gnosis-mcp/docs.db. Great up to ~100 k chunks.
PostgreSQL: use when you have multiple concurrent writers, need a shared index across machines, or corpus > 100 k chunks.
# Postgres (only if you need it)
export GNOSIS_MCP_DATABASE_URL="postgresql://user:pass@localhost:5432/mydb"
Check that pgvector is available on the server:
psql "$GNOSIS_MCP_DATABASE_URL" -c "CREATE EXTENSION IF NOT EXISTS vector;"
Step 3 — Initialise the database
Idempotent. Safe to run any time.
gnosis-mcp init-db
Step 4 — Ingest your docs
If $ARGUMENTS includes a path, use it. Otherwise ask the user where
their docs live.
# Preview (dry-run): lists files that would be ingested
gnosis-mcp ingest /path/to/docs --dry-run
# Real ingest, with embeddings for hybrid search
gnosis-mcp ingest /path/to/docs --embed
Default chunk size is 2000 characters (peak of the v0.11 sweep on
real dev-docs). Tune it for your corpus with /gnosis:tune after
you've got some real queries to score against.
Watch mode — auto re-ingests on file changes, no cron needed:
gnosis-mcp serve --watch /path/to/docs --transport streamable-http --rest
Step 5 — Verify
gnosis-mcp check # DB connectivity + schema sanity
gnosis-mcp stats # doc / chunk / embedding counts
Expect something like:
Backend: sqlite
Version: SQLite 3.46.0
chunks_table: ✓ (1,742 rows)
fts_table: ✓
sqlite_vec: ✓ (1,742 vectors)
links_table: ✓ (812 rows)
If anything says ✗, run /gnosis:status for diagnosis.
Step 6 — Wire your editor
Pick one. Each config goes in the project root (or global config — see each editor's docs).
Claude Code — .claude/mcp.json or ~/.claude/mcp.json
{
"mcpServers": {
"gnosis": {
"command": "gnosis-mcp",
"args": ["serve"]
}
}
}
For shared-state setups (agent teams, parallel tabs) — use HTTP:
{
"mcpServers": {
"gnosis": {
"type": "url",
"url": "http://127.0.0.1:8000/mcp"
}
}
}
Start the server separately:
gnosis-mcp serve --transport streamable-http --rest --watch ./docs
Cursor — .cursor/mcp.json
Same shape as Claude Code. Drop the JSON above.
Windsurf — ~/.codeium/windsurf/mcp_config.json
Same shape.
VS Code (GitHub Copilot) — .vscode/mcp.json
Key is "servers", not "mcpServers":
{
"servers": {
"gnosis": {
"command": "gnosis-mcp",
"args": ["serve"]
}
}
}
JetBrains — Settings → Tools → AI Assistant → MCP Servers
Command: gnosis-mcp, args: serve.
Cline — Cline's MCP panel in the sidebar
Command: gnosis-mcp, args: ["serve"].
Full per-editor guidance (including auth, env vars for write mode, and team / remote setups) lives in llms-install.md.
Step 7 — Report
Print a compact summary to the user:
## Setup complete
| Setting | Value |
|-------------------|-------|
| Backend | sqlite |
| Database | ~/.local/share/gnosis-mcp/docs.db |
| Documents | 42 |
| Chunks | 170 |
| Embeddings | 170 / 170 (100 %) |
| Chunk size | 2000 chars (default) |
| Writable | false (set GNOSIS_MCP_WRITABLE=true to enable upsert/delete) |
Next:
/gnosis:search "your first query" — sanity check retrieval works
/gnosis:tune — find your chunk-size optimum (optional)
/gnosis:ingest git <repo> — index commit history too (optional)
Optional — extras
Git history
gnosis-mcp ingest-git . --since 6m --embed
Web crawl (vendor docs → local index)
gnosis-mcp crawl https://docs.stripe.com --sitemap --embed
Writes from your agent
Off by default for safety. Enable only if you want your agent to call
upsert_doc / delete_doc:
export GNOSIS_MCP_WRITABLE=true
REST API on the same port
gnosis-mcp serve --transport streamable-http --rest
Then: curl http://127.0.0.1:8000/api/search?q=auth&limit=5.
Add Bearer auth for any non-localhost exposure:
export GNOSIS_MCP_API_KEY="$(python -c 'import secrets; print(secrets.token_urlsafe(32))')"
Notes
- Re-running ingest is cheap — content hashes skip unchanged files
- Watch mode beats cron — mtime polling + debounce, no fsnotify dep
- Rerankers are off by default: they help in some domains, hurt in
others. Run
/gnosis:tune fullto find out which applies to your corpus before enabling - PostgreSQL is a drop-in swap: set
GNOSIS_MCP_DATABASE_URLand re-runinit-db+ingest. Same tool API.