CLI Reference
CLI Reference
Every subcommand of gnosis-mcp. All commands honour GNOSIS_MCP_* env
vars (see config.md) — flags override env.
Quick map
| Command | Purpose |
|---|---|
serve |
Start the MCP server (stdio / HTTP). |
init-db |
Create tables, indexes, triggers. |
ingest |
Ingest local files (md / txt / ipynb / toml / csv / json / rst / pdf). |
prune |
Delete chunks whose source file is gone. |
ingest-git |
Index a git repo's commit history. |
crawl |
Crawl a documentation website and ingest pages. |
search |
Run a search from the command line (sanity check). |
embed |
Backfill embeddings for NULL rows. |
stats |
Print doc / chunk / embedding / access counts. |
export |
Dump documents as JSON or markdown. |
diff |
Dry-run re-ingest: show what would change. |
check |
Verify DB connection, schema, and extensions. |
cleanup |
Purge old access-log rows. |
fix-link-types |
One-off migration for pre-0.10 git-history links. |
eval |
Retrieval-quality harness (Hit@K, MRR, Precision@K). |
serve
Start the MCP server.
gnosis-mcp serve [--transport {stdio,streamable-http,sse}]
[--host HOST] [--port PORT]
[--ingest PATH] [--watch PATH]
[--rest]
| Flag | Description |
|---|---|
--transport |
stdio (default, for editor clients) or streamable-http (serve over HTTP). |
--host |
HTTP bind (default 127.0.0.1; env GNOSIS_MCP_HOST). |
--port |
HTTP port (default 8000; env GNOSIS_MCP_PORT). |
--ingest |
Ingest this path before starting. |
--watch |
Watch path for changes, auto-re-ingest (implies --ingest). Uses mtime polling with debounce. |
--rest |
Enable the REST API on the same HTTP port. See rest-api.md. |
Examples
# Default: stdio, SQLite at ~/.local/share/gnosis-mcp/docs.db
gnosis-mcp serve
# Network-accessible MCP + REST with local embeddings
GNOSIS_MCP_EMBED_PROVIDER=local \
gnosis-mcp serve --transport streamable-http --host 0.0.0.0 --rest
# Live-updating from a watched folder
gnosis-mcp serve --watch ./knowledge
init-db
Create the tables, FTS5 / tsvector indexes, triggers, and (on Postgres) the HNSW vector index. Idempotent.
gnosis-mcp init-db [--dry-run]
--dry-run prints the SQL without running it.
ingest
Ingest local files. Walks directories, respects file type, chunks by heading depth, skips unchanged files via content hash.
gnosis-mcp ingest PATH
[--dry-run] [--force] [--embed]
[--prune] [--wipe] [--include-crawled]
| Flag | Description |
|---|---|
PATH |
File or directory to ingest. |
--dry-run |
Show what would happen, write nothing. |
--force |
Re-ingest every file even if content hash matches. |
--embed |
Generate embeddings for new/changed chunks (requires an embed provider). |
--prune |
After ingest, delete chunks whose source file is gone. |
--wipe |
Delete every document first (full reset — the nuclear option). |
--include-crawled |
When pruning, also consider crawled URLs. Default is to leave them alone. |
Supported formats
| Extension | Enabled by |
|---|---|
.md |
core |
.txt |
core |
.ipynb |
core (code + markdown cells joined) |
.toml |
core |
.csv |
core |
.json |
core |
.rst |
pip install gnosis-mcp[rst] |
.pdf |
pip install gnosis-mcp[pdf] |
Frontmatter
Ingest extracts YAML frontmatter:
title:— override first-H1 heuristiccategory:,audience:,tags:— metadatarelates_to:— inline or list form, emitsrelatededgesrelations:— typed-edge block (type: prerequisiteetc.)
Body links ([text](path.md), [[wikilinks]]) become content_link edges.
prune
Delete chunks whose source file no longer exists on disk.
gnosis-mcp prune PATH [--dry-run] [--include-crawled]
Safer than --wipe: only touches chunks whose file_path was a local file
under PATH and that file is gone. Crawled URLs are skipped unless you
pass --include-crawled.
ingest-git
Ingest commit history as searchable documents. One markdown doc per file,
listing the latest commits that touched it. Cross-file co-edit links get
git_co_change; source-file mentions get git_ref.
gnosis-mcp ingest-git REPO
[--since WHEN] [--until WHEN] [--author SUB]
[--max-commits-per-file N]
[--include GLOB] [--exclude GLOB]
[--include-merges]
[--dry-run] [--force] [--embed]
| Flag | Description |
|---|---|
--since, --until |
Date windows. 6m / 2w / 2025-01-01 all work. |
--author |
Filter by author name or email substring. |
--max-commits-per-file |
Default 10 (most recent). |
--include, --exclude |
Glob filters on the file set. |
--include-merges |
Default off (merge commits excluded). |
crawl
Crawl a documentation website and ingest pages as markdown. Deferred
imports — requires pip install gnosis-mcp[web].
gnosis-mcp crawl URL
[--sitemap] [--max-depth N]
[--include GLOB] [--exclude GLOB]
[--max-pages N]
[--dry-run] [--force] [--embed]
| Flag | Description |
|---|---|
--sitemap |
Discover URLs via sitemap.xml. Best for large doc sites. |
--max-depth |
BFS link-crawl depth when --sitemap is off (default 1). |
--include / --exclude |
Path glob filters. |
--max-pages |
Safety cap (default 5000). |
--force |
Ignore the ETag / Last-Modified / hash cache. |
Caching. A JSON sidecar at ~/.local/share/gnosis-mcp/crawl-cache.json
stores ETag and hash metadata so subsequent crawls can skip unchanged pages
via conditional requests.
robots.txt. Respected. Same-host redirect on robots.txt is treated as
disallow to block redirect-based spoofing.
search
Quick retrieval sanity check from the shell.
gnosis-mcp search "your query" [-n 10] [-c guides] [--embed]
| Flag | Description |
|---|---|
-n, --limit |
Max results (default 5). |
-c, --category |
Filter by category. |
--embed |
Auto-embed the query for hybrid search (needs an embed provider). |
embed
Backfill embeddings for chunks where embedding IS NULL.
gnosis-mcp embed
[--provider {openai,ollama,custom,local}]
[--model NAME] [--batch-size N] [--dry-run]
Flags override the GNOSIS_MCP_EMBED_* env vars.
stats
Print a snapshot: doc count, chunk count, coverage of embeddings, top categories, access-log size.
gnosis-mcp stats
export
Dump documents for external pipelines.
gnosis-mcp export [-f {json,markdown}] [-c CATEGORY]
JSON output is a stream of {file_path, title, category, content, ...}
objects — friendly for piping into jq.
diff
Dry-run re-ingest: show which files would be re-chunked and which would skip (hash match).
gnosis-mcp diff PATH
check
Verify that:
- The database is reachable.
- All required tables / extensions are present.
- FTS5 (SQLite) or tsvector (Postgres) is functional.
Exit 0 on healthy, non-zero with a remediation hint otherwise. Good for
Docker HEALTHCHECK and CI smoke tests.
gnosis-mcp check
cleanup
Purge old access-log rows.
gnosis-mcp cleanup [--days N]
Default keeps the last 90 days. get_context's popularity signal still
works after trimming — recency weights recent accesses higher.
fix-link-types
One-off migration. Pre-0.10 git-history docs used generic relates_to
edges; this command re-classifies them as git_co_change / git_ref so
get_graph_stats() can separate curated links from the noisier git-derived
ones. Safe to run multiple times.
gnosis-mcp fix-link-types
eval
Retrieval-quality harness. Runs a small built-in query set against the indexed corpus and reports Hit@5, MRR, Precision@5.
gnosis-mcp eval [--json]
--json emits the metrics only, suitable for piping into CI dashboards.
Use with --force re-ingest during benchmarks.
Environment overrides
Every flag mentioned above has an env-var equivalent under GNOSIS_MCP_*
(see config.md). Env wins over interactive defaults; flags
win over env.