Troubleshooting
Troubleshooting
Common failure modes, what they mean, and how to recover. Run
gnosis-mcp check first — it often names the problem in one line.
Install
ERROR: Could not find a version that satisfies the requirement gnosis-mcp
You're on Python < 3.11. python --version should report 3.11 or newer.
ImportError: sqlite3 or cannot load sqlite-vec
Your Python's sqlite3 was built against an old SQLite (< 3.42) without
loadable extensions. Options:
- Upgrade Python from python.org (ships a recent sqlite).
- Use Postgres —
pip install gnosis-mcp[postgres]then setGNOSIS_MCP_DATABASE_URL=postgresql://…. - On Linux distros, install
libsqlite3-devand rebuild Python.
ONNXRuntimeError: LoadLibrary failed
ONNX Runtime couldn't load its native binary. Usually a mismatch between
onnxruntime and glibc on very old Linux. Options:
pip install onnxruntime==1.17.*to pin a version with wider glibc support.- Switch to a remote embed provider (set
GNOSIS_MCP_EMBED_PROVIDER=openaiorollama) — avoids the native dep entirely.
Database
gnosis-mcp check reports no such table: documentation_chunks
You haven't initialised the database. Run gnosis-mcp init-db once.
gnosis-mcp check reports vec0 table not present (SQLite)
You installed the core package without embeddings. This is fine if you're only using keyword search. To enable hybrid:
pip install "gnosis-mcp[embeddings]"
gnosis-mcp init-db # re-run; idempotent, creates vec0 table
Postgres: extension "vector" is not available
pgvector isn't installed on your Postgres server.
- Docker: use
pgvector/pgvector:pg15instead ofpostgres:15. - Managed DB (Supabase, Neon, RDS): enable pgvector in the extensions panel (most managed providers support it).
- Bare-metal: install from source
(
https://github.com/pgvector/pgvector).
Then: CREATE EXTENSION vector; in your database, re-run gnosis-mcp init-db.
could not read a page from server / pool saturation
pool_max=3 (default) is small for heavy write loads. Raise it:
export GNOSIS_MCP_POOL_MAX=15
MCP client integration
Agent says "I can't find any docs on X"
Check in order:
gnosis-mcp stats— is the DB actually populated? If zero docs, you haven't ingested.gnosis-mcp search "X"from the shell — does it work outside MCP? If yes, the MCP transport is the issue.gnosis-mcp check— any errors?- Editor config — did you actually wire the server? See
llms-install.md.
Stdio client hangs / "server not responding"
- Another
gnosis-mcp serveprocess is holding the SQLite DB open in exclusive mode.ps aux | grep gnosis-mcpand kill stragglers. - Your client is writing non-JSON to stdio. Check the client log — look for pre-handshake noise (warnings, print statements).
"403 / auth" over HTTP
You set GNOSIS_MCP_API_KEY but the client isn't sending the
Authorization: Bearer <key> header, or the header value doesn't match
the env var. curl -H "Authorization: Bearer $KEY" from the shell to
confirm the key itself works.
Ingestion
Some files were skipped silently
Content hashing: if the file is unchanged since last ingest, it's skipped.
Use --force to re-ingest anyway, or diff to preview what would happen.
"Chunk split inside a code fence" — wait, we promised that can't happen
It shouldn't. File an issue with a repro. In the meantime, move the oversized code block to a sibling file and link to it.
.ipynb ingested but the cells look weird
We join cell sources in document order, stripping outputs. Executed cells
with escape codes can produce noisy strings. Clear outputs first
(jupyter nbconvert --clear-output ...) or export the notebook to
markdown.
.pdf ingest dies on a particular file
pypdf (our extractor) fails on malformed / scanned PDFs. Extract
manually with pdftotext and ingest the .txt result.
Watch mode doesn't notice changes
We use mtime polling (works on every OS, no fsnotify dependency). Some editors save-atomic (temp-file + rename), which changes the inode; mtime still updates. If you're on a networked filesystem with mtime quirks, touch the file manually to force detection.
"No documents indexed. Run: gnosis-mcp ingest "
Your search query returned zero results AND the chunks table is empty.
Exactly what the message says. Ingest something.
Search
Hybrid search isn't faster — in fact it's slower than keyword alone
Reranking is off by default but if you enabled it, expect +20 ms per query. To confirm:
time gnosis-mcp search "your query"
GNOSIS_MCP_RERANK_ENABLED=false time gnosis-mcp search "your query"
Semantic results don't look semantic
Check:
gnosis-mcp stats— are embeddings actually populated? It reports "N chunks with NULL embeddings".- You ingested with
--embed, or you rangnosis-mcp embedafterwards. GNOSIS_MCP_EMBED_PROVIDERis set. Without it, queries don't get embedded and hybrid degrades to keyword.
Tuning RRF
GNOSIS_MCP_RRF_K (default 60). Higher values flatten the rank curve and
let vector scores contribute more. If your queries are all keyword-ish
(code, identifiers), lower k (~30). If they're all natural language,
raise it (~120).
Writes
upsert_doc returns "write tools disabled"
GNOSIS_MCP_WRITABLE is unset or false. Set it to true in the env
of the server process (not the client).
upsert_doc returns "content exceeds max_doc_bytes"
50 MB cap. Either split the doc or bump GNOSIS_MCP_MAX_DOC_BYTES. The
default exists so a client bug can't flood your DB with a 2 GB blob.
Webhook never fires after writes
GNOSIS_MCP_WEBHOOK_URLunset.- The target resolves to a private / loopback / link-local IP. By default
we refuse those (SSRF guard). Logs will say so. Set
GNOSIS_MCP_WEBHOOK_ALLOW_PRIVATE=truefor intentional loopback setups. - Target returned non-2xx — we don't retry. Check your webhook server log.
Web crawl
"robots.txt disallows this URL"
Respect it. If you own the target and it's a config mistake, fix the
robots.txt. gnosis-mcp will not bypass.
Crawl finishes with fewer pages than expected
- You passed
--max-pages N(default 5000) and hit it. - Many links point to non-HTML assets (PDFs without
[pdf]extra, images). --include/--excludeglobs filtered them out.- Rate limiting by the target — retry with exponential backoff (currently manual).
Re-crawl re-fetches everything
The ETag / Last-Modified / content-hash cache lives at
~/.local/share/gnosis-mcp/crawl-cache.json. If you deleted it, or
passed --force, you pay full cost.
Performance
Ingestion is slow
Profile:
gnosis-mcp ingest --dry-runfirst — shows the file list without writing.- Check if
--embedis on. Embedding dominates ingest time if so. - Split into stages:
ingestfirst,embedseparately, so you can restart just the embedding pass if it fails.
Search latency climbs over time
Vacuum the DB:
-- sqlite
VACUUM;
-- postgres
VACUUM ANALYZE documentation_chunks;
Or run gnosis-mcp cleanup --days 30 to shrink the access-log table,
which can grow unbounded.
Memory climbs in watch mode
Python's weakref isn't always enough with long-lived servers. Known
workaround: restart once a week via systemd RuntimeMaxSec= or Docker
--restart unless-stopped + healthcheck.
Still stuck?
gnosis-mcp check— first 5 lines of output usually localise the problem.gnosis-mcp serve --transport stdio 2>/tmp/gnosis.log— full log stream.GNOSIS_MCP_LOG_LEVEL=DEBUG gnosis-mcp …— more detail.- Open an issue at
github.com/nicholasglazer/gnosis-mcp/issues
with the command you ran, the exact error, and the output of
gnosis-mcp check.