docs

Troubleshooting

docs/troubleshooting.md

Troubleshooting

Common failure modes, what they mean, and how to recover. Run gnosis-mcp check first — it often names the problem in one line.

Install

`ERROR: Could not find a version that satisfies the requirement gnosis-mcp`

You're on Python < 3.11. python --version should report 3.11 or newer.

`ImportError: sqlite3` or `cannot load sqlite-vec`

Your Python's sqlite3 was built against an old SQLite (< 3.42) without loadable extensions. Options:

Upgrade Python from python.org (ships a recent sqlite).
Use Postgres — pip install gnosis-mcp[postgres] then set GNOSIS_MCP_DATABASE_URL=postgresql://….
On Linux distros, install libsqlite3-dev and rebuild Python.

`ONNXRuntimeError: LoadLibrary failed`

ONNX Runtime couldn't load its native binary. Usually a mismatch between onnxruntime and glibc on very old Linux. Options:

pip install onnxruntime==1.17.* to pin a version with wider glibc support.
Switch to a remote embed provider (set GNOSIS_MCP_EMBED_PROVIDER=openai or ollama) — avoids the native dep entirely.

Database

`gnosis-mcp check` reports no such table: documentation_chunks

You haven't initialised the database. Run gnosis-mcp init-db once.

`gnosis-mcp check` reports vec0 table not present (SQLite)

You installed the core package without embeddings. This is fine if you're only using keyword search. To enable hybrid:

pip install "gnosis-mcp[embeddings]"
gnosis-mcp init-db   # re-run; idempotent, creates vec0 table

Postgres: extension "vector" is not available

pgvector isn't installed on your Postgres server.

Docker: use pgvector/pgvector:pg15 instead of postgres:15.
Managed DB (Supabase, Neon, RDS): enable pgvector in the extensions panel (most managed providers support it).
Bare-metal: install from source (https://github.com/pgvector/pgvector).

Then: CREATE EXTENSION vector; in your database, re-run gnosis-mcp init-db.

could not read a page from server / pool saturation

pool_max=3 (default) is small for heavy write loads. Raise it:

export GNOSIS_MCP_POOL_MAX=15

MCP client integration

Agent says "I can't find any docs on X"

Check in order:

gnosis-mcp stats — is the DB actually populated? If zero docs, you haven't ingested.
gnosis-mcp search "X" from the shell — does it work outside MCP? If yes, the MCP transport is the issue.
gnosis-mcp check — any errors?
Editor config — did you actually wire the server? See llms-install.md.

Stdio client hangs / "server not responding"

Another gnosis-mcp serve process is holding the SQLite DB open in exclusive mode. ps aux | grep gnosis-mcp and kill stragglers.
Your client is writing non-JSON to stdio. Check the client log — look for pre-handshake noise (warnings, print statements).

"403 / auth" over HTTP

You set GNOSIS_MCP_API_KEY but the client isn't sending the Authorization: Bearer <key> header, or the header value doesn't match the env var. curl -H "Authorization: Bearer $KEY" from the shell to confirm the key itself works.

Ingestion

Some files were skipped silently

Content hashing: if the file is unchanged since last ingest, it's skipped. Use --force to re-ingest anyway, or diff to preview what would happen.

"Chunk split inside a code fence" — wait, we promised that can't happen

It shouldn't. File an issue with a repro. In the meantime, move the oversized code block to a sibling file and link to it.

`.ipynb` ingested but the cells look weird

We join cell sources in document order, stripping outputs. Executed cells with escape codes can produce noisy strings. Clear outputs first (jupyter nbconvert --clear-output ...) or export the notebook to markdown.

`.pdf` ingest dies on a particular file

pypdf (our extractor) fails on malformed / scanned PDFs. Extract manually with pdftotext and ingest the .txt result.

Watch mode doesn't notice changes

We use mtime polling (works on every OS, no fsnotify dependency). Some editors save-atomic (temp-file + rename), which changes the inode; mtime still updates. If you're on a networked filesystem with mtime quirks, touch the file manually to force detection.

"No documents indexed. Run: gnosis-mcp ingest "

Your search query returned zero results AND the chunks table is empty. Exactly what the message says. Ingest something.

Search

Hybrid search isn't faster — in fact it's slower than keyword alone

Reranking is off by default but if you enabled it, expect +20 ms per query. To confirm:

time gnosis-mcp search "your query"
GNOSIS_MCP_RERANK_ENABLED=false time gnosis-mcp search "your query"

Semantic results don't look semantic

Check:

gnosis-mcp stats — are embeddings actually populated? It reports "N chunks with NULL embeddings".
You ingested with --embed, or you ran gnosis-mcp embed afterwards.
GNOSIS_MCP_EMBED_PROVIDER is set. Without it, queries don't get embedded and hybrid degrades to keyword.

Tuning RRF

GNOSIS_MCP_RRF_K (default 60). Higher values flatten the rank curve and let vector scores contribute more. If your queries are all keyword-ish (code, identifiers), lower k (~30). If they're all natural language, raise it (~120).

Writes

`upsert_doc` returns "write tools disabled"

GNOSIS_MCP_WRITABLE is unset or false. Set it to true in the env of the server process (not the client).

`upsert_doc` returns "content exceeds max_doc_bytes"

50 MB cap. Either split the doc or bump GNOSIS_MCP_MAX_DOC_BYTES. The default exists so a client bug can't flood your DB with a 2 GB blob.

Webhook never fires after writes

GNOSIS_MCP_WEBHOOK_URL unset.
The target resolves to a private / loopback / link-local IP. By default we refuse those (SSRF guard). Logs will say so. Set GNOSIS_MCP_WEBHOOK_ALLOW_PRIVATE=true for intentional loopback setups.
Target returned non-2xx — we don't retry. Check your webhook server log.

Web crawl

"robots.txt disallows this URL"

Respect it. If you own the target and it's a config mistake, fix the robots.txt. gnosis-mcp will not bypass.

Crawl finishes with fewer pages than expected

You passed --max-pages N (default 5000) and hit it.
Many links point to non-HTML assets (PDFs without [pdf] extra, images).
--include / --exclude globs filtered them out.
Rate limiting by the target — retry with exponential backoff (currently manual).

Re-crawl re-fetches everything

The ETag / Last-Modified / content-hash cache lives at ~/.local/share/gnosis-mcp/crawl-cache.json. If you deleted it, or passed --force, you pay full cost.

Performance

Ingestion is slow

Profile:

gnosis-mcp ingest --dry-run first — shows the file list without writing.
Check if --embed is on. Embedding dominates ingest time if so.
Split into stages: ingest first, embed separately, so you can restart just the embedding pass if it fails.

Search latency climbs over time

Vacuum the DB:

-- sqlite
VACUUM;
-- postgres
VACUUM ANALYZE documentation_chunks;

Or run gnosis-mcp cleanup --days 30 to shrink the access-log table, which can grow unbounded.

Memory climbs in watch mode

Python's weakref isn't always enough with long-lived servers. Known workaround: restart once a week via systemd RuntimeMaxSec= or Docker --restart unless-stopped + healthcheck.

Still stuck?

gnosis-mcp check — first 5 lines of output usually localise the problem.
gnosis-mcp serve --transport stdio 2>/tmp/gnosis.log — full log stream.
GNOSIS_MCP_LOG_LEVEL=DEBUG gnosis-mcp … — more detail.
Open an issue at github.com/nicholasglazer/gnosis-mcp/issues with the command you ran, the exact error, and the output of gnosis-mcp check.

Troubleshooting

Troubleshooting

Install

ERROR: Could not find a version that satisfies the requirement gnosis-mcp

ImportError: sqlite3 or cannot load sqlite-vec

ONNXRuntimeError: LoadLibrary failed

Database

gnosis-mcp check reports no such table: documentation_chunks

gnosis-mcp check reports vec0 table not present (SQLite)

Postgres: extension "vector" is not available

could not read a page from server / pool saturation

MCP client integration

Agent says "I can't find any docs on X"

Stdio client hangs / "server not responding"

"403 / auth" over HTTP

Ingestion

Some files were skipped silently

"Chunk split inside a code fence" — wait, we promised that can't happen

.ipynb ingested but the cells look weird

.pdf ingest dies on a particular file

Watch mode doesn't notice changes

"No documents indexed. Run: gnosis-mcp ingest "

Search

Hybrid search isn't faster — in fact it's slower than keyword alone

Semantic results don't look semantic

Tuning RRF

Writes

upsert_doc returns "write tools disabled"

upsert_doc returns "content exceeds max_doc_bytes"

Webhook never fires after writes

Web crawl

"robots.txt disallows this URL"

Crawl finishes with fewer pages than expected

Re-crawl re-fetches everything

Performance

Ingestion is slow

Search latency climbs over time

Memory climbs in watch mode

Still stuck?

`ERROR: Could not find a version that satisfies the requirement gnosis-mcp`

`ImportError: sqlite3` or `cannot load sqlite-vec`

`ONNXRuntimeError: LoadLibrary failed`

`gnosis-mcp check` reports no such table: documentation_chunks

`gnosis-mcp check` reports vec0 table not present (SQLite)

`.ipynb` ingested but the cells look weird

`.pdf` ingest dies on a particular file

`upsert_doc` returns "write tools disabled"

`upsert_doc` returns "content exceeds max_doc_bytes"