Skip to content

Troubleshooting

Common issues and solutions when using Aletheia.

Ingestion Issues

"Invalid entity IDs in edge extraction"

Symptom: Warning messages during ingestion about invalid entity IDs.

Cause: FTM entities reference other entities by ID, but they're processed in different episodes.

Solution: This is expected behavior. Graphiti creates edges through entity resolution after processing. Verify edges were created:

# FalkorDB
redis-cli GRAPH.QUERY my_graph "MATCH ()-[r:SANCTION]->() RETURN count(r)"

# Neo4j
cypher-shell -d my_graph "MATCH ()-[r:SANCTION]->() RETURN count(r)"

"Token limit exceeded"

Symptom: Episode fails to process with token limit error.

Cause: Episode markdown is too large for the LLM context window.

Solution: Split large entities into smaller episodes in your episode builder:

def build_episode(entity: MyEntity) -> str:
    # Truncate large fields
    description = entity.description[:2000] if entity.description else ""
    ...

"Rate limit exceeded"

Symptom: Ingestion fails with rate limit errors.

Cause: Too many API calls to the LLM provider.

Solution: 1. Wait and resume: aletheia build-knowledge-graph ... --resume 2. Use a local LLM for extraction 3. Reduce batch size in configuration

Build stuck or hanging

Symptom: Ingestion appears frozen with no progress.

Cause: Usually an LLM API timeout or network issue.

Solution: 1. Check network connectivity 2. Verify API key is valid 3. Interrupt (Ctrl+C) and resume: --resume

Evaluation Issues

Low Context Recall

Symptom: Context recall score < 0.5.

Possible Causes: 1. BFS not traversing to relevant nodes 2. Search limit too low 3. Data not properly ingested

Solutions:

  1. Increase BFS depth:

    config = SearchConfig(
        node_config=NodeSearchConfig(
            bfs_max_depth=3,  # Increase from default
        ),
    )
    

  2. Increase search limit:

    aletheia evaluate-ragas ... --limit 20
    

  3. Verify data exists:

    aletheia show-graph --knowledge-graph my_graph
    

Low Context Precision

Symptom: Context precision score < 0.5.

Cause: Too many irrelevant results in retrieved context.

Solutions: 1. Reduce search limit 2. Use query filters (--use-query-filters) 3. Improve question specificity

Low Faithfulness

Symptom: Faithfulness score < 0.5 despite good recall.

Cause: LLM answering from parametric knowledge, not context.

Solutions: 1. Use --grounding-mode strict 2. Use domain-specific data LLM hasn't seen 3. Check if questions are too general

High Answer Similarity but Low Faithfulness

Diagnosis: LLM answering correctly from parametric knowledge.

Solution: This is a dataset problem, not retrieval. Use: - More obscure data - Recent data (post LLM training cutoff) - Synthetic questions generated from your graph

RAGAS Scores All Zero

Symptom: All metrics return 0.0.

Cause: Usually using wrong RAGAS metric class.

Solution: Ensure using correct metrics:

from ragas.metrics._context_precision import LLMContextPrecisionWithoutReference
from ragas.metrics._answer_similarity import AnswerSimilarity

Many INSUFFICIENT_CONTEXT Responses

Symptom: High rate of "INSUFFICIENT_CONTEXT" answers.

Possible Causes: 1. Retrieval not finding relevant data 2. Questions require data not in graph 3. Questions are unsuitable for GraphRAG

Solutions: 1. Check retrieval is returning results (use --verbose) 2. Verify questions can be answered from graph data 3. Remove SQL-like questions (counting, aggregation)

Database Issues

Cannot connect to FalkorDB

Symptom: Connection refused or timeout.

Solutions: 1. Verify FalkorDB is running:

docker ps | grep falkordb

  1. Check configuration:

    echo $FALKORDB_HOST $FALKORDB_PORT
    

  2. Test connection:

    redis-cli -h localhost -p 6379 ping
    

Cannot connect to Neo4j

Symptom: Authentication failed or connection refused.

Solutions: 1. Verify Neo4j is running and accessible 2. Check credentials in environment:

echo $NEO4J_URI $NEO4J_USERNAME

  1. Test connection:
    cypher-shell -a bolt://localhost:7687 -u neo4j -p password
    

Graph not found

Symptom: "Graph 'my_graph' not found" error.

Cause: Graph doesn't exist or wrong database configuration.

Solutions: 1. List existing graphs:

aletheia list-graphs

  1. Check database type matches where graph was created:
    echo $ALETHEIA_DATABASE_TYPE
    

LLM Issues

"API key invalid"

Solution: Verify API key is set:

echo $OPENAI_API_KEY

Slow extraction

Cause: Using large/slow model for extraction.

Solution: Use faster model for extraction:

export ALETHEIA_FAST_MODEL=gpt-4o-mini

Inconsistent extractions

Cause: Model temperature too high or wrong model.

Solution: Extraction uses deterministic settings by default. If seeing inconsistency: 1. Verify model configuration 2. Check for API provider issues

Schema Inference Issues

"Too many relationship types"

Symptom: Schema has dozens of semantically similar relationship types.

Cause: Using none mode or Phase 4 consolidation didn't merge all duplicates.

Solution: Use graph-hybrid or ontology-first mode with an ontology. These modes constrain relationship types to ontology-defined vocabulary. Phase 4 consolidation (applied to all modes) attempts to merge redundant types, but an ontology provides stronger constraints.

Duplicate edge types in parallel ingestion

Symptom: Same edge (e.g., "Barcelona-El Prat LOCATED_IN Spain") appears twice.

Cause: Known race condition — parallel episode ingestion can create duplicate edges when both episodes extract the same relationship and neither is persisted yet.

Solution: This is a known issue. Deduplicate after ingestion or ingest serially for small datasets.

Scalar/list type mismatch errors

Symptom: Pydantic validation errors during extraction about expected str vs list[str].

Cause: The LLM returns a scalar where the schema expects a list, or vice versa.

Solution: Ensure your schema uses CoerciveBaseModel as the base class (this is the default for generated schemas). It automatically wraps scalars in lists and extracts first elements from unexpected lists.

Getting More Help

If you're still stuck:

  1. Check logs with --verbose flag
  2. Search existing issues on GitHub
  3. Open a new issue with:
  4. Aletheia version
  5. Full error message
  6. Minimal reproduction steps