Skip to content

Graphiti MCP Server vs Aletheia Evaluation

This FAQ explains the differences between using the Graphiti MCP server directly (e.g., with Claude Desktop) versus using Aletheia's evaluation framework, and when to use each.

The Question

"I discovered the Graphiti MCP server. How does it compare to Aletheia? What's the difference between Claude Desktop answering questions via MCP versus Aletheia's evaluation pipeline?"

Quick Answer

  • Graphiti MCP Server: A retrieval and analytics interface that lets an LLM access your knowledge graph. The LLM synthesizes answers using its own judgment.
  • Aletheia: A controlled evaluation framework that forces grounded answers with programmatic verification and quality metrics.

Example: Complex Aviation Query

To illustrate the differences, consider this question:

"What incidents in Barcelona or Madrid involved Boeing aircraft and resulted in emergency declarations (MAYDAY or PAN-PAN)?"


Graphiti MCP Server (Aletheia Fork)

Available Tools

The MCP server exposes 15 tools across 6 groups:

Group Tools Purpose
Semantic Discovery search, explore_node Semantic search with BFS traversal, entity-centric exploration
Schema & Ontology get_schema, search_ontology, explore_ontology Graph structure discovery, ontology concept search
Graph Profiling profile_graph Property coverage, language detection, relationship validation
Cypher Analytics run_cypher Read-only Cypher queries with 4-stage security pipeline
Community Intelligence build_communities Label propagation clustering
Data Management add_memory, get_episodes, get_episode_context, delete_entity_edge, delete_episode, clear_graph, get_status CRUD operations and health checks

Self-Describing Connectors

Each MCP server instance introspects its own graph at startup via DomainProfile, generating domain-specific:

  • Tool descriptions — the LLM sees aviation-specific guidance when connected to the aviation graph
  • Entity catalogs — types, counts, and sample entities
  • MCP resourcesgraphiti://domain_summary, graphiti://entity_catalog, graphiti://relationship_types

Four domain connectors are configured in Aletheia: Aviation Safety, Safety Recommendations, Airworthiness Directives, and Terrorist Organizations.

How Claude Desktop Answers

User: "What incidents in Barcelona or Madrid involved Boeing aircraft
       and resulted in emergency declarations (MAYDAY or PAN-PAN)?"

Claude Desktop (via MCP):
1. Calls search(query="Barcelona Madrid Boeing MAYDAY PAN-PAN")
2. Optionally calls run_cypher() for structured analytics
3. Optionally calls explore_node() to traverse from a known entity
4. Uses parametric knowledge + returned results to synthesize answer

MCP Limitations for Evaluation

Limitation Impact
No grounding verification LLM may hallucinate details not in results
No evaluation metrics No way to measure retrieval quality
No evidence citations Can't trace answer to source
Parametric knowledge leakage LLM's training data may pollute answer

Aletheia Evaluation Framework

Pipeline Architecture

Question
    ▼ [1. Query Analysis]
    Detect patterns: "Barcelona OR Madrid", "Boeing", "MAYDAY"
    → Generate SearchFilters (node_labels, edge_types)
    ▼ [2. Graphiti Search]
    Multi-method search with:
    - Cosine similarity + BFS traversal
    - RRF reranking
    - Optional community search
    - Custom search filters
    ▼ [3. Evidence Transformation]
    transform_search_results(results) → EvidenceUnit[]
    Each unit has: eid, subject_name, predicate, object_name, fact
    ▼ [4. Context Building]
    Atomic evidence units with citation IDs
    Optional COMMUNITY_CONTEXT section
    ▼ [5. LLM Answer Generation]
    Structured JSON with mandatory citations
    ▼ [6. Grounding Verification]
    Programmatic check that answer entities appear in cited evidence
    ▼ [7. RAGAS Evaluation]
    Quality metrics: precision, recall, faithfulness, similarity
    ▼ [8. Grounding Report]
    Pass rates, rejection reasons, diagnostics

Example Evidence Output

EVIDENCE:
[E1]
subject: Occurrence-2024-0157-EU
predicate: OCCURRED_AT
object: Barcelona Airport
fact: The aviation incident Occurrence-2024-0157-EU occurred at Barcelona Airport
source: aviation_safety

[E2]
subject: Occurrence-2024-0157-EU
predicate: INVOLVED_AIRCRAFT
object: Boeing 737-800
fact: Boeing 737-800 registration EC-LUT was the aircraft involved
source: aviation_safety

Grounding Verification

Aletheia programmatically verifies that:

  1. All cited evidence IDs are valid
  2. Entities mentioned in the answer appear in the cited evidence
  3. INSUFFICIENT_CONTEXT is returned when evidence is lacking

Three grounding modes control strictness:

Mode Behavior
strict (default) Rejects ungrounded answers
lenient Warns but includes all answers
off No verification

Side-by-Side Comparison

Dimension Graphiti MCP Server Aletheia Evaluation
Tools 15 (search, Cypher, schema, profiling, ontology, CRUD) Programmatic pipeline
Search search (configurable methods + reranker) Multi-method with custom filters
Analytics run_cypher (read-only Cypher queries) RAGAS metrics
Schema Awareness get_schema discovers graph structure Schema inference engine (7 modes)
Context Format Raw search results Atomic evidence units with IDs
Answer Generation LLM's discretion Structured JSON with mandatory citations
Grounding None Programmatic verification gate
Parametric Knowledge Can leak into answer Blocked by prompt + verification
Evaluation Metrics None RAGAS (precision, recall, faithfulness, similarity)
Self-Description DomainProfile generates per-domain guidance N/A

When to Use Which

Use MCP Server

  • Interactive exploration of knowledge graphs
  • Conversational memory for agentic workflows
  • Structured analytics via Cypher queries
  • Quick answers where auditability is less critical
  • Building AI agents that need persistent memory
  • Schema discovery and ontology browsing

Use Aletheia Evaluation

  • Measuring retrieval system quality
  • Ensuring answers are grounded in evidence
  • Preventing parametric knowledge leakage
  • Generating quality metrics for system improvements
  • Production systems requiring auditability
  • Comparing search configurations (baseline comparisons)

Integration

The MCP server and Aletheia are complementary:

  1. Build your knowledge graph with Aletheia (parsers, episode builders, schema inference)
  2. Evaluate retrieval quality with Aletheia's RAGAS pipeline
  3. Serve the graph to LLMs via MCP connectors for interactive use
  4. Analyze the graph with Cypher queries via the MCP server

The same graph database powers both — Aletheia builds and evaluates it, MCP serves it.


Summary

If you need... Use...
Quick interactive Q&A MCP Server
Structured graph analytics MCP Server (run_cypher)
Measured retrieval quality Aletheia
Grounded, auditable answers Aletheia
Agentic memory MCP Server
Production evaluation Aletheia