Skip to content

Aletheia vs FalkorDB GraphRAG-SDK

This document compares Aletheia's approach to ontology handling and schema inference with FalkorDB's GraphRAG-SDK.

Overview

Both frameworks build knowledge graphs from unstructured data using LLMs for entity and relationship extraction. They differ in how they handle ontologies and schema discovery.

Framework Philosophy
GraphRAG-SDK "Let the LLM figure it out"
Aletheia "LLM proposes, ontology validates"

GraphRAG-SDK Approach

Ontology Autodiscovery

GraphRAG-SDK uses single-pass LLM extraction from raw documents. A fixed system prompt instructs the LLM to output JSON schema directly.

# GraphRAG-SDK: Ontology from documents
ontology = Ontology.from_sources(
    sources=[PDF("report.pdf"), URL("https://example.com")],
    model=model,
)

Key Characteristics

Aspect Implementation
Discovery Method Single-pass LLM extraction
Prompt Strategy Fixed system prompt outputs JSON directly
Entity Definition Runtime Python objects (label, attributes, description)
Relationship Handling Extracted alongside entities in same pass
Validation Post-hoc LLM correction via FIX_ONTOLOGY_PROMPT
Merging Document ontologies merged via o.merge_with()

Ontology Sources

GraphRAG-SDK supports three ways to obtain an ontology:

  1. from_sources() - LLM extracts from documents
  2. from_kg_graph() - Extract from existing FalkorDB graph
  3. from_ttl() - Parse RDF/Turtle files

Prompt Architecture

Document Text → Fixed System Prompt → JSON Schema → Ontology Object

The system prompt instructs:

  • Capture entities, relationships, and attributes
  • Use basic types (e.g., "person" not "mathematician")
  • Maintain consistent entity references
  • Output JSON inline with no spaces

Aletheia Approach

Two-Stage Meta-Prompt Architecture

Aletheia separates domain analysis from schema extraction:

  1. Stage 1: Domain Analysis - LLM acts as "knowledge graph architect" to generate a domain-specific extraction prompt
  2. Stage 2: Schema Extraction - Uses the generated prompt to extract structured schema
# Aletheia: Schema with ontology alignment
engine = SchemaInferenceEngine(
    llm_client=client,
    schema_mode=SchemaMode.GRAPH_HYBRID,
    ontology=ontology,
    parser=parser,
)
schema = await engine.extract_schema(db_name, sample_data_dir)

Schema Modes

Aletheia offers six distinct schema inference modes (plus an inference alias for llm):

Mode Description
none Use Graphiti defaults
llm Two-stage LLM inference
ontology Extract schema from ontology file
hybrid LLM + ontology validation
graph-hybrid LLM-first + semantic graph alignment
ontology-first Ontology as primary, LLM enhances

All modes except none apply a Phase 4 consolidation step: an LLM reviews the final schema for redundancies and merges semantically similar types. Ontology-derived types are protected from removal.

The graph-hybrid mode provides the best balance of flexibility and rigor:

┌─────────────────────────────────────────────────────────┐
│ Phase 1: LLM-First Inference (Unbiased)                 │
│   • Extract schema without ontology guidance            │
│   • Avoids anchoring bias from seeing ontology first    │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Phase 2: Semantic Alignment via Knowledge Graph         │
│   • Vector search matches LLM concepts → ontology       │
│   • Confidence scores for each alignment                │
│   • Unaligned concepts flagged for review               │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│ Phase 3: Property Enrichment (Data-Driven)              │
│   • Add ontology properties that appear in actual data  │
│   • Filter properties not present in source data        │
│   • Prevents schema bloat                               │
└─────────────────────────────────────────────────────────┘

Ontology as Knowledge Graph

Aletheia loads OWL/TTL ontologies into FalkorDB as searchable knowledge graphs:

# Load ontology to graph (once)
aletheia build-ontology-graph \
  --use-case my_case \
  --knowledge-graph my_ontology

# Build data graph with alignment
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode graph-hybrid \
  --ontology-graph my_ontology

Detailed Comparison

Ontology Role

Aspect GraphRAG-SDK Aletheia
Purpose IS the extraction schema Guides/validates extraction schema
Relationship Ontology = Schema Ontology → Schema (separate artifacts)
Authority LLM-derived Domain expert-defined

Formal Ontology Support

Aspect GraphRAG-SDK Aletheia
OWL support Limited (TTL import only) First-class (loaded to graph)
Ontology storage Runtime Python objects FalkorDB knowledge graph
Searchability No Vector + BFS search
Reusability Per-session Persistent, shared across runs

LLM Bias Mitigation

Aspect GraphRAG-SDK Aletheia
Anchoring bias None - LLM sees prompts Graph-hybrid: LLM infers blind
Hallucination control Post-hoc fix prompts Ontology alignment validation
Schema drift Risk across documents Ontology provides anchor

Alignment Mechanism

Aspect GraphRAG-SDK Aletheia
Method String matching via merge Semantic search (cosine + BFS)
Confidence tracking No Yes - per-concept scores
Alignment reports No Yes - JSON reports with rationale
Failed alignments Silent merge Explicit warnings, review flags

Property Handling

Aspect GraphRAG-SDK Aletheia
Property source LLM extraction only LLM + ontology enrichment
Filtering None Data-driven (only properties in data)
Schema bloat Risk Prevented by filtering

Schema Persistence

Aspect GraphRAG-SDK Aletheia
Storage Runtime Python objects Generated Python modules
Location Memory only schemas/<graph_name>.py
Versioning No Git-trackable files
Reuse Rebuild each session Load existing schema

Code Examples

GraphRAG-SDK: Basic Usage

from graphrag_sdk import Ontology, KnowledgeGraph
from graphrag_sdk.source import URL

# Auto-discover ontology from web pages
ontology = Ontology.from_sources(
    sources=[URL("https://example.com/article")],
    model=model,
)

# Create knowledge graph
kg = KnowledgeGraph(
    name="my_graph",
    ontology=ontology,
    model=model,
)

# Populate from sources
kg.process_sources([URL("https://example.com/data")])

Aletheia: Graph-Hybrid Mode

from aletheia.core.schema import SchemaInferenceEngine, SchemaMode

# Create engine with ontology
engine = SchemaInferenceEngine(
    llm_client=client,
    schema_mode=SchemaMode.GRAPH_HYBRID,
    ontology=ontology,  # Loaded from OWL/TTL
    parser=parser,
    alignment_confidence=0.7,
)

# Extract with alignment
schema = await engine.extract_schema(
    db_name="my_graph",
    sample_data_dir=Path("data/"),
)

# Access alignment report
report = engine._last_alignment_report
print(f"Aligned: {report.successful_count}")
print(f"Failed: {report.failed_count}")

When to Use Which

Scenario Recommended
Quick prototyping from documents GraphRAG-SDK
Domain with formal ontology (FTM, FIBO, etc.) Aletheia
Need audit trail / alignment reports Aletheia
Multi-source data requiring entity resolution Aletheia
Simple RAG chatbot GraphRAG-SDK
Regulatory compliance / explainability Aletheia
Ad-hoc document analysis GraphRAG-SDK
Production knowledge graph with governance Aletheia

Summary

GraphRAG-SDK optimizes for simplicity and speed. It works well for general-purpose document analysis where schema consistency is less critical.

Aletheia optimizes for rigor and auditability. It excels in domains with established ontologies where schema fidelity and alignment transparency matter.

The key insight is that these represent different points on the flexibility-rigor spectrum. GraphRAG-SDK prioritizes developer experience; Aletheia prioritizes domain modeling correctness.