Aletheia vs FalkorDB GraphRAG-SDK¶
This document compares Aletheia's approach to ontology handling and schema inference with FalkorDB's GraphRAG-SDK.
Overview¶
Both frameworks build knowledge graphs from unstructured data using LLMs for entity and relationship extraction. They differ in how they handle ontologies and schema discovery.
| Framework | Philosophy |
|---|---|
| GraphRAG-SDK | "Let the LLM figure it out" |
| Aletheia | "LLM proposes, ontology validates" |
GraphRAG-SDK Approach¶
Ontology Autodiscovery¶
GraphRAG-SDK uses single-pass LLM extraction from raw documents. A fixed system prompt instructs the LLM to output JSON schema directly.
# GraphRAG-SDK: Ontology from documents
ontology = Ontology.from_sources(
sources=[PDF("report.pdf"), URL("https://example.com")],
model=model,
)
Key Characteristics¶
| Aspect | Implementation |
|---|---|
| Discovery Method | Single-pass LLM extraction |
| Prompt Strategy | Fixed system prompt outputs JSON directly |
| Entity Definition | Runtime Python objects (label, attributes, description) |
| Relationship Handling | Extracted alongside entities in same pass |
| Validation | Post-hoc LLM correction via FIX_ONTOLOGY_PROMPT |
| Merging | Document ontologies merged via o.merge_with() |
Ontology Sources¶
GraphRAG-SDK supports three ways to obtain an ontology:
from_sources()- LLM extracts from documentsfrom_kg_graph()- Extract from existing FalkorDB graphfrom_ttl()- Parse RDF/Turtle files
Prompt Architecture¶
The system prompt instructs:
- Capture entities, relationships, and attributes
- Use basic types (e.g., "person" not "mathematician")
- Maintain consistent entity references
- Output JSON inline with no spaces
Aletheia Approach¶
Two-Stage Meta-Prompt Architecture¶
Aletheia separates domain analysis from schema extraction:
- Stage 1: Domain Analysis - LLM acts as "knowledge graph architect" to generate a domain-specific extraction prompt
- Stage 2: Schema Extraction - Uses the generated prompt to extract structured schema
# Aletheia: Schema with ontology alignment
engine = SchemaInferenceEngine(
llm_client=client,
schema_mode=SchemaMode.GRAPH_HYBRID,
ontology=ontology,
parser=parser,
)
schema = await engine.extract_schema(db_name, sample_data_dir)
Schema Modes¶
Aletheia offers six distinct schema inference modes (plus an inference alias for llm):
| Mode | Description |
|---|---|
none | Use Graphiti defaults |
llm | Two-stage LLM inference |
ontology | Extract schema from ontology file |
hybrid | LLM + ontology validation |
graph-hybrid | LLM-first + semantic graph alignment |
ontology-first | Ontology as primary, LLM enhances |
All modes except none apply a Phase 4 consolidation step: an LLM reviews the final schema for redundancies and merges semantically similar types. Ontology-derived types are protected from removal.
Graph-Hybrid Mode (Recommended)¶
The graph-hybrid mode provides the best balance of flexibility and rigor:
┌─────────────────────────────────────────────────────────┐
│ Phase 1: LLM-First Inference (Unbiased) │
│ • Extract schema without ontology guidance │
│ • Avoids anchoring bias from seeing ontology first │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Phase 2: Semantic Alignment via Knowledge Graph │
│ • Vector search matches LLM concepts → ontology │
│ • Confidence scores for each alignment │
│ • Unaligned concepts flagged for review │
└─────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────┐
│ Phase 3: Property Enrichment (Data-Driven) │
│ • Add ontology properties that appear in actual data │
│ • Filter properties not present in source data │
│ • Prevents schema bloat │
└─────────────────────────────────────────────────────────┘
Ontology as Knowledge Graph¶
Aletheia loads OWL/TTL ontologies into FalkorDB as searchable knowledge graphs:
# Load ontology to graph (once)
aletheia build-ontology-graph \
--use-case my_case \
--knowledge-graph my_ontology
# Build data graph with alignment
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode graph-hybrid \
--ontology-graph my_ontology
Detailed Comparison¶
Ontology Role¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| Purpose | IS the extraction schema | Guides/validates extraction schema |
| Relationship | Ontology = Schema | Ontology → Schema (separate artifacts) |
| Authority | LLM-derived | Domain expert-defined |
Formal Ontology Support¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| OWL support | Limited (TTL import only) | First-class (loaded to graph) |
| Ontology storage | Runtime Python objects | FalkorDB knowledge graph |
| Searchability | No | Vector + BFS search |
| Reusability | Per-session | Persistent, shared across runs |
LLM Bias Mitigation¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| Anchoring bias | None - LLM sees prompts | Graph-hybrid: LLM infers blind |
| Hallucination control | Post-hoc fix prompts | Ontology alignment validation |
| Schema drift | Risk across documents | Ontology provides anchor |
Alignment Mechanism¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| Method | String matching via merge | Semantic search (cosine + BFS) |
| Confidence tracking | No | Yes - per-concept scores |
| Alignment reports | No | Yes - JSON reports with rationale |
| Failed alignments | Silent merge | Explicit warnings, review flags |
Property Handling¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| Property source | LLM extraction only | LLM + ontology enrichment |
| Filtering | None | Data-driven (only properties in data) |
| Schema bloat | Risk | Prevented by filtering |
Schema Persistence¶
| Aspect | GraphRAG-SDK | Aletheia |
|---|---|---|
| Storage | Runtime Python objects | Generated Python modules |
| Location | Memory only | schemas/<graph_name>.py |
| Versioning | No | Git-trackable files |
| Reuse | Rebuild each session | Load existing schema |
Code Examples¶
GraphRAG-SDK: Basic Usage¶
from graphrag_sdk import Ontology, KnowledgeGraph
from graphrag_sdk.source import URL
# Auto-discover ontology from web pages
ontology = Ontology.from_sources(
sources=[URL("https://example.com/article")],
model=model,
)
# Create knowledge graph
kg = KnowledgeGraph(
name="my_graph",
ontology=ontology,
model=model,
)
# Populate from sources
kg.process_sources([URL("https://example.com/data")])
Aletheia: Graph-Hybrid Mode¶
from aletheia.core.schema import SchemaInferenceEngine, SchemaMode
# Create engine with ontology
engine = SchemaInferenceEngine(
llm_client=client,
schema_mode=SchemaMode.GRAPH_HYBRID,
ontology=ontology, # Loaded from OWL/TTL
parser=parser,
alignment_confidence=0.7,
)
# Extract with alignment
schema = await engine.extract_schema(
db_name="my_graph",
sample_data_dir=Path("data/"),
)
# Access alignment report
report = engine._last_alignment_report
print(f"Aligned: {report.successful_count}")
print(f"Failed: {report.failed_count}")
When to Use Which¶
| Scenario | Recommended |
|---|---|
| Quick prototyping from documents | GraphRAG-SDK |
| Domain with formal ontology (FTM, FIBO, etc.) | Aletheia |
| Need audit trail / alignment reports | Aletheia |
| Multi-source data requiring entity resolution | Aletheia |
| Simple RAG chatbot | GraphRAG-SDK |
| Regulatory compliance / explainability | Aletheia |
| Ad-hoc document analysis | GraphRAG-SDK |
| Production knowledge graph with governance | Aletheia |
Summary¶
GraphRAG-SDK optimizes for simplicity and speed. It works well for general-purpose document analysis where schema consistency is less critical.
Aletheia optimizes for rigor and auditability. It excels in domains with established ontologies where schema fidelity and alignment transparency matter.
The key insight is that these represent different points on the flexibility-rigor spectrum. GraphRAG-SDK prioritizes developer experience; Aletheia prioritizes domain modeling correctness.