Schema Modes¶
Schema modes control how Aletheia extracts entities and relationships from your data. Each mode balances structure against flexibility differently.
Available Modes¶
| Mode | Description | Use When |
|---|---|---|
none | No schema — Graphiti extracts freely | Quick prototyping |
llm | LLM infers schema from sample data | Unknown data structure |
inference | Alias for llm | Same as llm |
ontology | Direct extraction from ontology, no LLM | Formal domain models, strict adherence |
hybrid | LLM extraction validated against ontology | Known domain, some flexibility needed |
graph-hybrid | LLM-first, then semantic alignment via ontology graph | Recommended for FTM data |
ontology-first | Ontology primary, LLM discovers edge cases | Complete ontologies with known gaps |
inference is an alias
The inference mode is identical to llm. Both use the same two-stage meta-prompt pipeline. The alias exists for backward compatibility.
Choosing a Mode¶
graph TD
A[Do you have an ontology?] -->|Yes| B[Is the ontology complete?]
A -->|No| C[Is data structure known?]
B -->|Yes, covers all types| D[ontology-first]
B -->|Mostly, some gaps| E[graph-hybrid]
B -->|Partial coverage| F[hybrid]
C -->|Yes| G[llm]
C -->|No| H[none] Common Final Step: Phase 4 Consolidation¶
Regardless of which mode you choose, Aletheia applies a Phase 4 consolidation step at the end of schema inference. An LLM reviews the complete schema for:
- Redundant types — merges semantically similar entity or relationship types
- Over-specialized types — consolidates overly narrow types into broader ones
- Naming inconsistencies — normalizes type names
Ontology-derived and extension types are protected from removal during consolidation.
Mode Details¶
none¶
No schema constraints. Graphiti's LLM extracts whatever entities and relationships it finds.
- Fastest to set up
- No guarantees on type consistency across episodes
- No Phase 4 consolidation (no schema to consolidate)
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode none
llm¶
A two-stage meta-prompt pipeline: Stage 1 generates a domain-specific extraction prompt from sample data, Stage 2 uses that prompt to extract the schema.
- Good for exploring unknown data
- Schema generated once per run
- Data-driven pruning removes types not found in the parser's
schema_distribution
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode llm
ontology¶
Direct extraction from a TTL/OWL ontology file. No LLM involvement — class classification uses transitive ancestry to distinguish entity classes, relationship classes, and abstract classes.
- Strictest mode
- Only produces types defined in the ontology
- Requires a well-defined, complete ontology
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode ontology
hybrid¶
LLM infers a schema from data, then validates and enriches it against the ontology via fuzzy matching.
- Ontology guides but doesn't constrain
- LLM-discovered types kept alongside ontology types
- Property enrichment adds ontology-defined properties to matched entities
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode hybrid \
--ontology-graph my_ontology
graph-hybrid (Recommended for FTM)¶
Three-phase pipeline designed to prevent ontology bias in the initial extraction:
- Phase 1 — LLM-first inference: LLM analyzes sample data without seeing the ontology, producing an unbiased schema.
- Phase 2 — Semantic alignment: Each inferred type is matched against ontology concepts in the graph (exact match → alt-label → embedding similarity → LLM reranking).
- Phase 3 — Property enrichment: Ontology properties are added to aligned entities, filtered to only those present in the actual data.
- Phase 4 — Consolidation: Common final step (see above).
# 1. Load ontology into graph (once)
aletheia build-ontology-graph \
--use-case my_case \
--knowledge-graph my_ontology
# 2. Ingest with ontology reference
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode graph-hybrid \
--ontology-graph my_ontology
ontology-first¶
The ontology defines the baseline schema; an LLM fills in gaps the ontology doesn't cover.
- Phase 1 — Load ontology: Extracts entity types from concrete classes and relationship types from both reified classes (e.g.,
Ownership) and non-reified object properties (e.g.,locatedIn). - Phase 2 — LLM enhancement: Discovers patterns not in the ontology. Uses directive hints to prevent duplicating committed relationship types.
- Phase 3 — Merge: Combines ontology base with LLM discoveries. Data-driven pruning removes types not present in the data.
- Phase 4 — Consolidation: Common final step (see above).
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_graph \
--schema-mode ontology-first \
--ontology-graph my_ontology
Learn More¶
- Schema Inference Deep Dive — Full pipeline documentation per mode
- Building Graphs — Build options and flags
- FTM Data — Schema considerations for FollowTheMoney data