Schema Modes¶

Schema modes control how Aletheia extracts entities and relationships from your data. Each mode balances structure against flexibility differently.

Available Modes¶

Mode	Description	Use When
`none`	No schema — Graphiti extracts freely	Quick prototyping
`llm`	LLM infers schema from sample data	Unknown data structure
`inference`	Alias for `llm`	Same as `llm`
`ontology`	Direct extraction from ontology, no LLM	Formal domain models, strict adherence
`hybrid`	LLM extraction validated against ontology	Known domain, some flexibility needed
`graph-hybrid`	LLM-first, then semantic alignment via ontology graph	Recommended for FTM data
`ontology-first`	Ontology primary, LLM discovers edge cases	Complete ontologies with known gaps

inference is an alias

The inference mode is identical to llm. Both use the same two-stage meta-prompt pipeline. The alias exists for backward compatibility.

Choosing a Mode¶

graph TD
    A[Do you have an ontology?] -->|Yes| B[Is the ontology complete?]
    A -->|No| C[Is data structure known?]
    B -->|Yes, covers all types| D[ontology-first]
    B -->|Mostly, some gaps| E[graph-hybrid]
    B -->|Partial coverage| F[hybrid]
    C -->|Yes| G[llm]
    C -->|No| H[none]

Common Final Step: Phase 4 Consolidation¶

Regardless of which mode you choose, Aletheia applies a Phase 4 consolidation step at the end of schema inference. An LLM reviews the complete schema for:

Redundant types — merges semantically similar entity or relationship types
Over-specialized types — consolidates overly narrow types into broader ones
Naming inconsistencies — normalizes type names

Ontology-derived and extension types are protected from removal during consolidation.

Mode Details¶

none¶

No schema constraints. Graphiti's LLM extracts whatever entities and relationships it finds.

Fastest to set up
No guarantees on type consistency across episodes
No Phase 4 consolidation (no schema to consolidate)

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode none

llm¶

A two-stage meta-prompt pipeline: Stage 1 generates a domain-specific extraction prompt from sample data, Stage 2 uses that prompt to extract the schema.

Good for exploring unknown data
Schema generated once per run
Data-driven pruning removes types not found in the parser's schema_distribution

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode llm

ontology¶

Direct extraction from a TTL/OWL ontology file. No LLM involvement — class classification uses transitive ancestry to distinguish entity classes, relationship classes, and abstract classes.

Strictest mode
Only produces types defined in the ontology
Requires a well-defined, complete ontology

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology

hybrid¶

LLM infers a schema from data, then validates and enriches it against the ontology via fuzzy matching.

Ontology guides but doesn't constrain
LLM-discovered types kept alongside ontology types
Property enrichment adds ontology-defined properties to matched entities

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode hybrid \
  --ontology-graph my_ontology

graph-hybrid (Recommended for FTM)¶

Three-phase pipeline designed to prevent ontology bias in the initial extraction:

Phase 1 — LLM-first inference: LLM analyzes sample data without seeing the ontology, producing an unbiased schema.
Phase 2 — Semantic alignment: Each inferred type is matched against ontology concepts in the graph (exact match → alt-label → embedding similarity → LLM reranking).
Phase 3 — Property enrichment: Ontology properties are added to aligned entities, filtered to only those present in the actual data.
Phase 4 — Consolidation: Common final step (see above).

# 1. Load ontology into graph (once)
aletheia build-ontology-graph \
  --use-case my_case \
  --knowledge-graph my_ontology

# 2. Ingest with ontology reference
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode graph-hybrid \
  --ontology-graph my_ontology

ontology-first¶

The ontology defines the baseline schema; an LLM fills in gaps the ontology doesn't cover.

Phase 1 — Load ontology: Extracts entity types from concrete classes and relationship types from both reified classes (e.g., Ownership) and non-reified object properties (e.g., locatedIn).
Phase 2 — LLM enhancement: Discovers patterns not in the ontology. Uses directive hints to prevent duplicating committed relationship types.
Phase 3 — Merge: Combines ontology base with LLM discoveries. Data-driven pruning removes types not present in the data.
Phase 4 — Consolidation: Common final step (see above).

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology-first \
  --ontology-graph my_ontology

Learn More¶

Schema Inference Deep Dive — Full pipeline documentation per mode
Building Graphs — Build options and flags
FTM Data — Schema considerations for FollowTheMoney data