Skip to content

Schema Modes

Schema modes control how Aletheia extracts entities and relationships from your data. Each mode balances structure against flexibility differently.

Available Modes

Mode Description Use When
none No schema — Graphiti extracts freely Quick prototyping
llm LLM infers schema from sample data Unknown data structure
inference Alias for llm Same as llm
ontology Direct extraction from ontology, no LLM Formal domain models, strict adherence
hybrid LLM extraction validated against ontology Known domain, some flexibility needed
graph-hybrid LLM-first, then semantic alignment via ontology graph Recommended for FTM data
ontology-first Ontology primary, LLM discovers edge cases Complete ontologies with known gaps

inference is an alias

The inference mode is identical to llm. Both use the same two-stage meta-prompt pipeline. The alias exists for backward compatibility.

Choosing a Mode

graph TD
    A[Do you have an ontology?] -->|Yes| B[Is the ontology complete?]
    A -->|No| C[Is data structure known?]
    B -->|Yes, covers all types| D[ontology-first]
    B -->|Mostly, some gaps| E[graph-hybrid]
    B -->|Partial coverage| F[hybrid]
    C -->|Yes| G[llm]
    C -->|No| H[none]

Common Final Step: Phase 4 Consolidation

Regardless of which mode you choose, Aletheia applies a Phase 4 consolidation step at the end of schema inference. An LLM reviews the complete schema for:

  • Redundant types — merges semantically similar entity or relationship types
  • Over-specialized types — consolidates overly narrow types into broader ones
  • Naming inconsistencies — normalizes type names

Ontology-derived and extension types are protected from removal during consolidation.

Mode Details

none

No schema constraints. Graphiti's LLM extracts whatever entities and relationships it finds.

  • Fastest to set up
  • No guarantees on type consistency across episodes
  • No Phase 4 consolidation (no schema to consolidate)
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode none

llm

A two-stage meta-prompt pipeline: Stage 1 generates a domain-specific extraction prompt from sample data, Stage 2 uses that prompt to extract the schema.

  • Good for exploring unknown data
  • Schema generated once per run
  • Data-driven pruning removes types not found in the parser's schema_distribution
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode llm

ontology

Direct extraction from a TTL/OWL ontology file. No LLM involvement — class classification uses transitive ancestry to distinguish entity classes, relationship classes, and abstract classes.

  • Strictest mode
  • Only produces types defined in the ontology
  • Requires a well-defined, complete ontology
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology

hybrid

LLM infers a schema from data, then validates and enriches it against the ontology via fuzzy matching.

  • Ontology guides but doesn't constrain
  • LLM-discovered types kept alongside ontology types
  • Property enrichment adds ontology-defined properties to matched entities
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode hybrid \
  --ontology-graph my_ontology

Three-phase pipeline designed to prevent ontology bias in the initial extraction:

  1. Phase 1 — LLM-first inference: LLM analyzes sample data without seeing the ontology, producing an unbiased schema.
  2. Phase 2 — Semantic alignment: Each inferred type is matched against ontology concepts in the graph (exact match → alt-label → embedding similarity → LLM reranking).
  3. Phase 3 — Property enrichment: Ontology properties are added to aligned entities, filtered to only those present in the actual data.
  4. Phase 4 — Consolidation: Common final step (see above).
# 1. Load ontology into graph (once)
aletheia build-ontology-graph \
  --use-case my_case \
  --knowledge-graph my_ontology

# 2. Ingest with ontology reference
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode graph-hybrid \
  --ontology-graph my_ontology

ontology-first

The ontology defines the baseline schema; an LLM fills in gaps the ontology doesn't cover.

  1. Phase 1 — Load ontology: Extracts entity types from concrete classes and relationship types from both reified classes (e.g., Ownership) and non-reified object properties (e.g., locatedIn).
  2. Phase 2 — LLM enhancement: Discovers patterns not in the ontology. Uses directive hints to prevent duplicating committed relationship types.
  3. Phase 3 — Merge: Combines ontology base with LLM discoveries. Data-driven pruning removes types not present in the data.
  4. Phase 4 — Consolidation: Common final step (see above).
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology-first \
  --ontology-graph my_ontology

Learn More