Hybrid Mode¶

The hybrid mode combines LLM-based schema inference with ontology validation using string matching. It's a middle ground between pure LLM inference and strict ontology adherence.

Overview¶

Aspect	Value
Ontology Required	Yes
LLM Calls for Schema	2 (same as LLM mode)
Type Consistency	Good
Setup Time	Medium
Best For	Balanced approach, known + unknown data

How It Works¶

┌─────────────────────────────────────────────────────────┐
│                    PHASE 1                               │
│              LLM Schema Inference                        │
├─────────────────────────────────────────────────────────┤
│  Same as LLM mode:                                      │
│  - Stage 1: Domain analysis                             │
│  - Stage 2: Schema extraction                           │
│  Output: Inferred entity/relationship types             │
└─────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                    PHASE 2                               │
│              Ontology Validation                         │
├─────────────────────────────────────────────────────────┤
│  For each inferred type:                                │
│  - Fuzzy string match against ontology names            │
│  - If match found: Use ontology name                    │
│  - If no match: Keep inferred name                      │
└─────────────────────────────────────────────────────────┘
                          │
                          ▼
┌─────────────────────────────────────────────────────────┐
│                    PHASE 3                               │
│              Property Enrichment                         │
├─────────────────────────────────────────────────────────┤
│  For aligned entities:                                  │
│  - Add properties from ontology                         │
│  - Merge with LLM-inferred properties                   │
└─────────────────────────────────────────────────────────┘

Usage¶

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode hybrid

Comparison with LLM Mode¶

Aspect	LLM Mode	Hybrid Mode
LLM inference	Yes	Yes
Ontology validation	No	Yes (string matching)
Property enrichment	No	Yes
Handles unknown types	Yes	Yes (keeps as-is)

String Matching¶

Hybrid mode uses fuzzy string matching to align inferred types with ontology:

# LLM infers: "TerroristOrganization"
# Ontology has: "Organization"

# String matching:
match_score = fuzz.ratio("TerroristOrganization", "Organization")
# Result: 67 (below threshold)
# Outcome: Keep "TerroristOrganization"

# LLM infers: "Person"
# Ontology has: "Person"

# String matching:
match_score = fuzz.ratio("Person", "Person")
# Result: 100 (above threshold)
# Outcome: Use "Person"

Matching Limitations¶

String matching has significant limitations:

LLM Infers	Ontology Has	Match?	Problem
Person	Person	Yes	Exact match
Persona	Person	Maybe	Language difference
Airport	Aerodrome	No	Synonym
Company	Corporation	No	Semantic equivalent

For better semantic matching, use Graph-Hybrid Mode.

Console Output¶

🔀 Hybrid Mode: LLM-first + Ontology Validation

📊 Phase 1: LLM Schema Inference
   [Same output as LLM mode]
   ✓ Inferred 8 entity types, 5 relationship types

📋 Phase 2: Ontology Validation
   Validating against ontology: my_ontology.ttl
   ✓ Person → Person (exact match)
   ✓ Organization → Organization (exact match)
   ⚠️ TerroristGroup → (no match, kept as-is)
   ⚠️ Airport → (no match, kept as-is)
   ✓ Aligned 5/8 entities

📚 Phase 3: Property Enrichment
   ✓ Person: +12 properties from ontology
   ✓ Organization: +8 properties from ontology

Phase 4: Consolidation¶

After property enrichment, the schema passes through Phase 4 consolidation — the common final step across all modes. An LLM reviews the schema for redundancies and normalizes naming. Ontology-validated types are protected from removal.

Pros and Cons¶

Advantages¶

Flexible: Handles types not in ontology
Some normalization: Exact/close matches are aligned
Property enrichment: Gets ontology properties for matched types
No graph required: Ontology file only, no graph loading

Disadvantages¶

String matching limits: Misses semantic equivalents
No embedding alignment: Can't match "Airport" to "Aerodrome"
Inconsistent results: Similar names may or may not match
Language barriers: Non-English terms unlikely to match

When to Use¶

Use hybrid mode when:

Quick validation needed: Want some ontology alignment without full setup
Names are similar: Your data uses names close to ontology
Graph not available: Can't load ontology into a graph database
Partial alignment OK: Some unaligned types are acceptable

When NOT to Use¶

Avoid hybrid mode when:

Semantic alignment needed: Your terms don't match ontology strings
FTM data: Use graph-hybrid for better alignment
Full ontology compliance: Use ontology or ontology-first
Multilingual data: String matching fails across languages

Migration Path¶

If hybrid mode isn't aligning well, consider:

Graph-Hybrid: For semantic alignment via embeddings
Ontology-First: For complete ontology coverage

LLM Mode - Pure LLM inference (no validation)
Graph-Hybrid Mode - Semantic alignment (recommended)
Overview - Comparison of all modes