Skip to content

Hybrid Mode

The hybrid mode combines LLM-based schema inference with ontology validation using string matching. It's a middle ground between pure LLM inference and strict ontology adherence.

Overview

Aspect Value
Ontology Required Yes
LLM Calls for Schema 2 (same as LLM mode)
Type Consistency Good
Setup Time Medium
Best For Balanced approach, known + unknown data

How It Works

┌─────────────────────────────────────────────────────────┐
│                    PHASE 1                               │
│              LLM Schema Inference                        │
├─────────────────────────────────────────────────────────┤
│  Same as LLM mode:                                      │
│  - Stage 1: Domain analysis                             │
│  - Stage 2: Schema extraction                           │
│  Output: Inferred entity/relationship types             │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│                    PHASE 2                               │
│              Ontology Validation                         │
├─────────────────────────────────────────────────────────┤
│  For each inferred type:                                │
│  - Fuzzy string match against ontology names            │
│  - If match found: Use ontology name                    │
│  - If no match: Keep inferred name                      │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│                    PHASE 3                               │
│              Property Enrichment                         │
├─────────────────────────────────────────────────────────┤
│  For aligned entities:                                  │
│  - Add properties from ontology                         │
│  - Merge with LLM-inferred properties                   │
└─────────────────────────────────────────────────────────┘

Usage

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode hybrid

Comparison with LLM Mode

Aspect LLM Mode Hybrid Mode
LLM inference Yes Yes
Ontology validation No Yes (string matching)
Property enrichment No Yes
Handles unknown types Yes Yes (keeps as-is)

String Matching

Hybrid mode uses fuzzy string matching to align inferred types with ontology:

# LLM infers: "TerroristOrganization"
# Ontology has: "Organization"

# String matching:
match_score = fuzz.ratio("TerroristOrganization", "Organization")
# Result: 67 (below threshold)
# Outcome: Keep "TerroristOrganization"

# LLM infers: "Person"
# Ontology has: "Person"

# String matching:
match_score = fuzz.ratio("Person", "Person")
# Result: 100 (above threshold)
# Outcome: Use "Person"

Matching Limitations

String matching has significant limitations:

LLM Infers Ontology Has Match? Problem
Person Person Yes Exact match
Persona Person Maybe Language difference
Airport Aerodrome No Synonym
Company Corporation No Semantic equivalent

For better semantic matching, use Graph-Hybrid Mode.

Console Output

🔀 Hybrid Mode: LLM-first + Ontology Validation

📊 Phase 1: LLM Schema Inference
   [Same output as LLM mode]
   ✓ Inferred 8 entity types, 5 relationship types

📋 Phase 2: Ontology Validation
   Validating against ontology: my_ontology.ttl
   ✓ Person → Person (exact match)
   ✓ Organization → Organization (exact match)
   ⚠️ TerroristGroup → (no match, kept as-is)
   ⚠️ Airport → (no match, kept as-is)
   ✓ Aligned 5/8 entities

📚 Phase 3: Property Enrichment
   ✓ Person: +12 properties from ontology
   ✓ Organization: +8 properties from ontology

Phase 4: Consolidation

After property enrichment, the schema passes through Phase 4 consolidation — the common final step across all modes. An LLM reviews the schema for redundancies and normalizes naming. Ontology-validated types are protected from removal.

Pros and Cons

Advantages

  • Flexible: Handles types not in ontology
  • Some normalization: Exact/close matches are aligned
  • Property enrichment: Gets ontology properties for matched types
  • No graph required: Ontology file only, no graph loading

Disadvantages

  • String matching limits: Misses semantic equivalents
  • No embedding alignment: Can't match "Airport" to "Aerodrome"
  • Inconsistent results: Similar names may or may not match
  • Language barriers: Non-English terms unlikely to match

When to Use

Use hybrid mode when:

  1. Quick validation needed: Want some ontology alignment without full setup
  2. Names are similar: Your data uses names close to ontology
  3. Graph not available: Can't load ontology into a graph database
  4. Partial alignment OK: Some unaligned types are acceptable

When NOT to Use

Avoid hybrid mode when:

  1. Semantic alignment needed: Your terms don't match ontology strings
  2. FTM data: Use graph-hybrid for better alignment
  3. Full ontology compliance: Use ontology or ontology-first
  4. Multilingual data: String matching fails across languages

Migration Path

If hybrid mode isn't aligning well, consider:

  1. Graph-Hybrid: For semantic alignment via embeddings
  2. Ontology-First: For complete ontology coverage