Skip to content

Ontology Mode

The ontology mode extracts the schema directly from an ontology file (TTL/OWL) with no LLM involvement. This provides strict adherence to a formal domain model.

Overview

Aspect Value
Ontology Required Yes
LLM Calls for Schema None
Type Consistency Excellent
Setup Time Medium
Best For Formal domains with complete ontologies

How It Works

┌─────────────────────────────────────────────────────────┐
│                    INPUT                                 │
├─────────────────────────────────────────────────────────┤
│  Ontology File (TTL/OWL)                                │
│  use_cases/<name>/ontology/*.ttl                        │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│              RDF PARSING (rdflib)                        │
├─────────────────────────────────────────────────────────┤
│  Extract:                                                │
│  - owl:Class → Entity types                             │
│  - rdfs:subClassOf → Class hierarchy                    │
│  - owl:ObjectProperty → Relationship types              │
│  - owl:DatatypeProperty → Properties                    │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│              CLASS CLASSIFICATION                        │
├─────────────────────────────────────────────────────────┤
│  Classify each class:                                   │
│  - Class: Concrete entity (Person, Organization)        │
│  - AbstractClass: Non-instantiable (Thing, LegalEntity) │
│  - RelationshipClass: Interstitial (Ownership)          │
└─────────────────────────────────────────────────────────┘
┌─────────────────────────────────────────────────────────┐
│                    OUTPUT                                │
├─────────────────────────────────────────────────────────┤
│  SchemaDefinition with all ontology concepts            │
└─────────────────────────────────────────────────────────┘

Usage

aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology

Ontology File Format

Place your ontology in the use case directory:

use_cases/
└── my_case/
    └── ontology/
        ├── my_ontology.ttl     # Main ontology
        └── extensions/          # Optional extensions
            └── custom.ttl

TTL Format Example

@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://example.org/ontology#> .

# Classes
:Person a owl:Class ;
    rdfs:label "Person" ;
    rdfs:comment "A natural person" .

:Organization a owl:Class ;
    rdfs:label "Organization" ;
    rdfs:comment "An organization or company" .

# Relationship Classes (interstitial entities)
:Ownership a owl:Class ;
    rdfs:subClassOf :Interval ;
    rdfs:label "Ownership" ;
    rdfs:comment "An ownership relationship between entities" .

# Object Properties (become relationships)
:owns a owl:ObjectProperty ;
    rdfs:domain :Person ;
    rdfs:range :Organization ;
    rdfs:label "owns" .

# Datatype Properties (become entity properties)
:birthDate a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:label "birthDate" .

:jurisdiction a owl:DatatypeProperty ;
    rdfs:domain :Organization ;
    rdfs:label "jurisdiction" .

Class Classification

The ontology loader classifies each class:

Classification Criteria Examples
Class Concrete, instantiable Person, Organization, Aircraft
AbstractClass Parent-only, not instantiated Thing, LegalEntity, Value
RelationshipClass Subclass of Interval Ownership, Directorship, Membership

Classification Rules

  1. AbstractClass patterns:
  2. Named Thing, LegalEntity, Interval, Value, Analyzable
  3. Has subclasses but is never instantiated directly

  4. RelationshipClass:

  5. Subclass of Interval (FollowTheMoney pattern)
  6. Represents a relationship as an entity (reification)

  7. Class:

  8. Everything else that's owl:Class

Property Extraction

Properties are extracted from the ontology and grouped by their domain class:

:birthDate a owl:DatatypeProperty ;
    rdfs:domain :Person ;
    rdfs:label "birthDate" ;
    rdfs:comment "Date of birth" .

Becomes:

class Person(BaseModel):
    birth_date: str | None = Field(None, description="Date of birth")

Property Name Conversion

Ontology property names are converted to Python-safe names:

Ontology Name Python Name
birthDate birth_date
from from_ (Python keyword)
class class_ (Python keyword)
inn inn

Console Output

📚 Ontology Mode: Extracting schema from ontology

Loading ontology: use_cases/my_case/ontology/my_ontology.ttl
   ✓ Parsed 45 classes
   ✓ Loaded 1 extension file(s)

Classifying concepts:
   ✓ Classes: 38
   ✓ Abstract classes: 5
   ✓ Relationship classes: 12

Extracting properties:
   ✓ Person: 12 properties
   ✓ Organization: 8 properties
   ✓ ...

Generating schema:
   ✓ Entity types: 38
   ✓ Relationship types: 15

Ontology Extensions

You can extend upstream ontologies without modifying them:

use_cases/my_case/ontology/
├── followthemoney.ttl     # Upstream (don't modify)
└── extensions/
    └── custom.ttl          # Your additions

Extension Example

@prefix ftm: <https://followthemoney.tech/ns#> .
@prefix : <http://example.org/custom#> .

# Add new class
:CustomEntity a owl:Class ;
    rdfs:subClassOf ftm:LegalEntity ;
    rdfs:label "CustomEntity" .

# Add new relationship
:HAS_CUSTOM a owl:ObjectProperty ;
    rdfs:label "HAS_CUSTOM" .

Extensions are: - Loaded in alphabetical order - Merged into the same RDF graph - Can reference classes from the main ontology

Pros and Cons

Advantages

  • Complete control: Schema exactly matches ontology
  • No LLM variability: Deterministic schema generation
  • Formal semantics: Based on well-defined domain model
  • Reproducible: Same ontology always produces same schema

Disadvantages

  • Requires ontology: Must have or create a formal ontology
  • No flexibility: Can't handle data outside ontology
  • Manual updates: Ontology must be updated for new concepts
  • May include unused types: All ontology classes are included

When to Use

Use ontology mode when:

  1. Complete ontology exists: You have a formal, comprehensive domain model
  2. Strict adherence required: Data must conform exactly to ontology
  3. No LLM needed: You don't want LLM involvement in schema
  4. Reproducibility matters: Schema must be deterministic

When NOT to Use

Avoid ontology mode when:

  1. Data varies from ontology: Real data has types not in ontology
  2. Ontology incomplete: Important concepts are missing
  3. Want LLM discovery: You want LLM to find patterns
  4. FTM with extensions: Use graph-hybrid for better flexibility

FollowTheMoney Ontology

Aletheia includes the FollowTheMoney (FTM) ontology used by OpenSanctions:

use_cases/anticorruption/ontology/followthemoney.ttl

FTM defines: - Entity types: Person, Company, Organization, Sanction, etc. - Relationship classes: Ownership, Directorship, Membership - Properties: Extensive property definitions

To use FTM:

# Copy to your use case
cp use_cases/anticorruption/ontology/followthemoney.ttl \
   use_cases/my_case/ontology/

# Build with ontology mode
aletheia build-knowledge-graph \
  --use-case my_case \
  --knowledge-graph my_graph \
  --schema-mode ontology

Comparison with Graph-Hybrid

Aspect Ontology Mode Graph-Hybrid Mode
Ontology as primary Yes No (LLM primary)
LLM involvement None Yes (discovery)
Semantic alignment No Yes (embeddings)
Handles new types No Yes
Recommended for FTM No Yes

For FTM data, prefer graph-hybrid or ontology-first modes.