Skip to content

Understanding Ontologies

An ontology is a formal representation of knowledge within a domain. It defines the types of entities that exist, their properties, and the relationships between them.

What is an Ontology?

In the context of knowledge graphs, an ontology serves as a schema or blueprint that describes:

  • Classes: The types of entities (e.g., Person, Organization, Aircraft)
  • Properties: Attributes of entities (e.g., name, date, location)
  • Relationships: How entities connect to each other (e.g., WORKS_FOR, LOCATED_IN)
  • Constraints: Rules about valid combinations (e.g., a Person can have only one birth date)
Ontology (Schema)              Knowledge Graph (Data)
─────────────────              ─────────────────────
Person                    →    John Smith
  - name: string               - name: "John Smith"
  - birthDate: date            - birthDate: 1985-03-15

Organization              →    Acme Corp
  - name: string               - name: "Acme Corp"
  - founded: date              - founded: 2010-01-01

WORKS_FOR                 →    John Smith ──WORKS_FOR──► Acme Corp
  - since: date                - since: 2020-06-01

Why Ontologies Matter

1. Consistency

Without an ontology, the same concept might be extracted differently:

Without Ontology With Ontology
"company", "firm", "business", "corp" Organization
"works at", "employed by", "staff of" WORKS_FOR
"located in", "based in", "HQ in" LOCATED_IN

An ontology ensures that semantically equivalent concepts map to the same canonical type.

2. Interoperability

Ontologies enable data from different sources to be integrated:

Source A: "Boeing 737-800"     ─┐
Source B: "B738"               ─┼──► Aircraft (type: Boeing 737-800)
Source C: "737-800 aircraft"   ─┘

When multiple systems use the same ontology, their data becomes automatically compatible.

3. Domain Expertise Capture

Ontologies encode expert knowledge about a domain:

  • Aviation: An "Occurrence" involves Aircraft, happens at an Airport, has a Flight Phase
  • Financial Crime: A "Sanction" targets an Entity, issued by an Authority, has a Program ID
  • Healthcare: A "Diagnosis" relates to a Patient, made by a Physician, has an ICD code

This expertise guides extraction and ensures domain-relevant relationships are captured.

4. Query Intelligence

With an ontology, systems can understand that:

  • A query for "airlines" should match entities of type Operator
  • A query for "plane crashes" relates to Occurrence with certain Events
  • A query for "sanctioned companies" means Organization with SANCTION relationship

Ontology vs Schema vs Taxonomy

Concept Definition Example
Taxonomy Hierarchical classification Aircraft → Commercial → Wide-body → Boeing 787
Schema Data structure definition {name: string, date: date}
Ontology Formal knowledge model with reasoning Aircraft subClassOf Vehicle; if X manufactured Y, then Y hasManufacturer X

An ontology is the most expressive, supporting:

  • Inheritance: A CommercialAircraft inherits properties from Aircraft
  • Inference: If A is part of B, and B is located in C, then A is located in C
  • Constraints: An Occurrence must have exactly one primary_cause

Standard Ontology Formats

OWL (Web Ontology Language)

The W3C standard for ontologies, typically serialized as Turtle (.ttl):

@prefix ex: <http://example.org/> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .

ex:Aircraft a owl:Class ;
    rdfs:label "Aircraft" ;
    rdfs:comment "A vehicle capable of flight" .

ex:Operator a owl:Class ;
    rdfs:label "Operator" ;
    rdfs:comment "An organization that operates aircraft" .

ex:operatedBy a owl:ObjectProperty ;
    rdfs:domain ex:Aircraft ;
    rdfs:range ex:Operator .

FollowTheMoney (FTM)

A schema designed for investigative journalism and anti-corruption:

Person:
  properties:
    - name
    - birthDate
    - nationality

Organization:
  properties:
    - name
    - jurisdiction
    - registrationNumber

Ownership:
  properties:
    - owner: Person | Organization
    - asset: Organization
    - percentage

Domain-Specific Ontologies

Different domains have established ontologies:

Domain Ontology Description
Aviation Safety ECCAIRS EU standard for occurrence reporting
Financial Crime FollowTheMoney Investigative journalism standard
Biomedicine SNOMED CT Clinical terminology
Geography GeoNames Place names and relationships
General Schema.org Web-wide entity types

Using established ontologies provides:

  • Standardization: Industry-accepted terminology
  • Completeness: Years of domain expert refinement
  • Interoperability: Data exchange with other systems

Aletheia's Ontology Processing

Aletheia's GenericOntologyLoader processes ontologies with several capabilities beyond basic OWL parsing:

  • Transitive class classification — Determines whether each class is an entity type, relationship type, or abstract class by checking the full ancestry chain (not just direct parents).
  • Non-reified relationship extraction — Object properties between entity classes are extracted as relationship types alongside reified relationship classes.
  • ModelingProfile support — Optional explicit classification hints that override heuristic rules, useful when an ontology's structure doesn't fit the default patterns.
  • Enriched docstrings — Property names and descriptions from the ontology are appended to entity type docstrings, giving the LLM concrete extraction signal.

See Integration for the full workflow.

Next Steps