Skip to content

Use Cases

A use case is a self-contained data domain in Aletheia. Each use case defines how to parse, transform, and structure data for a specific domain.

Structure

Use cases are located in use_cases/<name>/ and must provide:

Component File Purpose
Parser parser.py Transform source format to entities
Episode Builder episode_builder.py Convert entities to rich text
Ontology ontology/*.ttl Schema for graph-hybrid mode
Init __init__.py Register components
Evaluation evaluation/ (Optional) Counterfactual data, substitution maps

Creating a Use Case

# use_cases/my_case/__init__.py
from .parser import MyParser
from aletheia.core.ontology import GenericOntologyLoader
from aletheia.core.episodes import register_episode_builder
from .episode_builder import build_episode

Parser = MyParser
Ontology = GenericOntologyLoader

register_episode_builder(
    "my_case",
    build_episode,
    source_description="My data source",
)

Parser Interface

Parsers must implement:

class MyParser:
    def __init__(self, data_dir: Path):
        self.data_dir = data_dir

    def parse(self) -> Iterator[Entity]:
        """Yield entities from the data source."""
        ...

Episode Builder Interface

Episode builders convert entities to markdown:

def build_episode(entity: Entity) -> str:
    """Convert entity to markdown episode."""
    return f"""
# Entity: {entity.name}

## Properties
- **Type**: {entity.type}
- **Description**: {entity.description}
"""

Available Use Cases

Use Case Description Data Source Schema Types
anticorruption EU financial sanctions OpenSanctions FTM 6 FTM schemas, 2 relationship types
terrorist_orgs Multi-authority FTO designations (US, UK, Australia) OpenSanctions FTM 2 FTM schemas, HAS_ALIAS extension
aviation_safety European aviation incidents Synthetic (10 incidents) 7 entity types, 6 relationship types
safety_recommendations EASA safety recommendations EASA Annual Review EU Reg 996/2010 ontology
airworthiness_directives EASA airworthiness directives EASA ADs/EADs/PADs Part-21 ontology
operation_tango Multi-dataset investigation OpenSanctions FTM 18 FTM schemas across 4 files
evaluation MuSiQue evaluation benchmark MuSiQue N/A

Learn More