Use Cases¶
A use case is a self-contained data domain in Aletheia. Each use case defines how to parse, transform, and structure data for a specific domain.
Structure¶
Use cases are located in use_cases/<name>/ and must provide:
| Component | File | Purpose |
|---|---|---|
| Parser | parser.py | Transform source format to entities |
| Episode Builder | episode_builder.py | Convert entities to rich text |
| Ontology | ontology/*.ttl | Schema for graph-hybrid mode |
| Init | __init__.py | Register components |
| Evaluation | evaluation/ | (Optional) Counterfactual data, substitution maps |
Creating a Use Case¶
# use_cases/my_case/__init__.py
from .parser import MyParser
from aletheia.core.ontology import GenericOntologyLoader
from aletheia.core.episodes import register_episode_builder
from .episode_builder import build_episode
Parser = MyParser
Ontology = GenericOntologyLoader
register_episode_builder(
"my_case",
build_episode,
source_description="My data source",
)
Parser Interface¶
Parsers must implement:
class MyParser:
def __init__(self, data_dir: Path):
self.data_dir = data_dir
def parse(self) -> Iterator[Entity]:
"""Yield entities from the data source."""
...
Episode Builder Interface¶
Episode builders convert entities to markdown:
def build_episode(entity: Entity) -> str:
"""Convert entity to markdown episode."""
return f"""
# Entity: {entity.name}
## Properties
- **Type**: {entity.type}
- **Description**: {entity.description}
"""
Available Use Cases¶
| Use Case | Description | Data Source | Schema Types |
|---|---|---|---|
anticorruption | EU financial sanctions | OpenSanctions FTM | 6 FTM schemas, 2 relationship types |
terrorist_orgs | Multi-authority FTO designations (US, UK, Australia) | OpenSanctions FTM | 2 FTM schemas, HAS_ALIAS extension |
aviation_safety | European aviation incidents | Synthetic (10 incidents) | 7 entity types, 6 relationship types |
safety_recommendations | EASA safety recommendations | EASA Annual Review | EU Reg 996/2010 ontology |
airworthiness_directives | EASA airworthiness directives | EASA ADs/EADs/PADs | Part-21 ontology |
operation_tango | Multi-dataset investigation | OpenSanctions FTM | 18 FTM schemas across 4 files |
evaluation | MuSiQue evaluation benchmark | MuSiQue | N/A |
Learn More¶
- Creating Use Cases - Detailed guide
- FTM Data - FollowTheMoney format
- Episodes - Episode builder details