Creating Use Cases¶
This guide walks through creating a new use case in Aletheia.
Overview¶
A use case is a self-contained data domain. To create one:
- Create directory structure
- Implement a parser
- Implement an episode builder
- (Optional) Add an ontology
- Register components
Directory Structure¶
use_cases/my_case/
├── __init__.py # Registration
├── parser.py # Data parser
├── episode_builder.py # Markdown builder
├── ontology/ # (Optional) Ontology files
│ └── schema.ttl
└── data/ # Source data
└── records.json
Step 1: Implement the Parser¶
The parser transforms source data into entities:
# use_cases/my_case/parser.py
from pathlib import Path
from dataclasses import dataclass
from typing import Iterator
import json
@dataclass
class MyEntity:
"""Entity from my data source."""
id: str
name: str
type: str
properties: dict
class MyParser:
"""Parser for my data format."""
def __init__(self, data_dir: Path):
self.data_dir = data_dir
def parse(self) -> Iterator[MyEntity]:
"""Parse data files and yield entities."""
data_file = self.data_dir / "records.json"
with open(data_file) as f:
records = json.load(f)
for record in records:
yield MyEntity(
id=record["id"],
name=record["name"],
type=record["type"],
properties=record.get("properties", {}),
)
Step 2: Implement the Episode Builder¶
The episode builder converts entities to markdown:
# use_cases/my_case/episode_builder.py
from .parser import MyEntity
def build_episode(entity: MyEntity) -> str:
"""Convert entity to markdown episode."""
# Build properties section
props_lines = []
for key, value in entity.properties.items():
props_lines.append(f"- **{key.title()}**: {value}")
props_section = "\n".join(props_lines) if props_lines else "- No properties"
return f"""
# Entity: {entity.name}
## Metadata
- **ID**: {entity.id}
- **Type**: {entity.type}
## Properties
{props_section}
## Context
This is a {entity.type} entity named {entity.name}.
""".strip()
Step 3: Register Components¶
Register the parser and episode builder:
# use_cases/my_case/__init__.py
from .parser import MyParser, MyEntity
from aletheia.core.ontology import GenericOntologyLoader
from aletheia.core.episodes import register_episode_builder
from .episode_builder import build_episode
# Export parser class
Parser = MyParser
# Export ontology loader
Ontology = GenericOntologyLoader
# Register episode builder
register_episode_builder(
"my_case",
build_episode,
source_description="My custom data source",
)
Step 4: (Optional) Add Ontology¶
For graph-hybrid mode, add an ontology:
# use_cases/my_case/ontology/schema.ttl
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#> .
@prefix : <http://example.org/mycase#> .
# Classes
:Person a owl:Class ;
rdfs:label "Person" .
:Organization a owl:Class ;
rdfs:label "Organization" .
# Properties
:name a owl:DatatypeProperty ;
rdfs:domain :Person ;
rdfs:range xsd:string .
# Relationships
:worksFor a owl:ObjectProperty ;
rdfs:domain :Person ;
rdfs:range :Organization .
Step 5: Test the Use Case¶
List Use Cases¶
Build Graph¶
aletheia build-knowledge-graph \
--use-case my_case \
--knowledge-graph my_case_graph \
--schema-mode none
Verify¶
Reusing the FTM Parser¶
For FTM data, reuse the existing parser:
# use_cases/my_ftm_case/__init__.py
from use_cases.anticorruption.parser import FTMParser, FTMEntity
from aletheia.core.ontology import GenericOntologyLoader
from aletheia.core.episodes import register_episode_builder
from use_cases.anticorruption.episode_builder import build_ftm_episode_content
# Reuse FTM parser
Parser = FTMParser
Ontology = GenericOntologyLoader
# Reuse FTM episode builder
register_episode_builder(
"my_ftm_case",
build_ftm_episode_content,
source_description="OpenSanctions FTM data",
)
Advanced: Custom Entity Resolution¶
Override entity resolution behavior:
def build_episode(entity: MyEntity) -> str:
"""Episode with explicit entity markers for resolution."""
# Add explicit entity markers
entities = [f"[[{entity.name}]]"]
for alias in entity.properties.get("aliases", []):
entities.append(f"[[{alias}]]")
return f"""
# Entity: {entity.name}
## Known As
{', '.join(entities)}
## Properties
...
"""
Advanced: Custom Relationship Extraction¶
Add relationship hints for edge extraction:
def build_episode(entity: MyEntity) -> str:
"""Episode with relationship hints."""
relationships = []
for rel in entity.properties.get("relationships", []):
relationships.append(
f"- {entity.name} {rel['type']} {rel['target']}"
)
return f"""
# Entity: {entity.name}
## Relationships
{chr(10).join(relationships) if relationships else 'No relationships'}
"""
Step 6: (Optional) Add MCP Config¶
For MCP server integration, add a config file referencing the shared base:
# use_cases/my_case/mcp_config.yaml
base: ../../mcp-base-config.yaml
graphiti:
group_id: my_case
ontology_graph: my_case_ontology
The base: path is resolved relative to the overlay file's directory. See MCP Connectors for details.
Step 7: (Optional) Implement schema_distribution¶
If using ontology-first or graph-hybrid modes with data-driven pruning, implement schema_distribution on your parser:
@property
def schema_distribution(self) -> dict[str, int]:
"""Return entity type counts from the data."""
return {"Person": 42, "Organization": 15, "Sanction": 30}
This drives data-driven pruning — entity types not in this distribution are removed from the schema.
Step 8: (Optional) Add Evaluation Data¶
For counterfactual testing (parametric knowledge detection), add an evaluation/ directory:
use_cases/my_case/evaluation/
├── substitutions.json # Entity substitution maps
├── counterfactual_testset.json # Generated test set
└── generate_counterfactual_testset.py # Generator script
The substitutions.json file defines domain-specific entity swaps used by the counterfactual mutation framework in aletheia.core.evaluation.counterfactual. See use_cases/terrorist_orgs/evaluation/ for a working example.
Checklist¶
- [ ] Directory structure created
- [ ] Parser implemented and returns entities
- [ ] Episode builder returns valid markdown
- [ ] Components registered in
__init__.py - [ ] (Optional) Ontology TTL files added
- [ ] (Optional) MCP config with base reference
- [ ] (Optional)
schema_distributionproperty on parser - [ ] (Optional) Counterfactual substitution data in
evaluation/ - [ ] Use case appears in
list-use-cases - [ ] Graph builds successfully
- [ ] Entities extracted correctly
Learn More¶
- Architecture - System overview
- Use Cases Concept - Use case design
- FTM Data - FTM format details