Aviation Safety - Technical Reference¶
This page documents the technical implementation of the aviation safety use case.
Architecture¶
use_cases/aviation_safety/
├── __init__.py # Use case registration
├── parser.py # Markdown incident parser
├── episode_builder.py # Episode text builder
├── data/ # Markdown incident reports
│ ├── incident_2024_0157.md
│ ├── incident_2024_0289.md
│ └── ...
├── ontology/
│ ├── aviation_safety.ttl # Basic ontology
│ └── eccairs_aviation.ttl # ECCAIRS-derived (optional)
├── eccairs_taxonomy/ # ECCAIRS XML parser
│ └── eccairs_parser.py
├── evaluation_questions.json # Full evaluation set (50q)
└── evaluation_questions_curated.json # Curated set (35q)
Components¶
Parser¶
The AviationSafetyParser parses structured markdown incident reports:
# use_cases/aviation_safety/parser.py
class AviationSafetyParser(BaseParser):
"""Parser for aviation safety markdown incident reports."""
def parse_all(self) -> Iterator[IncidentRecord]:
"""Parse all markdown files in data directory."""
for file in self.data_dir.glob("*.md"):
yield self.parse_file(file)
def parse_file(self, path: Path) -> IncidentRecord:
"""Parse a single incident report."""
content = path.read_text()
sections = self._split_sections(content)
return IncidentRecord(
id=self._extract_id(sections),
date=self._extract_field(sections, "Metadata", "Date"),
location=self._extract_field(sections, "Metadata", "Location"),
aircraft=self._parse_aircraft(sections),
description=sections.get("Incident Description", ""),
findings=self._parse_findings(sections),
# ...
)
Data Models¶
@dataclass
class IncidentRecord:
"""Aviation safety incident record."""
id: str
date: str = ""
time: str = ""
location: str = ""
country: str = ""
flight_phase: str = ""
aircraft: AircraftInfo = field(default_factory=AircraftInfo)
description: str = ""
outcome: Outcome = field(default_factory=Outcome)
weather: WeatherConditions = field(default_factory=WeatherConditions)
findings: Findings = field(default_factory=Findings)
safety_recommendations: list[SafetyRecommendation] = field(default_factory=list)
@dataclass
class AircraftInfo:
"""Aircraft information from incident report."""
aircraft_type: str = ""
registration: str = ""
operator: str = ""
@dataclass
class Findings:
"""Incident findings and analysis."""
primary_cause: str = ""
contributing_factors: list[str] = field(default_factory=list)
human_factors: str = ""
wildlife: dict[str, str] = field(default_factory=dict)
Episode Builder¶
The episode builder converts parsed records to markdown for Graphiti:
# use_cases/aviation_safety/episode_builder.py
def build_episode_content(record: IncidentRecord) -> str:
"""Build markdown episode from incident record."""
lines = [
f"# Aviation Safety Occurrence: {record.id}",
"",
"## Occurrence Metadata",
f"- **Occurrence ID**: {record.id}",
f"- **Date**: {record.date}",
f"- **Location**: {record.location}",
f"- **Country**: {record.country}",
f"- **Flight Phase**: {record.flight_phase}",
]
if record.aircraft.aircraft_type:
lines.extend([
"",
"## Aircraft",
f"- **Type**: {record.aircraft.aircraft_type}",
f"- **Registration**: {record.aircraft.registration}",
f"- **Operator**: {record.aircraft.operator}",
])
lines.extend([
"",
"## Occurrence Description",
record.description,
])
if record.findings.primary_cause:
lines.extend([
"",
"## Findings",
f"### Primary Cause",
record.findings.primary_cause,
])
return "\n".join(lines)
Episode Output Example¶
# Aviation Safety Occurrence: 2024-0412-EU
## Occurrence Metadata
- **Occurrence ID**: 2024-0412-EU
- **Date**: 2024-03-22
- **Location**: En route, 45 nm west of Barcelona (LEBL)
- **Country**: Spain
- **Flight Phase**: Cruise
## Aircraft
- **Type**: Embraer ERJ-195
- **Registration**: CS-TTW
- **Operator**: TAP Air Portugal
## Occurrence Description
The aircraft encountered severe clear air turbulence at FL380 without
prior warning. Two cabin crew members sustained minor injuries...
## Findings
### Primary Cause
Unpredicted clear air turbulence associated with jetstream boundary
Registration¶
# use_cases/aviation_safety/__init__.py
from .parser import AviationSafetyParser, IncidentRecord
from aletheia.core.ontology import GenericOntologyLoader
from aletheia.core.episodes import register_episode_builder
from .episode_builder import build_episode_content
Parser = AviationSafetyParser
Ontology = GenericOntologyLoader
DATA_DIR = "use_cases/aviation_safety/data"
ONTOLOGY_DIR = "use_cases/aviation_safety/ontology"
register_episode_builder(
"aviation_safety",
build_episode_content,
source_description="Aviation safety data",
)
Data Pipeline¶
Markdown Incident Reports
│
▼ [AviationSafetyParser]
IncidentRecord objects
│
▼ [Episode Builder]
Markdown episodes
│
▼ [Graphiti]
Knowledge Graph
│
├──► Nodes: Occurrence, Aircraft, Airport, Operator, Manufacturer, Country
└──► Edges: HAS_AIRCRAFT, HAS_OPERATOR, HAS_AIRPORT, LOCATED_IN, MANUFACTURED_BY
Graph Schema¶
Node Types¶
| Type | Count | Description |
|---|---|---|
| Occurrence | 10 | Aviation incidents |
| Aircraft | 11 | Aircraft involved |
| Airport | 10 | Locations |
| Operator | 9 | Airlines |
| Country | 5 | Countries |
| Manufacturer | 5 | Aircraft makers |
| Episodic | 10 | Source episodes |
Edge Types¶
| Type | Count | Description |
|---|---|---|
| MENTIONS | 60 | Episode → Entity |
| RELATES_TO | 15 | General relationships |
| HAS_OPERATOR | 10 | Occurrence → Operator |
| LOCATED_IN | 10 | Airport → Country |
| HAS_AIRPORT | 10 | Occurrence → Airport |
| HAS_AIRCRAFT | 10 | Occurrence → Aircraft |
| MANUFACTURED_BY | 10 | Aircraft → Manufacturer |
ECCAIRS Taxonomy¶
Converting ECCAIRS XML to OWL¶
python use_cases/aviation_safety/eccairs_taxonomy/eccairs_parser.py \
"use_cases/aviation_safety/eccairs_taxonomy/Eccairs Aviation 7.0.0.1.xml" \
-o use_cases/aviation_safety/ontology/eccairs_aviation.ttl \
-v
Parser Features¶
The ECCAIRSTaxonomyParser converts ECCAIRS XML (UTF-16) to OWL:
class ECCAIRSTaxonomyParser:
"""Convert ECCAIRS XML taxonomy to OWL ontology."""
def parse(self) -> list[ECCAIRSEntity]:
"""Parse ECCAIRS XML file."""
tree = ET.parse(self.xml_path)
root = tree.getroot()
for entity in root.findall(".//ENTITY"):
yield ECCAIRSEntity(
id=entity.get("Id"),
name=entity.get("Description"),
attributes=self._parse_attributes(entity),
)
def to_ttl(self, output_path: Path) -> None:
"""Generate OWL ontology."""
# Creates owl:Class for each ENTITY
# Creates owl:DatatypeProperty for attributes
# Creates owl:ObjectProperty for references
Generated Ontology¶
@prefix eccairs: <http://eccairs.jrc.ec.europa.eu/ontology#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
eccairs:Occurrence a owl:Class ;
rdfs:label "Occurrence" ;
rdfs:comment "An aviation safety occurrence" .
eccairs:Aircraft a owl:Class ;
rdfs:label "Aircraft" ;
rdfs:comment "An aircraft involved in an occurrence" .
eccairs:hasAircraft a owl:ObjectProperty ;
rdfs:domain eccairs:Occurrence ;
rdfs:range eccairs:Aircraft .
Markdown Format¶
Required Sections¶
| Section | Required | Content |
|---|---|---|
# Incident Report {ID} | Yes | Title with ID |
## Metadata | Yes | Date, location, flight phase |
## Aircraft | Yes | Type, registration, operator |
## Incident Description | Yes | Narrative text |
## Outcome | No | Injuries, damage |
## Weather Conditions | No | Visibility, wind |
## Findings | No | Primary cause, contributing factors |
## Safety Recommendations | No | EASA recommendations |
Field Extraction¶
Fields are extracted from bullet points:
Parser regex:
Evaluation Questions¶
Question Format¶
{
"questions": [
{
"id": "av_q1",
"question": "What caused incident 2024-0157-EU?",
"answer": "Hydraulic pump failure due to manufacturing defect",
"type": "cause_lookup"
}
]
}
Question Type Distribution (Curated)¶
| Type | Count | % |
|---|---|---|
| cause_lookup | 10 | 28.6% |
| incident_at_location | 8 | 22.9% |
| entity_description | 6 | 17.1% |
| aircraft_lookup | 5 | 14.3% |
| operator_lookup | 4 | 11.4% |
| recommendation_lookup | 2 | 5.7% |
Configuration¶
Environment Variables¶
# Database
FALKORDB_HOST=localhost
FALKORDB_PORT=6379
# LLM
OPENAI_API_KEY=sk-...
# Embeddings (optional)
EMBEDDING_PROVIDER=local
EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
Extending the Use Case¶
Adding New Incidents¶
- Create markdown file following the format
- Name as
incident_YYYY_NNNN.md - Include all required sections
- Rebuild graph with
--reset
Custom Entity Extraction¶
Modify the episode builder to extract additional entities:
def build_episode_content(record: IncidentRecord) -> str:
# Add custom entity extraction
if record.findings.wildlife:
lines.append(f"- **Bird Species**: {record.findings.wildlife.get('species')}")
Adding ECCAIRS Attributes¶
- Generate new ontology from ECCAIRS XML
- Load ontology graph
- Rebuild knowledge graph with
graph-hybridmode