Investigations¶
When a question needs more than a quick lookup — multi-step reasoning across knowledge graphs, an auditable evidence chain, or systematic coverage of a complex topic — you run an investigation. Phronesis, the reasoning engine, takes your approved briefing and autonomously plans, executes, and synthesizes an answer. Every finding traces back to the specific tool calls and graph nodes that produced it.
What Investigations Are For¶
Investigations answer open-ended questions over structured knowledge — questions where the answer has to be assembled from multiple entities, connectors, or hops across a graph:
| Pattern | What's being investigated |
|---|---|
| Lookup with context | A specific entity, then its relationships and properties |
| Comparison | Two or more entities against each other |
| Causal / temporal chain | A sequence of events and how they connect |
| Classification | Which category something belongs to and why |
| Cross-graph correlation | Facts that only emerge when two knowledge graphs are joined |
| Negative / unsatisfiable | "Is there anything that…" where the honest answer may be "no" |
The value isn't any single query — it's an agent that classifies the question, discovers which data sources are relevant, plans an approach, executes it, recovers from dead ends, and produces a synthesis where every sentence is traceable to evidence.
Not everything needs an investigation
If the answer is a single lookup, a count, or a short chain of pivots, exploratory analysis is faster. Investigations are for questions that require planning, multi-step execution, and provenance.
What Investigations Don't Cover¶
Be precise about scope — it matters for how you frame a question:
- Unstructured corpora. Phronesis reasons over knowledge graphs, not raw document collections. If the answer lives in PDFs that haven't been ingested, ingest first.
- Open-web research. There is no web browsing. Every tool call routes through a connected knowledge graph.
- Numeric optimization. If the question is "what is the best allocation under these constraints?", that's a decision, not an investigation.
- Ground-truth judgment. Phronesis reports what the graphs say. Conflicting evidence surfaces as a human pause with options, not a silent majority vote.
- Cross-session memory. Each investigation has its own working memory. Lessons from one investigation don't automatically carry into the next.
Starting an Investigation¶
After a briefing is elicited and approved, start the investigation from chat:
"Start the investigation."
The assistant launches Phronesis, which returns immediately and begins running in the background. The workspace shows a lifecycle stepper tracking progress: Planning → Running → Synthesizing → Verifying → Complete.
Complexity hint
start_investigation accepts a complexity hint: simple for quick lookups, standard for typical questions, analytical for multi-hop cross-graph work. This controls the cycle budget and time cap.
Following Progress¶
While the investigation runs:
- The lifecycle stepper in the workspace shows the current stage with live cycle and finding counts
- Ask the assistant for updates: "How is the investigation going?"
- The assistant reports the current cycle, finding count, and status — the full detail is in the workspace viewer
A typical investigation runs 10–40 cycles, depending on the question's complexity and the data available.
When the Investigation Pauses¶
This is where the human-in-the-loop design matters most. When Phronesis hits a question it cannot answer from the data alone, it pauses and asks you:
Phronesis: "Two operators match 'Iberia' in the aviation-safety graph — Iberia LAE and Iberia Express. Which one should the investigation follow?"
- Iberia LAE
- Iberia Express
- Both
The investigation presents structured options with a reason for the pause. You pick an option or provide a free-text answer. Your input is recorded in the investigation graph — fully auditable, attributed to you.
Pause Reasons¶
| Reason | What happened | What you do |
|---|---|---|
| Ambiguity | Multiple interpretations of an entity, scope, or term | Pick the right one |
| Dead end | A search path returned nothing useful | Redirect, narrow, or confirm it's OK to stop |
| Scope | The investigation hit the edge of what the briefing covers | Expand scope or confirm the boundary |
| Judgment | The data supports multiple conclusions | Choose the interpretation |
| Prior findings | New evidence contradicts earlier findings | Confirm which finding stands |
Your response is classified by type — clarification, selection, redirect, or override — which tells the engine how to factor it in. If you want to see what context the engine will receive before committing your answer, ask the assistant to preview the input first.
After you respond, the investigation resumes with your input factored into the next cycle.
Reviewing Results¶
When the investigation reaches a terminal status, the workspace shows five views of the result:
Summary¶
The synthesized answer and the claims it stands on, each tagged by type. This is your primary view — the answer to the briefing's question, with every claim linked to its evidence.
Timeline¶
Every cycle in chronological order: what the engine did, what it found, where it paused. Useful when an answer feels surprising and you want to retrace the reasoning.
Findings¶
A filterable list of every finding with its status, confidence, and the cycle that produced it:
| Status | Meaning |
|---|---|
confirmed | Supported by evidence from the graph |
hypothesis | Plausible but not fully verified |
rejected | Contradicted by later evidence |
anomaly | Unexpected — worth noting but not central |
negative_finding | The investigation looked and found nothing |
unsatisfiable | The question cannot be answered from the available data |
Evidence Chain¶
Starting from any finding or the final answer, trace backwards through the evidence: which tool calls produced which findings, and which findings informed the synthesis. This is the auditability layer — every claim has a path back to the raw data.
Graph¶
The raw provenance topology. Every other view is a projection of this graph.
Terminal Statuses¶
| Status | Meaning | What to do |
|---|---|---|
| Completed | Verified synthesis produced | Review findings, file to the case |
| Incomplete | Engine converged without a confident answer | Check the last few cycles for explanation |
| Error | Fatal failure during execution | Check the error reason, adjust briefing, re-run |
For incomplete investigations, the partial findings are preserved. Often, providing new context in the briefing and re-running is enough.
What Happens After¶
When an investigation completes, the assistant:
- Summarizes the key findings (3–5 sentences)
- Checks whether the findings address the original RFI requirement
- Flags any gaps — questions the investigation was expected to answer but didn't
- Asks whether you'd like to review the findings for filing into the case
Nothing is filed automatically. You review each finding and decide what enters the case layer.
Multi-Graph Investigations¶
With two or more knowledge graphs connected, Phronesis can investigate across them:
- Discovery registers each graph's schema and ontology in a shared capability model
- Cross-graph mapping identifies entity types that align between graphs and caches the mappings
- Planning detects cross-graph intent and generates steps that pivot between connectors
- Evidence chains tag every tool call with its source connector, so you see where each fact came from
If no viable mapping exists between the graphs the question needs, the engine pauses and asks — it will not silently fabricate the bridge.
Under the Hood¶
The investigation pipeline
An investigation flows through these stages:
| Stage | What happens |
|---|---|
| Discovery (Phase 0) | Schemas and ontologies fetched from every connected MCP data server and cached (default staleness: 24h). Cold: 30–60s; warm: instant. |
| Planning | The planner builds a question profile and generates a functional plan: objectives and intents describing what to establish, not how. No tools or parameters committed yet. |
| Cycle execution | The executor picks the current step, chooses the concrete tool call at runtime, runs it, and records findings. The executor — not the planner — decides tools and parameters. |
| Recovery | Three tiers: smart retry (tool switching), step fallback (pre-encoded alternative), adaptive replan (new plan from accumulated findings). Max two replans per investigation. |
| Convergence | Loop stops when: max cycles reached, max time reached, findings stabilized (5 cycles without new material), or redundant tool calls detected. |
| Synthesis | A Synthesis node written with the answer text plus INFORMED_BY edges to the findings it consumed. |
| Verification | Deterministic checks confirm every claim in the answer has a supporting finding. |
The investigation graph is a FalkorDB database named investigation_<id> containing W3C PROV-O nodes (Investigation, Finding, ToolCall, HumanRequest, HumanInput, Synthesis) and the edges connecting them. See ADR-004 (PROV-O audit trails) and ADR-006 (investigation graph per session) in the project's decision records.
Recovery and replanning
- Smart retry — the executor re-prompts with a hint to switch tools or parameters. Resolves most transient failures.
- Fallback — the planner attached a fallback strategy to the step ahead of time. Runs automatically.
- Replan — the planner receives completed steps, findings, and the failure reason, and returns a new functional plan for the remaining work. Capped at two replans.
Replanning abandons the old plan; findings persist.
Investigation history
Every investigation is persisted in FalkorDB as its own graph and in the workbench SQLite database for metadata. Past investigations are browsable with identical views to live runs. Active investigations cannot be deleted.
Vocabulary¶
| Term | Meaning |
|---|---|
| Briefing | The scoped question the investigation answers |
| Cycle | One iteration: action → tool call → finding → convergence check |
| Finding | A single piece of evidence with status, confidence, and source |
| Human request | A pause: the investigation needs your input |
| Replan | Regenerating the approach mid-flight after recovery is exhausted |
| Synthesis | The final answer, traceable to the findings that produced it |
| Claim type | How a finding appears in the summary: positive ✓, negative ✗, enumeration ≡, uncertainty ~, or implication → |
Learn More¶
- Briefings — How to scope the question before investigating
- Cases — Filing investigation findings into the case layer
- Decisions — When the question shifts to "what should we do?"
- Exploratory Analysis — Answering questions without a formal investigation