Skip to content

Investigations

When a question needs more than a quick lookup — multi-step reasoning across knowledge graphs, an auditable evidence chain, or systematic coverage of a complex topic — you run an investigation. Phronesis, the reasoning engine, takes your approved briefing and autonomously plans, executes, and synthesizes an answer. Every finding traces back to the specific tool calls and graph nodes that produced it.

What Investigations Are For

Investigations answer open-ended questions over structured knowledge — questions where the answer has to be assembled from multiple entities, connectors, or hops across a graph:

Pattern What's being investigated
Lookup with context A specific entity, then its relationships and properties
Comparison Two or more entities against each other
Causal / temporal chain A sequence of events and how they connect
Classification Which category something belongs to and why
Cross-graph correlation Facts that only emerge when two knowledge graphs are joined
Negative / unsatisfiable "Is there anything that…" where the honest answer may be "no"

The value isn't any single query — it's an agent that classifies the question, discovers which data sources are relevant, plans an approach, executes it, recovers from dead ends, and produces a synthesis where every sentence is traceable to evidence.

Not everything needs an investigation

If the answer is a single lookup, a count, or a short chain of pivots, exploratory analysis is faster. Investigations are for questions that require planning, multi-step execution, and provenance.

What Investigations Don't Cover

Be precise about scope — it matters for how you frame a question:

  • Unstructured corpora. Phronesis reasons over knowledge graphs, not raw document collections. If the answer lives in PDFs that haven't been ingested, ingest first.
  • Open-web research. There is no web browsing. Every tool call routes through a connected knowledge graph.
  • Numeric optimization. If the question is "what is the best allocation under these constraints?", that's a decision, not an investigation.
  • Ground-truth judgment. Phronesis reports what the graphs say. Conflicting evidence surfaces as a human pause with options, not a silent majority vote.
  • Cross-session memory. Each investigation has its own working memory. Lessons from one investigation don't automatically carry into the next.

Starting an Investigation

After a briefing is elicited and approved, start the investigation from chat:

"Start the investigation."

The assistant launches Phronesis, which returns immediately and begins running in the background. The workspace shows a lifecycle stepper tracking progress: Planning → Running → Synthesizing → Verifying → Complete.

Complexity hint

start_investigation accepts a complexity hint: simple for quick lookups, standard for typical questions, analytical for multi-hop cross-graph work. This controls the cycle budget and time cap.

Following Progress

While the investigation runs:

  • The lifecycle stepper in the workspace shows the current stage with live cycle and finding counts
  • Ask the assistant for updates: "How is the investigation going?"
  • The assistant reports the current cycle, finding count, and status — the full detail is in the workspace viewer

A typical investigation runs 10–40 cycles, depending on the question's complexity and the data available.

When the Investigation Pauses

This is where the human-in-the-loop design matters most. When Phronesis hits a question it cannot answer from the data alone, it pauses and asks you:

Phronesis: "Two operators match 'Iberia' in the aviation-safety graph — Iberia LAE and Iberia Express. Which one should the investigation follow?"

  • Iberia LAE
  • Iberia Express
  • Both

The investigation presents structured options with a reason for the pause. You pick an option or provide a free-text answer. Your input is recorded in the investigation graph — fully auditable, attributed to you.

Pause Reasons

Reason What happened What you do
Ambiguity Multiple interpretations of an entity, scope, or term Pick the right one
Dead end A search path returned nothing useful Redirect, narrow, or confirm it's OK to stop
Scope The investigation hit the edge of what the briefing covers Expand scope or confirm the boundary
Judgment The data supports multiple conclusions Choose the interpretation
Prior findings New evidence contradicts earlier findings Confirm which finding stands

Your response is classified by type — clarification, selection, redirect, or override — which tells the engine how to factor it in. If you want to see what context the engine will receive before committing your answer, ask the assistant to preview the input first.

After you respond, the investigation resumes with your input factored into the next cycle.

Reviewing Results

When the investigation reaches a terminal status, the workspace shows five views of the result:

Summary

The synthesized answer and the claims it stands on, each tagged by type. This is your primary view — the answer to the briefing's question, with every claim linked to its evidence.

Timeline

Every cycle in chronological order: what the engine did, what it found, where it paused. Useful when an answer feels surprising and you want to retrace the reasoning.

Findings

A filterable list of every finding with its status, confidence, and the cycle that produced it:

Status Meaning
confirmed Supported by evidence from the graph
hypothesis Plausible but not fully verified
rejected Contradicted by later evidence
anomaly Unexpected — worth noting but not central
negative_finding The investigation looked and found nothing
unsatisfiable The question cannot be answered from the available data

Evidence Chain

Starting from any finding or the final answer, trace backwards through the evidence: which tool calls produced which findings, and which findings informed the synthesis. This is the auditability layer — every claim has a path back to the raw data.

Graph

The raw provenance topology. Every other view is a projection of this graph.

Terminal Statuses

Status Meaning What to do
Completed Verified synthesis produced Review findings, file to the case
Incomplete Engine converged without a confident answer Check the last few cycles for explanation
Error Fatal failure during execution Check the error reason, adjust briefing, re-run

For incomplete investigations, the partial findings are preserved. Often, providing new context in the briefing and re-running is enough.

What Happens After

When an investigation completes, the assistant:

  1. Summarizes the key findings (3–5 sentences)
  2. Checks whether the findings address the original RFI requirement
  3. Flags any gaps — questions the investigation was expected to answer but didn't
  4. Asks whether you'd like to review the findings for filing into the case

Nothing is filed automatically. You review each finding and decide what enters the case layer.

Multi-Graph Investigations

With two or more knowledge graphs connected, Phronesis can investigate across them:

  • Discovery registers each graph's schema and ontology in a shared capability model
  • Cross-graph mapping identifies entity types that align between graphs and caches the mappings
  • Planning detects cross-graph intent and generates steps that pivot between connectors
  • Evidence chains tag every tool call with its source connector, so you see where each fact came from

If no viable mapping exists between the graphs the question needs, the engine pauses and asks — it will not silently fabricate the bridge.

Under the Hood

The investigation pipeline

An investigation flows through these stages:

Stage What happens
Discovery (Phase 0) Schemas and ontologies fetched from every connected MCP data server and cached (default staleness: 24h). Cold: 30–60s; warm: instant.
Planning The planner builds a question profile and generates a functional plan: objectives and intents describing what to establish, not how. No tools or parameters committed yet.
Cycle execution The executor picks the current step, chooses the concrete tool call at runtime, runs it, and records findings. The executor — not the planner — decides tools and parameters.
Recovery Three tiers: smart retry (tool switching), step fallback (pre-encoded alternative), adaptive replan (new plan from accumulated findings). Max two replans per investigation.
Convergence Loop stops when: max cycles reached, max time reached, findings stabilized (5 cycles without new material), or redundant tool calls detected.
Synthesis A Synthesis node written with the answer text plus INFORMED_BY edges to the findings it consumed.
Verification Deterministic checks confirm every claim in the answer has a supporting finding.

The investigation graph is a FalkorDB database named investigation_<id> containing W3C PROV-O nodes (Investigation, Finding, ToolCall, HumanRequest, HumanInput, Synthesis) and the edges connecting them. See ADR-004 (PROV-O audit trails) and ADR-006 (investigation graph per session) in the project's decision records.

Recovery and replanning
  1. Smart retry — the executor re-prompts with a hint to switch tools or parameters. Resolves most transient failures.
  2. Fallback — the planner attached a fallback strategy to the step ahead of time. Runs automatically.
  3. Replan — the planner receives completed steps, findings, and the failure reason, and returns a new functional plan for the remaining work. Capped at two replans.

Replanning abandons the old plan; findings persist.

Investigation history

Every investigation is persisted in FalkorDB as its own graph and in the workbench SQLite database for metadata. Past investigations are browsable with identical views to live runs. Active investigations cannot be deleted.

Vocabulary

Term Meaning
Briefing The scoped question the investigation answers
Cycle One iteration: action → tool call → finding → convergence check
Finding A single piece of evidence with status, confidence, and source
Human request A pause: the investigation needs your input
Replan Regenerating the approach mid-flight after recovery is exhausted
Synthesis The final answer, traceable to the findings that produced it
Claim type How a finding appears in the summary: positive ✓, negative ✗, enumeration ≡, uncertainty ~, or implication →

Learn More

  • Briefings — How to scope the question before investigating
  • Cases — Filing investigation findings into the case layer
  • Decisions — When the question shifts to "what should we do?"
  • Exploratory Analysis — Answering questions without a formal investigation