Agentic AI for Integrated Pathology Reporting
A 45-minute hands-on workshop. You'll run three LangFlow workflows that progressively show what an agentic workflow does that a chatbot can't. Everything you need — login info, PDFs, user prompts, and system prompts — is in this one page.
How to read this handbook
Three types of paste-able blocks appear throughout the handbook, each with a distinct color so you can scan and know what kind of thing you're looking at without reading every line. Most long blocks are collapsed by default — click the row to expand. Each block also has a Copy button (top right) that puts its contents on your clipboard.
Amber — system prompt
Instructions to the LLM that already live inside a node's System Prompt field. You'd open the textarea in the node's right-side panel and replace it (or edit it) to change the model's behaviour.
Blue — user prompt
What you type into Playground. The chat directive that runs the flow. Just copy-paste the contents into the chat input and press send.
Violet — code / alternative
Python or alternative prompt content you'd paste into a node's Code tab or System Prompt field to replace the default. Useful for experiments — paste, re-run, see what changes.
▸ Collapsible row
Anything longer than ten lines is closed by default to keep the handbook scannable. Click the row to expand, read or copy the content, click again to collapse.
Quick start
1. Log in
Open https://pi-2026-workshop.javadilab.org in any browser. Your facilitator will hand out usernames in the form pi-user-NNN and a shared password.
2. Open a flow
After login you'll see "Starter Project." Six flows are pre-loaded for you:
chatbot— the Part 1 warm-uppathology_report_integration— the Part 2 agentic workflow (you also rebuild this from scratch in Part 3)extras_wikipedia_agent,extras_variant_tournament,extras_longitudinal_notes,extras_case_routing— bonus flows, not part of the live talk
Click any name to open the flow's canvas.
3. Use Playground
Click the Playground button at the top-right of the canvas. A chat panel opens. Type your prompt in the input, press send, watch the flow run. Edits to any node's right-side detail panel apply on the next run — no save button needed.
The workshop's warm-up baseline. Three nodes: Chat Input → General Chatbot → Chat Output. A single LLM call with file attachments. This is the before we compare Part 2 against.
The four AML PDFs
Download these to your machine. You'll attach them via the paperclip in Playground.
| # | File | Modality | What it carries |
|---|---|---|---|
| 1 | 01_bone_marrow_morphology.pdf | Bone marrow morphology | Manual blast count, cytochemistry, hedges on lineage |
| 2 | 02_flow_cytometry.pdf | Flow cytometry | Gated blast %, immunophenotype, resolves the morphology hedge |
| 3 | 03_cytogenetics_fish.pdf | Cytogenetics + FISH | Karyotype, AML panel — normal here |
| 4 | 04_molecular_ngs.pdf | Molecular NGS (54-gene) | NPM1, FLT3-ITD, DNMT3A, other variants |
One fictional patient: adult male, 58y · leukocytosis, anemia, thrombocytopenia · 41% peripheral blasts.
The prompt to type
After attaching all four PDFs, paste this into the chat input.
This is the same task Part 2 (Scenario D) tackles. Identical input, identical ask, very different output.
How to run, what to notice
- Open
chatbotfrom "Starter Project." - Click Playground.
- Click the paperclip icon and attach all four PDFs.
- Paste the prompt above. Press send.
- Wait 10–30 seconds.
Four things to notice in the reply. These are the gaps the agentic workflow will close.
- Which PDF supported each sentence? The chatbot won't tell you. Part 2's Part B trace will.
- Where did DNMT3A land? It often drifts into the diagnosis line. DNMT3A is prognostic, not classifying — it belongs in prognostic notes only.
- Run it again. Same prompt, same files, fresh chat. How much does the structure change?
- If a downstream LIS needed JSON, what could it parse? Run the same prompt a third time. Is the structure stable enough to write a parser against?
Write down what you see. We revisit these answers in the side-by-side at the end of the workshop.
Variants if you finish early
Tighter format
Ask for an evidence trace
Lane discipline stress test
Single-source classifying
Expected answer for the last one: NPM1 + FLT3-ITD live only in 04_molecular_ngs.pdf. Sometimes the chatbot catches this, sometimes it doesn't.
langflow_flows/components/api_scenario_zero/general_chatbot.py in the repo.
The workshop's headline case study. Nine custom components plus a stock Text Input holding WHO 5e classification rules. Reads the same four AML PDFs as Part 1 but emits a structured 11-section integrated report, a per-sentence evidence trace, and a QA-flag section. Case design and the original Stage 1 / Stage 2 prompts: Omar Baba, MD (see Authors at the bottom of this page).
The pipeline
[ChatInput] → [PipelineConfig] → [PDF Intake] ──┬─► [Morphology Parser] ─┐
(5 outputs) ├─► [Flow Parser] ─┤
├─► [Cytogenetics Parser] ─┼─► [WHO Classifier] → [QA Reviewer] → [Report Formatter] → [ChatOutput]
├─► [Molecular Parser] ─┤
└─► (cross-report Data) ─┤
│
[WHO Instructions] ────────┘
(Text Input, prefilled)
PDF Intake (Stage 1) runs five LLM calls: four per-source extractions (one per modality — morphology, flow, cytogenetics, molecular) plus one cross-report analysis that compares the four extractions for concordances, discordances, and single-source findings. Five outputs feed four dedicated parser nodes plus one direct line to the WHO Classifier.
WHO Classifier (Stage 2) takes six inputs: the four modality syntheses, the cross-report data, and the WHO 5e Instructions text. It produces the 11-section report plus the Part B evidence trace.
There are three editable system prompts across the pipeline (per-source extraction, cross-report analysis, WHO Classifier integration), plus the WHO Instructions text attendees can edit directly in the canvas without modifying any prompt.
Directives to type
You don't upload PDFs here. The four AML files are pre-registered in the case manifest — typing a directive loads them automatically.
Start here
This is the workshop's standard exercise. Loads the four AML PDFs, runs both LLM stages, emits the full integrated report + Part B trace + QA flags in markdown.
Output format variations
HTML renders styled output. JSON returns only the machine-readable form — what a downstream LIS would parse. Narrative collapses 11 sections into one short paragraph. Hide the qa flags suppresses the QA section in the rendered output (the QA Reviewer still runs).
Other cases the same workflow handles
The pipeline isn't AML-specific. Same nine components, different PDFs. The AML case is the one with all four planted pedagogical features, so focus on it during the workshop. Save the others for after.
What good AML output looks like
A passing run satisfies all of these:
- Final diagnosis line contains NPM1
- Final diagnosis line does NOT contain DNMT3A (it belongs in prognostic notes only)
- Section 8 (Integrated Interpretation) mentions both blast numbers — 18% from morphology and 22% from flow — and reconciles them out loud
- Part B trace has zero UNSUPPORTED rows
- Prognostic notes section mentions DNMT3A
- QA flags section is empty or only contains low-severity items
If something's missing, that's a learning opportunity — keep reading.
Edit a prompt and re-run
This is the workshop's "your turn" moment for Part 2. Each LLM stage carries an editable system prompt. Edit one, re-run, watch the output change.
The cleanest one-line exercise:
- Run the baseline once with
run the aml case. Note which sentence in section 8 talks about the blast count. - Click the WHO Classifier (Integrator) node on the canvas. Open the System Prompt field on the right.
- Add this sentence somewhere in the rules section:
Suggested edit — reconcile blast counts in section 8Always begin section 8 with a one-sentence reconciliation of the morphologic blast count vs the flow blast count, naming both numbers explicitly.
- Re-run with
run the aml case. Section 8 should now lead with the numeric comparison; the Part B trace updates to match.
That single edit ripples through the whole pipeline. No saving, no rebuild — changes apply on the next run.
Stage 1 — PDF Intake has TWO editable prompts
The PDF Intake node fires five LLM calls internally: four per-source extractions plus one cross-report analysis. Each kind of call has its own editable prompt in the right-side detail panel.
System prompts How the two PDF Intake prompts work + how to edit them (click to expand)
Per-Source Extraction Prompt — applies to all four per-source LLM calls. Tells the model how to read ONE component report (morphology, flow, cytogenetics, or molecular) and emit a structured JSON of that source's findings. The model adapts based on the source_id header in each call. Rules cover: extract-not-interpret, verbatim_support on every finding, classifying=true/false on every variant.
Cross-Report Analysis Prompt — runs after the four per-source extractions finish. Reads the four per-source JSONs and emits concordances, discordances (with resolution + basis), and single-source findings.
Full text of both prompts lives in langflow_flows/components/api_scenario_d/d2_pdf_intake.py (see DEFAULT_PER_SOURCE_PROMPT and DEFAULT_CROSS_REPORT_PROMPT) and is visible in the PDF Intake node's right-side panel.
To edit: click the PDF Intake node → right-side panel → either System Prompt textarea → paste your edited version → run. Edits to the per-source prompt change what gets extracted from every PDF; edits to the cross-report prompt change how concordances, discordances, and single-source findings are derived.
Stage 2 — WHO Classifier system prompt
The WHO Classifier takes six inputs (four parser outputs + cross-report data + WHO Instructions text) and emits the integrated report + Part B evidence trace. To edit: click the WHO Classifier (Integrator) node → System Prompt textarea.
System prompt The seven rules the WHO Classifier prompt enforces (click to expand)
Full text lives in langflow_flows/components/api_scenario_d/d2_who_classifier.py (see DEFAULT_SYSTEM_PROMPT). Paraphrased:
- Use only what you were given. Every clinical claim in interpretation/diagnosis must trace to a parser input, the cross-report data, or a classification rule from the WHO Instructions block.
- Resolve discordances out loud. Each discordance is named, with both numbers + the resolution + why it holds.
- Name single-source findings. Findings the cross-report marks as single-source are flagged plainly in the interpretation.
- Lane discipline. A variant in
prognostic_variantsbelongs in molecular summary + prognostic notes, NEVER in the final diagnosis line. - Diagnosis in classification terms. Apply the WHO Instructions block to translate combined findings into formal WHO/ICC language.
- Carry limitations forward. Stated limitations from any modality go in
limitations_pending. - Evidence trace integrity. Every sentence in interpretation + diagnosis has a trace row; any
UNSUPPORTEDrow is a pipeline failure caught by the QA Reviewer.
Note that the WHO 5e classification rules themselves are NOT in this prompt — they're in the separate WHO Instructions Text Input node. Edit those rules directly in that node without touching this prompt.
When the output looks wrong
| Symptom | First place to look |
|---|---|
UNSUPPORTED rows in Part B | Stage 2 wrote a sentence Stage 1's extraction didn't support. Tighten Rule 1 in the WHO Classifier prompt, or click the PDF Intake node after a run to inspect its output JSON for missing findings. |
| DNMT3A in the diagnosis line | Lane discipline slipped. Tighten Rule 4 in the WHO Classifier prompt. |
| Blast discordance not addressed | Stage 2 silently picked a number. Tighten Rule 2 in the WHO Classifier prompt. |
| Sections 1–7 look thin | Stage 1 didn't extract enough. Click the PDF Intake node after a run — its output is JSON you can read directly. |
You've just run pathology_report_integration end-to-end in Part 2. Now build the same flow from a blank canvas, six steps. The nine custom components are already in your sidebar; the WHO Instructions text is a stock Text Input. About ten minutes of drag-and-wire — no Python required.
Stuck? Open the completed pathology_report_integration from "Starter Project" side-by-side and compare your wiring.
[Chat Input] → [Pipeline Config] → [PDF Intake] ──┬─► [Morphology Parser] ─┐
(5 outputs) ├─► [Flow Parser] ─┤
├─► [Cytogenetics Parser] ─┼─► [WHO Classifier] → [QA Reviewer] → [Report Formatter] → [Chat Output]
├─► [Molecular Parser] ─┤
└─► (cross-report data) ─┤
│
[WHO Instructions] ────────┘
(Text Input)
Build it from scratch — six steps
Each step adds one or two components, wires them, and ends with a small diagram showing what your canvas looks like so far. Drag-then-wire-then-drag, not all-drag-then-all-wire.
Step 1 — Start a blank flow and add the inputs
From the dashboard ("Starter Project"), click + New flow and pick the blank template.
Drag: from Input / Output, drag in Chat Input. From api_scenario_d on the left sidebar, drag in Pipeline Config.
Wire: Chat Input's message output → Pipeline Config's user_message input.
Step 2 — Add PDF Intake (Stage 1)
Drag: from api_scenario_d, drag in PDF Intake to the right of Pipeline Config.
Wire: Pipeline Config's run_config output → PDF Intake's run_config input.
Click the PDF Intake node — you'll see five output handles on its right edge: morphology_data, flow_data, cytogenetics_data, molecular_data, and cross_report_data. We wire them in the next step.
Step 3 — Add the four parallel parsers (the fan-out)
Drag: from api_scenario_d, drag in four parsers — Morphology Parser, Flow Parser, Cytogenetics Parser, and Molecular Parser. Stack them vertically to the right of PDF Intake.
Wire four edges: PDF Intake's morphology_data → Morphology Parser's morphology_data input. Then flow_data → Flow Parser, cytogenetics_data → Cytogenetics Parser, molecular_data → Molecular Parser.
This is the most visually striking part of the canvas: four wires emerging from a single PDF Intake node, each going to a dedicated parser. LangFlow only allows compatible types to connect, so any wrong wiring is visually rejected.
Step 4 — Add WHO Classifier (the fan-in)
Drag: from api_scenario_d, drag in WHO Classifier (Integrator) to the right of the four parsers.
Wire five edges: each of the four parsers' outputs into its matching input on WHO Classifier (morphology_synthesis, flow_synthesis, cytogenetics_synthesis, molecular), plus PDF Intake's cross_report_data output → WHO Classifier's cross_report input.
Step 5 — Add the WHO Instructions Text Input
Drag: from Input / Output, drag in a Text Input. Place it below the parsers.
Paste: open the Text Input node's right-side panel, find its Text field, and paste the WHO Instructions content from the collapsible block in the next subsection.
Wire: Text Input's text output → WHO Classifier's who_instructions input.
The Text Input is a separate node — not a system-prompt field on another node. Its content is the third editable knob in the pipeline (alongside PDF Intake's two prompts and WHO Classifier's system prompt).
Step 6 — Add the tail and run
Drag: from api_scenario_d, drag in QA Reviewer and Report Formatter. From Input / Output, drag in Chat Output.
Wire three edges: WHO Classifier's integrated output → QA Reviewer's integrated input. QA Reviewer's reviewed output → Report Formatter's reviewed input. Report Formatter's report output → Chat Output's input_value input.
Run: click Playground. Type run the aml case. Press send. Wait 60–90 seconds.
Stage 1 (PDF Intake) runs 5 LLM calls. The three LLM-backed parsers run a fourth, fifth, and sixth. The WHO Classifier runs a seventh. The QA Reviewer's deterministic checks run last. The full integrated report + Part B evidence trace + QA flags arrives in the chat panel.
WHO Instructions — the text to paste in Step 5
This is the AML excerpt of WHO 5e classification rules. Copy-paste into the Text Input node's Text field. The full version with glioma / medulloblastoma / breast covered too is at docs/who5e_instructions.md in the workshop repo.
Paste content WHO Instructions — AML excerpt (~35 lines) (click to expand and copy)
Test it
Wait ~60–90 seconds. The full integrated report + Part B trace + QA flags arrive in the chat panel.
Edits worth trying once it runs
Each edit changes one editable text block, then you re-run run the aml case and watch what shifts in the output. Edits don't modify Python — they live in the System Prompt or Text Input fields you can see in each node's right-side panel.
Add a new tumor family to WHO Instructions
Open the Text Input node. Add a new section for a tumor family not currently covered (e.g., lymphoma or sarcoma). The integrator will use whatever rules are in the Text Input. (Loading PDFs from a new family also needs an entry in tools/scenario_d/pdf_io.py's case manifest, but the classifier prompt change itself is just editing the Text Input.)
Tighten lane discipline
In the WHO Classifier's system prompt, find Rule 4 (lane discipline). Add the sentence below. Re-run and see if non-classifying variants now stay out of the per-modality summaries too, not just the diagnosis line.
Change a per-source extraction prompt
Click PDF Intake → Per-Source Extraction Prompt. Add the sentence below. Re-run and watch the Molecular Parser's downstream output carry VAF numbers consistently across all variants.
Alternative code to try
Each block below is a paste-able alternative that replaces a specific prompt or Python file. Try one at a time, re-run, and see what changes. None of these are required — they're invitations to experiment.
Alt A · System prompt Cytogenetics Parser — refuse to interpret, ISCN only (click to expand)
Replace the default Cytogenetics Parser system prompt with the block below. The parser will quote karyotypes verbatim and stop hedging or summarizing in clinical language.
Paste into: Cytogenetics Parser → System Prompt textarea.
What changes: when you re-run the AML case, the cytogenetics_summary in section 6 of the integrated report becomes terser and more nomenclature-faithful (e.g., it'll say "46,XY[20]" verbatim instead of "normal male karyotype with 20 metaphases analyzed").
Alt B · System prompt Morphology Parser — 2-sentence summary instead of 4–7 (click to expand)
The default Morphology Parser writes a 4–7 sentence paragraph. This variant collapses it to two sentences.
Paste into: Morphology Parser → System Prompt textarea.
What changes: section 4 (Morphology summary) of the integrated report becomes a tight two-sentence summary. The lane-discipline and Part B evidence trace are unaffected — they live in the WHO Classifier, not here.
Alt C · System prompt PDF Intake — add a third "VUS" bucket to molecular classification (click to expand)
The default per-source extraction marks each molecular variant as classifying=true|false — binary. This alternative adds a third state: classifying="vus" for variants of uncertain significance the WHO doesn't yet classify either way.
Paste into: PDF Intake → Per-Source Extraction Prompt textarea (this is the variant of Rule 4 only; the rest of the default prompt stays).
What changes: the Molecular Parser will now create a third array vus_variants alongside classifying_variants and prognostic_variants. The WHO Classifier ignores VUS by default — but you can tell it to add a "VUS observed" sentence to the prognostic notes by editing Rule 4 of the WHO Classifier prompt accordingly.
Caveat: the Molecular Parser's Python (which does the split) only knows about true|false today. To actually surface the VUS bucket downstream, you'd also need to edit d2_molecular_parser.py — see Alt D below.
Alt D · Python code Molecular Parser — handle the new "vus" bucket from Alt C (click to expand)
If you applied Alt C above, the Molecular Parser's existing Python only recognizes true|false and lumps any "vus" variant into the prognostic bucket. This alternative changes the parser to surface a third vus_variants array.
Paste into: Molecular Parser → Code tab (advanced). The Code tab shows the underlying Python; replace the body of run_split with the snippet below.
What changes: the Molecular Parser's output Data now carries a vus_variants field. The WHO Classifier doesn't read it yet — you'd need to extend the Classifier prompt and inputs to do something with it.
Bonus reference — Scenarios A, B, C
These three scenarios are in your account as bonus material. They're not part of the live talk but the patterns are worth seeing if you want to explore after the workshop.
| Scenario | What it does | The pattern it shows |
|---|---|---|
| A — Variant Tournament | Ranks germline variants against rubric criteria using an LLM "tournament judge." | LLM-as-judge with a structured rubric; parallel evidence fetches (ClinVar, gnomAD, PubMed). |
| B — Longitudinal Ghost | Finds contradictions in a 14-note patient timeline. The "ghost": a 2022 note documenting tamoxifen use that the current request contradicts. | Temporal synthesis + rule-based contradiction detection. Includes an HITL gate. |
| C — Digital Thread | Routes 30 pending cases to 8 subspecialty pathologists. The "trap": two pathologists are over-loaded by design. | LLM-decides-then-deterministic-rule-clamps. Workshop-relevant fatigue-cap edit. |
Open any of them from "Starter Project" and click Playground. Each scenario's prompts and editable levers are documented in the repo at langflow_flows/components/api_scenario_{a,b,c}/.
Troubleshooting
| Symptom | Likely cause / fix |
|---|---|
| "No model selected" | Click the Agent node → Language Model dropdown → pick OpenAI + openai/gpt-4.1-mini. |
| Node turns red with a "!" badge | Click the badge for the error. Most common is a rate-limit timeout — wait 60 seconds and retry. The proxy caps each attendee at 30K tokens/min. |
| "401 Unauthorized" or auth error | Check the Agent's API Key field is blank. Pasting a key overrides the KeyBroker auth. |
| Connection dropped from a tool to the Agent | Check Tool Mode is ON on the tool node. The toggle is in the node header. Without it the tool exposes JSON, not the Tool handle. |
| Flow runs but the output looks empty / truncated | Click each node after the run and inspect its output panel. You'll find which stage produced nothing — usually that's where to investigate. |
| Edited a prompt and the result got worse | Click the node's Reset button (top of its detail panel) to restore the default prompt. |
| Last resort | Wave at the facilitator. They have a reset-attendee script that rebuilds your flows from scratch in ~10 seconds. |
Authors + repo
- Hesam H. Javadi, Ph.D. — Medical College of Wisconsin · Children's Wisconsin. Workshop infrastructure, LangFlow components, slides, handbook.
- Srikar Chamala, Ph.D. — Children's Hospital of Los Angeles · USC Keck School of Medicine.
- Omar Baba, MD — Clinical Pathologist, Pathologist Informaticist, Henry Ford Health System. AML case design, planted pedagogical features, Stage 1 and Stage 2 system prompts (see
docs/Integrated_report_demo_Omar/).
Workshop infrastructure: LangFlow 1.9.2 · OpenRouter via in-house KeyBroker proxy · Phoenix (currently disabled) · pre-provisioned attendee accounts on pi-2026-workshop.javadilab.org.
Repository: github.com/hesamhakim/agentic-pathology-workshop
- Slide deck:
docs/slides/AI Agentic workflow case study - API summit 2026/Beyond the Chatbot - API Summit 2026.html - This handbook:
docs/attendee_handbook.html - Component sources:
langflow_flows/components/ - Case PDFs and ground truth:
data/scenario_d/