Sprint Two: Map and Decide

Operational Task Mapping

A continuous activity with three visible touchpoints across Weeks 1-3 plus a workshop touchpoint. Selection conversation with the CEO picks 3-4 functions to study. Per-function task work surfaces a Function Task Brief per lead. The workshop touchpoint converts the master task table into a ranked candidate set the executive team scores and uses to select the first pilot.

Module info Weeks 1-3 + workshop touchpoint 21-25h facilitator 1h CEO + workshop participation 2-2.5h per functional lead 30 min per frontline doer (2 of 4) 60-90 min per exec at workshop 0h board

When: Selection conversation in Week 1; per-function task work Weeks 2-3; workshop touchpoint during the single Sprint Two workshop trip (4-5 hours single block OR ~6 hours across two consecutive days; one trip to Septapod's headquarters).

Time per person

Facilitator (Brent) 21-25 h Selection conversation (2h) + per-function work across 4 leads (10-12h) + observation layer 2 of 4 (2h) + scoring and master table (5h) + workshop segment design and facilitation (2h)

CEO 1 h + workshop 60-min selection conversation in Week 1, plus workshop participation as part of her overall workshop time (no incremental hours added beyond what the workshop already costs)

Each functional lead (3-4 selected) 2-2.5 h 60-90 min Function Task Brief pre-work (async default, Brent-facilitated when helpful) + 45-60 min calibration session with Brent + 3 min presentation at workshop. Add ~30 min if Brent runs the pre-work as a guided session for this lead.

Each frontline doer (2 of 4 functions) 30 min Observation by Brent during natural work flow. No prep, no follow-up.

Each exec in workshop touchpoint 60-90 min Added to existing workshop participation for the candidate-review segment (lead presentations + scoring + 10-15 min AI-fit gate + pilot selection).

Each board member n/a Not engaged in this module.

What actually happens

In Week 1, Brent and the CEO meet for 60 minutes to choose 3-4 functions to study, applying four written criteria (volume of work, drag points named in survey or CEO interview, governance risk profile against the AI Policy, strategic relevance to a Program of Work). In Weeks 2-3, each lead completes a Function Task Brief async (60-90 min) or in a Brent-facilitated session, then a 45-60 minute calibration session refines tags and surfaces drag. For 2 of the 4 functions, Brent observes a frontline doer for 30 minutes before the lead's calibration. Brent maintains a master task table (Google Sheet) populated from the Function Task Briefs, scored on impact and feasibility (5-point each, composite 1-25). The workshop touchpoint surfaces 12-15 minutes of lead presentations, exec team scoring against the rubric, a 10-15 minute AI-fit and learning-quality gate on the finalists, and pilot selection from the top tier.

Through-line

Generates: A Function Task Brief per functional lead (3-4 of them, held by their owners as printed documents). A master task table (Google Sheet) populated with 50-70 tasks across the functions, three-way tagged (Automate / Augment / Preserve Human) and scored. A ranked candidate set the workshop opens with. An AI-fit and learning-quality read on the top finalists. A first pilot selected with documented rationale.
Value to Septapod: The executive team picks the first pilot from data the leads themselves built, not from intuition or politics. The finalist gate distinguishes work the credit union can implement from work where current AI is a strong fit and the pilot can produce a fast, trustworthy learning signal. Functional leads emerge with a tangible artifact in their function and a lens for seeing automation opportunities they can keep using. The master task table becomes a Septapod reference asset that survives the engagement and informs every future AI conversation. The three-way tag lands the values dimension where it belongs, protecting mission-aligned work without forcing every decision into a values audit.
How Septapod uses it: The pilot selection anchors Sprint Three. The decisive unknown from the finalist gate becomes the first pilot's learning charter. The master task table becomes the reference Brent and the team return to for second and third pilot selection. The Function Task Briefs become each lead's starting point for any future automation work in their function.
Feeds into: Step 2 (Workshop) consumes the candidate set as the workshop's pilot-selection material. Sprint Three carries the finalist's decisive unknown into its machine and practice hypotheses. Later pilots are selected from the ranked remaining candidates after the workshop's first selection. The Annual AI Plan inherits the master task table as a foundational reference document. Step 3 (Direction Synthesis) surfaces the OTM outputs in its synthesis sections.

Research & methods anchors

Task-level mapping draws on LIMRA's three-phase AI Automation Identification Framework (Job Inventory → Task Analysis → Capabilities Mapping). Function Task Brief structure adapts BAMPT's workflow documentation template. Scoring uses BAMPT's Automation Opportunity Scorecard frame. Tacit-knowledge interrogation prompts in the calibration session draw on the AI Use Case Discovery Whitepaper. The three-way tag (Automate / Augment / Preserve Human) extends Google PAIR Guidebook's automate-vs-assist frame with an explicit values-protection tag for mission-aligned work. The agentic-AI framing aligns with Anthropic's Building Effective Agents pattern catalog.

LIMRA AI Automation Identification Framework: three-phase task decomposition (Job Inventory, Task Analysis, Capabilities Mapping). Worked CU examples for the three-way tagging rubric come from LIMRA's published task examples, adapted to credit union operational vocabulary.
BAMPT Workflow Documentation Template: eight-part workflow documentation structure adapted for the Function Task Brief.
BAMPT Automation Opportunity Scorecard: impact and feasibility scoring with concrete anchors per score level.
AI Use Case Discovery Whitepaper: tacit-knowledge and undocumented-workarounds interrogation prompts for the calibration session.
Google PAIR Guidebook: automate-vs-assist frame extended to the three-way tag.
Anthropic Building Effective Agents: agentic-AI pattern catalog informing the Automate tag's "agent as system of record" framing.

Week 1: Selection conversation 60 min, remote, Brent + the CEO (plus anyone the CEO invites)

Pick 3-4 functions to study, with criteria written down, so the selection traces to documentation the CEO can hand to anyone who asks why these functions and not others. The CEO decides who else (a VP Risk lead, etc.) joins the conversation.

Goal: Pick 3-4 functions to study, with criteria written down, so the selection traces to documentation the CEO can hand to anyone who asks why these functions and not others.
Facilitation moves: Open with the survey synthesis and CEO interview signal (drag points named, areas of concern, comfort distribution). Walk the four criteria as evaluation lenses, applying each to the candidate functions. Capture rationale in writing during the conversation. Agree on observation opt-ins for 2 of the 4 functions. Assign async-vs-facilitated pre-work mode per lead (see Engagement modes card). Send lead-notification emails within 24 hours.
Success indicators: The CEO and any others she invited agree on the 3-4 functions. Written rationale exists for each. Each function has an assigned pre-work mode. Leads have been notified and confirmed. Observation invitations have been sent for 2 of the 4 functions.
If drift, repair: Function selection feels political rather than criteria-driven: revisit the criteria explicitly; if a function does not satisfy any criterion, surface that to the CEO. The CEO defers selection without reaching agreement: schedule a 30-min follow-up; do not proceed to Touchpoint 2 with ambiguous selection. Lead expresses concern or push-back on being selected: 15-min call with the lead before Touchpoint 2 begins to address concerns.

Four selection criteria

Volume of work in the function. Does this function consume meaningful capacity that AI could reshape? Look at FTEs allocated, hours per week on routine tasks, throughput numbers Brent has from the Vendor AI Audit and the foundation pillar scores.
Drag points named in the executive survey or CEO interview. Has this function been surfaced as a friction point in Sprint One's earlier work (survey responses on Q4 about what changed in the past 12 months, Q5 task naming, or CEO interview Q3 about a recent stall-or-move-forward moment)?
Governance risk profile against the AI Policy. Does this function fall under either of the AI Policy's board-approval categories (member-facing AI displacing human agents; AI making autonomous access decisions)? Functions with high governance complexity may need more careful pilot design and benefit from being mapped early.
Strategic relevance to a Program of Work. Does this function tie to a named Program of Work in the the strategic plan strategic plan (Annual AI Plan, Copilot rollout, member-service initiatives, etc.)?

Selection rationale (write here during the conversation) Functional leads identified per function Observation opt-ins (2 of 4)

Multi-branch scope decision

If the credit union operates across multiple branches, functions may operate identically across branches, or they may have meaningful branch-level variation. Decide here whether the OTM work is institution-level (default) or accounts for branch-level differences for one or more functions. Default to institution-level unless the CEO flags branch variation as material for a specific selected function.

Institution-level (default). OTM treats each function as a single entity across all branches. Branch-specific operational variation surfaces in pilot design (Sprint Three) if it appears, not in the OTM module. Branch-aware for one or more functions. The lead documents branch-level variation in the Function Task Brief (e.g., lending at one branch vs another). Add 30-45 min to that lead's calibration session for branch-by-branch task review. Per-functional-lead time budget shifts from 2-2.5h to 2.5-3.25h for branch-aware leads.

If branch-aware: which functions and why

Lead-notification email template

Suggested copy. Adapt per lead.

Hi [Lead name], The CEO and I are running a focused piece of work in Sprint Two of the Septapod AI strategy engagement, and we'd like your participation. We'd love to spend ~2 hours with you over the next two weeks mapping the operational tasks in your function and identifying where AI might fit (and where it deliberately should not). Here's what's involved: - A short template I'll send you in the next day or two, which you'll spend 60-90 minutes filling in async between now and our calibration session [OR for Brent-facilitated leads: which we'll work through together in a 60-90 minute guided session]. - A 45-60 minute calibration session with me on [date/time TBD] where we walk through what you've documented and tag each task on a three-way decision (Automate, Augment, Preserve Human). - A 3-minute presentation of your function's map to the executive team at the workshop on [date]. I'll also send a separate note about observation, if your function is one of the two we're shadowing. The goal is for you to leave with a tangible artifact about your function and a lens for seeing automation opportunities you can keep using. The CEO and I picked your function because of [criterion from selection rationale]. Let me know if you have questions, or if [date/time] doesn't work for the calibration session. Brent

Observation-invitation template (for 2 of 4 functions)

Suggested copy. Adapt per function.

Hi [Doer name], I'm working with [Lead name] in [function] as part of the Septapod AI strategy engagement. With [Lead name]'s OK, I'd like to spend 30 minutes watching you work through [a typical example of the task] sometime in the next two weeks. This is observation, not assessment. I'm watching for the moves that don't show up in process documents: keyboard shortcuts, workarounds, the actual decision rhythm. What I see feeds into the conversation [Lead name] and I will have, and ultimately into the executive team's pilot selection. You don't need to prepare anything. I'll come find you at [time you suggest] and just watch what you'd normally do for that half hour. Stop me anytime if it gets in the way. Let me know what works on your end. Brent

Why this is here

Source: Addresses the political-selection problem identified in the adversarial review of the proposal. Without criteria-based selection, function selection becomes "the loudest exec's function gets mapped." The four criteria here are anchored to existing governance (AI Policy authorities) and strategic frame (Programs of Work in the strategic plan).

Failure mode prevented: Faction-driven selection that bakes the executive team's existing political dynamics into the candidate set before mapping even begins.

Facilitation move: The criteria are read aloud and applied transparently to each candidate function. Rationale is captured visibly during the conversation, not after.

Watch for: The CEO treating the criteria as a checkbox exercise rather than a real evaluation. Push back gently if needed. If the CEO invites others into the conversation, the same criteria-driven discipline applies to them.

Weeks 2-3: Materials and method what each functional lead receives and how the work is shaped

Each lead receives a packaged set of materials: the Function Task Brief template, the three-way tagging rubric, a one-page method packet drawn from the LIMRA AI Automation Identification Framework, and a framing message. Brent does NOT pre-populate function-specific task lists. The lead does the task decomposition themselves using the framework and rubric provided.

View the printable Function Task Brief template →

Goal: Equip each functional lead to break their function down into 12-18 atomic tasks and apply the three-way tag (Automate / Augment / Preserve Human) to each. Brent provides the framework, rubric, and template; the lead provides the function-specific operational knowledge.
Facilitation moves: Send the materials within 24 hours of the selection conversation (per U2). Include the lead-notification email's framing about what to complete async vs in the calibration session. For leads assigned to facilitated mode, schedule the guided pre-work session before sending the template (they complete it together with Brent, not alone).
Success indicators: Each lead has received the template, the rubric, and the method packet within 24 hours of selection. Each lead knows their pre-work mode (async vs facilitated). Each lead has a scheduled calibration session date.
If drift, repair: Lead doesn't respond within 3 business days: follow up directly (call, not email). Lead pushes back on the task-decomposition framework as too abstract: switch to facilitated mode and walk them through the first 3-4 tasks together. Lead's function appears too narrow for 12-18 tasks: revisit selection rationale; consider whether the function should be re-scoped or merged with an adjacent function.

Three-way tagging rubric

For each task, the lead applies one of three tags. The borderline subsection addresses tasks that sit between two tags or shift over time.

Automate

Remove the human; the agent runs this task end-to-end with exception-only human review.

Decision question: "Could the agent be the system of record for this task, with humans intervening only on exceptions?"

Worked CU examples populate from U8 (LIMRA-adapted research + Brent's review).

Augment

Keep the human in the loop; AI assists. The human decides; AI drafts, retrieves, classifies, summarizes, or suggests.

Decision question: "Does the value come from a human's judgment, with AI making that judgment faster or better-informed?"

Worked CU examples populate from U8.

Preserve Human

Keep fully human; AI deliberately stays out of the path.

Decision question: "Is this task one where AI presence would erode something Septapod is trying to preserve (member trust, mission integrity, judgment under uncertainty)?"

Worked CU examples populate from U8.

Borderline tasks

When a task sits between two tags: apply the decision question for the more conservative tag first (Preserve Human, then Augment, then Automate). If the answer is uncertain, default to the more conservative tag and capture the uncertainty in the lead's notes. Brent and the lead revisit borderline tags during the calibration session; tags can be revised based on conversation. The borderline subsection populates with specific worked examples drawn from U8.

Function Task Brief template

Each lead's owned artifact for their function. The brief is a standalone HTML page intentionally distinct from the wizard's app register, designed as a printed business document with serif typography, generous margins, masthead, and signature block.

→ Open the Function Task Brief template (opens in new tab; print to PDF for the lead's records)

Why this is here

Source: The three-way tagging rubric extends Google PAIR Guidebook's automate-vs-assist frame with an explicit values-protection tag (Preserve Human) appropriate for a CDFI credit union whose mission constrains automation. Task-level decomposition follows LIMRA's three-phase framework (Job Inventory, Task Analysis, Capabilities Mapping). The Function Task Brief template structure adapts BAMPT's workflow documentation template.

Failure mode prevented: A binary Automate-vs-don't-automate frame loses the values dimension entirely. Tagging the values dimension separately at Touchpoint 2 (not at the workshop as an audit) keeps mission-aligned tasks explicit from the start.

Facilitation move: Brent does not pre-populate function-specific task lists. The lead's own knowledge is the source of truth for what tasks exist. Brent provides the framework, rubric, and template, and verifies, interrogates, and refines during the calibration session.

Watch for: A lead who tags everything Automate (over-eagerness to free capacity; surface in calibration). A lead who tags everything Preserve Human (defensiveness about the function; surface gently). A lead who can't break their function down past 4-5 tasks (the framework hasn't landed; switch to facilitated mode for the next session).

Weeks 2-3: Engagement modes how Brent engages with each lead individually

Two decisions Brent makes per lead during or right after the selection conversation: which pre-work mode the lead uses (async or facilitated), and whether the function gets a 30-minute observation layer (2 of 4 functions, opt-in).

Goal: Match each lead to a pre-work mode they'll actually complete, and surface 2 functions where Brent observes a frontline doer to feed the calibration session with tacit-knowledge probes.
Facilitation moves: Default each lead to async mode. Switch to facilitated mode when the lead's function is unfamiliar to them at the task level, the lead's schedule rules out 60-90 min of focused async work, the lead's thinking is known to unlock better in conversation than alone, or the lead specifically requests it. For observation: nominate the 2 functions during the selection conversation; send observation invitations within 24 hours; brief the doer before observation begins.
Success indicators: Each lead has an assigned pre-work mode and knows what to expect. For async-mode leads: the template has arrived. For facilitated-mode leads: the guided session is scheduled. 2 of 4 functions have confirmed observation invitations.
If drift, repair: An async-mode lead doesn't complete pre-work by the calibration session date: convert that lead's calibration session to a facilitated pre-work session followed by an abbreviated calibration. A nominated observation function declines: substitute the next-priority function. Fewer than 2 functions consent to observation: the calibration session for non-observed functions uses enhanced interrogation prompts (longer drag-point probing, more open-ended workflow walking) in place of observation probes.

Pre-work mode per lead

Observation layer (2 of 4 functions)

30 minutes shadowing a frontline doer in their natural work flow, BEFORE the lead's calibration session. Brent watches for keyboard shortcuts, workarounds, system-switching patterns, decision rhythm, and the moves the lead would not think to write down. Notes feed the calibration session as specific probes.

Why this is here

Source: The async-default-with-facilitated-fallback design draws on the AI Use Case Discovery Whitepaper's observation that pre-work quality varies by participant; structured options work better than rigid uniformity. The observation layer addresses the tacit-knowledge gap: lead-reported task descriptions hide the moves agents are actually good at (keyboard shortcuts, system-switching, copy-paste-validate loops). 30 minutes of observation surfaces what 60 minutes of interview would miss.

Failure mode prevented: All-async pre-work forces every lead through a self-directed task decomposition; some leads stall and arrive at the calibration session without a usable artifact. All-facilitated pre-work doubles Brent's time and undermines lead ownership. The hybrid matches each lead to what works for them.

Facilitation move: Brent picks the mode per lead, ideally during the U2 selection conversation. The lead-notification email reflects the assigned mode. Observation is a separate opt-in conversation with each nominated lead.

Watch for: A lead who would benefit from facilitated mode but doesn't want to ask for the help. Assign anyway based on Brent's read. A doer who agrees to observation but later seems uncomfortable on the day. Make the observation shorter or skip; trust matters more than data.

Weeks 2-3: Calibration session 45-60 min per lead, with Brent

The synchronous moment where Brent and the lead interrogate the lead's pre-work together, refine tags, surface drag, and draft the synthesis paragraph. Five phases.

Goal: Convert the lead's pre-work into a finalized Function Task Brief with accurate tags, surfaced drag points, captured tacit knowledge, and a synthesis paragraph the lead signs off on.
Facilitation moves: Open with a 5-minute review of the lead's draft. Read it back to confirm understanding. Then interrogate (15-20 min) using the prompts below. Then validate tags (10-15 min): apply the decision question for each task, push back when something feels mis-tagged. Then draft the synthesis paragraph together (10 min). Then lead sign-off (5 min). For observed functions, weave observation notes into the interrogation phase as specific probes.
Success indicators: The lead's Function Task Brief is signed off and exported to PDF. Brent has populated the master task table from the brief. Tags feel defensible: the lead can explain each one. Drag points are named (not just inferred). Tacit knowledge surfaced: at least 2-3 things the lead didn't write down but mentioned during conversation.
If drift, repair: The lead pushes back on a tag Brent wants to change: revisit the decision question for that tag; if the lead's reading is reasonable, accept it and note the variance for the workshop. The session runs long: drop the synthesis paragraph drafting (Brent writes a v0 after, lead reviews async). The lead seems guarded about a task: shelve it; sometimes the right move is to come back to it after building trust. A task the lead tagged turns out to not actually exist in the function: drop it; document the surprise.

Session structure (45-60 min)

Opening review (5 min). Brent reads the lead's draft Function Task Brief back to them: function summary, 3-4 most prominent tasks, lead's initial tag rationale. The lead corrects anything Brent misread.
Structured interrogation (15-20 min). Brent walks through the tasks using interrogation prompts (below). For observed functions, observation notes become specific probes (e.g., "I saw you switch between two systems several times on that inquiry; what was that?"). The lead surfaces drag points and tacit knowledge that didn't make the pre-work template.
Tag validation (10-15 min). For each task with a non-obvious tag, Brent and the lead apply the decision question for that tag together. Tags get refined. Borderline tasks are flagged.
Synthesis paragraph drafting (10 min). Together, Brent and the lead draft 3-5 sentences naming the function's strongest AI candidates and the principles that should guide deployment in this function. Drafted live in the document, not after.
Lead sign-off (5 min). The lead reads the full brief back. Any final corrections. The lead signs the brief. Brent confirms it's now their owned artifact.

Interrogation prompts (from AI Use Case Discovery Whitepaper, adapted)

Drag prompts: "When in your week is the most frustrating moment in this task? What's happening?" / "What's the workaround you've built that wasn't in the documentation?"
Tacit-knowledge prompts: "If you stopped doing this tomorrow, who would notice first? How would they know?" / "What's the thing you do that the new hire would take six months to learn by watching?"
Tag-pressure prompts: "If we automated this completely, what would break that you couldn't see today?" / "If we deliberately kept AI out of this, who would be served better, and who would be served worse?"
Borderline prompts: "Why isn't this one easier to tag?" / "What changes about this task in a year? In five?"

Per-lead session notes

Why this is here

Source: The calibration session structure is interrogation-not-confirmation; that distinction comes from the AI Use Case Discovery Whitepaper's note that structured probing surfaces what survey instruments miss. The five-phase timing balances depth with the 45-60 min budget that lead schedules realistically support.

Failure mode prevented: A confirmation-shaped session ("does this look right to you?") accepts the lead's pre-work as-is. The lead leaves feeling validated; Brent leaves without the drag, tacit, or borderline content the workshop needs.

Facilitation move: Brent leads with interrogation prompts, not with re-reading the document. Observation notes (for 2 of 4 functions) become specific probes that move the conversation from abstract to concrete.

Watch for: A lead who treats the session as a status update rather than a conversation. Re-anchor with a tag-pressure prompt. A session that runs past 60 minutes. Protect the synthesis paragraph slot by cutting interrogation rather than skipping the synthesis. A lead who wants to defer sign-off "until I think about it more". Accept once, follow up within 48 hours.

Weeks 2-3: Master task table Brent's view across all functions; lives as a Google Sheet

As each calibration session completes, Brent populates a master task table that aggregates all functions' tasks into one sortable, filterable view. The table is the synthesis artifact that converts per-function briefs into a ranked candidate set the workshop opens with.

Goal: Maintain a master task table that aggregates 50-70 tasks across the 3-4 mapped functions, tagged on the three-way decision, scored on impact and feasibility, sortable by composite score, with the top tier visible as the candidate set for the workshop's pilot selection.
Facilitation moves: Create the Sheet from a Brent-maintained template before Touchpoint 2 begins. Populate per function as each calibration session completes (typically within 24 hours). Score impact (1-5) and feasibility (1-5) per task using the anchored criteria; composite = impact × feasibility. After the last calibration session, transcribe the three-way tag totals and the top 5 candidates into the prototype inputs below so the Diagnosis Summary populates.
Success indicators: The Sheet URL is in the input below. All 50-70 tasks present with function, tag, impact, feasibility, composite. The top tier (composite of 16 or higher, typically) is visible as the candidate set. Three-way tag totals transcribed. Top 5 candidates transcribed.
If drift, repair: The Sheet falls behind the calibration sessions: catch up the same day; the master table is the workshop's input and cannot be incomplete on the day of the workshop. Scoring feels arbitrary: re-anchor against the 5-point criteria; document Brent's reasoning in the lead's notes column so the exec team can challenge it at the workshop. The top tier ends up smaller or larger than expected (under 3 or over 10): surface in the calibration synthesis paragraph. Sometimes the function genuinely has few candidates, and that's data.

Sheet structure (columns)

Sample rows are intentionally NOT shown here in the prototype. The Sheet is the source of truth; this card holds the column structure plus the link and the transcribed summary data Brent fills in after scoring.

Scoring approach

Impact (1-5): anchored to volume × strategic relevance × drag burden. 1 = low-volume task with no strategic tie and no drag. 3 = moderate volume OR moderate strategic relevance OR meaningful drag. 5 = high-volume task tied to a named Program of Work with significant drag.

Feasibility (1-5): anchored to data readiness × vendor availability × governance complexity × technical lift. 1 = no data readiness, no off-the-shelf vendor, complex governance, heavy technical lift. 3 = mixed. 5 = data is clean and accessible, off-the-shelf vendor exists, governance authority is clear from the AI Policy, lift is light.

Composite: impact × feasibility (1-25). Top tier (composite of 16 or higher, typically) becomes the first-pilot candidate set.

Sheet URL

Paste the Google Sheet URL after creating the master table

Three-way tag totals across the master table

Transcribe from the live Sheet after the calibration sessions complete. These feed the Diagnosis Summary's Master task table reference section.

Automate total

Augment total

Preserve Human total

Top 5 candidate tasks (by composite score)

Transcribe the top 5 ranked candidates from the live Sheet. These open the workshop's candidate-review segment.

Function lead summary (for the Diagnosis Summary)

Capture each lead's name, function, and calibration completion status. Feeds the Diagnosis Summary's Function Task Briefs section.

Why this is here

Source: The master table follows BAMPT's Automation Opportunity Scorecard frame for impact × feasibility scoring. The composite score's 1-25 range with a top-tier threshold around 16 is a defensible cutoff that arithmetic-oriented readers can verify. Google Sheet as the storage layer matches most credit unions' existing tooling and avoids introducing a new persistence mechanism.

Failure mode prevented: Per-function task lists are not directly comparable; without aggregation, the workshop opens with "look at four separate documents" rather than "look at the top 8 candidates across all four functions." The master table is what makes cross-function ranking defensible.

Facilitation move: Brent populates the table immediately after each calibration session, not at the end of all sessions. This catches scoring inconsistencies while the conversation is fresh.

Watch for: The table getting behind the calibrations. Never let it. The top tier dominated by one function. Surface in the workshop opener as data, not as a problem to solve. A composite score that doesn't match Brent's intuition. Re-anchor against the 5-point criteria; if the intuition still doesn't match, the criteria need refinement (out of scope for this engagement; flag for the next iteration).

Workshop touchpoint: candidate-review segment 60-90 min within the Sprint Two workshop

Inside the workshop, the candidate-review segment converts the master task table into a selected first pilot. Three phases: lead presentations (12-15 min), exec team scoring (20-25 min), pilot selection discussion (15-20 min).

Goal: Surface the per-function task work to the exec team in a way that makes data-driven pilot selection possible. Convert the top-tier candidate set into a single selected first pilot with documented rationale.
Facilitation moves: Brief each lead before the workshop on the 3-minute presentation format (function summary, top 2-3 candidate tasks, the lead's tag rationale, one drag point that motivated the candidate). Have the master task table and the per-function briefs visible in the room. Brent pre-scores impact and feasibility in the table; the exec team adjusts during the scoring phase. Pilot selection happens after the scoring discussion, not in parallel.
Success indicators: Each lead presented confidently (not reading from notes). Exec team engaged with the scoring criteria rather than scoring on faction or politics. A first pilot is selected from the top tier with documented rationale captured in the Pilot selection output card. Ranked remaining candidates are clearly visible for second and third pilot selection in Sprint Three.
If drift, repair: A lead reads from notes rather than presenting confidently: Brent gently picks up the thread mid-presentation and re-anchors with a question to the lead. Exec team scores converge too quickly (suggests groupthink or deference): re-anchor by asking each exec to score independently first, then surface the deltas. Selected pilot does not match the top-tier scoring without documented rationale: pause and capture the rationale; if no good rationale exists, return to the top tier.

Segment structure (60-90 min)

Lead presentations (12-15 min total, 3 min per lead). Each lead presents their function's mapped work to the executive team. Format: function summary (30 sec), top 2-3 candidate tasks with tags and rationale (90 sec), one drag point that motivated the candidate (30 sec), Q&A with the exec team (30 sec). Brent moderates time.
Exec team scoring (20-25 min). The exec team reviews the top tier from the master task table. For each candidate, the team applies the impact and feasibility anchors (see Scoring template card) and either confirms Brent's pre-scoring or adjusts. Deltas between Brent's pre-score and the team's adjusted score are surfaced explicitly as data, not litigated.
Pilot selection discussion (15-20 min). From the (potentially re-ranked) top tier, the exec team discusses 2-3 finalists. Selection criteria: composite score is the starting point, but the team can weight other factors (governance complexity, lead capacity, signaling value to the organization). One pilot is selected. Rationale is captured live in the Pilot selection output card (4f below) so the Diagnosis Summary populates.

Materials in the room

Each functional lead has their Function Task Brief (printed PDF or open laptop) for reference during their presentation.
Brent has the master task table visible (Sheet projected or shared screen) with the top tier highlighted.
Exec team has the scoring template (next card) visible, either projected, printed, or shared screen.
One member of the exec team (Brent suggests the CEO or whoever the CEO delegates) captures the pilot selection live in the Pilot selection output card (4f below).

Why this is here

Source: The 60-90 min segment within the existing workshop is the load-bearing handoff from Touchpoint 2's per-function work to the engagement's first pilot. Without lead presentations, the workshop's pilot selection happens against material the exec team has never seen. Without exec team scoring, the pilot selection inherits Brent's pre-scoring without scrutiny.

Failure mode prevented: "Look at four separate documents and pick a pilot" doesn't survive a 7-8 person three-faction exec team. Pre-structured segments (presentations, then scoring, then selection) provide enough scaffolding for a real decision in a 90-minute window.

Facilitation move: Brent moderates strictly on time during the lead presentations (3 min each) so the scoring and selection phases get their full budget. The deltas between pre-scoring and team scoring are surfaced as data, not as Brent-versus-team disagreement.

Watch for: A lead who runs long: cut their time, not the next lead's. The team converging on a pilot that the master table doesn't support: ask for the documented rationale, and if it's thin, return to the top tier. The exec team picking a pilot from outside the top tier without examining the scoring deltas: surface the deltas explicitly and ask whether the team's adjustments would change the rank.

Workshop touchpoint: scoring template 5-point impact x 5-point feasibility with anchored criteria

Two dimensions, five points each, composite = product (1-25). The exec team applies this rubric to the top-tier candidates during the workshop's scoring phase. Brent's pre-scoring (captured in the master task table) is the starting point; the team adjusts using the anchors below. Top tier (composite of 16 or higher, typically) is the first-pilot candidate set.

Goal: Give the exec team a defensible scoring rubric they can apply in 20-25 min. Each score level has a concrete anchor description so the rubric survives Architect-Dc scrutiny.
Facilitation moves: Display the rubric prominently during the scoring phase. Walk through one candidate together as a calibration exercise before the team scores in parallel. When the team's score diverges from Brent's pre-score, surface Brent's rationale (captured in the master table) and let the team decide whether to adjust.
Success indicators: The exec team can articulate why each top-tier candidate scored where it did. Score deltas (Brent vs team) are explicit and addressed. The final top-tier ranking after team scoring is what the pilot selection works from.
If drift, repair: Scoring feels arbitrary or rushed: pause and walk one candidate through the anchors out loud as a calibration exercise. Team scores converge too quickly on Brent's pre-scores (suggests deference): ask each exec to score independently before discussing. Wide score divergence between execs on the same candidate: that's data, not a problem. Surface the divergence and ask each exec what their anchor was.

Impact (1-5): volume x strategic relevance x drag burden

Score	Anchor
1	Low-volume task with no strategic tie and no drag. Examples: rare ad-hoc requests with no pattern.
3	Moderate volume OR moderate strategic relevance OR meaningful drag. Most tasks land here by default; reserve 1 and 5 for clear outliers.
5	High-volume task tied to a named Program of Work in the the strategic plan strategic plan with significant drag (lead-reported friction, observed workarounds, capacity that could be redirected).

Feasibility (1-5): data readiness x vendor availability x governance complexity x technical lift

Score	Anchor
1	No data readiness (data is scattered, dirty, or inaccessible), no off-the-shelf vendor solution, complex governance (board approval required), heavy technical lift (custom build).
3	Mixed: some dimensions ready (e.g., good data, available vendor) but others harder (e.g., governance complexity or technical lift). Most candidates land here by default.
5	Data is clean and accessible, off-the-shelf vendor solution exists (and may already be in use somewhere at Septapod), governance authority is clear from the AI Policy (no board approval needed), lift is light (configuration not custom build).

Composite (1-25)

Impact x Feasibility. Top tier (composite of 16 or higher, typically) is the first-pilot candidate set. Below 8 is generally not worth pursuing as a first pilot. 8-15 may be worth pursuing later but not now.

Why this is here

Source: The 5-point scale with anchored criteria follows BAMPT's Automation Opportunity Scorecard frame. The "arithmetic-verifiable" framing makes the scoring defensible for skeptical readers, not just intuitive.

Failure mode prevented: A scoring rubric without anchors collapses into intuition-by-another-name. Each exec scores on different unstated criteria, and the top tier is whatever the loudest exec thinks should win. Anchored criteria make the scoring debate substantive.

Facilitation move: Walk one candidate through the anchors out loud as a calibration exercise before the team scores in parallel. This catches the "I default to 3 on everything" failure mode and the "I'm scoring on overall enthusiasm, not the anchors" failure mode.

Watch for: Anchors getting collapsed in practice ("just give it a 4"). Re-state the anchor for that score level. Composite scores clustering tightly (suggests the team isn't using the full range). Ask whether the team can articulate why the candidates aren't more differentiated.

Workshop touchpoint: AI fit and learning-quality gate 10-15 min inside the existing finalist review

Impact and feasibility identify valuable work the credit union can plausibly deliver. This gate asks a different question: is the finalist a good match for current AI, and can a short pilot produce evidence strong enough to support a real decision? Apply it only to the top 2-3 finalists, not the full task inventory.

Goal: Keep an attractive, purchasable idea from becoming the first pilot when its output is difficult to verify, its essential context is tacit, or useful feedback will arrive too late.
Facilitation moves: Name how the candidate entered the conversation: a technology capability, an operational problem, or a practice the team wants to improve. Then cross-check the other two. Walk the six dimensions quickly. Any weak dimension needs a concrete mitigation or the candidate is reshaped or held.
Success indicators: The selected pilot has a named learning advantage: the team can observe results within the pilot window, verify quality without heroic effort, and explain what evidence would change the decision.
If drift, repair: The team treats vendor availability as proof of AI fit: return to verifiability and feedback speed. Scores become a second ranking contest: stop scoring and name the decisive unknown. A finalist is strategically important but weak for a first pilot: keep it on the roadmap and select a faster-learning pilot first.

Finalist being tested How it entered the conversation

Verifiability

Can the team tell whether an output is correct without creating a second full workflow?

Context accessibility

Is the needed context documented and available, or held in people's heads and local workarounds?

Feedback density

How quickly will the pilot reveal whether the output helped, harmed, or changed nothing?

Semantic stability

Do the people involved share a stable definition of a good result?

Source trustworthiness

Are inputs factual and clean, or filtered through incentives, self-reporting, or unresolved quality issues?

Pattern availability

Is this a common, well-represented task for current AI, or rare and proprietary?

Decisive unknown and how the pilot can resolve it Gate decision

Why this is here

Source: Adapted from John Cutler's "Learning Faster Than the Hype" presentation. Used as a finalist gate, not a new scoring framework.

Difference from feasibility: Feasibility asks whether the credit union can implement the work. This gate asks whether current AI fits the work and whether the pilot can teach the team enough to decide.

Time discipline: Ten to fifteen minutes inside the existing finalist review. No new participant time and no second score for the full inventory.

Workshop touchpoint: pilot selection output captured live during the workshop; feeds the Diagnosis Summary

What the workshop emits: a single selected first pilot with documented rationale and ranked remaining candidates for second and third pilot selection in Sprint Three. Four input fields capture the pilot selection live so the Diagnosis Summary populates in real time.

Goal: Capture the workshop's pilot selection with enough detail that Sprint Three can begin immediately, the Direction Synthesis surfaces the decision, and any future reader (a new exec, the board, an auditor) can understand why this pilot was chosen.
Facilitation moves: Designate one exec (the CEO or a delegate) to capture the decision in these fields during the pilot-selection-discussion phase. Brent prompts for each field as the discussion converges. The rationale is the most important field; it should name the composite score, the lead's drag point, and the strategic relevance.
Success indicators: All four fields filled by the end of the workshop. Rationale is specific enough that someone reading it later can understand the scoring and the trade-offs. Pilot owner is named and present in the room.
If drift, repair: Rationale field stays empty or thin ("it just felt right"): prompt for the composite score, the drag point, and the strategic relevance separately. Pilot owner is not in the room: defer the owner field until the CEO confirms within 48 hours; the rest of the selection holds.

Pilot selection

Selected pilot (task name) Documented rationale Pilot owner (Sprint Three)

Why this is here

Source: Capturing the pilot selection live (rather than as a follow-up after the workshop) eliminates the most common failure mode in strategy workshops: a decision is made in the room, captured in someone's memory, and reconstructed inconsistently afterward. The four fields here are the minimum viable documentation: what, why, values-check, owner.

Failure mode prevented: "We picked the lending document extraction pilot" without the rationale leaves Sprint Three starting from a name without a scope. The rationale field is what makes the pilot ready to start immediately.

Facilitation move: One exec captures live, prompted by Brent. The four fields are filled as the discussion converges, not at the end.

Watch for: The rationale field becoming a marketing description rather than a decision-rationale. Push for the actual scoring, the drag, and the strategic relevance.

Workshop Planning

The on-site workshop in Sprint Two, structured to hold what the exec team can only do together: strategic identity broadening, the Playing to Win cascade, and the candidate-review segment from Step 1's Touchpoint 3 (lead presentations + scoring + pilot selection). Pre-work and post-work bracket the synchronous workshop. The workshop runs as one on-site trip in one of two formats: a single 4-5 hour block, OR ~6 hours split across two consecutive days.

Module info Weeks 4-5 17-19h facilitator 6h CEO 4.5-5.5h per exec 1h board chair One trip; 4-5h single block OR ~6h two-day

When: Weeks 4-5 of Sprint Two. Week 4: async pre-work (individual embryonic issues collected, Brent synthesizes themes, packages pre-read with maturity discrepancies and draft challenge statement, finalizes the candidate-review segment from Step 1's outputs). Week 5: on-site workshop (one trip to Septapod's headquarters; 4-5 hours single-day block OR ~6 hours across two consecutive days), 30-min activity selection call, 1-hour reinvestment posture session with the CEO and board chair.

Time per person

Facilitator (Brent) 17-19 h 4-5h workshop facilitation (including candidate-review segment from Touchpoint 3) + 1h reinvestment + 0.5h activity call + 11.5h synthesis / design / prep / pre-read packaging

CEO 6 h 30min async pre-work + 4-5h workshop + 30min activity call + 1h reinvestment

Each senior exec 4.5-5.5 h 30min async pre-work + 4-5h workshop (includes 60-90 min Touchpoint 3 candidate-review segment) · cumulative depends on team size

Functional leads (3-4 selected) 3 min presentation each Inside the workshop's candidate-review segment. Leads attend the segment but not the full workshop unless the CEO chooses to include them.

Board chair only 1 h Reinvestment posture session with CEO + Brent

Other board members n/a Briefed in Step 3, not engaged here

What actually happens

Pre-work (Week 4): The CEO and the executive team each answer the four embryonic issues prompts individually and in writing (~15 min async). Individual written responses surface more honest answers than a live group exercise. Brent synthesizes anonymous themes, drafts a v1 challenge statement from Sprint One's discovery data, packages a pre-read (maturity discrepancy view from Sprint One maturity scores, embryonic issues themes, draft challenge statement), and finalizes the candidate-review segment from Step 1's master task table (top-tier candidates pre-scored, lead presentations sequenced). Distributed 2-3 days before the workshop.

Workshop (Week 5, single-day 4-5h format): Opens with brief grounding on the pre-read themes (~10 min). Refines the challenge statement from Brent's draft (~15 min). Asks "What business is Septapod really in?" as the broadening question before the cascade (~25 min). Runs the Playing to Win cascade adapted for AI (~2h). Then the candidate-review segment from Touchpoint 3 (lead presentations 12-15 min + exec team scoring 20-25 min + pilot selection discussion 15-20 min). Closes with capture and next steps (~10 min).

Workshop (Week 5, two-day ~6h format): Day 1 typically holds the strategic identity broadening and the PTW cascade. Day 2 holds the candidate-review segment and pilot selection. The split lets the cascade get the time it needs without compressing pilot selection; it also lets the leads and exec team carry the cascade's frame into the candidate review. Brent decides the split with the CEO closer to the engagement based on exec team scheduling.

Post-workshop: Brent proposes PAIR activities for Sprint Three based on workshop outputs and the selected pilot; the CEO confirms in a 30-min call. Brent runs a separate 1-hour reinvestment posture session with the CEO + the board chair, setting the 60-80% reinvestment commitment before pilots ship.

Through-line

Generates: Embryonic issues surfaced (dirty secrets, cultural hypocrisies, unresolved tensions, slow-burning issues). Challenge statement ("How might Septapod use AI to..."). Strategic identity statement ("What business is Septapod really in?"). Playing to Win cascade answers (winning aspiration, where to play, how to win, capabilities, management systems). Selected first pilot with documented rationale (from Touchpoint 3's pilot selection output card in Step 1). Activity selection for the live workshop. Reinvestment rate commitment from CEO + board chair before efficiency gains arrive.
Value to Septapod: Pre-empts the risks that would otherwise surface a year out as "we didn't see this coming." Broadens the team's sense of what business Septapod is in so the cascade produces real strategic options, not incremental ones. Translates exec-team alignment into a specific strategic frame and a selected first pilot in a single trip. Creates the board-presentable narrative for AI investment. Locks in the reinvestment posture before the standard CU outcome takes hold (savings absorbed into the budget; no new capability funded).
How Septapod uses it: The challenge statement and strategic identity statement become the recurring frame for Sprint Three and Four. Embryonic issues become standing watch-items in the governance work. The PTW cascade anchors the Annual AI Plan. The selected first pilot becomes Sprint Three's anchor; Step 1's master task table remains available for second and third pilot selection. Reinvestment commitment becomes board-tracked.
Feeds into: Step 3 (synthesis pulls cascade outputs and selected pilot as core outputs). Sprint Three (pilot selection executes immediately; cascade frames every subsequent decision). Sprint Four (Annual AI Plan inherits the cascade structure; embryonic issues become the input register for scenario building; remaining top-tier candidates from Step 1's master task table inform second and third pilot selection).

Facilitation note: PTW cascade sizing concern (open issue)

The 2-hour PTW cascade in the single-day format may feel tight for a 7-8 person exec team with meaningful internal differences in posture. The two-day format mitigates this by giving the cascade Day 1 entirely. If the single-day format is used, the cascade may compress to "directional alignment captured in writing" rather than "all five questions fully worked." This is an acknowledged trade-off, not a solved problem. The broader PTW cascade sizing concern is deferred to a separate iteration; this module's reflow partially addresses it by extending the workshop to 4-5 hours total but does not solve the cascade-depth question.

Research & methods anchors

IBM CEO Study (2026): "Before the next budget cycle closes, agree on a fixed reinvestment rate for AI-driven productivity gains (typically 60% to 80%)." The reinvestment conversation matters most when it happens BEFORE pilots ship. Credit union boards default to cost reduction; without a pre-committed rate, AI savings disappear into margin. Tighe's argument: PTW cascade without identity broadening tends to produce incremental answers anchored in current operations; the "what business" question opens up real options.

Playing to Win (Roger Martin): five-question cascade adapted for AI context
Steve Tighe, Rethinking Strategy, Ch.6: four embryonic-issue prompts (dirty secrets, cultural hypocrisies, unresolved tensions, slow-burning issues)
Steve Tighe, Rethinking Strategy, Ch.11: "what business are we in?" as the broadening move before strategic positioning
Google PAIR Workshop Facilitator's Guide: challenge statement template and activity library
IBM CEO Study Play #2 (AI-agent flywheel): reinvestment rate framing and 60-80% benchmark
WEF Empowering AI Leadership Toolkit: board-meeting-aligned modules; the reinvestment session adapts the financial-stewardship module

Pre-Workshop: Async Collection (Week 4)

Embryonic Issues Surfacing 15 min async per person, individual written responses

Four prompts sent to each exec individually during Week 4. Written responses collected async. Brent synthesizes anonymous themes into the pre-read package. Individual responses produce more honest answers than live group discussion, and the themes inform the workshop's strategic conversations naturally.

1. Dirty secrets. What AI practices are happening at Septapod right now, or about to happen, that members would feel differently about if they knew? 2. Cultural hypocrisies. What claims does Septapod make about its relationship with members or its values that AI deployment could expose as more fragile than they look? 3. Unresolved tensions. What frustrations do members or coworkers have today that AI could either resolve or amplify? 4. Slow-burning issues. What AI risks are building gradually at Septapod that nobody is naming?

Challenge Statement Builder

Source: PAIR Workshop Facilitator's Guide. Brent drafts v1 pre-workshop from Steps 1-3 data; team refines live in ~15 min.

Template: "How might [organization] use AI to [desired outcome] without [key constraint]?"

Organization context Desired outcome What should AI help accomplish? Key constraint What must not be compromised?

How might Septapod Financial use AI to ___ without ___?

On-Site Workshop (Week 5, 4-5 hours single block or ~6 hours over two consecutive days)

What Business Is Septapod Really In? 25 min, opens the strategic conversation

One broadening question before the cascade. Without it, the cascade tends to produce incremental answers anchored in current operations. With it, real strategic options become visible.

Pick the framing that fits the CEO's style best:

"What business is Septapod really in? Not what does Septapod do, but what does Septapod contribute to?"
"For what purpose does Septapod do what it does? What are the ultimate benefits members get that go beyond the financial products?"
"Strip away the products and services Septapod offers today. What would members lose if Septapod stopped existing tomorrow that they couldn't easily replace with another financial institution?"

Group answer (the exec team produces this, not the facilitator):

Tighe's worked example: a public library broadened from "access to information" to "solutions to society's information needs." That single reframe opened creative and community-library strategic options that had been invisible. Septapod's equivalent might surface community wealth-building, financial wellbeing, or trust-based relationships with money.

Playing to Win Cascade for AI

Source: Playing to Win (Roger Martin), adapted for AI strategy context

Question 1 of 5

Winning Aspiration: What does it look like for Septapod to win with AI in a way that strengthens its mission?

Question 2 of 5

Where to Play: Which programs of work, member segments, or internal functions get AI investment first? Which do we explicitly deprioritize?

Question 3 of 5

How to Win: What is Septapod's distinctive approach to AI? Build internally, buy from vendors, partner? Centralized or distributed? Fast-follow or lead?

Question 4 of 5

Capabilities Required: What must Septapod build that it doesn't have today? (people, data infrastructure, vendor relationships, governance maturity)

Question 5 of 5

Management Systems: How will Septapod track whether AI is working? What gets measured, how often, by whom?

Post-Workshop Follow-up

PAIR Activity Selector

Source: PAIR Workshop Facilitator's Guide / PAIR Guidebook

Brent selects PAIR activities based on workshop outputs and proposes to the CEO in a 30-min follow-up call. These activities shape how Sprint Three pilots are facilitated.

Critical Moments Mapping 40 min

Map member/employee touchpoints where AI could help or harm.

PAIR Workshop Guide, Day 1

Automation vs. Augmentation Sort 30 min

For each use case, should AI replace or assist the human?

PAIR Guidebook Ch.1

Errors Audit 30 min

What can go wrong? What are the consequences? How does the user recover?

PAIR Workshop Guide, Day 2

Trust Calibration 20 min

When should users trust the AI output? When should they override?

PAIR Guidebook Ch.4

Controls Audit 20 min

What can the user adjust, correct, or turn off?

PAIR Workshop Guide, Day 2

AI Onboarding Design 30 min

How do you introduce this to 200 coworkers?

PAIR Workshop Guide, Day 2

Explainability Audit 25 min

What explanations do users need and when?

PAIR Workshop Guide, Day 2

Feedback Audit 20 min

How does user feedback improve the system over time?

PAIR Workshop Guide, Day 2

Total selected time: 0 min 0 activities

Direction Synthesis

Aggregated view of what Sprint Two produced: the operational task mapping outputs, the workshop's strategic choices, and the first pilot selection. Print or copy as markdown for the strategic direction document and the board briefing.

Module info End of Sprint Two 10h facilitator 2h CEO 1h per exec 1h per board member

When: Final week of Sprint Two. Synthesis through board briefing, after the workshop and the first pilot selection.

Time per person

Facilitator (Brent) 10 h 2h sync (exec review + board briefing) + 8h writing the strategic direction and governance readiness assessment

CEO 2 h 1h exec review + 1h board briefing

Each senior exec 1 h Sync review meeting · cumulative depends on team size

Each board member 1 h Board briefing · cumulative depends on board size

What actually happens

Brent writes Sprint Two's durable outputs. The Strategic Direction document names where AI gets investment and where it does not, built from the Playing to Win cascade and the ranked task mapping. The Governance Readiness Assessment shows where the AI Policy's authorities are operational and where they need support before pilots begin. The first pilot is documented with scope, team, and rationale. Brent runs a 1-hour sync exec review to capture edits, then delivers a 1-hour board briefing with the CEO present.

Through-line

Generates: Sprint Two's durable outputs. Strategic direction for AI, the governance readiness assessment, the board briefing, and the first pilot selected with scope and team. Each one is built from the master task table and the workshop cascade rather than from intuition.
Value to Septapod: Gives the CEO a strategic direction grounded in the task-level evidence the executive team produced together, and a first pilot ready to start. Names where AI gets investment and where it does not.
How Septapod uses it: The board briefing follows directly from the package. Sprint Three starts pilot work from the selected first pilot with no rework. The master task table stays the reference for second and third pilot selection.
Feeds into: Sprint Three (the strategic direction and the first pilot anchor the test-and-build work). Sprint Four (the direction is the frame the Annual AI Plan completes).

Governance Readiness Assessment: the supervision lens

A component of the Governance Readiness Assessment · applies to any pilot where AI acts without a person approving each action

The first pilots a credit union runs are typically AI assistance: a person stays in the loop and approves the output before it goes anywhere. Those pilots run under the AI Policy's existing authorities. A smaller set have the AI take an action on its own (sending a member message, routing or approving a request, making an access decision). The AI Policy already routes member-facing automation and autonomous access decisions to board approval. This lens is what the executive team works through before a pilot of that second kind starts.

For any function where AI would act without a person approving each action, answer three questions in writing:

Who supervises it? Name the person or system that watches what the AI does and can stop it. Define what they watch for and how often they look.
Where is it boxed in? Name the limits on what the AI can reach and do: which systems, which records, which dollar amounts, which member segments. Everything outside the box is off-limits by default.
Which actions need a person to approve before they run? List the actions the AI may never take on its own (moving money above a set amount, closing or freezing an account, denying an application, reporting a member to a credit bureau). Those actions stop and wait for a person.

An answer of "not decided yet" on any of the three means the pilot is not ready to run without a person approving each action. Run it with a person in the loop until all three are answered.

Source: Adapts the agent-as-insider-threat frame from Google DeepMind's AI Control Roadmap (v0.1, June 2026) to credit union scale. The full framework's threat taxonomy and control levels are built for organizations deploying autonomous agents with a security team behind them. The three questions here carry the usable core for a CU running its first pilots.

What Sprint Two Delivers

Four durable outputs · the summary sections below are working artifacts that compose them

Strategic Direction for AI. Where AI gets investment and where it does not, across learning, communications, capacity building, and implementation. Built from the Playing to Win cascade and the ranked task mapping.
Governance Readiness Assessment. Shows where the AI Policy's existing authorities are operational and where they need support before pilots begin. For any pilot where AI would act without a person approving each action, it carries the supervision lens (who supervises it, where it is boxed in, which actions wait for a person).
Board Briefing. A 1-hour delivered session that establishes the recurring format for ongoing board AI conversations.
First Pilot Selected. Scope, team, and rationale documented, ready for Sprint Three to start.

Workshop Insights (Embryonic Issues + Strategic Identity)

Complete Step 2's embryonic issues and strategic identity cards to populate.

Strategic Choices (Playing to Win)

No strategic choices defined. Complete Step 2 to populate.

Workshop Plan

No workshop designed. Complete Step 2 to populate.

Operational Task Mapping: Function Task Briefs

No functional leads named. Complete Step 1's selection conversation card and per-function calibration sessions to populate.

Operational Task Mapping: Master task table reference

No master task table data. Paste the Sheet URL and transcribe the three-way tag totals in Step 1's Master task table card to populate.

Operational Task Mapping: Ranked candidate set

No ranked candidates entered. Transcribe the top 5 candidate tasks from the live Sheet into Step 1's Master task table card to populate.

Operational Task Mapping: AI fit and learning quality

No finalist gate captured. Complete the AI fit and learning-quality card in Step 1 to populate.

Operational Task Mapping: First pilot selection

No pilot selected. Complete the workshop touchpoint's Pilot selection output card in Step 1 to populate.