Pilot Phase Guide

"Week 1-2: Prove the concept. Calibrate the system. Build the foundation."

The pilot phase is designed to validate HEAT with minimal risk and maximum learning. By the end of Week 2, you'll have baseline data, calibrated thresholds, and proof of value to justify full team rollout.

Pilot Objectives

✅ Validate signal quality — Do Pain Streaks correlate with actual grind? ✅ Calibrate intensity scale — Is x8 meaningful vs x5? ✅ Identify tag patterns — Which work types matter for your team? ✅ Measure adoption friction — Does 30-second tagging actually work? ✅ Prove business value — Can you catch one burnout signal early?

Success criteria: At least 1 actionable insight per pilot participant by Week 2

Pilot Team Selection

Option A: Individual Pilot (Safest)

Who: 1-2 volunteer developers + their manager

Pros:

Minimal disruption
Easy to calibrate
Safe to iterate

Cons:

Limited pattern detection
No team heatmap view
Slower to prove ROI

Best for: Conservative organizations, initial validation

Option B: Small Team Pilot (Recommended)

Who: 3-5 developers + 1 manager (e.g., single squad)

Pros:

Team heatmap visible
Bus Factor detectable
Faster ROI proof
Realistic workflow test

Cons:

Requires group buy-in
More coordination

Best for: Agile teams, fast-moving organizations

Selection Criteria

Look for:

Criterion	Why It Matters
Volunteers	Adoption works when people want it
Diverse work types	Mix of Feature, Bug, Support, Config work
Reasonable manager	Will actually use the data, not weaponize it
Stable project	Not in crisis mode (skews baseline)
Technical comfort	Can install browser extension

Avoid:

❌ Teams in firefighting mode (skews data) ❌ Skeptical managers (kills adoption) ❌ Solo developers with no manager buy-in ❌ Teams with <1 week left on project

Week 1: Setup & Calibration

Day 1: Monday — Installation & Onboarding

Morning (30 minutes):

Install browser extension
- Download from internal deployment or public store
- Configure API endpoint
- Authenticate with SSO or team token
Quick training (15 minutes)
- Watch 5-minute tagging demo video
- Review Observable Signals guide
- Download Quick Reference Card

What to tag:

Work Type: Feature | Bug | Blocker | Support | Config | Research
Intensity: x1 (routine) to x10 (high load)

Rule of thumb:
- x1-x3: No mental fatigue
- x4-x6: Concentrated effort
- x7-x10: Grinding, stuck, frustrated

Tag your first task
- Pick task you're currently on
- Choose work type
- Estimate intensity (make your best guess)
- Click "Save Tag"

Afternoon:

Tag 2-3 more tasks as you switch work
Note which parts feel unclear (we'll calibrate tomorrow)

Expected tags Day 1: 3-5 per person (getting comfortable)

Day 2: Tuesday — Intensity Calibration

Morning standup (5 minutes):

Quick round-robin:

"What intensity did you log yesterday?"
"Did any task feel harder than the number you gave it?"
"Any confusion on work types?"

Intensity calibration exercise:

Review these scenarios together:

Scenario	Initial Guess	Calibrated	Why
Reviewed PR for familiar code	x2	x1	Too easy to be x2, truly routine
Debugged SQL performance	x6	x8	"I was stuck for 3 hours, felt exhausting"
Built CRUD endpoint	x4	x3	Standard feature work, not intense
Production incident	x9	x9	Correctly calibrated (crisis mode)

The Physical Test:

x1: Could do this while half-asleep
x3: Normal focus, feel fine after
x5: Deep focus, need a break after
x7: Frustrated, want to walk away
x10: Mentally fried, done for the day

Afternoon:

Re-tag yesterday's work with calibrated understanding
Tag today's work with new mental model

Expected outcome: Everyone aligns on intensity scale meaning

Day 3: Wednesday — Work Type Clarity

Morning standup (5 minutes):

Discuss edge cases:

Scenario	Tag	Why
"Helping teammate debug their code"	Support, x2	Assisting others, low personal effort
"Fixing bug in feature I just built"	Bug, x4	It's a defect, even if recent
"Stuck waiting for QA environment"	Blocker, x5	Can't proceed, frustration building
"Reading docs for new library"	Research, x3	Exploration, not building yet
"Updating CI/CD config"	Config, x2	Tooling/environment work

Common confusions resolved:

Q: "What if I'm blocked on a bug?" A: Tag as Blocker (primary) + note "waiting on DB team" (we track bottlenecks, not just bug fixing)

Q: "I spent 4 hours on a feature, but 2 hours was routine and 2 hours was hard. How do I tag?" A: Two tags: Feature, x2 and Feature, x7. Be granular.

Q: "Do I tag meetings?" A: Only if it's substantive work (architecture review = Research, x4). Skip standup.

Afternoon:

Tag with confidence on work types
Note any remaining edge cases for Friday review

Expected outcome: <10% tag ambiguity

Day 4: Thursday — Pattern Recognition

Morning:

Pilot participants get access to Developer View dashboard:

Your Heatmap (This Week):
├── Monday: 12 intensity (🟩 Normal)
├── Tuesday: 18 intensity (🟨 Warm) — Calibration adjustments
├── Wednesday: 8 intensity (🟦 Cool)
└── Thursday: (in progress)

Tag Distribution:
├── Feature: 45%
├── Bug: 25%
├── Blocker: 15%
├── Support: 10%
├── Config: 5%

Self-reflection questions (5 minutes):

Does my heatmap match how I felt this week?
Am I surprised by my tag distribution?
Did I catch any Pain Streaks forming?

Pair discussion (10 minutes):

Share your heatmap with one other pilot participant
Discuss: "What does this data tell you about your week?"
Look for patterns: "Why was Tuesday warm?"

Afternoon:

Continue tagging
Start thinking about patterns

Expected outcome: First "aha moment" — data matches reality

Day 5: Friday — Week 1 Retrospective

Pilot team meeting (30 minutes):

Agenda:

Data review (10 minutes)
Manager shares (anonymized or open, team decides):
- Team heatmap view
- Tag distribution
- Any Pain Streaks detected (probably none yet, Week 1 is baseline)
Calibration check (5 minutes)
- Does everyone agree on intensity scale now?
- Any work types still confusing?
- Are tags taking >30 seconds? (If yes, simplify)
Early insights (10 minutes)
Go around the table:
- "One thing I learned about my work this week?"
Examples:
- "I didn't realize 30% of my time was Support"
- "Tuesday felt bad because I had 3 blockers stacked"
- "I thought I was productive Wednesday, but it was all Config work"
Week 2 goals (5 minutes)
- Goal: Catch one Pain Streak before it becomes a problem
- Goal: Identify one Bus Factor = 1 situation
- Goal: 90%+ tagging consistency

Expected outcome: Team is comfortable, sees value, ready to continue

Week 2: Pattern Detection

Day 6: Monday — Baseline Established

Morning standup (3 minutes):

Quick reminder: Tag as you work
New goal: "Let's see if we can detect a Pain Streak this week"

Manager setup:

Enable Pain Streak Alerts:

Threshold: 3 days on same task/tag combo
Alert delivery: Slack DM or email
Auto-check: Daily at 5pm

Developers:

Tag as usual. Week 1 was calibration, Week 2 is real signal detection.

Day 7: Tuesday — First Pain Streak Alert (Maybe)

Scenario: Alice gets 🔥 Streak: 2 days notification

Task: PROJ-42 "Fix payment gateway timeout"
Tag: Blocker, x8
Streak: Day 2 (Monday + Tuesday)

Manager action:

Check-in (5 minutes): "Hey Alice, I see you've been on the payment bug for 2 days at high intensity. Need help?"
Options:
- Pair with senior developer
- Escalate to vendor support
- Deprioritize for now if not critical
Outcome:
- Alice paired with Bob (knows payment system)
- Blocker resolved by end of day
- Cascade prevented: Rushed fix that could have caused production incident

Value demonstrated: $245K cascade avoided (see Cascade Effect example)

Day 8: Wednesday — Bus Factor Detection

Manager View shows:

Module Ownership (Last 2 Weeks):
├── Payment Gateway: Alice (100%)
├── Auth Service: Carol (90%), David (10%)
├── API Core: Bob (60%), Alice (40%)
└── Frontend: David (70%), Carol (30%)

🚨 Bus Factor = 1: Payment Gateway

Manager action:

Immediate: Schedule knowledge transfer session (Friday)
This sprint: Pair Carol with Alice on payment work
Next month: Cross-training plan for all critical modules

Value demonstrated: If Alice leaves, payment module continuity preserved

Day 9: Thursday — Tag Analysis Insight

Tag Analysis View shows:

Team Tag Distribution (Week 2):
├── Feature: 140 intensity (good, innovation work)
├── Bug: 80 intensity (normal)
├── Blocker: 180 intensity (🔴 Alert: 2× baseline)
├── Support: 60 intensity (normal)
└── Config: 120 intensity (🟡 Warning: Environment issues?)

Drill-down on Blocker spike:
- 3 of 5 developers blocked on "QA environment down"
- Total lost capacity: ~12 hours this week

Manager action:

Immediate: Escalate to DevOps (QA env priority)
Root cause: Deployment script broken Monday, not fixed until Thursday
Prevention: QA env monitoring alert added

Value demonstrated: Systemic issue caught, future cascades prevented

Day 10: Friday — Pilot Phase Retrospective

Pilot team meeting (45 minutes):

Agenda:

Success stories (15 minutes)
Share wins:
- Alice's Pain Streak caught early
- Bus Factor = 1 on Payment identified
- Config spike revealed QA env issue
Metrics review (10 minutes)
Adoption:
- Tagging consistency: 85%+ (target: 90%+)
- Avg time per tag: 25 seconds (target: <30s)
- Dashboard usage: Manager checked 4×/week
Value:
- Pain Streaks detected: 1
- Cascades prevented: 1 (estimated $245K savings)
- Systemic issues surfaced: 1 (QA env downtime)
- Bus Factor = 1 situations: 1 (cross-training initiated)
Learnings (10 minutes)
What worked:
- Daily tagging became habit by Day 3
- Manager using data proactively, not punitively
- Dashboard matched "gut feel" (validated framework)
What needs improvement:
- Tag selector UI could be faster (1-click for common tags?)
- Some edge cases still unclear (support vs collaboration?)
- Need reminder to tag at end of day

Go/No-Go decision (10 minutes)

Criteria for full rollout:

Criterion	Status	Notes
Tagging adoption >85%	✅ 85%	Good enough
At least 1 actionable insight	✅ 3 insights	Exceeded
Team sentiment positive	✅ Unanimous	Everyone wants to continue
Manager buy-in	✅ Strong	Already using data
Technical stability	✅ No issues	Extension + API solid

Decision: ✅ GO for full team rollout

Post-Pilot: Preparing for Rollout

Week 3 Prep Activities

Manager:

Document pilot wins
- Write 1-page summary for leadership
- Include: Alice's Pain Streak, Bus Factor finding, QA env issue
- Quantify value: $245K cascade prevented (conservative)
Plan team onboarding
- Schedule 30-minute team kickoff meeting
- Prepare demo using pilot team's real data (anonymized if needed)
- Assign onboarding buddy for each new user
Review thresholds
- Based on pilot data, are default thresholds right?
- Intensity color bands: <5 (cool), 5-15 (normal), 15-30 (warm), 30+ (critical)
- Adjust if needed for team norms

Pilot participants:

Become champions
- Share your experience in team channels
- Offer to pair with new users for first tagging session
- Be honest about what worked and what didn't
Refine workflows
- Identify optimal tagging moments (end of day vs real-time?)
- Share tips (e.g., "I tag in standup for yesterday's work")

Common Pilot Issues & Solutions

Issue 1: "I keep forgetting to tag"

Solutions:

Browser extension badge reminder
End-of-day Slack reminder bot
Buddy system (pair checks in daily)
Make it a standup ritual

Issue 2: "I'm not sure if this is x5 or x7"

Solution:

Use comparative questions:

"Was this harder than the last x5 task I did?"
"Did I need a break after, or could I continue?"
"On a scale of 1-10 physical exhaustion, how tired am I?"

Remember: Precision doesn't matter as much as consistency. If you're consistently calling similar work x6, the patterns will emerge.

Issue 3: "Manager is using this against me"

RED FLAG. Stop pilot immediately.

Required conversation:

"HEAT is a team health tool, not a performance review system. Using it punitively will kill adoption and trust."

If manager doesn't course-correct, escalate or abandon pilot.

Issue 4: "This feels like surveillance"

Clarify:

HEAT Measures	HEAT Does NOT Measure
Effort intensity patterns	Hours worked
Team health signals	Individual productivity scores
Systemic friction	Who is "best" or "worst"
Knowledge distribution	Surveillance metrics

Opt-in philosophy:

You choose what to tag
Your data is visible to you first
Manager sees patterns, not granular activity

If concern persists, review Architecture privacy section together.

Issue 5: "Tags don't match my reality"

Investigate:

Calibration issue?
- Review intensity scale together
- Use Physical Test framework
Work type confusion?
- Review Observable Signals
- Clarify edge cases
Legitimate gap?
- Maybe HEAT isn't capturing your work type (e.g., heavy meeting load)
- Consider custom tags or note this as limitation

Remember: Pilot is for finding these gaps. Iterate.

Pilot Success Metrics

Quantitative (Minimum Thresholds)

Metric	Target	Acceptable	Concern
Tagging consistency	>90%	>80%	<80%
Avg time per tag	<20s	<30s	>30s
Dashboard usage (manager)	Daily	3×/week	<2×/week
Actionable insights	3+	1+	0
Pain Streaks detected	1+	0 (Week 1 only)	0 (Week 2)

Qualitative (Must-Haves)

✅ Team sentiment: Majority positive or neutral ✅ Manager buy-in: Using data proactively ✅ Technical stability: No blocking bugs ✅ Privacy comfort: No surveillance concerns ✅ Value proof: At least 1 "aha moment"

Pilot Deliverables (End of Week 2)

For Leadership:

One-page summary
- Pilot team: 5 developers, 2-week trial
- Key findings: [List 3-5 insights]
- Value demonstrated: [$X cascade prevented]
- Recommendation: Proceed to full rollout
Example dashboard screenshot (anonymized)
- Team heatmap
- Tag distribution
- Pain Streak example

For Team:

Onboarding guide (based on pilot learnings)
- 5-minute quick start
- Common edge cases FAQ
- Tag cheat sheet
Manager playbook
- Daily/weekly/monthly rituals
- Intervention matrix (when to act on signals)
- Example check-in scripts

Next Steps After Pilot

✅ Pilot Success? → Team Rollout Phase (Week 3-4)

🔧 Need More Calibration? → Extend pilot to Week 3, iterate

📊 Want to See More Examples? → Case Studies

❓ Questions on Framework? → Observable Signals Guide

"Pilot proves value. Rollout scales it. Sustained use delivers ROI." 🔥

Pilot Phase Guide ​

Pilot Objectives ​

Pilot Team Selection ​

Option A: Individual Pilot (Safest) ​

Option B: Small Team Pilot (Recommended) ​

Selection Criteria ​

Week 1: Setup & Calibration ​

Day 1: Monday — Installation & Onboarding ​

Day 2: Tuesday — Intensity Calibration ​

Day 3: Wednesday — Work Type Clarity ​

Day 4: Thursday — Pattern Recognition ​

Day 5: Friday — Week 1 Retrospective ​

Week 2: Pattern Detection ​

Day 6: Monday — Baseline Established ​

Day 7: Tuesday — First Pain Streak Alert (Maybe) ​

Day 8: Wednesday — Bus Factor Detection ​

Day 9: Thursday — Tag Analysis Insight ​

Day 10: Friday — Pilot Phase Retrospective ​

Post-Pilot: Preparing for Rollout ​

Week 3 Prep Activities ​

Common Pilot Issues & Solutions ​

Issue 1: "I keep forgetting to tag" ​

Issue 2: "I'm not sure if this is x5 or x7" ​

Issue 3: "Manager is using this against me" ​

Issue 4: "This feels like surveillance" ​

Issue 5: "Tags don't match my reality" ​

Pilot Success Metrics ​

Quantitative (Minimum Thresholds) ​

Qualitative (Must-Haves) ​

Pilot Deliverables (End of Week 2) ​

Next Steps After Pilot ​

Pilot Phase Guide

Pilot Objectives

Pilot Team Selection

Option A: Individual Pilot (Safest)

Option B: Small Team Pilot (Recommended)

Selection Criteria

Week 1: Setup & Calibration

Day 1: Monday — Installation & Onboarding

Day 2: Tuesday — Intensity Calibration

Day 3: Wednesday — Work Type Clarity

Day 4: Thursday — Pattern Recognition

Day 5: Friday — Week 1 Retrospective

Week 2: Pattern Detection

Day 6: Monday — Baseline Established

Day 7: Tuesday — First Pain Streak Alert (Maybe)

Day 8: Wednesday — Bus Factor Detection

Day 9: Thursday — Tag Analysis Insight

Day 10: Friday — Pilot Phase Retrospective

Post-Pilot: Preparing for Rollout

Week 3 Prep Activities

Common Pilot Issues & Solutions

Issue 1: "I keep forgetting to tag"

Issue 2: "I'm not sure if this is x5 or x7"

Issue 3: "Manager is using this against me"

Issue 4: "This feels like surveillance"

Issue 5: "Tags don't match my reality"

Pilot Success Metrics

Quantitative (Minimum Thresholds)

Qualitative (Must-Haves)

Pilot Deliverables (End of Week 2)

Next Steps After Pilot