THE SCORECARD
Three Engineering objectives at the function level — squad reliability, on-call distribution, and platform health
Your CTO gets graded on architecture and the long-term tech bets. Your VP Engineering gets graded on shipping the roadmap. You and your squads get graded on something different. Can squads ship what they commit to, sprint after sprint? Is on-call work distributed enough to keep your senior ICs from quitting? Is the internal platform healthy, or starved every time product features need shipping?
The three objectives below are what an Engineering leader would actually write down for the quarter. They're operational. They're measurable. And they're the ones that fail quietly — long before the roadmap misses.
| Objective |
Key Result |
Benchmark / Threshold |
Target |
|
Improve squad reliability so 90% of sprint commits ship to production
When squads consistently ship what they commit, the VP Eng can give the CEO a roadmap with real dates. When they don't, every quarter ends with three teams explaining why their work slipped.
|
Each squad ships ≥ 90% of its sprint commitment to production |
60–75% typical at this stage1 Benchmark |
≥ 90% |
| Cut mid-sprint scope changes below 10% of committed points |
25–40% typical2 Benchmark |
< 10% |
| Hold unplanned interrupt work below 20% of squad capacity |
35–45% typical1 Benchmark |
< 20% |
|
Distribute on-call load so no engineer carries more than 4 pages per week
When 3 senior engineers are taking 70% of the pages, you're one resignation away from losing the people who know your most fragile systems. Spreading the load is how you keep them.
|
Top on-call engineer's page count ≤ 1.5× the squad median, rolling 4 weeks |
2.5–4× typical3 Threshold |
≤ 1.5× |
| Weekend pages distributed across ≥ 4 engineers per squad per quarter |
Concentrated in 1–2 typical Threshold |
≥ 4 / squad |
| Senior+ engineer regretted attrition tied to on-call below 5% / qtr |
10–15% at this stage4 Benchmark |
< 5% |
|
Run the platform team like an internal product team with measurable SLOs
Platform engineers serve every other squad in the org — but their work is usually the first thing cut when product features need shipping. Treating them as an internal product team protects the leverage everyone else depends on.
|
Platform team holds 99.5% SLO on internal CI/CD, deploy tooling, observability |
Often unmeasured at this stage Threshold |
≥ 99.5% |
| Internal NPS from product squads ≥ 40 (platform-as-product satisfaction) |
−10 to +20 typical5 Benchmark |
≥ 40 |
| Platform team's quarterly capacity protected from product reallocation ≥ 80% |
40–60% typical (gets cut for features) Threshold |
≥ 80% |
Why on-call distribution (O2) is the senior IC retention problem
Every VP Engineering will tell you on-call is "rotated." Pull the PagerDuty data. Three engineers are carrying 70% of weekend pages — because they're the only ones who know the legacy auth system, the billing service, or the ETL pipeline.
The rotation looks fair on the calendar — until those three quit. O2 isn't really about fair scheduling. It's about whether the team has built enough operational coverage that no one engineer is a single point of failure for an entire system.
STRATEGIC BETS
The three strategic bets inside the Engineering stack — what to focus on this quarter
Your squads are already running the recurring work — standups, refinement, retros, code review, on-call rotations, deploys, incident response, postmortems. That's table stakes and it doesn't stop. Strategy at the function level is which three transformations you commit to this quarter, on top of the regular work. The three below are the most common bets an Engineering leader makes at this stage, and the specific initiatives that make each one real.
Strategy 1 — Make squad commits visible across the org, not buried in Jira
→ O1
1.1
Publish a weekly squad-commit dashboard — every squad's commit, deploy, and slip rate visible to peer EMs and the VP Eng
All EMs + Data
1.2
Force T-shirt sizing on every story above 2 points at refinement — squads that skip refinement get no commit credit
Internal
1.3
Lock mid-sprint scope changes behind a written exception — Product owns the form, VP Eng signs off
VP Product + VP Eng
1.4
Run a monthly "commit reliability" review across squads — name patterns, not people; surface what blocks predictability
Internal
Strategy 2 — Spread on-call ownership before the senior IC writes the resignation
→ O2
2.1
Map every production system to ≥ 3 trained on-call responders — single-knowledge systems get capacity to fix that this quarter
Internal
2.2
Track per-engineer page count weekly — anyone trending past 4/week gets re-routed before the rolling-4-week threshold breaks
Internal
2.3
Invest 20% of squad capacity in runbook + automation work for the top 5 page sources per quarter — not optional, not deferred
VP Product + VP Eng
2.4
Run a quarterly on-call retro per squad — not blameless culture noise, actual data on what fired and who absorbed it
Internal
Strategy 3 — Treat the platform team like the most expensive customer the org has
→ O3
3.1
Publish platform SLOs the same way you publish customer SLAs — uptime, deploy time, build time, rollback time
Internal
3.2
Run quarterly internal-NPS surveys from product squads — what's slowing them down, what they'd pay for if platform were a vendor
Internal
3.3
Lock platform-team capacity at 80% protected — VP Eng signs every reallocation request, not the EM under pressure
VP Eng + CFO
3.4
Pick one platform investment per quarter that compounds — CI speed, deploy automation, observability — and ship it like a customer feature
Internal
How this differs from your VP Engineering's scorecard
Your VP Engineering is judged on whether the roadmap ships. You and your squads are judged on whether you can keep shipping that roadmap quarter after quarter.
That depends on senior ICs not quitting, the platform team not getting starved, and EM cadence holding consistent. The roadmap can ship for a few quarters even when the team underneath is straining. But eventually the senior IC writes the resignation email — and the next two quarters slip.
ENFORCEMENT LAYER
Enforcement triggers for Engineering OKRs — the cadence layer above Jira and PagerDuty
Jira shows you tickets. Linear shows you issue state. PagerDuty shows you pages. Each does its own job. But none of them tells you when a squad's commit-to-deploy ratio has been quietly drifting for 3 sprints, or when one engineer's page count has crossed the burnout threshold for the second month running. That's what enforcement does — it's the layer that sits above your Engineering tools and watches the cadence.
ShiftFocus watches seven trigger types on every Engineering KR. Two of them are the ones you'll see fire most often at a 200-500 SaaS Engineering team: Velocity Drop (Trigger 2) and Dependency SLA Breach (Trigger 6). Most Engineering OKR misses trace back to one of these — and they almost always show up at the perf cycle, not in week 4 when you could have fixed them.
The two that fire hardest at the Engineering function layer
Trigger 2 · Velocity Drop — the on-call concentration killer
⚡ Fires whenProgress on commit-to-deploy, on-call distribution, or platform-SLO KRs falls below 50% of planned pace by mid-cycle. Threshold
▎ Why this matters
Engineering KRs miss in slow-motion. The on-call distribution KR is "no engineer above 4 pages/week." Week 6: one engineer hits 7 pages. The squad EM thinks it's a one-off. Week 8: same engineer hits 9. Now it's a pattern but the senior IC is already drafting the resignation. Trigger 2 fires when the math says the per-engineer threshold is going to break — at week 6, not at the resignation email.
▎ Example scenario
Q3 rolling-4-week page count: senior IC at 7.2 (threshold: 4). Squad median: 1.8. Ratio = 4.0× (threshold ≤ 1.5×). Trigger 2 fires. EM gets the auto-brief — 3 production systems route only to this engineer, 2 squads have unfilled rotation slots, projected burnout window 4–6 weeks. Re-route or re-train before the senior IC quits.
Trigger 6 · Dependency SLA Breach — the squad-blocking-squad killer
⚡ Fires whenCross-squad dependency (API delivery, library version, schema migration, shared service) misses its agreed delivery date. Threshold
▎ Why this matters
Squad A's commit-to-deploy KR depends on Squad B shipping an API. Squad B slips by 2 weeks. Squad A's commit looks at-risk on paper but Jira shows the ticket "in progress" — the actual blocker is in someone else's backlog. Trigger 6 catches the dependency breach the day Squad B misses, not 3 weeks later when Squad A's KR turns red.
▎ Example scenario
Squad A committed to ship feature X by sprint 4, conditional on Squad B's auth API by sprint 2. Sprint 2 close: API not shipped. Trigger 6 fires immediately to both EMs and the VP Eng — auto-brief shows downstream blast radius (Squad A's KR, 2 customer commits at risk). Now Squad B's miss is a tracked breach, not a Slack thread.
The other 5 that also fire on Engineering KRs
Trigger 1 · Missed Check-in
⚡ WhenEM, tech lead, or platform-team owner skips weekly KR update. 48h auto-nudge, then escalates.
▎ Example scenario
Platform-team EM skips Friday SLO check-in for 2 weeks running. Trigger 1 nudges, then routes to VP Eng with the missed metrics flagged.
Trigger 3 · Momentum Decay
⚡ WhenCommit reliability, mid-sprint scope-change rate, or on-call distribution trends in the wrong direction 2+ sprints running.
▎ Example scenario
Squad commit ratio: sprint 1 = 88%, sprint 2 = 82%, sprint 3 = 76%. Three-sprint drift down. Trigger fires before the squad crosses the 70% structural-debt threshold.
Trigger 4 · KPI Drift
⚡ WhenUnderlying KPI (deploy frequency, change failure rate, MTTR, build time) crosses an operating threshold without the parent KR flagging.
▎ Example scenario
Build time creeps from 4 min → 7 min over 6 weeks. Aggregate platform SLO still green. Trigger 4 catches the drift before it becomes a developer-experience complaint.
Trigger 5 · Owner Absence
⚡ WhenA KR has no active owner-driven progress for 5+ business days — owner is OOO, transitioning, or quietly disengaged.
▎ Example scenario
Platform-team EM out PTO 2 weeks. SLO KR showed no movement during that window. Trigger 5 fires day 6 — VP Eng assigns interim owner before SLO drift becomes invisible debt.
Trigger 7 · Projected Miss
⚡ WhenProjected end-of-quarter completion on a function KR drops below 70% at week 6 — quarter still has 7 weeks but trajectory is broken.
▎ Example scenario
"Platform NPS ≥ 40" KR for end of Q2. Week 6 survey: 18. Trajectory projects 22 by quarter close. Trigger 7 fires now — re-prioritize platform investments while there's still a quarter to recover.
What this catches that Jira + PagerDuty miss
Jira shows you ticket state. PagerDuty shows you page volume. Neither tells you that a senior IC has been carrying 4× the squad median for 3 weeks running, or that Squad A's KR is at risk because Squad B's API is shipping sprint-late. ShiftFocus watches the rhythm of progress on every KR — across squads, across systems, across sprints — and surfaces the problem while you still have time to fix it.
ESCALATION DESIGN
The Engineering OKR escalation chain — 5 levels on a 48-hour clock
Right now, Engineering escalation is informal. The squad lead mentions a problem at sprint retro. The EM DMs the VP. The VP hears about it at the perf cycle. By the time it reaches the VP, the senior IC has already been carrying 70% of the on-call load for 6 sprints.
The chain below replaces that. Every level has a 48-hour clock. If the person above doesn't resolve it in 48 hours, it auto-routes up. Below is one example — on-call concentration crossing the burnout threshold — walked through all 5 levels.
L1
Auto-Nudge — to the squad EM
Tuesday week 6: senior IC's rolling-4-week page count hits 7.2 (threshold: 4). Squad-median ratio = 4.0× (threshold: 1.5×). EM gets Slack + email with the engineer name redacted to peers, the page-source breakdown, and the SLA they breached.
Immediate
L2
Peer Flag — adjacent EMs + platform lead see it
Thursday: page count uncorrected. Adjacent squad EMs get pinged — knowledge-sharing or rotation-borrowing options surfaced. Platform lead sees if any production system can be re-mapped to remove the SPoF.
+48h
L3
VP Engineering Brief — escalation lands on the desk
Saturday: still uncorrected. VP Eng gets a brief — engineer named, 3 systems route only to them, 2 unfilled rotation slots in adjacent squads, modeled retention risk window 4-6 weeks, suggested actions (cross-train next sprint, freeze new commits on those 3 systems, route weekend pages to platform lead). Owns the next move.
+48h
L4
CTO Brief — function-level exposure
Week 8 auto-check: senior IC's page count still ≥ 6/week. CTO gets a one-page brief — what's failing, why, what to do. Specifically: a senior IC with 18 months tenure is in a measurable burnout window. Replacement cost modeled at $480K (1.5× FLC + 6mo ramp). Decision required.
Week 8
L5
Intervention — exec war room
3 weeks before quarter close. On-call concentration unresolved across 2+ engineers. War room fires. CTO + VP Eng + CHRO + CFO. Re-allocate platform capacity, freeze feature commits on at-risk systems, or accept the budgeted attrition cost — locked within 48 hours.
T-3 weeks
What this kills
The familiar Engineering story: a senior IC quits in week 11 of the quarter. The squad's velocity drops 30% the next sprint. The post-mortem concludes "we didn't see it coming."
With this chain, Trigger 2 catches the on-call concentration the first time the ratio crosses 2.5×. Same facts, six weeks earlier, with the right person on it.
EXECUTION INTELLIGENCE
Five execution metrics that track every Engineering OKR
Your Engineering tools tell you what shipped. ShiftFocus tells you whether you're going to hit your OKRs — using five simple metrics that run on every KR. The same five metrics run on every team's KRs in the company. So when you walk into your VP Eng 1:1, you already know what they're seeing.
What this looks like in practice
What the leakage actually costs
Engineering team failures don't show up as one number. They show up across senior-IC attrition, platform-team turnover, deploy reliability, and customer-facing reliability incidents. The numbers below are sourced; the scenario is a $40M ARR SaaS at 300 employees.
Senior IC attrition tied to on-call burden
2 senior+ engineers / qtr × $480K replacement cost (1.5× FLC $220K + 6mo ramp)1
-$960K
Platform team capacity cut for product features
Avg 30% of platform capacity reallocated each sprint × $180K/eng FLC × 4 engineers × 1 qtr2
-$216K
Customer-facing incidents from deferred platform work
Avg 3-5 incidents / qtr × $60-180K avg cost (revenue + remediation + comms)3
-$540K
Sprint-replan overhead from mid-sprint scope changes
Avg 6 squads × 2 mid-sprint replans / sprint × 4 hrs / replan × $200/hr blended2
-$58K
Slow build / deploy time across all squads
12 min build vs 4 min target × 30 builds/eng/wk × 60 engineers × 12 weeks × $150/hr4
-$130K
Senior engineer time on toil instead of leverage work
10 senior+ engineers × 30% toil time × $250K FLC × 1 quarter4
-$190K
Quarterly cost band of running engineering without enforcement
$2.4M – $4.1M
The ROI math for an Engineering function
Modeled quarterly cost: $2.4M–$4.1M. Annual: $9.6M–$16.4M.
Stop one senior IC resignation tied to on-call burnout, or catch one platform-team capacity drift before it becomes a customer reliability incident — and the tool has paid for itself several times over. The point isn't "another sprint dashboard on top of Jira." It's making cadence visible across squads before the senior IC quits.
▶ Pilot-verifiable
See where your engineering org's hero-debt is going to break — before the senior IC writes the resignation.
Connect your Jira or Linear plus PagerDuty. We'll audit the last 4 sprints for commit-reliability drift, on-call concentration patterns, and cross-squad dependency breaches — and show you which squad's hero-debt is the next attrition risk.