Purpose
Judgment, not trivia
The cases surface the habits senior analysts need: metric humility, causal caution, operational risk, fairness burden, and uncertainty communication.
Data Science Judgment Lab
This is a 25-case room for practicing the hard part of data science: deciding what messy evidence can support, what it cannot support, and how confident you should be when pressure is in the room.
Purpose
The cases surface the habits senior analysts need: metric humility, causal caution, operational risk, fairness burden, and uncertainty communication.
Method
You work the evidence before the explanation appears. The struggle is the point: tempting wrong stories become visible enough to learn from.
Record
Completion creates a local judgment record after reviewed cases. It is a learning artifact, not a proctored credential.
Case library
A launch-week chart jumps, a deck is due, and several teams have reasons to claim the movement.
A checkout readout lands just before planning closes, and different artifacts point toward different launch stories.
A polished retention model arrives with a renewal deadline, a crowded outreach queue, and a promise that the save team can act sooner.
A city inspection team has a new routing screen, a long backlog, and one week to decide how much authority the score should have.
A district impact brief is headed to a funding vote after students who used a tutoring platform show stronger spring gains.
A city housing office must set winter overflow capacity from a forecast that fits ordinary nights better than pressure weeks.
A state benefits agency wants to use a verification score to cut backlog, but the burden may land unevenly on applicants with messier administrative records.
A benefits agency chatbot handles routine questions well, but evaluation logs show confident wrong answers on high-stakes claim situations.
The same benefits agency must choose a payment-hold threshold that catches fraud without turning suspicion into broad payment delay.
The benefits modernization program changes its executive metric, and the new dashboard may reward faster closure while hiding reopened cases and payment delay.
A customer research survey appears decisive until response patterns reveal who never had a real chance to answer.
A familiar hospital operations field powers a clean improvement story while source systems leave conflicting traces.
A clinical risk report looks stable after dropping incomplete records, but missingness follows staffing, language access, and acuity.
A de-identified public health export clears a checklist, but linkage, consent scope, and lifecycle controls make the release less simple.
A board packet turns an early operational shift into a dramatic story, and the chart frame is doing more work than it first appears.
A regional media test appears to win, but market matching, spillover, seasonality, and operational changes keep the counterfactual unsettled.
A policy brief claims a workforce pilot raised employment, but the comparison group was already drifting away before launch.
An eligibility cutoff seems to prove a rental-assistance navigator prevented evictions, until sorting around the threshold weakens the design.
A product experiment gets a fast no-go recommendation, but the exposure record and interval width leave more than one interpretation alive.
A subscription checkout test lifts paid starts, but refunds, retention, and support burden make the growth claim less settled.
A hospital readmission score looks unusually strong in validation, and the launch team wants it in the discharge workflow next month.
A moderation model beats the old rules engine on a vendor benchmark, but the benchmark labels may be measuring vendor behavior more than policy truth.
An ETA model drift alarm is real, but the deeper failure is that monitoring is not connected to owned operational response.
A holiday replenishment model looks accurate enough to override planners, but store operations leave clues that demand may not be fully visible.
An enterprise assistant performs well on clean sales workflows, and revenue operations wants tool-enabled expansion before renewal season.