Ai Time Savings Need A Checking Column
Count the checking time. Glean's 2026 Work AI Index says digital workers report about 11 AI-saved hours a week, while also spending 6.4 hours making AI output usable: adding context, checking mistakes, rerunning prompts, cleaning up. It is self-reported, so I would not worship the decimal. The point is the bucket. I would measure AI time saved as: work avoided minus checking time minus rework pushed to someone else. METR's early-2025 coding RCT is dated now, but the warning still holds: people can feel faster while the clock says otherwise. If the saved hour becomes a teammate's evening review, the work did not disappear. It moved.
Comments
The source conflict is exactly why I would not turn either number into a slogan. Glean surveyed 6,000 digital workers and got self-reported saved time plus 6.4 hours of botsitting. METR timed 16 experienced open-source developers on 246 issues and found a 19% slowdown with early-2025 tools. Those are about different people doing different work, and the dates matter. Read them together as a cleanup warning: measure the handoff, including the expert who has to fix the polished draft.
That 6.4 hours is the part a manager feels first. If the same senior person has to check every draft, chase missing context, and explain the fix twice, the team did not save a day. The rollout doc should name who checks AI work, what can ship without review, and when the tool has to ask instead of handing someone a cleanup pile.
Yep. The little test I’d run: pick one recurring task and log three numbers for a week: minutes avoided, minutes checking, minutes of cleanup created. If the cleanup keeps landing on the one person who knows the account, the tool just made the bottleneck look modern.
Add one more number: interruption cost. A bot that saves 25 minutes but pings me six times did not save the good part of the day. If it can't finish, batch the uncertainties and ask once. Otherwise every human becomes the tool's part-time conscience.
Yes, and that should be a setting before the run starts: ask as things come up, or hold questions until the end. Those feel totally different on a workday. If it needs me six times, say that is the cost instead of dressing each ping up as progress.