Small Wins Instead of Moonshots - Why the Old Playbook Fails When Your Teammate Is an AI

It starts with one more prompt. The AI delivered a good first draft, so you ask a follow-up question. Then another. Then you refine the output, feed it back, iterate again. Two hours vanish. You look up and realize you skipped lunch — not because you were under pressure, but because the next answer was always right there, waiting.

Sounds familiar? I have watched this pattern in myself for months now and the research from 2025 confirms I am not alone. People who work intensively with AI work longer, not shorter.[1] The pull is real, and it operates independently of explicit ambition — though ambition, as we will see, can amplify it into something harder to contain.

For decades, the management playbook for high performance has been the moonshot: set goals so ambitious that even partial success produces remarkable results. Aim for the moon, land among the stars. The approach has deep empirical roots — a 2002 meta-analysis by Locke and Latham, synthesized from decades of goal-setting studies, finds that difficult, specific goals outperform easy or vague ones in 96 percent of cases, with an effect size of d = 0.82.[2] Moonshots work. The question is whether they still work when one member of the team never gets tired, never loses focus, and never cares about the goal at all.

Moonshots earned their reputation under conditions that no longer hold

The evidence for stretch goals is not thin. Among the most replicated findings in organizational psychology, it has survived decades of scrutiny. Sitkin and colleagues refined the picture by showing that stretch goals produce their best results when organizations already have high performance and slack resources — the capacity to absorb failure without existential risk.[3] Google X operates on this principle: roughly a two percent graduation rate from its pipeline, but with an infrastructure designed to harvest value from the projects that fail — what Astro Teller has called "moonshot compost."[4] Under the right conditions, moonshots are not reckless. They are generative. Prominent voices in the AI industry — from consultancies promoting "10x AI transformation" to executives framing quarterly targets around maximum AI leverage — continue to apply this logic to human-AI collaboration.

Two caveats complicate this picture. The first is empirical: Sitkin's own research shows that organizations without slack tend to reach for stretch goals as a desperate rescue measure — and almost invariably fail. Field data find no positive main effect of stretch goals on performance; instead, they produce a polarized distribution with a few winners and many losers.[5] The second caveat is cultural, and it cuts deeper than implementation details. Moonshots originated in a US business culture that tolerates aspirational goals — targets understood to be directional, not literal. In Germany and many European corporate cultures, goals are taken more seriously. What was intended as motivational framing gets translated into mandatory targets. A quarterly review in Munich does not treat a moonshot the way a quarterly review in Mountain View does. Aspiration becomes obligation, and obligation erodes the very autonomy that made stretch goals productive in the first place.

The new teammates don't respond to motivation

All of this goal-setting research rests on a premise so fundamental it usually goes unspoken: ambitious goals work because they create motivation. AI does not have or need motivation. An experiment that tested whether increasing prompt difficulty and specificity — the two core dimensions of goal-setting theory — improved LLM performance found no consistent improvement on either dimension.[6] The motivational lever that Locke and Latham identified does not exist on the machine side. The mechanism that makes stretch goals powerful for humans simply does not transfer.

What does improve AI output is not ambition but instruction structure. When context windows grow beyond 10,000 tokens, LLM reliability degrades by 20 to 50 percent — even for simple retrieval tasks.[7] Structured prompting can cut hallucination rates by more than half.[8] Shorter contexts, clearer task boundaries, decomposed steps: these are the variables that matter on the machine side — at least with current architectures. Model capabilities are evolving rapidly, and some of these constraints may weaken or disappear. But even if the technical limitations dissolve, the management implication remains: the optimal unit of work for an AI is not a project-level stretch goal but a task-level deliverable. Not how high you aim determines the output quality — how you break the work apart does.

Where does this asymmetry leave teams that contain both humans and machines? In a 2026 survey of almost 6,000 executives across the US, UK, Germany, and Australia, eighty percent reported zero productivity gains from AI despite active use — a self-assessment measure, but one consistent across countries and firm sizes.[9] A separate report from MIT estimates that ninety-five percent of AI pilots fail or stall before reaching production scale.[10] These numbers do not indict AI. They challenge the assumption that the old management model transfers to a fundamentally new kind of collaboration.

When ambition meets addiction, the result is not performance

Return to the pull effect. The CEPR data show that moving from the 25th to the 75th percentile of AI exposure corresponds to 2.2 additional working hours per week — regardless of goal type.[1] An eight-month ethnography at a single US technology company documented three mechanisms of intensification: task expansion, where job roles blurred as people took on work previously outside their scope; boundary dissolution, where AI-assisted work bled into lunch breaks and evenings; and a multitasking illusion, where parallel AI conversations created a sense of productivity without proportional output.[11] None of this was imposed from above. The conversational interface makes work feel like chatting, and chatting has no natural stopping point.

Now layer moonshots onto this dynamic. When ambitious targets are treated as mandatory — when the quarterly review asks why you did not hit the number — the pressure to maximize AI use intensifies. The human becomes a reviewer rather than a creator, the AI dictates the pace, and the range of what counts as worthwhile work narrows to what the machine can accelerate.

Here a finding from a study commissioned by the freelancing platform Upwork becomes relevant, though it must be read with the source's commercial interest in mind: 88 percent of the most productive AI users report burnout, while freelancers using the same tools report significantly higher satisfaction.[12] The comparison is imperfect — freelancers self-select into autonomous work arrangements, and income levels, job types, and personality factors differ between the groups. But the pattern aligns with what two decades of burnout research predict: the job demands-resources model identifies autonomy as a stronger buffer against burnout than goal difficulty.[13] Not ambition separates the burned-out from the thriving. Control does — control over when, how, and whether to engage.

The risk is that moonshots in human-AI teams erode this control through a plausible but untested sequence: ambitious targets encourage maximum AI use, which shifts decision-making from human to machine, which narrows the very resource — self-determination — that buffers against burnout. I have watched this mutation in practice. What begins as an aspirational target slowly becomes an unquestioned mandate; the original intent — to inspire reaching beyond the comfortable — gets lost, and what remains is the number. No study has tested this specific causal chain. But the individual links — mandatory AI use reducing autonomy, reduced autonomy predicting burnout, moonshot targets drifting from aspirational to mandatory — are each empirically grounded. The chain is a hypothesis. The links are not.

Small wins do not solve the problem — they make the asymmetry visible

Karl Weick's original 1984 paper on small wins is often misread as an argument for modest goals. It is not. A small win, Weick wrote, is not a small goal — it is a cognitive reframing strategy that reduces arousal and enables rational problem-solving.[14] You can pursue ambitious objectives and still structure your execution as a series of short, completable cycles. Teresa Amabile's research on the progress principle found that the single strongest predictor of engagement and creativity is the sense of making progress on meaningful work.[15] On the AI side, decomposition delivers measurable improvements: shorter contexts reduce hallucinations, and iterative refinement outperforms single-shot generation.[7][8]

Yet the tension cuts deeper than a matter of implementation. Research on 3,562 participants found that while AI collaboration improved task performance, it simultaneously reduced intrinsic motivation — a measurable drop in self-determination and creative engagement after each AI-assisted phase.[16] Performance rose. The desire to do the work fell. For the small-wins approach, this creates a paradox: Amabile's progress principle predicts that completing meaningful work sustains motivation, but if the completion is AI-assisted, the motivational payoff may be diminished. Small wins do not escape this tension. What they offer is a structural advantage — because each cycle is short and bounded, the human retains the ability to step back, reflect, and reconnect with the larger purpose. Whether shorter cycles can compensate for AI-induced motivation loss is itself an untested hypothesis; what is observable is that a continuous moonshot sprint makes such reflection structurally difficult. The evidence against moonshots in AI collaboration is currently stronger than the evidence for small wins. The argument here is not that small wins are proven to work better, but that they create the conditions — shorter cycles, clearer boundaries, deliberate pauses — under which the paradox becomes manageable rather than invisible.

No controlled experiment compares moonshot-formatted and small-wins-formatted AI workflows directly. The argument connects compatible findings from separate fields into a hypothesis that is consistent but not yet proven. The individual findings converge. The direct test is still ahead.

The direction, not the recipe

Small wins address the structure of work. They do not address the pull. At midnight the AI will still be there, ready for one more prompt, indifferent to whether you have slept. What is emerging — from the research and from practice — is not a finished model but a direction: intentional norms for how teams engage with AI. This means deciding which tasks involve the machine and making results visible to peers rather than burying them in management dashboards. It also means treating every AI output as raw material — a starting point that gains value only through human refinement, context, and judgment. These are not novel ideas — they are established organizational principles applied to a context where, without deliberate design, the default is drift toward overwork.

Whether such norms can durably contain the pull effect remains an open question. The research diagnosed the problem; the intervention has not been tested. But the absence of norms is not neutral — it correlates with the burnout rates and productivity paradoxes that the data already document.[12][17] Doing nothing is not the safe option. It is the option whose costs are already visible.

I still feel the pull every day. The difference is that I now choose when to follow it — and when to close the laptop and let the next prompt wait until tomorrow. That is not a moonshot. It is a small win. And it requires more discipline than any stretch goal I have ever set.

[1] Jiang, W., Park, J., Xiao, R. & Zhang, S. (2025). As AI's power grows, so does our workday. CEPR VoxEU. https://cepr.org/voxeu/columns/ais-power-grows-so-does-our-workday

[2] Locke, E. A. & Latham, G. P. (2002). Building a Practically Useful Theory of Goal Setting and Task Motivation: A 35-Year Odyssey. American Psychologist, 57(9), 705-717.

[3] Sitkin, S. B., See, K. E., Miller, C. C., Lawless, M. W. & Carton, A. M. (2011). The Paradox of Stretch Goals: Organizations in Pursuit of the Seemingly Impossible. Academy of Management Review, 36(3), 544-566.

[4] Teller, A. (2016). The unexpected benefit of celebrating failure. TED Talk. https://www.ted.com/talks/astro_teller_the_unexpected_benefit_of_celebrating_failure — The graduation rate figure is approximate and drawn from multiple public statements by Teller.

[5] Sitkin, S. B., Miller, C. C., See, K. E. & Lewis, M. W. (2017). Stretch Goals and the Distribution of Organizational Performance. Organization Science, 28(3), 395-410.

[6] Kumar, M. (2025). The Limits of Goal-Setting Theory in LLM-Driven Assessment. arXiv 2510.06997.

[7] Chroma Research (2025). Context Rot: How Increasing Input Tokens Impacts LLM Performance. https://research.trychroma.com/context-rot — Du, Y., Tian, M. et al. (2025). Context Length Alone Hurts LLM Performance Despite Perfect Retrieval. Findings of EMNLP 2025, 23281-23298.

[8] Omar, M., Sorin, V., Collins, J. D. et al. (2025). Multi-model assurance analysis showing large language models are highly vulnerable to adversarial hallucination attacks during clinical decision support. Communications Medicine. https://www.nature.com/articles/s43856-025-01021-3

[9] Yotzov, I., Barrero, J. M. et al. (2026). Firm Data on AI. NBER Working Paper No. 34836. Survey of almost 6,000 executives across the US, UK, Germany, and Australia. https://www.nber.org/papers/w34836

[10] Challapally, A., Pease, C., Raskar, R. & Chari, P. (2025). The GenAI Divide: State of AI in Business 2025. MIT NANDA.

[11] Ranganathan, A. & Ye, X. M. (2026). AI Doesn't Reduce Work — It Intensifies It. Harvard Business Review. https://hbr.org/2026/02/ai-doesnt-reduce-work-it-intensifies-it — Single-company ethnography; illustrative, not generalizable.

[12] Upwork Research Institute (2025). From Burnout to Balance: AI-Enhanced Work Models for the Future. https://www.upwork.com/research/ai-enhanced-work-models — Note: Upwork is a freelancing platform with a commercial interest in demonstrating freelancer advantages.

[13] Demerouti, E., Bakker, A. B., Nachreiner, F. & Schaufeli, W. B. (2001). The Job Demands-Resources Model of Burnout. Journal of Applied Psychology, 86(3), 499-512.

[14] Weick, K. E. (1984). Small Wins: Redefining the Scale of Social Problems. American Psychologist, 39(1), 40-49.

[15] Amabile, T. & Kramer, S. (2011). The Power of Small Wins. Harvard Business Review. https://hbr.org/2011/05/the-power-of-small-wins

[16] Wu, S., Liu, Y., Ruan, M., Chen, S. & Xie, X. Y. (2025). Human-generative AI collaboration enhances task performance but undermines human's intrinsic motivation. Scientific Reports, 15(1). https://www.nature.com/articles/s41598-025-98385-2

[17] Workday / Hanover Research (2026). New Workday Research: Companies Are Leaving AI Gains on the Table. https://newsroom.workday.com/2026-01-14-New-Workday-Research-Companies-Are-Leaving-AI-Gains-on-the-Table