What a simple illustration request revealed about training data bias.

I was working on a story that needed a simple illustration: a young girl carrying a bundle of wood in her arms, seen from behind. A straightforward image to support the narrative moment. I turned to an AI image generator, as I often do for story development, expecting this would take a few attempts to get the composition right.
After dozens of tries and increasingly specific prompts, the AI kept producing the same anatomically impossible result: a girl with wood on her back, one arm twisted behind her in ways that defied human physiology. I refined my language. I tried different phrasings. Nothing worked. The AI seemed unable to understand what I was asking for.
Perhaps you’ve experienced something similar—a moment when an AI tool consistently misunderstands a request that feels perfectly clear to you, and no amount of prompt refinement seems to bridge the gap.
When the Problem Isn’t the AI
That’s when I shifted approach. If the AI couldn’t generate what I needed, maybe I could find a reference image to work from. I started searching: “person carrying wood shown from behind,” “woman carrying firewood in arms seen from behind,” variations on this theme across multiple search engines and image databases.
I found many images—but not what I needed. As I kept searching, I noticed they all fell into two distinct categories.
The first: documentary-style photographs of African and rural Asian women carrying enormous loads—massive bundles strapped to their backs or balanced on their heads. The photography emphasized the scale of the burden, the hardship of the labor. These images appeared in contexts about traditional practices, rural life, development work.
The second: lifestyle aesthetic shots of young white women holding small amounts of wood styled as props. Smiling faces, carefully styled outfits, cozy backdrops of stacked firewood. The wood was a prop in a narrative about rustic charm or outdoor living.
What I didn’t find: an ordinary person doing ordinary work, photographed from a practical angle that simply showed the task being done. Not romanticized. Not exoticized. Just the mundane reality of carrying a moderate amount of wood in your arms while your back faces the camera.
Questions About What Gets Documented
This pattern made visible something I’d been noticing but hadn’t fully examined. The same task—carrying wood—seemed to receive completely different photographic treatment depending on who was performing it and why someone was taking the picture. Marginalized women’s labor was documented to emphasize burden and tradition. Western women’s interaction with firewood was styled as lifestyle aesthetic. But the everyday, unremarkable version of this task? That seemed to exist in a gap between these two frameworks.
I found myself wondering: whose activities get photographed, and how do those choices reflect what’s considered worth documenting? When photographers throughout history decided what merited capturing—when painters chose their subjects, when writers selected which stories to preserve—what got included and what got overlooked?
Perhaps more importantly for our current moment: if those gaps and biases in historical documentation now form the training data for AI systems, what happens to those absences? They don’t just persist—they become encoded, automated, seemingly authoritative. The AI could draw from abundant examples of both the lifestyle aesthetic and the documentary hardship—but not the everyday reality that fell between them.
Beyond Absence: Encoded Bias
The issue extends beyond what’s missing to what patterns are present. I noticed this while using AI for another purpose: generating simple character sketches for story development. I asked for drawings of three 10-11 year old girls, specifying modest, age-appropriate, and non-sexualized clothing. The AI produced three figures representing different ethnicities. The white and Asian characters appeared in jeans and t-shirts. The Black character consistently appeared in revealing, age-inappropriate clothing—exposed midriff, tight clothing—regardless of how many times I specified these requirements.
This raised different questions. The AI wasn’t working from absence here but from presence—from patterns in its training data where Black girls were apparently depicted differently than other children. Even explicit counter-instructions couldn’t override whatever weight that pattern carried in the system.
Both examples—the missing ordinary perspective and the embedded stereotype—point to the same underlying issue: AI systems learn from what humans chose to create, document, and archive. Those choices weren’t neutral. They reflected who had cameras and power, what narratives seemed worth telling, which subjects were considered interesting, important, or worthy of preservation.
This isn’t new. Medical research long relied on data from predominantly white male subjects, establishing what counted as ‘normal’ health baselines and treatment protocols without accounting for how these measures differ across races and ethnicities. Now we’re seeing similar patterns play out in training data for AI systems—the gaps and biases in what was historically documented become the gaps and biases in what AI can generate.
What This Means for Design and Narrative
As someone working at the intersection of design and storytelling, this feels particularly relevant. Writers make choices about which stories to tell, which perspectives to center. Designers make choices about what to build, what to make visible, what to prioritize. These have always been consequential decisions.
But now those choices are becoming training data. What we design, what we document, what we consider worth creating—these decisions may shape what AI systems can generate tomorrow. The gaps we leave become gaps that scale. The biases we embed become biases that automate.
This isn’t an argument against AI. It’s an observation about responsibility. If our work today helps train the systems of tomorrow, then the choices about what to include, how to frame it, whose perspectives to prioritize—these matter in ways they perhaps didn’t before. Or maybe they always mattered this much, and we’re only now seeing the consequences made visible through AI’s limitations.
An Invitation to Notice
I’m still thinking through the implications of this. What strikes me is how easy it would be not to notice—to assume the AI simply couldn’t understand my request, rather than recognizing that the gap was in what the AI had learned from us.
You might find yourself noticing similar patterns. Requests that should be simple but produce consistently wrong results. Gaps that reveal themselves not through what AI generates, but through what it repeatedly cannot generate despite clear instructions.
These moments might be worth paying attention to. They may tell us something about what we’ve collectively chosen to document, create, and preserve—and what we’ve chosen to overlook. And they raise questions worth asking as we continue building the reference libraries that will shape what becomes possible to generate, design, and imagine in the years ahead.