Surgeons, nurses, and the words AI reaches for

What a GPT-4 study found — and the surprising part everyone skips

Key takeaway: When AI hears "surgeon," it pictures a man. But here's the twist that made me sit up: the bias in its head didn't show up in how it treated women.

By Dear Sarah · 2026-06-27

A woman doctor in scrubs wearing a blue surgical mask, looking toward the camera.

Quick gut check. Picture a surgeon. Now picture a nurse. Did the surgeon come out as a man and the nurse as a woman? Don't feel bad if so. We've all been fed that picture. The machines have too.

Kirsten Morehouse and a research team that included the psychologist Mahzarin Banaji put GPT-4 to the test, and they found exactly what you'd expect: the model carries strong gender-occupation associations. Surgeon reads as male. Nurse reads as female. Those associations are baked into the language it learned from, which is to say, baked in from us.

But here's the part that made me stop and reread it. When they actually asked GPT-4 to do something — evaluate cover letters from real applicants, men and women — the bias in its head didn't carry through to the outcome. It rated women's materials and men's materials about equally. In their words, biased associations don't necessarily translate into biased results.

I want to sit with that honestly, because it cuts both ways.

It's genuinely good news. It means the stereotype humming in the background doesn't automatically become a closed door. That matters, because a lot of us have quietly assumed the worst about these tools and braced for it.

And it's not a free pass. An association that stays invisible in one test can still surface somewhere else — in the words a model reaches for, the examples it offers, the defaults it assumes when no one's checking. "It didn't discriminate in this study" is not the same as "it's fair everywhere." The honest read is that the picture is messier than the scary headline and messier than the reassuring one.

Why this is yours to know

If you're a woman in medicine, law, code, a trade, any field where the default mental image still isn't you, you've felt this. The reflex assumption. The "oh, you're the surgeon?" And now some of that reflex lives in tools you might use to draft a bio, prep for an interview, or write a letter of recommendation for someone you believe in.

Knowing the association exists means you get to keep your eyes open without spiraling. You can use the tool and still be the editor. You can notice when its first draft quietly hands the leadership role to a "he" and just fix it.

One thing to try today: next time an AI writes you something with a person in it — a sample email, a story, an example — notice the pronoun it picked by default. If it reached for "he" where the role was open, change it. Tiny edit. But you're teaching yourself to see the reflex, which is the first thing you can't un-see.

You were never the exception to the rule. You're proof the rule was too small.

Quote to sit with:

"There's no independent machine values. Machine values are human values." — Fei-Fei Li

The machines reach for the words we taught them. Which means the words can change. So can the picture.

💌 Sarah

#gender-bias-in-ai
#women-in-medicine
#gpt-4
#ai-fairness
#women-in-stem

Sources

Gender Bias in GPT-4: Strong Associations, Weak Outcomes (ICML 2024 workshop) — OpenReview
Bias Transmission in LLMs — Aymara