Stop Blaming the Guitar: A Hands-On Rebuttal to “Diabolus Ex Machina”
If a chord sounds wrong, tune the strings—don’t blame the guitar.
Last week a friend dropped me a link to Amanda Guinzburg’s viral essay Diabolus Ex Machina and asked for my take. In her piece, Guinzburg describes feeding four of her own essays to ChatGPT to help craft a query letter; only to watch the model invent analyses of articles it never read. Online, the episode became fresh proof that large-language models “lie.”
I read the article and thought: This isn’t an AI honesty crisis. It’s user error dressed up as technological treachery. So I reran the experiment (same task, different technique) and the “lies” disappeared.
1 Why We Keep Mishearing the Instrument
We reflexively treat LLMs like people: we talk to them, marvel when they answer, feel betrayed when they fumble. Yet nobody curses a guitar for sounding awful when it’s out of tune, or calls a mistuned chord a “deception.” The flaw is almost always in the hands, not the hardware.
2 Re-Running Guinzburg’s Challenge—Properly Tuned
What Amanda Did
Supplied links to four essays.
Asked the model to evaluate them.
Received confident but fabricated feedback.
What I Did Differently
Provided full text of the three essays that were freely accessible:
“The Cicadas Are Coming”
“The Summer I Went Viral”
“Girl Before a Mirror”
Acknowledged the paywall on “How to Leave Your Body” and instructed the model to skip it.
Defined the role and the output: “You’re a literary agent. Evaluate each excerpt, rank them, and draft a 300-word query letter using my name, not the author’s.”
The model—OpenAI’s o4-mini—followed instructions to the letter, produced accurate evaluations, skipped the missing essay, and wrote a concise query using my name. No hallucinations, no imaginary sources, no drama.
3 What Happened
Prompt Move
Outcome
Paste actual essay text
Model stopped guessing; worked only with available material.
Explicit skip rule
It flagged the missing essay instead of inventing one.
Follow-up request
Model produced a concise query letter under 300 words, using my name—not Guinzburg’s—to match the brief.
Full thread: Chat log
Result? Zero hallucinations, clear citations, task completed—because the instrument was tuned.
4 Prompt Framework You Can Steal
You are [ROLE].
TASK: [action in ≤25 words].
CONSTRAINTS:
• Use only the input below.
• If information is missing, reply “insufficient context.”
INPUT:
<<<paste source text>>>
Three simple lines turn a guessing machine into a precision instrument.
5 Why This Matters for Real-World Teams
Productivity Targeted prompts cut first-draft time by a third.
Trust Sharing the entire chat log, not cherry-picked screenshots, lets peers see exactly how the sausage is made.
Tuning Effective prompting can get wonderful output from an LLM
6 Skill Over Suspicion—Key Takeaways
LLMs are instruments. Master the scales (prompts) before judging the sound.
Context beats clicks. Feed the model the text; it can’t breach paywalls.
Transparency sells. Publish full threads to build credibility.
Stop anthropomorphizing the guitar. Learn to play it, and the music takes care of itself.