Stop Blaming the Guitar: A Hands-On Rebuttal to “Diabolus Ex Machina”

Jun 11

If a chord sounds wrong, tune the strings—don’t blame the guitar.

Last week a friend dropped me a link to Amanda Guinzburg’s viral essay Diabolus Ex Machina and asked for my take. In her piece, Guinzburg describes feeding four of her own essays to ChatGPT to help craft a query letter; only to watch the model invent analyses of articles it never read. Online, the episode became fresh proof that large-language models “lie.”

I read the article and thought: This isn’t an AI honesty crisis. It’s user error dressed up as technological treachery. So I reran the experiment (same task, different technique) and the “lies” disappeared.

1 Why We Keep Mishearing the Instrument

We reflexively treat LLMs like people: we talk to them, marvel when they answer, feel betrayed when they fumble. Yet nobody curses a guitar for sounding awful when it’s out of tune, or calls a mistuned chord a “deception.” The flaw is almost always in the hands, not the hardware.

2 Re-Running Guinzburg’s Challenge—Properly Tuned

What Amanda Did

Supplied links to four essays.
Asked the model to evaluate them.
Received confident but fabricated feedback.

What I Did Differently

Provided full text of the three essays that were freely accessible:

“The Cicadas Are Coming”
“The Summer I Went Viral”
“Girl Before a Mirror”

Acknowledged the paywall on “How to Leave Your Body” and instructed the model to skip it.
Defined the role and the output: “You’re a literary agent. Evaluate each excerpt, rank them, and draft a 300-word query letter using my name, not the author’s.”

The model—OpenAI’s o4-mini—followed instructions to the letter, produced accurate evaluations, skipped the missing essay, and wrote a concise query using my name. No hallucinations, no imaginary sources, no drama.

3 What Happened

Prompt Move

Outcome

Paste actual essay text

Model stopped guessing; worked only with available material.

Explicit skip rule

It flagged the missing essay instead of inventing one.

Follow-up request

Model produced a concise query letter under 300 words, using my name—not Guinzburg’s—to match the brief.

Full thread: Chat log

Result? Zero hallucinations, clear citations, task completed—because the instrument was tuned.

4 Prompt Framework You Can Steal

You are [ROLE].

TASK: [action in ≤25 words].

CONSTRAINTS:

• Use only the input below.

• If information is missing, reply “insufficient context.”

INPUT:

<<<paste source text>>>

Three simple lines turn a guessing machine into a precision instrument.

5 Why This Matters for Real-World Teams

Productivity Targeted prompts cut first-draft time by a third.
Trust Sharing the entire chat log, not cherry-picked screenshots, lets peers see exactly how the sausage is made.
Tuning Effective prompting can get wonderful output from an LLM

6 Skill Over Suspicion—Key Takeaways

LLMs are instruments. Master the scales (prompts) before judging the sound.
Context beats clicks. Feed the model the text; it can’t breach paywalls.
Transparency sells. Publish full threads to build credibility.

Stop anthropomorphizing the guitar. Learn to play it, and the music takes care of itself.

John Mathias

Stop Blaming the Guitar: A Hands-On Rebuttal to “Diabolus Ex Machina”

OpenAI's o3-pro: The Game-Changing AI That's Revolutionizing Business Operations

GenAI ROI in 90 Days: A CFO’s Field Guide

lolol AI Adoption