Stop Blaming the Guitar: A Hands-On Rebuttal to “Diabolus Ex Machina”

If a chord sounds wrong, tune the strings—don’t blame the guitar.

Last week a friend dropped me a link to Amanda Guinzburg’s viral essay Diabolus Ex Machina and asked for my take. In her piece, Guinzburg describes feeding four of her own essays to ChatGPT to help craft a query letter; only to watch the model invent analyses of articles it never read. Online, the episode became fresh proof that large-language models “lie.”

I read the article and thought: This isn’t an AI honesty crisis. It’s user error dressed up as technological treachery. So I reran the experiment (same task, different technique) and the “lies” disappeared.

1 Why We Keep Mishearing the Instrument

We reflexively treat LLMs like people: we talk to them, marvel when they answer, feel betrayed when they fumble. Yet nobody curses a guitar for sounding awful when it’s out of tune, or calls a mistuned chord a “deception.” The flaw is almost always in the hands, not the hardware.

2 Re-Running Guinzburg’s Challenge—Properly Tuned

What Amanda Did

  1. Supplied links to four essays.

  2. Asked the model to evaluate them.

  3. Received confident but fabricated feedback.

What I Did Differently

  • Provided full text of the three essays that were freely accessible:

    • “The Cicadas Are Coming”

    • “The Summer I Went Viral”

    • “Girl Before a Mirror”

  • Acknowledged the paywall on “How to Leave Your Body” and instructed the model to skip it.

  • Defined the role and the output: “You’re a literary agent. Evaluate each excerpt, rank them, and draft a 300-word query letter using my name, not the author’s.”

The model—OpenAI’s o4-mini—followed instructions to the letter, produced accurate evaluations, skipped the missing essay, and wrote a concise query using my name. No hallucinations, no imaginary sources, no drama.

3 What Happened

Prompt Move

Outcome

Paste actual essay text

Model stopped guessing; worked only with available material.

Explicit skip rule

It flagged the missing essay instead of inventing one.

Follow-up request

Model produced a concise query letter under 300 words, using my name—not Guinzburg’s—to match the brief.

Full thread: Chat log

Result? Zero hallucinations, clear citations, task completed—because the instrument was tuned.

4 Prompt Framework You Can Steal

You are [ROLE].

TASK: [action in ≤25 words].

CONSTRAINTS:

  • Use only the input below.

  • If information is missing, reply “insufficient context.”

INPUT:

<<<paste source text>>>

Three simple lines turn a guessing machine into a precision instrument.

5 Why This Matters for Real-World Teams

  • Productivity Targeted prompts cut first-draft time by a third.

  • Trust Sharing the entire chat log, not cherry-picked screenshots, lets peers see exactly how the sausage is made.

  • Tuning Effective prompting can get wonderful output from an LLM

6 Skill Over Suspicion—Key Takeaways

  1. LLMs are instruments. Master the scales (prompts) before judging the sound.

  2. Context beats clicks. Feed the model the text; it can’t breach paywalls.

  3. Transparency sells. Publish full threads to build credibility.

Stop anthropomorphizing the guitar. Learn to play it, and the music takes care of itself.

Previous
Previous

OpenAI's o3-pro: The Game-Changing AI That's Revolutionizing Business Operations

Next
Next

GenAI ROI in 90 Days: A CFO’s Field Guide