RLHF trained Claude to be verbose. Here's the proof

1 / 2

RLHF trained Claude to be verbose. Here's the proof

DEV Community·Saulo Linares·19 days ago

#phOGy9vX

#ai #claude #llm #model #prompt #response

Reading 0:00

15s threshold

The moment that made me want to understand this I was deep in FinMentor — my multi-agent Claude-powered financial advisor — testing a query I'd run dozens of times: "What's the difference between a mutual fund and an ETF?" The answer came back in 400 words. Four paragraphs. Bullet points. A disclaimer about individual circumstances. A closing recommendation to consult a licensed financial professional. The actual difference fits in two sentences. I had written nothing in my system prompt requesting elaboration. No "be thorough." No "explain in detail." The verbosity was coming from somewhere else. I rewrote the system prompt. "Be concise. Answer only what's asked." The response shortened — but not proportionally. The hedging stayed. The paragraph structure stayed. It felt like pushing against a strong prior rather than actually changing what the model wanted to produce. I was overriding behavior, not removing it. That distinction — override vs. remove — is what sent me to the InstructGPT paper.…

Continue reading — create a free account

Join HashtagPLUS to read full articles, follow hashtags, vote, and join the conversation.

Create free account Log in

Menu

RLHF trained Claude to be verbose. Here's the proof