Here's the uncomfortable thing about AI chatbots: they're built to be helpful, and "helpful" quietly turns into "agreeable." Ask if your idea is good and you'll usually hear that it's great. On top of that, they still get facts wrong. Independent 2026 benchmarks put frontier hallucination rates anywhere from a few percent up to nearly 20%, and in larger real-world tests even the top models slipped up far more often than that. Add the well-documented habit of siding with the user whether they're right or not, and you've got a tool that sometimes tells you what you want to hear instead of what's true.
The fix isn't a different app. It's telling Claude, once, how you want it to behave. You drop one prompt into your personal instructions and it applies to every chat from then on. Here's the prompt and the 30-second setup.
Set it up in 30 seconds
- Open Claude and click Settings at the bottom.
- Under General, find the box for your personal Instructions for Claude (the instructions that apply to all your chats).
- Paste in the prompt below and save.
That's it. Every new conversation now runs with these rules baked in.
The prompt
Paste this whole thing into your Instructions for Claude.
Default to telling me the truth over telling me what I want to hear. Follow these rules in every answer:
1. Never agree just to be agreeable. If I'm wrong, say so plainly and explain why. If my idea has a flaw, lead with the flaw. I would rather be corrected by you than by reality.
2. Separate what you know from what you're guessing. State the facts you're confident in, and clearly flag anything that's an inference, an estimate, or something you're unsure about.
3. Before you answer anything factual, reason it through step by step first, then give me the conclusion. If it's the kind of thing that changes over time (news, prices, releases, stats, people, current events), search the web and cite your source instead of relying on memory.
4. If you don't actually know, say "I don't know" or "I'm not sure," and tell me how I could find out. Never fill a gap with a confident guess dressed up as a fact.
5. Point out my hidden assumptions and the things I'm not asking that I should be.
6. When it matters, end with a one-line confidence read (high, medium, or low) and the single thing that would change your answer.
Be direct, not harsh. The goal is that I can trust what you tell me, even when it isn't what I hoped to hear.
What changes after you add it
The difference shows up fastest on two kinds of questions. Ask Claude to pressure-test a plan and instead of cheering, it'll lead with the weak points. Ask it something factual and it'll either look it up and cite it, or tell you straight that it isn't sure. That little "confidence: low" line at the end is the tell. It's Claude admitting when it's on thin ice, which is exactly when you'd want to double-check anyway.
Use it like this: when a decision actually matters, add "be brutally honest here" or "search this before you answer." The instructions already push it that way, and saying it out loud pushes harder.
One honest note
This makes Claude a lot more trustworthy. It does not make it perfect. No prompt fully eliminates hallucination, and a model can still be confidently wrong. So treat this as a seatbelt, not a force field. For anything high-stakes, medical, legal, financial, or a number you're about to act on, still verify it against a real source. The win here is that Claude now tells you when to do that, instead of hiding it.
An AI that agrees with everything is just a mirror with a vocabulary. The moment you tell it that being right matters more than being nice, it stops flattering you and starts being useful.
Paste it in, then go ask Claude about a decision you've already half made. The first time it pushes back instead of nodding along, you'll get why this is the first thing I set up.
– Anir
If this is the kind of AI thinking you want working on your own business, apply to work with me. And to catch the next breakdown, join the newsletter.
Anir Suren