When AI Goes Wrong: Real Mistakes and What They Cost

I'm going to tell you about some AI failures. Not to scare you off, because I genuinely think AI is useful for small businesses. But because understanding how things go wrong is the best way to make sure they don't go wrong for you.

None of these are client stories. They're composites and public examples, but they represent patterns I've seen play out more than once.

The Chatbot That Made Stuff Up

A small e-commerce company set up a customer service chatbot powered by a large language model. They fed it their FAQ, their return policy, and their product catalog. For the first two weeks, it worked beautifully. Customers got fast answers. Support ticket volume dropped.

Then a customer asked about a product warranty, and the chatbot invented one. Confidently stated that the product came with a two-year replacement guarantee, which it didn't. The customer bought the product based on that warranty. When it broke nine months later, they wanted a replacement and had the chatbot transcript to back up their claim.

The company honored the warranty because fighting it would have cost more in reputation damage than the product was worth. But it could have been worse. Imagine a chatbot inventing a return policy, a discount, or a service commitment.

What went wrong: The chatbot wasn't restricted to only answering from the source documents. It could generate responses based on general training data, which included other companies' warranty policies. There was no fallback to "I don't know, let me connect you with a person."

The fix is simple. Any customer-facing AI needs guardrails that limit responses to verified information and a clear escalation path when it can't find the answer. "I'm not sure about that, let me check with the team" is a perfectly acceptable chatbot response.

The Automation That Sent the Wrong Emails

A consulting firm automated their proposal follow-ups. If a proposal hadn't received a response within five days, an AI-drafted follow-up would go to the prospect. Smart setup, in theory.

The problem was a data mismatch. The automation pulled client names from one field in their CRM and project details from another. For about two weeks, a bug in the integration meant that follow-up emails referenced the wrong projects. "Just checking in on the kitchen renovation proposal" went to someone who had received a branding strategy proposal.

Most recipients probably just deleted it or shrugged. But one prospect had been considering two firms, and the muddled follow-up tipped their decision toward the competitor who seemed more organized.

What went wrong: The automation was built and launched without a testing phase. Nobody sent themselves test emails to verify the data mapping. And there was no monitoring in place to catch the mismatch after launch.

The fix: Test automations with real data before going live. Send to yourself first. And build in a weekly check for the first month, just glance at the log to make sure outgoing messages look right. After that, monthly audits are usually enough.

The Prompt That Shared Client Data

This one's quieter but potentially more serious. An employee at a small accounting firm was using Claude to help draft client communications. To give the AI better context, they pasted in segments of client financial data, including revenue figures, tax situations, and in one case, a partial Social Security number.

Nobody found out. No breach was reported. The data sat in the AI provider's system, covered by their terms of service, probably not used for anything. Probably.

But "probably" isn't good enough when you're handling client financial data. If a client had asked "do you share my data with third-party AI services," the honest answer would have been yes, and most engagement letters don't contemplate that.

What went wrong: No clear policy existed about what could and couldn't go into AI tools. The employee was trying to do good work faster. They didn't think of the AI tool as a "third party" because it felt like a personal productivity tool.

The fix: This is the easiest one to prevent. Create a short, clear list of data categories that are off-limits for AI tools. Share it with your team. Revisit it when you onboard new people. The safety post has a framework for this.

The Content That Nobody Checked

A small marketing agency started using AI to generate first drafts of blog posts for clients. The workflow was sensible: AI drafts, human edits, client approves. But as the team got comfortable with the AI output, the "human edits" step got shorter and shorter. Skim it, looks fine, ship it.

One post included a fabricated statistic about industry growth rates. Another attributed a quote to someone who never said it. A third recommended a specific product that didn't exist.

None of these were catastrophic. But when a client Googled the statistic and couldn't find a source, trust eroded. The agency's credibility was their product, and they'd outsourced the quality check.

What went wrong: Process erosion. The review step existed but wasn't enforced. As confidence in the AI grew, vigilance dropped. This is a natural human tendency, and it needs to be designed against.

The fix: Make the review step structural, not optional. Use a checklist. "Have you verified all statistics? Have you confirmed all quotes? Have you checked that all products/companies mentioned are real?" It takes two minutes and prevents the kind of mistakes that damage client relationships.

The Pattern

Four different failures, one common thread: the technology worked roughly as expected. The failures were all about how people interacted with the technology.

Not enough guardrails on what AI could say. Not enough testing before launch. Not enough clarity about what data goes where. Not enough discipline in the review process.

AI mistakes aren't usually dramatic. They're quiet. A wrong email here, a fabricated fact there, a data leak nobody notices. They compound over time into credibility damage, lost clients, or regulatory exposure.

The good news is that every one of these failures is preventable with basic protocols. You don't need a risk management department. You need a short list of rules, a testing habit, and a review step that actually happens.

If you're setting up AI tools and want to make sure you're not building in these kinds of risks, that's worth a conversation. Sometimes an outside perspective catches the gaps you've gotten used to stepping over.