Guide

AI Hallucinations in Law: The Courtroom Has a Judge. Your Back Office Doesn’t.

A hallucinated court filing gets a lawyer sanctioned, because a judge reads it. A hallucinated invoice, a dropped client email, or a missed deadline gets no such review. Here is why the business of law needs hallucination-free automation, and how Caddi is built for it.

Caddi•June 16, 2026

The word everyone in legal AI is afraid of

Ask any partner what worries them about AI and you will hear the same word within a minute: hallucination. It has become the single biggest reason firms hesitate, pilot forever, and keep a human glued to every output.

An AI hallucination is a confident, fluent answer that happens to be false. A citation to a case that does not exist. A figure that was never in the document. A clause the model invented to round out a paragraph. The reason it is so dangerous is that it does not look like an error. It looks exactly like a correct answer, written in the same authoritative tone, which is why it slides past a tired reviewer at 6pm.

For a profession built entirely on precise references, real cases, exact numbers, named parties, and hard deadlines, that is a uniquely poisonous failure mode. The one thing the work cannot tolerate is a fabricated fact delivered with total confidence.

The courtroom is already full of cautionary tales

The stories that made the headlines all happened in litigation, where the stakes are public and the referee is right there. It started with Mata v. Avianca in 2023, when two New York lawyers filed a brief full of cases ChatGPT had simply made up, and it has only accelerated since.

A California attorney was fined $10,000 after 21 of the 23 quotes in his opening brief turned out to be fabricated by AI. He admitted he had not read the model’s output before filing it.
A law firm defending the Chicago Housing Authority was sanctioned nearly $60,000 after an attorney used ChatGPT and filed a citation no one verified, in violation of the firm’s own AI policy.
And these are no longer isolated headlines. Tracker after tracker now logs new sanction orders for AI-fabricated citations on a rolling basis, across state and federal courts.

The lesson everyone took from these cases is “be careful with AI in court filings.” That is the right lesson, but it is the smaller half of the story.

In court, a judge catches it. In the back office, no one does.

Here is what actually saved those firms from something worse: a judge read the filing. Opposing counsel checked the citations. The system is adversarial by design, so a hallucination gets caught, named, and punished. Embarrassing and expensive, but contained.

The business of law has no judge. Intake, conflicts, billing, collections, client email, document movement, calendaring: this is the machinery that actually keeps a firm running, and almost none of it is reviewed by an adversary looking for errors. When an AI hallucinates here there is no opposing counsel to flag it and no court to throw it out. It simply enters the workflow and proceeds as if it were true.

That is the quiet risk almost no one is pricing in. The dramatic failure is a fake case in a brief. The expensive failure is a hallucination in the work nobody double-checks.

What a back-office hallucination actually costs

Move the same failure mode out of the courtroom and into operations and the damage looks very different. No headline, no sanction order, just erosion:

Emails get dropped or misrouted.A model that summarizes an inbox and decides what is urgent will, on a bad day, quietly file a furious client’s message under “low priority.” Nothing alarms anyone. The client just never hears back.
Numbers get invented in billing. A hallucinated time entry, matter number, or invoice total does not announce itself. It goes out to the client, and now the firm is either writing it off or defending it.
Deadlines land in the wrong place. A calendared date that the model confidently misread is far more dangerous than a date it missed, because everyone trusts the calendar.
The wrong client gets the right document. A confident misfile in a document or email workflow is a confidentiality incident, not a typo.

None of these get you sanctioned. They do something slower and worse. Clients leave. They rarely tell you it was the dropped email or the wrong invoice; they just decide the firm feels sloppy and they quietly move their work. The business suffers one papercut at a time, and because no single hallucination ever gets caught and named, the firm never traces the bleeding back to its source.

Why Caddi is laser-focused on being hallucination-free

Most AI tools ask a language model to figure out the task fresh, every single time it runs. That is exactly the setup that produces hallucinations: an improviser, working without a script, in a job where improvisation is the one thing you cannot afford.

Caddi is built the opposite way. Instead of generating the work on every run, Caddi watches how a task is done once, turns that recording into a verified, API-driven workflow, and then runs that exact workflow deterministically every time after. We call it record-to-code. The reliable path is fixed in advance, reviewed before it ever touches a client, and repeated identically, rather than re-invented on each run by a model that might guess.

That distinction is the whole point on the business side of law. A deterministic workflow pulls the real invoice total from the system of record instead of estimating it. It routes the email by a rule you can read instead of a vibe. It files the document against a real matter ID instead of a plausible one. There is no room for a confident fabrication, because nothing is being fabricated. The work runs the same way it ran when a person reviewed and approved it.

It is the same reason the courtroom stories stayed contained: a check sits between the AI and the outcome. Caddi builds that check into the back office, where there is no judge to provide one.

The takeaway

The court filing hallucinations are a warning, not the whole danger. They were caught precisely because someone was looking. The work that runs your firm, the intake, the billing, the client email, the calendar, has no one looking the same way. That is exactly where automation has to be hallucination-free by design, not by diligence.

If you are evaluating AI for the business of law, the question is not “how good are its answers.” It is “what happens on the run nobody checks.” That is the question Caddi was built to answer.

Do more with less

See Caddi in action

Tell us where to reach you and the calendar opens right here. In 30 minutes we'll show you how Caddi automates the back-office work that grows with your clients—built, run, and maintained for you.

Frequently asked questions

What is an AI hallucination?

An AI hallucination is a confident, fluent output that is simply false: a citation to a case that does not exist, a number that was never in the source, or a fact the model invented to fill a gap. The danger is that hallucinations read exactly like correct answers, so they pass a quick human glance.

Why are AI hallucinations such a problem in law?

Legal work runs on precise references: real cases, exact figures, named parties, firm deadlines. A hallucination breaks the one thing the work depends on. In court it gets a lawyer sanctioned. In the back office it quietly corrupts billing, intake, and client communication, where no judge is reviewing the output.

How is Caddi different from a chatbot that can hallucinate?

Caddi does not improvise the work at runtime. It watches how a task is done once, turns that recording into a verified, API-driven workflow, and then runs that same workflow deterministically every time. The reliable path is fixed in advance and reviewed, rather than re-generated by a language model on each run.