
When AI Is the Wrong Tool (and Cheaper Options Win)
Not every problem needs an LLM. A candid look at where rules, search, or plain software beat AI, and how to tell before you spend the budget.
Key takeaways
- AI earns its place on problems that are messy and language-heavy, tolerant of being wrong, and expensive to handle by hand.
- When the same input must always produce the same output, like tax, pricing, or permission checks, use plain code rather than a probabilistic model.
- If a rule fits in a sentence and holds, write the rule: an invoice number or email format is a regex, not a language model.
- Before building, run the math on latency and cost, because a 200ms model call inside a loop run ten million times is an incident waiting to happen.
Here is the short version, since you are busy and probably already behind on whatever your board asked for last quarter. A lot of the problems people want to solve with an LLM are better solved with a database query, a regex, a lookup table, or a hundred lines of ordinary code. AI earns its keep on a specific kind of problem: messy, language-heavy, tolerant of being wrong sometimes, and expensive to handle by hand. If your problem is not that, an LLM will cost you more, run slower, and fail in ways that are harder to debug than the boring solution you skipped.
I run an engineering shop, so this is not a neutral take. We make money building AI systems. And I still talk clients out of AI on a regular basis, because the fastest way to lose a customer is to sell them a thing that does not work and then bill them for the privilege. The honest answer to "where should we add AI" is sometimes "not here."
So let me make the case for the boring solution, and then draw the line where AI actually wins.
The pressure to add AI is not a reason to add AI
You know the situation. A board member read something, or a competitor put "AI-powered" in a press release, and now there is a line item that says "AI strategy" with your name next to it. The temptation is to find a place, any place, to bolt an LLM on so you have something to show. That is how you end up with a chatbot nobody uses and a support summarizer that hallucinates refund policies.
The better move is to start from the problem and ask a flat question: would a deterministic answer be better here than a probabilistic one? Most of the time, in the parts of a business that actually matter (billing, compliance, access control, anything with a number in it), the answer is yes. And a probabilistic tool is a bad fit for a deterministic need.
Five places where AI is the wrong tool
1. When the answer has to be the same every time
If the same input must always produce the same output, you want code, not a model. Tax calculation, eligibility rules, pricing tiers, permission checks. These have correct answers. An LLM gives you an answer that is usually right, which is a different and much worse property. You can lower the temperature and tighten the prompt all you want; you are still asking a probabilistic system to do a job that has a known formula.
I have seen a team try to use a model to "classify" which discount applied to an order, when the discount logic was four if-statements. The if-statements never had an outage, never cost a cent per call, and never invented a fifth discount that did not exist.
2. When rules or a regex already cover it
A surprising share of "AI" requests are pattern matching in a trench coat. Extracting an invoice number that always looks like INV- followed by six digits is a regex, not a language model. Routing a ticket based on which product name appears in the subject line is a keyword match. Validating an email format is a library you already have installed.
The test I use: if you can write down the rule in a sentence and it holds, write the rule. You get speed, near-zero cost, and an answer you can explain to an auditor. Reach for a model only when the rule has a long, fuzzy tail of exceptions that you genuinely cannot enumerate.
3. When a wrong answer is expensive and you cannot check it
This is the one that bites hardest. An LLM will be confidently wrong sometimes, and that is a permanent property, not a bug to be patched. So the real question is not "how often is it wrong" but "what happens when it is, and will you catch it before it matters?"
If a human reviews every output before it goes anywhere, the failure is cheap and AI can be a real accelerant. A draft email that a person reads before sending is fine. A summary a human skims is fine. But if the output goes straight to a customer, a ledger, a medical record, or a legal filing with no human in the loop, then a confident wrong answer is a liability with your company's name on it. The cost of the mistake plus the difficulty of verifying it is the whole decision.
4. When latency or cost cannot absorb it
A model call is slow and not free. Tens to hundreds of milliseconds on a good day, sometimes seconds, and a real per-call price that multiplies by your traffic. For a feature a user invokes a few times a day, nobody notices. For something on the hot path of every request, every page load, every row of a batch job over millions of records, you have just signed up for a latency tax and a bill that scales with success.
Run the math before you build. A 200ms model call inside a loop that runs ten million times is not a feature, it is an incident waiting for a busy Tuesday. Often a cached result, a precomputed table, or a small classifier you train once will do the same job at a fraction of the time and price.
5. When you do not have the data
People imagine an LLM as a thing that knows your business. It does not. It knows the public internet up to a training cutoff. If the answer lives in your private data, the model only helps once you have built the retrieval, the access controls, and the data plumbing to feed it the right context at the right moment. That plumbing is most of the work, and if your data is scattered, stale, or locked in PDFs nobody has parsed, the model has nothing to stand on. AI does not fix a data problem. It exposes one.
So where does AI actually earn its place
I do not want to leave you thinking the answer is always "no," because it is not. AI is genuinely the best tool when the problem is language-shaped and rules fall apart.
Summarizing long, unstructured documents into something a person reads next. Drafting first versions of text that a human edits. Classifying open-ended free text where the categories are fuzzy and the input is endlessly varied. Extracting structured fields from documents that have a thousand different layouts. Powering a search or support experience over messy internal knowledge, where keyword search misses too much. Translating between languages or registers. Writing code under a developer's supervision.
Notice the common threads. The input is messy human language. There is no clean rule that covers the long tail. A wrong answer is recoverable, usually because a human sees it before it counts. And doing the job by hand would be slow and expensive. When all four are true, AI is not a gimmick, it is leverage.
A quick reference table
| Problem type | Better tool | Why |
|---|---|---|
| Pricing, tax, eligibility, permissions | Plain code | Deterministic, auditable, no per-call cost |
| Extracting fixed-format IDs or fields | Regex or parser | Faster, free, exact |
| Ticket or content routing by keyword | Rules engine | Explainable and cheap to change |
| Exact lookups (customer, order, inventory) | Database query | Correct by construction |
| Finding known records by attribute | Search index | Built for it, fast at scale |
| Recommendations at huge scale on hot path | Trained classifier or precompute | Lower latency and cost per call |
| Summarizing long unstructured text | LLM | No rule covers it, human reviews output |
| Drafting text a person will edit | LLM | Cheap mistakes, real time saved |
| Classifying fuzzy open-ended free text | LLM | Categories resist hard rules |
| Extracting fields from varied documents | LLM (often with retrieval) | Layouts defeat parsers |
| Search over messy internal knowledge | LLM with retrieval | Keyword search misses too much |
The pattern across the top half: a known answer, a fixed format, a need for speed or proof. The pattern across the bottom half: messy language, no clean rule, a human checking the result. Most real systems are a mix. The skill is drawing the line in the right place, not picking one tool for the whole product.
The advisor who says "you don't need AI here"
Here is the part that does not get said enough. A partner who tells you to skip the LLM and write the four if-statements is worth more than one who says yes to everything. The first one is optimizing for your system working. The second one is optimizing for their invoice.
When we do AI consulting, a real chunk of the value is in the "no." No, that does not need a model. No, you would be paying for latency you cannot afford. No, your data is not ready, and here is what to fix first. That advice is unglamorous and it saves clients real money, and it is also why they come back when they do have a problem that AI is genuinely good at.
So the next time someone hands you "add AI," push back with a better question. What problem are we solving, how often must the answer be exactly right, and what happens when it is not? Answer those honestly and the right tool usually picks itself. Sometimes it is an LLM. Often it is a regex you could have shipped last week.
If you want a second opinion on where AI fits in your stack and where it does not, that is a conversation we are happy to have, including the parts where the answer is no.