Skip to main content
Back to ArticlesBusiness & Operations
11 min read

How to spot an AI use case worth doing

The six prerequisites that decide whether an AI use case is worth building — with a scoring rubric, two worked examples, and the red flags that say walk away.

Miguel Vicente Jr
Miguel Vicente Jr· Head of Operations

How to spot an AI use case worth doing

Most AI projects that stall don't stall on the model. They stall because the use case was wrong before anyone wrote a line of code.

The pattern is consistent. A company picks a project because it sounds good in a board update, builds for three months, runs a demo that gets a round of applause, and then quietly shelves it. The model worked. The use case didn't. The data wasn't there, no one owned the outcome, the task was too rare to matter, or a simple rule would have done the same job for a hundredth of the cost.

This is the most expensive mistake in applied AI, and it happens before the technical work starts. Choosing the right use case is cheaper than any model you'll build on top of it, and it determines whether the build pays back or becomes a line item someone cuts next year.

This is a working guide to that choice: the six prerequisites that predict impact, a scoring rubric, two worked examples, and the failure signals to watch for.

Why use-case selection is the real bottleneck

The model is rarely the constraint anymore. Foundation models, fine-tuning, retrieval, agents — the toolkit is mature and getting cheaper every quarter. What separates a project that ships and saves money from one that dies in a slide deck is whether the problem was a fit for automation in the first place.

Think of it as a filter. Dozens of ideas go in. Most should be rejected fast and cheaply — on paper, in an afternoon — so that the engineering effort lands on the few that will still be running a year later. The prerequisites below are that filter.

The six prerequisites

1. The task is frequent and repetitive

AI pays back on volume. A task done 500 times a week has 500 times the return of a task done once. If a person does it constantly, with little variation, that's where automation compounds. If it happens twice a quarter, the build will cost more than the time it saves — every time.

The simple math: (times per week) × (minutes per time) × (loaded hourly cost) gives you the weekly return surface. A task done 400 times a week at 6 minutes each is 40 hours — a full person. A task done 5 times a week at 30 minutes is 2.5 hours; interesting, but the build has to be cheap to justify it.

  • Good fit: classifying and routing 800 inbound support tickets a day.
  • Poor fit: writing the annual strategy memo. High value, but it happens once, and the judgement is the point.

Ask: how often, and how long each time? Multiply. If the number is small, stop here — no amount of model quality fixes a thin return surface.

2. The data already exists and you can reach it

There is no AI without the data underneath it. If answering the task needs information that lives in three systems nobody has joined, or in PDFs nobody has parsed, the real project is a data project — and it should be scoped and costed as one, not discovered halfway through the build.

Be specific about three things: where the data lives, who can access it, and what state it's in. "It's all in the CRM" is not an answer until someone has confirmed the fields are populated, current, and exportable.

  • Good fit: the last two years of invoices, already in the accounting system, in a consistent format.
  • Poor fit: "tribal knowledge" that lives in people's heads and a Slack history nobody can search.

Ask: can you put your hands on a clean sample of the exact data the task needs, today? If not, the data work comes first.

3. The cost of a wrong answer is bounded

AI is probabilistic. It will be wrong sometimes. The question is what happens when it is, and whether someone catches it before it does damage.

A task where a person reviews the output before it's used, or where a mistake is cheap to notice and reverse, is a strong fit. A task where one wrong answer moves money, breaches a rule, or reaches a customer unreviewed is not — at least not without a control in front of it.

This is also where the level of autonomy gets decided. "AI proposes, a person approves" suits high-stakes work. "AI runs unattended" suits work where errors are visible, cheap, and rare. Most good first projects start at propose-and-approve and earn more autonomy as the error rate proves out.

Ask: what is the worst a wrong output can do, and who catches it before it lands? If the answer is "nothing catches it" and "the damage is large," the use case needs a control before it needs a model.

4. There's an owner and a process to improve

AI improves a process that already exists. It rarely invents a good one from nothing. The use cases that ship have a named owner who runs the workflow today and feels the pain personally. The ones that drift are owned by "the company" and championed by no one.

This is the prerequisite that has nothing to do with technology and predicts success better than any of the others. An owner defines what "good" means, supplies real examples, catches the edge cases, and defends the project at budget time. Without one, the build floats.

Ask: who runs this process now, and do they want it changed? If you can't name that person, you don't have a use case yet — you have an idea.

5. You can measure the before and after

If you can't measure it, you can't prove it worked, and a project you can't prove gets cut at the first budget review. You need a baseline before you start: cycle time, error rate, cost per unit, hours spent, backlog size. Something concrete and current.

"It'll make us more efficient" is not a number. "Average handling time is 14 minutes; backlog is 3 days" is. The second one lets you say, three months in, exactly what changed — and that sentence is what funds the next project.

Ask: what number does this move, and what is that number today?

6. A plain rule wouldn't do the job better

This is the prerequisite people skip, and skipping it wastes the most money. Some tasks don't need AI — they need a script, a filter, or a fixed rule. If the logic is deterministic ("route every invoice over €10,000 to finance for sign-off"), write the rule. It's cheaper, faster, fully predictable, and never hallucinates.

Reach for AI when the task involves judgement, natural language, images, or messy input that rules can't capture — reading a free-text complaint and deciding what it's about, pulling fields from documents that all look slightly different, drafting a reply that fits the context. The honest test: could a competent engineer solve this with a week of plain code and a few if statements? If yes, do that.

Ask: is the hard part judgement and messy input, or is it just logic nobody has written down yet?

Score it before you build it

Take a candidate and rate it 1–5 on each prerequisite. The two that act as gates — data and ownership — should be weighted hardest, because a low score on either kills the project no matter how good everything else looks.

Prerequisite Weight Score (1–5)
Frequent & repetitive ×2
Data exists & reachable ×3 (gate)
Error cost is bounded ×2
Clear owner ×3 (gate)
Measurable baseline ×2
Beats a plain rule ×1

A weighted total is useful, but the gates override it: if data or ownership scores a 1 or 2, stop, regardless of the sum. A use case that scores high on excitement and low on data is the classic trap.

Two worked examples

Candidate A — "An AI assistant that answers any internal question." Frequency: high. Data: scattered across wikis, drives, and Slack, none of it clean (gate fails, score 2). Error cost: medium. Owner: "everyone," so no one (gate fails, score 2). Measurable: vague. Beats a rule: yes. Verdict: reject for now. It's two gate failures wearing an exciting title. The real first project is getting the knowledge into one searchable, current place — a data project.

Candidate B — "Extract supplier, amount, and due date from incoming invoices and pre-fill the entry." Frequency: 600/week (strong). Data: invoices already arrive by email in PDF, two years of history (gate passes, 5). Error cost: bounded — a person reviews before posting (4). Owner: the finance ops lead, who is sick of manual entry (gate passes, 5). Measurable: 6 minutes per invoice today, 4% keying errors (5). Beats a rule: no, the layouts vary too much for rules (5). Verdict: build it. Every prerequisite is green, with a clear baseline to prove the result against.

The difference between A and B isn't ambition or technology. It's that B passes the gates and A doesn't.

Red flags that a use case will fail

  • No owner, or the owner is a committee. Enthusiasm without a single accountable person is the most reliable predictor of a project that drifts.
  • "We'll figure out the data later." The data is the project. Later means never, or a surprise three-month detour.
  • The success metric is a feeling. "More efficient," "better experience," "modern." If you can't baseline it, you can't defend it.
  • It only works in the demo. Demos use clean, hand-picked inputs. Ask what happens on the ugliest 10% of real cases — that's where value leaks out.
  • A rule would do. If if amount > X solves it, AI is the expensive answer to a cheap question.
  • Boil-the-ocean scope. "Automate the whole department" is not a use case. The first shippable slice of it is.

How we run this

Most of what we do in a discovery sprint is exactly this: take a shortlist of candidate use cases, pressure-test each against these prerequisites, score them, and come out with one or two worth building and a clear reason for parking the rest. Two weeks, fixed fee, before anyone commits to a build. The output isn't the most impressive project — it's the one that will still be running, and still saving time, a year from now.

The technology is rarely the constraint. The choice of where to point it is. Get that right and the rest is engineering.

Miguel Vicente Jr

Miguel Vicente Jr

Head of Operations

Want to apply these ideas in your business? Talk to our AI consultants.

Book a call