What To Look For When Hiring An AI Developer
How a non-technical founder vets an AI developer: how to tell shipped work from demos, the right questions to ask, and the red flags that signal trouble before you pay.
How does a founder who cannot read code tell a real AI developer from someone who is good at sounding like one? That is the actual question behind every hiring decision in this space, and right now it is harder than it has ever been. The tools that build software have gotten so good that anyone can produce a slick demo in an afternoon. A demo is not a product, and the gap between the two is exactly where founders lose money. This post is about closing that gap before you sign anything.
I build AI products and automation for clients, so I am on the other side of these conversations often. I am going to tell you what I would check if I were the one writing the check, in plain terms, with no jargon you need to look up.
Shipped Work Beats Every Demo
The single most useful thing you can do is separate what someone has actually put in front of real users from what they built to impress you. A demo runs once, on the developer's machine, with the inputs they chose. Shipped work runs every day, for people who did not read the manual, on inputs nobody predicted. The skills are not the same, and the second one is the one you are paying for.
So ask for live links. Not screenshots, not a recorded video, not a private repo you cannot open. A real URL you can visit yourself, an app in a store you can download, a tool you can sign up for and click around in. When someone has shipped, they are usually proud to hand you the address. When they hedge, when everything is "under NDA" or "we took it down" or "I can show you on a call," treat that as information. Some NDAs are genuine, but if every single thing in a portfolio is invisible, the simplest explanation is that there is less there than the conversation suggests.
When you do get a live link, poke at it like a confused customer, not like an engineer. Try the obvious wrong thing. Leave a field blank. Click the button twice. Refresh in the middle. Real shipped software handles that gracefully because real users have already done all of it. A polished thing that falls apart the moment you go off the happy path was probably built to be watched, not used.
The same logic applies to AI features specifically. Anyone can show you a chatbot that answers the one question they prepared. Ask it the second question, the awkward one, the one with a typo in it. See whether the thing degrades politely or says something embarrassing. That tells you whether the developer thought about the messy reality of an AI product or just wired up the demo path.
The Questions That Actually Reveal Skill
You do not need to be technical to ask questions that separate the real builders from the talkers. You just need questions where a vague answer is itself the answer.
Ask them to walk you through one project end to end, from the first conversation with the client to the day it went live and what happened after. People who have actually shipped tell this story with texture. They mention the thing that broke in week two, the feature the client thought they wanted and did not, the boring part that took longer than the exciting part. People who have not shipped stay at the level of adjectives. They tell you it was "scalable" and "robust" and "cutting edge" and never once mention a specific decision they had to make.
Ask what they would do if the AI part of your project turns out to be unreliable or too expensive at scale. This is a fair question because for AI products it is the question. A serious developer will not flinch. They will talk about fallbacks, about caching, about measuring cost per request, about when a simpler approach beats a fancy model. Someone who only knows how to call an API and hope will get uncomfortable, because you have just asked about the part they have never had to own.
Ask who maintains it after launch and what handoff looks like. The honest answer involves documentation, access to the accounts and keys, and a plan for you to either keep them on retainer or take it elsewhere without being held hostage. If the answer is fuzzy, you may be looking at someone who builds things that only they can ever touch, which is a slow trap.
Finally, ask them to explain a technical tradeoff in your project in language you understand. The best builders are translators. If someone cannot make their choices legible to you, the problem is not your intelligence, it is that they either do not understand it deeply or do not respect that you need to.
Red Flags Worth Walking Away From
Some signals are reliable enough that I would slow down or stop entirely when I see them.
The first is a portfolio with no reachable work and no clear reason why. Covered above, but it earns the top spot because it is the most common and the most expensive to ignore.
The second is a quote that arrives instantly with total confidence and no questions. Real scoping requires understanding what you actually need, and that takes a conversation. A number that lands before anyone has asked what success looks like is a number pulled from the air. The opposite extreme is also a flag, an estimate so padded and hedged that it tells you the person has no idea, but the instant confident quote is the more seductive trap because it feels like decisiveness.
The third is fluency in buzzwords paired with silence on specifics. If every answer is "we use agents and RAG and the latest models" and no answer is "here is the exact thing your users will see and how it behaves when it fails," you are talking to marketing, not engineering. Buzzwords are free. Decisions are not.
The fourth is any resistance to a small paid first step. A developer confident in their work is usually happy to prove it on a contained piece before you commit to the whole thing. Someone who insists on a large up front commitment with no checkpoint is asking you to take all the risk. You can structure almost any project so the first milestone is small, real, and shippable, and the people worth hiring tend to prefer it that way too.
The last one is subtle. Pay attention to whether they are curious about your business or only about the technology. The developer who keeps steering back to what your customers need and what outcome you are actually buying is the one who will build the right thing. The one who only lights up about the stack will build something technically impressive that misses the point.
How To Structure The First Engagement
Even after you have found someone who passes every check, structure protects you. Do not hand over a single large fixed contract for an unproven relationship.
Start with a small, well defined first piece of work with a clear deliverable you can see and use. It might be one feature, one working slice of the product, one automation that does one job. The goal is to learn how this person actually works, how they communicate when something gets hard, and whether the thing they hand you matches what they promised. You learn more from one real milestone than from ten reference calls.
Get the boring ownership questions settled in writing before money moves. Who owns the code. Where the accounts and keys live and that they are in your name. What happens if you part ways. None of this is hostile, it is just the difference between a clean relationship and a painful one, and good developers expect it.
If you want a sense of what a serious build looks like end to end, my AI agent development work is structured around exactly these principles, real shippable milestones, plain communication, and software you own rather than rent. And if you would rather just talk through your specific project and figure out whether it is even worth building yet, you can book a call and we will sort that out before anyone writes a line of code.
The throughline across all of this is simple. You are not trying to evaluate code, you are trying to evaluate judgment, and judgment shows up in shipped work, in specific answers, and in a willingness to prove it on a small piece first. Hold to those three and you will filter out almost everyone who would have wasted your time.
I am Kevin Gabeci, a software engineer who builds this kind of thing for clients, solo and fast. If you want it built, book a call.
Like this? You'll like what I'm building too.
Two ways to support and get more of this work.
HEARTH
A privacy-first Life OS for your desktop. Journal, tasks, and notes that stay on your machine. Coming soon, direct download from this site.
Read moreMY TOOLKITS
Receipts-first toolkits for shipping after hours, building Claude agents, publishing on Amazon, and more. The exact methods I used, not theory.
Browse on WhopRelated Articles
Build Versus Buy For A Custom AI Feature
How to decide between an off-the-shelf AI tool and a custom AI feature, weighing control, cost, differentiation, and data with a clear framework.
Can One Developer Build Your Whole MVP
An honest look at whether a single full-stack developer can ship your whole MVP, when it works beautifully, and when you should bring in more people.
Freelancer, Agency, Or Dev Shop For Your MVP
The honest tradeoffs between hiring a solo freelancer, an agency, and a dev shop to build your MVP, covering cost, speed, communication, risk, and when each is right.