Five simple questions every founder and CTO should agree on before starting an AI project. The strongest projects ship because these are answered in week one.
.png)
Every AI project that ships and pays back has one thing in common. The founder and the technical lead agreed on the same answers to a small set of questions before any code was written. The questions are not technical. They are the kind of questions a founder and a CTO can answer together in a single meeting, and the answers shape every decision that follows.
This post walks through five of them. They are the five we have seen separate the projects that work from the projects that stall. In Forrester's 2026 analysis of agent deployments, 41 percent of underperforming projects came from one cause alone: the team had not agreed on what success looked like. The other four questions cover the same ground for cost, ownership, measurement, and exit, and the pattern holds. Projects that get these answered early ship. Projects that leave them to "we will figure it out as we go" run into avoidable problems later. Digital Applied Team
The good news is that all five questions can be answered in one structured 90-minute meeting between the founder and the technical lead, before the first kickoff. The hard part is having the meeting. The questions themselves are clear, and the right answers tend to surface quickly once the right people are in the room.
The first question is about being specific.
A vague answer sounds like "we want an AI agent that handles tier-one support tickets." A clear answer sounds like "cut median tier-one ticket resolution time from 47 minutes to under 15 minutes within six months, with customer satisfaction staying above 4.2 out of 5."
The vague answer describes a capability. The clear answer describes the change in the business that capability is meant to produce. The difference is not stylistic. It is the difference between a project that has a referee and a project that has an open debate every Monday.
When the success metric is written down with a number, every later decision has a tiebreaker. Should the team use one model or several. Whichever moves the metric. Should the system call a frontier model on every request or a cheaper model most of the time. Whichever holds the metric while staying inside the budget. Should we build the eval system now or later. Now, because the eval system is how we know whether we are moving the metric at all.
A clear sentence in week one removes the largest single source of wasted AI budget in the field. It is also the cheapest piece of risk reduction available, and the answer almost always becomes obvious once the founder and the CTO sit down to write it together.
The second question is about the economic envelope.
Most teams discover their cost ceiling in production, usually after the first month of real traffic. The strong move is to set it in week one. The math is simple and worth doing on a whiteboard. A workflow that costs 50 cents per call during testing turns into a $50,000 monthly bill at 100,000 monthly calls, which is roughly the volume most mid-market AI features hit by month four. Doing the math early shapes the architecture before it is too expensive to change. Bonjoy
Knowing the cost ceiling forces useful decisions early. Does the system use a frontier model on every call, or a cheaper model for the easy 80 percent and a frontier model only for the harder 20 percent. Does it cache aggressively, or not at all. Does it run single-shot prompts or longer reasoning chains. Each option has a different unit cost, and the right choice depends on what the business can spend per customer interaction. That number is a CFO and CTO call made together.
The latency ceiling is the same idea in a different unit. A support agent that takes 12 seconds to respond is not the same product as one that takes 800 milliseconds. The user experience is different even if the answer is identical. Where the line should sit is a business judgment, and the technical team can build to whatever number the business agrees on.
When this question is answered together, the architecture team has a clear envelope to design inside, and the project avoids the most common cost surprise in production.
The third question is the one most teams underestimate, and the one with the biggest payoff when it is answered early.
AI systems need ownership the same way any other production system does. Models drift. Prompts decay. Vendor APIs change. Costs creep. None of that fixes itself, and none of it sits inside the engineering team's regular workload by default.
The market has noticed. By 2026, 56 percent of enterprises have created a formal AI agent owner role. In 2024 it was 11 percent. That fivefold jump in 18 months is the largest organisational shift in the field, because companies who tried it without naming an owner found themselves with a system that nobody was actively keeping in shape. Digital Applied Team
The right way to think about this is to ask one question. When the AI system needs attention, whose name is on it. The named owner runs the production monitoring, owns the eval results, approves prompt changes, decides when to swap models, and is the escalation point when the business notices a quality issue. The role is operational. It outlasts the build by a wide margin, and the cost of staffing it is part of the project budget, not a surprise that shows up in month six.
In most mid-market companies, this role is not a new hire. It is an existing senior person with explicit authority over the system, the right time allocation, and a clear accountability line back to the success metric from question one. Naming that person in week one is one of the cheapest, highest-leverage decisions in the entire project.
The fourth question is about how the team will tell whether the system is getting better.
This is the eval question, and it has the biggest single effect on production reliability of any decision in the project. Production data from 2026 shows agents with proper eval coverage get rolled back at a 9 percent rate, while agents without it get rolled back at a 47 percent rate. Same engineering teams. Same models. The single difference is whether the eval system was built early. Digital Applied Team
A solid eval setup has three parts. A small dataset of representative inputs paired with the expected outputs, usually 50 to 200 examples to start. Automated scoring against that dataset every time the prompt changes, the model changes, or the retrieval changes. And a small set of judgment-based checks for things like tone, format, and reasoning quality, where exact matching is not the right tool.
None of this is technically hard. Teams that build it early ship faster after launch, because every change can be tested confidently before it goes live. Teams that postpone it spend the first few months after launch making changes they cannot fully verify, then build the eval system anyway after the first incident. The earlier choice is much cheaper, and almost always the right one.
The question for the founder and the CTO is straightforward. Is the eval system in scope from week one. The answer should be yes, and the budget should reflect it.
The fifth question is about staying flexible as the market moves.
The AI vendor landscape is moving fast. In April 2026, Anthropic moved Claude enterprise from fixed pricing to dynamic pricing. The change is expected to double or triple costs for heavy users. A Q1 2026 survey found that 81 percent of enterprise leaders flag vendor dependency as a top concern. Companies that do migrate between providers report average migration costs of around $315,000 per project. The market is mature enough that pricing changes, model deprecations, and policy shifts are now part of the normal operating environment. The Register + 2
The good news is that most of the cost of staying flexible can be paid in week one for almost nothing. The pattern is called a model abstraction layer, and it is simple in concept. The application code expresses what it needs in business terms. Classify this support ticket. Summarise this contract. Generate this draft response. Underneath that, the system wires to a specific provider. When pricing changes or a better model becomes available, the team changes the wiring, not the application.
Teams that build this layer in week one have negotiating leverage and the option to swap providers when it makes sense. Teams that wire directly against a single vendor's API can still get there, but the work is much larger after the fact. The question for the founder and the CTO is whether to pay the small cost now or the larger cost later. The answer is almost always now.
A founder and a CTO who answer all five questions together in week one have a project with a measurable target, a clear cost envelope, a named owner, an eval system in scope, and an exit path built in. That is a strong foundation, and the projects built on it tend to ship inside the 8 to 12 week window that boutique AI builds run on in 2026.
The questions also surface useful information about the project itself. If two of the five questions cannot be answered confidently in the meeting, that is a signal the project needs another week of scoping before kickoff. If four or five can be answered, the project is ready to go and the team can move fast with confidence.
This is the kind of meeting Verttx runs at the start of every engagement. We pressure-test the answers, name the architecture that fits inside the cost envelope, build the eval system in week one, and ship inside weeks rather than quarters, with full code ownership handed over to you at the end. You arrive with the answers to the five questions. We turn them into a working system.
we partner with ambitious teams to solve real problems, ship better products, and drive lasting results.
Read more Case Studies & Insights.png)