Forget chasing viral ads — build a creative testing system that scales
How to engineer repeatable growth through systematic creative testing, for any budget

Summary
Creative testing should focus on scalable, repeatable processes — not post-hoc explanations of ad performance. Teams should establish clear baseline KPIs like CAC, spend, and conversion events, and adapt testing setups based on budget to optimize creative volume and increase ‘shots at goal’.
Creatives are one of the most powerful levers for growth — and with the speed and possibilities AI offers, their importance has only multiplied. We all know that producing more creatives should increase your odds of finding winners, but there’s a problem.
There’s too much noise. Every other post on LinkedIn introduces a new complex AI system that promises to ‘100x your creative process’. My feed is full of posts bragging about testing ‘thousands of creatives per week’. It sounds impressive, but it raises the question: do you need volume to compete?
Is the solution to automate creative and pump out hundreds of ads at a time? Or is this just the latest AI hype?
In this article, we’ll look at what actually matters in creative testing:
- Identifying which creatives move the growth needle
- Aligning your ad testing strategy with your budget
Stop producing for the sake of testing — start producing for the sake of scalability
Before you can identify winners and recreate that success, you need to have a good setup that serves your needs, while adapting to budget constraints.
This is my tried-and-tested setup that allows you to scale creative success by allocating more of your budget to winning creatives:

But that’s not the only setup that works for scalability.
If your limitations are tighter, ask yourself these two questions to figure out the best setup for your creative testing:
1. Do I already have winning creatives, or do I have to test from scratch?
This question determines your initial setup. If you’ve already run campaigns and seen what kind of creatives drive good performance, then you can start the new campaigns using these ideas. If not, your setup will have to continuously adapt until you discover the concepts that unlock growth.
2. How much money can I spend per day?
This answer will determine the regions and platforms you can target, the event that you can optimize for, and the number of campaigns, ad groups and creatives that make sense. Before diving into the mathematics of answering this question, let’s take a detour to determine what a winner looks like in terms of metrics.
Three types of ad creatives to identify
As everything in life, ad success isn’t black and white. There’s not just ‘winners’ and ‘losers’, there’s some grey areas.
Broadly speaking, here’s the three types of ad creative you’ll come up against.
Winning creatives are the ones that drastically improve performance in all aspects. They don’t tank when you put more budget on them, and they take much longer to be burned out.
Poorly-performing creatives are pretty self-explanatory: they never perform at the level you need.
Then you have average creatives. This is the grey area — average creatives are those that get some spend and perform decently (though worse than winning ads), but you can’t scale them very aggressively. Average creatives are still important, as they add variety in your ad group and diversify your spend once winning creatives start to flag.
Within this group we also include the false positives and false negatives:
- False positives: Perform well when they have low spend, but perform badly when you force the algorithms to spend on them
- False negatives: Don’t get spend when among winning creatives, but can perform well when isolated in separate ad groups
TL;DR: the spend/traffic a creative receives is a critical variable that determines the best action to take on that specific ad.
What does a winning creative look like for app growth?
The heart of this article is to layout what a winning creative looks like — and since I’m a believer in educating through real data, here’s real data on the performance a winner brings to your campaign, compared to average creatives.

Impressive, right? With this creative, we reduced the cost-per-event by 65%, while our spend was fully absorbed by them. The creative also performed well in upper funnel metrics like CPI, CTR, hook rate and hold rate.
This is what defines a real winning creative:
- They drastically improve performance, not only in your optimization goal, but also engagement-related upper funnel metrics
- They do this consistently, for a longer period of time [than other creatives]
- They do this with a higher spend*
*In the example above, you see a short time because these winners were moved to isolated ad groups — more on this later!
How to establish a baseline that filters your winning creatives
Once you’ve got your creatives defined, you need to know how to rank their success. This means establishing a baseline for the different ad metrics and KPIs you’ll use to measure success and plan actions for ads. Your baseline KPIs will essentially tell you if your next creative has potential to be a winner.
This baseline should be determined by the winning creative — if you were able to produce a great performance with these, it means you can do it again (if you produce creatives of a similar quality!).
Here’s what I look at, in order from highest to lowest priority:
- Customer acquisition cost (CAC) / cost-per-acquisition (CPA): The primary action you’re optimizing for should be your north-star metric — keep this cost at or below your target, based on your business economics
- Spend: Winning creatives typically receive 80–95% of daily spend — if a creative receives less than 50% of the budget after two days, treat it as a likely loser or false positive
- Install to conversion event: Winning creatives convert to your optimization goal at a faster pace — this metric helps identify why some users install but fail to complete the target action
- Cost-per-install (CPI): Doesn’t need to be the lowest, but high-performing creatives usually deliver better-than-average CPI
- CTR: Indicates user intent and how effectively the creative captures attention
- Install rate: Measures how efficiently users who click go on to install the app — winning creatives should convert better than average creatives
- Hook rate: A critical indicator of early potential — normally much higher in winning creatives
- Hold rate: Measures how long users stay engaged; though it can fluctuate, it’s a strong signal of creative quality and retention
- Installs-per-mille (IPM): Winning creatives tend to consistently drive higher IPM
- Ad score: A broad summary of social engagement to identify which creatives drive meaningful interaction — to calculate it: (Reactions x 2) + (Comments x 5) + (Saves/Shares x 10)
💡 A note on CAC/CPA
It’s important to consider which event your campaign is optimizing for. If you’re optimizing for an upper-funnel action like registration or trial start, you might see creatives with an impressively low CPA — but very poor conversion to paid subscription afterward.
This often happens when algorithms over-deliver ads to younger users (ages 18–24) who are curious enough to try the app but rarely subscribe.
In these cases, make sure the audience attracted by that creative aligns with your actual target segment. If not, monitor your conversion rate closely — a downward trend may signal that your best-performing creatives are simply attracting the wrong users.
For example, this is a snapshot of my Meta accounts:

You can check other metrics like frequency, CPMs, cost per 1k accounts reached, or cost per 6s view, but the list above includes the ones I suggest using as a consistent baseline to measure success.
Creative ad optimization for every budget size: three testing frameworks
So, you know what a winner looks like by metrics, but how should you approach creative optimization at each budget stage?
From $0 to $500
So you have less room to test, but you still can make it work. In these cases, I always recommend testing one platform (normally iOS) and one GEO (normally US), while you optimize for your main event (e.g. start trial, or direct purchase if you go with a hard paywall).
Your setup should be very simple: one campaign, and one ad group focused on the main event. If you split ad groups, you won’t generate enough events per day, meaning you won’t finish the learning phase and your performance will tank.
In terms of # of creatives, I always go with eight–10 creatives, distributed depending on the answer to our question above:
- If you already had winners from previous experiences, do three–four winning creatives and two–three new concepts you want to test
- If you’re starting from scratch, simply invest in the best creatives you can produce

If you run a channel like Meta or TikTok, you’ll know within a couple of days whether the test concepts are winner, since these networks quickly push most spend towards the best performers.
There will be false positives and negatives, but if you push new concepts and they don’t get spend, you can be almost sure they won’t get a better performance than your current top-spending creatives.
It doesn’t matter if you go from scratch or with pre-existing winners, you must rotate the creatives that don’t spend every two–threee days, otherwise they’ll never get a good performance.
Also rotate winning assets if you start to see their KPIs get worse over time. The same as test concepts can become a winner, a winning asset can become a loser due to ad fatigue. Ultimately, there will always be a slot unlocked for tests, either because the previous test hasn’t spent, or because the winner has ad fatigue.
From $500 to $5,000
This is my favorite budget stage, as it allows you to split ad groups for testing purposes while still controlling performance and spending the majority of your budget on the best assets.
At this stage, you should already know which concepts are the best for your goal.
This is the setup I suggest, since you can launch three test ad groups and two BAUs for each campaign (assuming you run iOS campaigns with SKAN reporting). This allows you to put up to 30-50 creatives every week for testing, alongside up to 20 winners to absorb most of the budget.
You’ll likely see some testing assets get most of the spend, but with a worse CAC/CPA than your BAU assets. In this case, these should all be considered losers — pause them and rotate. (But remember to analyze and compare the engagement metrics, in case you can iterate on the idea and find a real winning asset)

Note: there’s less ads pictured than I recommend, just for clarity within the image
With this setup, you have the opportunity to double-confirm false positives and negatives with isolated ad groups. It’s normal to get a lot of false positives. In this instance, there’s two possibilities:
- If the asset performed better than BAU, create an isolated ad group (if necessary, create a new campaign if you don’t have more space in the existing campaign) and see how it evolves when you put a significant amount of spend on it. If it keeps performing better, it’s a real winner — make the most of it in the isolated ad group!
- If it quickly gets a much worse CAC than BAU, it means it was a false positive. As usual, don’t forget to check all the metrics in case the concept has winner potential.
False negatives will start to appear on this stage as well. The best approach is following follow the same strategy: isolate them in a new ad group and wait for one–two days. Typically, you’ll see a poor performance when you force spend on them — if not, you’ve hit gold and identified a genuine false negative that turned into a winner!
$5,000 to infinite
At the higher end of budget, things get messy. You have a huge mass of assets to rotate, double-confirm, and scale all at once.
You’ll likely have different GEOs and multiple campaigns per GEO where you can have four–five BAU ad groups, 10–15 testing groups, and five–10 isolated groups to double-confirm false positives/negatives.
The number of ad groups that work as BAU is really determined by the performance. In my experience, I’ve had accounts where I could have five campaigns with three BAU ad groups on each, and others where my winning creatives ad fatigue meant I had to rotate the winners faster — which obviously limited the number of BAUs.
This requires a ton of manual work, but also accelerates your creative process proportionally. In these situations what you really need to focus on is ensuring your BAU ad groups have a stable performance, since they’re spending most of the budget.
If you see inclining trends in the CAC of these ad groups, it’s time to shorten the number of BAU ad groups and focus on optimizing their performance before adding more test ad groups. Otherwise, you risk rising CPA/CAC and disrupting the logic behind your setup.
There’s no perfect formula to ad wins
So there you have it — several detailed setups for optimizing your ads, and a breakdown of how to measure success.
However, there’s no one in the world that can give you a perfect setup for every case. Everyone has techniques for winning ads, whether they’re KPI formulas or AI tools — but each app is unique.
Your app and creatives have their own intricacies and idiosyncracies, so don’t be afraid to tweak this strategy to work for you. You might not have budget to double-confirm assets, or maybe your winners perform well longer than average so you can leverage performance without many rotations.
Stop focusing on creative volume, stop chasing winners without understanding them first — start with the basics, and learn along the way. There’s no better teacher than real data, and real experiments.
You might also like
- Blog post
The creative volume trap in Meta ads
More winners, less waste: building a sustainable creative strategy
- Blog post
The creative testing system that slashed our CAC (and scaled our spend)
We scaled Meta ad spend by 74.6% and dropped CAC 40%. Here’s how.
- Blog post
Why creative fatigue is killing your ROAS
Why ad algorithms punish slowness on creative testing and what you can do about it.