A/B Testing
Engage lets you run 2-4 variants on a campaign and auto-promote the winner. Behind the scenes it uses a Bayesian model - which sounds intimidating but is actually simpler to read than the t-tests most marketing tools use.
Navigation
A/B variants are configured inside the Campaign Builder under the A/B Variants tab.
What You Can Test
| Element | Examples |
|---|---|
| Subject line | "Your scooter misses you" vs "Come back for 20% off" |
| Body copy | Long-form vs short-form, formal vs casual |
| CTA text | "Ride now" vs "Take a ride" vs "Unlock a scooter" |
| Send time | 10 AM vs 6 PM local |
| Channel | Email vs push (rare - usually a journey decision) |
You can test any combination of these by creating distinct variants. The traffic split is configurable per variant.
How the Test Runs
- You attach 2-4 variants to a campaign.
- Each recipient gets assigned to one variant deterministically (seeded by
customer_id), so a single rider always sees the same variant if they show up in multiple sends. - As engagement events stream in (delivered, opened, clicked), Engage updates its belief about each variant's true performance.
- Once each variant has at least 500 sends AND one variant has a 95%+ probability of being best, the winner is locked in.
- If you toggled Send remaining to winner, the rest of the audience receives the winning variant.
What Bayesian Actually Means Here
You may have learned A/B testing through "p-values" and "statistical significance." That approach (called frequentist) asks: "if the variants were actually identical, how often would I see results this extreme?" It's a useful question, but slow to give you a clear answer and easy to misuse.
The Bayesian approach asks the question you actually want answered: "What's the probability variant B is better than variant A?"
Engage gives you that probability directly. The campaign analytics page shows something like:
Variant A: 4.1% click rate (n=523), 12% chance of being best
Variant B: 5.8% click rate (n=518), 87% chance of being best
Variant C: 3.9% click rate (n=515), 1% chance of being best
When any variant crosses 95%, that's your winner.
The Math (Lightly Explained)
Skip this section if you are not a stats nerd - the tool works without it.
Engage models each variant as a Beta-Binomial:
- The Binomial part is the basic mechanic - of N sends, some succeeded (opened, clicked, whatever you defined).
- The Beta part is a prior distribution over the true success rate. Engage uses a weakly-informative uniform prior.
- After observing data, the posterior is also a Beta distribution.
To estimate "P(variant B is best)," Engage takes many random samples from each variant's posterior and counts how often each variant wins. The sampling uses the Marsaglia-Tsang Gamma algorithm, a fast and numerically stable way to draw Beta samples for any reasonable variant count.
The practical upshot: you get a clean probability number that updates in real time as data comes in, with no p-hacking and no early-stopping bias.
Setting Up an A/B Test
- In the campaign composer, click A/B Variants.
- Click Add Variant.
- For each variant:
- Give it a name (
A: original subject,B: emoji subject) - Attach a template (variants typically share a template; subject-line variants override just the
subjectfield) - Set the traffic split weight (defaults are equal weighting)
- Give it a name (
- Optional: toggle Hold out 10% to keep a control group that never sees any variant - useful for measuring incremental impact.
- Optional: toggle Send remaining to winner so once a winner is picked, the unsent audience receives the winning variant.
Save and send.
Reading the Results
The campaign analytics page shows per-variant breakdowns:
| Column | What it means |
|---|---|
| Sends | Recipients in this variant |
| Delivered | Provider confirmed delivery |
| Open rate | (email/push) |
| Click rate | (email/SMS) |
| Conversion rate | Hit the goal in the attribution window |
| P(best) | Probability this variant is the best of the set |
Once one variant hits 95% and all have at least 500 sends, you'll see a "Winner declared" banner.
Minimum Sample Size
The 500-sends-per-variant gate exists to prevent calling early winners on noisy data. Even with a strong Bayesian framework, 50 sends per variant won't tell you much.
If your audience is smaller than 500 per variant:
- The test still runs, but no winner gets declared automatically.
- Use the manual Force winner button on the analytics page if you've reviewed the data and want to lock in one variant.
- Or skip A/B testing for small audiences - you'd be better off comparing two full sends across two weeks.
Holdout Groups
If you toggled Hold out 10%, that 10% gets no message at all. Their conversion rate becomes the "baseline" - the rate at which the goal would have happened without any send.
The incremental lift of your campaign = winning variant conversion rate minus baseline conversion rate. This is the true "did the campaign matter?" number.
A/B Testing Inside Journeys
Journeys do not currently support per-step A/B testing. If you want to test a step inside a journey, run a standalone campaign for that step first, pick the winner, then bake the winning template into the journey.
Best Practices
- Test one thing at a time. If A has a new subject AND new body, you don't know which one moved the needle.
- Be patient. Hitting 95% probability with realistic effect sizes usually takes 1,000-5,000 sends per variant. If you only have 200 riders per variant, expect to wait or accept a less-confident result.
- Watch for confounders. A subject-line test that runs only on Tuesday is also a "Tuesday vs other day" test. Run for at least a full week unless your audience is huge.
- Document your winner. Once a variant wins, update your default template so the next campaign starts from the better baseline.
Troubleshooting
Variant traffic split is uneven
Variant assignment is deterministic per recipient. Below ~200 sends per variant, the split converges slowly. At 500+ it should look close to your target weights.
No winner declared after 5,000 sends
This means no variant has hit 95% probability yet, which usually means the variants are genuinely similar. Either accept that and pick whichever you prefer, or stop the test and design a more differentiated B variant.
Click rate is zero on all variants
Check the funnel - if delivered is also zero, the dispatch is broken. If delivered is high and clicked is zero, your link tracking is broken (usually a misformatted URL with double-braces left unresolved).
Need Help?
For A/B testing help, contact support@levyelectric.com.