There MUST be a hypothesis for your test. If you don’t know what you are testing then collecting data is pretty pointless. To call it a test there needs to be something that is being tested, not just throwing spaghetti against the wall. Sometimes a radical redesign is the right call, but most of the time you should be testing something specific. Things like hero shots, calls to action, and headline are all things that you should be testing.
For an A/B test to be meaningful and insightful you should follow these rules to keep your insights and data clean:
- The original cannot have content edits during the test
- All variations should have one hypothesis to test against the Original
- Traffic distributions should stay proportional
- Don’t end tests on less than 50% confidence
- Define a minimum sample size
The original CANNOT have content edits during the test
If you change your original in a way that affects your hypothesis then you have to start your data over from scratch.
If you are updating something like a broken link or a typo obviously don’t worry, but if you are changing the headline you should be be making a variation, not changing the original.
If you change the original your data will no longer be representative of your landing page.
All variations should have ONE hypothesis to test against the Original
You can have one variation that tests headline, one variation that tests CTA, and one variation that tests form length–but you shouldn’t have one variation that tests all three at once.
Testing Tip: Remember to name your variations based on what they are testing.
Traffic distributions should stay proportional
It is not horrible, but it does complicate any insights you may have.
If you can, introduce the new hypothesis in the next generation instead of adding new variations to an ongoing test.
Examples of a Bad Changes
Control (80%), Test A (20%) »» Control (50%), Test A (50%)
Control (80%), Test A (20%) »» Control (50%), Test A (25%), Test B (25%)
Better Changes
Control (80%), Test A (10%), Test B (10%) »» Control (60%), Test A (20%), Test B(20%)
Control (80%), Test A (20%), »» Control (60%), Test A (20%), Test B(20%)
Notice that I say better change, not good change, changing your traffic mix always adds some noise to your data.
If you can save the new hypothesis for the next test you should.
Don’t end tests on less than 50% confidence
I know that many people are having an aneurysm even thinking about a test that isn’t being taken to 95% confidence.
Here is what I have to say: I’m not a scientist, I’m a marketer.
I don’t calculate standard deviation; I calculate return on investment, and that means moving on to the next test that improves my results and income instead of getting into the mathematical weeds on any given scenario. #Amen (Editors note).
Define a minimum sample size
The simplest rule is choose a minimum sample that lets all versions get 100 visitors. Why 100 you may ask? Because, then all of your percentages are easy to talk about and understand. If you don’t have what you need you can keep going, but don’t cut and run from what could be a solid improvement.
In Conclusion: Now You Know (and Knowing is Half The Battle)
That means that if you have to defend your decisions or hand-off control you will have a clear demonstration of your success.
So how’s your testing hygiene?
Tell me about some of your A/B testing successes and failures. Also, if you have some questions, comment!