What is A/B testing?
Product managers use A/B tests to create products that resonate most with users. By splitting the audience and testing different features or changes in content or UI/UX, product managers (PMs) can gain insights into what yields the best results. Yet, A/B tests are not only about randomly splitting a product’s audience: many finer points need to be considered to avoid inaccurate results.
In particular, how do you select the product’s audience before the test and during the analysis phase to see the clearest picture? In this article, we’ll dive into some audience specifics to make the most of A/B testing.
Researching users and tracking product changes
Running an A/B test is crucial for most product managers dealing with digital products that have a reasonable-sized audience. It is essential to research users and measure the changes we are making to our product as our views on the product may be biased, and usually, the bias leans toward a much more positive perception of the change.
This happens for several reasons: the users of the product are very different from the people who created it, customers usually spend much less time with the product and don’t know how it works very well, and lastly, product people are partial to the product they made simply because it’s their creation. So, we must verify significant changes with real users.
There are many ways we can research the changes we are making (or going to make) and study users that are relevant in different situations and stages of product development. A/B testing is the ultimate tool to determine if the change you are making to your product is impactful. Why is that?
A/B testing metrics: The ultimate performance measuring tool
Research methods can be grouped into qualitative and quantitative metrics, as well as behavioral and attitudinal. Qualitative methods are primarily helpful for understanding why users act or think in a certain way and don’t give any solid statistical information. In contrast, quantitative methods don’t answer why but provide actual statistics.
Behavioral metrics show how users really act, while attitudinal ones show more of what users think (and we know that their behavior can be very different from their storytelling). Thus, we must apply quantitative behavioral metrics if we need actual statistics about how users interact with the product.
Amongst them, we usually have usability testing, Alpha/Beta testing, and A/B testing. It’s generally difficult to run enough usability testing to get statistical significance, and the usability test setting is not 100% the same as the product’s actual usage environment. Alpha/Beta testing is typically done with the most motivated users, and this audience does not represent an accurate statistical sample.
Conversely, A/B tests are done on randomly split audiences, so the difference we see does not apply to the difference in audience. Once the A/B test is implemented, we can run it on as many users as possible, and if we have enough people in our product testing, we can detect even a small change in metrics.
We run A/B tests on the actual product, and users don’t even know that their behavior is being tested, so it is not affected by anything. To sum up, the A/B tests show through statistics how users truly reacted to the change in the product.
Why selecting a proper user audience is important
It’s essential to select the audience carefully for several reasons. Different segments of users can behave differently, so if you don’t carefully select the audience of the A/B test, in the best-case scenario, you will not detect the difference and, in the worst-case scenario, you can see misleading results. Consequently, you might make suboptimal decisions and thus lead your product and business in the wrong direction.
We should think about the user audience before setting up the test and also after the test has reached the analysis phase. You shouldn't include users in the test that are invalid (for example, new onboarding to the old users). On the other hand, sometimes you need to dig deeper into how different segments of users in the test reacted to the change and analyze them separately. A/B testing is a tool not only to measure but also to gain some new insights about your customers.
Selecting audience for A/B tests: How to do it right
There are several factors you need to take into consideration when selecting your audience:
The proper segment of users. You need to think about what audience behavior you need to study during the test. Some types of segmentation to consider are new users vs. old users, different traffic sources, different countries, different platforms, and different user problems that are solved by your product.
The example here could be a redesign of your product. Usually, it’s almost impossible to improve any metrics for the old users during the redesign because they are very much used to the old design. On the other hand, new users don’t have such bias, so it might be more important to satisfy them, especially if you are planning to acquire a significant number of new users in the near future.
However, you should avoid over-targeting: even if you solved some user problem for a narrow audience, ran a successful A/B test, and saw a statistically significant improvement, you have proven that the improvement is only impactful for this particular audience. It doesn’t mean that the change would apply to other audiences. And a considerable improvement for a specific audience might be insignificant for the whole business. For example, the 20% improvement for 10% of the total audience is only a 2% improvement for the entire product.
Advanced users. You must also pay attention to advanced users, such as power users. Imagine a situation where you have ten users who generate the most significant amount of profit. It’s possible that seven out of ten users got into the same group and, thus, impacted the results. Don’t forget to check such pitfalls inside your product. Also, an A/B test can help to avoid such situations.
Only users that had a chance to experience the change. Not all users will interact with our change, which product managers and data analysts commonly do not consider. As a result, the impact is not detectable during the analysis — the right audience is mixed in with the wide range of all the users.
It is important to analyze only those users who saw the change. For example, if you have a mobile application or a website and not all users visit a particular section where you’ve added a new feature, you need to analyze only those users who visited this section. Don’t forget to add proper logging to isolate the relevant audience.
Timings. Not all the periods during the week and the year are the same, so if we select the wrong timeframe for the start and the end of the test, it might not be representative. For many products, the users’ behavior during the business day and the weekend are very different, so it’s better to set a number of weeks for your A/B test. If you don’t do so, and for example, run an A/B test from Monday to Friday, you exclude weekend days that might show different behavior.
Also, there are some special periods, such as holidays (e.g., Christmas holidays), when the behavior of customers does not represent the rest of the year, so it’s better to avoid running any tests during such periods.
Conclusion
A/B testing is a mighty tool for PMs, but only when used correctly. It’s always possible to run tests that will lead to the wrong results and poor decision-making. Incorrectly choosing the audience is the most crucial thing that might cause those outcomes.
Understanding what exactly is going to be tested, what segment of users will be the most representative, and choosing the right time for testing on the right audience are all crucial steps for running effective A/B tests.