🥳 You have completed all topics in the Handbook. Click here to claim your certificate!

1. From hypotheses to tests

Conversion rate optimization often revolves around the concept of a "test". However, before testing anything there needs to be a hypothesis to evaluate. You need to have a business assumption in mind that the test is then designed to either reinforce or invalidate.

Let’s address the elephant in the room, first. The term CRO is terribly misleading. When you run an experimentation process, you are not trying to optimize the conversion rate – you are trying to positively impact growth by persuading your app and site visitors to perform actions that are beneficial to your business.

You are optimizing your customer flows, your site’s conversion funnels, or even individual conversions. You are not trying to game a metric. Yes, “conversion rate” is a metric. In CRO, you are trying to improve the underlying business process that this metric measures rather than the metric itself.

That being said, CRO is so widely used to describe the formal approach to experimentation in the digital marketing industry that we can allow this misconception to persist.

However, conversion rate is only the very last thing you’ll evaluate in an experimentation process. It’s what you end up with when an experiment has run its course.

But to get to that evaluation stage, you first need to go through the flow of generating a variation based on a hypothesis, designing the tests to either prove or invalidate this variation, and implementing the test so that it produces fair and reproducible results.

At its best, CRO follows the scientific method closely.

A structured, hypothesis-driven approach tends to align your experimentation goals with the overall goals of the business. It turns into a feedback loop of constant data evaluation, research, experimentation, and implementation.

As a technical marketer, you’ll find that many of the CRO tools and software suites handle the most mechanical parts of the process for you. You don’t necessarily need to have a degree in statistics, and you don’t always need to build the technical components of an online experiment yourself when using these tools.

But one thing the tools will not be able to replace is how to generate hypotheses about your business, your market segment, and your products and services in a way that will help you find new opportunities for growth.

Hypothesis design

Think back on the last time you made a significant design change on your website, your app, or a marketing campaign, for example.

Maybe you updated the hero image on your product’s landing page, or perhaps you decided to go with a more persuasive headline in your holiday social media campaign.

Why did you do the change? What was the reason for it?

A hypothesis is an assumption that forms the foundation for the experiment. In the hypothesis, you declare what you are changing, what the outcome of the change is, and what the reason for this outcome is. The experiment then strives to prove or disprove the design change based on this hypothesis.

Don’t miss this fact!

It’s important to understand that all outcomes of a (properly run) experiment are valuable. You might think that if the experiment disproves your variation it was a bad experiment. However, by disproving the variation, you now learn more about your business and your visitors, and you can use this knowledge to help refine your assumptions in the future.

Before creating designs or deploying experiments, you should start with formalizing the hypothesis. This is a popular template for hypothesis design:

If <what is changed>, then <desired outcome>, because <rationale for the outcome>, and this will be good for everyone because <rationale for the business decision>

Example

You have been analyzing your landing page data just to find that visitors are not interacting with the call-to-action (CTA) on that landing page. You are pretty sure this is because it is so far down the page that most visitors simply won’t reach it due to impatience or exhaustion or other reasons.

In this case, you formulate a hypothesis:

If we update the CTA to be easier to find,
then we’ll see an uptick in interactions with the CTA without hurting sales,
because most users don’t scroll deep enough to find the CTA,
and this will be good for everyone because this could lead to more sales while offering a better user experience to our users.

Note that this is a single hypothesis about a single outcome. It’s possible the landing page has far more opportunities for improving conversions, but it’s quite fine to take an atomic approach, too. However, you should always consider the bigger picture when generating hypotheses. You don’t want your change to the CTA to conflict with other messaging on the landing page.

Remember that a strong hypothesis does not guarantee a “winning” experiment. You might think that all the time you spent in research and hypothesis design is wasted if the hypothesis is rejected. But this is not true! You’ve learned something valuable about how your audience behaves, and you can use this to refine your approach for future experiments.

Deep Dive

Variable, result, and rationale

The simple hypothesis template above comprises three parts:

  1. Variable (what is being changed). What is the element (or elements) that needs to be changed for the desired outcome to manifest? Coming up with variables is part of a research process. Start by looking at your analytics data to find poorly performing pages or forms, for example, and then try to figure out (also with qualitative data!) what could be a potential reason for the lower outcome.
  2. Result (what is the predicted outcome of this hypothesis). You need to have a result in mind when making assumptions about the selected variable(s). What is the key metric that changes? What will you measure against? With landing pages and forms, for example, the expected result is typically that conversions against a call-to-action would improve. But it could be something more noisy, such as an uptick in dwell time for visitors to the page.
  3. Rationale (what is the reason for this experiment in the first place). This is where you show that the hypothesis isn’t just something you cooked up on a whim. You need to show that the experiment’s origins are rooted in research and an understanding of your visitors and their behavior.

There are more comprehensive hypothesis templates available, too. For example, the following template from Craig Sullivan starts with the rationale and expands the other components, too.

Tip: Whichever hypothesis template you end up using, document your hypotheses together with the background research. When the experiments are concluded, include their underlying hypotheses prominently when presenting the results. Create a wiki or other database with a comprehensive history of these experiments – it can be an extremely valuable research tool for your entire organization.

Ready for a quick break?

One thing you should not experiment is whether you should have a break or not. Breaks are always beneficial to you, and this statement is as statistically significant as it gets.

Testing the hypothesis

You now have a hypothesis ready. You’ve made assumptions about your visitors’ behavior, about your website, about your business goals, and you are now ready to test whether the variation you created with the hypothesis has a significant effect or not.

While the CRO software suites will typically handle the nuts and bolts of your experiments for you (check the next Topic for more details), it’s still important to understand what goes into experiment design itself.

There are many approaches to designing the experiment. Here are some of the fundamental decisions you’ll need to make:

  1. What are clear and measurable objectives for the test?
  2. What type of test do you need? An A/B test? A multivariate test? A redirect test?
  3. How will you ensure test validity? Do you have a large enough sample size? Are you assigning users to different variations in an appropriately randomized manner? Are you able to eliminate biases from test design?

Much of experiment design boils down to experience and research. The better you understand your business (and its goals), your audiences, and the limitations that might impact what types of experiments to run, the more robust designs you’ll be able to conjure.

An A/B test means that you’re testing one or more variants (or variations) against the original (also known as the control). Typically, you’ll distribute traffic to the page equally between these variations, so that users only see the variation intended for them for the entire duration of the experiment.

But before you can do this, you’ll need to figure out what the duration of the experiment needs to be. You’ll also need to determine which conversion rate you are measuring against and how many visitors your experiment should be seen by to achieve a meaningful result.

Figuring out the sample size (number of visitors per variation) and experiment length (how long the experiment needs to run to reach a statistically valid result) can be determined with dedicated calculators (see this excellent calculator from abtestguide.com, for example).

The testing tools you use are designed to help determine the parameters for any given test, but you need to provide input about the conversion rate you’re trying to impact and what type of impact you’re satisfied with.

One thing you’ll learn when using these calculations is that the more traffic you have to the site, the easier it will be to run experiments. In fact, if your site doesn’t fetch enough traffic, you won’t be able to run A/B tests at all, because your sample sizes will be too small to achieve any measure of viable results in a decent amount of time.

CRO is one of those disciplines where you can delegate much of the nuts and bolts to the testing software you use. While you could do everything with a calculator and a spreadsheet, modern testing tools are designed to guide you from hypothesis generation to experiment design, all the way to producing results than can be fed back to the hypothesis machine.

As a technical marketer, it’s useful to understand the underlying concepts so that you can communicate results more efficiently in your organization. You’ll also be of use in fine-tuning the testing tools so that they work on your site properly and produce usable results.

We’ll talk more about CRO tools in the next Topic.

Deep Dive

Calculators for tests

There are many test calculators online. Most of the testing tools themselves come with calculators, but there are online resources for this, too. For example, here is a calculator from abtestguide.com:

Here’s an explanation of the different components:

  1. Conversion rate Control is the baseline rate you currently have on the site. In this case, 1.7% of visitors to the landing page subscribe to the newsletter.
  2. Expected improvement over control (also known as the minimum detectable effect) is the minimum improvement to the conversion rate that you want the experiment to detect. In this case 1.7 * 1.2 = 2.04%. In other words, the conversion rate produced by the variation needs to be at least 2.04% in order to run the test at the required levels of power and confidence (see below).
  3. Unique visitors on your test page per week is what it says it is. You need to feed what the expected or average number of weekly unique visitors is that visits the test page. This number can be sourced from an analytics tool. This is calculated together with the minimum sample size to determine the test duration.
  4. Max number of weeks for AB-test is how many weeks you intend to run the test. This might result in a calculation that doesn’t meet your expectations – for example, with the numbers in the form, the minimum test duration is 4 weeks. If you enter 3 as the maximum number of weeks, then the minimum expected improvement over control will be 2.09% (+23%) and not 2.04% (+20%).
  5. One-side vs. two-sided hypothesis configures whether you’re only interested in detecting an improvement (one-sided), or whether you’re interested in detecting a decline or an increase in the conversion rate. A two-sided hypothesis requires a larger sample size. Most A/B tests would be one-sided tests, as you’re only interested in detecting an improvement.
  6. Power is the strength of the test to actually detect a true effect in your variation. The higher the power, the more sensitive the test is to detect a meaningful difference. This is usually kept at 80%. If you increase the power of the test, the minimum sample size will also be bigger.
  7. Required confidence level is your tolerance for false positives. For example, if the confidence level is 95%, it means that there’s a 5% probability that the test detected an effect (conversion improvement) when there actually was none – this is a false positive result. A confidence level of 95% is common – to reduce the probability of false positives (i.e. increase the confidence level) will require a larger sample size.

To use a calculator like this, you’ll need to feed in some numbers that you’ll easily get from your analytics tool of choice. Sometimes it might be difficult to decide what the minimum detectable effect you’re satisfied with is, but this is largely dictated by how much traffic you have to the site and how long you’re willing to run your experiments.

Key takeaway #1: A good hypothesis establishes the parameters of the test

In order to experiment, you need to have a business question, assumption, and goal in mind. Testing just for the sake of running tests is silly. It’s expensive and time-consuming. By starting with a solid hypothesis, you establish the baseline criteria for running the test and for evaluating its results. The hypothesis typically includes a declaration of what you are changing, what the expected outcome of the change is, and why the outcome came to be. The experiment then strives to prove or disprove the design change based on this hypothesis.

Key takeaway #2: From A/B tests to multivariate tests

The most popular test type is the A/B test. In an A/B test, one or more versions of the control (original version) are tested against each other and the control to see which one performs best against the hypothesis and the metrics that measure the expected outcome. In CRO, the metric goals typically have to do with conversions. Another popular test type is the multivariate test, where multiple variables in different combinations are tested in the same test.

Key takeaway #3: Calculations for setting up the test

To run a test successfully (i.e. that it produces a result), you need to be wary of a number of different metrics and measures that test design incorporates. For example, conversion rate is the metric you probably want to improve. Power is the probability of detecting an effect where there is an effect to be detected. Confidence level is how confident you are the test result is not a false positive. A good calculator will help you generate these metrics.

Quiz: From Hypotheses To Tests

Ready to test what you've learned? Dive into the quiz below!

1. Which of the following is not part of the hypothesis template?

2. Which of the following statements is false regarding A/B tests?

3. What is the minimum detectable effect?

Your score is

0%

What did you think about this topic?

Thanks for your feedback!

Unlock Premium Content

Simmer specializes in self-paced online courses for technical marketers. Take a look at our offering and enroll in one or more of our courses!