2. Tools of the CRO trade

Conversion Rate OptimizationConversion rate optimization refers to running tests and experiments with the purpose of improving the likelihood of conversion for visitors to your sites and apps. is situated precariously between many different parts of a digital organization.

It’s a marketing discipline because most of the time its purpose is to generate business growth through optimizing marketing efforts.

It’s a software development discipline because it adds additional processing requirements on the client-side code.

It’s an analytics discipline because it relies on robust data collection to validate the experiment results.

It’s a user experience design discipline because it frequently experiments with removing bottlenecks from the site or app that hinder the visitor’s path to conversion.

On top of these, CROConversion rate optimization refers to running tests and experiments with the purpose of improving the likelihood of conversion for visitors to your sites and apps. relies on certain technologies that can be controversial in their own right.

Example

Visitors are randomly distributed to different groups based on which variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. they should see and interact with. Information about this needs to be persisted so that the visitors stay in their groups throughout the experiment. To enable this, browser cookiesCookies are a way to persist information on the web from one page to the next and from one browsing session to the next. They are small bits of information always stored on a specific domain, and they can be set to expire (self-delete) after a given amount of time. or other persistent storage mechanisms are frequently used.

Similarly, A/B testsA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal. are often run with client-side technologies. This means that it’s possible that the actual variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. isn’t applied until the user loads the web page or app. This, in turn, results in the risk of the so-called flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. effect, where the original (control) version of the experiment is briefly shown to the user before the client-side code kicks in and replaces it with the variation.

This sensitive balance between running the experiment without compromising general usability of the site has resulted in a proliferation of different tools and technical solutions for running experiments. In this Topic, we’ll take a look at some of these and discuss their implications from a technical marketer’s perspective.

Variant assignment and persistence

One of the first technical problems you need to solve with an experimentation tool is how to assign visitors to variantsIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. and how to persist this information.

Example

You want to run a simple A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal. with just the control and one variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. to experiment with. You’ve decided on a 50/50 split of traffic without any further targeting rules.

This sounds simple, right? For every visitor to the page, just use a simple JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript. method to pick one of two options at random, assign the visitor to a group based on this calculation, and show them the correct variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. for the duration of the experiment.

Well, generating a random number is simple enough (although if you want to go down the rabbit-hole you can research how difficult true randomness is in a computational context). But assigning it to a visitor consistently is more difficult.

Firstly, to persist something on the web you need browser storage. This was discussed in an earlier Topic. Typically, browser cookiesCookies are a way to persist information on the web from one page to the next and from one browsing session to the next. They are small bits of information always stored on a specific domain, and they can be set to expire (self-delete) after a given amount of time. would be used for this, but other browser storage can also be utilized.

So once you have determined that the visitor should be either in group A or group B, you need to store this information in their browser so that for the duration of the experiment they will always be in group A or group B. Again, sounds simple, right?

But what if the visitor clears their browser storage? What if they browse with incognito or private browsing mode, which deletes all storage after each session? What if they visit again with a different browser or device?

This is a risk of running experiments in the user’s device. There’s no way for you to adequately control visitor behavior. Depending on your market segmentWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads., it’s possible that users clear their cookiesCookies are a way to persist information on the web from one page to the next and from one browsing session to the next. They are small bits of information always stored on a specific domain, and they can be set to expire (self-delete) after a given amount of time. more frequently than on other sites, which will directly increase the risk of a single visitor being exposed to different variantsIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. of the same experiment.

Seeing different variantsIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. of the same experiment can introduce mistrust and confusion, which will lead to an even more of an uphill struggle in the journey to conversion. It can also dilute the measured effect size for the experiment, as this partial inability to recognize returning users increases the likelihood of false negatives in the experiment results.

Example

Let’s say you’ve run an experiment where the control had a 5% conversion rateThe ratio of conversion events to visitors or sessions. A high conversion rate usually means that your visitors are performing actions that are beneficial to your business. and the variation had a 6% conversion rateThe ratio of conversion events to visitors or sessions. A high conversion rate usually means that your visitors are performing actions that are beneficial to your business.. In this case, the effect size was 6 / 5 = +20%. Users who saw both the control and the variation (because of lack of persistence of their group assignment) would have a conversion rateThe ratio of conversion events to visitors or sessions. A high conversion rate usually means that your visitors are performing actions that are beneficial to your business. somewhere between 5 and 6, for example 5.5%. Let’s assume the percentage of users who were exposed to both versions was 30%.

When taking into account these variablesVariables are (usually small) pieces of code run in a TMS to fetch dynamic values for tags when they fire. A defining feature of variables is that they are re-evaluated whenever a tag fires. For example, if a variable fetches the exact time when a tag fired, it's important that it doesn't use the same, fixed value for all tags on the page., we can calculate that the conversion rateThe ratio of conversion events to visitors or sessions. A high conversion rate usually means that your visitors are performing actions that are beneficial to your business. for the control was 30% * 5.5% + 70% * 5% = 5.15% and for the variation 30% * 5.5% + 70% * 6% = 5.85%. Thus the measured effect size was 5.85 / 5.15 = +13.6% instead of the real effect size of +20%.

Factoring in uncertainty of group assignment is part of running client-side experiments. Calibrating the measurement to be better aware of the size of uncertainty, such as by segmenting the data by browser, goes a long way into figuring out how many false negatives were recorded.

Ultimately, it might be useful to move away from the fragility of browser-based persistence and instead target the most important experiments with more deterministic group data, such as logged-in users.

Deep Dive

Improve the persistence of group distribution

The fickle nature of browser storage might tempt your organization to look at other solutions for persisting the visitor’s group assignment. After all, the better you can ensure each visitor consistently interacts with the same experiment variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test., the more you can trust that the results are valid.

There are some solutions to this.

For example, if you’re OK with only showing experiments to logged in visitors, you can store their experiment metadataMetadata is additional data about the data itself. For example, in an analytics system the "event" describes that action the user took, and metadata about the event could contain additional information about the user or the event itself. directly in the user data table in your databaseStructured storage for data that usually serves a singular purpose. For example, a company's financial records would be stored in a database.. Naturally, this would exclude all visitors who have not signed up to your service from the experiments.

Another option is fingerprinting. This is often floated around as a solution to the fragility of browser storage. The group information could be stored behind a browser fingerprint, which would improve the chances of the visitor always being assigned to the same group as long as they always use the same browser. So it doesn’t really solve the bigger picture, and it introduces severe privacy risks in return.

Don’t miss this fact!

While persisting group information with 100% reliability is practically impossible to do, it’s good to understand the inherently unreliable nature of using browser storage for persistence. When you have enough sample sizeThe number of visitors your experiment needs in order to have enough of a chance to reach a statistically significant result when there is an effect to be detected. to run experiments but have a hard time ending up with significant results, it could be wise to investigate if your A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal. solution has a problem with recognizing returning users in your experiment.

Running the experiment client-side

Designing the test typically happens with a tool or service that lets you dynamically select the elements on a web page that you want to experiment with.

Note that many drag-and-drop solutions use poorly designed JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript. to create the code to be executed. It’s a good idea to loop in developers when working with test design, so that you can customize the code to be more performant and reliable.

Example

If you want to test two different CTAA website or app element that invites the user to perform some action that is beneficial to the business. A typical example is a button the user can click to subscribe to a newsletter. elements, the tool would let you select the element on the page, apply the changes to it, and then the tool can make use of this information when its scripts are loaded in the visitor’s browser.

When the experimentation tool is then loaded in the visitor’s browser and when the visitor’s group has been determined (either through random assignment or by loading the information from browser storage), the changes stored in the experiment are applied to the page.

In the case of a classic A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal., for example, individual elements on the page might change to reflect the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. the user should be exposed to. Following the example above, instead of seeing the text Buy now! (the control), they’ll see Subscribe now!.

With multivariate testsAn experiment where more than one variable is tested at the same time. The results are analyzed using combinations of different variables to see which is the strongest one., there can be many different elements that change on the page.

There are also redirect testsAn experiment where instead of showing different variations of a section of a page, users are distributed to different versions of the entire page instead. Redirect tests are useful when testing large, page-wide changes. where instead of changing individual elements, client-side code automatically redirects the user to a different URLUniversal Resource Locator, the main method of encoding internet addresses for web browsers to send requests to. if their group assignment requires it. This can be a useful remedy for the flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. problem, as long as the redirection is done server-side. If the redirection is done client-side with JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript., it will be impacted by the flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. effect too.

As the visitor sees the element being treated by the experiment, they might change their behavior on the site as a result.

In experiments, data is being constantly collected from the visitor to see if they are converting against the objectives set for the experiment.

For example, if the website is testing different checkout steps, then it’s important to collect data about how the visitor moves through the ecommerceE-commerce represents the industry of online sales. E-commerce covers online transactions that happen on the web, among other things. funnel.

Deep Dive

The flicker effect

FlickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test., or “flash of original content” (FOOC), refers to the phenomenon where the experimentation tool applies the change to the page element (based on the user’s assignment to a variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test.) with a small delay.

There’s thus a very short period of time when the visitor sees the original content before the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. is applied.

This can be confusing to the user, especially if the site is testing different sales copy versions and the visitor sees a different pitch before the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. is applied.

There are solutions to the flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. effect, which include:

Only change elements below the foldThe "fold" means the visible part of the page when a visitor loads it in the web browser. Items that are "below the fold" usually require the user to scroll to them. – that way the visitor needs to scroll down to see the element, which gives the experiment script more time to update the element. Note! This lowers the Power of the experiment, because you’ll only be able to impact the behavior of users who scrolled down far enough.
Hide the original element initially – regardless of the visitor’s group assignment, the element is hidden until the script is ready to show either the control or the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test.. This naturally leads to the page having an empty spot where the element should be, but sometimes this is better than a flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. of the wrong element.
Hide the entire page initially – some tools take the drastic stance of hiding the entire page until the element is ready to be rendered. The good thing about this is that it leads to less confusion because it just shows up as a slightly slower page load. The bad thing is that you are delaying the page load in favor of an A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal., which might be difficult to justify in your organization.
Serve the updated content from the server – this is related to server-side experiments (see below). Instead of dynamically updating the element in client-side code, the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. serves the HTMLHyperText Markup Language (HTML) is used to describe how web documents should be rendered by the web browser. When you navigate to a web page, the web server hosting that page serves your browser an HTML source file that is then rendered into the visible and interactive web page. with the correct element in place. This requires much more work than purely client-side solutions.

The flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. effect might not always be such a huge problem. Visitors are used to seeing elements loading lazily on the web, and they might not even notice the flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. most of the time.

Ready for a quick break?

Your eyelids are already flickering – it’s time to take a break.

Data collection

When collecting data about an experiment, it’s of course important to always include the visitor’s group assignment in the collected data.

Note that most A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal. tools add the user to an experiment group when they first load an experiment page. This means that when you analyze results, the experiment might include data from users who might never have seen or even interacted with the element that was changed. It’s a good idea to “tag” users who were exposed to the changed element for an appropriate amount of time to make it easier to focus your analysis on this cohort.

Once the analytics tool collects experiment data from visitors, subsequent analysis can include information such as:

How many visitors have been added to each group. This is useful if you want to verify that traffic allocation to different variantsIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. works as intended.
How many visitors are converting against the objectives set for the experiment.
What else these visitors are doing on the site.

The first two points are instrumental in validating an experiment. Once the experiment is over, this data is used to calculate the statistical significanceThe likelihood of a test result to be caused by something other than pure chance. of the experiment result and whether the variant(s) or the control “won”.

If a group “wins” the experiment, it means that you have measured an uplift in the conversion metric (compared to the other variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test.) that is most likely not the result of random chance.

The winning as shown by https://abtestguide.com/calc/

While it’s important to stick to the conversion goals you configured when starting the experiment, and while it’s equally important to let the experiment run its full duration, analyzing secondary effects of the experiment can be very fruitful for generating inputs for new tests (with more evidence, of course).

Analytics tools allow you to segmentWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads. the data based on the visitor’s assignment to different experiments and groups. Sometimes you might see patterns that you didn’t originally envision as the outcome of the experiment.

Example

Your checkout steps test didn’t result in a statistically significant “winner” with regard to the conversion goal you picked. However, you can still use your analytics data to see that perhaps visitors in a certain group were more likely to struggle with checkout errors than users in another group. This could be valuable input for a new test on how checkout errors are exposed, for example.

Don’t miss this fact!

Remember that even though your experimentation tool relies on a shortlist of conversion goals to validate the experiment, you have the full capacity of your analytics tool to dig deeper into user behavior while the experiment was running. Look for patterns in user behavior when segmenting by experiment group. Use this data to generate additional hypothesesAn assumption based on research that you want to prove or reject with experimentation. Minimally, it needs to include what you intend to change, what the expected outcome is, and what your rationale for the hypothesis is. for future experiments!

Server-side experiments

Running experimentation logic in the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. rather than in the visitor’s browser is an interesting prospect. In fact, it seems to solve many of the problems with client-side experimentation.

With server-side experiments, the visitor’s group assignment and the content delivery is done in the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. itself. In other words, when you navigate to a website that’s running an A/B testA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal., as soon as you navigate to the site and see the page in the browser, you have already been assigned to a variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. and the page that’s loaded from the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. has all the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test. elements in place.

Another option is feature flagging, where the content served from the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. has both the control and the variation. The actual toggle to determine which content should be shown to the user is determined client-side.

This can be a huge asset, because it means that the visitor’s browser doesn’t have to load heavy and disruptive JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript. librariesAnother word for a file that contains code which can be utilized by downloading the library into the application. For example, when the web browser loads JavaScript files from vendors, those are frequently called libraries. just for determining the visitor’s group and updating the elements on the page.

It doesn’t solve the problem of persisting the visitor’s group assignment, but it does counter other issues with client-side testing such as the flickerAlso known as "flash of original content". Sometimes users see a quick flicker of the original version of an element before the change is applied to the element based on the user's group assignment in the test. effect.

However, the beauty of client-side experimentation is that it’s all handled by a JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript. libraryAnother word for a file that contains code which can be utilized by downloading the library into the application. For example, when the web browser loads JavaScript files from vendors, those are frequently called libraries.. All you need is for the visitor’s browser to load that libraryAnother word for a file that contains code which can be utilized by downloading the library into the application. For example, when the web browser loads JavaScript files from vendors, those are frequently called libraries., and it takes care of the rest.

With server-side experiments, you need the web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. to run all this additional logic. There are so many different web serverA machine connected to the World Wide Web, which is purpose-built to respond to HTTP requests from clients and for sending resources in response. software stacks that it’s not as simple as the plug-and-play JavaScriptJavaScript is the main language of the dynamic web. The web browser renders the HTML source file into a dynamic document that can be interacted with using JavaScript. approach.

You also need the server-side process to communicate with the site’s content management system (CMS), so that it knows to deliver the correct variation.

Often when server-side experimentation is discussed, the opportunity to build a custom solution can be tempting. After all, running simple A/B testsA type of experiment where one or more variations are tested against the control (original version) to see if the variations perform better with regard to an agreed upon conversion rate goal. doesn’t really require that much specialized technology.

Group assignment can be done with a randomization algorithm and cookiesCookies are a way to persist information on the web from one page to the next and from one browsing session to the next. They are small bits of information always stored on a specific domain, and they can be set to expire (self-delete) after a given amount of time., variation selection can be done with the CMS and some metadataMetadata is additional data about the data itself. For example, in an analytics system the "event" describes that action the user took, and metadata about the event could contain additional information about the user or the event itself., and data collection can be done with whatever analytics tool the organization is already using.

Calculating things like experiment winners and statistical significanceThe likelihood of a test result to be caused by something other than pure chance. can be done using online calculators that are readily available.

Nevertheless, server-side experimentation often requires more maturity from the organization that wants to try it. Modifying server-side processes to benefit experimentation might be an even more difficult pill to swallow for developers than the overhead of running experiments in the visitors’ devices.

As a technical marketer, you are uniquely positioned to consult your organization about opportunities like these. It’s important to understand the limitations of running experiments for visitors, both client-side and server-side, before deciding which approach to follow.

You can experiment with experimentation, too! Try different approaches, tools, and services before figuring out which works best in your unique organization.

Key takeaway #1: User’s device stores information about their test group

If a user is included in an experiment, then it’s important that they remain in the test group for the duration of the test. If they saw a different version of the experiment content with repeat visits, it would dilute any data collected from the user as noisy. A commonly used technology to retain the user in the assigned group is browser storage – the user’s test participation is stored in a cookieCookies are a way to persist information on the web from one page to the next and from one browsing session to the next. They are small bits of information always stored on a specific domain, and they can be set to expire (self-delete) after a given amount of time. or other browser storage which persists for the duration of the test.

Key takeaway #2: Flicker can ruin a test

When running tests in the user’s browser or device, the scripts are often loaded asynchronously. This means that the page has time to renderRendering happens when the web browser starts to convert the HTML document and its associated resources into the dynamic document the user sees and interacts with when visiting a web page. for a while before the user’s group assignment is determined. If the user is part of a variation group, it might result in them seeing the original content quickly before it’s changed to the variantIn experimentation, variants are different versions of the variable that is the target of the experiment. Any visitor included in the experiment would (ideally) be exposed to a single variant of the variable for the duration of the test.. This is called the “flicker” effect and can ruin the test due to producing conflicting visual cues.

Key takeaway #3: Don’t collect data just for the experiment

By collecting information about experiments running on the page and the user’s group assignments into the analytics tool, you can segmentWhen data is grouped by property, attribute, or value, it is segmented. When building audiences for ads, for example, you need to choose for which user segments to target the ads. your visitors based on test participation. While most of the time you’re probably focused on the conversion rateThe ratio of conversion events to visitors or sessions. A high conversion rate usually means that your visitors are performing actions that are beneficial to your business., some tests might impact the user’s behavior on the site in other, unexpected ways. These insights can feed into new hypothesesAn assumption based on research that you want to prove or reject with experimentation. Minimally, it needs to include what you intend to change, what the expected outcome is, and what your rationale for the hypothesis is. for additional tests down the line.

What did you think about this topic?

Thanks for your feedback!

Example

Variant assignment and persistence

Example

Example

Deep Dive

Improve the persistence of group distribution

Don’t miss this fact!

Running the experiment client-side

Example

Deep Dive

The flicker effect

Ready for a quick break?

Data collection

Example

Don’t miss this fact!

Server-side experiments

Key takeaway #1: User’s device stores information about their test group

Key takeaway #2: Flicker can ruin a test

Key takeaway #3: Don’t collect data just for the experiment

What did you think about this topic?

Unlock Premium Content

Chrome DevTools For Digital Marketers