Statistical and Financial Considerations in Website Optimization: Part 1
By Marcia Gulesian
Most comapnies want to deploy Web pages that fulfill a business purpose. Their goal might be selling their product (at a profit!), distributing information (such as newly legislated prerequisites for obtaining a driver license), gathering information (such as the demographics of potential customers) from an on-line form that doesn’t turn people away because it’s too complicated or intrusive, etc.
The question is what do you put on your page(s) to achieve your goal(s)? You could decide on your own or together with a team of in-house experts and hope for the best. Or you could throw the process of Website design open to the world of users beyond your firewall. In doing so, you would be democratizing the process by letting the public tell you, through their actual behavior, which of a number of possible designs gets them to use your Web page(s) in the way that you had hoped they would. As pointed out in a recent New York Times article, the latter approach has now become main stream.
We will focus primarily on one of the several competing products that you could use to democratize this process, Google’s online Website Optimizer (GWO), chosen for, among other reasons, the fact that Google doesn’t charge for its use. Moreover, in a future article, I’ll discuss the collateral use of some of Google’s other online services:
- Webmaster Tools
Also, Urchin, Google’s counterpart of its Analytics , that, unlike Analytics, you can install on a local server located behind your firewall.
First, I’ll elaborate on Optimizer’s use of some elementary statistics in the preparation of its reports. I’ll then apply additional statistical methods to the raw data collected by GWO, after exporting this data to Microsoft Excel (to which I have added the statistical tool StatTools from Palisade, Inc.).
GWO allows you to try different layouts, alternate content, new buttons, and even new colors through multivariate testing or split A/B testing without making permanent changes to your site, these tests will be described in some detail below. Better yet, your visitors themselves ultimately determine which combination of site elements cause them to perform the actions you desire such as filling out a form, adding an item to the shopping cart, or taking the big plunge to enter their credit card.One problem with this approach is that this approach modifies the original website and some of these experimental variations may cause some visitors to abandon your site. This may not be acceptable for businesses with low-traffic, high-stake websites and others.
Choose the pages and content to test
Using GWO’s Web-based interface, you provide Google with the content — headlines, images, or text, for example — and design alternatives that you’d like to test.
Test these changes with your visitors
GWO will then show these content and design alternatives to your site visitors, all the while monitoring which combinations lead to the highest conversion rates (another term that I’ll define in some detail below). An outline of this process is shown in Figure 1.
Sections: elements you test on a page; e.g., headers, images, buttons, forms, and text.
Variations: different ways you can design and word these elements.
Combinations: different ways variations across elements can be matched up.
Conversion page: the page that, when reached by a user, means business results for you. It’s the “Goal” in Figure 1. Depending on your type of site, it may be the page where a user will complete a purchase or fill out an interest form.
Conversions aren’t always associated with user clicks, page visits, or other actions in the browser. A user who finishes reading an article, for example, won’t always click a button or register another event in the browser to indicate that she’s done.
In those cases, it may be useful to count a conversion after the page has been loaded in the browser for a certain amount of time. This way, users who stay on the page for 20 seconds (or for any amount of time you set) will trigger a conversion in GWO. We’ll cover this subject in a bit.
Conversion rate: the percentage of your visitors that end up reaching a given goal during the time period in question.
When you start testing more than one element on a page at the same time, you’re multivariate testing, with its combination of sections and variations.
In GWO’s A/B test, you treat the entire page as a section; you are testing pages, not variables. Because there are only two combinations, it is particularly suited for pages with little traffic; you don’t need a whole lot of traffic to be able to identify a difference (if there is one).
How do I select a good conversion page?
In general, you should choose a page where users complete a defined action that produces desirable business results for you.
If your site has a lot of products and you’re trying to test overall conversion improvements, you might want to use a ‘Thank You’ page as your conversion page – this will enable you to capture any successful action users take. However, if you’re trying to test completion of a unique goal, you can narrow your focus with a conversion page that’s unique to a specific product – for example, the purchase page for that product.
Landing pages are key pages to optimize because they are your visitors’ first, and all too frequently last, impression of your website. If a visitor lands on a page that doesn’t provide the information she’s looking for, she’ll probably leave without clicking any further. For high-traffic landing pages, this can add up to a lot of lost visitors.
That’s why it’s so important to find, and fix, high-traffic landing pages that lose a high percentage of visitors. Look at the “Top Landing Pages” report (See Figure 3) within the Content section of Google Analytics (GA). Pages that have both a high Bounce Rate (the percentage of visits that resulted in the visitor immediately leaving the site) and large number of Entrances need to be redesigned.Note: There is a close relationship between Google Website Optimizer and Google Analytics – the conversion data used in Website Optimizer reports comes from the Analytics database system.
Don’t forget about funnel pages
Other high value pages are those that lead visitors to your goal page(s) (See Figure 1). Visitors reach a goal page once they have made a purchase or completed another desired action, such as a registration or download. In GA, you can specify up to ten pages in a defined funnel representing the path that you expect visitors to take on their way to the goal page (conversion). A page that is part of a goal funnel is another great place to focus website optimization efforts.
The Funnel Visualization report within the Goals section of GA shows you how many visitors exit the funnel at each step in the path towards the goal page. In the funnel visualization below, you can see that most visitors in this funnel are lost in the transition from the “View Shopping Cart” step to the “Login” step. Only 7% of visitors move past this step, but of those who do, many go on to complete an order! Before you setup your Website Optimizer experiment, you can examine the Funnel Visualization report to see whether you could be improved simply by limiting steps in paths to a goal, like the “View Shopping Cart” step below.
Keep in mind that sometimes the further the goal is from your tested page, the more traffic you will need.
Two kinds of Experiments
Multivariate tests, the primary focus of this article, allow you to test multiple variables — in this case, sections of a page — simultaneously. For example, you could identify the headline, image, and promo text as parts of your page you’d like to improve, and try out three different versions of each one. Website Optimizer would then show users different combinations of those versions (let’s say, Headline #2, Image #3, and Promo Text #1) to see what users respond to best. Multivariate tests are more complicated and typically require higher page traffic than A/B tests.
Figure 6. A/B Experiment
An A/B experiment, on the other hand, allows you to test the performance of two (or more!) entirely different versions of a page. You can change the content of a page, alter the look and feel, or move around the layout of your alternate pages; there’s plenty of design freedom with A/B testing. It’s the simpler type of test, and works best with pages that don’t get a lot of traffic.
Two kinds of reports
In multivariate testing, there are two kinds of reports: a combination report and a page section report. Each of the columns in these reports provides a different insight into the performance of combinations, page sections and variations.
The Combination Report
A combination report will show the performance results for all of the page combinations made from the page section variations you created for your experiment. By seeing how well a particular combination performs in comparison with the original and the other combinations, you can choose the most successful one to improve your business.
The chance to beat original column shows the likelihood, expressed as a probability, that a particular combination will be more successful than your original content. It is very possible that there can be more than one combination which has a good chance to beat the original. When this number goes above 95% or below 5%, the corresponding bar will be all green or all red, respectively.
Observed improvement displays the percent improvement over the original combination. Because this percentage is a ratio of the conversion rate of a combination to the conversion rate of the original column, it will often vary widely. You should only concentrate on the observed improvement when a large amount of data has been collected and it can be considered more reliable.
What’s a high-confidence winner?
A high-confidence winner is a tested combination that you can feel confident will bring you more conversions than your original website configuration. Here’s an example: Suppose you tested three combinations: X, Y, and Z. X showed a 23% chance of beating the original, Y showed a 44% chance of beating the original, and Z showed a 99% chance of beating the original. In that case, Z (with its very high probability of improvement) would be marked as a high- confidence winner.
GWO will alert you in your report whenever a high- confidence winner is found in one of your experiments. You’ll see the phrase ‘High-confidence winners found,’ followed by details on those high-confidence combinations. There may be zero, one, or more than one high-confidence winner in any report, depending on the results of your experiments.
Use these high-confidence winners on your pages to boost your conversion rate. Then keep experimenting! A high- confidence winner is a very good start, but you may be able to improve your conversions even more.
You may not see a high-confidence winner with every experiment. In some cases no significant difference will be found between the combinations tested and your original. In that case, no high-confidence winner will be declared.
Sometimes you simply need more data to be able to reach a level of high confidence. A tested combination typically needs around 200 conversions for you to judge its performance with certainty.
If the results are inconclusive, what do I do?
If the results are inconclusive, it may mean that you weren’t able to sample a significant number of users, or that there wasn’t a significant difference between the combinations tested. If there wasn’t a significant difference between combinations, but several combinations performed better than your original content, you or your webmaster may wish to replace the original content on your live site with the HTML content for one of the better- performing combinations. Alternatively, you may prefer to edit and run your experiment again.
Conversions/impressions is just that — the raw data of how many conversions and visits a particular combination generated. It represents the number of visitors who reached the conversion page after viewing the test page where the given combination was presented.
Let’s take a look, in greater detail, at the Original, Combination 11 and/or Combination 7 rows of in the Combinations section shown in Figure 6.Estimated Conversion Rate.
- 125/401 => 31.2% (point average for Original)
- 160/411 => 38.9% (point average for Combination 11)
- 143/453 = > 31.6% (point average for Combination 7)
Estimated Conversion Rate Range
Figure 8. Standard error of sample proportion (Eqn. 1) and confidence interval for a proportion (Eqn. 2)
The Website Optimizer uses an 80% confidence level to estimate conversion rate range; that is, the real conversion rate delivered by a combination will fall within the estimated range 80% of the time. The Estimated Conversion Rate Ranges reported in Figure 7 use Equation 2 with a z- multiple or 1.28 (signifying 80%).
Chance to Beat Original
Using the formula for confidence interval for the difference between proportions in Equation 4, the Excel spreadsheet shown in Figure 8 computes the values shown in the Chance to Beat Original column in the report shown in Figure 7.
Figure 9. Standard error of difference between sample proportions (Eqn. 3) and confidence interval for difference between proportions (Eqn. 4)
Suppose, for example, that the high conversion rate for Combination 11 came from the fact that it displayed a $350 price for a camera and the only slightly higher conversion rate for Combination 7 came from the fact that it displayed a $425 price for the same camera, while the Original page offered the camera for $450. And, suppose that your cost to make and advertise this camera was $300.
Then, for every 100 customers who arrived on your landing page, you could expect to make a profit of about
- $50(.3893)(100) = $1946 for Combination 11
- $125(.3157)(100) = $3946 for Combination 7
- $150(.3117)(100) = $4674 for Original
This is a pretty extreme example, but it does demonstrate the point that achieving your business goal – maximum profit, in this case – isn’t necessarily the same as maximizing your conversion rate. I’ll discuss the financial aspects of Website optimization further in my next article, Statistical and Financial Considerations in Website Optimization: Part 2.
Interactions between the variables
It’s important to understand that Google Website Optimizer reports on only the main effects within your Web page (the significance of the individual variable values). However, there is frequently significant interaction between the variables. Some interactions are unexpected while others are intentional. As a general rule, you want to create interactions that yield positive, not negative, synergies. The Cambridge University Press book “Nature’s Magic” cited in the Reference elaborates on this subject in considerable detail.
With a knowledge of which independent variables, if any, interact, you can refine your experiment. To incorporate this interaction and/or other realities into your analysis, you can start by exporting the raw conversion rate data from your GWO report.
Figure 11. Export to an XML, CSV (comma delimited), TSV (tab-separated values) or print file (pdf with Adobe installed)
I use the CSV option to export the Combination Report to an Excel spreadsheet with the statistics application StatTools add-in. You can do everything that follows with the regression data analysis tool that comes with Excel, but, as I’ll explain below (and in my next article), StatTools can make the job a good deal easier.
Regression analysis is a statistical technique that can be used to develop a mathematical equation showing how variables are related. In regression terminology, the variable that is being predicted is called the dependent or response variable. The variable or variables being used to predict the value of the dependent variable are called the independent or predictor variables. An important part of the regression analysis procedure should focus on the selection of the set of independent variables that provides the best forecasting model.As presented in this article, regression analysis may seem simple; but, often, it is not: for example, issues such as the possible cyclical nature of data can make deciding the length of time required to collect enough data to make statistically significant (and helpful) decisions difficult.
In the equation below, Conversion Rate is the dependent variable and the s are the independent variables.
Using the report’s estimated conversion rate and a prior knowledge of which variables are represented in each combination (including the original), the next step in the analysis entails using either StatTools (or Excel alone in very simple cases) to derive regression coefficients for a multivariate statistical model.
StatTools (or Excel alone) then computes values for the coefficients ci, the magnitude of the contribution of each effect, which can be either positive or negative and which appear in the Coefficient column of the regression report shown in Figure 13. Further explanation of this computation and report will appear in my next article, Statistical and Financial Considerations in Website Optimization: Part 2.
Note: This explanation, like the regression report shown in Figure 13, has been simplified for the sake of clarity. In practice however, the choices you make for each section (e.g., image1, image2 or image3) are usually replaced by dummy variables as described in Chapters 7 and 8 of the S. Christian Albright book listed in References. A dummy variable is a variable with possible values 0 and 1. It equals 1 if the choice is in a particular section (e.g., image 2) and 0 if it is not.
In multiple regression analysis, we make the initial assumption that the effects of the independent variables on the dependent variable are additive. In short, we assume that the dependent variable can be predicted most accurately by a linear function of the independent variables. However, the effects of independent variables on a dependent variable are not always additive. We refer to the presence of non- additive effects as interaction (e.g., in the equations above). Interaction occurs whenever the effect of an independent variable on a dependent variable is not constant over all of the values of the other independent variables. Although interaction is a somewhat difficult concept to envision in the abstract, it is not difficult to conceive of situations that would entail interactions between variables: For example, temperature and rainfall on the number of orchids harvested.
Note: Everything in this discussion on multiple regressions used to understand the factors that bring about conversion rates could be applied, equally well, to understand the factors that optimize other dependent variables such as Return On Investment (ROI). Moreover, the Xi’s can also represent information about conditions or events outside of the Web page itself (indicated by the “External variable” term in the Regression Table section in Figure 13 : for example, the language, location, or Web browser of the viewers (information available from products like Google Analytics and Urchin), marketing campaign information, etc., in addition to information about the combinations of Web page elements used in your test.
I’ll discuss the statistical aspects of Website optimization in greater detail in my next article, Statistical and Financial Considerations in Website Optimization: Part 2.
The Page Sections report
If you’re running a multivariate test, you’ll notice that you have two sub-tabs: reports by combination, as discussed above, and reports by page section.
Page section reports have the same columns as combination reports, plus one more: relevance rating.
Relevance rating shows how much impact a particular page section has on your experiment. For example, if your headline page section showed a relevance rating of 0, you’d know that the headlines you used did not significantly distinguish themselves. Alternatively, a relevance rating of 5 for your image page section would show that there were one or more images which significantly differentiated themselves from the others, and that the images page section is important for conversions.
Why might the relevance rating be 0 for all my page sections?
Seeing relevance ratings of zero for all your sections could suggest that the difference between your variations is subtle, and requires more data to become apparent, or that the content tested in your experiment isn’t having a significant effect on your users.
The ROI / Business case for testing
Sometimes it’s hard to convince your boss that your company should invest precious resources into running content experiments. One way to communicate the business value of testing is to run the numbers.
As an example, let’s say your company currently acquires customers or sells products online. Let’s assume every year, 1 million visitors reach your site and 5% proceed past your home page, 10% of those proceed past your product/service details page, and 20% of those ultimately become customers. Let’s also assume the average sale is $250. This would equate to 1,000 customers and $250,000 in revenue for every 1,000,000 visitors. Not too shabby right?
Now assume that by testing alternate content and page designs, you were able to increase the advance rates for each step of the sales process by a small percentage. Let’s say that now 6% of visitors advanced past the home page, 12% past the details page, and 24% to the final checkout /contact page. By improving only 3 pages by small percentages, with the same number of visitors each year, instead of acquiring 1,000 customers and $250,000 in revenue, you’d now acquire 1,728 customers and generate $432,000 in revenue!
So at the end of the day, using experiments to find better-performing content can dramatically increase your sales without increasing your spending. After using Website Optimizer to improve your ROI, you might even find your boss asking you why you aren’t testing more often.
User Experience Testing
Because usability is a huge factor in conversion rate improvement, user experience (UX) people are usually heavily involved in the development of a written test plan (helping to determine the elements to be tested, along with the specific alternatives for each element to be considered). Remember that successful testing all of the elements on a page may require the collection of a great deal of raw data. So, if UX testers can correctly eliminate the need to include some elements from your initial optimization testing, their efforts will shorten the time required to collect a statistically significant amount of data.
UX experts can also readily construct the awareness, interest, desire, action (AIDA) decision process steps for your business.
- AIDA is a model of consumer behavior that traces the sequence of cognitive events leading to a purchase decision or other action; also called hierarchy of readiness. For example, in a political campaign, one first becomes aware of the candidates. After receiving additional information, an interest develops in one or more candidates, eventually resulting in a desire to see one candidate elected, and the act of casting a vote for that candidate. The AIDA model is used by marketers as a guideline for creating communications. This requires an understanding of where the market for a product currently lies among the AIDA continuum. Marketing of an innovation requires building awareness. Marketing of an established product may require building desire. Recipes that present new ways to use established brands are one way often used to build desire for an existing product.
UX practitioners are generalists. They may have been involved in the design of many websites on a variety of topics. For this reason, it is important to team them with a subject matter expert (SME). Without the support of someone knowledgeable in your industry or business, UX practitioners may miss important aspects of your conversion process or business goals.
UX people are usually good at the functional and architectural aspects of your designs (i.e., common usability issues that are likely to affect all of your visitors). They are generally weaker on the content issues, such as text, copy, marketing message, and graphical design.
The question is should you deploy a Website decided upon by a conference room of experts (e.g., usability testers, marketing gurus, etc.), by a self contained tool like Website Optimizer, by an advanced statistical analysis like the one outlined above or some combination of them. The answer, as you may have guessed, is it depends. It depends on your budget (dollars and time), the level of experience and expertise in such matters of your people (IT, Finance, Marketing, etc.), and the importance of the ROI you think you can achieve.
Remember, without a positive ROI, advertising is a cost, not an investment!
- Tullis et al Measuring The User Experience, Morgan Kaufmann (2008)
- Ash, T. Landing Page Optimization, Sybex (2008)
- Eisenberg et al The Complete Guide to Google Website Optimizer, Sybex (2008)
- Clifton, B. Advanced Web Metrics with Google Analytic, Sybex (2008)
- Albright, S. C. Learning Statistics with StatTools, Palisade (2003)
- King, A. Website Optimization, O’Reilly (2008)
- Carlberg, C. Business Analysis with Microsoft Excel 3rd Ed, Que (2007)
- Corning, P. Nature’s Magic, Synergy and Fate, Cambridge University Press (2003)
- Motulsky, H. Intuitive Biostatistics, Oxford University Press (1995)
- Website Optimizer Overview
- Choosing a statistical test:
- Content testing ideas: