Google Analytics
Contents
By Justin Cutroni
Goals & Funnels..................................... 27
Copyright © 2007 O'Reilly Media, Inc.
Common Website Configurations ......... 33
ISBN: 01234567890
Marketing Campaign Tracking............. 44
Web analytics is the process of measuring your website, analyzing the data and making changes based on the analysis. Many businesses are just starting to learn how they can increase the performance of their website by using web analytics. For many people their first exposure to web analytics is Google Analytics; a free tool available to everyone.
E-Commerce Tracking .......................... 52
Getting Setup Correct.............................. 2 How Google Analytics Works.................. 2 Profiles and Profile Settings .................. 10 Filters...................................................... 14
Custom Segmentation ............................ 59 CRM Integration ................................... 62 Tips & Tricks ......................................... 65 Reference................................................ 76 Conclusion.............................................. 84 Copyright Section .................................. 84
Although analysis is vital to web analytics, you can’t do analysis without good data. Configuring Google Analytics correctly is the key to collecting good data. That’s why this ShortCut exists: to help you configure Google Analytics correctly. This PDF provides a thorough description of how the Google Analytics system works, information about many different types of implementations and ways to avoid common pitfalls. It also shares some best practices to get your setup correct the first time.
Find more at shortcuts.oreilly.com
Google Analytics Getting Setup Correct I wrote this ShortCut for one primary reason: to help you configure Google Analytics correctly. If it’s not configured correctly then you may have incorrect data leading to incorrect analysis. In my experience, getting Google Analytics configured correctly is the biggest roadblock to using it effectively. Throughout the ShortCut I try to explain how Google Analytics works so you can understand the impact of various configuration choices. Remember, Google Analytics is used to analyze business data, which means each business will configure it differently. You need to identify what’s best for your business and configure Google Analytics accordingly. I believe this ShortCut can be used in two distinct ways. First, it can be used as a reference manual. If you’re already using Google Analytics, and you have a question about filters, just flip to the filters section. Or, if you’re not sure you’ve set up cross domain tracking correctly, read Tracking Across Multiple Domains in the Common Website Configurations section. My goal is to make each section a resource that can be used without reading the entire ShortCut. Second, you can view this ShortCut as a complete work. I’ve tried to structure the sections to follow a typical implementation. One of the tips I include in the Tips & Tricks section is a short implementation process. If you’re just getting started with Google Analytics you may want to review that process first and keep it in mind as you progress through this ShortCut.
How Google Analytics Works Understanding the architecture of the Google Analytics system, how it collects data, identifies visitor and creates reports, is the key to understanding many of the advanced topics that will be discussed later in this ShortCut. Before I begin discussing filters, goals and advanced implementations let’s review the fundamentals of how the system works.
Data Collection & Processing I’m going to explain how Google Analytics collects, processes and displays data using Figure 1. The data collection process begins when a visitor requests a page from the web server. The server responds by sending the requested page back to the visitor's browser (step #1). As the browser processes the data it contacts other servers that may host parts of the requested page. This is the case with the Google Analytics Tracking Code (GATC). Google Analytics
2
Figure 1: Google Analytic processing flow.
The visitor’s browser requests the code from a Google Analytics server (step #2) and the server responds by sending the code to the visitor’s browser. All of the code is contained within one file named urchin.js. Once the browser receives the code, the GATC begins to execute while the rest of the page loads. During execution, the code identifies attributes of the visitor and their browsing environment; such as how many times they've been to your site, where they came from, etc. After all the appropriate data has been collected, the GATC sets (or updates, depending on the situation) a number of cookies (step #3), which will be discussed, in a later section. The cookies are used to store information about the visitor. After writing the cookies, the tracking code sends the data back to the Google Analytics server. The data is transmitted to the server via a request for an invisible GIF file (step #4). When the Google Analytics server receives this request it stores the data in a large text file called a log file (step #5). There is one line in the log file for each pageview created by Google Analytics. Each line in the log file contains numerous attributes of the pageview. This includes: • When the pageview occurred (date and time) • Where the visitor came from (referring website, search engine, etc.) • How many times the visitor has been to the site (number of visits) • Where the visitor is located (geographic location) • Who the visitor is (IP address) Google Analytics
3
After the pageview is in the log file the data collection process is complete. The next step is data processing. At some regular interval, usually every few hours, Google Analytics processes the data in the log file. During processing, each line is split into pieces, one piece for each attribute of the pageview. Here’s a sample log file line (note that this is not an actual log file line from Google Analytics. It is a representation.) 65.57.245.11 www.epikone.com - [21/Nov/2006:19:05:06 -0600] "GET /__utm.gif?utmwv=1&utmn=323703347&utmcs=utf-8&utmsr=1600x1200&utmsc=32-bit&utmul=enus&utmje=1&utmfl=8.0&utmcn=1&utmdt=EpikOne%20%20Google%20Analytics%20Support%2C%20Training%20%20Urchin%205%20Software%2C%20Analytics%20Consulting&utmhn=www.epikone.com&utmr=&utmp=/ HTTP/1.1" 200 35 "http://www.epikone.com/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 1.1.4322; .NET CLR 2.0.50727)" "__utma=100957269.323703347.1164157501.1164157501.1164157501.1; __utmb=100957269; __utmc=100957269; __utmz=100957269.1164157501.1.1.utmccn=(direct)|utmcsr=(direct)|utmcmd=(none)"
While most of this data is difficult to understand, a few things stand out. The date and time (Nov 21, 2006 at 19:05:06) and the IP address of the visitor (65.57.245.11) are easily identifiable. Google Analytics turns each piece of data in the log file line into a data element called a ‘field’. For example, the IP address would become the ‘Visitor IP’ field. It’s important to understand that each pageview has many, many attributes and that each one is stored in a different field. After each line has been broken into fields (step #6) filters are applied to the data (step #7). Filters are business rules that you add to Google Analytics. They control what data appears in your reports and how it appears. Finally, after the filters have been applied, the reports are created (step #8) and stored in the database (step #9). Each report in Google Analytics is created by comparing a filed, like the Visitor City, to a piece of integer data (Visits, Pageviews, Bounce Rate, Conversion Rate, etc.). Once the data is in the database the process is complete. When you, or any other user request a report, the appropriate data is retrieved from the database and sent to the browser.
Warning Once Google Analytics has processed the data and it is in the database, it can never be changed. This means historical data can never be altered or re-processed. Any mistakes made during setup or configuration can permanently affect the quality of the data. This also means that any changes made to the configuration of Google Analytics will not alter historical data.
Google Analytics
4
About the GATC Google Analytics uses a very common web analytics technology called Page Tagging to identify visitors, track their actions and collect the data. Each page on your website that you want to track must be ‘tagged’ with a small snippet of JavaScript. If the tracking code is not on a page then that page will not be tracked. The following JavaScript snippet is the standard GATC. <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-XXXXX-X"; urchinTracker(); </script>
Google suggests that you place the tracking code immediately before the </BODY> tag of each page, so if the browser has any problems requesting the urchin.js file from a Google Analytics server (shown in Figure 1, step #2), it does not slow down the page from loading. However, there may be cases where the tracking code must be placed at the beginning of the page (these will be discussed later). In these cases you can place the GATC right after the <BODY> tag or even right before the </HEAD> tag. If your website uses a content management system or some type of templating engine, then the tracking code can be added to template files or other mechanism that automatically generates common HTML elements. This is a fast, effective way to tag all website pages. If you cannot place the tracking code before the closing </BODY> tag then it is possible to add it to some other part of the web page. It can be place in the <HEAD> tag or almost anywhere in the main <BODY> tag. If you do place the GATC in the <BODY> tag make sure it appears inline and not nested within another tag. However, this can have a negative affect on the visitor’s experience. If there is any latency in the Google Analytics server then the browser will pause while waiting for the urchin.js to download. The visitor will experience a delay while the website loads in the browser. Remember, the visitor experience is very important. Anything that can degrade the experience should be minimized.
Google Analytics
5
Note It is possible to host the urchin.js file on your own server. To do so, copy the contents of the urchin.js by viewing the file in your browser. Just enter http://www.google-analytics.com/urchin.js into your browser, copy the resulting code and place it in a file on your server. Then, update the GATC to reference the new file location on your server and not the urchin.js located on the Google Analytics server. <script src="http://www.myserver.com/myfile.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-XXXXX-X"; urchinTracker(); </script>
It should be noted that Google updates the urchin.js without notifying users. If you decide to host urchin.js on your own severs make sure you periodically check for updates. There are two versions of the Google Analytics Tracking code, a secure version and a non-secure version. The secure version should be used on secure web pages. If the non-secure version is used on a secure page then the browser will display a security warning to the visitor. Security warnings can negatively impact a visitorâ&#x20AC;&#x2122;s engagement with your website. To simplify the installation, the secure version can be used on both secure, and non-secure, pages. <script src="https://ssl.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-XXXX-X"; urchinTracker(); </script>
Customizing the GATC There are many settings and features that can be enabled or modified by adding or changing variables in the GATC. A complete list of these variables, and how they can be used to modify Google Analytics tracking, can be found in the Reference section of this Short Cut.
About urchinTracker() The most important part of the GATC is a JavaScript function named urchinTracker(). This function is used to collect visitor data, store that data in cookies and send the data to the Google Analytics sever. urchinTracker() appears in the GATC: <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script>
Google Analytics
6
<script type="text/javascript"> _uacct = "UA-XXXXX-X"; urchinTracker(); </script>
Every time urchinTracker() executes a pageview is created and the data is sent to Google Analytics (Figure 1, step #4). Each pageview has a unique ‘name’ that can be found in the URL column of the Top Content report. Here is some sample data from the Top Content report.
Figure 2: Top Content report showing pageview ‘names’.
During the data collection process urchinTracker() extracts the information from the location bar of your browser. It modifies the value by removing the domain name and domain extension. The only things left are the directories, file name and query string variables. This is called the Request URI and it’s one of the Fields created during data processing (Figure 1, step #6). Here’s an example. The URL http://www.epikone.com/pages/index.php?id=110 would appear in the top content report as /pages/index.php?id=110. So, in this example, the Request URI is the part of the URL that comes after www.epikone.com. That’s the default behavior of urchinTracker(). You can over-ride this behavior and specify how urchinTracker() names a pageview by passing a value to urchinTracker(). For example, to change the way the pageview for /index.php appears in Google Analytics you would modify urchinTracker() on the index.php page as follows: urchinTracker(‘index page’) This modification forces urchinTracker() to name the pageview ‘index page’ rather than /index.php. The deeper effect of this change is the value for the
Google Analytics
7
Request URI is not index.php, but index page. This will have an impact on other configuration settings. Iâ&#x20AC;&#x2122;ll discuss this later in this ShortCut. urchinTracker() is just like any other JavaScript function. This means that it can be executed anywhere a normal JavaScript function can be executed. So, if you place urchinTracker() in the onClick attribute of an image, a pageview will be created in Google Analytics when a visitor clicks on the image. How will the pageview appear in Google Analytics? By default, it will use the Request URI. However, if you pass a value to the function you can name the click anything you want. This technique can be used to track visitor clicks, actions and other browser events. For example, to track clicks on links to other websites (called outbound links) simply add the urchinTracker() function to the onClick attribute of the appropriate anchor tags. Donâ&#x20AC;&#x2122;t forget to pass urchinTracker() a value so the visitor click is identifiable. There are more tricks about using urchinTracker() to track Flash, JavaScript and non-HTML files in the Tips & Tricks section.
About the Tracking Cookies Google Analytics uses up to five, first-party cookies to track a visitor and store information about that visitor. These cookies, set by the urchinTracker() function, track attributes of the visitor such as how many times they have been to the site and where they came from. Note that the cookies do not store any personally identifiable information about the visitor. Here is a list of all the tracking cookies, their format and other information: __utma Expiration: 2038 Format: domain-hash.unique-id.ftime.ltime.stime.session-counter The __utma cookie is the visitor identifier. The unique-id value is a number that identifies this visitor. ftime, ltime and stime are all used to compute visit length (along with the __utmb and __utmc cookies). The final value in the cookie is the session counter. It tracks how many times the visitor has visited the site. It is incremented every time a new visit begins. __utmb Expiration: End of session Format: hashcode The __utmb cookie, in conjunction with the __utmc cookie, is used to compute the visit length.
Google Analytics
8
__utmc Expiration: End of session Format: hashcode The __utmc cookie, in conjunction with the __utmb cookie, is used to compute visit length. __utmz Expiration: By default, 6 Months, but it can be customized Format: domainhash.ctime.nsessions.nresponses.utmcsr=X(|utmccn=X|utmctr=X|utmcmd=X|ut mcid=X|utmcct=X|utmgclid=X) The __utmz cookie is the referral-tracking cookie. It tracks all referral information regardless of the referral medium or source. This means that all organic, CPC, campaign or plain referral information is stored in the __utmz cookie. Data about the referrer is stored in a number of name-value pairs, one for each attribute of the referral: utmcsr. Identifies a search engine, newsletter name, or other source specified in the utm_source query parameter. See the Marketing Campaign Tracking section for more information about query parameters. utmccn. Stores the campaign name or value in utm_campaign query parameter. utmctr. Identifies the keywords used in an organic search or the value in the utm_term query parameter. utmcmd. A campaign medium or value of utm_medium query parameter. utmcct. Campaign content or the content of a particular ad (used for A/B testing). The value from utm_content query parameter. utmgclid. A unique identifier used when AdWords auto tagging is enabled. This value is reconciled during data processing with information from AdWords. The expiration date for the campaign cookie can be set in the GATC. See the Reference section for more information about how to change the default value. Also, there is more information about referral tracking in the Marketing Tracking section. __utmv Expiration: 2038 Format: domain-hash.value Custom Segmentation cookie. This cookie is not present unless custom segmentation has been implemented. The cookie is created using the __utmSetVar() function which will be discussed in a later section. The
Google Analytics
9
value passed to __utmSetVar() is stored in the value section of the __utmv cookie. In this Short Cut I don’t discuss any of the issues regarding cookies and visitor tracking. There are numerous studies, white papers and blog posts estimating the rate at which cookies are blocked by browsers and deleted by users. Eric Petersen first wrote about the pitfalls of cookies in a 2005 study for Jupiter Research. A summary of his work can be found in this press release: http://www.jupitermedia.com/corporate/releases/05.03.14newjupresearch.html In my opinion, the best course of action to mitigate visitor behavior, and its affects on your data, is to look for trends and patterns in your data and to avoid absolute numbers.
Note Google Analytics does not track any visitor who has configured their browser to block first party cookies or any visitor that has disabled JavaScript. There is no way to circumvent this limitation. Also, if a visitor deletes their cookies, they will appear as a new visitor the next time they visit the website.
Profiles and Profile Settings The data for each website that you track is stored in a profile. Most documentation describes a profile as data for a website. But a profile is more than just data. Each profile has a number of settings that can affect the data within the profile. A more accurate way to describe a profile is a collection of data and business rules. The business rules modify the data in the profile. In Google Analytics, the business rules are called filters (I’ll discuss them more in a later section) and each profile can have different filters. Multiple profiles can be created for the same website. Each profile can have different filters thus changing the data in each additional profile created for a website. So, even though you may have two profiles for www.epikone.com, the data in the reports could be dramatically different because of the different filters applied to each profile. Why would you create multiple profiles for a single website? To create different sets of data for different types of analysis. I’ll discuss this more in the Filters section and in the Tips & Tricks section where I suggest some profiles that you should consider creating. Google Analytics
10
In addition to filters, there are other settings that are common to each profile. Understanding how each setting alters the data in the profile is important when you’re setting things up.
Website URL The website URL is used for two simple tasks. First, it is used to check the installation of the tracking code. After a profile has been created, Google Analytics will spider the Website URL value and search for the tracking code to insure that it has been installed correctly. The Website URL is also used to create the Site Overlay report. When the site Overlay report is generated Google Analytics website URL value and displays the page in a separate window. It then adds the appropriate data to each link on the page.
Profile Name The Profile name identifies each profile in a list. There are no restricts on how to name a profile. You can even create two profiles with the same name, but I don’t recommend this (how would you differentiate them in a list?). I suggest naming profiles something descriptive that all users will understand. If a profile has a filter applied to it then include a small description explaining how the filter changes the data in the profile. For example, I might name a profile ‘www.epikone.com - vermont traffic only’. When viewing the previous in a list I can easily understand that the profile is for the www.epikone.com domain and only the profile contains traffic from Vermont.
Tip Wondering when a profile was created? Add the ‘start date’ to the profile name. that way you’ll always know how much data exists in the profile.
Time Zone This setting can only be changed if your Google Analytics account is not linked to an AdWords account. If you’ve linked your Analytics account to an AdWords account then the Time Zone setting will be the time zone you defined in your Google AdWords account. Applying the AdWords time zone to the Analytics data ensures that the Google AdWords reporting in Google Analytics is accurate.
Default Page Setting the default page for a website is a simple configuration step that ensures the quality of your Google Analytics report data. The default page for a website is the Google Analytics
11
page shown to a visitor when they enter just the website domain into the browser’s location bar. If you type http://www.epikone.com/ into your browser the web server returns index.php. You won’t see index.php in the browser’s location bar, but that’s the page the server returns. This is the same for directories within your website. http://www.epikone.com/blog also returns index.php. Why does this matter? When the GATC executes it creates pageviews using the page name that the visitor requested. What if there is no page name, as is the case in this example? Google Analytics creates a pageview and names it /. However, when the user types http://www.epikone.com/index.php, Google Analytics create a pageview for index.php. Although the visitor sees the same page, Google Analytics creates a pageviews for / and a pageview for index.php: two pageviews for the same page. Pageviews for a single piece of content page should be summarized as a single line item, not two. Figure 3 illustrates how two pageviews can exist for a single page.
Figure 3: In the above image, /index.html and / are the same page and should be consolidated into a single line item.
To remedy this problem enter the default page for your website in the ‘Default page:’ field in the ‘Main Website Profile Information’ configuration section. Be sure to enter only the page name. Do not include a slash before the page name and do not use regular expressions. As Figure 4 shows, just input the name of the page, nothing else.
Google Analytics
12
Figure 4: The Default Page setting.
Exclude URL Query Parameters See the Dynamic Website Configuration section for coverage of this setting. Even if you do not have a dynamic website you should still read this section.
E-Commerce Settings There are two e-commerce settings in the profile information section. The first, ‘E-Commerce Website’, is a switch. When set to ‘Yes’ a series of e-commerce repots will be added to the reporting interface. By default this setting is set to ‘No’. So, if you have an e-commerce website make sure you change this setting when creating a profile.
Figure 5: Profile e-commerce settings.
The second e-commerce setting formats the currency in your Google Analytics reports. Google Analytics can display currency in 8 different formats. Changing Google Analytics
13
the currency setting will alter the way e-commerce data is displayed in the reporting interface.
Apply AdWords Cost Data Most profile settings can be configured in the Google Analytics interface. However, one profile setting, Share AdWords Cost Data (Figure 6) can only be activated via the AdWords interface. This setting is used to automatically import cost data from AdWords into Google Analytics.
Figure 6: How to apply Google AdWords cost data to a Google Analytics profile.
You can read more about the Apply Cost Data setting in the Campaign Tracking section.
Note Remember, making changes to any profile settings will not alter the data that has already been processed by Google Analytics. It will affect the data processed after the settings have been changed.
Filters There is no Google Analytics concept that is more important, but less understood, than filters. Functionally, filters are business rules. They are added to a profile because there is a business need to modify the data in a profile. For example, it is very common to exclude website traffic generated by internal employees. This data can skew the data generated by actual customers thus causing incorrect analysis.
Google Analytics
14
I believe the key to understanding filters is understanding how website data is structured in Google Analytics. I discussed this in How Google Analytics works. If you have not read that section please do so. There are two types of filters in Google Analytics: Predefined Filters and Custom Filters. Predefined filters are common filters that most people use. Google has bundled these common filters together and simplified the implementation. Custom Filters are different. You need to do all the configuration work when creating a custom filter. While it can be challenging, Custom Filters truly offer you advanced control over the data in your profiles. In general, Custom Filters and Predefined Filters work off of the same premise.
How Profile Filters Works Figure 7 displays the three common components of a filter:
Figure 7: The three primary parts of a filter.
â&#x20AC;˘ Filter Field â&#x20AC;˘ Filter Pattern â&#x20AC;˘ Filter Type I find it easy to define filters as a process involving these components. As Google Analytics processes site data it executes the filters that have been applied to the profile. It compares the filter field against the filter pattern and, if the field matches the pattern, the filter performs an action. When the action occurs the data in the profile is changed. Multiple filters can be applied to a profile. When more than one filter is applied to a profile they are executed sequentially, in the order they are listed. The output from one filter is used as the input to the next filter.
Google Analytics
15
Warning Filters forever modify the data in the profile. This means an incorrect filter will forever alter profile data. Be careful when applying filters to your profiles. Test your filters on a ‘test’ profile to ensure they work as expected. (See the Reference section for more information about Test profiles). Filter Field The first part of a filter are the filter fields. These data elements are created when Google Analytics processes the data in the log file (Figure 1, step #6). Each filter filed is an attribute of a pageview in Google Analytics. There are 37 different filter fields in Google Analytics, each of which can be used to create a filter. A complete list of filter fields, and what they represent, can be found in the Google Analytics support documents: http://www.google.com/support/googleanalytics/bin/answer.py?answer=55588 Some of the most common filter fields are listed in table 1: Table 1: The most commonly used Filter Fields
Filter Field
Description
Request URI
The request URI is created using the information in the location bar of the browser. Google Analytics removes the sub domain, the hostname and the extension. Everything remaining becomes the request URI.
Hostname
This is the primary and sub domain (if present), listed in the location bar of the visitor’s browser.
Visitor IP Address
The IP address of the person visiting your website. While this field is available for use in filters, it is not visible. The value of the IP address, which is protected by the Google Analytics Privacy Policy, cannot be displayed in reports.
Google Analytics
16
Tip Each filed represents a piece of data which can have many values. For example, the field for Visitor City can contain Boston, San Francisco, Seattle, etc. To find the different values stored in a field use the Google Analytics reports. As mentioned previously, each report is constructed using a field, so all the values in a filed are displayed in the report built from that field. At the time of writing there is no reliable documentation about which reports are created from each filter field. This information should appear in the Google Analytics support documentation soon.
Filter Pattern The second part of a filter is the filter pattern. The pattern is applied to the filter field and, if the pattern matches any part of the field, the pattern returns a positive result thus causing an action to occur. The patterns used in Google Analytics are called regular expressions. A regular expression is a set of characters that represents a larger set of data. These characters may be standard alpha-numeric characters (like letters or numbers) or special characters (like the * or +). Rather than begin a discussion on regular expression in this section, I feel that it is better to continue the conceptual discussion on Filters. If you understand the basic concept of a regular expression, that it is a pattern applied to data, then you will be able to follow this section. (More information about regular expressions can be found in the Reference section of this Short Cut.)
Filter Type The final part of a filter is the filter type. The filter type is what happens to the data if the filter pattern matches the filter field. There are seven different types of filters, each with a distinct function. They are all Custom filters and their description follows next. Include/Exclude Filters Include/exclude filters are the most common custom filters in Google Analytics. Theyâ&#x20AC;&#x2122;re also the easiest to understand. The action for an include filter is â&#x20AC;&#x2DC;inclusionâ&#x20AC;&#x2122;. This means that if the regular expression matches the filter field the data is included in the profile. Google Analytics
17
Exclude filters operate in the opposite manner. If the filter pattern matches the filter field then the data will be excluded from the profile. Include/Exclude filters are extremely powerful because they can be used to segment your data in different ways. For example, to analyze the visitation habits of visitors from California, create an include filter as follows:
Figure 8: An include filter that ‘lets in’ data from California.
The result of this filter would be that all data stored in the profile, and displayed in the reports, would be for visitors from ‘california’. (The visitor Region field stores the US state name.) A similar example (Figure 9) would be an include filter for visitors from New York:
Figure 9: A filter to include visitors from New York.
Now, how would you include visitors from New York and California? The answer is not as easy as applying two filters to the profile. Remember, filters are applied sequentially during processing and the output from filter #1 is used as the input for filter #2. Applying two filters to a profile, one to include visitors from New York and one to include visitors from California would not work. The reason is the first Google Analytics
18
filter would naturally exclude the filter pattern of the second filter. To combine the functionality of these two filters a single filter, shown in figure 10, must be used.
Figure 10: A filter to include visitors from New York and California.
This filter uses a regular expression to indicate that the Visitor Region must be “New York” or “California”. Search & Replace Filter The Search and Replace filter is a simple way to replace one piece of data with a different piece of data. This is most often used to replace long, unreadable URLs with more “human readable” information. Search and Replace filters are slightly different than other filters because they do not have a filter pattern. Instead they have a search string, which is the same as a filter pattern. When a Search and Replace filter is applied to a profile the filter searches the filter field for the search string. If the search string is found in the filter field, then the filter replaces the entire search string with the replace string. Figure 11 illustrates a common search and replace filter.
Google Analytics
19
Figure 11: A common Search and Replace filter.
This filter searches the Request URI field for the pattern category_id=1234. If the search string is found in the Request URI then the entire Request URI will be replaced with the string Chairs.
Warning The Replace String is standard text. It is not a regular expression. Also note that the entire Filter Field is replaced with the Replace String. So, if there are any other filters attached to a profile, and those filters use the same filter field as the Search and Replace filter, then those filters may not work because the Search and Replace filter modified the value in the filter field. Lowercase/Uppercase Filters Lowercase/Uppercase filters, Figure 12, are different than other filters in that they do not require a filter pattern, only a filter field. Simply put, a Lowercase or Uppercase filter changes the selected filter field to all lowercase characters or all uppercase characters.
Google Analytics
20
Figure 12: Lowercase filter setup form. Uppercase filters have the same settings.
Why is this filter needed? Some web servers, particularly Microsoft IIS servers, creates pageviews with mixed case URLs. This means that Google Analytics creates multiple line items for the same physical page in various reports. Here’s an example. The URL http://www.epikone.com/default.asp and http://www.epikone.com/Default.asp will generate the same page for the visitor: However, Google Analytics will create two line items in the Top Content report, one for default.asp and one for Default.asp. Obviously these are the same page and should be tracked as a single line item. A Lowercase filter forces the filter field -in this case the request URI -- to a consistent case, thereby consolidating all versions of the same page into a single line item in the Google Analytics reports.
Tip Another good use of the lowercase/uppercase filter is for keywords. Many users want to see ‘EpikOne’, ‘epikone’ and ‘EPIKONE’ as the same keyword, not three different keywords. An uppercase or lowercase filter, applied to the Campaign Term field, will change the keyword case. Lookup Table Lookup Table Filters are not currently active in Google Analytics. If you were previously an Urchin on Demand customer, and were using Lookup Table filters, then they will still work. However, new Lookup Table Filters cannot be created. The idea behind lookup tables is that they are an automated search and replace filter. A lookup table is a text file that can be upload it to Google Analytics. Google Analytics then applies the information in the lookup table to a specific
Google Analytics
21
filter field. For example, a lookup table could be used to replace product IDs in Google Analytics with the name of the actual product. This filter can be extremely useful. Hopefully it will be added to Google Analytics soon. Advanced Filters Advanced Filters can alter data fields by combining elements from multiple filters fields, removing unnecessary parts of filter fields or replace one filter field with another. Unlike most filters, advanced filters have two filter fields, Field A and Field B. Along with each filter field there is an Extract field. The Extract field is synonymous with the Filter Pattern; it is the regular expression that is applied to the filter field. It should be noted that you don’t have to use both fields. So, Extract A is applied to Filter A and Extract B is applied to Filter B. The reason why the Filter Patterns are named Extract for advanced filters is that certain parts of Field A and Field B can be removed, or extracted, from each field. The part of the filter field that is extracted is specified using a regular expression. In figure 13, two fields are referenced: the Request URI for Filed A and the Hostname for Field B. The pattern applied to filter Field A means ‘capture all the characters in the Request URI and retain those characters’. The pattern applied to filter Field B means ‘match all the characters in the Request URI and retain those characters’.
Figure 13: Advanced Filters have two filter fields, Field A and Field B. A portion of each filter field can be extracted using regular expressions.
What happens to those characters that are captured in the Extract Fields? Google Analytics allows you combine the extracted pieces of data and output them to Google Analytics
22
another field, called a Constructor. The Constructor is simply a field. After a part of the field has been captured Google Analytics stores it in a variable. Data extracted in Extract A starts with $A and data extracted from Extract B starts with $B. I’ve You can configure Google Analytics to permanently change the value of the constructor using the Override Output Field setting (see Figure 13). When you select ‘Yes’ the data in the constructor overwrites the Output Field value. So, any reports that are created using that field will be modified. Table 2 is meant to break down the process of combining two extracts and exporting them to a constructor. Table 2: How two extracts can be combined to modify an existing field.
Captured part of Request URI [$A1]
Captured part of Hostname [$B1]
Output to Constructor: Request URI [$B1$A1]
/pages/index.html
www.epikone.com
www.epikone.com/pages/index.html
If this filter is applied to a profile, then all the reports based on the Request URI change; they will include the hostname as well as the directory path, filename and query string variables.
Warning Modifying the Request URI field using an advanced filter can affect other settings in Google Analytics, most notably goal settings. Goals can be calculated using the Request URI. So, if an advanced filter changes the Request URI make sure to check the effect on your goal settings. There are two other settings specific to an Advanced Filter: Field A required and Field B required. These settings control the logic of an Advanced filter. When setting either one of these options to ‘Yes’ Google Analytics will place some constraints on when the filter takes action. If Field A does not match the pattern in Extract A then the filter will not ‘execute’. The same goes for Field B. Here’s one way an advanced filter can be used to gain more insight into the actions of visitors. In some applications it may be useful to identify which organizations are consuming content on your website. Maybe it’s a competitor or potential client. An advanced filter can be used to attach the ISP or Network name to the Request URI. This will display the name of the organization’s network along with the page requested. Figure 14 shows the settings for such a filter.
Google Analytics
23
Figure 14: Filter to concatenate the visitor’s ISP organization and Request URI.
The above filter would modify the Constructor (Request URI in this case) by adding the Visitor’s ISP or organization’s name to the Request URI. So, all reports created using the Request URI would be modified. The result of this filter is shown in Figure 15.
Figure 15: The Top Content report displaying the modified Request URI.
Now, let’s take the above example one step further. Let’s say we’d also like to see the keyword the visitor used to find the website along with the network name. We can use a series of filters, passing data from one filter to the next, to modify a constructor. This first filter would be an advanced filter to add the keyword to the ISP or network name. While this filter is similar to the previous filter there is one main difference. The constructor used in this filter, shown in Figure 16, is Custom Value 1. This filed is a temporary field that is not used by any reports. It is meant to be used when you need to pass data from one filter to another.
Google Analytics
24
Figure 16:Advanced filter that stores data in a temporary variable
The second filter, shown in figure 17, must add the Request URI to the value we previously created and stored in Custom Value 1. I’ve used a comma in the constructor to separate the Campaign Term from the Visitor ISP Organization. I like using the comma to separate values because if I export the data from Google Analytics into Excel I can easily import it as a comma separated file and Excel will place each value in a new column.
Filter 17: Advanced filter that modifies a value stored in a temporary variable
I’ve chosen to use the Visitor Java Enabled? filed as the constructor for the second filter. The reason is I want to place the data in a field that is not used by any ‘major’ reports. If I used the Request URI as the constructor then all of the reports that are based on the Request URI would break. So, by using the Visitor Java Enabled, I only break the Java Enabled report. In my experience this report is not used very often and can be sacrificed. If you use the Java Enabled report then consider using a different field for the constructor or create an additional profile for this filter. Remember, for the above filters to work correctly they must be in a specific order.
Google Analytics
25
Google Analytics does not limit the number of extracts for each field. Multiple parts of Extract A and Extract B can be captured. If more than one part of an extract is captured then Google Analytics will retain multiple variables for that extract field. Figure 18 shows multiple values extracted from a filter field.
Figure 18: Multiple extracts using an advanced filter
The above filter will capture all of Extract A (the entire hostname) and two parts of Extract B. The first part of Extract B that will be captured is ‘v’ followed by any character. The second extract from Field B will be everything after ‘/23/’. Then, the filter will combine the hostname with the two extracts from Field B. It will separate the values in $A1, $B1 and $B2 with slashes. The characters entered in the Output To field are literal characters meaning that they appear exactly as you type them, except for those that begin with $A or $B. Predefined Filters Before we conclude our discussion on filters, I should mention that Google Analytics has three, predefined filers. These filters are the three most commonly used filters. Google has simplified the implementation by removing the Filter Field and Filter pattern. With a predefined filter, just choose the Filter type and enter the appropriate data in to the form. Figure 19 lists the three Predefined Filters.
Google Analytics
26
Figure 19: Predefined filters.
The ‘Exclude all traffic from a domain’ filter uses a reverse lookup to identify the domain of the site visitors. The visitor domain is usually the ISP of the visitor or, in the case of large corporations, the name of the company. ‘Exclude all traffic from an IP address’ removes all data coming from the addressed entered into the filter pattern. This filter is primarily used to exclude internal company resources. The ‘Include only traffic to a subdirectory’ filter isolates data for a specific directory on the website. This filter is usually used to create profiles that focus on one part of the website.
Goals & Funnels Another common profile configuration is the creation of goals and funnels. While it is not necessary to create any goals or funnels it is highly recommended. Google Analytics automatically segments report data and displays the goal conversion rate for each line item in the report. So, if you’re looking at a report containing keywords, Google Analytics will display the goal conversion rate for each keyword. This type of segmentation is extremely valuable when analyzing data but is only possible if you set up goals for your profiles.
Goals Google Analytics Goals are a way to measure conversion activities on your website. A goal is simply a pageview that indicates the visitor has completed some type of high-value process. This process could be filling out a contact form, purchasing a product or downloading a file. Each process usually concludes with some type of ‘thank you’ page. In Google Analytics this is called the goal page. A goal is defined by the URL of the goal page. As Google Analytics processes site data, it increments the goal counter each time a goal page is found. If the goal page is found multiple times during a single visit the goal counter is only incremented once. This is important because it means that a visitor can only convert once during a visit. Google Analytics
27
There are multiple ways to define a goal depending on the complexity of your website. The easiest way to create a goal is to paste the URL of your goal page into the Goal URL field. Figure 20 shows the goal setup form and the Goal URL field. So, if your checkout process ends with http://www.epikone.com/thankyou.php, enter http://www.epikone.com/thankyou.php in the Goal URL field.
Figure 20: Paste the URL for a goal page in the Goal URL text field.
If the URL of the goal page is http://www.epikone.com/thankyou.php?submit=true then enter http://www.epikone.com/thankyou.php?submit=true into the Goal URL field. A goal can also be defined using a regular expression. Rather than enter an exact URL in the Goal URL field you can enter a regular expression. This is particularly helpful if the website is dynamic and contains query string parameters that may differ from one visitor to the next. If the goal page contains a unique identifier then you can’t copy and paste a URL into the Goal URL field. Every goal URL will be different. You need to use a regular expression for the Goal URL. I’ll discuss this below in the Additional Settings section. The Goal setup form also includes a field for Goal name. The Goal name will be used to identify the goal in the reports. The Activate Goal setting is an on-off switch. Switching the setting to ‘Off’ will stop tracking for the goal. Why would you want to turn a goal off? Google Analytics will calculate an overall website conversion rate using all of the goals you define for the site. If you create a goal that is temporary, say for a specific campaign, then it could artificially skew the overall site conversion rate if you leave the goal on after the campaign ends.
Google Analytics
28
Note In reality, Google Analytics only uses the Request URI when calculating goals. So, even if you specify the entire URL as a goal page, Google Analytics will only use the Request URI. This also means that if you modify the Request URI using a filter (like an uppercase filter or lowercase filter) you may need to change your Goal URL. For example, if a goal is defined as /pages/html/thankyou.html, but an advanced filter has been applied to the profile and changes the request URI to /pages/thankyou.html, then the goal will not work.
Funnels A funnel is a series of pre-defined steps, or pages, that a visitor mush view before reaching a goal. Not every goal will have an associated funnel; so defining a funnel is optional. You should set up a funnel if you have a predefined process that the visitor must go through before reaching the goal. This could be as simple as specifying the form used on a contact us page or as complicated as a multi-step checkout process. The funnel is an excellent way to visualize problems in the conversion process. Setting up a funnel is very similar to setting up a Goal. Each step in a funnel is a pageview. So, to create a funnel, paste the URL for each page in your process into the setup form (Figure 21).
Figure 21: The defined funnel setup form.
The â&#x20AC;&#x2DC;Required Stepâ&#x20AC;&#x2122; check box can affect the number of goal conversions in the Funnel Visualization report. When selected, visitors who complete the goal, without starting at the first step in the defined funnel, will not be shown as
Google Analytics
29
completing the goal in the funnel visualization report. However, the conversion will be recorded in other conversion reports.
Warning Google Analytics will â&#x20AC;&#x2DC;back fillâ&#x20AC;&#x2122; your predefined funnels. For example, if you have a four step funnel, and a visitor completes only the first step and the final step, Google Analytics will go back and indicate that the visitor actually hit every step in the funnel process.
Additional Settings Each goal/funnel has an Additional Settings section that can aid configure in unique situations. Any changes made in this section will be applied to both the goal and the steps in the funnel. Figure 22 shows the options available in the Additional settings section.
Figure 22: The additional settings for a Goal and Funnel.
The Case sensitive setting can be used with websites that have mixed-case URLs. If the Goal URL, or any of the steps in your funnel are case sensitive check this checkbox. Remember, filters can affect this setting. If an Uppercase or Lowercase Custom filter has been applied to the profile then there will be no mix-case URLs and this setting will be irrelevant. The Match Type setting is a powerful setting that can facilitate goal tracking. For example, if each goal page contains a unique customer identifier then it will be impossible to paste a single URL into the Goal URL field. The reason is each URL will be unique, your website will not have a single URL that represents the goal page. Google Analytics has three different match types that can be used to match multiple URLs and resolve goal setup issues. When selected, each match type will change how Google Analytics applies the value in the Goal URL field to the data it processes.
Google Analytics
30
Exact Match The value in the Goal URL field must exactly match the URL in the location bar of the visitor’s browser. Head Match The Head Match setting can be used when a small part of the goal URL differs from one visitor to another. When using a head match the URL in the visitor’s browser must exactly match the value in the Goal URL. However, if there is any additional data at the end of the visitor’s URL, which does not appear in the Head Match value, the goal will still count. The Head Match will match both path data and query string variables. Regular Expression This setting defines a goal using a regular expression rather than a static URL. If the regular expression entered into the Goal URL matches any part of the URL in visitor’s browser, then the goal is counted. This includes sub domains, primary domains, path information and query string variables. (To learn more about regular expressions, see the Regular Expressions section in the Reference material.)
Warning The Case sensitive and Match Type settings are applied to both the values in the Goal URL and Funnel steps. It is impossible to use a match type of Exact Match for your funnel steps and a Regular Expression match type for the Goal URL.
Tip You can define goals and funnels for data created by urchinTracker(). Remember, if you pass a value to urchinTracker() then that data becomes a pageview in Google Analytics. These pageviews can then be defined as goals by placing the value passed to urchinTracker() in the Goal URL field. The final option in the Additional Settings section is Goal Value. Use this field to monetize non-e-commerce goals. For example, if each Contact Form submitted by a user is worth $100, enter 100 in the Goal Value field. Google Analytics will use 100 to calculate return on investment (ROI) and other revenue based calculations. If e-commerce tracking is active for a profile, activated, feature in Google Analytics, and would like to use e-commerce data for your goals, simply leave this field blank. You should really try to monetize your non e-commerce goal values. The reason is that Google Analytics will use the goal value to calculate a $Index value. This is a Google Analytics
31
metric that indicates how much each page on your website is worth. You can use the $Index value to determine which content is most important to the conversion process. The $Index value can be found in the Top Content report.
Using Regular Expressions to Extend Goals Some users are frustrated by the four goals per profile limit. With a little creativity the limit can be circumvented by tracking multiple conversion activities in a single goal. This can be done with a regular expressions. Remember, the Goal URL can be a regular expression. This means that multiple URLs on a website can match the regular expression defined as a goal. Here's an example. The following two pages represent a conversion on a site: http://www.analyticstalk.com/blog/outbound/rss/google http://www.analyticstalk.com/blog/outbound/rss/rss The regular expression used for the Goal URL could be ‘/rss/(google|rss)$’. Both URLs match the regular expression and would count towards the goal tally. To drill down into the data, and differentiate which URL generated more goals, use the Goal Verification report. This report, shown in Figure 23, segments a goal by the different pages that contribute to the goal.
Figure 23: The Goal Verification report.
Note What goals should you configure for your website? That depends on your online business model. Remember, Google Analytics collects business data, so the goals for an e-commerce business may be very different than those for a lead generation business. In general, goals usually involve one of the following: • Completing an e-commerce transactions • Submitting a ‘Contact Us’ forms • Subscribing to an email newsletter • Viewing certain content on a website
Google Analytics
32
Common Website Configurations This section of the ShortCut addresses common website architectures that can cause problems with the Google Analytics tracking. Remember, the technology that Google Analytics uses to track visitors, called page tagging, is based on JavaScript and cookies. So any website architectures, like multiple domain names, that affects cookies or JavaScript can interfere with tracking. Most of the changes required to deal with these website configurations are usually made to your website and not Google Analytics directly. If your website contains five static HTML pages then it is very likely that this section will not apply to you. However, if you have a dynamic website, that crosses multiple domains and sub domains, then this section will offer you valuable information about how, and why, you should configure Google Analytics.
Dynamic Websites A dynamic website is one that uses query string parameters, or variables, to determine which content the visitor is consuming. As discussed in the section on urchinTracker(), Google Analytics includes query string parameters when it creates a pageview. Table 3 illustrates how a URL in the browser’s location bar would appear in a Google Analytics report. Table 3: How Google Analytics creates page ‘names’.
URL in Browser
Resulting URL in Google Analytics
http://www.mysite.com/dir/index.p hp?sess=1234&cat=3&prod=foo&v ar2=bar
/dir/index.php?sess=1234&cat=3&prod =foo&var2=bar
http://www.mysite.com/dir/index.p /dir/index.php?sess=4567&cat=6&prod hp?sess=4567&cat=6&prod=bar&v =bar&var2=foo ar2=foo
However, not all query string parameters are created equal. Some query string parameters indicate the content that a visitor is viewing. These parameters are necessary for analysis. Other query string parameters are used by your web server or web application and provide no insight into the visitor’s actions or the content they view. These variables are not needed and should be eliminated from Google Analytics.
Google Analytics
33
To configure Google Analytics to remove query string parameters during processing simply list the unwanted parameters in the ‘Exclude URL Query Parameters’ field in the ‘Main Website Profile Information’ section (Figure 24). List multiple query string parameters as a comma separated list.
Figure 24: Enter unwanted query string variables in the ‘Exclude URL Query Parameters’ text box.
In the above example the query string variables sess, and var2 would be removed from Google Analytics during processing. Table 4 indicates how a URL will appear after certain parameters have been excluded. Table 4: How a URL looks after removing unnecessary query string parameters.
URL in Google Analytics
Resulting Value Parameters
after
Excluding
/dir/index.php?sess=1234&cat=3&pr od=foo&var2=bar
/dir/index.php?cat=3&prod=foo
/dir/index.php?sess=4567&cat=6&pr od=bar&var2=foo
/dir/index.php?cat=6&prod=bar
Excluding query string parameters from Google Analytics will affect other parts of the application. Once a query string variable has been added to the list it will be completely removed from the system. This means that the parameter data will not be accessible via filters, goal setting or funnel settings. So, if a filter utilizes a particular query string variable, and the variable is excluded, then the filter will break. This also holds true for goal settings and funnel settings. What parameters should you eliminate? Any parameter that does not provide insight into what the visitor is doing, or what the visitor is viewing, should be Google Analytics
34
removed. How will you know which query string parameters to exclude and which ones to include? I’ve found the easiest way to let Google Analytics collect some data. Then use the Top Content report to identify all query string parameters. Create a list of the parameters and check with your IT staff to learn what each ones means. This process is not easy, but it is important.
Warning It is very common for web sites to use a query string variable called a Session ID to identify each individual visitor. Session IDs are a unique string that may appear in the query string of every page. A session ID will make every pageview unique because each session ID is unique. Session IDs should be eliminated from Google Analytics in the method describe above. If every page comes through as unique, because of the unique session ID, every page will have only one page view. Only when you aggregate the data, by removing the session ID from the query string, will you fix the problem. Some websites add the session ID as a directory in the file path. In this case an advanced filter must be used to restructure the Request URI field. Please see the section on Advanced filters for more information.
As usual, changing the ‘Exclude Query String Parameters’ setting will not affect data that has already been processed by Google Analytics. Only data processed in the future will reflect this change.
Warning It is against the Google Analytics Privacy Policy to store any personally identifiable information in Google Analytics. If your website uses query string parameters to pass personal information about your visitors then that information will be stored in Google Analytics, thus violating the privacy policy. You must exclude all query string variables that may contain personally identifiable information.
Tracking Across Multiple Domains Google Analytics has the ability to track visitors across multiple domains. This functionality is primarily used on website that have a third party shopping cart, but can be used for other purposes. If your website traverses multiple domains, then you will want to track your visitors as they move from one domain to another. If Google Analytics
35
you do not track them across domains then each visitor will appear as a new visitor (i.e. a new person) each time they move from one of your websites to another. However, tracking across multiple domains should only be implemented if there is some functional connection between the websites. If there is no business relationship between the websites then there may not be a need to track visitors between domains. Only you can decide if you need to track visitors between multiple domains. Critical to cross-domain tracking is the concept of first party cookies. First party cookies are cookies whose domain is the same as the website that the visitor is currently visiting. For example, cookies for a user visiting www.epikone.com have a domain of epikone.com. Google Analytics uses first party cookies. Therefore, the GATC on www.epikone.com can only interact with cookies that have a domain of epikone.com. If the visitor leaves www.epikone.com, as is the case when a website uses a third party shopping cart, then the tracking cookies cannot be accessed by the GATC on the shopping cart pages. How it Works When a visitor arrives at the website for the first time the Google Analytics Tracking code sets a number of cookies that uniquely identifies the visitor. No matter where the visitor goes on the website they can always identified by the cookies. Things change if the visitor leaves the website. The tracking cookies are first-party cookies, which means they can only be used by the website that sets them. If the visitor leaves the site to use a shopping cart located on a different domain, then the tracking cookies will no longer work. There needs to be some mechanism to transfers the cookies, along with the visitor, from one domain to another. This is shown in Figure 25.
Google Analytics
36
Figure 25: When a visitor moves from one domain to another their tracking cookies must move with them.
Google Analytics provides two functions to transfer the tracking cookies between domains: __utmLinker() and __utmLinkPost(). Both functions operate in the same manner. They extract the tracking cookie values from the cookies and place the data in the destination page URL as query string parameters. The tracking cookies in Figure 25, colored green, are passed in the query string as the visitor moves from epikone.com to cutroni.com. The name of each tracking cookie is highlighted in orange. When the visitor lands on cutroni.com the GATC removes the cookie values from the query string and resets the tracking cookies on cutroni.com. When the process is complete the visitor has two sets of cookies with the same values. One set of cookies is for epikone.com and one set is for cutroni.com. There are two critical conditions that must be met for this technique to work: • Both domains must have the GATC installed • The third party domain must accept query string parameters If the third party domain prohibits either of these conditions then Google Analytics does not track visitors from one domain to the other. Implementation First, make sure the pages on both websites have the GATC installed. If the tracking code cannot be added to the both websites then tracking will not work. In addition to adding the tracking code, it must be modified as follows: <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct="UA-xxxx-x"; _udn="none"; _ulink=1; urchinTracker(); </script>
The above modifications are necessary to change how the tracking code interacts with, and configures, the tracking cookies. The _udn variable determines the domain for the tracking cookies. Normally Google Analytics uses the sub domain and primary domain in the location bar of the browser for the cookie domain. By setting _udn to “none” the tracking code uses the entire hostname for the cookie domain. The _ulink variable is a switch that activates a security feature within the tracking code. This feature creates a ‘key’ that ensures that the tracking variables
Google Analytics
37
are set with the same values on both domains. You can see the key in the URL. It is stored in a query string parameter named _utmk. Once the tracking code has been modified and installed __utmLinker() or __utmLinkPost() must be added to your website. As mentioned above, these functions extract the tracking cookie values and add them to the destination URL as query string parameters. If the website transfers the visitor between domains using standard anchor tags then __utmLinker() must be added as follows: <a href=”http://www.b.com” onclick=”javascript:__utmLinker(this.href);false;">Buy Now</a>');
If a form is used to transfer the visitor between domains then __utmLinkPost() must be added to the necessary forms. Modify all appropriate forms as follows: <form action="http://www.b.com/page.php" onSubmit="javascript:__utmLinkPost(this)">
__utmLinkPost() will change the form action, by adding query string parameters to the value in the action attribute, when the visitor submits the form. It is important to note that you may need to tag the links and forms on both websites. Why? Every page on Website A and Website B should be considered a search engine results page, and thus a starting point for a visitor’s visit. However, if there is no chance that the visitor’s visit will start on Website B, and then move to Website A, then there is no need to change the links or forms on Website B.
Tip Many people have been experimenting with DOM scripts to dynamically call __utmLinker() or __utmLinkPost() rather than manually add the functions to the HTML. While these scripts will work some of the time, success has been inconsistent due to variations in the DOM from one browser to another. Care should be taken when experimenting with this customization. Once the tracking code has been modified, and __utmLinker() or __utmLinkPost() have been added, visitors will be tracked between domains. To aid in reporting it is a good idea to add a filter that attaches the website hostname to the request URI. This makes it easier to identify common content on each domain. The settings for the filter are as follows:
Google Analytics
38
Figure 26: Advanced filter to add the Hostname to the Request URI
Tracking Across Multiple Sub Domains Like tracking across multiple domains, the primary issue with tracking across multiple sub domains has to do with the cookie domain. By default, the GATC includes the website sub domain in the cookie domain. This means that a cookie set by the GATC while the visitor is visiting one sub domain cannot be utilized by the GATC on different sub domain. So, a visitor who visits multiple sub domains on a website will receive a different set of tracking cookies for each sub domain. Iâ&#x20AC;&#x2122;ve illustrated this issue in Table 5. Table 5: How sub domains affect cookie domains.
Domain
Cookie Domain
Can be accessed by
support.foo.com .support.foo.com
support.foo.com only
secure.foo.com
.secure.foo.com only
.secure.foo.com
To resolve this issue the cookie domain must be consistent from one sub domain to another. The sub domain must be removed from the cookie domain. Once the sub domain is removed the cookie can be accessed by the GATC that appears on any sub domain, as illustrated in table 6.
Google Analytics
39
Table 6: Changing cookie domain enables tracking across different sub domains.
Domain
Cookie Domain
Can be accessed by
support.foo.com .foo.com
support.foo.com or secure.foo.com
secure.foo.com
support.foo.com or secure.foo.com
.foo.com
The tracking cookie domain can be changed using the _udn variable. In the default configuration, _udn is set to a value of “auto” causing the sub domain to be included in the cookie domain. _udn can be set to a specific value which will in turn be used for the cookie domain. So, setting _udn to the website’s primary domain lets the tracking code access the cookies on various sub domains. Implementation Configuring Google Analytics to track visitors across multiple sub domains is the following process: 1. Modify tracking code to include _udn variable 1. Apply filter to clarify Google Analytics reports 2. Segment traffic into multiple profiles for improved reporting (this step is optional but recommended) Begin by modifying the GATC to include the _udn variable. Set this variable to the primary domain for the website. <script type="text/javascript"> _uacct = “UA-XXXXX-X”; _udn = “primarydomain.com”; urchinTracker(); </script>
Once the tracking code has been modified and installed, you have to add a filter to the appropriate profile (Figure 27). The filter will differentiate pages that appear on multiple sub domains. For example, the page index.html may appear on multiple sub domains but will appear as index.html in the reports. Adding the hostname to the Request URI will differentiate multiple versions of the same page.
Google Analytics
40
Figure 27: An advanced filter that concatenates the Hostname and the Request URI.
The final step in configuring multiple sub domains is optional but recommended. It is a good idea to create a separate profile for each sub domain. This provides a greater level of reporting and more insight into visitor actions on each sub domain.
Warning This filter, or any filter that modifies the Request URI field, will break the site overlay report. The reason is that the Site Overlay report uses the Request URI to identify which links in the Site Overlay report correspond with specific data (like clicks and visits). To create the additional profiles use an include filter (figure 28) based on the Hostname field. When complete, there should be one main profile that contains summary data for all sub domains and individual profiles for each sub domain.
Google Analytics
41
Figure 28: An include filter used to create a profile for a specific sub domain.
Tracker Across Multiple Domains with Multiple Sub Domains Tracking visitors across multiple primary domains, which contains multiple sub domains, can be done. The key to a successful implementation is making sure the Google Analytics tracking cookies are set with the correct domain and that the cookies are passed between the primary domains. There are three steps to configure this type of tracking: 1. Modify tracking code on each sub domain and primary domain 2. Modify links and forms on both sites to use __utmLinker() or __utmLinkPost() 3. Add a filter to clarify the data within reports Many of GATC modifications for this configuration are similar to the settings used in the multiple domains and multiple sub domain tracking. _udn is used to remove the sub domain from the cookie domain, thus making tracking across each sub domain possible. _ulink is used to trigger certain actions in the tracking code necessary for cross domain tracking. Table 7 lists how the tracking code should be modified for each domain. Table 7: The GATC configuration for a website that uses multiple domains and multiple sub-domains.
Site 1 Hostnames
Tracking Code
products.site1.com
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script> <script type="text/javascript"> _uacct = "UA-XXXXX-X"; _uhash = "off"; _udn = "site1.com"; _ulink = 1; urchinTracker(); </script>
Site 2 Hostnames
Tracking Code
secure.site1.com
secure.site2.com support.site2.com
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script> <script type="text/javascript"> _uacct = "UA-XXXXX-X"; _uhash = "off"; _udn = "site2.com"; _ulink = 1; urchinTracker(); </script>
The primary difference is the _uhash variable. _uhash creates a unique hash (or numerical representation) of the domain name. This number is then placed in the tracking cookies. Google Analytics
42
Originally _uhash was created to speed the processing of profile data and to ensure the integrity of the tracking cookies. With improvements to the tracking code and the server architecture of Google Analytics, _uhash has become antiquated. However, it must be set appropriately for tracking to work. In addition to changing the tracking code, each site must be modified to use __utmLinker() or __utmLinkPost(). Remember, these functions pass the Google Analytics tracking cookies between the domains via the query string. If the cookies are not passed between domains then the visitorâ&#x20AC;&#x2122;s session will not be tracked between websites. Finally, to help clarify the data in your reports, use an Advanced Custom filter to attach the Hostname to the Request URI. To aid in analysis, it is wise to create separate profiles for each primary domain and/or each sub domain. Do this by using a simple include filter based on the Hostname field. Sample filters can be found in Figure 25 and Figure 26. Theoretically, there is no limit to the number of primary domains or sub domains that you can track a visitor across. However it may be impractical to track across more than two or three primary domains.
Frames & IFrames You can use Google Analytics on sites that have frames. However, take care during the installation and configuration. The most common problem with sites that use frames is that the original referral information can become distorted. This can lead to problems when tracking online marketing. When implementing Google Analytics on a site that uses frames, make sure that both the frameset and the pages within the frame are tagged with the Google Analytics tracking code. If both pages are not tagged, then Google Analytics does not track referral information correctly. A side effect of tagging both the frameset page and the pages within the frameset is that there will be an artificially high number of pageviews for some pages, specifically the frameset. If the frameset page is not critical (i.e. it is simply a navigation menu of page header) then consider removing it from the profile using an exclude filter.
Warning If your website uses frames, and the number of pageviews is a critical metric for your business, be sure you filter your profile data appropriately to insure accurate metrics.
Google Analytics
43
IFrames The effect of iFrames on Google Analytics tracking is similar to that of standard frames. If the outer page is not tagged with the Google Analytics tracking code then the original referral information will be lost. Another common issue with iFrames is the use of third party shopping carts. If you are using a third party shopping cart, and embedding the shopping cart pages in an iFrame, it will be impossible to track the visitor session from originating website to the shopping cart website. The reason is __utmLinker() and __utmLinkPost() functions can not be used in the SRC attribute of an IFrame.
Warning Some have tried to pass the Google Analytics tracking cookies directly to the source of an IFrame. This technique will not work. When __utmLinker() or __utmLinkPost() execute they create a hash, or key, based on the values of the cookies. The hash is sent to the third party domain and used to check the accuracy of the values.
Marketing Campaign Tracking Another important part of setting up Google Analytics correctly is the identification of all URLs used in online marketing. Unlike other configuration steps, marketing campaign tracking is not done in the Google Analytics administrative interface or on your website. Marketing campaign tracking involves changing the links used in your marketing activities. Iâ&#x20AC;&#x2122;ll discuss this more in a moment. The reason why marketing campaign tracking is so important is that, by default, Google Analytics places your visitors in three basic referral segments: organic Visitors clicking on a search engine results page. referral Visitors that click on a link on some other website. direct Visitors that go directly to your website by typing the URL into their browser. While these segments are useful, they do not identify paid marketing activities. You want to measure paid marketing activities so you can better understand if theyâ&#x20AC;&#x2122;re successful! This can only be done with via marketing campaign tracking.
Google Analytics
44
How it Works Marketing campaign tracking is based on the process of link tagging, which is adding extra information to the destination URLs used in your online marketing activities. The extra information is actually a number of query string parameters that describe the marketing activity. Iâ&#x20AC;&#x2122;ve illustrated how link tagging identifies your marketing activity in Figure 29.
Figure 29: How link tagging works.
It all begins with the ad that the visitor sees (step #1). In this example the ad is a paid search ad on Google AdWords. When the user clicks on the ad they are sent to a destination URL. Within the destination URL there are additional query string parameters that Google Analytics uses to identify the ad (step #2). When the visitor arrives on the website landing page the urchinTracker() function begins to execute. It examines the URL in the location bar and identifies the query string parameters that identify the URL as a campaign URL. urchinTracker() extracts the query string parameters (step #3). Then, it splits the query string parameters into their name-value pairs, reformats them (step #4) and finally stores them in the __utmz cookie (step #5). Because the values are now stored in a cookie, any actions that the visitor performs can be linked to the ad that drove them to the site. Letâ&#x20AC;&#x2122;s dig a bit deeper and learn about the specific query string parameters used in link tagging. Table 8 shows how the tagged link in Figure 30 was created:
Google Analytics
45
Table 8: A destination URL before and after link tagging.
Link Before Tagging
Link After Tagging
http://www.google.com/analytics/
http://www.google.com/analytics/?utm_so urce=google&utm_medium=CPC&utm_c ampaign=en&utm_term=google%20analy tics
Parsing the tagged link above identifies the query string parameters used for identifying the ad. Table 9 identifies each parameter and value: Table 9: The name value pairs extracted from a destination URL.
Parameter
Value
utm_source
utm_medium
CPC
utm_campaign
en
utm_term
google%20analytics
Now I’ll describe what each parameter actually represents. utm_campaign The name of the marketing campaign. Think of this as a bucket. It holds all of the marketing activities in some bigger effort. For example, buying some keywords on Google, running some banner ads and sending out an email blast may all be part of the marketing plan for some type of sale. These three activities, which are all part of the same campaign, can be grouped together for easy reporting. utm_medium The medium is the mechanism, or how the message is delivered to the recipient. Some popular mediums are email, banner and cpc. utm_source Think of the source as the ‘who’. With whom are you partnering to distribute the message? If you’re tagging CPC links the source may be Google, Yahoo! or MSN. If you’re using banner ads the source could be the name of the website where the banner ad is displayed. utm_term The search term or keyword that the visitor entered into the search engine. This value is automatically set for organic links but must be set for CPC links.
Google Analytics
46
utm_content The version of the ad. This is used for A/B testing. You can identify two versions of the same ad using this variable. This parameter is not includes in Figure 29. It’s important to note that not all parameters are required. The core parameters are utm_campaign, utm_source and utm_medium. These three should always be used when tagging a marketing link. utm_term should be used for tracking paid search advertising and utm_content can be used for A/B testing advertising. You determine the value for each parameter. In reality it does not matter what value you use. Whatever data you do use will appear in Google Analytics. However, it is important to follow some basic guidelines: • Keep the value short • Use alphanumeric characters and avoid white spaces • Make sure the value is understandable to you, or whomever use these reports The value of each parameter will be imported directly into the Google Analytics reports. This is very powerful. Google Analytics is importing information that is specific to your business, like the name of a marketing campaign, and segmenting your data based on the values. Google Analytics displays the data, exactly as it appears in the query string parameter, in a series of reports. These reports segment website traffic and conversions thus providing insight into which marketing activities are working. Figure 30, the Campaign Report, shows how Google Analytics segments visitation data based on a marketing campaign.
Figure 30: The campaigns report automatically segments site data based on the utm_campaign value.
Google Analytics
47
Warning All cost-per-click links that are not tagged will be categorized as â&#x20AC;&#x2DC;organicâ&#x20AC;&#x2122;. This can artificially inflate organic traffic volume, leading to incorrect analysis. If youâ&#x20AC;&#x2122;re using Google AdWords it is highly recommended that Auto Tagging be enabled. If other paid search systems are used, like Yahoo! Search Marketing or Microsoft AdCenter then the destination URLs must be manually tagged. This is absolutely vital to configuring Google Analytics correctly. How to Tag Links The process of link tagging is simple. Start by identifying the marketing information to be placed in the query string parameters. Specifically, you need to identify the campaigns, mediums, sources and potentially keyword and content values. Remember, the keyword parameter is only used for tracking search-based ads and the content parameter is used to identify different variations of an ad. I recommend using some type of spreadsheet to organize the information. Once all the parameter values have been identified, modify the destination URLs to include the parameters and values. Place a question mark at the end of the destination URL followed by the query string parameter. Separate each namevalue pairs using an ampersand (&). If the destination URL already has query string parameters, simply add the Google Analytics parameters at the end of the URL. Separate the Google Analytics parameters, from the existing parameters, using an and symbol (&).
Warning If your website uses redirects on the landing pages then there may be trouble with link tagging. The Google Analytics campaign tracking parameters must be present in the URL of the landing page. If the URL does not physically contain the tracking parameters then the visit will not be attributed to the correct ad. Ling tagging works for any destination URL. So, if you are sending out emails or using banner ads you should be tagging the destination URLs. In general, any time you pay for advertising on the web you should try to tag the URL used in the ad.
Google Analytics
48
Tip Some destinations URLs, especially those used in email marketing, can be very long before the addition of the Google Analytics campaign tags. One trick is to create a custom URL on your website and direct all traffic from the email to the custom URL. Then, when a visitor lands on the custom URL, dynamically append the campaign tracking variables to the URL. This can be done using application level code or with a simple HTML META refresh. More information about dynamically tagging campaign URLs can be found in this Conversion University article: http://www.google.com/analytics/cu/tt_offline_campaigns.html
Understanding Conversion Referrals Visitor campaign information is stored in the __utmz cookie on the visitorâ&#x20AC;&#x2122;s machine. This cookie not only stores campaign information, but also all referral information including organic referrals, marked campaign links, un-tagged referral links and direct visits. Each time a visitor visits the website the urchinTracker() function updates the __utmz cookie with the appropriate campaign information. When the cookie is updated Google Analytics discards the previous campaign information. As a result Google Analytics only tracks the current campaign information, not previous campaign information. With that said, there is a hierarchy of data importance that Google Analytics references before it updates the __utmz cookie and overwrites the referral information. Remember Google Analytics buckets traffic in four basic ways: Campaigns Links that are tagged with campaign information Referrals Visitors who click on an untagged link residing on a web page Direct Visitors who type the URL directly into the browser Organic Visitors who click on an organic search result Here is how Google Analytics updates the campaign-tracking cookie based on referrer: â&#x20AC;˘ Direct traffic is always overwritten by referrals, organic and tagged links
Google Analytics
49
â&#x20AC;˘ Referral, organic or tagged links always overrides existing campaign information For example, a user may visit a site via a tagged link in a newsletter. When the visitor leaves the site the campaign tracking cookie persists for 6 months and indicates that the visitor arrived via the newsletter. The same visitor decides to come back to the site one day later and types the website URL into the browser. This is a direct visit. The campaign cookie will still indicate that the visitor arrived via the newsletter because the second visit was a direct visit, and direct traffic does not overwrite existing campaign information. One day later the visitor clicks on a tagged CPC link. The __utmz cookie is updated to indicate the visitor clicked on a paid search link and the visit is attributed to the CPC link.
Note The timeout value for the campaign cookie can be changed. By default it is set to 6 months. This value can be altered by changing the _ucto variable in the GATC. Google Analytics can be configured to retain the original campaign data stored in the __utmz cookie. To enable this feature add an additional query string parameter to a destination URL. The query string parameter, utm_nooverride=1, will alert Google Analytics that the existing campaign information should retained. While helpful, this technique does not prevent the GATC from updating the campaign cookie if a visitor arrives by organic search or untagged referral link. This technique is only helpful in preventing tagged campaign links from overwriting previous referral information.
Tracking AdWords The Google Analytics and Google AdWords systems are connected. To take advantage of the interconnectivity, the AdWords account must be linked to the Analytics account. If the accounts are connected, there are two primary features that become enabled: Auto Tagging and Cost Data sharing. Auto Tagging automates the link tagging process that is usually used to track CPC campaign. When Auto Tagging is enabled, as shown in Figure 31, Google AdWords automatically adds a query string parameter to the destination URL that identifies Google AdWords as the referring site. While this parameter is different than the standard link tagging parameters, it does the same thing. The query string parameter is named gclid and it contains a random value. Google Analytics
50
Figure 31: The Auto Tagging feature is activated from the Account preferences section of the My Account tab.
The second benefit of linking an AdWords account and Analytics account is the Apply Cost Data feature (Figure 32). If you enable this option, Google Analytics imports your AdWords cost data and use it in ROI (Return On Investment) and other calculations. It is recommended that the Apply Cost Data feature be activated as it provides an extremely powerful view of how AdWords ad campaigns are performing.
Google Analytics
51
Figure 32: Use the Apply Cost Data option to view your AdWords cost data in Google Analytics.
Warning When Apply Cost Data is activated, cost data from the entire AdWords account is applied to each profile in the Analytics account. Google Analytics does not match the campaigns in AdWords to the profiles in Analytics. This means if you are managing AdWords campaigns for multiple websites, and you link the AdWords account to Analytics, cost data for the entire AdWords account is applied to each Analytics profile, even though the profiles contains data for a single website. You can resolve this issue by using the filter in Figure 33. This filter only includes AdWords data whose destination URL is the same as the profile URL.
Figure 33: Filter used to remove incorrect AdWords cost data from a profile.
E-Commerce Tracking The Google Analytics e-commerce tracking feature provides a rich set of reports that you can use for product and customer analysis. If you are an e-commerce company, then it is recommended that you implement the e-commerce tracking code to collect transaction data. The resulting reporting is very valuable and help you gain additional insight into the performance of your online business. As with most things, understanding how Google Analytics collects, processes and stores e-commerce data is key to a proper implementation.
Google Analytics
52
Warning Google Analytics e-commerce tracking should not be used in place of an accounting package. While the tracking is fairly accurate, there are too many external forces that can affect the data quality. It is best to analyze larger sets of ecommerce data and look for trends that provide insight into customer actions. Do not rely on it for accounting tasks.
How it Works Figure 34 illustrates the basic process used to track e-commerce transactions. Tracking begins when a visitor submits a transaction (step #1) and it is received by the web server (step #2). The web server usually passes the data to an application server where it is processed (step #3). This processing may include adding the data to a database, validating a credit card or emailing the customer. Once the application server has processed the transaction it usually creates some type of receipt page for the visitor (step #4).
Figure 34: How e-commerce transactions are collected by Google Analytics.
At this point in the process a modification must be made to accommodate Google Analytics. Before the application server sends the receipt page back to the web server, the application must add information about the visitorâ&#x20AC;&#x2122;s transaction to the receipt page. So, if you are using PHP, ColdFusion or .NET you must create application code that adds the transaction data, in a specific format, to the receipt page. This is the step that most people do not grasp. The format of the data is as follows: <form style="display:none;" name="utmform"> <textarea id="utmtrans"> UTM:T|[order-id]|[affiliation]|[total]|[tax]|[shipping]|[city]|[state]|[country] UTM:I|[order-id]|[sku/code]|[productname]|[category]|[price]|[quantity]
Google Analytics
53
</textarea> </form>
All of the data in brackets must be replaced with actual transaction information. Again, this modification must be made to your application server. The first line within the text area is a transaction line and contains summary information about the transaction. This includes transaction ID, total, tax, shipping, etc. Following the transaction line there is an item line. There will be one item line for each distinct product purchased by the visitor. This usually means one item line per SKU or unique product ID. The data in an item line includes product name, product category, unit price, quantity, etc. Tables 10 and 11 list the data Both transaction lines and item lines follow the same general format. Each is a pipe (|) separated list. Transaction lines start with UTM:T| and item lines start with UTM:I|. Returning to the process in Figure 35, once the application server completes processing it sends the receipt page to the web server which sends it back to the visitorâ&#x20AC;&#x2122;s browser (step #5). As it renders in the browser the __utmSetTrans() function executes. Like urchinTracker(), __utmSetTrans() is used to send data back to the Google Analytics server. In fact, __utmSetTrans() uses a request for an invisible GIF file to send the data to Google Analytics. The primary difference is that __utmSetTrans() only sends e-commerce data. It parses your receipt page, looks for the hidden form an, once found, reads the data stored in the text area (step #6). Every time it finds a transaction line or an item line it send the data to Google Analytics in the form of a pageview (step #7). Once the e-commerce data is in the log file the processing is similar to that of a pageview. Google Analytics processes the log file at a regular interval, applies filters to the fields created during processing, and then stores the data in a database. It should be noted that special fields are created when an e-commerce transaction is processed. Like all other fields in Google Analytics, e-commerce fields can be used in filters.
Google Analytics
54
Table 10: Description of Transaction line data variables.
Variable
Description
Order id
The internal tracking ID for the order.
Affiliation
If the transaction was generated by a partner of specific store, the value can be stored here
Total
Total amount of the transaction. This value should be an integer. Do not use a monetary symbol or comma in the value.
Tax
The amount of tax applied to the transactions. This value should be an integer. Do not use a monetary symbol or comma in the value.
Shipping
The shipping charge applied to the transaction. This value should be an integer. Do not use a monetary symbol or comma in the value.
City
The city entered by the customer. This can be the ship to city or bill to city.
State or region
The state (or region if used outside of the US) entered by the customer. This can be the ship to or bill to state/region.
Country
The country entered by the customer. This can be the ship to country or bill to country.
Note that the geographic location data for e-commerce transactions is generated in the same manner as other geo data in Google Analytics. It is created using a network mapping of the visitors IP address.
Google Analytics
55
Table 11: Description of item line data variables.
Variable
Description
order id
This is the same order ID used in the transaction line. Google Analytics sues the order ID to group the products contained in a transaction.
sku
Product SKU
product name
The name of the product purchased
product category
The category that the product belongs to
price
The unit price for the product
quantity
How many units were purchased
Implementation There are four steps to configuring a website and Google Analytics to track ecommerce transactions. 1. Verify that the tracking code is installed on all pages. The __utmSetTrans() function resides in the urchin.js library. If the reference to the urchin.js file is missing from the receipt page then the __utmSetTrans() function will not execute and the transaction data will not be sent to Google Analytics. 2. Enable E-commerce reports for the profile. By default, the e-commerce reports are disabled. This means that the e-commerce menu does not appear in the left hand navigation. You must modify the profile setting, as seen in Figure 35, and specify that the website is an e-commerce website.
Google Analytics
56
Figure 35: How to enable e-commerce reports in your website profile.
3. Add the hidden form field to the receipt page and create the application logic to populate the field with transaction data. The actual implementation of this step differs from one website to another depending on the application architecture. 4. Finally, add the __utmSetTrans()function to the receipt page. Like urchinTracker(), this function can be placed almost anywhere within the page. Make sure that the __utmSetTrans() function is below the hidden form. Some common implementations include: <body onLoad=”javascript:__utmSetTrans();”>
or <body onLoad=”javascript:myFunction();__utmSetTrans();”>
or <script type="text/javascript"> __utmSetTrans(); </script>
Once the __utmSetTrans() function has been added to the receipt page ecommerce data should begin to populate the e-commerce reports. If it does not, there may be a problem with the installation.
Warning If you place the __utmSetVar() function above the GATC then the transaction will not be recorded. The reason is that the __utmSetVar() function is in the urchin.js file. If the browser has not included the urchin.js file, and you try to use a function from the urchin.js file, then the browser will create an error. To resolve this issue make sure the GATC appears above the __utmSetTrans() function.
Google Analytics
57
Common E-Commerce Problems The e-commerce tracking is fairly straight forward and should work. However, problems do occur. Here are some of the most common issues and their solutions. Garbled Data in E-commerce Reports If the data in the Product performance reports (particularly the Product Overview and Product Categories report) is garbled or contains extra characters then there may be an issue with the data in your hidden form. Verify that the transaction line and product lines only contain alphanumeric characters. Monetary data should only contain numbers and a period to separate dollars from cents. Do not include a dollar sign or commas. All Transaction Sources are Your Website This is a common problem for websites using a third party shopping cart. The problem is probable not the implementation of the e-commerce tracking code. It is more likely that the cross-domain tracking is incorrect. A simple way to identify this problem is to examine the data in the All Traffic Sources report. Use the E-Commerce tab to identify which traffic sources are driving revenue. If your websites hostname is listed in the report then there is probably an issue with the cross-domain tracking. Missing Transactions It is not uncommon for some transactions to be missing from Google Analytics. This is usually caused by visitors navigating away from the receipt page before the data is transmitted. If the number of transactions in Google Analytics is off by more than 10% when compared to your accounting software then there may be a bigger problem. In general, Google Analytics should be tracking most transactions. If there is a significant number missing then double check the implementation. Is the hidden form field present and in the correct format? Is the __utmSetTrans() function executing? Tracking Third Party E-Commerce Platforms E-Commerce tracking may not be feasible for all third party shopping carts. If the shopping cart provider does not permit modifications to the receipt page then the standard implementation of e-commerce tracking will not work and you should try a workaround. Each shopping cart is different, so there is no one single workaround that will solve every problem.
Google Analytics
58
Google Analytics and Miva 5 https://www.sebenza.com/help/index.php?_m=knowledgebase&_a=viewarticle &kbarticleid=68 Google Analytics and osCommerce http://www.oscommerce.com/community/contributions,3756 Yahoo! Store Transactions As of July 2007 the standard e-commerce implementation process will not work with Yahoo! Stores. There are various companies that charge a monthly fee to integrate Yahoo! Store e-commerce data and Google Analytics data. Personally, I’ve used Monitus’ Web Analytics Connector (http://www.monitus.net/content/blogcategory/23/57). Their service connects your Yahoo! store data with your Google Analytics data. Monitus charges a monthly fee for this service based on the number of pageviews your site generates. If you’re using a Yahoo! store you should look into their service.
Custom Segmentation Segmentation of data involves dividing the data based on some visitor characteristic. For example, a common segmentation is dividing the data into two groups, new and returning visitors. Grouping the data by new and returning visitors means you can examine your website traffic and identify what portion is generated from new visitors, what portion is generated from returning visitors and what those groups did while on the site. Google Analytics has a number of pre-defined segments, including: • Visitor type (new and returning) • Geo-location • Language • Browser Type • Computer operating system In addition to the default segments, you can configure Google Analytics to use a custom segment. This means traffic can be divided into various groups, using a characteristic defined by you, the site owner. Creating a custom segment involves adding an additional Google Analytics tracking cookie to the user’s computer. The cookie, named __utmv, is created using a JavaScript function named __utmSetVar().
Google Analytics
59
__utmSetVar()can be added to a web page anywhere JavaScript can be added. This means that __utmSetVar() can be placed in many HTML attributes such as: • onLoad • onChange • onSubmit The value passed into the function will be stored in the __utmv cookie. When Google Analytics processes your profile data it creates a field named ‘User Defined Value’ and populates it with the value in the __utmv cookie. This field is just like any other field in Google Analytics. You can create filters using this field or view reports constructed from this field. For example, say you have a contact form on your site that contains a drop down box where the user chooses their gender. If you want capture their answer, and segment your visitors by gender, call the __utmSetVar() function when the form is submitted. This creates the custom segment cookie and sets the userchosen value as the custom segment. Here’s how the HTML code might look: <form onSubmit="__utmSetVar(this.gender.options[this.gender.selectedIndex].value);"> <select name=“gender”> <option value="Female">Female</option> <option value="Male">Male</option> </select>
Once the data is in Google Analytics there are a number of ways to view how your custom segments perform. The easiest way is using the Visitors > User Defined report, shown in Figure 36. This report shows a wealth of information for your custom segments including basic visitation data (visits, average pageviews per visit, etc.) and conversion data.
Google Analytics
60
Figure 36: You can compare the performance of your custom segments using the User-Defined report.
Visitors that are not in a custom segment will not have a __utmv cookie. As a result they will appear in a single line item named ‘(not set)’. Another way to utilize the custom segment value is by segmenting existing reports using your custom segment. There is a ‘Segmentation’ drop down box below the narrative description of many reports (Figure 37). Choose ‘User Defined Value’ from the list and the report you are viewing will be segmented based on your userdefined values.
Figure 37: Segmenting a report using the User Defined value.
Warning There is a limit to the number of custom segments that can be set. Google Analytics allows one custom segment cookie, therefore a visitor can only belong to a single custom segment. If __utmSetVar() is called multiple times, then only the last value will be retained. To learn more about how to use Custom Segmentation creatively see the Tips & Tricks section.
Google Analytics
61
Tip One of the most useful ways to use custom segments is to identify people who have purchased from you in the past (customers) and people who have not purchased from you in the past (non-customers). It is extremely useful to analyze the habits and actions that differentiate customers from non-customers. To segment these two groups of visitors add the __utmSetVar() function to the onLoad event of the BODY tag on the receipt page. <body onLoad=”javascript:__utmSetVar(‘customer’);”>
The function can also be added in a block of JavaScript <script type=”text/javascript”> __utmSetVar(‘customer’); </script>
When the above code executes the visitor will be placed in the ‘customer’ segment. There is no need to add non-customers to a custom segment, as they will be grouped in the ‘(not set)’ segment.
CRM Integration The data stored in the Google Analytics tracking cookies can be used in other applications. After all, the cookies are standard first party cookies that can be accessed by JavaScript or server side application code. One popular way to use Google Analytics cookie data is with a CRM system. Google Analytics stores marketing data in a cookie. This data can be extracted from the cookie and added to a lead generation form. When the visitor submits the form the marketing material that the visitor responded to is connected to other information that the individual provides (usually their name and other contact information). Knowing the marketing message that an individual responds to is a valuable piece of information for a sales team. If the contact form is integrated with a CRM application, like SalesForce.com or NetSuite, it may be possible to store the marketing information with the individual’s contact information. Direct CRM integration depends on the CRM platform. Some systems allow form fields to be pulled directly into the application. Check with your CRM provider for information about your specific system. The technique described next has many different applications and should serve as a template for your implementation.
Google Analytics
62
As I’ve discussed before, Google Analytics stores all visitor referral information (i.e. marketing information) in the __utmz cookie. The data can be extracted and manipulated using very simple JavaScript. The basic process to extract and use the data is as follows: 1. Extract marketing data, using JavaScript, from __utmz cookie 2. Manipulate data as needed, using JavaScript 3. Place data in hidden form fields When the visitor submits the form the data is passed back to the server where it can be manipulated by your CRM application or other server side code. To simplify the implementation I use a function that exists in the urchin.js file. The function _uGC() can be used to extract data from the __utmz cookie. Using this function means that I do not need to write a function that extracts data from the tracking cookies. _uGC() takes three arguments: • A string to search (target string) • A start pattern • An end pattern Here’s how the function actually works. It searches the target string for the start pattern. Once the start pattern is found it returns all characters between the start pattern and the end pattern. The returned value can be stored in a variable for use in your code. The sample HTML emulates the process I describe above; it extracts campaign information from the __utmz cookie and adds it to hidden form elements. I’ve also added some code that extracts the custom segment value, stored in the __utmv cookie, and adds it to the hidden form. When the visitor submits the form the values in the hidden elements are transmitted back to the server. <html> <head> <script src="http://www.google-analytics.com/urchin.js"></script> <script> _uact=”XXXXXX-X”; urchinTracker(); </script> <script> // // Get the __utmz cookie value. This is the cookies that // stores all campaign information. // var z = _uGC(document.cookie, '__utmz=', ';');
Google Analytics
63
// // The cookie has a number of name-value pairs. // Each identifies an aspect of the campaign. // // utmcsr = campaign source // utmcmd = campaign medium // utmctr = campaign term (keyword) // utmcct = campaign content (used for A/B testing) // utmccn = campaign name // utmgclid = unique identifier used when AdWords auto tagging is enabled // // This is very basic code. It separates the campaign-tracking cookie // and populates a variable with each piece of campaign info. // var source = _uGC(z, 'utmcsr=', '|'); var medium = _uGC(z, 'utmcmd=', '|'); var term = _uGC(z, 'utmctr=', '|'); var content = _uGC(z, 'utmcct=', '|'); var campaign = _uGC(z, 'utmccn=', '|'); var gclid = _uGC(z, 'utmgclid=', '|'); // // // // // // // if
The gclid is ONLY present when auto tagging has been enabled. All other variables, except the term variable, will be ‘(not set)’. Because the gclid is only present for Google AdWords we can populate some other variables that would normally be left blank. (gclid) { source = 'google'; medium = 'cpc';
} // Data from the custom segmentation cookie can also be passed // back to your server via a hidden form field var csegment = _uGC(document.cookie, '__utmv=', ';'); if (csegment) { csegment = csegment.match([1-9]*?\.(.*)); csegment = csegment[1]; } else { csegment = ‘’; } function populateHiddenFields(f) { f.source.value = source; f.medium.value = medium; f.term.value = term; f.content.value = content; f.campaign.value = campaign; f.segment.value = csegment; return true; } </script> </head> <body> <form name='contactform' onSubmit="javascript:populateHiddenFields(this);">
Google Analytics
64
<input type='hidden' <input type='hidden' <input type='hidden' <input type='hidden' <input type='hidden' <input type='hidden' </form>
name='source' /> name='medium' /> name='term' /> name='content' /> name='campaign' /> name='segment' />
</body> </html>
Note Extracting referral information from the Google Analytic tracking cookies does not violate the Google Analytics privacy policy. The information in the __utmz and __utmv cookies is not personally identifiable.
Tips & Tricks Tracking Visitor Clicks, Outbound Links and Non-HTML Files As discussed in Section 1, urchinTracker() creates pageviews. Because urchinTracker() is standard JavaScript it can be added to any HTML event handler and thus executed whenever a visitor performs an action. Therefore, almost any visitor action can be captured as pageviews within Google Analytics. The simple implementation for tracking visitor actions, or clicks, involves adding the urchinTracker() function to an HTML tag. For example, to track a visitor click on an image just add urchinTracker() to the onClick event of that element. <img src=”/image.jpg” onClick=”javascript:urchinTracker(‘/image.jpg’);” />
When a visitor clicks on the above image a pageview will be created for /image.jpg. This exact method can be used to track non-HTML file. <a href=”/schedule.pdf” onClick=”javascript:urchinTracker(‘/files/pdf/schedule.pdf’);” />PDF</a>
Google Analytics
65
Tip When creating pageviews for non-HTML files try to use a consistent naming convention. This will make it easier to identify them in the reporting interface. For example, you may want to create a virtual directory structure using urchinTracker(). In the previous code example I added /files/ to the value passed to urchinTracker(). This makes it easy to identify the non-HTML files in the reports. Outbound links are tracked in the same manner: <a href=”http://www.lunametrics.com” onClick=”javascript:urchinTracker(‘/outbound/’+this.href);” />www.lunametrics.com</a>
The above outbound link will appear as /outbound/http://www.lunametrics.com in the reports.
Warning Clicks on outbound links are not ‘real’ pageviews. If you need an accurate count of the number of pageviews your website generates, then make sure you filter out any clicks on outbound links. An exclude filter, using the Request URI and a filter pattern that matches your outbound link structure, will do the job. There is an easier way to track outbound links and non-HTML files. Create a simple DOM script, like the one below, to automatically apply the urchinTracker() function to links at the moment a visitor clicks. <html> <head></head> <body> <h3>A simple page to demonstrate how a DOM link tracker works.</h3> <a target="new" href="http://www.google.com">google</a><br /> <script src="http://www.google-analytics.com/urchin.js"></script> <script type="text/javascript"> // // By default, this script will track links to the follwoing file // types: .doc, .xls, .exe, .zip and .pdf // var fileTypes = (".doc",".xls",".exe",".zip",".pdf"); // This is a debug mode flag. Change to '' for production. In debug mode // the script will display an alert box and skip sending data to Google // Analytics. // var debug = '1';
Google Analytics
66
// // This variable controls how outbound links will appear // the GA reports. By default, external links will appear as // '/outbound/<URL>' where URL is the URL in the anchor tag. // var extIdentifier = '/outbound/'; /// No need to change anything below this line /// if (document.getElementsByTagName) { // Initialize external link handlers var hrefs = document.getElementsByTagName('a'); for (var l = 0; l < hrefs.length; l++) { //protocol, host, hostname, port, pathname, search, hash if (hrefs[l].hostname == location.host) { var path = hrefs[l].pathname; if (path.indexOf(fileTypes) != -1) startListening(hrefs[l],"click",trackDocuments); } else { startListening(hrefs[l],"click",trackExternalLinks); } } } function startListening (obj,evnt,func) { if (obj.addEventListener) { obj.addEventListener(evnt,func,false); } else if (obj.attachEvent) { obj.attachEvent("on" + evnt,func); } } function trackDocuments (evnt) { var url = (evnt.srcElement) ? "/" + evnt.srcElement.pathname : this.pathname; if (typeof(urchinTracker) == "function") { if (!debug) { urchinTracker(url); } else { alert(url); return false; } } } function trackExternalLinks (evnt) { var lnk; if (evnt.srcElement) { var elmnt = evnt.srcElement; while (elmnt.tagName != "A") { var newelmnt = elmnt.parentNode; elmnt = newelmnt; } lnk = extIdentifier +elmnt.hostname + "/" + elmnt.pathname + elmnt.search; } else { lnk = extIdentifier + this.hostname + this.pathname + this.search; }
Google Analytics
67
if (typeof(urchinTracker) == "function") { if (!debug) { urchinTracker(lnk); } else { alert(lnk); return false; } } } </script> </body> </html>
Recommended Profiles & Filters There are some filters and profiles that you should be using regardless of your business model. Each can help create more refined sets of data that can aid in analysis. Remember, you can not create a new profile and reprocess historical data. So itâ&#x20AC;&#x2122;s best to create these profiles during the initial setup, even if you donâ&#x20AC;&#x2122;t need them. As you use Google Analytics you will become a better analyst, your data needs will change and these profiles will become useful. Filters Exclude Internal Traffic Filter
Any profile that analyzes visitor data should include a filter to exclude internal company traffic. Traffic generated by you or your employees dilutes true visitor data. A simple predefined filter (Exclude all traffic from an IP address), shown in Figure 38, will remove internally generated traffic.
Figure 38: A predefined filter to exclude internally generated traffic from an IP range.
Remember, you may have external contractors or other workers who do not have the same IP address as your employees. In this case, set up additional filters for each IP address. If your company or contractors do not have a static IP address, which is normally the case for those that use cable modems or DSL connections, try using the Google Analytics
68
technique I describe in the ‘Excluding Yourself from Google Analytics Data’ section. Include Valid Traffic Filter
The Google Analytics tracking code is plain text and visitors can copy it by viewing your website HTML. A visitor could copy your tracking code and install it on a different website. This would artificially inflate the data within your profiles. To prevent this use a simple include filter (Figure 39) based on your website hostname.
Figure 39: Include filter that ensures valid website traffic in a profile.
The hostname attached to data coming from the perpetrator’s site will have a different hostname. This filter will remove it from the profile.
Note While it is highly unlikely that this will happen, it is best to ensure the quality of your data using this filter. If you do not wish to use this filter, and your data seems abnormally high, check the Visitors > Network Properties > Hostnames report. If any inappropriate hostnames are listed then it may be that your tracking code was hijacked and installed on an unrelated website. Profiles Test Profile
I recommend all users create at least one test profile within their Google Analytics account. Use this profile to test goal settings, profile settings and filters before you apply them to a ‘master’ profile. Remember, an incorrect profile setting will forever change the data in your Google Analytics reports. Take the time to test all of your settings before applying changes to a production set of data. Google Analytics
69
Master Profile
The Master profile should be a profile that is not altered or changed by any employee. The data within this profile should be refined as much as possible using filters & profile settings and, once the data is considered accurate, the profile should not change unless absolutely necessary. If any changes must be made to a master profile test them first using a test profile. Campaign/Source/Medium Profiles
There are some reports in Google Analytics, like the Funnel Visualization report, which cannot be segmented. But it is often helpful to segment these reports to identify if a particular segment of traffic is causing a problem. Maybe paid search traffic navigates the funnel more successfully than organic traffic? It is possible to answer this question, and others like it, by creating profiles for specific campaigns, sources or mediums. By applying a filter to a profile, like the one in Figure 40, you are segmenting the data in every report within the profile.
Figure 40: Include filter for a specific medium.
Custom Segmentation Hacks The custom segmentation framework is very flexible and can be used a number of ways. The reason the framework is so flexible is that it can store almost any data. Remember, when you call __utmSetVar() you simply pass it a string. It does not matter what that string is. Whatever you pass it is stored in the __utmv cookie. This means that you can place lot of data in the custom segment cookie. An ideal use for this is to store multiple custom segments in the custom segment cookie. For example, letâ&#x20AC;&#x2122;s say I want to segment my visitors using two custom segments, gender and age. To do this I can concatenate the values for the segments as one string. Then I can pass that string to __utmSetVar(). I do this using a small JavaScript function. It takes the gender data and age data from a form,
Google Analytics
70
combines it into a single string, and then sets the custom segment cookie. Here’s how the code and HTML might look: <script> function setSegment(f) { var s1 = f.gender.options[f.gender.selectedIndex].value; var s2 = f.age.options[f.age.selectedIndex].value; __utmSetVar(s1 + ’:’ + s2); return true; } </script> <form onSubmit=”javascript:setSegment(this.form);"> <select name=“gender”> <option value="Female">Female</option> <option value="Male">Male</option> </select> <select name=“age”> <option value="18-24">18-24</option> <option value="25-35">25-35</option> <option value="35+">35+</option> </select> </form>
The resulting data appears as ‘Male:18-24’ in the User Defined report.
Warning The Google Analytics Privacy Policy forbids the storage of any personally identifiable information in the Google Analytics system. As a result, you cannot store any personal visitor information in the custom segment cookie. This includes: • Username • Email address • Name
Excluding Yourself from Google Analytics Data As mentioned earlier, you should remove all website traffic generated by you or your employees, from your Google Analytics profiles. You can do this by using an exclude filer based on the Visitor IP Address field. While this is effective, the results can be a bit ‘broad’. You may not want to remove all of the traffic from an IP address. The solution is to create an exclude filter based on another filter field, more specifically the User Defined field. This method isn’t as broad as an exclude filter based on Visitor IP Address. This method uses a filter based on a cookie, which is specific to a computer. With this method you can eliminate all of the traffic from an individual computer without affecting data created by others. Google Analytics
71
To set this up first add the custom segment cookie to all of the computers you want to exclude. Do this by using the simple HTML page shown after Figure 42. Place the page on your server and direct all your employees to fill out the form. Make sure everyone enters the same value into the form. The second, and final step, in the process is to create a filter that excludes website traffic based on the User Defined field. An example filter is shown in Figure 41. If you entered ‘internal-employee’ in the form then use ‘internal-employee’ as the value in the filter pattern field.
Figure 41: Exclude filter to remove a custom segment. <html> <head> <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"></script> <script type="text/javascript"> /////////////////////////////////////////////// // Set Segment // function setSegment(f) { if (f.cs.value) { urchinTracker(); __utmSetVar(f.cs.value); alert(“Custom Segment set with value of '” + f.cs.value + “'.”); alert(“Don’t forget to create an ‘exclude’ filter!”); } return true; } </script> </head> <body> <h2>Set Custom Segment Cookie Script<h2>
Google Analytics
72
<p>Use the form below to set the Google Analytics custom segment cookie. Whatever value is entered into the form will be stored on this computer in the __utmv cookie.</p> <form action="" onSubmit=”javascript:setSegment(this);” method="post"> <input type="text" name="cs" size="30" maxlength="30" /> <input type="button" value="Set Custom Segment Cookie" /> </form> </body> </html>
Keep in mind that the Google Analytics cookies are first party cookies, which are specific to a domain. So, if you want to use this script for multiple websites you need to repeat this process for each website.
Tip If you’re using FireFox, there is a quick way to set a custom segment directly from the browser. You don’t need to use an HTML form. First, navigate to the website whose domain matches the Google Analytics profile you want to be excluded from. Then type the following in the location bar of the FireFox: javascript:__utmSetVar(‘foo’);
Replace ‘foo’ with the value that you specified in your exclude filter. Press enter and FireFox will execute the JavaScript and set the custom segment cookie. If you are already using custom segmentation on your website then you should change how the custom segment is set for your visitors. The reason is that if your employees use the website there is the chance that they may erase their custom segment (which excludes them from Google Analytics) by accidentally executing the code intend for your visitors. A small change to the way you call the __utmSetVar() function protects your employees and your data quality. The code change checks if the __utmv currently exists. If it does exist then __utmSetVar() will not execute. if (!document.cookie.match(‘__utmv’)) __utmSetVar(‘s’);
Creating an Implementation Plan Implementing Google Analytics is not a complicated process. However, it does take some planning and foresight. The Google Analytics support documentation does contain a rough implementation guide that includes the various steps to get Google Analytics installed and running. I have modified that process as follows: 1. Create a Google Analytics account 2. Analyze the website 3. Create and Configure profiles
Google Analytics
73
• Create filters • Create goals and funnels • Create recommended profiles 4. Edit the tracking code 5. Modify the website (if necessary, determined in step 2) 6. Add the tracking code to website pages 7. Tag marketing campaigns 8. Enable e-commerce transaction tracking (optional) 9. Implement Custom Segmentation (optional) 10. Configure other administrative features • User accounts and report access • Automated email report delivery The critical step in the process is step 2: Analyze the Website. During this step you should identify any aspect of the website that may require modifications to the tracking code, the website HTML or the profile. Ask questions like: • Does the website have multiple domains? • Does the website have multiple sub domains? • Is the website dynamic? • What are the goals of the website? The answers to these questions drive the modifications you need to make in steps 3,4, and 5. Remember, once the tracking code has been added to the website data will begin to populate the profiles. If a setting is incorrect or missing then the resulting data will be inaccurate. It is a good idea to plan ahead. If you anticipate any special data needs, like a funnel segmented by a specific campaign, create the necessary profile as soon as possible. To some extent, the process is iterative. Don’t expect to get it right the first time. Once the tracking code has been installed, and you have some data in the reports, check the data. Does it make sense? Should you add another filter to exclude additional data? It may be that you need to add an additional filter or change a profile setting to refine the data. The key to a successful implementation is to take a structured approach, take your time and document everything you do.
Google Analytics
74
Keeping Track of Your Configuration Changes There are many things that can change the data in your Google Analytics account. New marketing campaigns, website changes, etc. For example, if you launch a new CPC campaign your website traffic will probably increase. Understanding why the data change is the reason we do web analytics. It is imperative that you know if any non-business forces are affecting your data. Specifically, did modification to your Google Analytics settings cause the data to change? To help answer this question I recommend you use a simple ‘change log’ to track the changes made to your Google Analytics settings. Create a very basic spreadsheet and record every modification that you, or other employees, make to a profile configuration. The change log does not need to be complicated. Start with the following columns: • Date • Profile name • Name of person who made the change • Description of the change It may also be useful to create a business activity log to record major business events that could affect the data. Such activities might be when you launch, or stop, an AdWords campaign or send an email blast to your email list. A spreadsheet is an easy way to determine if a business decision changed site traffic.
Troubleshooting Data Accuracy In general, Google Analytics data should be fairly accurate when compared to other systems that use a similar tracking method. If you do engage in a data comparison process make sure the other applications are using JavaScript and first party cookies to track visitors. If they use a different tracking method then it is likely that the data will be different. There are a few simple ways to validate that the data in your Google Analytics profiles is accurate. Check your reports for a line item named ‘(other)’. Specifically, check the top content report. ‘(other)’ indicates that you have filled the profile database. This means there is probably an issue with the profile configuration. The usual cause is a dynamic website passing session identifier in the query string. Make sure the profile has been configured to remove the appropriate query string parameters (see the section on Dynamic Websites located in the Common Website Configurations sections).
Google Analytics
75
If you have an e-commerce website it is a good idea to check the number of transactions in Google Analytics against the number of transactions in an accounting software. As with all Google Analytics data, the numbers should be fairly accurate. However, donâ&#x20AC;&#x2122;t be surprised if a few transactions are missing. This is normally due to visitors navigating away from the receipt page before the ecommerce tracking code has a chance to execute. If you believe the goal tracking for your profile is incorrect check the Top Content report to validate the pageviews for your Goal URL. Find your Goal URL in the Top Content report and identify the number of unique pageviews. Does this match the Goal count? If the Top Content report lists more unique pageviews for the goal URL than there are goals conversions, then there is a problem with the way the goal is defined.
Reference Google Analytics Functions Below is a list of common Google Analytics functions. Many of these are explained at various points of this ShortCut. urchinTracker(stiring) This is the main Google Analytics function. It collects visitor information, sets the tracking cookies and sends the data to Google Analytics. More information about urchinTracker() can be found in Section 1: About Google Analytics. __utmLinker(url) The __utmLinker() function is used in cross domain tracking. This function appends the Google Analytics tracking cookies to the end of the URL passed to the function and then forwards the browser to the URL passed to the function. See Common Website Configurations, Tracking Across Multiple Domains for usage. __utmLinkPost(url) Similar to the __utmLinker() function. __utmLinkPost() is used to transer Google Analytics tracking cookies from one domain to another by adding the tracking cookies to the value in the form action. See Common Website Configurations, Tracking Across Multiple Domains for usage. _utmSetTrans() This function sends e-commerce data to the Google Analytics sever. First, it examines the HTML and looks for the hidden form field. Once found it parses the data and sends it to Google Analytics. Each transaction line and item line in the hidden form filed are sent to Google Analytics as a request for an invisible GIF file. See the section on E-commerce for more information.
Google Analytics
76
_uGC(string, start-pattern, end-pattern) Technically, the _uGC() function will parse any string and look for a start pattern. If the start pattern is found then the function will capture all characters until the end pattern is found and return the characters. If the start pattern is not found then the function will return a dash. Because website cookies are plain strings, this function is very useful for extracting cookie values.
Tracking Code Variables There are many JavaScript variables in the urchin.js file that can be modified. Changing these variables usually alters how urchinTracker() functions. In most cases, it is not necessary to change any variable. However, understanding what each variable does can help expand the usage of Google Analytics. To change a variable simply add that variable to the GATC. Here’s an example that shows the Google Analytics account number: <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-XXXXX-X"; urchinTracker(); </script>
The list of available variables is as follows: _uacct=”UA-XXXXX-Y” This is the Google Analytics account number. The format of the value is “XXXXX-Y” where XXXXX is the account number and Y is the profile number. Each profile that is created for a new domain is assigned a new profile number. Profiles that are created using an existing domain have the same profile number. _udn=”auto” or “none” or”domain.com” The _udn variable sets the domain for the Google Analytics tracking cookies. By default, the value of _udn is “auto” which forces Google Analytics to use the sub domain (if present) and primary domain as the domain for the tracking cookies. If the value is set to ‘none’ then Google Analytics will use the entire domain as the cookie domain. Setting _udn to a specific domain name will specifically set the cookie domain to the specified value. This variable is primarily used when tracking visitors across multiple domains or sub domains. See the section Common Website Configurations: Cross Domain Tracking or Sub Domain tracking for implementation information. _ulink =1 or 0 _ulink is a flag used to identify implementations that span multiple domains. This variable should only be used when tracking a visitor across multiple Google Analytics
77
domains. See the section Common Website Configurations: Cross Domain Tracking for more information. _uff=1 or 0 This variable identifies if urchinTracker() has been executed. Once urchinTracker() has been executed, it can only be executed again on the same page if either the function is called with value i.e. urchinTracker(‘index’) or if _uff is reset to "0". Using this variable visitation data can be tracked in multiple Google Analytics accounts. To do so modify the tracking code as follows: <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-YYYYY-1"; urchinTracker(); _uff = 0; _uacct = "UA-XXXXX-1"; urchinTracker(); </script>
Warning If you’re using _uff on an e-commerce website be conscience of where the ecommerce data is sent. If you are using _uff, and sending pageview data to multiple accounts, you may want to send the e-commerce data to multiple accounts as well. This means calling __utmSetTrans() multiple times. <script type="text/javascript"> _uacct = "UA-YYYYY-1"; urchinTracker(); __utmSetTrans(); _uff = 0; _uacct = "UA-XXXXX-1"; urchinTracker(); __utmSetTrans(); </script>
_uOsr[] and _uOkw[] By default, Google Analytics will track 28 search engines. The list of search engines list is maintained by the GA team and can be found in the urchin.js file. Every now and then the GA team will add a new search engines to the list. This list can be modified, by you, to include additional search engines. This is particularly helpful if there is a special search engine for your industry that you would like to identify in Google Analytics. To add an additional search engine you must identify two things, the name of the search engine and how it passes the search term to your website. Once this information has been identified, add the following code to your GATC: Google Analytics
78
_uOsr[100]="search-engine-name";_uOkw[100]="search-parameter";
The first variable is an array that stores the name of the search engines. The second variable is an array that stores the query string parameters used to identify the search term. These settings are dynamic, that’s why I started with array number 100. So, if the Google Analytics team adds additional search engines to the urchin.js they will not overwrite your changes. _uhash=”on” The _uhash variable controls a domain hash placed in the tracking cookies. This variable should only be used when implementing Google Analytics on a website that has multiple domains and multiple sub domains. This topic is covered the Common Website Configurations: Tracker Across Multiple Domains with Multiple Sub Domains. _usample=1-100 This variable controls the percentage of website traffic that Google Analytics will track. It is used for data sampling. If a website is filling the Google Analytics database, the volume of traffic tracked can be reduced with this variable. Simply set the variable for a value between 1 (for 1%) and 100 (for 100%). It is very rare to implement traffic sampling for a website, but it can be done as follows: <script src="http://www.google-analytics.com/urchin.js" type="text/javascript"> </script> <script type="text/javascript"> _uacct = "UA-YYYYY-1"; _usample=71; urchinTracker(); </script>
Identifying the value for _usample is an iterative process that may take some time. The only way to verify if your setting is working is by changing the value and checking the data in your reports. _utimeout=1800 This is the timeout length for a session, in seconds. If a visitor is inactive for this amount of time, then the session will be terminated. By default, Google Analytics sessions will timeout after 1800 seconds or 30 minutes. 30 minutes is an industry standard and should only be changed if you have a specific business need.. _ucto=15768000 _ucto is how long the campaign tracking cookie (__utmz) will persist on the visitor’s machine. By default, the cookie will last 6 months. To change the timeout value add the _ucto variable to the GATC and specify when the cookie should expire in seconds (not months). Here’s how to set the cookie to time out in 30 days: Google Analytics
79
<script type="text/javascript"> _uacct = “UA-XXXXX-X”; _ucto = 2592000; urchinTracker(); </script>
Regular Expressions A regular expression (reg ex) contains a mix of regular characters (like letters and numbers) and special characters, to form a pattern. The pattern is applied to a piece of data and if the pattern matches then the regular expression returns a positive result. Many regular expressions include regular alphanumeric characters. For example, you may have a list of keywords and you need to identify those keywords that contain ‘goo’. These three characters are a valid regular expression. Google Analytics will apply ‘goo’ to the target data, in this case the keywords. If ‘goo’ matches any part of the data then the reg ex will return a positive result. Table 12 shows some simple patterns and explains what type of data each will match. Table 12: Regular expressions that only contain alphanumeric characters
Pattern
Description
Example Matches
go
Match the characters go
google, go, merry-go-round, golf
bos
Match the characters bos
boss, boston, my boss, emboss
As illustrated in Table 12, a regular expression does not need to be a complicated mix of special characters. However, it’s the special characters that make regular expressions powerful and flexible. There are four basic types of special characters in regular expressions: wildcards, quantifiers, operators and anchors. Note Google Analytics uses Perl Compatible Regular Expressions (PCRE). Any special characters that are specific to Perl Compatible Regular Expressions will work in Google Analytics. Wildcards Wildcards are used to indicate what type of characters the regular expression should match. When you type the letter ‘g’ the regular expression looks for the letter ‘g’. But what if you need to match any character, not just the letter ‘g’? This is where wildcards come in. The most common wildcards are shown in Table 13. Google Analytics
80
Table 13: Basic regular expression Wildcards.
Pattern
Description
Example
Will Match
.
match any character
a.c
abc, aec, adc, a3c
[]
match one item in this list a[bB]c of characters within brackets
abc, aBc
[^ ]
match one item NOT in the a[^bB]c list of characters in the brackets
adc, a3c
Quantifiers Quantifiers (table 14) indicate how many times a character can be matched. Wildcards describe what type of character to match and the quantifier describes how many times to match it. Quantifiers are applied to the character directly before the quantifies. So, you can apply a quantifier to a wildcard OR to a standard alpha-numeric character. Table 14, Basic regular expression quantifiers
Pattern
Description
Example
Example Matches
*
Match zero or more of the ab*c previous character
ac, abc, abbc, abbbbc
+
Match one or more of the ab+c previous character
abc, abbc, abbbc
?
Match zero or one of the ab?c previous character
ac, abc
Operators Operators (table 15) are used to perform some type of logic within the regular expression. The most common operator is the escape character. When used in a regular expression, the escape character transforms a special character into a regular character. For example, the question mark is a special character in regular expressions. However, if you place the escape character before the question mark, like this: \? , then the question mark become a literal question mark. Rather than Google Analytics
81
trying to interpret the question mark as a special character the regular expression will interpret it as an actual question mark. Table 15: Common regular expression operators.
Pattern
Description
Example
Will Match
\
Escape any pattern. The character after the escape symbol will be interpreted literally.
a\.c
a.c
|
OR operator. Match one item or another.
abc|def
abc, def
()
Group characters into a (abc|123) abc or 123 and return pattern or capture the whichever is matched characters in the parenthesis
The pipe character is the logical OR operator, which literally means, “match this OR match that.” This is very powerful because it can be used to create lists of things to match. For example, to match a list of keywords, the following expression could be used: justin|cutroni|epikone The final operator is the parenthesis. This operator has two uses. First, it can be used to group expressions and then apply a regular expression operator, e.g. a quantifier, to the entire group. For example, the expression (abc)+ means match ‘abc’ one or more times. Parenthesis can also be used to retain part of the data. This feature is used with Advanced filters. Placing an expression in parenthesis, like (a.b) will force Google Analytics to retain the value that the pattern a.b matches. This could be aab, abb, a2b, etc. Anchors Anchors (listed in table 16) describe where the regular expression should be applied to the data that is being evaluated. When an anchor is used, the value that the reg ex is applied to must begin (or end) with the appropriate pattern. Essentially, they mean, match at the beginning or match at the end.
Google Analytics
82
Table 16: Regular Expression Anchors
Pattern Description
Example
Will Match
^
match at the beginning of the data
^abc
abc, abcdef, abc123
$
match at the end of the data
abc$
abc, dfghabc, 12jdkjfabc
Note LunaMetrics, a web analytics consulting company in Pittsburgh, has a wonderful Regular Expressions tutorial. I highly recommend it: http://www.lunametrics.com/blog/category/web-analytics/googleanalytics/regular-expressions/
Tools for Debugging Google Analytics Here is a list of must-have tools for debugging Google Analytics. All of these tools help investigate what data Google Analytics is storing in the tracking cookies and what information is sent to the Google Analytics servers. LiveHTTPHeaders LiveHTTPHeaders is a FireFox plug-in that displays all of the headers sent between a web page and the various servers that contribute the content for said webpage. Using this plug-in you can validate that a request is made to the googleanalytics server for both the urchin.js file and the utm.gif file. If you are using Internet Exporer version 6 or 7 you can use HttpWatch. FireFox Developer’s Toolbar If you’re working with web pages then you probably already have this installed. I like the Developer’s Toolbar because it provides quick access to the Google Analytics tracking cookies and HTML source code for a page. Validating that the Google Analytics cookies are set, and are set correctly, is one of the first things you should do when debugging a Google Analytics problem.
Google Analytics
83
RegEx Coach This is the tool for testing your regular expressions. The Regex Coach is a graphical application for Windows and Linux which can be used to experiment with regular expressions interactively. If you have any questions about the validity of your regular expressions you should test them with the RegEx Coach. Time The most challenging part of working with Google Analytics is waiting for your changes to take affect. Once a filter or profile setting is changed it may be 3 or more hours before the data in Google Analytics is affected. Debugging problems in Google Analytics takes time, so be patient. The more you can do to understand the implications of a change, prior to make the modification, the better.
Conclusion I’ve covered a lot of content in this Short Cut. From tracking across multiple domains to filters, I’ve tried to explain how the system works and the pitfalls to avoid. The goal is to help you get Google Analytics set up correctly for your website and your business needs. Remember, getting Google Analytics configured correctly is vital to performing good analysis. Take the time to review the necessary sections of this Short Cut before, during, and after your installation. And don’t be afraid to try something new. If you understand how Google Analytics works you can use the system in new and creative way. If you take one thing away from the Short Cut I hope it is a realization that you can do a lot of things with Google Analytics. Your data needs will drive your configuration and the information in this Short Cut will help you get it right.
Copyright Section The copyright section includes the following information (The following paragraph is set up to start a new page, which is why there’s blank space at the bottom of this page):
Copyright Short Cuts Template by Justin Cutroni Copyright © 2007 O'Reilly Media, Inc. All rights reserved. ISBN: ########## The Title, Author, and Company name are all captured from the respective fields in File → Properties. The year is computed using a Date field. Your editor will Google Analytics
84
provide the ISBN for your Short Cut. If you need to re-create the copyright section, use the AutoText entry “,copyright”. Note that if you do that, you’ll need to manually specify that the paragraph containing the word “Copyright” begin a new page. Put your cursor in that paragraph, select Format → Paragraph → Line and Page Breaks, and check the box marked “Page Break Before.”
Google Analytics
85