How does Google Analytics work?

Last Updated: 08 November 2017

As discussed in more detail in an earlier guide, Google Analytics is a service that tracks the usage of a website or web application, and provides an interface for the viewing and analysis of that data. In this guide, we will take a look at how Google Analytics collects the information it does, and how that data is then displayed to the website owner through the Google Analytics dashboard.

The Tracking Code

google analytics tracking code

The Google Analytics tracking code. This piece of code, or a reference to it, must be present on each page of a website in order for data to be collected about that page.

The first place to start is with the Google Analytics tracking code. This piece of code, or a reference to it, is included on every page of a website that the owner wishes to track. When users of your website view a page with this piece of code on it, the code references a JavaScript file called analytics.js stored on Google's Servers, which then in turn starts collecting information about the user who is viewing the page. It collects this information from two main sources:

  • The HTTP request - When you view a page on the internet, your browser makes what is called an HTTP request to the server that is hosting the website. It does this in order to receive the information that your browser needs to display the page you want to view. When your browser makes that request, it provides certain points of information about your system, such as the browser type, the referrer if there is one, and the language settings of the browser. In addition, most browsers will also provide access to more detailed browser and system information, such as Java and Flash support and screen resolution.

  • First-party cookies - The first party cookies in this case are cookies placed on your computer by Google Analytics to track information about your interactions with the website, including things such as:

    • if you have visited this website before (i.e. is there already a Google Analytics cookie for this website on your computer)
    • which page you viewing, and
    • which page you came from.

From these pieces of information, almost all the metrics that are available for viewing on the Google Analytics dashboard can be calculated.

How the data gets to Google Analytics

So now that we have all this data, how does it get from your browser back to Google Analytics to be compiled and presented to the website owner? Since the 2014 update, Google Analytics has three options for how it can collect the data from your browser. The three options are rather cryptically labelled 'beacon', 'xhr', and 'image'. In this guide we'll focus on 'image' as it perhaps the most common method, and the definitely the most interesting. But to understand that, first we need a little background.

At the start we briefly talked about how when you try to view a page on a website, your browser will make a request to the server that hosts that website. In return the server will send a bunch of data files back to the browser, which the browser interprets to display the page that you see. Adding an additional level of complexity, when the browser tries to create that page, the files it uses will often have references to other files that are in different places on the internet, and in order to create the page, the browser has to make additional requests to other web servers to get those files.

It is this functionality that Google Analytics uses to send the information back to Google's servers. When your browser opens a new page, Google Analytics tells your browser to make a request for a GIF (yep one of those moving image files) from Google's server. In this case though, the intention is not to receive the GIF to display it on the page, the intention is that when it makes the request for that file, it passes all the data it has collected to the server by adding it to the request URL.

About now I can hear saying "woah, woah, slow down egghead." Ok, let's go back one step, what is a URL and how does one add data to it? A URL (Uniform Resource Locator) is simply a fancy name for web address. When your browser makes the request for the GIF, it uses a URL, just like you do when you go to a website. In this case the URL for the GIF is simply http://www.google-analytics.com/collect. You can even click on that link like any other link, the only difference is that instead of a proper webpage, the only thing that is there is that single one pixel GIF that you cannot see because in addition to being tiny, it is also transparent.

Now, the next step, adding data to a URL. Sometimes, when you go to a URL, you may notice that at the end of the URL, there will be some extra stuff, like:

www.example.com?something=something

For example, often when you click on a link someone posts on Facebook or Twitter, if you check the URL in your address bar, you will see something like:

www.example.com?spref=fb

The bolded part of the URL shown above is called a URL parameter, and it is a way of passing data to the server hosting the website. In the case shown above, this URL parameter is likely telling the people at example.com that I got to their website by clicking a link on Facebook, information they will use to work out how many people came from that source.

You may be able to see where I am going with this now. When your browser makes the request for that little one pixel GIF, Google Analytics adds URL parameters, lots of them separated with '&'s, to the URL http://www.google-analytics.com/collect. The URL that the GIF is requested from actually ends up looking more like this:

https://www.google-analytics.com/collect?v=1&_v=j65&a=616242100&t=pageview&_s=1&dl=http%3A%2F%2Fexample.com%2F&ul=en-us&de=UTF-8&dt=Example&sd=24-bit&sr=1280x800&vp=1279x651&je=1&_u=AACAAMQAI~&jid=&gjid=&cid=2055012442.1509440828&tid=UA-XXXXXXXX-3&_gid=335469808.3109550828&z=6054873098

These URL parameters are passing all the information we talked about being collected to Google Analytics. For a full explanation of what all the parameters are, check out the Google Analytics documentation, however, just from reading the URL above we can make out several pieces of information that have been sent to Google Analytics:

  • My browser is using US English (ul=en-us)
  • My screen resolution is 1280x800 pixels (sr=1280x800)
  • My browser window size is 1279x651 pixels (vp=1279x651 - vp stands for viewport)
  • The page I visited was http://example.com/ (dl=http%3A%2F%2Fexample.com%2F - certain special characters cannot be used in a URL and have to be replaced)

Many of the more personal details (browser type and device type for example) are also provided here, but are encoded for privacy reasons.

Processing the Data

Now that we understand how Google Analytics is receiving the data, the next step is to understand how they process it into a format that website owners find useful.

Firstly, let's recap what this data will look like as Google receives it. For a given website with the Google Analytics tracking code installed, Google will receive a request using one parameter filled URL ('datapoint') for each page visited, or if the current page is refreshed. In addition to telling them information about the computer and which page sent that particular datapoint, it also provides an ID which allows it to determine which datapoints were created by the same user and which were created by other users. This is important, not because they want to know who you are, but because it allows them to group all the different datapoints up according to the user who created them. Once the datapoints for a given user in a given session (see our Basic Terminology Guide if you are unsure what a session is) can be identified, then some very useful information can be created, including:

  • The first page the user landed on - Commonly called the landing page, this will be the page from the first datapoint in the session
  • The last page the user view before leaving - Commonly called the exit page, this will be the page from the last datapoint in the session
  • Total Pages visited - The total number of datapoints received
  • Total Session Duration - the time between the first and the last datapoints
  • Time spent on a page - time between that page's datapoint and the next page's datapoint

We won't go through all the different metrics and how you could calculate them using the information above, but hopefully you get the idea!

Summing Up

Google Analytics is, in many ways, a complex piece of software, and understanding all the technical details of how it works requires more than a couple of pages of text. As such, the above is really just a high-level look at some of the basics, and in places does simplify to avoid getting overly technical. However, for the less technically minded, hopefully it has given you a basic understanding of how Google Analytics works.