As a website owner, Google Analytics is one of the most important tools for me. It gives me statistics about my visitors and target group, which pages are interesting to people and how often someone returns. The online dashboard of Google Analytics shows you the most important metrics, which you can easily adapt yourself. But sometimes you want more sophisticated analyses, such as correlations and predictions. The online dashboard is not the right tool for this, but probably some statisticians or analysts can help you by using this tutorial.
Google Analytics features
Google Analytics has so many features, you can get lost easily. I will write another blog post about the main features Google Analytics offers, for now I will focus on how to set-up a connection between Google Analytics and R and how you can import and analyze data in R.
Let’s get started!
Before we can start analyzing our Google Analytics data in R, we have to set up a project first.
- Go to https://console.developers.google.com and search for Analytics API in the ‘Library’ section.
- Click on ‘Create Project’.
- On the next page click on ‘Create’.
- Now type your project name. For this post I use the name ‘Test-project’. Select your e-mail preferences and click ‘Create’.
- After a few seconds, your project is set up. Now go to the ‘Credentials’ section and click on ‘oAuth consent screen’. Enter your website credentials and click ‘Save’.
- Go back to the ‘Credentials’ section, click to create new credentials and select ‘OAuth client ID’.
- On the next screen we select the type of client. In order to get our Google Analytics data in R, we select ‘Other’ and give our client ID a name.
- After giving your client ID a name, you see a page with your client ID and client secret. Keep this screen open to copy the credentials into R later.
- Now let’s go to the R environment. First, we need to install the package ‘RGoogleAnalytics’. Also we create two variable to store the client ID and client secret. Then we create a token based on the two variables.
install.packages("RGoogleAnalytics") library(RGoogleAnalytics) client_id <- 'xxx' client_secret <- 'xxx' token <- Auth(client_id, client_secret)
- The last line of code asks you if you want to use a local file to cache OAuth access credentials between R sessions. When you type ‘No’, the console gives you a link which you put in your browser. In return, it gives you an authorization code you need to paste into R.
httpuv not installed, defaulting to out-of-band authentication Please point your browser to the following url: https://accounts.google.com/o/oauth2/auth?client_id=xxx Enter authorization code: xxx
- You can validate your authorization code by using the function validateToken()
- Now the fun part can start! The connection is all set-up, so we can extract the data we want to work with. First thing to do is to create a query including the dimensions and metrics you need in your dataframe. A list with all dimensions and metrics can be found here. You should also specify the start and end date, the maximum number of results to return and on which dimension you want to sort the output. Finally you should provide you table.id, this can be found in your account on analytics.google.com.Use the QueryBuilder() function to create a list object and finally create your dataframe by using GetReportData() .
query <- Init(start.date = "2017-08-01", end.date = "2017-09-04", dimensions = "ga:date", metrics = "ga:pageviews", max.results = 1000, sort = "-ga:date", table.id = "ga:xxx") query_builder <- QueryBuilder(query) data <- GetReportData(query_builder, token, split_daywise = TRUE)
So that’s it, your Google Analytics data is now in R! You can do whatever you want with it like any other dataframe in R.