4 Examples to visualize a correlation matrix in R

One of the first things you probably do with a dataset, is checking the number of records, counting the number of variables and understanding what the variables mean. Soon after, you will probably check if there are any correlations between the variables. This gives you a good understanding of the data and perhaps unexpected correlations appear.

A correlation is usually presented in a matrix. It does give the information you need, but chances are it takes some time to go through all correlation values or you miss the one correlation necessary for further analysis. Presenting correlations in a matrix is something I keep as background information and sometimes I show it to clients and business people.

The best way to show correlations is to visualize it in a correlation plot. Below I’ve listed a couple of ways how you can quickly visualize a correlation matrix in R. There are several packages available for visualizations. I will use the packages  corrplot ,  GGally ,  ggcorrplot and  ggplot2 .

 

Corrplot package

The corrplot package is the easiest way to get a good looking visual of the correlations. It only takes seconds to have your visual ready which you can adapt with some handy functions. Some functions available:

  • method = “circle”, “square”, “ellipse”, “number”, “pie”, “shade” or “color”.
  • type = “full”, “upper” or “lower”.
  • order = “original”, “AOE”, “FPC”, “hclust” or “alphabet”.

Visualization of correlation matrix with corrplot package

 

GGally package

The function  ggcor  from the GGally package is another way to plot a correlation matrix. This function has less options but does everything you need. This function also has the option for different shapes by using geom = “tile”, “circle”, “text” or “blank”. By default, the lower triangle is plotted.

Correlation plot with GGally package

 

ggcorrplot

Third function in this article is the one from the  ggcorrplot package. This function is very similar to the one from GGally package, but this time you can also apply  ggplot functions which makes it much more advanced. Like most functions, there is the possibility to only plot the upper or lower triangle, order the variables and apply a different color.

Correlation matrix ggcorrplot

 

ggplot2

The most advanced version is using the ggplot2, which allows you to modify the correlation plot as much as you want. The basic code to start with is shown below. The plot doesn’t look to fancy, but with some additional code you can achieve the same result as previous examples. If you want to show the upper or lower triangle, you need to do this in the data preparation, the same holds for ordering the variables.

Correlation plot with ggplot

 

Hope you find this post helpful to choose the right function for your case. Depending on how much you want to specify or how quick you want results, there is always a function which will fit your needs.

World full of data author

Who I am


Hi! My name is Claudia, a freelance data analyst/scientist. This is my space on the internet where I share knowledge and experience with everyone who wants to become a better analyst. Read more about my work as a freelancer here.

Share this post on

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.