How to, Maps, R, Shapefiles, Visual

R tutorial: plot maps from shapefiles

In my opinion, visualizations are always better than showing data tables. Whenever I work with spatial data, I like to plot the data on a map. There are several ways to visualize shapefile data on a map in R, I will walk you through a couple of them below.

In this tutorial I will use shapefiles to plot spatial data. A shapefile contains all the coordinates and additional data, such as region names. I’ve downloaded a shapefile of Soutch Africa, because I love this country ūüôā You can download the shapefile on¬†http://www.gadm.org/download. This file contains all the coordinates which we are going to plot. In addition, I will also use¬†population data from Wikipedia and coordinates of the five biggest cities of South Africa from Tageo.

 

Load shapefile data and plot map

So let’s start by loading the¬†maptools¬†library and store the location of the .shp file in a new object. Then I load the population data and coordinates of the cities into the right format. Next, I’ll read the shapefile into an object and plot the first map. In order to read and open the shapefile, I’ll use the¬†maptools¬†package.

library(maptools)

file <- 'C:/.../ZAF_adm1.shp'

# source: https://en.wikipedia.org/wiki/List_of_South_African_provinces_by_population
population <- data.frame(province = c("Gauteng", "KwaZulu-Natal", "Western Cape", "Eastern Cape", "Limpopo", "Mpumalanga", "North West", "Free State", "Northern Cape"),
                         population = c(14278669, 11074784, 6510312, 6498683, 5778442, 4444212, 3856174, 2866678, 1213996)
)

# source: http://www.tageo.com/index-e-sf-cities-ZA.htm
coordinates_cities <- data.frame(city = c("Cape Town", "Durban", "Johannesburg", "Pretoria", "Soweto"),
                                 lat = c(-33.930, -29.870, -26.190, -25.730, -26.280),
                                 lon = c(18.460, 30.990, 28.040, 28.220, 27.840)
                                 )

# Read data
shapefile <- readShapeSpatial(file)
plot(shapefile)

This simple code already produces the following map:

Plot shapefile in R using maptools package

That wasn’t difficult at all! The shapefile includes all kinds of information. I always start to look what kind of information is included. You can use the¬†@¬†and¬†$¬†signs to extract objects from the data. The following examples show you which information is in the shapefile of South Africa.

# View shapefile information
summary(shapefile)
#Object of class SpatialPolygonsDataFrame
#Coordinates:
#        min       max
#x  16.45189  32.89125
#y -34.83514 -22.12503
#Is projected: NA 
#proj4string : [NA]
#Data attributes:
#      ID_0      ISO             NAME_0       ID_1             NAME_1      HASC_1      CCN_1       CCA_1         TYPE_1     ENGTYPE_1 NL_NAME_1
# Min.   :211   ZAF:9   South Africa:9   Min.   :1   Eastern Cape :1   ZA.EC  :1   Min.   :0   EC     :1   Provinsie:9   Province:9   NA's:9   
# 1st Qu.:211                            1st Qu.:3   Free State   :1   ZA.FS  :1   1st Qu.:0   FS     :1                                       
# Median :211                            Median :5   Gauteng      :1   ZA.GT  :1   Median :0   GT     :1                                       
# Mean   :211                            Mean   :5   KwaZulu-Natal:1   ZA.MP  :1   Mean   :0   KZN    :1                                       
# 3rd Qu.:211                            3rd Qu.:7   Limpopo      :1   ZA.NC  :1   3rd Qu.:0   LIM    :1                                       
# Max.   :211                            Max.   :9   Mpumalanga   :1   ZA.NL  :1   Max.   :0   MP     :1                                       
#                                                    (Other)      :3   (Other):3               (Other):3                                       
#                                                     VARNAME_1
# Eastern Transvaal                                        :1  
# Natal and Zululand                                       :1  
# Noord-Kaap                                               :1  
# Noordelike Provinsie|Northern Transvaal|Northern Province:1  
# North-West|Noordwes                                      :1  
# Oos-Kaap                                                 :1  
# (Other)                                                  :3 

shapefile@data
#  ID_0 ISO       NAME_0 ID_1        NAME_1 HASC_1 CCN_1 CCA_1    TYPE_1 ENGTYPE_1 NL_NAME_1                                                 VARNAME_1
#0  211 ZAF South Africa    1  Eastern Cape  ZA.EC     0    EC Provinsie  Province      <NA>                                                  Oos-Kaap
#1  211 ZAF South Africa    2    Free State  ZA.FS     0    FS Provinsie  Province      <NA>                                Orange Free State|Vrystaat
#2  211 ZAF South Africa    3       Gauteng  ZA.GT     0    GT Provinsie  Province      <NA>                               Pretoria/Witwatersrand/Vaal
#3  211 ZAF South Africa    4 KwaZulu-Natal  ZA.NL     0   KZN Provinsie  Province      <NA>                                        Natal and Zululand
#4  211 ZAF South Africa    5       Limpopo  ZA.NP     0   LIM Provinsie  Province      <NA> Noordelike Provinsie|Northern Transvaal|Northern Province
#5  211 ZAF South Africa    6    Mpumalanga  ZA.MP     0    MP Provinsie  Province      <NA>                                         Eastern Transvaal
#6  211 ZAF South Africa    7    North West  ZA.NW     0    NW Provinsie  Province      <NA>                                       North-West|Noordwes
#7  211 ZAF South Africa    8 Northern Cape  ZA.NC     0    NC Provinsie  Province      <NA>                                                Noord-Kaap
#8  211 ZAF South Africa    9  Western Cape  ZA.WC     0    WC Provinsie  Province      <NA>                                                  Wes-Kaap

shapefile@data$NAME_1
#[1] Eastern Cape  Free State    Gauteng       KwaZulu-Natal Limpopo       Mpumalanga    North West    Northern Cape Western Cape 
#Levels: Eastern Cape Free State Gauteng KwaZulu-Natal Limpopo Mpumalanga North West Northern Cape Western Cape

shapefile@plotOrder
#[1] 8 1 9 2 5 7 4 6 3

shapefile@bbox
#        min       max
#x  16.45189  32.89125
#y -34.83514 -22.12503

 

Color shapefile based on population number

Now we know what kind of information is available and how to plot the first map from the shapefile, it’s time to color the areas. I want to color the provinces according to their population intensity. The file does not include population data, sometimes shapefiles include information like this. But it’s not a problem, because we can add columns to the shapefile.

After adding the population column to the shapefile, I’m going to use the¬†colorRampPalette()¬†function to create HEX color coding in a new column called¬†color¬†. Next, I’ll tell the¬†plot()¬†function to apply this color and to not show the borders.

# Merge population data
shapefile@data <- merge(shapefile@data, population, by.x = c("NAME_1"), by.y = c("province"))
shapefile@data

# Assign colors to provinces based on population
population_data <- shapefile@data$population
colors <- (population_data - min(population_data)) / (max(population_data) - min(population_data))*254+1
shapefile@data$color =  colorRampPalette(c('#9eceff', '#004081'))(255)[colors]
plot(shapefile, col = shapefile@data$color, border=NA)

Plot shapefile in R using maptools package

That looks good but I want to add a legend with the population numbers corresponding to the colors. You can add a legend by using the¬†legend()¬†function. The legend needs to know which colors to plot and the corresponding values, so I’ll create bins of the population data and use the¬†colorRampPalette()¬†¬†again to get the right color codes.

# Create legend
library(dplyr)
shapefile@data <- shapefile@data %>%
  arrange(population, color)
shapefile@data$population_bins <- cut(shapefile@data$population/1000000, 5)
shapefile@data$color_bins =  colorRampPalette(c('#9eceff', '#004081'))(5)[shapefile@data$population_bins]
legend("topleft", fill = unique(shapefile@data$color_bins), legend = unique(shapefile@data$population_bins), col = unique(shapefile@data$color_bins))

Plot shapefile and legend in R using maptools package

Now we have the colors and legend in place, I want to finish the map with the five biggest cities. We can plot the cities based on their longitude and latitude. We can use the function points() to add points to the map created with the shapefile. You can change some arguments such as the color and type of point you want to plot.

# Plot cities
points(coordinates_cities$lon, coordinates_cities$lat, pch=19, col="red", cex=1)

Plot shapefile, legend and points in R using maptools package

Et voila, there we have our map of South Africa created with a shapefile. With some additional data we added the colors and points on the map.

 

Ggplot2 for shapefiles

But this was just a start, know I will show you how to use¬†ggplot()¬†to create a map with shapefiles. First, we need to activate the library¬†ggplot2¬†to use the function¬†ggplot()¬†.¬† We need to tell the function which shapefile we want to use, but also the longitude and latitude columns, and which column contains the regions. I’ll also add black borders and make sure that the map is plotted using the right scale.

library("ggplot2")
ggplot(shapefile, aes(x = long, y = lat, group = group)) + 
  geom_polygon(fill = "white") +
  geom_path(color = "black") +
  coord_equal()

Plot shapefile in R using ggplot2

This looks nice already, but I want to recreate the map with colors based on the population, a legend and points on the map for the five biggest cities.

In order to color the regions, we need to transform the shapefile into a data frame. But I start by reordering the shapefile data, such that the order starts at 1 again. This is something we need to do, because otherwise the coloring will go wrong. As soon as the order of the regions is back to the original order, I’m using the function¬†fortify()¬†to transform the shapefile to a data frame. This creates a data frame with all the coordinates on a new line. Each line/coordinate needs to know the population value, so I merge the data to add a new column containing the population. Then I create an object of the map including the colors for the regions and a legend.

# Plot map and legend with colors
shapefile@data <- shapefile@data[order(shapefile@data$ID_1),] 
shapefile_data <- fortify(shapefile, region = "NAME_1")
shapefile_data <- merge(shapefile_data, shapefile@data[, c('NAME_1', 'population')], by.x='id', by.y='NAME_1', all.x=TRUE)
ggplot_southafrica <- ggplot_southafrica +
  geom_polygon(aes(fill = population/1000000)) +
  geom_path(color = "black") + 
  scale_fill_gradient(low = '#9eceff', high = '#004081', name = "Population x 1,000,000")
ggplot_southafrica

Plot shapefile, colors and legend in R using ggplot2

I want to end by adding some points for the five biggest cities on the map.

# Plot cities
ggplot_southafrica <- ggplot_southafrica + 
  geom_point(data = coordinates_cities, aes(x = lon, y = lat), col = "red", inherit.aes = FALSE)
ggplot_southafrica

Plot shapefile, colors, legend and points in R using ggplot2

So that’s the final map! I’ve shown you two options how you could plot a map from a shapefile. There are more ways to do this, but I think these two are the best to know.

Leave a comment

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.