Maps Example
2021-01-12 Tim Essam
vignette ggplot
Required packages
You will need a couple new packages to run the code below. The gisr
and glitr
packages are SI developed packages to help with gis and data munging related tasks, respectively. You can install the packages with the following chunks of code.
devtools::install_github("USAID-OHA-SI/glamr")
devtools::install_github("USAID-OHA-SI/gisr")
# Setup knitr defaults and folder paths
knitr::opts_chunk$set(echo = TRUE, message = FALSE, warning = FALSE, out.width = '100%')
pub_images <- "public_images"
images <- "Images"
# Set up caption object
caption <- paste0("Source: Testing data from glitr package | Created on: ", Sys.Date())
# Add libraries needed
library(tidyverse)
library(sf)
library(scales)
library(glitr)
library(here)
library(patchwork)
library(gisr)
library(glitr)
Spatial
Maps convey geographic patterns of data using coordinates and polygons. Maps are useful for showing spatial correlations or the geographic distribution of a metric. Combining a map with a bar graph can be a powerful way to summarize a metrics distribution by magnitude and geography. In the example below, we walk through how to create a choropleth map and combine it with a bar graph. As before, we use the hts
and hts_geo
data available from the glitr
package.
We will start by calculating the hts_dev_wide
data frame that is also presented in the Deviation
tutorial. We make use of the simple features sf
package to load and manipulate data that has spatial defined spatial properties.
How to check what type of data you are wrangling
To check the property of an R object or data frame you can use the str()
function. This function compactly displays the internal structure of an R object. As we will be working with the hts_geo
data set, we can check its properties by loading the glitr
package (done in the chunk above) and then passing the data object to the str
() function.
str(hts_geo)
In the first line, you will notice that the classes of the object are ‘sf’ and ‘data.frame’. The ‘sf’ portion of line indicates that the object is a collection of simple features that includes attributes and geometries in the form of a data frame. The major advantage of working with an ‘sf’ object is that it is compatible with the tidyverse
. Practically, this means that you can use the pipe %>%
operator and pass the object through tidyverse
functions. When plotting, ggplot2
has a dedicated geom geom_sf()
to visualize simple features.
In the example below, we walk through how to join attribute data from the hts
data to the hts_geo
data frame. You may find yourself doing a similar exercise if you have a .shp
(shapefile) file that is lacking attribution information you would like to visualize or analyze. See sf::st_read()
for details on how to read a .shp
as an sf object.
# Munge the hts data to create the hts_dev_wide data frame used in other tutorials
hts_dev <-
hts %>%
filter(indicator == "HTS_TST", period == "FY49", period_type != "results") %>%
group_by(primepartner, period_type, indicator) %>%
summarise(partner_totals = sum(value)) %>%
ungroup()
# Spread the data to make the acheivement calculations a bit easier
hts_dev_wide <-
hts_dev %>%
pivot_wider(names_from = period_type, values_from = partner_totals) %>%
mutate(achievement = cumulative / targets) %>%
group_by(indicator) %>%
mutate(annual_results = sum(cumulative),
annual_targets = sum(targets),
annual_achievement = annual_results / annual_targets,
deviation = achievement - annual_achievement) %>%
# Remove dedups
filter(primepartner != "Dedup")
# Merge the hts_dev_wide data frame with the hts_geo
hts_geo_dev <- left_join(hts_geo, hts_dev_wide)
# Use the glamr function (https://github.com/USAID-OHA-SI/glamr) to print results
glamr::prinf(hts_geo_dev)
Th output of the join operation shows that the hts_geo
data merged 13 features (rows) with 11 fields (columns). We can see that partner Leo has NAs for all calculated metrics. Looking at the original hts
data you can see that Leo in fact only started reporting on testing results in FY50. As the hts
data is filtered to FY49, it makes sense that Leo has NAs for the calculated metrics. Instead of removing Leo with a filter operation, we leave it in to show how to visualize NA values when creating maps.
Plotting the joined data
To plot the joined data we can use ggplot2
and the geom_sf()
function. The code chunk for creating a basic map is below.
hts_geo_dev %>%
ggplot() +
geom_sf()
Creating a boundary
The first map lacked quite a few things to make the visual useful. The lack of polygon labels and colors to encode testing achievement puts this map in the category of a poorly done base map. Below, we show how to encode target achievement with colors, resulting in a choropleth map. We also take advantage of sf
’s summarise
()
command to create an outline of the polygons. This can be useful for helping to define the boundary of the area being mapped. To show how different geom parameters affect the map’s look, two maps are presented. The first is based on using default parameters and the second version shows how incorporating SIEI colors and helper functions from glitr
and gisr
can improve the product.
# Create an admin0 to outline the admin1 shapefile
# We will pass this sf object to ggplot as a new data source
hts_geo_admin0 <- hts_geo_dev %>% summarise(cumulative)
# Original basemap with a fill value passed to asethics
map_lhs <- hts_geo_dev %>%
ggplot() +
geom_sf(aes(fill = achievement))
# Build on our original basemap
map_rhs <-
hts_geo_dev %>%
ggplot() +
geom_sf(aes(fill = achievement),
color = "white",
stroke = 0.5,
alpha = 0.85,
) +
geom_sf(data = hts_geo_admin0,
color = grey90k,
stroke = 2,
fill = NA) +
scale_fill_si(palette = "denims",
discrete = FALSE,
label = percent,
na.value = grey20k) + # this controls the fill on missing values (Leo)
gisr::si_style_map()
# compare maps
map_lhs + map_rhs
The visualization on the right is getting us closer to a presentation ready product. We still need to incorporate polygon labels, annotations and titles. Below, we make use of the geom_sf_text()
function to add in polygon labels.
# Define a vector with label colors
label_color <- if_else(hts_geo_dev$achievement > 2, "white", "black")
map_lhs_adorned <-
hts_geo_dev %>%
ggplot() +
geom_sf(aes(fill = achievement),
color = "white",
size = 0.5,
alpha = 0.85) +
geom_sf_text(aes(label = paste0(primepartner, "\n", percent(achievement, 2))),
color = label_color,
size = 2.5) +
geom_sf(data = hts_geo_admin0,
color = grey90k,
stroke = 2,
fill = NA) +
scale_fill_si(palette = "denims",
discrete = FALSE,
label = percent,
na.value = grey20k) +
gisr::si_style_map() +
theme(legend.position = "none") +
labs(title = "VIRGO LEADS TESTING ACHIEVEMENT AT NEARLY 400%",
caption = caption)
Much better, but we can still make a few tweaks that will help this product stand on its own. Complementing the map with bar graph can help the reader compare magnitude differences on the metric being presented without having to do too much math (remember, your job is to make the reader understand a product within 5-10 seconds of seeing it).
bar_graph <-
hts_dev_wide %>%
ggplot(aes(y = fct_reorder(primepartner, achievement),
x = achievement,
fill = achievement)) +
geom_col(aes(x = 1),
fill = grey10k,
alpha = 0.85) +
geom_col() +
geom_vline(xintercept = 1,
size = 1.5,
color = "white") +
scale_fill_si(palette = "denims",
discrete = FALSE,
label = percent) +
scale_x_continuous(labels = percent) +
labs(x = NULL,
y = NULL,
caption = caption) +
si_style_xline() +
theme(legend.position = "none")
# We can remove the title and caption from the map by passing NULL values to labs()
bar_graph + labs(caption = NULL) +
map_lhs_adorned +
labs(title = NULL) +
plot_annotation(title = "VIRGO AND SERPENS LEAD TESTING ACHEIVEMENT IN FY49")
To export this object for additional editing in a vector graphics software, we use the si_save()
function. This function is a wrapper for ggsave and uses default options to save the plot to a size optimal for presentations (10 x 5.625 inches).
combined_plot <-
bar_graph +
labs(caption = NULL) +
map_lhs_adorned +
labs(title = NULL) +
plot_annotation(title = "VIRGO AND SERPENS LEAD TESTING ACHEIVEMENT IN FY49")
si_save(here("images", "HTS_Saturn_bar_map_combined.svg"), plot = combined_plot)