Extract facilities location data
Baboyma Kagniniwa
2022-05-03
Source:vignettes/extract-facilities.Rmd
extract-facilities.Rmd
Introduction
This vignette provides guidance to USAID/OHA and other PEPFAR Data Analysts on how to extract location data of PEPFAR supported facilities
Facilities location datasets
PEPFAR uses DATIM for Global HIV/AIDS Programs data management. Supported health facilities location data are also managed through the same data management system. Each facility has a Universal Unique Identifier (UUID), all the parent organizational units UIDs and the latitude / longitude of the site.
Below is one of the many ways to extract PEPFAR facilities location data for specific countries.
Workspace
Before jumping into data extraction, we recommend preparing a directory to host data. The best directory for this type of data is under the OHA/Vector path. Follow these steps.
# Define a folder to host data - Using FY
dir_sites <- Sys.Date() %>%
glamr::convert_date_to_qtr() %>%
str_sub(1, 4)
dir_data <- glamr::si_path("path_vector") %>%
paste0("../", .) %>% # This needed only with R Markdown files
paste0("/OU-Sites/", dir_sites)
# Create the folder
dir.create(dir_data)
dir_geodata <- paste0(dir_data, "/SHP")
dir.create(dir_geodata)
Data Extraction
There are 2 key information needed for facility data extraction: a. OU/Country name, b. Organizational level for Facilities
ou <- "Nigeria"
level_fac <- get_ouorglevel(operatingunit = ou, org_type = "facility")
Now it’s time to proceed with facilities data extraction
# extract location data
df_facs <- extract_locations(country = ou, level = level_fac)
# Get a clean version of the facilities
df_facs <- df_facs %>% extract_facilities()
# Get ride of extract columns
df_facs <- df_facs %>% select(-c(geom_type:nested))
Save the extracted data as .csv file in the pre-defined location
Convert Extracted location data into a shapefile
csv files are good for most data analyses but there are time when one need the location data in a shapefile format. Below is how to generate a shapefile from a .csv file.
# By default, location information is stored in these 2 columns
loc_cols <- c("latitude", "longitude")
# create a spatial dataframe - excluding data with no validate lat/long
spdf <- df_facs %>%
filter(across(all_of(loc_cols), ~ !is.na(.x))) %>%
mutate(across(all_of(loc_cols), ~ as.numeric(.x)))
# Make sure the Coofinate Reference System is in WGS 84
spdf <- spdf %>% st_as_sf(coords = loc_cols, crs = st_crs(4326))
# Shapefiles columns have a max length
spdf <- spdf %>%
rename(ou_iso = operatingunit_iso,
ou = operatingunit,
cntry_iso = countryname_iso,
cntry = countryname)
Save the shapefile in the pre-define directory
export_spdf(spdf = spdf,
name = paste0(dir_geodata, "/", ou,
" - facilities_locations_",
format(Sys.Date(), "%Y-%m-%d")))
Give it a try and let us know how it went.
Enjoy!