Introduction
This vignette serves as a data dictionary for the UNAIDS Estimates Data pulled from the EDMS. It will cover the data structure and variables, as well as provide an overview of the indicators included. Finally, it will touch briefly on common use cases and which data tab to access for those analytics.
UNAIDS Clean Data Structure
Let’s first take a look at the data structure.
df_unaids <- load_unaids(pepfar_only = TRUE)
glimpse(df_unaids)
#> Rows: 67,868
#> Columns: 16
#> $ year <int> 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990, 1990,…
#> $ iso <chr> "AGO", "AGO", "AGO", "AGO", "AGO", "AGO", "AGO", "AGO…
#> $ country <chr> "Angola", "Angola", "Angola", "Angola", "Angola", "An…
#> $ pepfar <lgl> TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE, TRUE,…
#> $ region <chr> "Eastern and southern Africa", "Eastern and southern …
#> $ indicator <chr> "Prevalence", "Prevalence", "Prevalence", "Number AID…
#> $ indicator_type <chr> "Rate", "Rate", "Rate", "Integer", "Integer", "Intege…
#> $ age <chr> "15-49", "15-24", "15-24", "All", "0-14", "15+", "All…
#> $ sex <chr> "All", "Female", "Male", "All", "All", "All", "All", …
#> $ estimate <dbl> 8.0e-01, 7.0e-01, 3.0e-01, 2.8e+03, 1.2e+03, 1.6e+03,…
#> $ lower_bound <dbl> 5.0e-01, 3.0e-01, 1.0e-01, 1.6e+03, 7.0e+02, 9.1e+02,…
#> $ upper_bound <dbl> 1.0, 1.2, 0.5, 3800.0, 1500.0, 2300.0, 100.0, 6600.0,…
#> $ estimate_flag <chr> NA, NA, NA, NA, NA, NA, "less than", NA, NA, NA, NA, …
#> $ achv_95_plhiv <lgl> NA, NA, NA, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
#> $ achv_95_relative <lgl> NA, NA, NA, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE,…
#> $ achv_epi_control <lgl> NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, N…
The data are a panel dataset of countries from 1990 to 2023 reporting
on 19 different indicators. Each of those indicators may be
disaggregated by some combination of age
and
sex
as seen in the table below. Its also important to
mention that the values are provided as estimates, which have lower and
upper bounds.
Estimates may be been reported with character values, e.g. “<100”
or “>98”. In order to analyst and plots these data, we have removed
the character values and noted them iin teh estimate_flag
column.
We have also provided a few additional columns to the dataset. -
pepfar
: a logical value that notes whether the country is a
PEPFAR country or not - indicator_type
: a character value
that tells you whether the estimate is an integer, percentage, rate, or
ratio - achv_95_plhiv
and achv_95_relativ
: a
logical value that identifies whether the country in a given year has
achieved all three 95 goal (with PLHIV or relative base, see ) with the
point estimate. This achievement is broken down by each age and sex
group (All, 0-15, 15+, Female/15+ Male/15+) -
achv_epi_control
: a logical vlaue that indentifies whether
a country in a given year has achieved all three requirements for
epidemic control (IMR less than 1 and both new infections and deaths
declining, see ).
Integer Indicators | |
Number AIDS Related Deaths
|
All, 0-14, 15+ |
Number Deaths Averted by ART
|
All |
Number Infections Averted by PMTCT
|
0-14 |
Number Known Status of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Number New HIV Infections
|
All, 0-14, 15+ |
Number PLHIV
|
All, 0-14, 15+, Female/15+ |
Number PMTCT Needing ART
|
Female/15-49 |
Number PMTCT Receiving ART
|
Female/15-49 |
Number Total Deaths to HIV Population
|
All, Female, Male |
Number VLS of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Number on ART of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Percent Indicators | |
Percent Known Status of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Percent VLS of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Percent VLS on ART
|
All, 0-14, 15+, Female/15+, Male/15+ |
Percent on ART of PLHIV
|
All, 0-14, 15+, Female/15+, Male/15+ |
Percent on ART with Known Status
|
All, 0-14, 15+, Female/15+, Male/15+ |
Rate Indicators | |
Incidence (per 1,000)
|
All |
Prevalence
|
15-49, Female/15-24, Male/15-24 |
Ratio Indicators | |
Incidence mortality ratio (IMR)
|
All |
As you can see, there are multiple different disaggregates, especially across indicators. For instance, if you were looking at “Number AIDS Related Deaths”, there is total and age disaggregates for 0-14 and 15+, but its not broken down by sex.
As such, it is important to filter for the age/sex/indicator type disaggregates that are you interested in before diving further into your analytics to avoid any potential double counting.