Skip to contents

Introduction

This vignette serves as a data dictionary for the UNAIDS 2022 Estimates Data. It will cover the data structure and variables, as well as provide an overview of the indicators across the "HIV Estimates" tab and the "HIV Test & Treat" tab. Finally, it will touch briefly on common use cases and which data tab to access for those analytics.

UNAIDS Clean Data Structure

Let’s first take a look at the data structure, using the HIV Estimates tab as an example.

df_pct <- pull_unaids(data_type = "HIV Estimates", pepfar_only = TRUE)
knitr::kable(head(df_pct, n = 6L), format = "html")
year iso country region indicator age sex estimate lower_bound upper_bound estimate_flag sheet indic_type pepfar achv_95_plhiv achv_95_relative epi_ratio_2023 epi_control
1990 KHM Cambodia Asia and the Pacific Number AIDS Related Deaths All All 100 100 100 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE
1990 KHM Cambodia Asia and the Pacific Number AIDS Related Deaths 0-14 All 100 100 100 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE
1990 KHM Cambodia Asia and the Pacific Number AIDS Related Deaths 15+ All 100 100 100 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE
1990 KHM Cambodia Asia and the Pacific Number PLHIV 0-14 All 100 100 100 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE
1990 KHM Cambodia Asia and the Pacific Number PLHIV 15+ Female 200 200 200 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE
1990 KHM Cambodia Asia and the Pacific Number PLHIV 15+ All 500 500 500 TRUE HIV Estimates Integer TRUE FALSE FALSE 0.7608939 TRUE

Within each of the UNAIDS datasets, there are disaggregates by age, sex, and indic_type. Let’s take a look at the disaggregates in the "HIV Estimates" tab.

Indic_type/Age/Sex Count Table from the HIV Estimates tab
indicator indic_type age sex n
Deaths averted by ART Integer All All 1732
Infections averted by PMTCT Integer 0-14 All 1596
Number AIDS Related Deaths Integer 0-14 All 1870
Number AIDS Related Deaths Integer 15+ All 1870
Number AIDS Related Deaths Integer All All 1870
Number New HIV Infections Integer 0-14 All 1870
Number New HIV Infections Integer 15+ All 1870
Number New HIV Infections Integer All All 1870
Number PLHIV Integer 0-14 All 1870
Number PLHIV Integer 15+ All 1870
Number PLHIV Integer 15+ Female 1870
Number PLHIV Integer All All 1870
Number PMTCT Needing ART Integer All Female 1870
Percent Incidence Percent 15-49 All 1870
Percent Incidence Percent All All 1870
Percent Prevalence Percent 15-24 Female 1870
Percent Prevalence Percent 15-24 Male 1870
Percent Prevalence Percent 15-49 All 1870
Total deaths to HIV Population Integer All All 1732
Total deaths to HIV Population Integer All Female 1732
Total deaths to HIV Population Integer All Male 1732

As you can see, there are multiple different disaggregates, especially across indicators. For instance, if you were looking at Percent Incidence (a percent indicator), there are age disaggregates for ages 15-49 and all ages, but no disaggregates for sex. If you were looking at Percent Prevalence, there are sex disaggregates for young adults (15-24), but no sex disaggregates for the age group 15-49.

As such, it is important to filter for the age/sex/indicator type disaggregates that are you interested in before diving further into your analytics.

HIV Estimates Indicators and Use Cases

The HIV Estimates data covers epidemiological indicators, such as HIV prevalence/incidence, new infections, AIDS-related death, etc. There are 2 percent indicators (Percent Prevalence and Percent Incidence) and 7 integer indicators (Number PLHIV, AIDS Related Deaths, Total Deaths to HIV Population, Number PMTCT Needing ART, Number New HIV Infections, Deaths averted by ART, and Infections averted by PMTCT).

To pull the HIV Estimates indicators for PEPFAR countries, use pull_unaids(data_type = "HIV Estimates", pepfar_only = TRUE).

Percent Indicators
Percent Prevalence HIV Prevalence
Percent Incidence HIV Incidence (per 1,000 uninfected population)
Integer Indicators
Number PLHIV Estimated number people living with HIV
Number AIDS Related Deaths AIDS Related Deaths
Total deaths to HIV Population Estimated number of deaths from all causes among people living with HIV
Number PMTCT Needing ART Pregnant women needing antiretrovirals for preventing mother-to-child transmission
Number New HIV Infections New HIV Infections
Deaths averted by ART Deaths averted by antiretroviral treatment
Infections averted by PMTCT Infections averted by preventing mother-to-child transmission

Use Case #1: Epidemic Control Curves

PEPFAR defines HIV epidemic control as the “point at which the total number of new HIV infections falls below the total number of deaths from all causes among HIV-infected individuals, with both declining.” The integer indicators in the HIV Estimates data can be used to analyze and plot these trends.

To assess progress towards this definition of epidemic control, we will use the "HIV Estimates" data:

df_est <- pull_unaids(data_type = "HIV Estimates", pepfar_only = TRUE)
Integer Indicators
PLHIV Estimated people living with HIV
Number AIDS Related Deaths AIDS Related Deaths
Total deaths to HIV Population Estimated number of deaths from all causes among people living with HIV
Number PMTCT Needing ART Pregnant women needing antiretrovirals for preventing mother-to-child transmission
Number New HIV Infections New HIV Infections
Deaths averted by ART Deaths averted by antiretroviral treatment
Infections averted by PMTCT Infections averted by preventing mother-to-child transmission
  • To create the epidemic control curves, you will need 2 indicators from the HIV Estimates dataset:
    • Number New HIV Infections
    • Total deaths to HIV Population

For more information on how to create these plots, see the article Epidemic Control Plots.

Use Case #2: Incidence/Prevalence Curves

To plot HIV Incidence and Prevalence curves with confidence intervals, we will use the percent indicators in the "HIV Estimates" data:

df_est <- pull_unaids(data_type = "HIV Estimates", pepfar_only = TRUE) %>% 
  filter(indic_type == "Percent")
Percent Indicators
Percent Prevalence HIV Prevalence
Percent Incidence HIV Incidence (per 1,000 uninfected population)
  • To create the incidence/prevalence curves, you will need 2 percent indicators from the HIV Estimates tab:
    • Percent Prevalence
    • Percent Incidence

HIV Test & Treat Indicators & Use Cases

The HIV Test & Treat Data covers indicators in the clinical cascade, that can be used to track progress to the UNAIDS 95-95-95 targets. There are 6 percent indicators and 5 integer indicators.

To pull the HIV Test & Treat indicators for PEPFAR countries, use pull_unaids(data_type = "HIV Test & Treat", pepfar_only = TRUE).

Percent Indicators
Percent Known Status of PLHIV Among PLHIV, Percent who known their HIV status
Percent on ART of PLHIV Among PLHIV, Percent on ART
Percent on ART with Known Status Among people who know their HIV status, Percent of ART
Percent VLS of PLHIV Among PLHIV, Percent with Suppressed Viral Load
Percent VLS on ART Among people on ART, Percent with Suppressed Viral Load
Percent PMTCT on ART Estimated Percentage of Women Living with HIV on ART for preventing mother-to-child transmission
Integer Indicators
Number Known Status of PLHIV Number who know their HIV status
Number on ART of PLHIV Number on ART
Number VLS of PLHIV Number with Suppressed Viral Load
Number PMTCT Needing ART Number of Pregnant women needing antiretrovirals for preventing mother-to-child transmission
Number PMTCT on ART Estimated Number of Women Living with HIV on ART for preventing mother-to-child transmission

Use Case #3: Progress to 95-95-95 Targets

The percent indicators in the HIV Test & Treat data can be used to assess progress to the UNAIDS 95-95-95 targets that are set to be achieved by 2030, where 95% of PLHIV know their status, 95% of those who know their status are accessing treatment, and 95% of those who are accessing treatment are virally suppressed. These can also be used to assess progress to the 90-90-90 goals, which were set to be achieved in 2020.

To assess progress to 95-95-95 targets, we will use the percent indicators in the "HIV Test & Treat" data:

df_tt <- pull_unaids(data_type = "HIV Test & Treat", pepfar_only = TRUE) %>%
  filter(indic_type == "Percent")
Percent Indicators
Percent Known Status of PLHIV Among PLHIV, Percent who known their HIV status
Percent on ART of PLHIV Among PLHIV, Percent on ART
Percent on ART with Known Status Among people who know their HIV status, Percent of ART
Percent VLS of PLHIV Among PLHIV, Percent with Suppressed Viral Load
Percent VLS on ART Among people on ART, Percent with Suppressed Viral Load
Percent PMTCT on ART Estimated Percentage of Women Living with HIV on ART for preventing mother-to-child transmission

As you may notice, there are multiple test & treat percent indicators for the clinical cascade. This is because there are 2 types of metrics that can be used to calculate progress to 95-95-95 targets - the relative base/conditional denominator or the PLHIV base/denominator.

  • If you are using the relative base to track 95-95-95 progress, use the following percent indicators from the "HIV Test & Treat" data tab.
    • Percent Known Status of PLHIV: Percent of PLHIV who known their HIV status
    • Percent on ART with Known Status: Among those who know their status, percent on ART
    • Percent VLS on ART: Among those who are on ART, percent virally suppressed
  • If you are using the PLHIV base, use the following percent indicators from the "HIV Test & Treat" data tab.
    • Percent Known Status of PLHIV: Percent of PLHIV who known their HIV status
    • Percent on ART of PLHIV: Among PLHVIV, percent on ART
    • Percent VLS of PLHIV: Among PLHIV, percent virally suppressed

For more information about the distinction between relative/PLHIV base and how to assess 95-95-95 progress, see the article 95-95-95 Target Progress: Relative Base vs. PLHIV Base.