Interface to the World Inequality Database (WID).
widr
R interface to the World Inequality Database (WID). This package downloads distributional national accounts data (DINA), validates and decodes WID variable codes, and provides helpers for currency conversion, inequality measurement, and plotting.
Installation
# CRAN
install.packages("widr")
# Development version
remotes::install_github("cherylisabella/widr")
Variable codes
WID variables follow the grammar <type><concept>[age][pop]:
| Component | Width | Example | Meaning |
|---|---|---|---|
type | 1 letter | s | share |
concept | 5–6 letters | ptinc | pre-tax national income |
age | 3 digits | 992 | adults 20+ |
pop | 1 letter | j | equal-split between spouses |
sptinc992j = share of pre-tax national income for equal-split adults aged 20+.
wid_decode("sptinc992j")
#> $series_type "s"
#> $concept "ptinc"
#> $age "992"
#> $pop "j"
wid_encode("s", "ptinc", age = "992", pop = "j") # "sptinc992j"
wid_is_valid(series_type = "s", concept = "ptinc") # TRUE
wid_search("national income") # search the concept table
Full catalogue: https://wid.world/codes-dictionary/
Download data
library(widr)
# Top 1% pre-tax income share, United States, 2000–2022
top1 <- download_wid(
indicators = "sptinc992j",
areas = "US",
perc = "p99p100",
years = 2000:2022)
top1
#> <wid_df> 23 rows | 1 countries | 1 variables
#> country variable percentile year value age pop
#> 1 US sptinc992j p99p100 2000 0.168 992 j
download_wid() returns a wid_df, a data.frame subclass that works natively with dplyr, ggplot2, and base R without any $data unwrapping.
Key parameters
| Parameter | Default | Description |
|---|---|---|
indicators | "all" | Variable codes |
areas | "all" | ISO-2 country / region codes |
years | "all" | Integer vector or "all" |
perc | "all" | Percentile codes, e.g. "p99p100" |
ages | "992" | Three-digit age code |
pop | "j" | Population unit |
metadata | FALSE | Attach source info as attr(., "wid_meta") |
include_extrapolations | TRUE | Include interpolated points |
cache | TRUE | Cache responses to disc |
Tidyverse integration
library(dplyr)
library(ggplot2)
top1 |>
wid_tidy(country_names = FALSE) |>
filter(year >= 1990) |>
ggplot(aes(year, value)) +
geom_line(colour = "#58a6ff", linewidth = 0.9) +
scale_y_continuous(labels = scales::percent_format()) +
labs(title = "Top 1% pre-tax income share — United States", x = NULL, y = NULL) +
theme_minimal()
Inequality measures
dist <- download_wid("sptinc992j", areas = c("US", "FR"), perc = "all", years = 1990:2022)
wid_gini(dist) # Gini coefficient
wid_top_share(dist, top = 0.01) # top 1% share
wid_top_share(dist, top = 0.10) # top 10% share
thresh <- download_wid("tptinc992j", areas = "US", perc = "all")
wid_percentile_ratio(thresh) # P90/P10
Currency conversion
download_wid("aptinc992j", areas = c("US", "FR"), perc = "p0p50") |>
wid_convert(target = "ppp", base_year = "2022")
Supported targets: "lcu", "usd", "eur", "gbp", "ppp", "yppp".
Plotting
wid_plot_timeseries(dist) # time series, one line per country
wid_plot_compare(dist, year = 2020) # cross-country bar chart
wid_plot_lorenz(dist, country = "US") # Lorenz curve
Reusable queries
q <- wid_query(indicators = "sptinc992j", areas = c("US", "FR"))
q <- wid_filter(q, years = 2010:2022)
wid_fetch(q)
Reference tables
Six tables synchronised from WID:
| Table | Contents |
|---|---|
wid_series_types | Series type codes (s, a, m, …) |
wid_concepts | Concept codes (ptinc, hweal, nninc, …) |
wid_ages | Age group codes (992, 999, …) |
wid_pop_types | Population unit codes (j, i, t, …) |
wid_countries | Country and region codes |
wid_percentiles | Percentile codes (p99p100, p0p50, …) |
Independent implementation unaffiliated with the World Inequality Lab (WIL) or the Paris School of Economics. Data maintained by WIL.