MyNixOS website logo
Description

Normalize and Match City Names to NUTS Regions.

Normalizes city names for EEA countries and matches them to NUTS 3 regions using provided crosswalks. Features include comprehensive normalization rules, cascading matching logic (Exact NUTS -> Exact LAU -> Fuzzy), and single-source data synthesis. The package implements the NUTS classification as described in the NUTS methodology (Eurostat (2021) <https://ec.europa.eu/eurostat/web/nuts>).

placematchr: Robust City Normalization and NUTS Matching for Europe

placematchr is an R package designed to normalize city names and map them to standard NUTS 3 and LAU (Local Administrative Units) codes across 32 European countries.

It is widely used to harmonize messy geographical data (survey responses, address lists) with official regional identifiers for analysis.

Key Features

  • Broad Coverage: Supports all EU/EEA countries
  • English Exonym Support: Handles common English names for major cities (e.g., "Munich" matches "München", "Prague" matches "Praha", "Florence" matches "Firenze", "Cologne" matches "Köln").
  • Robust Suburb Handling (Resolves to the Suburb's LAU, not the Central City):
    • Normalizes suburb suffixes by stripping them (e.g., "Garching b. München" -> "Garching"). The matching logic then aligns with the suburb's specific LAU, ensuring precise local mapping rather than defaulting to the central metropolitan area.
    • Correctly disambiguates complex cases like "Frankfurt (Oder)" vs "Frankfurt am Main".
    • Handles article inversions in Spanish/French (e.g., "Rozas, Las" -> "Las Rozas").
  • Cascading Matching Logic:
    1. Exact NUTS Match: Checks if the normalized input matches a NUTS 3 region name directly (e.g., "Berlin").
    2. Exact LAU Match: Checks if it matches a Local Administrative Unit (e.g., "Garching").
    3. Fuzzy LAU Match: Fuzzy string matching for slight variations (e.g., typos).
    4. Fuzzy NUTS Match: Fallback to fuzzy NUTS identification.
  • Zero-Config Data: All necessary geographical data (NUTS/LAU tables) is processed and bundled with the package.

Usage

Basic Matching

library(placematchr)

# Match a single city in Germany
match_city("Munich", country = "DE")
# Returns data frame with NUTS_ID: DE212, Name: München, Landeshauptstadt

# Match a list of cities in Italy
cities <- c("Rome", "Milan", "Naples", "Venice")
match_city(cities, country = "IT")

Handling Suburbs and Variations

# Input: "Garching b. München" (Suburb)
# Result: The suffix "b. München" is stripped. The base name "Garching" is matched against LAUs.
# Final Match: "Garching" (LAU). Mapped to Munich District (NUTS 3: DE21H), *not* Munich City (DE212).
match_city("Garching b. München", country = "DE")

# Input: "Champs-sur-Marne" (French Suburb)
# Result: Matches "Champs" (LAU).
match_city("Champs-sur-Marne", country = "FR")

Supported Countries

BE, BG, CZ, DK, DE, EE, IE, EL, ES, FR, HR, IT, CY, LV, LT, LU, HU, MT, NL, AT, PL, PT, RO, SI, SK, FI, SE, LI, NO, CH, MK, TR, UK.

License

MIT.

Metadata

Version

0.2.4

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows