MyNixOS website logo
Description

Tidy Access to Women's Tennis Association (WTA) Data.

Scrapes and tidies publicly available data from the Women's Tennis Association website (<https://www.wtatennis.com>). Provides helpers to retrieve player biographies, singles and doubles career overviews, match histories, live rankings and aggregate statistics. Dynamic pages are rendered through a headless 'Chrome' session so 'JavaScript'-generated content is fully captured, and all outputs are returned as tidy data frames suitable for downstream analysis or visualisation.

matchpointR matchpointR logo

R-CMD-check CRAN status Lifecycle: experimental

matchpointR turns the public pages of the Women's Tennis Association (https://www.wtatennis.com) into tidy data frames. It ships helpers for player biographies, career highlights, full match histories and live rankings.

Dynamic content (matches, rankings) is rendered through a headless Chrome session via chromote, so JavaScript-generated sections are fully captured before parsing. Where the WTA site exposes structured schema.org JSON-LD data, matchpointR reads from that in preference to CSS selectors for resilience against site redesigns.

Installation

You can install the development version of matchpointR from GitHub with:

# install.packages("pak")
pak::pak("Angnar-97/matchpointR")

Or, once it lands on CRAN:

install.packages("matchpointR")

You will also need a working Chrome/Chromium install for the dynamic scrapers, which is managed automatically by chromote.

Usage

library(matchpointR)

# Build a canonical player URL from id + slug
url <- wta_player_url(320301, "katerina-siniakova")

# Player biography (name, nationality, headshot, flag, ...)
wta_get_player_basics(url, download_images = FALSE)

# Career highlights (ranks, titles, prize money)
wta_get_player_overview(url)

# Full match history (walks the "Show more" button)
matches_url <- wta_player_url(320301, "katerina-siniakova", "matches")
wta_get_player_matches(matches_url)

# Live rankings
wta_get_rankings("singles", top = 50)

Terms of service and fair use

matchpointR reads publicly accessible pages of https://www.wtatennis.com using a single headless browser session per call. Before using this package — especially at scale — you are responsible for checking and complying with the WTA website's Terms and Conditions. The package does not bypass paywalls, authentication walls or any technical access controls, does not interact with user accounts, and exposes no bulk-download helpers. Please scrape considerately: the chromote session hits the site as a real browser, so iterating over many players or rankings days back-to-back is equivalent to a human browsing the site rapidly. Add explicit Sys.sleep() between calls in loops.

Scope and caveats

  • HTML selectors may drift as the site is redesigned. Where possible matchpointR reads from the page's schema.org JSON-LD block for stability. File an issue on GitHub if a function stops returning data.
  • Functions return everything as character to stay faithful to the rendered page; cast to numeric or date in a follow-up step.
  • Tour-wide statistics leaderboards are not yet implemented; tracked in issue #1.

License

Apache License (>= 2).

Author

Alejandro Navas González (Angnar).

Metadata

Version

0.1.0

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows