MyNixOS website logo
Description

Utilities for Geo-Spatial Cluster Detection and Significance Classification.

Provides utilities for manipulating time series of location-based counts of events to detect geo-spatial clusters. Significance of these clusters is determined using a set of models that classify based on a learned relationship between observed and the log(observed/expected) ratio of counts. The approach implemented here is similar to prospective space-time estimation of clusters using the scan statistic.

gsClusterDetect

Description

An R package for implementing geospatial cluster identification from time series of counts, by location. Locations can be expressed as counties, zip codes, census tracts, or other user-defined geographies. Users provide:

  1. a data.frame of counts by location and date
  2. a distance object, that contains the distance between location and its neighbors

The package provides functions to create these distance objects in either matrix or list format. These can be generated for census tract, zip codes, or counties (fips), or can be constructed for custom locations by providing a dataframe with columns for latitude and longitude (i.e the centroid of each location).

Installation

Install the gsClusterDetect package from CRAN as follows:

install.packages("gsClusterDetect")

Install the development version from git as follows:

devtools::install_github("lmullany/gsClusterDetect")

Getting Started:

  1. Load the package and provide data frame with location, date, and count columns.
library(gsClusterDetect)
df <- example_count_data
tail(df)

   location       date count
     <char>     <IDat> <int>
1:    39171 2025-02-04     1
2:    39171 2025-02-05     0
3:    39173 2025-02-04     6
4:    39173 2025-02-05     7
5:    39175 2025-02-04     2
6:    39175 2025-02-05     0
  1. Generate the distance matrix for this location. In this case, the synthetic data has counts from counties/fips in the state of OHIO, so we use county_distance_matrix() and pass the state abbreviation:
ohio_dm <- county_distance_matrix("OH")

# This is named list of two elements
cat("Class:", class(ohio_dm), "\nNames:", names(ohio_dm))

Class: list 
Names: loc_vec distance_matrix
  1. Set the end of your target period. This is called the detect_date, and is a parameter that must be passed to the find_clusters function. Typically, this might be the current (or last available) date.
detect_date <- max(df[, date])
  1. Call the find_clusters() function; See ?find_clusters() for full set of options. Note that below, we pass the minimum required elements: cases, distance_matrix, detect_date, and set the distance_limit (the maximum size of the clusters) to 50 (miles).
clusters <- find_clusters(
    cases = df,
    distance_matrix = ohio_dm[["distance_matrix"]],
    detect_date = detect_date,
    distance_limit = 50
)

Contacts:

Copyright 2026 The Johns Hopkins University Applied Physics Laboratory LLC.

Metadata

Version

1.0.0

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows