MyNixOS website logo
Description

Retail Shopping Data.

Retail shopping transactions for 2,469 households over one year. Originates from the 84.51° Complete Journey 2.0 source files <https://www.8451.com/area51> which also includes useful metadata on products, coupons, campaigns, and promotions.

completejourney

CRANstatus lifecycle Travis-CI BuildStatus AppVeyor buildstatus

Overview

An R data package that provides access to data in the Complete Journey Study provided by 84.51°. The data represents grocery store shopping transactions over one year from a group of 2,469 households who are frequent shoppers at a retailer. It contains all of each household’s purchases, not just those from a limited number of categories. For certain households, demographic information as well as direct marketing contact history are included.

  • campaigns: campaigns received by each household
  • campaign_descriptions: campaign metadata (length of time active)
  • coupons: coupon metadata (UPC code, campaign, etc.)
  • coupon_redemptions: coupon redemptions (household, day, UPC code, campaign)
  • demographics: household demographic data (age, income, family size, etc.)
  • products: product metadata (brand, description, etc.)
  • promotions_sample: a sampling of the product placement in mailers and in stores corresponding to advertising campaigns
  • transactions_sample: a sampling of the products purchased by households

Installation

install.packages("completejourney")

To get a bug fix, or use a feature from the development version, you can install completejourney from GitHub with:

# install.packages("remotes")
remotes::install_github("bradleyboehmke/completejourney")

Downloading full data sets

Due to the size of the transactions and promotions data, the package provides a sampling of the data built-in with transactions_sample and promotions_sample. However, you can access the full promotions and transactions data sets from the source GitHub repository with the following:

library(completejourney)

# get the full transactions data set
transactions <- get_transactions()
transactions
## # A tibble: 1,469,307 x 11
##    household_id store_id basket_id product_id quantity sales_value retail_disc
##    <chr>        <chr>    <chr>     <chr>         <dbl>       <dbl>       <dbl>
##  1 900          330      31198570… 1095275           1        0.5        0    
##  2 900          330      31198570… 9878513           1        0.99       0.1  
##  3 1228         406      31198655… 1041453           1        1.43       0.15 
##  4 906          319      31198705… 1020156           1        1.5        0.290
##  5 906          319      31198705… 1053875           2        2.78       0.8  
##  6 906          319      31198705… 1060312           1        5.49       0.5  
##  7 906          319      31198705… 1075313           1        1.5        0.290
##  8 1058         381      31198676… 985893            1        1.88       0.21 
##  9 1058         381      31198676… 988791            1        1.5        1.29 
## 10 1058         381      31198676… 9297106           1        2.69       0    
## # … with 1,469,297 more rows, and 4 more variables: coupon_disc <dbl>,
## #   coupon_match_disc <dbl>, week <int>, transaction_timestamp <dttm>
# get the full promotions data set
promotions <- get_promotions()
promotions
## # A tibble: 20,940,529 x 5
##    product_id store_id display_location mailer_location  week
##    <chr>      <chr>    <fct>            <fct>           <int>
##  1 1000050    316      9                0                   1
##  2 1000050    337      3                0                   1
##  3 1000050    441      5                0                   1
##  4 1000092    292      0                A                   1
##  5 1000092    293      0                A                   1
##  6 1000092    295      0                A                   1
##  7 1000092    298      0                A                   1
##  8 1000092    299      0                A                   1
##  9 1000092    304      0                A                   1
## 10 1000092    306      0                A                   1
## # … with 20,940,519 more rows
# a convenience function to get both
c(promotions, transactions) %<-% get_data(which = 'both', verbose = FALSE)
dim(promotions)
## [1] 20940529        5

dim(transactions)
## [1] 1469307      11

Learn more

Learn more about the completejourney data, and the type of insights you can look for, at http://bit.ly/completejourney.

Source

The Complete Journey data is available at: http://www.8451.com/area51/.

Metadata

Version

1.1.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows