MyNixOS website logo
Description

Dataset of the 'Contoso' Company.

A collection of synthetic datasets simulating sales transactions from a fictional company. The dataset includes various related tables that contain essential business and operational data, useful for analyzing sales performance and other business insights. Key tables included in the package are: - "sales": Contains data on individual sales transactions, including order details, pricing, quantities, and customer information. - "customer": Stores customer-specific details such as demographics, geographic location, occupation, and birthday. - "store": Provides information about stores, including location, size, status, and operational dates. - "orders": Contains details about customer orders, including order and delivery dates, store, and customer data. - "product": Contains data on products, including attributes such as product name, category, price, cost, and weight. - "calendar": A time-based table that includes date-related attributes like year, month, quarter, day, and working day indicators. This dataset is ideal for practicing data analysis, performing time-series analysis, creating reports, or simulating business intelligence scenarios.

CRAN status

Contoso is a synthetic dataset containing sample sales transaction data for the fictional “Contoso” company. It includes various supporting tables for business intelligence, such as customer, store, product, and currency exchange data.

This dataset is perfect for practicing time series analysis, joins, financial modeling, or any business intelligence-related tasks.

It comes with a built-in dataset as well as the ability to create an in-memory database with duckdb

The package comes with the following tables:

  • sales:
    • Contains information about sales transactions, including the total sales amount, customer, store, and product involved.
  • customer:
    • Contains details about customers, such as customer key, name, address, and demographic information.
  • store:
    • Contains information about stores, including store key, name, location, and related details.
  • product:
    • Contains information about products, such as product key, name, category, and price.
  • fx:
    • Contains foreign exchange rate data, mapping currency pairs to their exchange rates on specific dates.
  • calendar:
    • Contains date-related information, including date, week, month, quarter, and year for use in time-based analysis.
  • orders:
    • Contains information about individual orders, including order key, customer key, order date, and store information.
  • orderrows:
    • Contains detailed line items for each order, including product key, quantity, and price for each item in the order.

Built into the package is the 10K row version of the dataset.

Using view(), you can see the columns’ label using the labelled package.

Inspiration to using labelled comes from Crystal Lewis excellent blog post

For larger datasets, use create_contoso_duckdb() with one of the following sizes:

SizeApprox Sales Rows
small~8,000
medium~2.3 million
large~47 million
mega~237 million

Source

The data is originally sourced from the sqlbi github site

Dataset overview

The relationship keys that join each of the tables are listed below.

salescustomerproductstoreorderorderrowsfx
order_keyorder_keyorder_key
customer_keycustomer_keycustomer_key
store_keystore_keystore_key
product_keyproduct_keyproduct_key
currency_codefrom_currency

Installation

You can install the package from CRAN:

install.packages("contoso")

Or install the development version from Codeberg:

# install.packages("pak")
pak::pak("git::https://codeberg.org/usrbinr/contoso")

Example

library(contoso)

# Create a DuckDB connection to Contoso datasets
db <- create_contoso_duckdb(size = "medium")

# Access the sales dataset
db$sales |> head()

# Launch the DuckDB UI to explore all tables interactively
launch_ui(db$con)

# Clean up when done
DBI::dbDisconnect(db$con, shutdown = TRUE)
Metadata

Version

2.1.0

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows