MyNixOS website logo
Description

Scrutinizing Collections of Structured Datasets.

Provides a coherent interface for exploring and transforming multiple related data frames that share a common structure. Complements single-dataset inspection tools by operating across an entire collection at once. Also includes lightweight utilities for related file and folder management tasks.

scrutr

CRAN status CRAN downloads R-CMD-check Codecov test coverage

scrutr was previously developed under the name industtry (2024-2025) and has been renamed ahead of its first CRAN release.

scrutr is an R toolkit for scrutinizing collections of structured datasets (data frames). It targets workflows where you need to apply the same inspection, profiling, or conversion procedure across many related tables.

Two guiding ideas:

  1. "Collections first": common tasks (inspecting, detecting schema differences, batch conversion) operate at the collection level, not one dataset at a time.
  2. "Operational tooling": lightweight helpers for paths, duplicates, join checks, and string replacements that recur in production data pipelines.

Installation

# install.packages("devtools")
devtools::install_github("danielrak/scrutr")

API overview

AreaMain intentKey functions
Inspection & profilingProfile one dataset or a whole folder; export diagnostics to Excel.inspect(), inspect_write(), inspect_vars()
Schema detectionDetect variable presence across datasets; compare classes.vars_detect(), vars_compclasses(), detect_chars_structure_datasets()
Batch conversion / renamingConvert file formats or rename files at scale via Excel masks.convert_r(), convert_all(), mask_convert_r(), mask_rename_r(), rename_r()
Data hygieneDuplicate diagnostics, join checks, proportions.dupl_show(), dupl_sources(), ljoin_checks(), table_prop()
Paths & filesystemReplicate folder structures, move through paths.folder_structure_replicate(), path_move()
UtilitiesBatch string replacement.replace_multiple()

Quick examples

Inspect a single data frame

library(scrutr)
result <- inspect(CO2)
head(result)

Inspect a folder of datasets and export to Excel

mydir <- file.path(tempdir(), "example")
dir.create(mydir, showWarnings = FALSE)

saveRDS(cars, file.path(mydir, "cars.rds"))
saveRDS(mtcars, file.path(mydir, "mtcars.rds"))

inspect_vars(
  input_path = mydir,
  output_path = mydir,
  output_label = "diagnostics",
  considered_extensions = "rds"
)

Batch conversion with convert_all()

convert_all(
  input_folderpath = mydir,
  considered_extensions = "rds",
  to = "csv",
  output_folderpath = file.path(mydir, "csv_output")
)

Mask-driven conversion with convert_r()

# 1. Generate an Excel mask template
mask_convert_r(output_path = mydir)

# 2. Fill in the mask, then run:
convert_r(
  mask_filepath = file.path(mydir, "mask_convert_r.xlsx"),
  output_path = mydir
)
Metadata

Version

0.3.1

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows