MyNixOS website logo
Description

'SAS'-Style 'PROC FORMAT' for R.

Provides 'SAS' 'PROC FORMAT'-like functionality for creating and applying value formats in R. Supports discrete and range-based mapping of values to labels, reverse formatting (invalue), date/time/datetime formatting with built-in 'SAS' format names, multi-label formats, expression labels evaluated at apply-time, case-insensitive matching, import/export of format definitions, and proper handling of missing values (NA, NULL, NaN).

ksformat ksformat logo

PROC FORMAT for R GitHub License: GPL v3 R package

SAS-style PROC FORMAT for R: create and apply value formats, range-based formatting, reverse formatting (invalue), and consistent handling of missing values (NA, NULL, NaN).

Repository:github.com/crow16384/ksformat — source code, issue tracker, and development.

Installation

From GitHub (after cloning or from your repo URL):

# install.packages("remotes")
remotes::install_github("crow16384/ksformat")

From local source:

install.packages(".", repos = NULL, type = "source")
# or
devtools::install()

Features

  • Format creation — Value-to-label mappings like SAS PROC FORMAT
  • Format application — Apply formats to vectors and data frames
  • Reverse formatting — Convert labels back to values (INVALUE)
  • Missing value handling — NA, NULL, NaN, and empty values
  • Range support — Numeric ranges with inclusive/exclusive bounds
  • Multilabel — A single value can match multiple labels (fput_all)
  • Expression labels — Dynamic labels with .x1, .x2, … evaluated at apply-time
  • Case-insensitive matchingignore_case = TRUE in fnew
  • Date/time/datetime — Built-in SAS format names and custom strftime patterns
  • Import/export — Parse SAS-like text (fparse), export (fexport), import CNTLOUT CSV (fimport)
  • Format library — Register and retrieve formats globally

Quick start

Discrete formatting

library(ksformat)

fnew(
  "M" = "Male",
  "F" = "Female",
  .missing = "Unknown",
  name = "sex"
)

fput(c("M", "F", NA, "X"), "sex")
# [1] "Male"    "Female"  "Unknown" "X"

Numeric ranges

fparse(text = '
VALUE age (numeric)
  [0, 18)   = "Child"
  [18, 65)  = "Adult"
  [65, HIGH] = "Senior"
  .missing   = "Age Unknown"
;
')

fputn(c(5, 25, 70, NA), "age")
# [1] "Child"       "Adult"       "Senior"      "Age Unknown"

Reverse formatting (invalue)

finput("Male" = 1, "Female" = 2, name = "sex_inv")

finputn(c("Male", "Female", "Unknown"), "sex_inv")
# [1]  1  2 NA

Format library

fprint()              # list all registered formats
fmt <- format_get("sex")
fclear("sex")         # remove one format
fclear()              # clear all

Interactive library browser (Shiny)

if (interactive() && requireNamespace("shiny", quietly = TRUE)) {
  format_library_app()
}

The app shows both VALUE (ks_format) and INVALUE (ks_invalue) objects, supports name/type filtering, shows a formatted mapping table, and includes library management actions (remove selected, clear all, or quit).

In RStudio, you can also open it from Addins as Format Library Browser.

Data frames

df <- data.frame(
  sex = c("M", "F", "M", NA),
  age = c(15, 25, 70, 35)
)

fput_df(df, sex = format_get("sex"), age = format_get("age"), suffix = "_label")

Multilabel formats

With multilabel = TRUE, a single value can match multiple labels. Use fput_all() to collect all matches:

fnew(
  "0,17,TRUE,TRUE"  = "Pediatric",
  "18,Inf,TRUE,TRUE" = "Adult",
  "3,5,TRUE,TRUE"   = "Serious",
  name = "ae_age", type = "numeric", multilabel = TRUE
)

fput_all(c(10, 25, 4), "ae_age")
# [[1]] "Pediatric"
# [[2]] "Adult"
# [[3]] "Pediatric" "Serious"

Date/time/datetime formats

SAS date format names are auto-resolved — no pre-creation needed:

fputn(Sys.Date(), "DATE9.")
# [1] "25MAR2026"

fputn(Sys.Date(), "MMDDYY10.")
# [1] "03/25/2026"

# Custom strftime pattern
fnew_date("%d.%m.%Y", name = "ru_date", type = "date")
fput(Sys.Date(), "ru_date")
# [1] "25.03.2026"

Time (seconds since midnight) and datetime are also supported:

fputn(3600, "TIME8.")
# [1] "1:00:00"

fputn(Sys.time(), "DATETIME20.")

Expression labels

Labels containing .x1, .x2, etc. are evaluated as R expressions at apply-time. Pass extra arguments through fput(x, fmt, ...):

stat_fmt <- fnew(
  "n"   = "sprintf('%s', .x1)",
  "pct" = "sprintf('%.1f%%', .x1 * 100)",
  name = "stat", type = "character"
)

fput(c("n", "pct"), stat_fmt, c(42, 0.053))
# [1] "42"   "5.3%"

Use e() to mark a label for evaluation even without .xN placeholders:

fnew("ts" = e("format(Sys.time(), '%Y-%m-%d')"), name = "demo")
fput("ts", "demo")

Case-insensitive matching

fnew("M" = "Male", "F" = "Female", name = "sex_nc",
     type = "character", ignore_case = TRUE)

fput(c("m", "F", "M", "f"), "sex_nc")
# [1] "Male"   "Female" "Male"   "Female"

Missing value handling

Priority order:

  1. NA, NULL, NaN.missing label if defined, otherwise NA
  2. Exact match → value–label mapping
  3. Range match → range label (numeric formats)
  4. No match.other label or original value

Options: keep_na = TRUE, na_if, include_empty = TRUE.

Cheat sheet

  • In R: run ksformat_cheatsheet() to open the cheat sheet in your browser (HTML), or ksformat_cheatsheet("pdf") for the PDF.
  • In this repo:HTML | PDF

Function reference

AreaFunctions
Creationfnew(), finput(), fnew_bid(), fnew_date(), fparse(), e()
Applicationfput(), fputn(), fputc(), fput_all(), fput_df()
Reversefinputn(), finputc()
Libraryformat_get(), fprint(), fclear(), fexport(), fimport(), format_library_app()
Utilitiesis_missing(), range_spec()
Documentationksformat_cheatsheet() — open cheat sheet

Development

install.packages(c("roxygen2", "testthat", "devtools"))
devtools::document()
devtools::test()
devtools::check()

When bumping the package version, update DESCRIPTION and then run
Rscript scripts/sync-version.R to refresh version references in cran-comments.md and any other synced files.

License

GPL-3. See https://www.gnu.org/licenses/gpl-3.0.html.

Metadata

Version

0.7.1

License

Unknown

Platforms (80)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arc-linux
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • sh4-linux
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows