MyNixOS website logo
Description

String Functions for Compact R Code.

A collection of string functions designed for writing compact and expressive R code. 'yasp' (Yet Another String Package) is simple, fast, dependency-free, and written in pure R. The package provides: a coherent set of abbreviations for paste() from package 'base' with a variety of defaults, such as p() for "paste" and pcc() for "paste and collapse with commas"; wrap(), bracket(), and others for wrapping a string in flanking characters; unwrap() for removing pairs of characters (at any position in a string); and sentence() for cleaning whitespace around punctuation and capitalization appropriate for prose sentences.

yasp: String Functions for Compact R Code

CRAN_Status_Badge

yasp is a small R package for working with character vectors. It is written in pure base R and has no dependancies. It includes:

paste wrappers with short names and various defaults

mnemoniccollapse=sep=
p(), p0()paste, paste0NULL""
ps(), pss()paste (sep) spaceNULL" "
psh()paste sep hyphenNULL"_"
psu()paste sep underscoreNULL"-"
psnl()paste sep newlineNULL"\n"
pc()paste collapse""""
pcs()paste collapse space" """
pcc()paste collapse comma", """
pcsc()paste collapse semicolon"; """
pcnl()paste collapse newline"\n"""
pc_and()paste collapse andvaries""
pc_or()paste collapse orvaries""

pc_and and pc_or collapses vectors of length 3 or greater using a serial comma (aka, oxford comma)

pc_and( letters[1:2] )  # "a and b"
pc_and( letters[1:3] )  # "a, b, and c"
pc_or( letters[1:2] )  # "a or b"
pc_or( letters[1:3] )  # "a, b, or c"

wrap and variants

Wrap a string with some characters

wrap("abc", "__")  #  __abc__
dbl_quote("abc")   #   "abc"
sngl_quote("abc")  #   'abc'
parens("abc")      #   (abc)
bracket("abc")     #   [abc]
brace("abc")       #   {abc}

unwrap, unparens

Remove pairs of characters from a string

label <- p("name", parens("attribute"))

label             # "name (attribute)"
unparens(label)   # "name attribute"

# by default, removes all matching pairs of left and right
x <- c("a", "(a)", "((a))", "(a) b", "a (b)", "(a) (b)" )
data.frame( x, unparens(x), check.names = FALSE )
#>         x unparens(x)
#> 1       a           a
#> 2     (a)           a
#> 3   ((a))           a
#> 4   (a) b         a b
#> 5   a (b)         a b
#> 6 (a) (b)         a b

specify n_pairs to remove a specific number of pairs

x <- c("(a)", "((a))", "(((a)))", "(a) (b)", "(a) (b) (c)", "(a) (b) (c) (d)")
data.frame( x, "n_pairs=1"   = unparens(x, n_pairs = 1),
               "n_pairs=2"   = unparens(x, n_pairs = 2),
               "n_pairs=3"   = unparens(x, n_pairs = 3),
               "n_pairs=Inf" = unparens(x), # the default 
               check.names = FALSE)
  
#>                 x     n_pairs=1   n_pairs=2 n_pairs=3 n_pairs=Inf
#> 1             (a)             a           a         a           a
#> 2           ((a))           (a)           a         a           a
#> 3         (((a)))         ((a))         (a)         a           a
#> 4         (a) (b)         a (b)         a b       a b         a b
#> 5     (a) (b) (c)     a (b) (c)     a b (c)     a b c       a b c
#> 6 (a) (b) (c) (d) a (b) (c) (d) a b (c) (d) a b c (d)     a b c d

use unwrap() to specify any pair of characters for left and right

x <- "A string with some \\emph{latex tags}."
unwrap(x, "\\emph{", "}")
#> [1] "A string with some latex tags."

by default, only pairs are removed. Set a character to "" to override.

x <- c("a)", "a))", "(a", "((a" )
data.frame(x, unparens(x), 'left=""' = unwrap(x, left = "", right = ")"),
           check.names = FALSE)
  
#>     x unparens(x) left=""
#> 1  a)          a)       a
#> 2 a))         a))       a
#> 3  (a          (a      (a
#> 4 ((a         ((a     ((a

sentence

paste with some additional string cleaning (mostly concerning whitespace) appropriate for prose sentences. It

  • trims leading and trailing whitespace
  • collapses runs of multiple whitespace into a single space
  • appends a period . if there is no terminal punctuation mark (., ?, or !)
  • removes spaces preceding punctuation characters: .?!,;:
  • collapses sequences of punctuation characters (.?!,;:) (possibly separated by spaces), into a single punctuation character. The first punctuation character of the sequence is used, with priority given to terminal punctuation marks .?! if present
  • makes sure a space or end-of-string follows every one of .?!,;:, with an exception for the special case of .,: followed by a digit, indicating the punctuation is a decimal period, number seperator, or time delimiter
  • capitalizes the first letter of each sentence (start-of-string or following a .?!)
  

Some examples in ?sentence:

compare <- function(x) cat(sprintf(' in: "%s"\nout: "%s"\n', x, sentence(x)))

#>  in: "capitilized and period added"
#> out: "Capitilized and period added."

#>  in: "whitespace:added ,or removed ; like this.and this"
#> out: "Whitespace: added, or removed; like this. And this."

#>  in: "periods and commas in numbers like 1,234.567 are fine !"
#> out: "Periods and commas in numbers like 1,234.567 are fine!"

#>  in: "colons can be punctuation or time : 12:00 !"
#> out: "Colons can be punctuation or time: 12:00!"

#>  in: "only one punctuation at a time!.?,;"
#> out: "Only one punctuation at a time!"

#>  in: "The first mark ,; is kept;,,with priority for terminal marks  ;,."
#> out: "The first mark, is kept; with priority for terminal marks."

# vectorized like paste()
sentence(
 "The", c("first", "second", "third"), "letter is", letters[1:3],
 parens("uppercase:", sngl_quote(LETTERS[1:3])), ".")
#> [1] "The first letter is a (uppercase: 'A')." 
#> [2] "The second letter is b (uppercase: 'B')."
#> [3] "The third letter is c (uppercase: 'C')."

Installation

You can install 'yasp' from CRAN with:

install.packages("yasp")

Or install from github with:

# install.packages("devtools")
devtools::install_github("t-kalinowski/yasp")
Metadata

Version

0.2.0

License

Unknown

Platforms (75)

    Darwin
    FreeBSD 13
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd13
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd13
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows