MyNixOS website logo
Description

Convert Values to NA.

Provides a replacement for dplyr::na_if(). Allows you to specify multiple values to be replaced with NA using a single function.

fauxnaif

License:MIT R buildstatus Dependencies

faux-naïf (/ˌfoʊ.naɪˈif/): a person who pretends to be simple or innocent

fauxnaif: an R package for simplifying data by pretending values are NA

Overview

fauxnaif provides an extension to dplyr::na_if(). Unlike dplyr’s na_if(), na_if_in() allows you to specify multiple values to be replaced with NA using a single function. fauxnaif also includes a complementary function na_if_not() to specify values to keep.

Installation

You can install fauxnaif from CRAN:

install.packages("fauxanif")

Or the development version from GitHub:

# install.packages("remotes")
remotes::install_github("rossellhayes/fauxnaif")

Usage

library(dplyr)
library(fauxnaif)

The basics

Let’s say we want to remove an unwanted negative value from a vector of numbers

-1:10
#>  [1] -1  0  1  2  3  4  5  6  7  8  9 10

We can replace -1…

… explicitly:

na_if_in(-1:10, -1)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10

… by specifying values to keep:

na_if_not(-1:10, 0:10)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10

… using a formula:

na_if_in(-1:10, ~ . < 0)
#>  [1] NA  0  1  2  3  4  5  6  7  8  9 10

A little more complex

messy_string <- c("abc", "", "def", "NA", "ghi", 42, "jkl", "NULL", "mno")

We can replace unwanted values…

… one at a time:

na_if_in(messy_string, "")
#> [1] "abc"  NA     "def"  "NA"   "ghi"  "42"   "jkl"  "NULL" "mno"

… or all at once:

na_if_in(messy_string, "", "NA", "NULL", 1:100)
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"
na_if_in(messy_string, c("", "NA", "NULL", 1:100))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"
na_if_in(messy_string, list("", "NA", "NULL", 1:100))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"

… or using a clever formula:

grepl("[a-z]{3,}", messy_string)
#> [1]  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE FALSE  TRUE
na_if_not(messy_string, ~ grepl("[a-z]{3,}", .))
#> [1] "abc" NA    "def" NA    "ghi" NA    "jkl" NA    "mno"

With data frames

faux_census
#> # A tibble: 5 × 4
#>   state    age  income gender                      
#>   <chr>  <dbl>   <dbl> <chr>                       
#> 1 TX        57 9999999 Gender is a social construct
#> 2 Canada    49  149000 Male                        
#> 3 NY       557   90750 f                           
#> 4 LA         2   61000 Male                        
#> 5 TN        64 9999999 M

na_if_in() is particularly useful inside dplyr::mutate():

faux_census %>%
 mutate(
   income = na_if_in(income, 9999999),
   age    = na_if_in(age, ~ . < 18, ~ . > 120),
   state  = na_if_not(state, ~ grepl("^[A-Z]{2,}$", .)),
   gender = na_if_in(gender, ~ nchar(.) > 20)
 )
#> # A tibble: 5 × 4
#>   state   age income gender
#>   <chr> <dbl>  <dbl> <chr> 
#> 1 TX       57     NA <NA>  
#> 2 <NA>     49 149000 Male  
#> 3 NY       NA  90750 f     
#> 4 LA       NA  61000 Male  
#> 5 TN       64     NA M

Or you can use dplyr::across() on data frames:

faux_census %>%
  mutate(
    across(age, na_if_in, ~ . < 18, ~ . > 120),
    across(state, na_if_not, ~ grepl("^[A-Z]{2,}$", .)),
    across(where(is.character), na_if_in, ~ nchar(.) > 20),
    across(everything(), na_if_in, 9999999)
  )
#> # A tibble: 5 × 4
#>   state   age income gender
#>   <chr> <dbl>  <dbl> <chr> 
#> 1 TX       57     NA <NA>  
#> 2 <NA>     49 149000 Male  
#> 3 NY       NA  90750 f     
#> 4 LA       NA  61000 Male  
#> 5 TN       64     NA M

Hex sticker fonts are Bodoni* by indestructible type* and Source Code Pro by Adobe.

Image adapted from icon made by Freepik from flaticon.com.

Please note that fauxnaif is released with a Contributor Code of Conduct.

Metadata

Version

0.7.1

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows