MyNixOS website logo
Description

Consistent Anonymisation Across Datasets.

A simple function that anonymises a list of variables in a consistent way: anonymised factors are not recycled and the same original levels receive the same anonymised factor even if located in different datasets.

simplanonym

The goal of simplanonym is to easily anonymise individuals that appear in multiple datasets, in a consistent way.

Consistent anonymisation means that:

  • each unique individual gets the same single anonymised factor, even if appearing in more than one dataset;
  • the same anonymised factor always refers back to the same individual.

Installation

You can install the development version of simplanonym from GitHub with:

# install.packages("devtools")
devtools::install_github("dkgaraujo/simplanonym")

Example

This is a basic example which shows you how to solve a common problem:

library(simplanonym)

rand_tbl_1 <- vroom::gen_tbl(10, 4, col_types = "fffd")
rand_tbl_2 <- vroom::gen_tbl(10, 2, col_types = "fd")
rand_tbl_2$X3 <- rand_tbl_1$X3

# note:
# * rand_tbl_1 and rand_tbl_2 share three column names,
#   of which X2 is a factor in one but not the other.
# * X1 factors do not overlap, but their anonymisation
#   should still be consistent (ie, different levels should
#   have their own unique anonymised factors).
# * For X3, the anonymised factors should consider the levels
#   at both `rand_tbl_1$X3` and `rand_tbl_2$X3`.

data_list <- list(rand_tbl_1, rand_tbl_2)
data_list
#> [[1]]
#> # A tibble: 10 × 4
#>    X1                X2              X3                     X4
#>    <fct>             <fct>           <fct>               <dbl>
#>  1 different_gorilla thankful_pony   scary_mustang     1.53   
#>  2 different_gorilla own_leopard     beautiful_marten  0.204  
#>  3 purring_donkey    witty_tiger     straight_deer     1.05   
#>  4 itchy_badger      own_leopard     adorable_panther  0.00518
#>  5 young_snake       calm_finch      white_rhinoceros -1.37   
#>  6 chubby_rabbit     public_lizard   scary_mustang     1.16   
#>  7 hot_bunny         loose_eagle     glamorous_puma   -1.83   
#>  8 gentle_hare       deep_goat       hissing_puppy    -1.09   
#>  9 brief_weasel      green_impala    adorable_panther -1.28   
#> 10 cool_gazelle      nutritious_bird beautiful_marten -1.70   
#> 
#> [[2]]
#> # A tibble: 10 × 3
#>    X1                   X2 X3              
#>    <fct>             <dbl> <fct>           
#>  1 good_chinchilla -0.461  scary_mustang   
#>  2 defeated_cougar -0.0267 beautiful_marten
#>  3 sweet_kangaroo   0.152  straight_deer   
#>  4 good_chinchilla -0.883  adorable_panther
#>  5 good_chinchilla  2.36   white_rhinoceros
#>  6 sweet_kangaroo  -0.570  scary_mustang   
#>  7 rotten_bee       0.297  glamorous_puma  
#>  8 huge_eagle      -0.0890 hissing_puppy   
#>  9 rotten_bee      -1.06   adorable_panther
#> 10 good_chinchilla -0.308  beautiful_marten

data_list |> anonymise(return_original_levels = TRUE)
#> Joining, by = c("X1", "X2", "X3")
#> Joining, by = c("X1", "X3")
#> [[1]]
#> # A tibble: 10 × 7
#>    X1                X2              X3         X1_anon X2_anon X3_anon       X4
#>    <fct>             <fct>           <fct>      <fct>   <fct>   <fct>      <dbl>
#>  1 different_gorilla thankful_pony   scary_mus… 19      12      12       1.53   
#>  2 different_gorilla own_leopard     beautiful… 19      09      04       0.204  
#>  3 purring_donkey    witty_tiger     straight_… 05      02      16       1.05   
#>  4 itchy_badger      own_leopard     adorable_… 26      09      19       0.00518
#>  5 young_snake       calm_finch      white_rhi… 21      08      06      -1.37   
#>  6 chubby_rabbit     public_lizard   scary_mus… 02      19      12       1.16   
#>  7 hot_bunny         loose_eagle     glamorous… 30      11      03      -1.83   
#>  8 gentle_hare       deep_goat       hissing_p… 25      21      01      -1.09   
#>  9 brief_weasel      green_impala    adorable_… 13      16      19      -1.28   
#> 10 cool_gazelle      nutritious_bird beautiful… 12      07      04      -1.70   
#> 
#> [[2]]
#> # A tibble: 10 × 5
#>    X1              X3               X1_anon X3_anon      X2
#>    <fct>           <fct>            <fct>   <fct>     <dbl>
#>  1 good_chinchilla scary_mustang    04      12      -0.461 
#>  2 defeated_cougar beautiful_marten 33      04      -0.0267
#>  3 sweet_kangaroo  straight_deer    23      16       0.152 
#>  4 good_chinchilla adorable_panther 04      19      -0.883 
#>  5 good_chinchilla white_rhinoceros 04      06       2.36  
#>  6 sweet_kangaroo  scary_mustang    23      12      -0.570 
#>  7 rotten_bee      glamorous_puma   01      03       0.297 
#>  8 huge_eagle      hissing_puppy    11      01      -0.0890
#>  9 rotten_bee      adorable_panther 01      19      -1.06  
#> 10 good_chinchilla beautiful_marten 04      04      -0.308

Acknowledgements

The hex sticker for the package is based on an icon provided free of charge by www.flaticon.com. René magritte icons created by Freepik - Flaticon.

Metadata

Version

0.1.0

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows