Description
Detect Data Containing Personally Identifiable Information.
Description
Allows users to quickly and easily detect data containing Personally Identifiable Information (PII) through convenience functions.
README.md
detector
detector
makes detecting data containing Personally Identifiable Information (PII) quick, easy, and scalable. It provides high-level functions that can take vectors and data.frames and return important summary statistics in a convenient data.frame. Once complete, detector
will be able to detect the following types of PII:
- Full name
- Home address
- E-mail address
- National identification number
- Passport number
- Social Security number
- IP address
- Vehicle registration plate number
- Driver's license number
- Credit card number
- Date of birth
- Birthplace
- Telephone number
- Latitude and longtiude
State of the Union
Complete!
- E-mail address
- Telephone number
- National identification number
Needs more work...
- Credit card number
Haven't even started :(
- Full name
- Date of birth
- Home address
- IP address
- Vehicle registration plate number
- Driver's license number
- Birthplace
- Latitude and longtiude
Installation
You can install:
the latest released version from CRAN with
install.packages("detector")
the latest development version from github with
if (packageVersion("devtools") < 1.6) { install.packages("devtools") } devtools::install_github("paulhendricks/detector")
If you encounter a clear bug, please file a minimal reproducible example on github.
API
Generate data containing fake PII
library(dplyr, warn.conflicts = FALSE)
library(generator)
n <- 6
ashley_madison <-
data.frame(name = r_full_names(n),
email = r_email_addresses(n),
phone_number = r_phone_numbers(n, use_hyphens = TRUE,
use_spaces = TRUE),
stringsAsFactors = FALSE)
ashley_madison %>%
knitr::kable(format = "markdown")
name | phone_number | |
---|---|---|
Leonardo Rodriguez | [email protected] | 254- 851- 6814 |
Dee Rice | [email protected] | 597- 978- 5193 |
Conception Marquardt | [email protected] | 184- 962- 8153 |
Collette Nitzsche | [email protected] | 475- 723- 2947 |
Norman Pfannerstill | [email protected] | 153- 674- 4219 |
Katelin Gislason | [email protected] | 831- 847- 1568 |
Detect data containing PII
library(detector)
ashley_madison %>%
detect %>%
knitr::kable(format = "markdown")
column_name | has_email_addresses | has_phone_numbers | has_national_identification_numbers |
---|---|---|---|
name | FALSE | FALSE | FALSE |
TRUE | FALSE | FALSE | |
phone_number | FALSE | TRUE | FALSE. |