MyNixOS website logo
Description

Tools for Scholarly and Academic Identifiers.

Tools for detecting, normalizing, classifying, and extracting scholarly identifier strings. The package provides lightweight, dependency-free helpers for common identifier systems such as DOIs, ORCID iDs, ISBNs, ISSNs, arXiv identifiers, and PubMed identifiers. Functions are designed to be vectorized, predictable, and suitable as low-level building blocks for other R packages and data workflows.

scholid

R-CMD-check Codecov testcoverage

scholid provides lightweight, dependency-free utilities for working with scholarly identifiers in R. The package is designed as a small, well-tested foundation that can be safely reused by other packages and data workflows.

Installation

Install the released version from CRAN:

install.packages("scholid")

Scope

The package focuses on common identifier systems used in scholarly communication:

  • DOI
  • ORCID iD
  • ISBN
  • ISSN
  • arXiv
  • PubMed (PMID)
  • PubMed Central (PMCID)

Interface

User-available functions:

FunctionPurpose
scholid_types()List supported scholarly identifier types
is_scholid(x, type)Test whether values conform to a given identifier type
normalize_scholid(x, type)Normalize identifiers to canonical form
extract_scholid(text, type)Extract identifiers of a given type from free text
classify_scholid(x)Guess the identifier type of each input value
detect_scholid_type(x)Detect identifier types from canonical or wrapped input values

Examples

# list supported scholarly identifier types
scholid::scholid_types()
## [1] "arxiv" "doi"   "isbn"  "issn"  "orcid" "pmcid" "pmid"
# test whether values match a given identifier type
scholid::is_scholid(
  x    = "10.1000/182",
  type = "doi"
)
## [1] TRUE
# normalize identifiers to canonical form
scholid::normalize_scholid(
  x    = "https://doi.org/10.1000/182",
  type = "doi"
)
## [1] "10.1000/182"
# extract identifiers of a given type from free text
scholid::extract_scholid(
  text = "See https://doi.org/10.1000/182 for details.",
  type = "doi"
)
## [[1]]
## [1] "10.1000/182"
# classify the identifier type of each input value
scholid::classify_scholid(
  x = c(
    "10.1000/182",
    "0000-0002-1825-0097",
    "not an id"
  )
)
## [1] "doi"   "orcid" NA
# detect identifier types from canonical or wrapped input values
scholid::detect_scholid_type(
  x = c(
    "https://doi.org/10.1000/182",
    "ORCID: 0000-0002-1825-0097",
    "arXiv:2101.00001",
    "not an id"
  )
)
## [1] "doi"   NA      "arxiv" NA

For more detailed usage patterns, including extraction from text and classification of mixed identifier columns, see the Get started vignette.

License

MIT.

Metadata

Version

0.1.0

License

Unknown

Platforms (78)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    uefi
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-uefi
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-linux
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-uefi
  • x86_64-windows