MyNixOS website logo
Description

Rapid Automatic Keyword Extraction (RAKE).

A 'Java' implementation of the RAKE algorithm ('Rose', S., 'Engel', D., 'Cramer', N. and 'Cowley', W. (2010) <doi:10.1002/9780470689646.ch1>), which can be used to extract keywords from documents without any training data.

rapidraker

A fast version of the Rapid Automatic Keyword Extraction (RAKE) algorithm

Linux BuildStatus

Installation

You can get the stable version on CRAN:

install.packages("rapidraker")

The development version of the package requires you to compile the latest Java source code in rapidrake-java, so installing it is not as simple as making a call to devtools::install_github().

What is rapidraker?

rapidraker is an R package that provides an implementation of the same keyword extraction algorihtm (RAKE) as slowraker. However, rapidraker::rapidrake() is written in Java, whereas slowraker::slowrake() is written in R. This means that you can expect rapidrake() to be considerably faster than slowrake().

Usage

rapidrake() has the same arguments as slowrake(), and both functions output the same type of object. You can therefore substitue rapidrake() for slowraker() without making any additional changes to your code.

library(slowraker)
library(rapidraker)

data("dog_pubs")
rakelist <- rapidrake(txt = dog_pubs$abstract[1:5])

rapidrake() outputs a list of data frames. Each data frame contains the keywords that were extracted for an element of txt:

rakelist
#> 
#> # A rakelist containing 5 data frames:
#>  $ :'data.frame':    61 obs. of  4 variables:
#>   ..$ keyword:"assistance dog identification tags" ...
#>   ..$ freq   :1 1 ...
#>   ..$ score  :11 ...
#>   ..$ stem   :"assist dog identif tag" ...
#>  $ :'data.frame':    90 obs. of  4 variables:
#>   ..$ keyword:"current dog suitability assessments focus" ...
#>   ..$ freq   :1 1 ...
#>   ..$ score  :22 ...
#>   ..$ stem   :"current dog suitabl assess focus" ...
#> #...With 3 more data frames.

You can bind these data frames together using slowaker::rbind_rakelist():

rakedf <- rbind_rakelist(rakelist, doc_id = dog_pubs$doi[1:5])
head(rakedf, 5)
#>                         doc_id                            keyword freq score                   stem
#> 1 10.1371/journal.pone.0132820 assistance dog identification tags    1  10.8 assist dog identif tag
#> 2 10.1371/journal.pone.0132820          animal control facilities    1   9.0     anim control facil
#> 3 10.1371/journal.pone.0132820          emotional support animals    1   9.0      emot support anim
#> 4 10.1371/journal.pone.0132820                   small body sizes    1   9.0        small bodi size
#> 5 10.1371/journal.pone.0132820       seemingly inappropriate dogs    1   7.9    seem inappropri dog

Learning more

  • To learn more about the API used by both slowraker and rapidraker, head over to slowraker’s webpage.
Metadata

Version

0.1.3

License

Unknown

Platforms (77)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows