MyNixOS website logo
Description

Yet Another 'ARFF' Reader.

A parser and a writer for 'WEKA' Attribute-Relation File Format <https://waikato.github.io/weka-wiki/arff_stable/> in pure R, with no dependencies. As opposed to other R implementations, this package can read standard (dense) as well as sparse files, i.e. those where each row does only contain nonzero components. Unlike 'RWeka', 'yarr' does not require any 'Java' installation nor is dependent on external software. This implementation is generalized from those in packages 'mldr' and 'mldr.datasets'.

yarr

Yet Another ARFF Reader

R language Downloads Travis GPL v3 license


So you need to read an ARFF file. You can use:

  • foreign::read.arff, but only on dense format.
  • farff::readARFF, but only on dense format.
  • RWeka::read.arff, but you need to configure Java.

yarr::read.arff can read dense and sparse ARFF files, and it's implemented in pure R.

The implementations in R are derivatives of those in mldr and mldr.datasets, packages for management of multilabel learning datasets.

Usage

remotes::install_github("fdavidcl/yarr")
library(yarr)
download.file("https://www.openml.org/data/download/1681111/phpEUwA95", "dexter.arff")
dexter <- read.arff("dexter.arff")
dexter
# An ARFF dataset: dexter
# 20001 attributes and 600 instances
#   V0: numeric (min = 0, mean = 0.12, max = 72) 0, 0, 0, 0, 0...
#   V1: numeric (min = 0, mean = 0, max = 0) 0, 0, 0, 0, 0...
#   V2: numeric (min = 0, mean = 0, max = 0) 0, 0, 0, 0, 0...
#   V3: numeric (min = 0, mean = 1.165, max = 427) 0, 0, 0, 0, 0...
#   V4: numeric (min = 0, mean = 0.57, max = 342) 0, 0, 0, 0, 0...
#   V5: numeric (min = 0, mean = 0.386666666666667, max = 116) 0, 0, 0, 0, 0...
#   V6: numeric (min = 0, mean = 0, max = 0) 0, 0, 0, 0, 0...
#   V7: numeric (min = 0, mean = 0.448333333333333, max = 121) 0, 0, 0, 0, 0...
#   V8: numeric (min = 0, mean = 0.358333333333333, max = 84) 0, 0, 0, 0, 0...
#   ...
#   class: {-1,1} (-1: 300, 1: 300) 1, 1, 1, -1, 1...
relation(dexter)
# [1] "dexter"
write.arff(dexter, "dexter-new.arff")

Supported features

  • Dense format
  • Sparse format ({index value}, e.g. {0 1, 20 1})
  • Missing values with ?
  • MEKA parameter c in @relation.

Unsupported features

  • Date attributes are read as mere strings
  • Relational attributes are not supported
  • Instance weights.
Metadata

Version

0.1.2

License

Unknown

Platforms (75)

    Darwin
    FreeBSD 13
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-darwin
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-darwin
  • i686-freebsd13
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd13
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows