MyNixOS website logo
Description

Parse and Manipulate R Code.

Parsing R code is key to build tools such as linters and stylers. This package provides a binding to the 'Rust' crate 'ast-grep' so that one can parse and explore R code.

astgrepr

R-CMD-check Codecov testcoverage

astgrepr provides R bindings to the ast-grep Rust crate. ast-grep is a tool to parse the abstract syntax tree (AST) of some code and to perform search and rewrite of code. This is extremely useful to build linters, stylers, and perform a lot of code analysis.

See the example below and the “Getting started” vignette for a gentle introduction to astgrepr.

Since astgrepr can be used as a low-level foundation for other tools (such as linters), the number of R dependencies is kept low:

> pak::local_deps_tree()
✔ Loading metadata database ... done
local::. 0.0.1 [new][bld]                                                  
├─checkmate 2.3.1 [new][dl] (746.54 kB)
│ └─backports 1.5.0 [new]
├─rrapply 1.2.7 [new]
└─yaml 2.3.8 [new][dl] (119.08 kB)

Key:  [new] new | [dl] download | [bld] build

Installation

install.packages('astgrepr', repos = c('https://etiennebacher.r-universe.dev', 'https://cloud.r-project.org'))

Demo

library(astgrepr)

src <- "library(tidyverse)
x <- rnorm(100, mean = 2)
any(duplicated(y))
plot(x)
any(duplicated(x))
any(is.na(variable))"

root <- src |>
  tree_new() |>
  tree_root()

# get everything inside rnorm()
root |>
  node_find(ast_rule(pattern = "rnorm($$$A)")) |>
  node_get_multiple_matches("A") |>
  node_text_all()
#> $rule_1
#> $rule_1[[1]]
#> [1] "100"
#> 
#> $rule_1[[2]]
#> [1] ","
#> 
#> $rule_1[[3]]
#> [1] "mean = 2"

# find occurrences of any(duplicated())
root |>
  node_find_all(ast_rule(pattern = "any(duplicated($A))")) |>
  node_text_all()
#> $rule_1
#> $rule_1$node_1
#> [1] "any(duplicated(y))"
#> 
#> $rule_1$node_2
#> [1] "any(duplicated(x))"

# find some nodes and replace them with something else
nodes_to_replace <- root |>
  node_find_all(
    ast_rule(id = "any_na", pattern = "any(is.na($VAR))"),
    ast_rule(id = "any_dup", pattern = "any(duplicated($VAR))")
  )

fixes <- nodes_to_replace |>
  node_replace_all(
    any_na = "anyNA(~~VAR~~)",
    any_dup = "anyDuplicated(~~VAR~~) > 0"
  )

# original code
cat(src)
#> library(tidyverse)
#> x <- rnorm(100, mean = 2)
#> any(duplicated(y))
#> plot(x)
#> any(duplicated(x))
#> any(is.na(variable))

# new code
tree_rewrite(root, fixes)
#> library(tidyverse)
#> x <- rnorm(100, mean = 2)
#> anyDuplicated(y) > 0
#> plot(x)
#> anyDuplicated(x) > 0
#> anyNA(variable)

Related tools

There is some recent work linking tree-sitter and R. Those are not competing with astgrepr but are rather a complement to it:

  • r-lib/tree-sitter-r: provide the R grammar to be used with tools built on tree-sitter. astgrepr relies on this grammar under the hood.
  • DavisVaughan/r-tree-sitter: a companion of r-lib/tree-sitter-r. This gives a way to get the tree-sitter representation of some code directly in R. This is useful to learn how tree-sitter represents the R grammar, which is required if you want advanced use of astgrepr. However, it doesn’t provide a way to easily select specific nodes (e.g. based on patterns).
Metadata

Version

0.1.1

License

Unknown

Platforms (75)

    Darwin
    FreeBSD
    Genode
    GHCJS
    Linux
    MMIXware
    NetBSD
    none
    OpenBSD
    Redox
    Solaris
    WASI
    Windows
Show all
  • aarch64-darwin
  • aarch64-freebsd
  • aarch64-genode
  • aarch64-linux
  • aarch64-netbsd
  • aarch64-none
  • aarch64-windows
  • aarch64_be-none
  • arm-none
  • armv5tel-linux
  • armv6l-linux
  • armv6l-netbsd
  • armv6l-none
  • armv7a-linux
  • armv7a-netbsd
  • armv7l-linux
  • armv7l-netbsd
  • avr-none
  • i686-cygwin
  • i686-freebsd
  • i686-genode
  • i686-linux
  • i686-netbsd
  • i686-none
  • i686-openbsd
  • i686-windows
  • javascript-ghcjs
  • loongarch64-linux
  • m68k-linux
  • m68k-netbsd
  • m68k-none
  • microblaze-linux
  • microblaze-none
  • microblazeel-linux
  • microblazeel-none
  • mips-linux
  • mips-none
  • mips64-linux
  • mips64-none
  • mips64el-linux
  • mipsel-linux
  • mipsel-netbsd
  • mmix-mmixware
  • msp430-none
  • or1k-none
  • powerpc-netbsd
  • powerpc-none
  • powerpc64-linux
  • powerpc64le-linux
  • powerpcle-none
  • riscv32-linux
  • riscv32-netbsd
  • riscv32-none
  • riscv64-linux
  • riscv64-netbsd
  • riscv64-none
  • rx-none
  • s390-linux
  • s390-none
  • s390x-linux
  • s390x-none
  • vc4-none
  • wasm32-wasi
  • wasm64-wasi
  • x86_64-cygwin
  • x86_64-darwin
  • x86_64-freebsd
  • x86_64-genode
  • x86_64-linux
  • x86_64-netbsd
  • x86_64-none
  • x86_64-openbsd
  • x86_64-redox
  • x86_64-solaris
  • x86_64-windows