Description
R Toolkit for 'Databricks'.
Description
Collection of utilities that improve using 'Databricks' from R. Primarily functions that wrap specific 'Databricks' APIs (<https://docs.databricks.com/api>), 'RStudio' connection pane support, quality of life functions to make 'Databricks' simpler to use.
README.md
brickster
{brickster}
is the R toolkit for Databricks, it includes:
Wrappers for Databricks API's (e.g.
db_cluster_list
,db_volume_read
)Browser workspace assets via RStudio Connections Pane (
open_workspace()
)Exposes the
databricks-sql-connector
via{reticulate}
(docs)Interactive Databricks REPL
Installation
remotes::install_github("databrickslabs/brickster")
Quick Start
library(brickster)
# only requires `DATABRICKS_HOST` if using OAuth U2M
# first request will open browser window to login
Sys.setenv(DATABRICKS_HOST = "<workspace-prefix>.cloud.databricks.com")
# list all SQL warehouses
warehouses <- db_sql_warehouse_list()
# read `data.csv` from a volume
file <- db_volume_read(
path = "/Volumes/<catalog>/<schema>/<volume>/data.csv",
tempfile(pattern = ".csv")
)
volume_csv <- readr::read_csv(file)
Refer to the "Connect to a Databricks Workspace" article for more details on getting authentication configured.
API Coverage
{brickster}
is very deliberate with choosing what API's are wrapped. {brickster}
isn't intended to replace IaC tooling (e.g. Terraform) or to be used for account/workspace administration.
API | Available | Version |
---|---|---|
DBFS | Yes | 2.0 |
Secrets | Yes | 2.0 |
Repos | Yes | 2.0 |
mlflow Model Registry | Yes | 2.0 |
Clusters | Yes | 2.0 |
Libraries | Yes | 2.0 |
Workspace | Yes | 2.0 |
Endpoints | Yes | 2.0 |
Query History | Yes | 2.0 |
Jobs | Yes | 2.1 |
Volumes (Files) | Yes | 2.0 |
SQL Statement Execution | Yes | 2.0 |
REST 1.2 Commands | Partially | 1.2 |
Unity Catalog | Partially | 2.1 |