R Tools for Data Copy-Pasta.
datapasta 3.1.0 'Leave to Simmer'
The Goods
Introducing datapasta
datapasta
is about reducing resistance associated with copying and pasting data to and from R. It is a response to the realisation that I often found myself using intermediate programs like Sublime to munge text into suitable formats. Addins and functions in datapasta
support a wide variety of input and output situations, so it (probably) "just works". Hopefully tools in this package will remove such intermediate steps and associated frustrations from our data slinging workflows.
Prerequisites
- Linux users will need to install either
xsel
orxclip
. These applications provide an interface to X selections (clipboard-like).- For example:
sudo apt-get install xsel
- it's 72kb...
- For example:
- Windows and MacOS have nothing extra to do.
Installation
- Get the package:
install.packages("datapasta")
- Set the keyboard shortcuts using Tools -> Addins -> Browse Addins, then click Keyboard Shortcuts...
Usage
Use with RStudio
Getting data into source
At the moment this package contains these RStudio addins that paste data to the cursor:
tribble_paste
which pastes a table as a nicely formatted call totibble::tribble()
- Recommend Ctrl + Shift + t as shortcut.
- Table can be delimited with tab, comma, pipe or semicolon.
vector_paste
which will paste delimited data as a vector definition, e.g.c("a", "b")
etc.- Recommend Ctrl + Alt + Shift + v as shortcut.
vector_paste_vertical
which will paste delimited data as a vertically formatted vector definition.- Recommend Ctrl + Shift + v as shortcut
- example output:
c("Mint",
"Fedora",
"Debian",
"Ubuntu",
"OpenSUSE")
df_paste
which pastes a table on the clipboard as a standarddata.frame
definition rather than atribble
call. This has certain advantages in the context of reproducible examples and educational posts. Many thanks to Jonathan Carroll for getting this rolling and coding the bulk of the feature.- Recommend Ctrl + Alt + Shift + d as shortcut.
dt_paste
which is the same asdf_paste
, but fordata.table
.
Massaging data in source
There are two Addins that can help with creating and aligning data in your editor:
Fiddle Selection
will perform magic on a selection. It can be used to:- Turn raw data delimited by any combination of commas, spaces, and newlines into a
c()
expression - Pivot a
c()
expr between horizontal and vertical layout. - Reflow messy
tribble()
anddata.frame()
exprs. - Recommend Ctrl +Shift + f as shortcut.
- Turn raw data delimited by any combination of commas, spaces, and newlines into a
Toggle Vector Quotes
will toggle ac()
expr between all elements wrapped in""
and all bare unquoted form. Handy in combination with above to save mucho keystrokes.- Recommend Ctrl +Shift + q as shortcut.
Getting Data out of an R session
There are two R functions available that accept R objects and output formatted text for pasting to a reprex or other application:
dpasta
accepts tibbles, data.frames, and vectors. Data is output in a format that matches in input class. Formatted text is pasted at the cursor.dmdclip
accepts the same inputs asdpasta
but inserts the formatted text onto the clipboard, preceded by 4 spaces so that is can be as pasted as a preformatted block to Github, Stackoverflow etc.
Use with other editors
The only hard dependency of datapasta
is readr
for type guessing. All the above *paste
functions can be called directly instead of as an addin, and will fall back to console output if the rstudioapi
is not available.
On system without access to the clipboard (or without clipr
installed) datapasta
can still be used to output R objects from an R session. dpasta
is probably the only function you care about in this scenario.
Custom Installation
datapasta
imports clipr
and rstudioapi
so as to make installation smooth and easy for most users. If you wish to avoid installing an rstudioapi
you will never use you can use:
install.packages("datapasta", dependencies = "Depends")
.- Followed by
install.packages("clipr")
to enable clipboard features.
Pitfalls
tribble_paste
works well with CSVs, excel files, and html tables, but is currently brittle with respect to irregular table structures like merged cells or multi-line column headings. For some reason Wikipedia seems chock full of these. :(- Quoted csv data, where the quotes contain commas will not be parsed correctly.
- Nested list columns have limited support with
tribble_paste()
/dpasta()
. Nested lists of length 1 fail unless all are length 1 - It's complicated. You still get some output so it might be viable to fix and reflow withFiddle Selection
. Tread with caution.
Prior art
This package is made possible by mdlincon's clipr, and Hadley's packages tibble and readr (for data-type guessing). I especially appreciate clipr's
thoughtful approach to the clipboard on Linux, which pretty much every other R clipboard package just nope'd out on.
Future developments
I am interested in expanding the types of objects supported by the output functions dpasta
. I would also like to eventually have Fiddle Selection
to pivot function calls and named vectors. Feel free to contribute your ideas to the open issues.
Bonus
0 to datapasta
in 64 seconds via a video vignette: