A Bit of Glue Between R and Stata.
RStata
A simple R -> Stata
interface allowing the user to:
- execute Stata commands (both inline and from a
.do
file) from R; - pass a
data.frame
to Stata; - return a
data.frame
to R; - include Stata 'chunks' in knitr documents (however each chunk is independent from the others; feature tested with LaTeX/Linux).
Install
install.packages('RStata')
Setup
In order to use this package you need to setup two options
:
options("RStata.StataPath")
options("RStata.StataVersion")
Once done, the main function in the package is stata
(see Examples below).
RStata.StataPath
RStata.StataPath
contains the path to Stata executable (Windows user have to delete the extension .exe
from the path ).
You can use
chooseStataBin()
to set this option the first time:
in Linux it searches for the 4 (cli) Stata binaries (
stata-mp
,stata-se
,stata
,stata-sm
) in the search path and prompt a menu to choose one;in Windows does a
file.choose
(and a bit of manipulation, in Windows, to obtain a valid path).other system are not handled by
chooseStataBin
(mainly because of my ignorance regarding both the system itself and Stata installation on it, but contributions are welcome).Mac users are a notable group: at this stage they should set the path for the command line executable (not the GUI app). Eg for a Stata13 SE installation in El Capitain (with default options) the path is
'/Applications/Stata/StataSE.app/Contents/MacOS/stata-se'
. In other cases (Stata licenses, installation path and so on) the path could be slightly different (BTW, i've found the path 'exploring' the filesystem with a terminal, not the graphics explorer since it 'stops' at 'StataSE.app').
chooseStataBin
set up RStata.StataPath
accordingly to user advice and return the path; but you may consider to put the path returned in .Rprofile
options for the next time you start R.
RStata.StataVersion
RStata.StataVersion
is needed for right management of data export/import to/from Stata. Eg, for Stata 13:
options("RStata.StataVersion" = 13)
As above, you may consider to put this statement in your .Rprofile
.
Other options
See ?stata
.
Troubleshooting
In some setup the package seems to freeze even with the simplest example (eg stata('set obs 10')
). This is likely dued to Stata more
option, which has to be set to off:
set more off, permanently
Examples
One inline command
> stata("help regress") #<- this won't work in Windows dued to needed
# batch mode
. help regress
Title
-----
[R] regress -- Linear regression
Syntax
------
regress depvar [indepvars] [if] [in] [weight] [, options]
[...]
Many inline commands
You can do this (eg.):
> stata_src <- "
version 10
sysuse auto
reg mpg weight
"
> stata(stata_src)
[...]
External .do file sourceing
> stata("foo.do")
Data input to Stata
> x <- data.frame(a = rnorm(3), b = letters[1:3])
> stata("sum a", data.in = x)
. sum a
Variable | Obs Mean Std. Dev. Min Max
------------+--------------------------------------------------------
a | 3 -.5603985 .3000552 -.7720861 -.2170166
Data output from Stata
Eg in order to obtain auto
dataset
> auto <- stata("sysuse auto", data.out = TRUE)
. sysuse auto
(1978 Automobile Data)
> head(auto)
make price mpg rep78 headroom trunk weight length turn displacement
1 AMC Concord 4099 22 3 2.5 11 2930 186 40 121
2 AMC Pacer 4749 17 3 3.0 11 3350 173 40 258
3 AMC Spirit 3799 22 NA 3.0 12 2640 168 35 121
4 Buick Century 4816 20 3 4.5 16 3250 196 40 196
5 Buick Electra 7827 15 4 4.0 20 4080 222 43 350
6 Buick LeSabre 5788 18 3 4.0 21 3670 218 43 231
gear_ratio foreign
1 3.58 Domestic
2 2.53 Domestic
3 3.08 Domestic
4 2.93 Domestic
5 2.41 Domestic
6 2.73 Domestic
Data input/output
> x <- data.frame(a = rnorm(3), b = letters[1:3])
> (y <- stata("replace a = 2", data.in = x, data.out = TRUE))
. replace a = 2
(3 real changes made)
a b
1 2 a
2 2 b
3 2 c
Contribute
You are welcome to:
- submit suggestions and bug-reports at: http://github.com/lbraglia/RStata/issues
- fork and send a pull request on: http://github.com/lbraglia/RStata/
- send an e-mail to: [email protected].