Description

Stochastic Frontier Analysis Routines.

Description

Maximum likelihood estimation for stochastic frontier analysis (SFA) of production (profit) and cost functions. The package includes the basic stochastic frontier for cross-sectional or pooled data with several distributions for the one-sided error term (i.e., Rayleigh, gamma, Weibull, lognormal, uniform, generalized exponential and truncated skewed Laplace), the latent class stochastic frontier model (LCM) as described in Dakpo et al. (2021) <doi:10.1111/1477-9552.12422>, for cross-sectional and pooled data, and the sample selection model as described in Greene (2010) <doi:10.1007/s11123-009-0159-1>, and applied in Dakpo et al. (2021) <doi:10.1111/agec.12683>. Several possibilities in terms of optimization algorithms are proposed.

README.md

cran.r-project.org

sfaR: Stochastic Frontier Analysis Using R

sfaR provides a set of tools (maximum likelihood and maximum simulated likelihood) for various specifications of stochastic frontier analysis.

Three categories of models are available in sfaR:

Classic Stochastic Frontier Model

This model allows the estimation of the frontier for a cross-sectional or pooled data. Basically we have

$$y_i = \mathbf{x_i'}\boldsymbol{\beta} + v_i - Su_i$$

where $S = 1$ for production function and $S = -1$ for cost function. $v$ follows a normal distribution $\mathcal{N}(0, \sigma_v^2)$. For $u$ ten different distributions are available. These distributions include:

Half-Normal
Truncated Normal
Exponential
Rayleigh
Gamma
Generalized Exponential
Lognormal
Truncated Skewed Laplace
Uniform
Weibull

In the case of the Gamma, lognormal and Weibull distributions, maximum simulated likelihood is used with the possibility of four possibilities to construct the draws: Halton, Generalized Halton, Sobol and uniform.

Heteroscedasticity in both error terms can be implemented, in addition to heterogeneity in the truncated mean parameter in the case of the truncated normal and lognormal distributions. In addition, in the case of the truncated normal distribution, the scaling property can be estimated. The main function for this class of model is sfacross.

Latent Class Stochastic Frontier Model

This model accounts for technological heterogeneity by splitting the observations into a maximum number of five classes. The classification operates based on a logit functional form that can be specified using some covariates (namely, the separating variables allowing the separation of observations in several classes). Only the half normal distribution is available for the one-sided error term. Heteroscedasticity in both error terms is possible. The choice of the number of classes can be guided by several information criteria (i.e. AIC, BIC or HQIC). The main function for this class of model is sfalcmcross.

Sample Selection Correction Stochastic Frontier Model

This model solves the selection bias due to the correlation between the two-sided errors terms in both the selection and the frontier equations, in the case of cross-sectional or pooled data. The main function for this class of model is sfaselectioncross.

An important features of sfaR is to provide eleven different optimization algorithms. For complex problem, several algorithms can be combined especially non-gradient based in a first step.

Installation

You can install the development version of sfaR from GitHub with:

# install.packages("devtools")
devtools::install_github("hdakpo/sfaR")

Install the current version on CRAN with

# install.packages("sfaR")

Example

This subsection provides set of examples introducing some important features of sfaR.

library(sfaR)
#> * Please cite the 'sfaR' package as:
#>   Dakpo KH., Desjeux Y., Henningsen A., and Latruffe L. (2023). sfaR: Stochastic Frontier Analysis Routines. R package version 1.0.0.
#> 
#> See also: citation("sfaR")
#> 
#> * For any questions, suggestions, or comments on the 'sfaR' package, please make use of Tracker facilities at:
#>   https://github.com/hdakpo/sfaR/issues
## basic examples code

## let's estimate the classic frontier for different distributions using 
## the utility dataset, which contains data on fossil fuel fired 
## steam electric power generation plants in the United States
hlf <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
 log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
 I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
 udist = 'hnormal', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')

trnorm <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
 log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
 I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
 udist = 'tnormal', muhet = ~ regu, data = utility, S = -1, method = 'bfgs')

tscal <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
 log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
 I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
 udist = 'tnormal', muhet = ~ regu, uhet = ~ regu, data = utility, 
 S = -1, method = 'bfgs', scaling = TRUE)

expo <- sfacross(formula = log(tc/wf) ~ log(y) + I(1/2 * (log(y))^2) +
 log(wl/wf) + log(wk/wf) + I(1/2 * (log(wl/wf))^2) + I(1/2 * (log(wk/wf))^2) +
 I(log(wl/wf) * log(wk/wf)) + I(log(y) * log(wl/wf)) + I(log(y) * log(wk/wf)),
 udist = 'exponential', uhet = ~ regu, data = utility, S = -1, method = 'bfgs')

Outputs of estimation can be exported using the texreg package. For instance, using the command screenreg(list(hlf, trnorm, tscal, expo)) yields the following output

sfacross

## For the latent class stochastic frontier we have:
lcm2c1 <- sfalcmcross(formula = ly ~ lk + ll + yr, thet = ~initStat, 
 data = worldprod)
#> Initialization: SFA + halfnormal - normal distributions...
#> LCM 2 Classes Estimation...
lcm2c2 <- sfalcmcross(formula = ly ~ lk + ll + yr, uhet = ~initStat, 
 data = worldprod)
#> Initialization: SFA + halfnormal - normal distributions...
#> LCM 2 Classes Estimation...

The command screenreg(list(lcm2c1, lcm2c2)) generates the following

sfalcmcross

## The following simulation is used for the sample selection
 N <- 2000  # sample size
 set.seed(12345)
 z1 <- rnorm(N)
 z2 <- rnorm(N)
 v1 <- rnorm(N)
 v2 <- rnorm(N)
 e1 <- v1
 e2 <- 0.7071 * (v1 + v2)
 ds <- z1 + z2 + e1
 d <- ifelse(ds > 0, 1, 0)
 u <- abs(rnorm(N))
 x1 <- rnorm(N)
 x2 <- rnorm(N)
 y <- x1 + x2 + e2 - u
 data <- cbind(y = y, x1 = x1, x2 = x2, z1 = z1, z2 = z2, d = d)
 
 ## Estimation using quadrature (Gauss-Kronrod)
 
 selecRes1 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, 
 modelType = 'greene10', method = 'bfgs',
 logDepVar = TRUE, data = as.data.frame(data),
 S = 1L, udist = 'hnormal', lType = 'kronrod', Nsub = 100, uBound = Inf,
 simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE,
 seed = 12345, itermax = 2000, printInfo = FALSE)
#> First step probit model...
#> Second step Frontier model...
 
 ## Estimation using quadrature (Gauss-Hermite)
 
 selecRes2 <- sfaselectioncross(selectionF = d ~ z1 + z2, frontierF = y ~ x1 + x2, 
 modelType = 'greene10', method = 'bfgs',
 logDepVar = TRUE, data = as.data.frame(data),
 S = 1L, udist = 'hnormal', lType = 'ghermite', Nsub = 100, uBound = Inf,
 simType = 'halton', Nsim = 300, prime = 2L, burn = 10, antithetics = FALSE,
 seed = 12345, itermax = 2000, printInfo = FALSE)
#> First step probit model...
#> Second step Frontier model...

The command screenreg(list(selecRes1, selecRes2))

selectioncross

r-sfaR

sfaR: Stochastic Frontier Analysis Using R

Installation

Example

Version

License

Status

Source

Homepage

Platforms (75)

sfaR: Stochastic Frontier Analysis Using R

Installation

Example

Version

License

Status

Source

Homepage

Platforms75 (75)

Platforms (75)