Group Regression Models for Risk Protein Complex Identification.
PCLassoReg
Protein complex-based group regression models for risk protein complex identification.
Installation
To install the released version from CRAN with:
install.packages("PCLassoReg")
To install the latest development version from GitHub:
devtools::install_github("weiliu123/PCLassoReg")
Details
The package implements protein complex-based group regression models (PCLasso and PCLasso2) for risk protein complex identification. PCLasso is a prognostic model that identifies risk protein complexes associated with survival. PCLasso2 is a classification model that identifies risk protein complexes associated with classes.
Example
library(PCLassoReg)
#################### PCLasso ####################
# load data
data(survivalData)
data(PCGroups)
x <- survivalData$Exp
y <- survivalData$survData
# get human protein complexes
PC.Human <- getPCGroups(Groups = PCGroups, Organism = "Human",
Type = "EntrezID")
set.seed(20150122)
idx.train <- sample(nrow(x), round(nrow(x)*2/3))
x.train <- x[idx.train,]
y.train <- y[idx.train,]
x.test <- x[-idx.train,]
y.test <- y[-idx.train,]
# fit cv.PCLasso model
cv.fit1 <- cv.PCLasso(x = x.train, y = y.train, group = PC.Human, nfolds = 5)
# predict risk scores of samples in x.test
s <- predict(object = cv.fit1, x = x.test, type="link",
lambda=cv.fit1$cv.fit$lambda.min)
# Nonzero coefficients/risk protein complexes
sel.groups <- predict(object = cv.fit1, type="groups",
lambda = cv.fit1$cv.fit$lambda.min)
# Nonzero coefficients/risk proteins
sel.vars.unique <- predict(object = cv.fit1, type="vars.unique",
lambda = cv.fit1$cv.fit$lambda.min)
#################### PCLasso2 ####################
# load data
data(classData)
data(PCGroups)
x = classData$Exp
y = classData$Label
# get human protein complexes
PC.Human <- getPCGroups(Groups = PCGroups, Organism = "Human",
Type = "GeneSymbol")
set.seed(20150122)
idx.train <- sample(nrow(x), round(nrow(x)*2/3))
x.train <- x[idx.train,]
y.train <- y[idx.train]
x.test <- x[-idx.train,]
y.test <- y[-idx.train]
# fit model
cv.fit1 <- cv.PCLasso2(x = x.train, y = y.train, group = PC.Human,
penalty = "grLasso", family = "binomial", nfolds = 5)
# predict risk scores of samples in x.test
s <- predict(object = cv.fit1, x = x.test, type="class",
lambda=cv.fit1$cv.fit$lambda.min)
# Nonzero coefficients/risk protein complexes
sel.groups <- predict(object = cv.fit1, type="groups",
lambda = cv.fit1$cv.fit$lambda.min)
# Nonzero coefficients/risk proteins
sel.vars.unique <- predict(object = cv.fit1, type="vars.unique",
lambda = cv.fit1$cv.fit$lambda.min)
Reference
PCLasso2: a protein complex-based, group Lasso-logistic model for risk protein complex discovery. To be published.
PCLasso: a protein complex-based, group lasso-Cox model for accurate prognosis and risk protein complex discovery. Brief Bioinform, 2021.
Park, H., Niida, A., Miyano, S. and Imoto, S. (2015) Sparse overlapping group lasso for integrative multi-omics analysis. Journal of computational biology: a journal of computational molecular cell biology, 22, 73-84.