superpc.st.Rd
Identify ppath significant features, extract principal components (PCs) from those specific features to construct a data matrix, predict the response with this data matrix, and record the model fit statistic of this prediction.
superpc.st( fit, data, n.threshold = 20, threshold.ignore = 0, n.PCs = 1, min.features = 3, epsilon = 1e-06 )
fit | An object of class |
---|---|
data | A list of test data:
|
n.threshold | The number of bins into which to split the feature scores
returned in the |
threshold.ignore | Calculate the model for feature scores above this percentile of the threshold. We have observed that the smallest threshold values (0% - 40%) largely have no effect on model t-scores. Defaults to 0.00 (0%). |
n.PCs | The number of PCs to extract from the pathway. |
min.features | What is the smallest number of genes allowed in each pathway? This argument must be kept constant across all calls to this function which use the same pathway list. Defaults to 3. |
epsilon | I'm not sure why this is important. It's called when comparing the absolute score values to each value of the threshold vector. Defaults to 10−6. |
A list containing:
thresholds
: A labelled vector of quantile values of the
score vector in the fit
object.
n.threshold
: The number of splits to make in the score
vector.
scor
: A matrix of model fit statistics. Each column is the
threshold level of predictors allowed into the model, and each row is a
PC included. Which genes are included in the matrix before PC extraction
is governed by comparing their model score to the quantile value of the
scores at each threshold value.
tscor
: A matrix of model t-statisics for each PC
included (rows) at each threshold level (columns).
type
: Which model was called? Options are survival,
regression, or binary.
NOTE: the number of thresholds at which to test (n.threshold
)
can be larger than the number of features to bin. This will result in
constant t-statistics for the first few bins because the model isn't
changing.
See https://web.stanford.edu/~hastie/Papers/spca_JASA.pdf.
# DO NOT CALL THIS FUNCTION DIRECTLY. # Use SuperPCA_pVals() instead if (FALSE) { data("colon_pathwayCollection") data("colonSurv_df") colon_OmicsSurv <- CreateOmics( assayData_df = colonSurv_df[,-(2:3)], pathwayCollection_ls = colon_pathwayCollection, response = colonSurv_df[, 1:3], respType = "surv" ) asthmaGenes_char <- getTrimPathwayCollection(colon_OmicsSurv)[["KEGG_ASTHMA"]]$IDs data_ls <- list( x = t(getAssay(colon_OmicsSurv))[asthmaGenes_char, ], y = getEventTime(colon_OmicsSurv), censoring.status = getEvent(colon_OmicsSurv), featurenames = asthmaGenes_char ) superpcFit <- superpc.train( data = data_ls, type = "surv" ) superpc.st( fit = superpcFit, data = data_ls ) }