.gmt
file in as a pathwayCollection
objectread_gmt.Rd
Read a set list file in Gene Matrix Transposed (.gmt
)
format, with special performance consideration for large files. Present
this object as a pathwayCollection
object.
read_gmt( file, setType = c("pathways", "genes", "regions"), description = FALSE, nChars = 1e+07, delim = "\t" )
file | A path to a file or a connection. This file must be a |
---|---|
setType | What is the type of the set: pathway set of gene, gene sites
in RNA or DNA, or regions of CpGs. Defaults to |
description | Should the "description" field (the second field in the
|
nChars | The number of characters to read from a connection. The largest
|
delim | The |
A pathwayCollection
list of sets. This list has three
elements:
'setType' : A named list of character vectors. Each vector
contains the names of the individual genes, sites, or CpGs within that
set as a vector of character strings. The name of this list entry is
equal to the value specified in setType
.
TERMS
: A character vector the same length as the 'setType'
list with the proper names of the sets.
description
: (OPTIONAL) A character vector the same length
as the 'setType' list with a note on that set (for the .gmt
file
included with this package, this field contains hyperlinks to the
MSigDB description card for that pathway). This field is included when
description = TRUE
.
This function uses R
's readChar
function to
improve character input performance over readLines
(and
far improve input performance over scan
).
See the Broad Institute's "Data Formats" page for a description of the Gene Matrix Transposed file format: https://software.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats#GMT:_Gene_Matrix_Transposed_file_format_.28.2A.gmt.29
# If you have installed the package: data_path <- system.file( "extdata", "c2.cp.v6.0.symbols.gmt", package = "pathwayPCA", mustWork = TRUE ) geneset_ls <- read_gmt(data_path, description = TRUE) # # If you are using the development version from GitHub: # geneset_ls <- read_gmt( # "inst/extdata/c2.cp.v6.0.symbols.gmt", # description = TRUE # )