Check if any or all of the elements of a short atomic vector are contained within a supplied long atomic vector.

Contains(long, short, matches = c("any", "all"), partial = FALSE)

Arguments

long

A vector to possibly containing any or all elements of short

short

A short vector or scalar, some elements of which may be contained in long

matches

Should partial set matching of short be allowed? Defaults to "any", signifying that the function should return TRUE if any of the elements of short are contained in long. The other option is "all".

partial

Should partial string matching be allowed? Defaults to FALSE. Partial string matching means that the character string starts with the supplied value.

Value

A logical scalar. If matches = "any", this indicates if any of the elements of short are contained in long. If matches = "all", this indicates if all of the elements of short are contained in long. If partial = TRUE, the returned logical indicates whether or not any of the character strings in long start with the character scalar supplied to short.

Details

This is a helper function to find out if a gene symbol or some similar character string (or character vector) is contained in a pathway. Currently, this function uses base R, but we can write it in a compiled language (such as C++) to increase speed later.

For partial matching (partial = TRUE), long must be an atomic vector of type character, short must be an atomic scalar (a vector with length of 1) of type character, and matches should be set to "any". Because this function is designed to match gene symbols or CpG locations, we care if the symbol or location starts with the string supplied. For example, if we set short = "PIK", then we want to find if any of the gene symbols in the supplied long vector belong to the PIK gene family. We don't care if this string appears elsewhere in a gene symbol.

Examples

Contains(1:10, 8)
#> [1] TRUE
Contains(LETTERS, c("A", "!"), matches = "any")
#> [1] TRUE
Contains(LETTERS, c("A", "!"), matches = "all")
#> [1] FALSE
genesPI <- c( "PI4K2A", "PI4K2B", "PI4KA", "PI4KB", "PIK3C2A", "PIK3C2B", "PIK3C2G", "PIK3C3", "PIK3CA", "PIK3CB", "PIK3CD", "PIK3CG", "PIK3R1", "PIK3R2", "PIK3R3", "PIK3R4", "PIK3R5", "PIK3R6", "PIKFYVE", "PIP4K2A", "PIP4K2B", "PIP5K1B", "PIP5K1C", "PITPNB" ) Contains(genesPI, "PIK3", partial = TRUE)
#> [1] TRUE