Compute pathway analysis using PCA spotting PCs driven by only one gene.

For each module provided as input, runs PCA and tests informativity of successive PCs to find different activation scores.

run_activity_analysis(
  expr_mat,
  modules_list,
  norm = FALSE,
  nb_comp_max = 5,
  min_cells_pct = 0.05,
  nCores = 1,
  min_module_size = 10,
  max_contrib = 0.5,
  scale_before_pca = T,
  all_PCs_in_range = F
)

Arguments

expr_mat: Numeric matrix, with genes as rows and cells as columns. This matrix should be normalized. If it is not, set parameter norm to TRUE to perform logCPM normalization.
modules_list: List, with module names as list names and character vectors containing genes in module. Output from read_gmt file for example.
norm: Boolean, set to TRUE if your data is not normalized. LogCPM normalization will be performed.
nb_comp_max: Integer, maximum number of ways a module can be activated.
min_cells_pct: Numeric between 0 and 1, minimum percentage of cells that should be above informativity threshold to consider activity score interesting.
nCores: Number of cores to use. Set to 1 if working in a Windows environment, otherwise you can use the function detectCores() to find out how many cores are available.
min_module_size: Minimum size of the gene set for which it makes sense to compute activity. Default is 10.

Value

List, for each module, stores activity scores (raw or scaled between 0 and 1) from informative PCs, gene contributions to each informative PC, variance explained by each PC, threshold used for informativity and number of cells above threshold.