pairot.pp.select_genes#
- pairot.pp.select_genes(de_res_ova, de_res_ava, n_genes_ova=10, n_genes_ava=3, n_genes_max=None, overlap_threshold=0.3, overlap_n_genes=10, remove_duplicated_genes=True)#
Select and combine DE results from OVA (one vs all) and AVA (all vs all) settings.
Function combines the OVA (one vs. all) and AVA (all vs. all) DE results by selecting the top DE genes from both settings. The AVA results only get added for clusters that are similar enough based on the Jaccard overlap of their top OVA DE genes.
- de_res_ova
OVA (one vs. all) DE results from
pairot.pp.filter_genes_ova().- de_res_ava
AVA (all vs. all) DE results from
pairot.pp.filter_genes_ava().- n_genes_ova
Number of top DE genes to select from the OVA results for each cluster.
- n_genes_ava
Number of top DE genes to select from the AVA results for each cluster pair.
- n_genes_max
Maximum number of DE genes to return for each cluster after combining OVA and AVA results. If None, all genes are returned.
- overlap_threshold
Jaccard overlap threshold to determine if two clusters are similar enough to refine the DE results using AVA results. If the overlap of the top
overlap_n_genesgenes between the OVA results of two clusters is greater than this threshold, the topn_genes_avaAVA results will be added to refine the DE results.- overlap_n_genes
Number of genes to use for the overlap calculation between clusters.
- remove_duplicated_genes
If true, remove duplicated genes after combining OVA and AVA results.
- Return type:
- Returns:
combined_de_results Dictionary containing the combined DE results for each cluster.
Examples
>>> import pairot as pr >>> >>> de_res_ova, de_res_ava = pr.pp.rank_genes_limma( >>> adata, >>> cluster_label="cell_type_col", >>> sample_label="sample_col", >>> ) >>> de_res_ova_sorted_and_filtered = pr.pp.filter_genes_ova( >>> de_res_ova, >>> logfc_threshold=1.0, >>> aucroc_threshold=0.6, >>> adj_pval_threshold=0.05, >>> gene_filtering=True, >>> ) >>> de_res_ava_sorted_and_filtered = pr.pp.filter_genes_ava( >>> de_res_ava, >>> logfc_threshold=1.0, >>> aucroc_threshold=0.6, >>> adj_pval_threshold=0.05, >>> gene_filtering=True, >>> ) >>> combined_de_results = pr.pp.select_genes( >>> de_res_ova_sorted_and_filtered, >>> de_res_ava_sorted_and_filtered, >>> n_genes_ova=10, >>> n_genes_ava=3, >>> ) >>> combined_de_results