pairot.pp.select_genes

Contents

pairot.pp.select_genes#

pairot.pp.select_genes(de_res_ova, de_res_ava, n_genes_ova=10, n_genes_ava=3, n_genes_max=None, overlap_threshold=0.3, overlap_n_genes=10, remove_duplicated_genes=True)#

Select and combine DE results from OVA (one vs all) and AVA (all vs all) settings.

Function combines the OVA (one vs. all) and AVA (all vs. all) DE results by selecting the top DE genes from both settings. The AVA results only get added for clusters that are similar enough based on the Jaccard overlap of their top OVA DE genes.

de_res_ova

OVA (one vs. all) DE results from pairot.pp.filter_genes_ova().

de_res_ava

AVA (all vs. all) DE results from pairot.pp.filter_genes_ava().

n_genes_ova

Number of top DE genes to select from the OVA results for each cluster.

n_genes_ava

Number of top DE genes to select from the AVA results for each cluster pair.

n_genes_max

Maximum number of DE genes to return for each cluster after combining OVA and AVA results. If None, all genes are returned.

overlap_threshold

Jaccard overlap threshold to determine if two clusters are similar enough to refine the DE results using AVA results. If the overlap of the top overlap_n_genes genes between the OVA results of two clusters is greater than this threshold, the top n_genes_ava AVA results will be added to refine the DE results.

overlap_n_genes

Number of genes to use for the overlap calculation between clusters.

remove_duplicated_genes

If true, remove duplicated genes after combining OVA and AVA results.

Return type:

dict[str, DataFrame]

Returns:

combined_de_results Dictionary containing the combined DE results for each cluster.

Examples

>>> import pairot as pr
>>>
>>> de_res_ova, de_res_ava = pr.pp.rank_genes_limma(
>>>     adata,
>>>     cluster_label="cell_type_col",
>>>     sample_label="sample_col",
>>> )
>>> de_res_ova_sorted_and_filtered = pr.pp.filter_genes_ova(
>>>     de_res_ova,
>>>     logfc_threshold=1.0,
>>>     aucroc_threshold=0.6,
>>>     adj_pval_threshold=0.05,
>>>     gene_filtering=True,
>>> )
>>> de_res_ava_sorted_and_filtered = pr.pp.filter_genes_ava(
>>>     de_res_ava,
>>>     logfc_threshold=1.0,
>>>     aucroc_threshold=0.6,
>>>     adj_pval_threshold=0.05,
>>>     gene_filtering=True,
>>> )
>>> combined_de_results = pr.pp.select_genes(
>>>     de_res_ova_sorted_and_filtered,
>>>     de_res_ava_sorted_and_filtered,
>>>     n_genes_ova=10,
>>>     n_genes_ava=3,
>>> )
>>> combined_de_results