troutpy.tl.cluster_distribution_from_source#
- troutpy.tl.cluster_distribution_from_source(sdata, gene_key='gene', distance_key='distance', n_clusters=3, n_bins=20, copy=False)#
Clusters genes based on the distribution of distances of extracellular transcripts from their source cell.
For each gene in sdata[‘source_score’].obs, the function computes a normalized histogram (using n_bins) over the distance range. These histogram vectors are then standardized and clustered using KMeans.
- Parameters:
sdata (spatialdata.SpatialData) – Spatial data object containing a ‘source_score’ layer with an obs DataFrame.
gene_key (str) – Column name that contains the gene names.
distance_key (str) – Column name that contains the distance from the source cell.
n_clusters (int) – Number of clusters to form.
n_bins (int) – Number of bins for the histogram representation.
- Returns: