troutpy.tl.cluster_distribution_from_source#
- troutpy.tl.cluster_distribution_from_source(sdata, gene_key='gene', distance_key='distance', n_clusters=3, n_bins=20, copy=False)#
Cluster genes by the distribution of their transcripts’ distances to source cells.
For each gene in
sdata["source_score"].obs, computes a normalized histogram (withn_binsbins) ofdistance_keyvalues, standardizes these histogram vectors, and clusters them with KMeans. Results are stored insdata["xrna_metadata"].var["kmeans_distribution"].- Parameters:
sdata (
SpatialData) – SpatialData object containing a"source_score"table with anobsDataFrame.gene_key (
str(default:'gene')) – Column insdata["source_score"].obscontaining gene identifiers.distance_key (
str(default:'distance')) – Column insdata["source_score"].obscontaining the distance from the source cell.n_clusters (
int(default:3)) – Number of KMeans clusters to form.n_bins (
int(default:20)) – Number of histogram bins used to represent each gene’s distance distribution.copy (
bool(default:False)) – IfTrue, return a modified copy ofsdata. Otherwise modify in place.
- Returns:
If
copy=True, a modified copy ofsdata. OtherwiseNone, modifyingsdatain place.