troutpy.pp.get_transcript_categories

troutpy.pp.get_transcript_categories#

troutpy.pp.get_transcript_categories(sdata, layer='transcripts', struct_table_key='structure_table', metadata_key='xrna_metadata')#

Classify transcripts into a hierarchy of intracellular/extracellular categories.

Transcripts are split, in order, into: intracellular; cell-like (extracellular is False despite being outside a cell), split by structural connectivity; high-density extracellular structures, split by connectivity; noise-spectrum genes (high fdr_noise in metadata_key); and the remaining diffuse extracellular transcripts, split into diffusion-compatible and -incompatible genes based on the Kolmogorov-Smirnov p-value (ks_pval) in metadata_key.

Parameters:
  • sdata (spatialdata.SpatialData) – SpatialData object containing the layer points layer, the struct_table_key table with an is_physically_connected column indexed by "struct_<id>", and the metadata_key table whose .var holds the fdr_noise and ks_pval columns.

  • layer (str, optional) – Points layer with overlaps_cell, extracellular, enrichment_class, structure_id, and gene columns. Defaults to "transcripts".

  • struct_table_key (str, optional) – Key of the table in sdata describing extracellular structures. Defaults to "structure_table".

  • metadata_key (str, optional) – Key of the table in sdata holding per-gene uRNA metadata. Defaults to "xrna_metadata".

Returns:

pandas.Series Transcript counts per category: "Intracellular", "Cell-Like Connected", "Cell-Like Unconnected", "High-Density Connected", "High-Density Unconnected", "Noise Spectrum", "Diffusion Compatible", and "Diffusion Incompatible".