troutpy.tl.celltype_contact_matrix

troutpy.tl.celltype_contact_matrix#

troutpy.tl.celltype_contact_matrix(sdata, cell_type_key='leiden', radius=50.0, min_score=0.1, gene_list=None, gene_key='gene', normalize=False, cell_types=None, tile_size=1000.0, store_in_sdata=False, obsp_key='transcript_contacts', obs_key='contact_cell_ids')#

Compute a cell-type x cell-type contact matrix from the spatial proximity of “owned” transcripts.

Each transcript is assigned to an owner cell: either the cell it physically overlaps (overlaps_cell), or, if unassigned but with assignment_score >= min_score, its predicted_parent cell from sdata.tables["source_score"]. Pairs of owned transcripts within radius of each other (found tile-by-tile for memory efficiency) define a directed cell-cell contact, which is then aggregated into a cell_type_key x cell_type_key count matrix.

Parameters:
  • sdata (spatialdata.SpatialData) – SpatialData object with a "table" AnnData (cell metadata, including cell_id and cell_type_key in .obs), a "source_score" table with predicted_parent and assignment_score in .obs, and a "transcripts" points layer with x, y, cell_id, overlaps_cell, and gene_key columns.

  • cell_type_key (str, optional) – Column in sdata["table"].obs with cell type labels. Defaults to "leiden".

  • radius (float, optional) – Maximum distance between two owned transcripts for their owner cells to be considered in contact. Defaults to 50.0.

  • min_score (float, optional) – Minimum assignment_score for an unassigned transcript to be attributed to its predicted_parent cell. Defaults to 0.1.

  • gene_list (str or list of str, optional) – If given, restrict the analysis to transcripts of this gene (or genes). Defaults to None (all genes).

  • gene_key (str, optional) – Column in the transcripts table holding gene identity. Defaults to "gene".

  • normalize (bool, optional) – If True, normalize each row of the output matrix to sum to 1. Defaults to False.

  • cell_types (list of str, optional) – If given, restrict owners and contacts to these cell types. Defaults to None (all cell types).

  • tile_size (float, optional) – Side length of the square tiles used to find nearby transcript pairs. Defaults to 1000.0.

  • store_in_sdata (bool, optional) – If True, also store a per-cell sparse adjacency matrix in sdata["table"].obsp[obsp_key] and a per-cell neighbor-id string in sdata["table"].obs[obs_key]. Defaults to False.

  • obsp_key (str, optional) – Key for the sparse adjacency matrix in sdata["table"].obsp when store_in_sdata=True. Defaults to "transcript_contacts".

  • obs_key (str, optional) – Key for the per-cell neighbor-id string in sdata["table"].obs when store_in_sdata=True. Defaults to "contact_cell_ids".

Return type:

DataFrame

Returns:

pandas.DataFrame cell_type_key x cell_type_key matrix of directed contact counts (or row-normalized proportions if normalize=True).