gridgene.binsom module#

File for get the tum stroma mask using bins of image and the SOM clustering

class gridgene.binsom.GetBins(bin_size, unique_targets, logger=None)[source]

Bases: object

Bin spatial transcriptomics data into grid cells and create AnnData objects.

get_bin_df(df, df_name)[source]

Convert a DataFrame of cells with spatial coordinates and target labels into a binned AnnData object.

Parameters:
  • df (pd.DataFrame) – DataFrame with columns [‘X’, ‘Y’, ‘target’] representing cell positions and target labels.

  • df_name (str) – Identifier for the dataset.

Returns:

AnnData object with spatial bins and counts per target.

Return type:

ad.AnnData

get_bin_cohort(df_list, df_name_list, cohort_name)[source]

Process multiple datasets into binned AnnData objects and concatenate them into a cohort.

Parameters:
  • df_list (List[pd.DataFrame]) – List of DataFrames to process.

  • df_name_list (List[str]) – List of dataset names corresponding to each DataFrame.

  • cohort_name (str) – Name of the cohort to assign to all data.

Return type:

None

preprocess_bin(min_counts=10, adata=None)[source]

Filter and normalize the binned AnnData.

Parameters:
  • min_counts (int, optional) – Minimum total counts per bin to retain it, by default 10

  • adata (Optional[ad.AnnData], optional) – AnnData object to preprocess (defaults to internal one), by default None

Return type:

None

class gridgene.binsom.GetContour(adata, logger=None)[source]

Bases: object

Perform SOM clustering on spatial bins and evaluate clusters.

run_som(som_shape=(2, 1), n_iter=5000, sigma=0.5, learning_rate=0.5, random_state=42)[source]

Apply SOM clustering on the AnnData object.

Parameters:
  • som_shape (Tuple[int, int], optional) – Shape of the SOM grid (rows, columns), by default (2, 1)

  • n_iter (int, optional) – Number of iterations for SOM training, by default 5000

  • sigma (float, optional) – Width of the Gaussian neighborhood function, by default 0.5

  • learning_rate (float, optional) – Learning rate for SOM training, by default 0.5

  • random_state (int, optional) – Random seed for reproducibility, by default 42

Return type:

None

eval_som_statistical(top_n=20)[source]

Compute and log top ranked features per SOM cluster.

Parameters:

top_n (int, optional) – Number of top features to retrieve for each cluster, by default 20

Return type:

None

create_cluster_image(adata, grid_size)[source]

Reconstruct an image from cluster annotations in the AnnData object.

Parameters:
  • adata (ad.AnnData) – AnnData object containing clustering results and grid positions.

  • grid_size (int) – Size of each grid cell in pixels.

Returns:

2D array with cluster IDs as pixel values.

Return type:

np.ndarray

plot_som(som_image, cmap=None, path=None, show=False, figsize=(10, 10), ax=None, legend_labels=None)[source]

Visualize the SOM cluster map.

Parameters:
  • som_image (np.ndarray) – 2D array representing the SOM clusters.

  • cmap (Optional[Any], optional) – Colormap to use for visualization, by default None (uses ‘tab10’)

  • path (Optional[str], optional) – Optional path to save the plot image, by default None

  • show (bool, optional) – Whether to display the plot, by default False

  • figsize (Tuple[int, int], optional) – Size of the figure, by default (10, 10)

  • ax (Optional[plt.Axes], optional) – Matplotlib Axes to plot on, by default None (creates new figure)

  • legend_labels (Optional[Dict[int, str]], optional) – Dictionary mapping cluster indices to labels for legend, by default None

Returns:

The matplotlib Axes object containing the plot.

Return type:

plt.Axes