Influence of Kernel size parameter#

This notebook intends to show the influence of Kernel Size on Mask Generation.

The stroma mask remains constant, while the cancer mask is computed using kernel sizes of varying sizes for the convolutional sum.

Larger kernel sizes produce smoother and more generalized contours, while smaller kernels result in more fine-grained and detailed masks, potentially capturing localized structures.

These parameters can also be adjusted to define regions with varying levels of granularity—ranging from highly detailed, cell-level distinctions to broader, smoother masks.

useful imports

%load_ext autoreload

import os 
import sys
import time
import logging
import re
from tqdm import tqdm
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from PIL import Image
from natsort import natsorted

import cv2
import numpy as np
from PIL import Image
import tifffile as tiff
import copy 

sys.path.append(os.path.dirname(os.getcwd()))
from gridgene import get_arrays as ga
from gridgene import contours 
from gridgene import get_masks

Kernel on CosMx data

To remember, CosMx data has:

  • Resolution: 1 px = 125 nm = 0.125 um / 1 um = 8 px

  • N Transcripts -> 999 + system and negative controls

# transcripts for CosMx Cancer
target_tum =  ['EPCAM',  'KRT19', 'KRT8', 'KRT18','KRT17','CEACAM6','SPINK1', 'CD24', 'S100A6','RPL37','S100P',]  

Use just one file for displaying purposes

cosmx_path_s0 =  '../../cosmx_data/S0/S0/20230628_151317_S4/AnalysisResults/iz38iruwno'

folder_names_s0 = [folder_name for folder_name in os.listdir(cosmx_path_s0) if
                os.path.isdir(os.path.join(cosmx_path_s0, folder_name))]

target_files_s0 = [
    os.path.join(cosmx_path_s0, folder, file)
    for folder in os.listdir(cosmx_path_s0)
    if os.path.isdir(os.path.join(cosmx_path_s0, folder))
    for file in os.listdir(os.path.join(cosmx_path_s0, folder))
    if '__target_call_coord.csv' in file
]


files_names = natsorted(target_files_s0)
file_csv = files_names[0]   # 5

df_total = pd.read_csv(file_csv)

df_total['target'].value_counts()
df_total = pd.read_csv(file_csv)
df_total['X'] = (round(df_total['x'])).astype(int)
df_total['Y'] = (round(df_total['y'])).astype(int)
n_genes = len(df_total['target'].unique())
height = max(df_total['X'] + 1)
width = max(df_total['Y'] + 1)

# this makes the sparse df to an array with the spatial information 
target_dict_total = {target: index for index, target in enumerate(df_total['target'].unique())}
array_total = ga.transform_df_to_array(df = df_total, target_dict=target_dict_total, array_shape = (height, width,len(target_dict_total))).astype(np.int8)

# creating subsets 
df_subset_tum, array_subset_tum, target_indices_subset_tum = ga.get_subset_arrays(df_total, array_total,target_dict_total,
                                                                     target_list=target_tum, target_col = 'target')

Define the parameters for the 3 kernels

min_area_th_tum =  1000 
min_area_th_empty = 2000 

parameters = {
    'i': {'density_th_tum':40, 'kernel_size_tum':80, 'density_th_empty':140, 'kernel_size_empty':80},
    'ii':{'density_th_tum':50, 'kernel_size_tum':100, 'density_th_empty':50, 'kernel_size_empty':100},
    'iii':{'density_th_tum':30, 'kernel_size_tum':60, 'density_th_empty':30, 'kernel_size_empty':60},}
for name, i in parameters.items():
    density_th_tum = i['density_th_tum']
    kernel_size_tum = i['kernel_size_tum']
    density_th_empty = i['density_th_empty']
    kernel_size_empty = i['kernel_size_empty']
    
    # obtain contours 
    CTum = contours.ConvolutionContours(array_subset_tum, contour_name='tum')
    CTum.get_conv_sum(kernel_size=kernel_size_tum, kernel_shape='square')
    CTum.contours_from_sum(density_threshold = density_th_tum,
                           min_area_threshold = min_area_th_tum , directionality = 'higher')
    
    CEmpty = contours.ConvolutionContours(array_total, contour_name='empty')
    CEmpty.get_conv_sum(kernel_size=kernel_size_empty, kernel_shape='square')
    CEmpty.contours_from_sum(density_threshold = density_th_empty,
                           min_area_threshold = min_area_th_empty, directionality = 'lower') # attention that directionality is lower here 
        
    # Cancer contours
    fig1, ax1 = plt.subplots(figsize=(7, 5))
    CTum.plot_conv_sum(cmap='plasma', c_countour='white', ax=ax1)
    ax1.set_title('Cancer contours')
    fig1.savefig(f'results/kernel/cancer_contours{i}.png', dpi=300, bbox_inches='tight')

    # Empty points and contours
    fig2, ax2 = plt.subplots(figsize=(7, 5))
    CEmpty.plot_conv_sum(cmap='plasma', c_countour='white', ax=ax2)
    ax2.set_title('Total points and contours for empty')
    fig2.savefig(f'results/kernel/empty_contours{i}.png', dpi=300, bbox_inches='tight')

    
    #### obtain masks
    GM = get_masks.GetMasks(image_shape = (height, width))
    
    mask_empty = GM.create_mask(CEmpty.contours)
    mask_tum = GM.create_mask(CTum.contours)
    mask_stroma = GM.subtract_masks(np.ones((height, width), dtype=np.uint8), mask_tum, mask_empty)          
    mask_stroma = GM.filter_binary_mask_by_area(mask_stroma, min_area=700)
    
    GM.plot_masks(masks=[mask_stroma, mask_tum], mask_names=['Stroma', 'Cancer'],
                  background_color=(1, 1, 1), mask_colors={'Stroma': (65, 105, 225), 'Cancer': (255, 165, 0)},
                  path=f"results/kernel/", show=True, ax=None, figsize=(6, 6))
2025-06-12 00:43:51,585 - gridgen.contours.tum - INFO - Initialized GetContour
get_conv_sum took 0.5010 seconds
2025-06-12 00:43:52,408 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 38
2025-06-12 00:43:52,409 - gridgen.contours.empty - INFO - Initialized GetContour
contours_from_sum took 0.3213 seconds
get_conv_sum took 8.8959 seconds
2025-06-12 00:44:09,846 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 35
contours_from_sum took 8.5411 seconds
2025-06-12 00:44:12,853 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:44:12,866 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:44:16,425 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/masks_stroma_cancer.png
../_images/02845d4fd295210b3ddb7d9188ba0b9f400bac4bce4dbefece5eea3f12605454.png ../_images/42df1d2b236e2abf4e007de610f28d8427ba19b8796211fb4b819fd6c4c19f2d.png ../_images/6f8e8b08d221076e734524317cc0e620a33b3223725af1e3ce76b6b13375497b.png
2025-06-12 00:44:18,948 - gridgen.contours.tum - INFO - Initialized GetContour
get_conv_sum took 0.5078 seconds
2025-06-12 00:44:19,776 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 30
2025-06-12 00:44:19,776 - gridgen.contours.empty - INFO - Initialized GetContour
contours_from_sum took 0.3194 seconds
get_conv_sum took 8.7913 seconds
2025-06-12 00:44:37,111 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 6
contours_from_sum took 8.5432 seconds
2025-06-12 00:44:40,094 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:44:40,105 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:44:43,596 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/masks_stroma_cancer.png
../_images/219f8f8788414d2914396e56a134a8fd6fc251c59366727d36c22afdc65711cc.png ../_images/846e8cabbce930fd39454e7d7cbc1cdb3f321d71268251a238f338c9eed20ca9.png ../_images/68728ecf9b393187067d993686eaaad4eb24f3c8e2cba7dfdc6fd9e85d8fd448.png
2025-06-12 00:44:46,109 - gridgen.contours.tum - INFO - Initialized GetContour
get_conv_sum took 0.5043 seconds
2025-06-12 00:44:46,953 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 79
2025-06-12 00:44:46,955 - gridgen.contours.empty - INFO - Initialized GetContour
contours_from_sum took 0.3402 seconds
get_conv_sum took 8.9319 seconds
2025-06-12 00:45:04,579 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 13
contours_from_sum took 8.6921 seconds
2025-06-12 00:45:07,623 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:45:07,637 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:45:11,536 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/masks_stroma_cancer.png
../_images/1a9b5428788bf1d5dc64251560cb62232bf5f692e52df7a963283034aa0e9fd7.png ../_images/2198f9f1e6c15e988dc8cc121df36871de1f9fb81d7142f7c24864e95ef3ecbc.png ../_images/03bfff1cc8f78dd12571528c377047b239e9a0671f18aa2605e8805a5b8e5956.png

Xenium

How the kernel affects Xenium data?

target_tum = ['EPCAM', 'SMIM22','CLDN3', 'KRT18','LGALS4', 'KRT8', 'ELF3','TSPAN8', 'STMN1', 'CD47', 'MYC', 'LGALS3'] 
file_csv =  '../../xenium_data/HLA/GD_TMA1_S3/fov_filtered/TMA1_Selection13_filtered.csv'

df_total = pd.read_csv(file_csv)
df_total = df_total[['x_location', 'y_location', 'feature_name']]
df_total = df_total.rename(columns={'feature_name': 'target'})
df_total = df_total[~df_total['target'].str.contains('System|egative')]
df_total['X'] = df_total['x_location'] - min(df_total['x_location'])
df_total['Y'] = df_total['y_location'] - min(df_total['y_location'])

n_genes = len(df_total['target'].unique())
height = int(max(df_total['X'])) + 1
width = int(max(df_total['Y'])) + 1
target_dict_total = {target: index for index, target in enumerate(df_total['target'].unique())}
array_total = ga.transform_df_to_array(df = df_total, target_dict=target_dict_total, array_shape = (height, width,len(target_dict_total))).astype(np.int8)

# creating subsets 
df_subset_tum, array_subset_tum, target_indices_subset_tum = ga.get_subset_arrays(df_total, array_total,target_dict_total,
                                                                     target_list=target_tum, target_col = 'target')

Define parameters

Here, we will mantain the same tissue contour and only change the cancer one

min_area_th_empty = 400 #400
min_area_th_tum =  700 

parameters = {
    'i': {'density_th_tum':20, 'kernel_size_tum':10, 'density_th_empty':30, 'kernel_size_empty':10},
    'ii':{'density_th_tum':20, 'kernel_size_tum':20, 'density_th_empty':30, 'kernel_size_empty':10},
    'iii':{'density_th_tum':20, 'kernel_size_tum':30, 'density_th_empty':30, 'kernel_size_empty':10},
    'iv':{'density_th_tum':20, 'kernel_size_tum':7, 'density_th_empty':30, 'kernel_size_empty':10},}
for name, i in parameters.items():
    density_th_tum = i['density_th_tum']
    kernel_size_tum = i['kernel_size_tum']
    density_th_empty = i['density_th_empty']
    kernel_size_empty = i['kernel_size_empty']
    
    
    # obtain contours 
    CTum = contours.ConvolutionContours(array_subset_tum, contour_name='tum')
    CTum.get_conv_sum(kernel_size=kernel_size_tum, kernel_shape='square')
    CTum.contours_from_sum(density_threshold = density_th_tum,
                           min_area_threshold = min_area_th_tum , directionality = 'higher')

    CEmpty = contours.ConvolutionContours(array_total, contour_name='empty')
    CEmpty.get_conv_sum(kernel_size=kernel_size_empty, kernel_shape='square')
    CEmpty.contours_from_sum(density_threshold = density_th_empty,
                           min_area_threshold = min_area_th_empty, directionality = 'lower') # attention that directionality is lower here 

    fig, axs = plt.subplots(1, 2, figsize=(15, 10))

    CTum.plot_conv_sum(cmap='plasma', c_countour='white', ax=axs[0])
    axs[0].set_title('Tum points and tum contours')

    CEmpty.plot_conv_sum(cmap='plasma', c_countour='white', ax=axs[1])
    axs[1].set_title('total points and contours for empty')

    plt.show()
        
    # Cancer contours
    fig1, ax1 = plt.subplots(figsize=(7, 5))
    CTum.plot_conv_sum(cmap='plasma', c_countour='white', ax=ax1)
    ax1.set_title('Cancer contours')
    ax1.axis('off')  # hides both x and y axes, ticks, and frame
    fig1.savefig(f'results/kernel/x_cancer_contours{name}.png', dpi=300, bbox_inches='tight')

    # Empty points and contours
    fig2, ax2 = plt.subplots(figsize=(7, 5))
    CEmpty.plot_conv_sum(cmap='plasma', c_countour='white', ax=ax2)
    ax2.set_title('Total points and contours for empty')
    ax2.axis('off')  # hides both x and y axes, ticks, and frame
    fig2.savefig(f'results/kernel/x_empty_contours{name}.png', dpi=300, bbox_inches='tight')


    #### obtain masks
    GM = get_masks.GetMasks(image_shape = (height, width))

    mask_empty = GM.create_mask(CEmpty.contours)
    mask_tum = GM.create_mask(CTum.contours)
    mask_stroma = GM.subtract_masks(np.ones((height, width), dtype=np.uint8), mask_tum, mask_empty)          
    mask_stroma = GM.filter_binary_mask_by_area(mask_stroma, min_area=700)

    GM.plot_masks(masks=[mask_stroma, mask_tum], mask_names=['Stroma', 'Cancer'],
                  background_color=(1, 1, 1), mask_colors={'Stroma': (65, 105, 225), 'Cancer': (255, 165, 0)},
                  path=f'results/kernel/_{name}', show=True, ax=None, figsize=(6, 4))
2025-06-12 00:49:27,951 - gridgen.contours.tum - INFO - Initialized GetContour
2025-06-12 00:49:28,096 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 54
2025-06-12 00:49:28,097 - gridgen.contours.empty - INFO - Initialized GetContour
get_conv_sum took 0.0819 seconds
contours_from_sum took 0.0629 seconds
get_conv_sum took 0.7006 seconds
2025-06-12 00:49:29,475 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 29
contours_from_sum took 0.6778 seconds
../_images/5d6ad3c85bc4ce2f0b8bfe87fe02500652281bc1b0ec9fba0770759a7eec8c42.png
2025-06-12 00:49:31,190 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:49:31,195 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:49:32,451 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/_i/masks_stroma_cancer.png
../_images/705f4d8dff17a8fe8942ba9328ba6e5dcb7fdb708e90b7e834ce09cd8e266cc4.png ../_images/011b723ea32b796b204536eee185e071d02fa45d3f7e7d54d73928e09f3f381a.png ../_images/fb967dd16ce153f52245771fc2eb78736781ec44610ca1a670fc47a4162f1170.png
2025-06-12 00:49:32,994 - gridgen.contours.tum - INFO - Initialized GetContour
2025-06-12 00:49:33,135 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 26
2025-06-12 00:49:33,136 - gridgen.contours.empty - INFO - Initialized GetContour
get_conv_sum took 0.0792 seconds
contours_from_sum took 0.0617 seconds
get_conv_sum took 0.6823 seconds
2025-06-12 00:49:34,500 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 29
contours_from_sum took 0.6814 seconds
../_images/18b5aaba44093b192ef6e1f26bb676b8d9122686b0769abe0110f832b323f0fe.png
2025-06-12 00:49:36,158 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:49:36,162 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:49:37,372 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/_ii/masks_stroma_cancer.png
../_images/44b6e84f6b465d1416bb3151a7e0041c7c84cfe7ee3503f293480e0fa61a6443.png ../_images/011b723ea32b796b204536eee185e071d02fa45d3f7e7d54d73928e09f3f381a.png ../_images/1db9d001395de058edfd21496926118349f3fa28233c84beda5d3ad088d1343a.png
2025-06-12 00:49:37,910 - gridgen.contours.tum - INFO - Initialized GetContour
2025-06-12 00:49:38,062 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 21
2025-06-12 00:49:38,063 - gridgen.contours.empty - INFO - Initialized GetContour
get_conv_sum took 0.0854 seconds
contours_from_sum took 0.0667 seconds
get_conv_sum took 0.7069 seconds
2025-06-12 00:49:39,431 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 29
contours_from_sum took 0.6612 seconds
../_images/18bd14af33a5cfaa30369c0b6003d831b0fb353f9dba3925f0969067d3d12126.png
2025-06-12 00:49:41,215 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:49:41,219 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:49:42,419 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/_iii/masks_stroma_cancer.png
../_images/f6b5f246f40e1d577e0327d63754f4bfb6b8b3b47b5d456b580e4cb0d131f8fa.png ../_images/011b723ea32b796b204536eee185e071d02fa45d3f7e7d54d73928e09f3f381a.png ../_images/b9da0f3a598402bf9113c4046dcd4fd282d25bc164d1f6b089b2b4bfc08b509b.png
2025-06-12 00:49:42,963 - gridgen.contours.tum - INFO - Initialized GetContour
2025-06-12 00:49:43,128 - gridgen.contours.tum - INFO - Number of contours after filtering no counts: 57
2025-06-12 00:49:43,129 - gridgen.contours.empty - INFO - Initialized GetContour
get_conv_sum took 0.0824 seconds
contours_from_sum took 0.0830 seconds
get_conv_sum took 0.7013 seconds
2025-06-12 00:49:44,508 - gridgen.contours.empty - INFO - Number of contours after filtering no counts: 29
contours_from_sum took 0.6785 seconds
../_images/335e4970049f19ed4cdf1d02a1dd5ef2f1b9479a761f52060dfa100398f5ea65.png
2025-06-12 00:49:46,133 - gridgen.get_masks.GetMasks - INFO - Initialized GetMasks
2025-06-12 00:49:46,138 - gridgen.get_masks.GetMasks - INFO - Subtracted masks from base mask.
2025-06-12 00:49:47,416 - gridgen.get_masks.GetMasks - INFO - Plot saved at results/kernel/_iv/masks_stroma_cancer.png
../_images/a1c143f2b90954a1f61807b783bb7bc9d715a004160150a208977c753e4e4cd8.png ../_images/011b723ea32b796b204536eee185e071d02fa45d3f7e7d54d73928e09f3f381a.png ../_images/a36fea1defa57cdd04ea0dde04d2bb26186d162c4e49c82e35ef0d402fda2488.png