Tutorial 3: Supervised Model for Identifying Metastasis‐Related Niches Using Primary Colorectal Cancer and Liver Metastasis Slices - cmzuo11/stClinic GitHub Wiki

After completing Tutorial 2, we further present our re-analysis aimed at identifying metastasis-related niches. If you are interested in using the trained models presented in our paper, please visit the following link: https://github.com/cmzuo11/stClinic/blob/main/Datasets/CRCLM/integrated_adata_CRCLM24_niche_model.h5ad.

Preparation

import os
import scanpy as sc
import random
import torch
import numpy as np
import pandas as pd
import warnings
import re
from pathlib import Path

import stClinic as stClinic

warnings.filterwarnings("ignore")

Set parameters

used_device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
parser      = stClinic.parameter_setting()
args        = parser.parse_args()

args.out_dir  = args.input_dir + 'stClinic/'
Path(args.out_dir).mkdir(parents=True, exist_ok=True)

Set seed

seed = 666
np.random.seed(seed)
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)

Load data

adata = sc.read_h5ad(args.input_dir + 'integrated_adata_CRCLM24.h5ad')
adata.obs['louvain'] = adata.obs['louvain'].astype('int')
adata.obs['louvain'] = adata.obs['louvain'].astype('category')

Data preparation

def extract_number(s):
    return int(re.findall(r'\d+', s)[0])
sorted_batch = sorted(np.unique(adata.obs['batch_name']), key=extract_number)

Six statistics measures per cluster

adata        = stClinic.stClinic_Statistics_Measures(adata, sorted_batch)

Clinical information (One-hot encoding)

All_type = []
for bid in sorted_batch:
    batch_obs = adata.obs[ adata.obs['batch_name'] == bid ]
    All_type.append( np.unique( batch_obs['type'] )[0] )
All_type = np.array(All_type)
type_idx = np.zeros([len(All_type)], dtype=int)
type_idx[All_type == 'Metastasis'] = 1
adata.uns['grading'] = type_idx

Run stClinic for supervised prediction

adata = stClinic.train_Prediction_Model(adata, pred_type='grading', lr=args.lr_prediction, device=used_device)

adata.obsm['stClinic'] and adata.obsm['X_umap'] contain the latent features of stClinic and UMAP embeddings; adata.obs['louvain'] includes spatial clusters; and adata.uns['Cluster_importance'] stores the weights of each cluster.

Visualization of niche importance

cluster_contrib = pd.DataFrame({'cluster': np.arange(1, 14, 1).astype('str'), 'contrib': adata.uns['Cluster_importance']})
cluster_contrib = cluster_contrib.sort_values(by='contrib', ascending=False)

plt.figure(figsize=(10, 6), dpi=100)
ax = plt.subplot(111)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.yaxis.set_ticks_position('left')
cmap     = mcolors.LinearSegmentedColormap.from_list('my_colormap', ['#FE0000', '#1E0551'], len(cluster_contrib))
rgb_list = [mcolors.to_rgb(cmap(i)) for i in range(len(cluster_contrib))]
plt.bar(x='cluster', height='contrib', data=cluster_contrib, color=rgb_list)
plt.tight_layout()
plt.show()

image