Tutorial 3: Supervised Model for Identifying Metastasis‐Related Niches Using Primary Colorectal Cancer and Liver Metastasis Slices - cmzuo11/stClinic GitHub Wiki
After completing Tutorial 2, we further present our re-analysis aimed at identifying metastasis-related niches. If you are interested in using the trained models presented in our paper, please visit the following link: https://github.com/cmzuo11/stClinic/blob/main/Datasets/CRCLM/integrated_adata_CRCLM24_niche_model.h5ad.
Preparation
import os
import scanpy as sc
import random
import torch
import numpy as np
import pandas as pd
import warnings
import re
from pathlib import Path
import stClinic as stClinic
warnings.filterwarnings("ignore")
Set parameters
used_device = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu')
parser = stClinic.parameter_setting()
args = parser.parse_args()
args.out_dir = args.input_dir + 'stClinic/'
Path(args.out_dir).mkdir(parents=True, exist_ok=True)
Set seed
seed = 666
np.random.seed(seed)
random.seed(seed)
torch.manual_seed(seed)
torch.cuda.manual_seed(seed)
Load data
adata = sc.read_h5ad(args.input_dir + 'integrated_adata_CRCLM24.h5ad')
adata.obs['louvain'] = adata.obs['louvain'].astype('int')
adata.obs['louvain'] = adata.obs['louvain'].astype('category')
Data preparation
def extract_number(s):
return int(re.findall(r'\d+', s)[0])
sorted_batch = sorted(np.unique(adata.obs['batch_name']), key=extract_number)
Six statistics measures per cluster
adata = stClinic.stClinic_Statistics_Measures(adata, sorted_batch)
Clinical information (One-hot encoding)
All_type = []
for bid in sorted_batch:
batch_obs = adata.obs[ adata.obs['batch_name'] == bid ]
All_type.append( np.unique( batch_obs['type'] )[0] )
All_type = np.array(All_type)
type_idx = np.zeros([len(All_type)], dtype=int)
type_idx[All_type == 'Metastasis'] = 1
adata.uns['grading'] = type_idx
Run stClinic for supervised prediction
adata = stClinic.train_Prediction_Model(adata, pred_type='grading', lr=args.lr_prediction, device=used_device)
adata.obsm['stClinic'] and adata.obsm['X_umap'] contain the latent features of stClinic and UMAP embeddings; adata.obs['louvain'] includes spatial clusters; and adata.uns['Cluster_importance'] stores the weights of each cluster.
Visualization of niche importance
cluster_contrib = pd.DataFrame({'cluster': np.arange(1, 14, 1).astype('str'), 'contrib': adata.uns['Cluster_importance']})
cluster_contrib = cluster_contrib.sort_values(by='contrib', ascending=False)
plt.figure(figsize=(10, 6), dpi=100)
ax = plt.subplot(111)
ax.spines['right'].set_visible(False)
ax.spines['top'].set_visible(False)
ax.yaxis.set_ticks_position('left')
cmap = mcolors.LinearSegmentedColormap.from_list('my_colormap', ['#FE0000', '#1E0551'], len(cluster_contrib))
rgb_list = [mcolors.to_rgb(cmap(i)) for i in range(len(cluster_contrib))]
plt.bar(x='cluster', height='contrib', data=cluster_contrib, color=rgb_list)
plt.tight_layout()
plt.show()