Patterns of protein- and metabolite-based cell-cell communication (MEBOCOST)¶

This tutorial is focused on running Tensor-cell2cell v2 using the tools cell2cell and MEBOCOST to predict protein- and metabolite-mediated cell-cell communication (CCC), respectively.

In this case, we use samples from cortical brain organoids across different time points, previously published by Trujillo et al (2019).

Read this if you plan using a GPU to speed-up the analysis. Before running this notebook, make sure to have a proper NVIDIA GPU driver (https://www.nvidia.com/Download/index.aspx) as well as the CUDA toolkit (https://developer.nvidia.com/cuda-toolkit) installed.

Then, make sure to create an environment with Pytorch >= v1.8.0 following these instructions to enable CUDA.

https://pytorch.org/get-started/locally/

Once you have everything installed, run the next blocks two blocks of code before everything.

If you are using a NVIDIA GPU, with PyTorch and CUDA, set the following variable to be True

use_gpu = True, otherwise set it use_gpu = False

In [1]:

Copied!





use_gpu = True

import tensorly as tl
if use_gpu:
    backend = 'pytorch'
    device = 'cuda'
    tl.set_backend(backend)
else:
    backend = 'numpy'
    device = 'cpu'
use_gpu = True

import tensorly as tl
if use_gpu:
    backend = 'pytorch'
    device = 'cuda'
    tl.set_backend(backend)
else:
    backend = 'numpy'
    device = 'cpu'

Importing packages to use

In [2]:

Copied!





import cell2cell as c2c
import scanpy as sc

import numpy as np
import pandas as pd

from tqdm.auto import tqdm

import matplotlib.pyplot as plt
import seaborn as sns
from mpl_toolkits.axes_grid1 import make_axes_locatable

%matplotlib inline
import cell2cell as c2c
import scanpy as sc

import numpy as np
import pandas as pd

from tqdm.auto import tqdm

import matplotlib.pyplot as plt
import seaborn as sns
from mpl_toolkits.axes_grid1 import make_axes_locatable

%matplotlib inline

In [3]:

Copied!

import warnings
warnings.filterwarnings('ignore')
import warnings
warnings.filterwarnings('ignore')

IMPORTANT: In this notebook, the version 0.8.3 of cell2cell is used. Versions older than 0.8.0 do not implement the Coupled Tensor Component Analysis (CTCA).

1 - Load Data¶

Specify folders where the data is located, and where the outputs will be written:

In [4]:

Copied!





import os

data_folder = '/data/'
directory = os.fsencode(data_folder)

output_folder = '/results/Tensor-cell2cell/'
if not os.path.isdir(output_folder):
    os.mkdir(output_folder)
import os

data_folder = '/data/'
directory = os.fsencode(data_folder)

output_folder = '/results/Tensor-cell2cell/'
if not os.path.isdir(output_folder):
    os.mkdir(output_folder)

Import 4D communication tensors¶

We will import the communication tensors created for both modalities (protein- and metabolite-mediated CCC), as previously done.

In [5]:

Copied!

tensor_lr = c2c.io.load_tensor('/data/Tensor-cell2cell/Initial-c2c-Tensor.pkl', backend=backend, device=device)
tensor_met = c2c.io.load_tensor('/data/Tensor-cell2cell/Initial-MEBOCOST-Tensor.pkl', backend=backend, device=device)
tensor_lr = c2c.io.load_tensor('/data/Tensor-cell2cell/Initial-c2c-Tensor.pkl', backend=backend, device=device)
tensor_met = c2c.io.load_tensor('/data/Tensor-cell2cell/Initial-MEBOCOST-Tensor.pkl', backend=backend, device=device)

In [6]:

Copied!

meta_tf = c2c.io.load_variable_with_pickle('/data/Tensor-cell2cell/Meta-Initial-c2c-Tensor.pkl')
meta_tf2 = c2c.io.load_variable_with_pickle('/data/Tensor-cell2cell/Meta-Initial-MEBOCOST-Tensor.pkl')
meta_tf = c2c.io.load_variable_with_pickle('/data/Tensor-cell2cell/Meta-Initial-c2c-Tensor.pkl')
meta_tf2 = c2c.io.load_variable_with_pickle('/data/Tensor-cell2cell/Meta-Initial-MEBOCOST-Tensor.pkl')

2 - Generate coupled tensors¶

Once we have loaded our 4D tensors, we start creating a CoupledInteractionTensorobject in Tensor-cell2cell v2. This object will later perform the CTCA.

In [7]:

Copied!





coupled_tensor = c2c.tensor.CoupledInteractionTensor(tensor1=tensor_lr,
                                                     tensor2=tensor_met,
                                                     tensor1_name='Protein Tensor',
                                                     tensor2_name='Metabolite Tensor',
                                                     mode_mapping={'shared' : [(0,0), (2,2), (3,3)]}
                                                    )
coupled_tensor = c2c.tensor.CoupledInteractionTensor(tensor1=tensor_lr,
                                                     tensor2=tensor_met,
                                                     tensor1_name='Protein Tensor',
                                                     tensor2_name='Metabolite Tensor',
                                                     mode_mapping={'shared' : [(0,0), (2,2), (3,3)]}
                                                    )

We can observe that this object includes both tensors simultaneously.

In [8]:

Copied!

coupled_tensor.shape
coupled_tensor.shape

Out[8]:

(torch.Size([4, 422, 5, 5]), torch.Size([4, 52, 5, 5]))

3 - Run Analysis¶

Next, we apply a non-negative canonical coupled component analysis (CTCA). Briefly, this tensor decomposition identifies a low-rank tensor. Briefly, tensor decomposition identifies a low-rank tensor for each modality's tensor that better approximates each original tensor. These low-rank tensors can be represented as the sum of a set of rank-1 tensors (in this case, the rank, or number of rank-1 tensors corresponds to number of factors). Each rank-1 tensor represents a factor in the decomposition and can be further represented as the outer product of n vectors, where n represents the number of tensor dimensions. Each vector represents one of the n tensor dimensions for that factor and its values, corresponding to individual elements in each dimension, represent the factor loadings.

In the CTCA, we have vectors that are shared between the coupled tensors (here contexts, sender cells, and receiver cells), while we have other vectors that are private for each modality (protein-based LR pair and metabolite-based LR pair dimensions). The loadings shared between both tensors behave exactly the same in both modalities, representing that we are capturing LR pairs of each modality that present similar trends in each factor. If each tensor was decomposed separately with the standard TCA in Tensor-cell2cell v1, we would not ensure that the key LR pairs for each factor would behave the same, as this could lead to loadings for the context, sender cells and receiver cells with different behaviors between modalities.

tensor-fact

Selecting number of factors¶

For selecting the number of factors to decompose CCC, we could do it either manually or automatically from an Elbow analysis. Typically this comes from running the CTCA with different number of factors; then, selecting the elbow with the lowest error.

However, here we combined the similarity across runs in the same rank and the reconstruction error. Then, we manually selected the number under the criteria of achieving a similarity of at least 0.9 while keeping the error as low as possible.

If you want to stick to using the elbow analysis based on the error, just run the code after the following cell and change the parameter automatic_elbowto be True if you want to automatically select the number of factors.

In [9]:

Copied!





elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=20,
                                                   runs=20,
                                                   init='svd',
                                                   metric='similarity',
                                                   automatic_elbow=False,
                                                   random_state=888,
                                                   show_individual=True,
                                                   )

plt.savefig(output_folder + 'Elbow-similarity.pdf', dpi=300, bbox_inches='tight')
elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=20,
                                                   runs=20,
                                                   init='svd',
                                                   metric='similarity',
                                                   automatic_elbow=False,
                                                   random_state=888,
                                                   show_individual=True,
                                                   )

plt.savefig(output_folder + 'Elbow-similarity.pdf', dpi=300, bbox_inches='tight')

100%|██████████| 20/20 [04:43<00:00, 14.19s/it]

No description has been provided for this image

In [10]:

Copied!





elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=20,
                                                   runs=20,
                                                   init='svd',
                                                   metric='error',
                                                   automatic_elbow=False,
                                                   random_state=888,
                                                   show_individual=True,
                                                   )

plt.savefig(output_folder + 'Elbow-error.pdf', dpi=300, bbox_inches='tight')
elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=20,
                                                   runs=20,
                                                   init='svd',
                                                   metric='error',
                                                   automatic_elbow=False,
                                                   random_state=888,
                                                   show_individual=True,
                                                   )

plt.savefig(output_folder + 'Elbow-error.pdf', dpi=300, bbox_inches='tight')

100%|██████████| 20/20 [04:46<00:00, 14.33s/it]

The number of factors we picked here is 6, which keeps a good balance between similarity (>0.9) and error (as low as possible).

In [11]:

Copied!

n_factors = 6
n_factors = 6

Fine-tuning balancing weights for CTCA¶

Our CTCA introduces two balancing weights, α1 and α2, that sum up to one, to control how each tensor contributes to the factorization results. By default, Tensor-cell2cell prioritizes both tensors equally as previously implemented in other coupled factorization approaches (Li et al. 2024). However, balancing weights can be changed by the user upon reasonable criteria to influence the discovered patterns.

In this case, after selecting the number of factors, we fine tune α1 and α2. To do so, we pick different combinations starting from α1 = 0.1 ; α2 = 0.9 until α1 = 0.9 ; α2 = 0.1, with changes of 0.1 in each step:

In [12]:

Copied!

weights = np.linspace(0.1, 0.9, 9)
weights = np.linspace(0.1, 0.9, 9)

In [13]:

Copied!





factors = []
names = []
weight_pairs = dict()
for i, a1 in tqdm(enumerate(weights), total=weights.shape[0]):
    a1 = np.round(a1, 1)
    a2 = np.round(1. - a1, 1)
    name = 'α1={}; α2={}'.format(a1, a2) 
    coupled_tensor.compute_tensor_factorization(rank=n_factors,
                                                init='svd',
                                                random_state=888, # Same initialization, different weights
                                                manual_weights=(a1, a2),
                                                runs=1, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition
                                               )

    factors.append(coupled_tensor.factors.copy())
    names.append(name)
    weight_pairs[name] = (a1, a2)
factors = []
names = []
weight_pairs = dict()
for i, a1 in tqdm(enumerate(weights), total=weights.shape[0]):
    a1 = np.round(a1, 1)
    a2 = np.round(1. - a1, 1)
    name = 'α1={}; α2={}'.format(a1, a2) 
    coupled_tensor.compute_tensor_factorization(rank=n_factors,
                                                init='svd',
                                                random_state=888, # Same initialization, different weights
                                                manual_weights=(a1, a2),
                                                runs=1, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition
                                               )

    factors.append(coupled_tensor.factors.copy())
    names.append(name)
    weight_pairs[name] = (a1, a2)

HBox(children=(HTML(value=''), FloatProgress(value=0.0, max=9.0), HTML(value='')))

Then, we use CorrIndex for similarity (1-CorrIndex) and evaluate which combination is the most balanced (that means less distinct from the other combinations).

In [14]:

Copied!

df = c2c.tensor.metrics.pairwise_correlation_index(factors)
df.columns = names
df.index = names
df = c2c.tensor.metrics.pairwise_correlation_index(factors)
df.columns = names
df.index = names

Visualizations of Similarities

In [15]:

Copied!





# Generate a mask for the upper triangle
mask = np.tril(np.ones_like(1. - df, dtype=bool), k=-1)

# Create figure and axes
fig, ax = plt.subplots(figsize=(8, 6))

# Create heatmap without colorbar
hm = sns.heatmap(1. - df, mask=mask, annot=True, cmap='Blues', 
                 square=True, linewidths=0.5, cbar=False, ax=ax)

# Create colorbar on the left
divider = make_axes_locatable(ax)
cax = divider.append_axes("left", size="5%", pad=0.5)
cbar = plt.colorbar(hm.collections[0], cax=cax)
cbar.set_label('Similarity\n(1 - CorrIndex)', rotation=90, labelpad=15)
cax.yaxis.set_ticks_position('left')
cax.yaxis.set_label_position('left')

plt.sca(ax)  # Set current axes back to main plot
plt.title('Sensitivity Analysis of Tensor Weights')

# Move x-axis labels to top and rotate 90 degrees
ax.xaxis.tick_top()
ax.xaxis.set_label_position('top')
plt.setp(ax.xaxis.get_majorticklabels(), rotation=90, ha='center')

# Move y-axis labels to right and set horizontal (0 degrees)
ax.yaxis.tick_right()
ax.yaxis.set_label_position('right')
plt.setp(ax.yaxis.get_majorticklabels(), rotation=0)

plt.tight_layout()

    
plt.savefig(output_folder + '/CorrIndex_similarity_weights.pdf',
            dpi=150, 
            bbox_inches='tight',
            edgecolor='none')
# Generate a mask for the upper triangle
mask = np.tril(np.ones_like(1. - df, dtype=bool), k=-1)

# Create figure and axes
fig, ax = plt.subplots(figsize=(8, 6))

# Create heatmap without colorbar
hm = sns.heatmap(1. - df, mask=mask, annot=True, cmap='Blues', 
                 square=True, linewidths=0.5, cbar=False, ax=ax)

# Create colorbar on the left
divider = make_axes_locatable(ax)
cax = divider.append_axes("left", size="5%", pad=0.5)
cbar = plt.colorbar(hm.collections[0], cax=cax)
cbar.set_label('Similarity\n(1 - CorrIndex)', rotation=90, labelpad=15)
cax.yaxis.set_ticks_position('left')
cax.yaxis.set_label_position('left')

plt.sca(ax)  # Set current axes back to main plot
plt.title('Sensitivity Analysis of Tensor Weights')

# Move x-axis labels to top and rotate 90 degrees
ax.xaxis.tick_top()
ax.xaxis.set_label_position('top')
plt.setp(ax.xaxis.get_majorticklabels(), rotation=90, ha='center')

# Move y-axis labels to right and set horizontal (0 degrees)
ax.yaxis.tick_right()
ax.yaxis.set_label_position('right')
plt.setp(ax.yaxis.get_majorticklabels(), rotation=0)

plt.tight_layout()

    
plt.savefig(output_folder + '/CorrIndex_similarity_weights.pdf',
            dpi=150, 
            bbox_inches='tight',
            edgecolor='none')

We can also visualize the distribution of these values:

In [16]:

Copied!





# Create violin plot
plt.figure(figsize=(10, 6))
sns.boxplot(data=1.-df)
plt.xlabel('Weights')
plt.ylabel('Similarity (1 - CorrIndex)')
plt.title('Distribution of Similarity Values across Weight Pairs')
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig(output_folder + '/CorrIndex_similarity_weights_dist.pdf',
            dpi=150, 
            bbox_inches='tight',
            edgecolor='none')
# Create violin plot
plt.figure(figsize=(10, 6))
sns.boxplot(data=1.-df)
plt.xlabel('Weights')
plt.ylabel('Similarity (1 - CorrIndex)')
plt.title('Distribution of Similarity Values across Weight Pairs')
plt.xticks(rotation=45)
plt.tight_layout()
plt.savefig(output_folder + '/CorrIndex_similarity_weights_dist.pdf',
            dpi=150, 
            bbox_inches='tight',
            edgecolor='none')

We pick the pair of weights with highest median of similarity across different combinations:

In [17]:

Copied!

weight_pair = weight_pairs[(1.-df).median().idxmax()]
print(f'Chosen weights: {weight_pair}')
weight_pair = weight_pairs[(1.-df).median().idxmax()]
print(f'Chosen weights: {weight_pair}')

Chosen weights: (0.7, 0.3)

This resulted in weights of α1=0.7 and α2=0.3, which means that the Protein Tensor will contribute more than the Metabolite Tensor to the factorization results.

Perform tensor factorization¶

After selecting the number of factors (6 in this case), and the balancing weights (α1=0.7 and α2=0.3), we can then perform our definite CTCA. To make our results more robust, we perform the initialization by running SVD, and by running the CTCA 100 times, each with a different initialization seed. Additionally, we add parameters to ensure a robust stop criteria (tol=1e-8and n_iter_max=500). From the 100 runs, we select the CTCA with lowest error.

In [18]:

Copied!





coupled_tensor.compute_tensor_factorization(rank=n_factors,
                                            init='svd',
                                            random_state=888,
                                            manual_weights=weight_pair,
                                            runs=100, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition
                                           )
coupled_tensor.compute_tensor_factorization(rank=n_factors,
                                            init='svd',
                                            random_state=888,
                                            manual_weights=weight_pair,
                                            runs=100, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition
                                           )

100%|██████████| 100/100 [05:58<00:00,  3.59s/it]

Best coupled model has a combined normalized error of: 0.231

Then, we can visualize the reconstruction error, separately for each tensor or combined, across iterations:

In [19]:

Copied!

errors, err_fig = coupled_tensor.get_factorization_errors(plot=True, filename=output_folder + '/Factorization_loss.pdf')
errors, err_fig = coupled_tensor.get_factorization_errors(plot=True, filename=output_folder + '/Factorization_loss.pdf')

3 - Results¶

Generate unified metadata for CTCA results

By using the metadata generated separately for each tensor, we generate a new metadata that incorporates metadata for the shared and private dimensions between both modalities.

In [20]:

Copied!

new_meta_tf = coupled_tensor.reorder_metadata(meta_tf, meta_tf2)
new_meta_tf = coupled_tensor.reorder_metadata(meta_tf, meta_tf2)

Plot factor loadings¶

After performing the CTCA, a set of loadings is obtained in a factor-specific way, for each tensor. These loadings provide details of what elements of each dimension are important within each factor. Here we plot first the loadings for the shared dimensions, followed by the private dimensions of the first and second tensor (protein- and metabolite-based, respectively).

In [21]:

Copied!

# Color palettes for each of the coupled tensor dimensions, ordered as in coupled_tensor.factors.keys()
cmaps = ['viridis', 'tab20', 'tab20', 'Dark2', 'Dark2', ]
# Color palettes for each of the coupled tensor dimensions, ordered as in coupled_tensor.factors.keys()
cmaps = ['viridis', 'tab20', 'tab20', 'Dark2', 'Dark2', ]

In [22]:

Copied!





factors, axes = c2c.plotting.tensor_factors_plot(interaction_tensor=coupled_tensor,
                                                 metadata=new_meta_tf,
                                                 sample_col='Element',
                                                 group_col='Category',
                                                 meta_cmaps=cmaps,
                                                 order_sorting=['Contexts', 'Sender Cells', 'Receiver Cells', 'Protein LRIs', 'Metabolite LRIs'],
                                                 fontsize=14,
                                                 filename=output_folder + 'Tensor-Factorization.pdf'
                                                )
factors, axes = c2c.plotting.tensor_factors_plot(interaction_tensor=coupled_tensor,
                                                 metadata=new_meta_tf,
                                                 sample_col='Element',
                                                 group_col='Category',
                                                 meta_cmaps=cmaps,
                                                 order_sorting=['Contexts', 'Sender Cells', 'Receiver Cells', 'Protein LRIs', 'Metabolite LRIs'],
                                                 fontsize=14,
                                                 filename=output_folder + 'Tensor-Factorization.pdf'
                                                )

Each factor represents a context-dependent communication pattern. In this visualization, each row is a factor and each column is a tensor dimension in that factor (Shared dimensions: contexts, sender-cells, and receiver-cells; Private dimensions: Protein LR pairs and Metabolite LR pairs).

The loadings represent the contribution of each element to that pattern, and their combinations between all dimensions represent the overall pattern. For example, in Factor 1, the identified communication pattern occurs in early stages (month 1), where all cell types but GABAergic neurons participate as sender cells. Then we can identify the top LR pairs contributing from each modality. Thus, those that are top ranked for the protein modality act in coordination with those top ranked for the metabolite modality.

Top LR pairs¶

Top-5 LR pairs

In [23]:

Copied!

for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Protein LRIs', 'Factor {}'.format(i+1), 5))
    print('')
for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Protein LRIs', 'Factor {}'.format(i+1), 5))
    print('')

MDK^NCL      0.372804
MDK^SDC2     0.317517
CDH2^CDH2    0.302039
CD99^CD99    0.267838
MDK^LRP1     0.177680
Name: Factor 1, dtype: float64

PTN^NCL       0.507021
MDK^NCL       0.434334
PTN^PTPRZ1    0.354619
MDK^PTPRZ1    0.301730
CDH2^CDH2     0.256842
Name: Factor 2, dtype: float64

PTN^PTPRZ1    0.514414
MDK^PTPRZ1    0.501849
PTN^NCL       0.245779
PTN^SDC3      0.233022
MDK^NCL       0.228345
Name: Factor 3, dtype: float64

CD99^CD99     0.464812
MDK^PTPRZ1    0.460041
MDK^NCL       0.417492
MDK^SDC2      0.199778
MDK^LRP1      0.170363
Name: Factor 4, dtype: float64

MDK^SDC2     0.436574
PTN^NCL      0.425681
MDK^NCL      0.387236
PTN^SDC2     0.380880
CD99^CD99    0.282566
Name: Factor 5, dtype: float64

NCAM1^NCAM1    0.439778
CADM1^CADM1    0.322127
NRXN1^NLGN2    0.213717
MDK^NCL        0.210292
NRXN1^NLGN1    0.210099
Name: Factor 6, dtype: float64

In [24]:

Copied!

for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Metabolite LRIs', 'Factor {}'.format(i+1), 5))
    print('')
for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Metabolite LRIs', 'Factor {}'.format(i+1), 5))
    print('')

Farnesyl pyrophosphate^LPAR2      0.323938
Arachidonic acid^ANXA2            0.320029
gamma-Aminobutyric acid^GABRB3    0.293607
Sphingosine^TRPM3                 0.288527
Retinal^RARA                      0.280857
Name: Factor 1, dtype: float64

Iron^TFRC                         0.401326
Retinal^RXRB                      0.358571
Taurine^GLRB                      0.330157
Retinal^RARA                      0.296951
gamma-Aminobutyric acid^GABBR1    0.230053
Name: Factor 2, dtype: float64

gamma-Aminobutyric acid^GABRB3    0.646834
Iron^TFRC                         0.363304
gamma-Aminobutyric acid^GABBR2    0.355796
Retinal^RARB                      0.229937
Testosterone^AR                   0.229856
Name: Factor 3, dtype: float64

Farnesyl pyrophosphate^LPAR2    0.528710
Desmosterol^NR1H2               0.396541
Thyroxine^THRA                  0.309135
Desmosterol^NR1H3               0.234776
Retinal^RXRB                    0.225645
Name: Factor 4, dtype: float64

Sphingosine^TRPM3                 0.515894
Arachidonic acid^S100A10          0.369580
Arachidonic acid^ANXA2            0.341487
Leukotriene B4^PPARA              0.220591
gamma-Aminobutyric acid^GABRB3    0.218846
Name: Factor 5, dtype: float64

Iron^TFRC                         0.358893
Arachidonic acid^ANXA2            0.336427
gamma-Aminobutyric acid^GABBR1    0.277284
Thyroxine^THRA                    0.257077
Desmosterol^NR1H2                 0.248295
Name: Factor 6, dtype: float64

Export Loadings¶

THESE VALUES CAN BE USED FOR DOWNSTREAM ANALYSES

In [27]:

Copied!

coupled_tensor.export_factor_loadings(output_folder + 'Coupled-Loadings.xlsx')
coupled_tensor.export_factor_loadings(output_folder + 'Coupled-Loadings.xlsx')

Coupled tensor factor loadings saved to /results/Tensor-cell2cell/Coupled-Loadings.xlsx

4 - Downstream Analyses¶

Generate factor-specific networks¶

We can generate factor-specific communication networks connecting sender-to-receiver cells.

In [28]:

Copied!

networks = c2c.analysis.tensor_downstream.get_factor_specific_ccc_networks(coupled_tensor.factors, 
                                                                           sender_label='Sender Cells',
                                                                           receiver_label='Receiver Cells')
networks = c2c.analysis.tensor_downstream.get_factor_specific_ccc_networks(coupled_tensor.factors, 
                                                                           sender_label='Sender Cells',
                                                                           receiver_label='Receiver Cells')

Generate matrix of cell-cell pairs by factors

In [29]:

Copied!

network_by_factors = c2c.analysis.tensor_downstream.flatten_factor_ccc_networks(networks, orderby='receivers')
network_by_factors = c2c.analysis.tensor_downstream.flatten_factor_ccc_networks(networks, orderby='receivers')

Select cell-cell pairs with high potential of interaction

In [30]:

Copied!

_ = plt.hist(network_by_factors.values.flatten(), bins = 20)
_ = plt.hist(network_by_factors.values.flatten(), bins = 20)

In [31]:

Copied!

# To consider only cell pairs with high potential to interact in each factor
cci_threshold = 0.2
# To consider only cell pairs with high potential to interact in each factor
cci_threshold = 0.2

Visualize networks of factors with significant differences between groups

In [32]:

Copied!





net_fig, net_ax = c2c.plotting.ccc_networks_plot(coupled_tensor.factors,
                                                 included_factors=None,
                                                 ccc_threshold=cci_threshold, # Only important communication
                                                 nrows=2,
                                                 panel_size=(8, 8),
                                                 network_layout='circular',
                                                 filename=output_folder + 'Networks.svg'
                                                )
net_fig, net_ax = c2c.plotting.ccc_networks_plot(coupled_tensor.factors,
                                                 included_factors=None,
                                                 ccc_threshold=cci_threshold, # Only important communication
                                                 nrows=2,
                                                 panel_size=(8, 8),
                                                 network_layout='circular',
                                                 filename=output_folder + 'Networks.svg'
                                                )

Clustermaps¶

Similarly, we can visualize the contribution of each dimension element across factors by plotting their loadings

In [33]:

Copied!

factor_sorted = ['Factor {}'.format(f) for f in range(1, coupled_tensor.rank+1)]
factor_sorted = ['Factor {}'.format(f) for f in range(1, coupled_tensor.rank+1)]

Cluster cell-cell pairs by their potential across factors

Here we display the outer product between the sender and receiver cell dimensions, which are the value governing the factor-specific networks above.

In [34]:

Copied!





cci_df = network_by_factors[factor_sorted]
cci_df = cci_df[(cci_df.T > cci_threshold).any()]
cci_df = (cci_df / cci_df.max())

cci_cm = c2c.plotting.loading_clustermap(cci_df,
                                         use_zscore=False, 
                                         loading_threshold=0.,
                                         figsize=(20, 9),
                                         tick_fontsize=16,
                                         cmap='YlOrBr',
                                         filename=output_folder + 'Clustermap-Factor-specific-CCI.pdf',
                                         row_cluster=False
                                        )
cci_df = network_by_factors[factor_sorted]
cci_df = cci_df[(cci_df.T > cci_threshold).any()]
cci_df = (cci_df / cci_df.max())

cci_cm = c2c.plotting.loading_clustermap(cci_df,
                                         use_zscore=False, 
                                         loading_threshold=0.,
                                         figsize=(20, 9),
                                         tick_fontsize=16,
                                         cmap='YlOrBr',
                                         filename=output_folder + 'Clustermap-Factor-specific-CCI.pdf',
                                         row_cluster=False
                                        )

Cluster LR pairs by their importance across factors

We can plot the key LR pairs from each modality. First we identify what could represent a high loading given their separate distributions.

In [35]:

Copied!

_ = plt.hist(coupled_tensor.factors['Protein LRIs'].values.flatten(), bins = 20)
_ = plt.hist(coupled_tensor.factors['Protein LRIs'].values.flatten(), bins = 20)

In [36]:

Copied!

_ = plt.hist(coupled_tensor.factors['Metabolite LRIs'].values.flatten(), bins = 20)
_ = plt.hist(coupled_tensor.factors['Metabolite LRIs'].values.flatten(), bins = 20)

And set and arbitrary threshold to filter out those that have low loadings

In [37]:

Copied!

# To consider only most relevant LR pairs
lr_threshold1 = 0.1
lr_threshold2 = 0.2
# To consider only most relevant LR pairs
lr_threshold1 = 0.1
lr_threshold2 = 0.2

Since we have loadings from different modalities, we want to visualize them in a comparable way, so we normalize each of them by their maximum value in each case, in a factor-wise manner.

In [38]:

Copied!





df1 = coupled_tensor.factors['Protein LRIs'].copy()
df1 = df1[(df1.T > lr_threshold1).any()]
df1 = (df1 / df1.max())

df2 = coupled_tensor.factors['Metabolite LRIs'].copy()
df2 = df2[(df2.T > lr_threshold2).any()]
df2 = (df2 / df2.max())
df1 = coupled_tensor.factors['Protein LRIs'].copy()
df1 = df1[(df1.T > lr_threshold1).any()]
df1 = (df1 / df1.max())

df2 = coupled_tensor.factors['Metabolite LRIs'].copy()
df2 = df2[(df2.T > lr_threshold2).any()]
df2 = (df2 / df2.max())

In [39]:

Copied!





lr_cm = c2c.plotting.loading_clustermap(df1,
                                        loading_threshold=0., # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(10, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-ProtLRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df1,
                                        loading_threshold=0., # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(10, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-ProtLRs.pdf',
                                        row_cluster=False
                                       )

In [40]:

Copied!





lr_cm = c2c.plotting.loading_clustermap(df2,
                                        loading_threshold=0., # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(8, 6),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-MetLRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df2,
                                        loading_threshold=0., # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(8, 6),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-MetLRs.pdf',
                                        row_cluster=False
                                       )

Given that we max-scaled them, we can also put them together in an unified plot

In [41]:

Copied!

df_both = pd.concat([df1, df2])
df_both = pd.concat([df1, df2])

In [42]:

Copied!





lr_cm = c2c.plotting.loading_clustermap(df_both,
                                        loading_threshold=0.15, # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(24, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-LRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df_both,
                                        loading_threshold=0.15, # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(24, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-LRs.pdf',
                                        row_cluster=False
                                       )