cell2cell
  • Home
  • API Documentation

cell2cell Tutorials

  • Cell-cell communication from bulk dataset
  • Cell-cell communication from single-cell dataset

Tensor-cell2cell Tutorials

  • Obtaining patterns of cell-cell communication with Tensor-cell2cell
  • Downstream analysis 1: Factor-specific analyses
  • Downstream analysis 2: Gene Set Enrichment Analysis
  • Inspecting CCC patterns from spatial transcriptomics
  • Running Tensor-cell2cell on your own GPU or on Google Colab's GPU

Tensor-cell2cell v2 Tutorials

  • Building multimodal 4D communication tensors (MEBOCOST)
  • Patterns of protein- and metabolite-based cell-cell communication (MEBOCOST)
    • 1 - Load Data
      • Import 4D communication tensors
    • 2 - Generate coupled tensors
    • 3 - Run Analysis
      • Elbow analysis
      • Perform tensor factorization
    • 3 - Results
      • Plot factor loadings
      • Top LR pairs
      • Export Loadings
    • 4 - Downstream Analyses
      • Generate factor-specific networks
      • Clustermaps
  • Patterns of cell-cell communication and transcription factors (DecoupleR)
  • Patterns of cell-cell communication and metabolic activities (scCellFie)
  • Patterns of protein- and metabolite-based cell-cell communication (LIANA+)
cell2cell
  • Tensor-cell2cell v2 Tutorials
  • Patterns of protein- and metabolite-based cell-cell communication (MEBOCOST)
  • Edit on GitHub

Patterns of protein- and metabolite-based cell-cell communication (MEBOCOST)¶

This tutorial is focused on running Tensor-cell2cell v2 using the tools cell2cell and MEBOCOST to predict protein- and metabolite-mediated cell-cell communication (CCC), respectively.

In this case, we use samples from cortical brain organoids across different time points, previously published by Trujillo et al (2019).

Read this if you plan using a GPU to speed-up the analysis. Before running this notebook, make sure to have a proper NVIDIA GPU driver (https://www.nvidia.com/Download/index.aspx) as well as the CUDA toolkit (https://developer.nvidia.com/cuda-toolkit) installed.

Then, make sure to create an environment with Pytorch >= v1.8.0 following these instructions to enable CUDA.

https://pytorch.org/get-started/locally/

Once you have everything installed, run the next blocks two blocks of code before everything.

If you are using a NVIDIA GPU, with PyTorch and CUDA, set the following variable to be True

use_gpu = True, otherwise set it use_gpu = False

In [1]:
Copied!
use_gpu = True

import tensorly as tl
if use_gpu:
    tl.set_backend('pytorch')
use_gpu = True import tensorly as tl if use_gpu: tl.set_backend('pytorch')

Importing packages to use

In [2]:
Copied!
import cell2cell as c2c
import scanpy as sc

import numpy as np
import pandas as pd

from tqdm.auto import tqdm

import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline
import cell2cell as c2c import scanpy as sc import numpy as np import pandas as pd from tqdm.auto import tqdm import matplotlib.pyplot as plt import seaborn as sns %matplotlib inline
In [3]:
Copied!
import warnings
warnings.filterwarnings('ignore')
import warnings warnings.filterwarnings('ignore')

IMPORTANT: In this notebook, the version 0.8.0 of cell2cell is used. Older versions do not implement the Coupled Tensor Component Analysis (CTCA).

1 - Load Data¶

Specify folders where the data is located, and where the outputs will be written:

In [4]:
Copied!
import os

data_folder = './data/'
directory = os.fsencode(data_folder)

output_folder = './results/Tensor-cell2cell/'
if not os.path.isdir(output_folder):
    os.mkdir(output_folder)
import os data_folder = './data/' directory = os.fsencode(data_folder) output_folder = './results/Tensor-cell2cell/' if not os.path.isdir(output_folder): os.mkdir(output_folder)

Import 4D communication tensors¶

We will import the communication tensors created for both modalities (protein- and metabolite-mediated CCC), as previously done.

In [5]:
Copied!
tensor_lr = c2c.io.load_tensor('./data/Tensor-cell2cell/Initial-c2c-Tensor.pkl')
tensor_met = c2c.io.load_tensor('./data/Tensor-cell2cell/Initial-MEBOCOST-Tensor.pkl')
tensor_lr = c2c.io.load_tensor('./data/Tensor-cell2cell/Initial-c2c-Tensor.pkl') tensor_met = c2c.io.load_tensor('./data/Tensor-cell2cell/Initial-MEBOCOST-Tensor.pkl')
In [6]:
Copied!
meta_tf = c2c.io.load_variable_with_pickle('./data/Tensor-cell2cell/Meta-Initial-c2c-Tensor.pkl')
meta_tf2 = c2c.io.load_variable_with_pickle('./data/Tensor-cell2cell/Meta-Initial-MEBOCOST-Tensor.pkl')
meta_tf = c2c.io.load_variable_with_pickle('./data/Tensor-cell2cell/Meta-Initial-c2c-Tensor.pkl') meta_tf2 = c2c.io.load_variable_with_pickle('./data/Tensor-cell2cell/Meta-Initial-MEBOCOST-Tensor.pkl')

2 - Generate coupled tensors¶

Once we have loaded our 4D tensors, we start creating a CoupledInteractionTensorobject in Tensor-cell2cell v2. This object will later perform the CTCA.

In [7]:
Copied!
coupled_tensor = c2c.tensor.CoupledInteractionTensor(tensor1=tensor_lr,
                                                     tensor2=tensor_met,
                                                     tensor1_name='cell2cell',
                                                     tensor2_name='mebocost',
                                                     mode_mapping={'shared' : [(0,0), (2,2), (3,3)]}
                                                    )
coupled_tensor = c2c.tensor.CoupledInteractionTensor(tensor1=tensor_lr, tensor2=tensor_met, tensor1_name='cell2cell', tensor2_name='mebocost', mode_mapping={'shared' : [(0,0), (2,2), (3,3)]} )

We can observe that this object includes both tensors simultaneously.

In [8]:
Copied!
coupled_tensor.shape
coupled_tensor.shape
Out[8]:
(torch.Size([4, 422, 5, 5]), torch.Size([4, 52, 5, 5]))

Assume that missing values are real zeros

Instead of imputing missing values with our method, we will assume that they are real zeros.

In [9]:
Copied!
coupled_tensor.mask1 = None
coupled_tensor.mask2 = None
coupled_tensor.mask1 = None coupled_tensor.mask2 = None

3 - Run Analysis¶

Elbow analysis¶

For selecting the number of factors to decompose CCC, we could do it either manually or automatically from an Elbow analysis.

In this case, after running the Elbow analysis, we manually selected the number of factors. Change the parameter automatic_elbowto be True if you want to automatically select the number of factors.

In [11]:
Copied!
elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=25,
                                                   runs=20,
                                                   init='svd',
                                                   automatic_elbow=False,
                                                   random_state=888,
                                                   )

plt.savefig(output_folder + 'Elbow.pdf', dpi=300, bbox_inches='tight')
elbow, error = coupled_tensor.elbow_rank_selection(upper_rank=25, runs=20, init='svd', automatic_elbow=False, random_state=888, ) plt.savefig(output_folder + 'Elbow.pdf', dpi=300, bbox_inches='tight')
100%|██████████| 25/25 [05:03<00:00, 12.14s/it]
The rank at the elbow is: 18
No description has been provided for this image

Here, the error measures how different is the original tensor from the sum of R rank-1 tensors used to approximate it. The idea is to find a good trade-off between a small number of factors and a small error.

Perform tensor factorization¶

Next, we apply a non-negative canonical coupled component analysis (CTCA) – the same performed in the elbow analysis above. Briefly, this tensor decomposition identifies a low-rank tensor. Briefly, tensor decomposition identifies a low-rank tensor (here, a rank of 6) for each modality's tensor that better approximates each original tensor. These low-rank tensors can be represented as the sum of a set of rank-1 tensors (6 of them in this case). Each rank-1 tensor represents a factor in the decomposition and can be further represented as the outer product of n vectors, where n represents the number of tensor dimensions. Each vector represents one of the n tensor dimensions for that factor and its values, corresponding to individual elements in each dimension, represent the factor loadings.

In the CTCA, we have vectors that are shared between the coupled tensors (here contexts, sender cells, and receiver cells), while we have other vectors that are private for each modality (protein-based LR pair and metabolite-based LR pair dimensions). The loadings shared between both tensors behave exactly the same in both modalities, representing that we are capturing LR pairs of each modality that present similar trends in each factor. If each tensor was decomposed separately with the standard TCA in Tensor-cell2cell v1, we would not ensure that the key LR pairs for each factor would behave the same, as this could lead to loadings for the context, sender cells and receiver cells with different behaviors between modalities.

tensor-fact

In [12]:
Copied!
coupled_tensor.compute_tensor_factorization(rank=6, # This is just for illustrative purposes - to keep a low number of factors
                                            init='svd',
                                            random_state=888,
                                            runs=100, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition
                                           )
coupled_tensor.compute_tensor_factorization(rank=6, # This is just for illustrative purposes - to keep a low number of factors init='svd', random_state=888, runs=100, tol=1e-8, n_iter_max=500 # Robust version of tensor decomposition )
100%|██████████| 100/100 [03:22<00:00,  2.02s/it]
Best coupled model has a combined normalized error of: 0.604

3 - Results¶

Generate unified metadata for CTCA results

By using the metadata generated separately for each tensor, we generate a new metadata that incorporates metadata for the shared and private dimensions between both modalities.

In [10]:
Copied!
new_meta_tf = coupled_tensor.reorder_metadata(meta_tf, meta_tf2)
new_meta_tf = coupled_tensor.reorder_metadata(meta_tf, meta_tf2)

Plot factor loadings¶

After performing the CTCA, a set of loadings is obtained in a factor-specific way, for each tensor. These loadings provide details of what elements of each dimension are important within each factor. Here we plot first the loadings for the shared dimensions, followed by the private dimensions of the first and second tensor (protein- and metabolite-based, respectively).

In [13]:
Copied!
# Color palettes for each of the coupled tensor dimensions, ordered as in coupled_tensor.factors.keys()
cmaps = ['viridis', 'tab20', 'tab20', 'Dark2', 'Dark2', ]
# Color palettes for each of the coupled tensor dimensions, ordered as in coupled_tensor.factors.keys() cmaps = ['viridis', 'tab20', 'tab20', 'Dark2', 'Dark2', ]
In [14]:
Copied!
factors, axes = c2c.plotting.tensor_factors_plot(interaction_tensor=coupled_tensor,
                                                 metadata=new_meta_tf,
                                                 sample_col='Element',
                                                 group_col='Category',
                                                 meta_cmaps=cmaps,
                                                 order_sorting=['Contexts', 'Sender Cells', 'Receiver Cells', # Shared dimensions
                                                                'Protein LRIs', 'Metabolite LRIs' # Private dimensions
                                                               ],
                                                 fontsize=14,
                                                 filename=output_folder + 'Tensor-Factorization.pdf'
                                                )
factors, axes = c2c.plotting.tensor_factors_plot(interaction_tensor=coupled_tensor, metadata=new_meta_tf, sample_col='Element', group_col='Category', meta_cmaps=cmaps, order_sorting=['Contexts', 'Sender Cells', 'Receiver Cells', # Shared dimensions 'Protein LRIs', 'Metabolite LRIs' # Private dimensions ], fontsize=14, filename=output_folder + 'Tensor-Factorization.pdf' )
No description has been provided for this image

Each factor represents a context-dependent communication pattern. In this visualization, each row is a factor and each column is a tensor dimension in that factor (Shared dimensions: contexts, sender-cells, and receiver-cells; Private dimensions: Protein LR pairs and Metabolite LR pairs).

The loadings represent the contribution of each element to that pattern, and their combinations between all dimensions represent the overall pattern. For example, in Factor 1, the identified communication pattern occurs in early stages (month 1), where all cell types but GABAergic neurons participate as sender and receiver cells. Then we can identify the top LR pairs contributing from each modality. Thus, those that are top ranked for the protein modality act in coordination with those top ranked for the metabolite modality.

Top LR pairs¶

Top-5 LR pairs

In [15]:
Copied!
for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Protein LRIs', 'Factor {}'.format(i+1), 5))
    print('')
for i in range(coupled_tensor.rank): print(coupled_tensor.get_top_factor_elements('Protein LRIs', 'Factor {}'.format(i+1), 5)) print('')
MDK^NCL      0.407387
CDH2^CDH2    0.307432
MDK^SDC2     0.285180
CD99^CD99    0.269068
PTN^NCL      0.224309
Name: Factor 1, dtype: float64

MDK^PTPRZ1    0.471079
MDK^NCL       0.462388
CD99^CD99     0.382835
PTN^NCL       0.349663
PTN^PTPRZ1    0.329195
Name: Factor 2, dtype: float64

PTN^PTPRZ1    0.441224
PTN^NCL       0.433979
MDK^PTPRZ1    0.403027
MDK^NCL       0.361902
CDH2^CDH2     0.276318
Name: Factor 3, dtype: float64

MDK^SDC2           0.509185
PTN^SDC2           0.429248
CD99^CD99          0.287311
MDK^ITGA6&ITGB1    0.225686
MDK^NCL            0.208107
Name: Factor 4, dtype: float64

NCAM1^NCAM1    0.385062
PTN^NCL        0.302485
PTN^PTPRZ1     0.290058
MDK^PTPRZ1     0.282025
CADM1^CADM1    0.275014
Name: Factor 5, dtype: float64

NCAM1^NCAM1    0.329649
MDK^LRP1       0.295148
CADM1^CADM1    0.284719
MDK^NCL        0.284322
PTN^SDC3       0.223877
Name: Factor 6, dtype: float64

In [16]:
Copied!
for i in range(coupled_tensor.rank):
    print(coupled_tensor.get_top_factor_elements('Metabolite LRIs', 'Factor {}'.format(i+1), 5))
    print('')
for i in range(coupled_tensor.rank): print(coupled_tensor.get_top_factor_elements('Metabolite LRIs', 'Factor {}'.format(i+1), 5)) print('')
Farnesyl pyrophosphate^LPAR2    0.388216
Arachidonic acid^ANXA2          0.373286
Retinal^RARA                    0.303664
Thyroxine^THRA                  0.259081
Arachidonic acid^S100A10        0.254810
Name: Factor 1, dtype: float64

Thyroxine^THRA                  0.479568
Iron^TFRC                       0.432253
Retinal^RARG                    0.329154
Farnesyl pyrophosphate^LPAR2    0.268704
Desmosterol^NR1H2               0.245748
Name: Factor 2, dtype: float64

Iron^TFRC                         0.577320
Retinal^RARB                      0.393141
Retinal^RXRA                      0.380903
gamma-Aminobutyric acid^GABBR1    0.316231
5-HETE^TRPV1                      0.226016
Name: Factor 3, dtype: float64

Sphingosine^TRPM3           0.430887
Arachidonic acid^ANXA2      0.394077
Iron^TFRC                   0.373948
Arachidonic acid^S100A10    0.360215
Retinal^RXRB                0.356561
Name: Factor 4, dtype: float64

Farnesyl pyrophosphate^LPAR2    0.768104
Desmosterol^NR1H2               0.464831
Taurine^GLRB                    0.272342
Testosterone^AR                 0.263091
Retinal^RARG                    0.178612
Name: Factor 5, dtype: float64

Retinal^RXRB                      0.465276
gamma-Aminobutyric acid^GABRB3    0.367336
Retinal^RARA                      0.342311
Iron^TFR2                         0.325170
gamma-Aminobutyric acid^GABBR1    0.302204
Name: Factor 6, dtype: float64

Export Loadings¶

THESE VALUES CAN BE USED FOR DOWNSTREAM ANALYSES

In [19]:
Copied!
coupled_tensor.export_factor_loadings(output_folder + 'Coupled-Loadings.xlsx')
coupled_tensor.export_factor_loadings(output_folder + 'Coupled-Loadings.xlsx')
Coupled tensor factor loadings saved to /results/Tensor-cell2cell/Coupled-Loadings.xlsx

4 - Downstream Analyses¶

Generate factor-specific networks¶

We can generate factor-specific communication networks connecting sender-to-receiver cells.

In [20]:
Copied!
networks = c2c.analysis.tensor_downstream.get_factor_specific_ccc_networks(coupled_tensor.factors, 
                                                                           sender_label='Sender Cells',
                                                                           receiver_label='Receiver Cells')
networks = c2c.analysis.tensor_downstream.get_factor_specific_ccc_networks(coupled_tensor.factors, sender_label='Sender Cells', receiver_label='Receiver Cells')

Generate matrix of cell-cell pairs by factors

In [21]:
Copied!
network_by_factors = c2c.analysis.tensor_downstream.flatten_factor_ccc_networks(networks, orderby='receivers')
network_by_factors = c2c.analysis.tensor_downstream.flatten_factor_ccc_networks(networks, orderby='receivers')

Select cell-cell pairs with high potential of interaction

In [22]:
Copied!
_ = plt.hist(network_by_factors.values.flatten(), bins = 20)
_ = plt.hist(network_by_factors.values.flatten(), bins = 20)
No description has been provided for this image
In [23]:
Copied!
# To consider only cell pairs with high potential to interact in each factor
cci_threshold = 0.2
# To consider only cell pairs with high potential to interact in each factor cci_threshold = 0.2

Visualize networks of factors with significant differences between groups

In [24]:
Copied!
net_fig, net_ax = c2c.plotting.ccc_networks_plot(coupled_tensor.factors,
                                                 included_factors=None,
                                                 ccc_threshold=cci_threshold, # Only important communication
                                                 nrows=2,
                                                 panel_size=(8, 8),
                                                 network_layout='circular',
                                                 filename=output_folder + 'Networks.svg'
                                                )
net_fig, net_ax = c2c.plotting.ccc_networks_plot(coupled_tensor.factors, included_factors=None, ccc_threshold=cci_threshold, # Only important communication nrows=2, panel_size=(8, 8), network_layout='circular', filename=output_folder + 'Networks.svg' )
No description has been provided for this image

Clustermaps¶

Similarly, we can visualize the contribution of each dimension element across factors by plotting their loadings

In [25]:
Copied!
factor_sorted = ['Factor {}'.format(f) for f in range(1, coupled_tensor.rank+1)]
factor_sorted = ['Factor {}'.format(f) for f in range(1, coupled_tensor.rank+1)]

Cluster cell-cell pairs by their potential across factors

Here we display the outer product between the sender and receiver cell dimensions, which are the value governing the factor-specific networks above.

In [26]:
Copied!
cci_cm = c2c.plotting.loading_clustermap(network_by_factors[factor_sorted],
                                         use_zscore=False, 
                                         loading_threshold=cci_threshold, # To consider relevant CCIs
                                         figsize=(20, 9),
                                         tick_fontsize=16,
                                         cmap='YlOrBr',
                                         filename=output_folder + 'Clustermap-Factor-specific-CCI.pdf',
                                         row_cluster=False
                                        )
cci_cm = c2c.plotting.loading_clustermap(network_by_factors[factor_sorted], use_zscore=False, loading_threshold=cci_threshold, # To consider relevant CCIs figsize=(20, 9), tick_fontsize=16, cmap='YlOrBr', filename=output_folder + 'Clustermap-Factor-specific-CCI.pdf', row_cluster=False )
No description has been provided for this image

Cluster LR pairs by their importance across factors

We can plot the key LR pairs from each modality. First we identify what could represent a high loading given their separate distributions.

In [27]:
Copied!
_ = plt.hist(coupled_tensor.factors['Protein LRIs'].values.flatten(), bins = 20)
_ = plt.hist(coupled_tensor.factors['Protein LRIs'].values.flatten(), bins = 20)
No description has been provided for this image
In [28]:
Copied!
_ = plt.hist(coupled_tensor.factors['Metabolite LRIs'].values.flatten(), bins = 20)
_ = plt.hist(coupled_tensor.factors['Metabolite LRIs'].values.flatten(), bins = 20)
No description has been provided for this image

And set and arbitrary threshold to filter out those that have low loadings

In [29]:
Copied!
# To consider only most relevant LR pairs
lr_threshold1 = 0.1
lr_threshold2 = 0.2
# To consider only most relevant LR pairs lr_threshold1 = 0.1 lr_threshold2 = 0.2

Since we have loadings from different modalities, we want to visualize them in a comparable way, so we normalize each of them by their maximum value in each case.

In [30]:
Copied!
df1 = coupled_tensor.factors['Protein LRIs'].copy()
df1 = df1[(df1.T > lr_threshold1).any()]
df1 = (df1 / df1.max().max())

df2 = coupled_tensor.factors['Metabolite LRIs'].copy()
df2 = df2[(df2.T > lr_threshold2).any()]
df2 = (df2 / df2.max().max())
df1 = coupled_tensor.factors['Protein LRIs'].copy() df1 = df1[(df1.T > lr_threshold1).any()] df1 = (df1 / df1.max().max()) df2 = coupled_tensor.factors['Metabolite LRIs'].copy() df2 = df2[(df2.T > lr_threshold2).any()] df2 = (df2 / df2.max().max())
In [31]:
Copied!
lr_cm = c2c.plotting.loading_clustermap(df1,
                                        loading_threshold=lr_threshold1, # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(10, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-ProtLRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df1, loading_threshold=lr_threshold1, # To consider only top LRs use_zscore=False, figsize=(10, 5), cbar_fontsize=20, tick_fontsize=10, cmap='YlOrBr', filename=output_folder + 'Clustermap-Factor-specific-ProtLRs.pdf', row_cluster=False )
No description has been provided for this image
In [32]:
Copied!
lr_cm = c2c.plotting.loading_clustermap(df2,
                                        loading_threshold=lr_threshold2, # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(8, 6),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-MetLRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df2, loading_threshold=lr_threshold2, # To consider only top LRs use_zscore=False, figsize=(8, 6), cbar_fontsize=20, tick_fontsize=10, cmap='YlOrBr', filename=output_folder + 'Clustermap-Factor-specific-MetLRs.pdf', row_cluster=False )
No description has been provided for this image

Given that we max-scaled them, we can also put them together in an unified plot

In [33]:
Copied!
df = pd.concat([df1, df2])
df = pd.concat([df1, df2])
In [34]:
Copied!
lr_cm = c2c.plotting.loading_clustermap(df,
                                        loading_threshold=0.15, # To consider only top LRs
                                        use_zscore=False,
                                        figsize=(24, 5),
                                        cbar_fontsize=20,
                                        tick_fontsize=10,
                                        cmap='YlOrBr',
                                        filename=output_folder + 'Clustermap-Factor-specific-LRs.pdf',
                                        row_cluster=False
                                       )
lr_cm = c2c.plotting.loading_clustermap(df, loading_threshold=0.15, # To consider only top LRs use_zscore=False, figsize=(24, 5), cbar_fontsize=20, tick_fontsize=10, cmap='YlOrBr', filename=output_folder + 'Clustermap-Factor-specific-LRs.pdf', row_cluster=False )
No description has been provided for this image
Previous Next

Built with MkDocs using a theme provided by Read the Docs.
GitHub « Previous Next »