TiRank is a comprehensive tool for integrating and analyzing RNA-seq and scRNA-seq data. It enables researchers to identify phenotype-associated spots by integrating spatial transcriptomics or single-cell RNA sequencing data with bulk RNA sequencing data. TiRank supports various analysis modes, including survival analysis (Cox), classification, and regression.
- Quickstart tutorials
- Model input
- Features
- Installation
- Result Interpretation
- Hyperparameters
- Support
- License
- 1. Example for integrate single-cell RNA-seq data of melanoma and response information.
- 2. Example for integrate spatial transcriptomics (ST) and bulk transcriptomics data to identify phenotype-associated spots and determine significant clusters.
- 3. A comprehensive downstream analysis of the spatial transcriptome: We present a series of downstream analyses based on TiRank results, demonstrating that TiRank can facilitate the identification and characterization of spatial structures associated with specific clinical phenotypes in spatial transcriptome studies.
- Spatial transcriptome data or single-cell data that we want to characterize.
- Bulk transcriptomics data: Expression profiles and pre-processed clinical information files. The format of the pre-processed clinical information file should align with our sample data. You can download the sample data from Google Drive.
- Integration of Bulk and Single-cell Data: Seamlessly integrates bulk RNA-seq data with single-cell or spatial transcriptomics data.
- Multiple Analysis Modes: Supports Cox survival analysis, classification, and regression modes.
- Visualization Tools: Provides functions for visualizing results, including UMAP plots and spatial maps.
- Customizable Hyperparameters: Offers flexibility in tuning hyperparameters to optimize results.
TiRank can be installed using one of the following methods. We recommend creating a new conda environment for TiRank to ensure compatibility and isolation from other Python packages.
- Anaconda or Miniconda: For managing Python environments.
- Python 3.9: TiRank requires Python version 3.9.
-
Set up a new conda environment:
conda create -n TiRank python=3.9 -y conda activate TiRank
-
Clone the TiRank repository from GitHub:
git clone [email protected]:LenisLin/TiRank.git
-
Install TiRank via pip:
pip install TiRank
-
Install required dependencies:
TiRank depends on the timm==0.5.4
package from TransPath. Follow these steps to install it:
- Install the package:
pip install ./TiRank/timm-0.5.4.tar # Replace with your actual path
- Reference Link.
-
Prepare Example Data:
- Download the example data from Google Drive
-
(Optional, for Spatial Transcriptomics): Download the pre-trained CTransPath model weights.
(Instructions to be provided)
-
Install TiRank: Follow the installation steps described in Method 1
-
Activate the Web Server:
- Navigate to the Web directory:
cd TiRank/Web
- Set up data directories:
mkdir data
- (Optional, for Spatial Transcriptomics)Inside the
data
directory, create anpretrainModel
folder and download the pre-trained model weights:
cd data mkdir pretrainModel
Download the pre-trained CTransPath model weights into the
pretrainModel
directory.- Inside the
data
directory, create anExampleData
folder and download the sample data:
cd ../ mkdir ExampleData
Download the sample data from Google Drive into the
ExampleData
directory.- Return to the
Web
directory:
cd ../ cd ../
- Verify the directory structure:
Web/ ├── assets/ ├── components/ ├── img/ ├── layout/ ├── data/ │ ├── pretrainModel │ │ ├── ctranspath.pth │ ├── ExampleData │ │ ├── CRC_ST_Prog/ │ │ └── SKCM_SC_Res/ ├── tiRankWeb/ └── app.py
- Run the web application:
python app.py
Note: If you encounter any issues with image loading, ensure that you are running the program from the Web
directory.
For more tutorials on using the web interface, please refer to the "Tutorials" section within the web application.
Please choose the installation method that best suits your setup. If you encounter any issues, feel free to open an issue on the TiRank GitHub Issues page.
After running TiRank, you can find the results in the savePath/3_Analysis/
directory. The key output file is spot_predict_score.csv
, where the Rank_Label
column represents the TiRank prediction results.
-
For
Cox
mode:Rank+
spots are associated with worse survival.Rank-
spots are associated with better survival.
-
For
Classification
mode:Rank+
spots are associated with the phenotype of the group encoded as1
.Rank-
spots are associated with the phenotype of the group encoded as0
.
-
For
Regression
mode:Rank+
spots are associated with high phenotype label scores.Rank-
spots are associated with low phenotype label scores.- For example, if the input is the IC50 values of different cell lines,
Rank+
spots are associated with drug resistance, andRank-
spots are associated with drug sensitivity.
TiRank provides several hyperparameters that can be adjusted to optimize the analysis. The first three hyperparameters are crucial for feature selection in bulk transcriptomics, while the latter three are used for training the multilayer perceptron network. TiRank automatically selects suitable combinations for the training hyperparameters within a predefined range.
-
top_var_genes
:- Description: The number of top variable genes to select from the bulk RNA-seq data.
- Default:
2000
- Recommendation: If you find that the number of filtered genes is low, consider increasing
top_var_genes
.
-
p_value_threshold
:- Description: The p-value threshold for selecting genes significantly associated with the phenotype.
- Default:
0.05
- Recommendation: If too few genes are selected, consider increasing
p_value_threshold
.
-
top_gene_pairs
:- Description: The number of top gene pairs to select based on variability.
- Default:
2000
-
alphas
:- Description: Weights of different components in the total loss computation.
- Details: Adjusts the influence of each loss component during training.
-
n_epochs
:- Description: The number of training epochs.
- Recommendation: Increase if the model has not converged.
-
lr
(Learning Rate):- Description: Controls the step size during parameter updates.
- Recommendation: A lower
lr
leads to slower but more stable convergence. A higherlr
may speed up convergence but can cause the model to overshoot optimal solutions.