# DeepCAT
**Repository Path**: lomo0625/DeepCAT
## Basic Information
- **Project Name**: DeepCAT
- **Description**: Deep Learning Method to Identify Cancer Associated TCRs
- **Primary Language**: Unknown
- **License**: Not specified
- **Default Branch**: master
- **Homepage**: None
- **GVP Project**: No
## Statistics
- **Stars**: 0
- **Forks**: 0
- **Created**: 2019-12-31
- **Last Updated**: 2020-12-19
## Categories & Tags
**Categories**: Uncategorized
**Tags**: None
## README
# DeepCAT
### Deep CNN Model for Cancer Associated TCRs
DeepCAT is a computational method based on convolutional neural network to exclusively identify cancer-associated beta chain TCR hypervariable CDR3 sequences. The input data were generated from tumor RNA-seq data and TCR repertoire sequencing data of healthy donors. Users do not need to perform training or evaluation. Instead, users can directly apply the PredictCancer function in the package, after downloading the CHKP folder.
### Standard pipeline of using DeepCAT
1. Clone github repository on your own machine in a desired folder
In Terminal:
```bash
git clone https://github.com/s175573/DeepCAT.git
```
2. Go to DeepCAT folder and unzip DeepCAT_CHKP.zip file with pre-trained model
```bash
cd DeepCAT
unzip DeepCAT_CHKP.zip
```
3. Running the DeepCAT requires python3, biopython, tensorflow version 1.4 and matplotlib packages to be installed. If they are not installed on your machine, please run the command:
```bash
pip install python3 biopython tensorflow==1.14 matplotlib
```
4. Now we are ready to run DeepCAT to perform cancer score prediction
*** 4A. User doesn't have raw TCR repertoire sequencing data.***
In this case please use the data in a SampleData folder for an example input.
This folder contains 4 files, all profiled by Adaptive Biotechnology and can be downloaded from immuneAccess (https://clients.adaptivebiotech.com/immuneaccess).
Files 1 and 2 come from early-stage breast cancer patients; 3 and 4 from healthy donors.
To process input files just call Script_DeepCAT.sh:
```bash
bash Script_DeepCAT.sh -t SampleData/Control
bash Script_DeepCAT.sh -t SampleData/Cancer
```
DeepCAT will output two files, Cancer_score_Control.txt and Cancer_score_Cancer.txt.
```bash
$ head Cancer_score_Control.txt
TestReal-HIP09051.tsv_ClusteredCDR3s_7.5.txt 0.22574785
TestReal-HIP09559.tsv_ClusteredCDR3s_7.5.txt 0.16407683
TestReal-HIP09364.tsv_ClusteredCDR3s_7.5.txt 0.21816333
TestReal-HIP09062.tsv_ClusteredCDR3s_7.5.txt 0.17059885
TestReal-HIP09190.tsv_ClusteredCDR3s_7.5.txt 0.17043449
TestReal-HIP09022.tsv_ClusteredCDR3s_7.5.txt 0.16097252
TestReal-HIP09029.tsv_ClusteredCDR3s_7.5.txt 0.172395
TestReal-HIP09159.tsv_ClusteredCDR3s_7.5.txt 0.17491624
TestReal-HIP09775.tsv_ClusteredCDR3s_7.5.txt 0.21496484
TestReal-HIP10377.tsv_ClusteredCDR3s_7.5.txt 0.19861585
```
where first column contains name of the input file, second column is mean cancer score for all sequences in corresponding input file.
Let’s make boxplots with cancer scores and ROC curve for early-stage breast cancer patients (16 samples) and healthy donors (30 samples).