Update README

parent f9ce38a0
# STR detection pipeline
# GAD STR expansion detection pipeline
- ASDP PIPELINE
- Author: Anne-Sophie Denommé-Pichon
- Version: 0.0.1
- Licence: AGPLv3
- Description: How to launch scripts to get STR genotype from genomes on all the locus tested
- Author: Anne-Sophie Denommé-Pichon
- Version: 0.0.1
- Licence: AGPLv3
- Description: this pipeline allows to get STR genotype from short-read genomes on the locus specified. It uses ExpansionHunter, Tredparse and GangSTR. It computes genotypes called by the tools and identifies STR expansions using 3 outlier detection methods to highlight abnormal repeat counts.
1. Fill the configuration file `config.sh`.
2. Create `samples.list` (bam file names without .bam).
## Setup
- Fill the configuration file `config.sh`. There is an example in the repository.
- Create `samples.list` (bam file names without .bam). There is an example in the repository.
## Usage
For now, scripts have to be launched from the clone directory.
3. Launch `launch_pipeline.sh`: `nohup ./launch_pipeline.sh samples.list &`. Dependencies:
- `config.sh`
- `samples.list`
- `pipeline.sh`
- `wrapper_delete.sh`
- `wrapper_ehdn.sh`
- `wrapper_expansionhunter.sh`
- `wrapper_gangstr.sh`
- `wrapper_transfer.sh`
- `wrapper_tredparse.sh`
4. Optional: launch `launch_pipeline_ehdn_outlier.sh`: `nohup ./launch_pipeline_ehdn_outlier.sh samples.list &`. Dependencies:
- `config.sh`
- `samples.list`
- `pipeline_ehdn_outlier.sh`
- `wrapper_ehdn_outlier.sh`
5. Launch `launch_results.sh`: `nohup ./launch_results.sh samples.list &`. Dependencies:
- `config.sh`
- `samples.list`
- `patho.csv`
- `getResults.py`
- `launch_str_outliers.sh`
- `str_outliers.py`
6. Optional: launch `launch_str_plotly.sh`.
7. Get files (i.e.: `scp 'username@ssh-ccub.u-bourgogne.fr:/work/gad/shared/analyse/STR/results/*' .`)
### Calling STRs
Launch `launch_pipeline.sh`:
```sh
nohup ./launch_pipeline.sh samples.list &
```
Dependencies:
- `config.sh`
- `samples.list`
- `pipeline.sh`
- `wrapper_delete.sh`
- `wrapper_ehdn.sh`
- `wrapper_expansionhunter.sh`
- `wrapper_gangstr.sh`
- `wrapper_transfer.sh`
- `wrapper_tredparse.sh`
### Identifying outliers
To highlight abnormal repeat counts, the pipeline identified outliers using 3 methods: repeats counts at a given locus
1. > normal (in the gray zone or pathological zone)
2. > 99th percentile or
3. ≥ 4 standard deviations above the mean (Z-score ≥ 4).
Launch `launch_results.sh`:
```sh
nohup ./launch_results.sh samples.list &
```
Dependencies:
- `config.sh`
- `samples.list`
- `patho.csv`
- `getResults.py`
- `launch_str_outliers.sh`
- `str_outliers.py`
## Future work
Another tool, ExpansionHunter DeNovo, will be added in the pipeline.
Markdown is supported
0% or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment