README 1.93 KB
Newer Older
Philippine Garret's avatar
Philippine Garret committed
1
# Mitochondria pipeline README
2
2019-06-06
Philippine Garret's avatar
Philippine Garret committed
3 4 5 6


## Pre-requisites
This pipeline requires docker.
7 8
GRCh38 reference file needs to be loaded in </path/to/references/directory>, decompressed, indexed by BWA and SequenceDictionary generated by picard tools. The GRCh38 genome version is mandatory (you can not use hg19). You can download it from http://hgdownload.soe.ucsc.edu/goldenPath/hg38/bigZips/hg38.fa.gz
samples.R1.fastq.gz and sample.R2.fastq.gz files need to be loaded in <path/to/fastq/gz/samples/files/directory> without subfolder. Each sample needs to be sequenced in paired-end.
Philippine Garret's avatar
Philippine Garret committed
9

10 11 12 13 14
## Hardware requirements
We strongly recommend a computer with at least:
- Memory 16 Go.
- CPU 2 to 16 cores.
- 30 Go disk space.
Philippine Garret's avatar
Philippine Garret committed
15 16

## Download Pipeline
17
We provide a docker version for local usage. Download docker image pipelinemitov1.tar (http://gitlab.gad-bioinfo.org/gad-public/pipelinemito)
Philippine Garret's avatar
Philippine Garret committed
18 19 20 21 22


## Load and run pipeline  
Commands:
	docker load -i pipelinemitov1.tar
23
	docker run  -v </path/to/fastq/gz/samples/files/directory>:/data:rw -v </path/to/references/directory>:/mitopipeline:ro --env THREAD=<int> --env REFNAME=<ref.fa> pipelinemitov1
Philippine Garret's avatar
Philippine Garret committed
24 25 26

Example: 
	docker load -i pipelinemitov1.tar
27
	docker run  -v /data/docker-data:/data:rw -v /data/docker-references:/mitopipeline:ro --env THREAD=2 --env REFNAME=grch38.fa pipelinemitov1
Philippine Garret's avatar
Philippine Garret committed
28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53


## Input File Format
The input format is fastq.gz. 


## Output Files Formats
The ouput formats are TSV and VCF.
Each variants is annotated with:
- Chromosome
- Position
- Reference and alternative alleles
- Genomic change
- Filters flags:  Is the variant synonymous (SYN)? frequent in GenBank (FM)? frequent in our cohort (FB)? determining patient haplogroup (HPG)? Or is not filtered (PASS)?
- Genotype (0/0, 0/1, 1/1)
- Genbank frequency (0 to 100.0%)
- Haplogroups defined by the variant
- Heteroplasmy rate (0 to 100.0%)
- MitoTIP score hen available (0 to 100.0%)

TSV report contains all mitochondria variants flagged "PASS". 


## Cite

## Contact