README.md 2.47 KB
Newer Older
henry's avatar
henry committed
1 2
# Supplementary workflow details for Florio et al. 2018

henry's avatar
henry committed
3 4 5 6 7 8
This repo contains scripts used to validate the genomic qPCR data from the paper:

**Evolution and cell-type specificity of human-specific genes preferentially expressed in progenitors of fetal neocortex**<br>
Marta Florio, Michael Heide, Anneline Pinson, Holger Brandl, Mareike Albert, Sylke Winkler, Pauline Wimberger, Wieland B. Huttner and Michael Hiller<br>

## Materials & Methods summary taken from the publication:
9 10

Paired-End data were trimmed using cutadapt (v1.15; `-m 20 -q 25 -a file:${Ill_ADAPTERS} -A file:${Ill_ADAPTERS}`) and mapped with STAR (v2.5.2b; `---alignSJoverhangMin 100 ---outFilterType BySJout ---sjdbGTFfile ${gtfFile}`). bedtools intersect (v2.25.0) was used to determine the number of overlapping alignments at each locus of interest, and samtools flagstat was used to determine the library size. Final data integration and visualization was implemented using R.
henry's avatar
henry committed
11 12 13

We share these computational protocols in the spirit of open data and reproducible research. So feel welcome to provide comments, report errors, or suggest improvements.

14 15 16
## Workflow

[`quantify_offtargets.sh`](./quantify_offtargets.sh) contains all performed steps
henry's avatar
henry committed
17

18 19 20 21 22
1. Data DL, QC, and Trimming
2. Alignment & Locus count intersection
3. Off-target ratio calcuation and reporting

The used region model is also indluded under [ortho_model_hsap_ppan_ptro.fixed.txt](./ortho_model_hsap_ppan_ptro.fixed.txt)
henry's avatar
henry committed
23 24 25

## Usage & Disclaimer

26
The scripts are provided as is under [MIT License](https://opensource.org/licenses/MIT), without the intention that an interested reader will able to run them directly.  We share our protocol here, not a ready-to-use tool. Still, we think they should provide enough technical detail to allow replicating our analysis.
henry's avatar
henry committed
27 28 29

Not detailed out in this repo are the steps to prepare a linux environment to include the listed tools and dependencies. Required tools include recent versions of R, samtools, STAR, bedtools, as well as:

30 31 32
* https://git.mpi-cbg.de/bioinfo/ngs_tools which is a collection of deep-sequencies utilites. The version used to build the data was `6747f0add32ba9bc41a3cd04de72dad69afdbb6d`
* https://github.com/holgerbrandl/joblist which an HPC task manager
* [rend.R](https://github.com/holgerbrandl/datautils/tree/master/tools/rendr) which is a wrapper around `knitr`
henry's avatar
henry committed
33 34 35 36

## Support

Feel welcome to contact [us](mailto:bioinformatics@mpi-cbg.de) or any of the other authors listed in the original publication in case you have any questions.