ngs_tools issueshttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues2020-10-13T06:42:05Zhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/97gsea: improvments2020-10-13T06:42:05Zdominguegsea: improvmentsA few things to add:
- [x] include on table with number of Entrez IDs per gene set at the beginning of the report. This would also help to directly see why some of the gene sets were not tested (i.e. because of too few genes) and could ...A few things to add:
- [x] include on table with number of Entrez IDs per gene set at the beginning of the report. This would also help to directly see why some of the gene sets were not tested (i.e. because of too few genes) and could help to adjust the settings accordingly.
- [x] add how many genes we miss because there was no corresponding Entrez ID found. We did something similar in cp_enrichment.R script where we give the percentage of ‘lost’ genes.
Some issues to fix:
- [ ] when a single gene list is analysed and it has more or fewer genes than or `--maxSize` `--minSize`, respectably, it will fail without a meaningful error. Add error message to fail gracefully.
- [ ] related, add a table with the gene lists analysed, number of genes per list, and if they pass the thresholds.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/88Stranded counts2020-07-13T10:52:16ZdomingueStranded countsChange the function `dge_star_counts2matrix` to extract read counts based on the library strandingChange the function `dge_star_counts2matrix` to extract read counts based on the library strandingdominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/51dge_workflow: report number of reads per gene2020-06-30T12:11:08Zhersemandge_workflow: report number of reads per geneSTAR only reports reads mapping to exons; however, it would be also useful to have the total number of reads mapping per gene as this would give additional information on whether a gene is only lowly expressed (only a few reads but more ...STAR only reports reads mapping to exons; however, it would be also useful to have the total number of reads mapping per gene as this would give additional information on whether a gene is only lowly expressed (only a few reads but more or less all map to exons) or if the low count is only background noise (a lot if not most of the reads map to non-exonic regions)hersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/45ms_workflow: move to peptide level2019-12-10T08:27:03Zhersemanms_workflow: move to peptide leveltry and verify the following:
- apply filter for label-free quantification to get rid of miss-cleaved and modified peptides
- use only proteotypic peptides (matching only 1 protein in the set)
- use only MS/MS and by_matching and only th...try and verify the following:
- apply filter for label-free quantification to get rid of miss-cleaved and modified peptides
- use only proteotypic peptides (matching only 1 protein in the set)
- use only MS/MS and by_matching and only the top 3 hitshersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/40ms_workflow: handling of technical replicates2019-12-09T16:32:07Zhersemanms_workflow: handling of technical replicatesso far technical replicates are not taken into account and we only process MaxQuant outputs with only one technical replicate; it would be nice to integrate the information on technical variability but this should of course be verified w...so far technical replicates are not taken into account and we only process MaxQuant outputs with only one technical replicate; it would be nice to integrate the information on technical variability but this should of course be verified with an appropriate test data set; for now we should at least modify our pipeline in a way that it averages technical replicates as part of the workflow and reports the technical variation.hersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/66ms_workflow: improve the results heatmap2019-08-23T13:54:28Zdominguems_workflow: improve the results heatmapSuggested by Olya:
- [x] add gene names to the rows
- [x] include plots as long as there _any_ hit
- [ ] move legend to the bottom
Some of the text accompanying the plots is now showing.Suggested by Olya:
- [x] add gene names to the rows
- [x] include plots as long as there _any_ hit
- [ ] move legend to the bottom
Some of the text accompanying the plots is now showing.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/57dge workflow: add GTF an argument for differential gene expression2019-06-28T17:51:47Zdominguedge workflow: add GTF an argument for differential gene expressionRelates to the [DGE analysis](https://git.mpi-cbg.de/bioinfo/ngs_tools/blob/master/dge_workflow/featcounts_deseq_mf.R) and it would have two uses:
1. retrieval of "accurate" gene lengths, exonic regions only, to calculate RPKM and FPM u...Relates to the [DGE analysis](https://git.mpi-cbg.de/bioinfo/ngs_tools/blob/master/dge_workflow/featcounts_deseq_mf.R) and it would have two uses:
1. retrieval of "accurate" gene lengths, exonic regions only, to calculate RPKM and FPM using `DESeq2` in-built functionality (more details [here](https://www.rdocumentation.org/packages/DESeq2/versions/1.12.3/topics/fpkm))
2. GTFs already contain a wealth of information which currently needs to be retrieved wiht `biomaRt`. Getting it from the GTF would make the process faster, more reproducible (in my experience `biomaRt` changes quite often) and it would work even for organisms not present in biomart or other marts (eg. planaria)hersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/50general: create templates for R scripts and shinyApps and revise the dge_star...2019-05-14T14:13:16Zhersemangeneral: create templates for R scripts and shinyApps and revise the dge_star_template- create templates to facilitate uniform script layout and the reporting of important bits and pieces, e.g. session information and session file- create templates to facilitate uniform script layout and the reporting of important bits and pieces, e.g. session information and session filehersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/35sc_workflow: write single script for Seurat workflow with tested and approved...2019-03-22T12:26:06Zhersemansc_workflow: write single script for Seurat workflow with tested and approved settingshersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/39general: create customized docker files for individual analyses done with our...2019-03-14T17:34:12Zhersemangeneral: create customized docker files for individual analyses done with our ngs_tools scriptsdeviating R and R-package versions can have a severe impact on the results when trying to reproduce published data; to ensure reproducibility it may be useful to either provide customized docker files along with the published data or to ...deviating R and R-package versions can have a severe impact on the results when trying to reproduce published data; to ensure reproducibility it may be useful to either provide customized docker files along with the published data or to develop a script/tool to create those docker files; information on the R and R package versions can be found in the .sessionInfo.txt files and it is also planned to provide copies of the actually used ngs_tools scripts along with the data (see #38 ), so that we 'only' have to extract and combine all necessary informationhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/31ms_workflow: report number of peptides (unique and razor peptides) for each p...2019-02-08T14:31:53Zhersemanms_workflow: report number of peptides (unique and razor peptides) for each protein grouphersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/32ms_workflow: aggregate protein isoforms data on gene level2019-02-08T14:31:39Zhersemanms_workflow: aggregate protein isoforms data on gene level- extract data on gene IDs and uniprot IDs from Ensembl
- check correlation of ensembl and uniprot IDs
- check if it's possible to further aggregate protein groups based on gene level- extract data on gene IDs and uniprot IDs from Ensembl
- check correlation of ensembl and uniprot IDs
- check if it's possible to further aggregate protein groups based on gene levelhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/24DESeq2 check current clustering method2018-11-29T15:23:53ZhersemanDESeq2 check current clustering methodDoes our clustering method correspond to the one in the current DESeq2 vignette?Does our clustering method correspond to the one in the current DESeq2 vignette?https://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/20report proportion of multi-mappers in STAR report2017-11-16T16:58:57Zhersemanreport proportion of multi-mappers in STAR reportalso explain algn efficiency in reportalso explain algn efficiency in reporthttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/21run kallisto parallel to STAR2017-11-09T15:29:31Zhersemanrun kallisto parallel to STARCompare results:
- write a report with abundance estimates correlations
- report problematic genesCompare results:
- write a report with abundance estimates correlations
- report problematic genes