ngs_tools issueshttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues2020-03-17T15:07:56Zhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/86ms_workflow: ms_ms_prop and reorder information are missing for protein IDs w...2020-03-17T15:07:56Zhersemanms_workflow: ms_ms_prop and reorder information are missing for protein IDs without fasta_header informationhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/83ms_workflow: refine reorder information of protein groups2020-02-18T14:07:47Zhersemanms_workflow: refine reorder information of protein groupsCurrently, we only give information on whether a protein group was reordered or not prior to merging of the tables; however, this does not include information on whether the (alphabetical) reordering for the individual protein groups too...Currently, we only give information on whether a protein group was reordered or not prior to merging of the tables; however, this does not include information on whether the (alphabetical) reordering for the individual protein groups took place in all samples and thus, although they are reordered, were originally all the same, or if protein groups of individual samples were only merged because they could be matched after reordering but were different based on the original protein IDs orders.hersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/79updating ms_workflow2020-07-13T09:28:16Zhersemanupdating ms_workflowhersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/73Implement renvs2020-02-20T14:49:29ZdomingueImplement renvsIt started with packrat in https://git.mpi-cbg.de/bioinfo/ngs_tools/issues/53# and after some testing I decided that it was worth it to add it as a function to our workflow. Still under testing.It started with packrat in https://git.mpi-cbg.de/bioinfo/ngs_tools/issues/53# and after some testing I decided that it was worth it to add it as a function to our workflow. Still under testing.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/69ms_worflow: bug in heatmap clustering2019-08-28T09:04:01Zdominguems_worflow: bug in heatmap clusteringWe are feeding the pre-calculated euclidean distances outside the plotting function, as input matrix, but `d3heatmap, and `pheatmap`for that matter, will also calculate it internally leading to overclustering. Fix it.
Lines:
https://git...We are feeding the pre-calculated euclidean distances outside the plotting function, as input matrix, but `d3heatmap, and `pheatmap`for that matter, will also calculate it internally leading to overclustering. Fix it.
Lines:
https://git.mpi-cbg.de/bioinfo/ngs_tools/blob/master/ms_workflow/02-ms-DEP-analysis.R#L482dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/68ms_workflow: Add table with % of contaminants in the most abudant proteins2019-08-26T08:24:52Zdominguems_workflow: Add table with % of contaminants in the most abudant proteinsThis can useful to exclude situations when the bulk of protein intensity comes from "contaminants". Since the contaminants are, to some extent, defined by MaxQuant, it could be that they are in fact of interested for the biological proce...This can useful to exclude situations when the bulk of protein intensity comes from "contaminants". Since the contaminants are, to some extent, defined by MaxQuant, it could be that they are in fact of interested for the biological process being studied. We will leave the decision on how to handle contaminants to the project owner.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/66ms_workflow: improve the results heatmap2019-08-23T13:54:28Zdominguems_workflow: improve the results heatmapSuggested by Olya:
- [x] add gene names to the rows
- [x] include plots as long as there _any_ hit
- [ ] move legend to the bottom
Some of the text accompanying the plots is now showing.Suggested by Olya:
- [x] add gene names to the rows
- [x] include plots as long as there _any_ hit
- [ ] move legend to the bottom
Some of the text accompanying the plots is now showing.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/65DEP installation on falcon - nc-config missing2019-09-17T07:17:00ZdomingueDEP installation on falcon - nc-config missingI was trying to install an R/BioC package on falcon r/3.5.1, and ran into a dependency issue:
```R
* installing *source* package ‘ncdf4’ ...
** package ‘ncdf4’ successfully unpacked and MD5 sums checked
configure.ac: starting
checking f...I was trying to install an R/BioC package on falcon r/3.5.1, and ran into a dependency issue:
```R
* installing *source* package ‘ncdf4’ ...
** package ‘ncdf4’ successfully unpacked and MD5 sums checked
configure.ac: starting
checking for nc-config... no
-----------------------------------------------------------------------------------
Error, nc-config not found or not executable. This is a script that comes with the
netcdf library, version 4.1-beta2 or later, and must be present for configuration
to succeed.
If you installed the netcdf library (and nc-config) in a standard location, nc-config
should be found automatically. Otherwise, you can specify the full path and name of
the nc-config script by passing the --with-nc-config=/full/path/nc-config argument
flag to the configure script. For example:
./configure --with-nc-config=/sw/dist/netcdf4/bin/nc-config
Special note for R users:
-------------------------
To pass the configure flag to R, use something like this:
R CMD INSTALL --configure-args="--with-nc-config=/home/joe/bin/nc-config" ncdf4
where you should replace /home/joe/bin etc. with the location where you have
installed the nc-config script that came with the netcdf 4 distribution.
-----------------------------------------------------------------------------------
ERROR: configuration failed for package ‘ncdf4’
```
I had this issue when installing this same package on my computer, also running Linux, and if I remember correctly the solution was to install some missing libraries with sudo. Well I don’t have sudo for falcon so I contacted hpc support to get it fixed.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/64ms_workflow: QC improvments2019-08-13T13:55:45Zdominguems_workflow: QC improvmentsDuring a meeting with Olya she mentioned that our QC looked very much like those of a package created by the Kempa lab. there is an `R` package and a [paper](https://doi.org/10.1021/acs.jproteome.5b00780) to got with it which I will peru...During a meeting with Olya she mentioned that our QC looked very much like those of a package created by the Kempa lab. there is an `R` package and a [paper](https://doi.org/10.1021/acs.jproteome.5b00780) to got with it which I will peruse to see if there is something we could use for our workflow.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/63ms_workflow: add more explanations2019-08-09T08:21:02Zdominguems_workflow: add more explanationsCurrently the text explanations are very sparse. Things to add:
- what are uniq and prop proteins
- LFQ vs rawCurrently the text explanations are very sparse. Things to add:
- what are uniq and prop proteins
- LFQ vs rawdominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/62ms_workflow: replace PCA with plotly2019-08-09T14:40:25Zdominguems_workflow: replace PCA with plotlyCurrently having colour and shape to define condition / replicate is not visually pleasant or readable. I am was using the `dep::plot_pca` but I will replace it with plotly.Currently having colour and shape to define condition / replicate is not visually pleasant or readable. I am was using the `dep::plot_pca` but I will replace it with plotly.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/61ms_workflow: add % of NAs to results table2019-08-09T09:58:43Zdominguems_workflow: add % of NAs to results table@herseman suggested that it is more informative to know the % of samples from which a protein is missing, and have this information for both conditions which are being compared - could indicate major differences between the conditions.@herseman suggested that it is more informative to know the % of samples from which a protein is missing, and have this information for both conditions which are being compared - could indicate major differences between the conditions.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/60ms_workflow: add intensities to results table2019-08-09T09:35:40Zdominguems_workflow: add intensities to results tableSomehow the protein intensities for each sample (and teh average for each condition) are missing - fix this.Somehow the protein intensities for each sample (and teh average for each condition) are missing - fix this.dominguedominguehttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/41ms_workflow: ms_limma.R returns error when no differentially abundant protein...2019-02-14T14:15:01Zhersemanms_workflow: ms_limma.R returns error when no differentially abundant proteins were foundhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/40ms_workflow: handling of technical replicates2019-12-09T16:32:07Zhersemanms_workflow: handling of technical replicatesso far technical replicates are not taken into account and we only process MaxQuant outputs with only one technical replicate; it would be nice to integrate the information on technical variability but this should of course be verified w...so far technical replicates are not taken into account and we only process MaxQuant outputs with only one technical replicate; it would be nice to integrate the information on technical variability but this should of course be verified with an appropriate test data set; for now we should at least modify our pipeline in a way that it averages technical replicates as part of the workflow and reports the technical variation.hersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/37ms_workflow: handling of contaminations2019-12-10T08:11:26Zhersemanms_workflow: handling of contaminations- REV__ entries mean non-sense sequences (means that there was a match against the reverse of an entry of the database of interest); REV__ entries can be removed right at the beginning of the script
- think about removal of keratin as a ...- REV__ entries mean non-sense sequences (means that there was a match against the reverse of an entry of the database of interest); REV__ entries can be removed right at the beginning of the script
- think about removal of keratin as a standard set-up (maybe Anna can provide us with a list of most common contaminations and we can start by reporting those specifically)hersemandominguehersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/36ms_workflow: add additional quality metrics to the differential abundance ana...2019-03-08T13:24:24Zhersemanms_workflow: add additional quality metrics to the differential abundance analysis output- take the intensities of the standard as a minimal threshold to mark low-abundant proteins
- summarize MS/MS information as numeric value per gene and condition (e.g. percentage of MS/MS identifications per gene across all replicates); ...- take the intensities of the standard as a minimal threshold to mark low-abundant proteins
- summarize MS/MS information as numeric value per gene and condition (e.g. percentage of MS/MS identifications per gene across all replicates); additionally include some summary of identification types per sample in the ms_data_prep.R script)
- report for which entries the proteinGroups order has been changed by sorting
- add information per gene whether the value for any replicate per condition has been imputedhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/33ms_workflow: report settings from MaxQuant log file2019-02-12T10:40:41Zhersemanms_workflow: report settings from MaxQuant log filehersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/32ms_workflow: aggregate protein isoforms data on gene level2019-02-08T14:31:39Zhersemanms_workflow: aggregate protein isoforms data on gene level- extract data on gene IDs and uniprot IDs from Ensembl
- check correlation of ensembl and uniprot IDs
- check if it's possible to further aggregate protein groups based on gene level- extract data on gene IDs and uniprot IDs from Ensembl
- check correlation of ensembl and uniprot IDs
- check if it's possible to further aggregate protein groups based on gene levelhersemanhersemanhttps://git.mpi-cbg.de/bioinfo/ngs_tools/-/issues/31ms_workflow: report number of peptides (unique and razor peptides) for each p...2019-02-08T14:31:53Zhersemanms_workflow: report number of peptides (unique and razor peptides) for each protein grouphersemanherseman