Defects in the neuroendocrine axis cause global development delay in a Drosophila model of NGLY1 Deficiency, bioRxiv, 2018-01-02
ABSTRACTN-glycanase 1 (NGLY1) Deficiency is a rare monogenic multi-system disorder first described in 2014. NGLY1 is evolutionarily conserved in model organisms, including the Drosophila melanogaster NGLY1 homolog, Pngl. Here we conducted a natural history study and chemical-modifier screen on a new fly model of NGLY1 Deficiency engineered with a nonsense mutation in Pngl at codon 420, resulting in truncation of the C-terminal carbohydrate-binding PAW domain. Homozygous mutant animals exhibit global development delay, pupal lethality and small body size as adults. We developed a 96-well-plate, image-based, quantitative assay of Drosophila larval size for use in a screen of the 2,650-member Microsource Spectrum compound library of FDA approved drugs, bioactive tool compounds, and natural products. We found that the cholesterol-derived ecdysteroid molting hormone 20-hydroxyecdysone (20E) rescued the global developmental delay in mutant homozygotes. Targeted expression of a human NGLY1 transgene to tissues involved in ecdysteroidogenesis, e.g., prothoracic gland, also rescues global developmental delay in mutant homozygotes. Finally, the proteasome inhibitor bortezomib is a potent enhancer of global developmental delay in our fly model, evidence of a defective proteasome “bounce-back” response that is also observed in nematode and cellular models of NGLY1 Deficiency. Together, these results demonstrate the therapeutic relevance of a new fly model of NGLY1 Deficiency for drug discovery, biomarker discovery, pharmacodynamics studies, and gene modifier screens.
biorxiv genetics 0-100-users 2018Comparison of computational methods for imputing single-cell RNA-sequencing data, bioRxiv, 2018-01-01
AbstractSingle-cell RNA-sequencing (scRNA-seq) is a recent breakthrough technology, which paves the way for measuring RNA levels at single cell resolution to study precise biological functions. One of the main challenges when analyzing scRNA-seq data is the presence of zeros or dropout events, which may mislead downstream analyses. To compensate the dropout effect, several methods have been developed to impute gene expression since the first Bayesian-based method being proposed in 2016. However, these methods have shown very diverse characteristics in terms of model hypothesis and imputation performance. Thus, large-scale comparison and evaluation of these methods is urgently needed now. To this end, we compared eight imputation methods, evaluated their power in recovering original real data, and performed broad analyses to explore their effects on clustering cell types, detecting differentially expressed genes, and reconstructing lineage trajectories in the context of both simulated and real data. Simulated datasets and case studies highlight that there are no one method performs the best in all the situations. Some defects of these methods such as scalability, robustness and unavailability in some situations need to be addressed in future studies.
biorxiv bioinformatics 0-100-users 2018DeepGS Predicting phenotypes from genotypes using Deep Learning, bioRxiv, 2018-01-01
AbstractMotivationGenomic selection (GS) is a new breeding strategy by which the phenotypes of quantitative traits are usually predicted based on genome-wide markers of genotypes using conventional statistical models. However, the GS prediction models typically make strong assumptions and perform linear regression analysis, limiting their accuracies since they do not capture the complex, non-linear relationships within genotypes, and between genotypes and phenotypes.ResultsWe present a deep learning method, named DeepGS, to predict phenotypes from genotypes. Using a deep convolutional neural network, DeepGS uses hidden variables that jointly represent features in genotypic markers when making predictions; it also employs convolution, sampling and dropout strategies to reduce the complexity of high-dimensional marker data. We used a large GS dataset to train DeepGS and compare its performance with other methods. In terms of mean normalized discounted cumulative gain value, DeepGS achieves an increase of 27.70%~246.34% over a conventional neural network in selecting top-ranked 1% individuals with high phenotypic values for the eight tested traits. Additionally, compared with the widely used method RR-BLUP, DeepGS still yields a relative improvement ranging from 1.44% to 65.24%. Through extensive simulation experiments, we also demonstrated the effectiveness and robustness of DeepGS for the absent of outlier individuals and subsets of genotypic markers. Finally, we illustrated the complementarity of DeepGS and RR-BLUP with an ensemble learning approach for further improving prediction performance.AvailabilityDeepGS is provided as an open source R package available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comcma2015DeepGS>httpsgithub.comcma2015DeepGS<jatsext-link>.
biorxiv bioinformatics 0-100-users 2018Disequilibrium in Gender Ratios among Authors who Contributed Equally, bioRxiv, 2018-01-01
AbstractIn recent decades, the biomedical literature has witnessed an increasing number of authors per article together with a concomitant increase of authors claiming to have contributed equally. In this study, we analyzed over 3000 publications from 1995–2017 claiming equal contributions for authors sharing the first author position for author number, gender, and gender position. The frequency of dual pairings contributing equally was male-male > mixed gender > female-female. For mixed gender pairs males were more often at the first position although the disparity has lessened in the past decade. Among author associations claiming equal contribution and containing three or more individuals, males predominated in both the first position and number of gender exclusive groupings. Our results show a disequilibrium in gender ratios among authors who contributed equally from expected ratios had the ordering been done randomly or alphabetical. Given the importance of the first author position in assigning credit for a publication, the finding of fewer than expected females in associations involving shared contributions raises concerns about women not receiving their fair share of expected credit. The results suggest a need for journals to request clarity on the method used to decide author order among individuals claiming to have made equal contributions to a scientific publication.
biorxiv scientific-communication-and-education 100-200-users 2018The Functional False Discovery Rate with Applications to Genomics, bioRxiv, 2017-12-31
AbstractThe false discovery rate measures the proportion of false discoveries among a set of hypothesis tests called significant. This quantity is typically estimated based on p-values or test statistics. In some scenarios, there is additional information available that may be used to more accurately estimate the false discovery rate. We develop a new framework for formulating and estimating false discovery rates and q-values when an additional piece of information, which we call an “informative variable”, is available. For a given test, the informative variable provides information about the prior probability a null hypothesis is true or the power of that particular test. The false discovery rate is then treated as a function of this informative variable. We consider two applications in genomics. Our first is a genetics of gene expression (eQTL) experiment in yeast where every genetic marker and gene expression trait pair are tested for associations. The informative variable in this case is the distance between each genetic marker and gene. Our second application is to detect differentially expressed genes in an RNA-seq study carried out in mice. The informative variable in this study is the per-gene read depth. The framework we develop is quite general, and it should be useful in a broad range of scientific applications.
biorxiv genomics 0-100-users 2017A genome-wide association study for shared risk across major psychiatric disorders in a nation-wide birth cohort implicates fetal neurodevelopment as a key mediator, bioRxiv, 2017-12-30
AbstractThere is mounting evidence that seemingly diverse psychiatric disorders share genetic etiology, but the biological substrates mediating this overlap are not well characterized. Here, we leverage the unique iPSYCH study, a nationally representative cohort ascertained through clinical psychiatric diagnoses indicated in Danish national health registers. We confirm previous reports of individual and cross-disorder SNP-heritability for major psychiatric disorders and perform a cross-disorder genome-wide association study. We identify four novel genome-wide significant loci encompassing variants predicted to regulate genes expressed in radial glia and interneurons in the developing neocortex during midgestation. This epoch is supported by partitioning cross-disorder SNP-heritability which is enriched at regulatory chromatin active during fetal neurodevelopment. These findings indicate that dysregulation of genes that direct neurodevelopment by common genetic variants results in general liability for many later psychiatric outcomes.
biorxiv genetics 0-100-users 2017