Characterization of prevalence and health consequences of uniparental disomy in four million individuals from the general population, bioRxiv, 2019-02-06
Meiotic nondisjunction and resulting aneuploidy can lead to severe health consequences in humans. Aneuploidy rescue can restore euploidy but may result in uniparental disomy (UPD), the inheritance of both homologs of a chromosome from one parent with no representative copy from the other. Current understanding of UPD is limited to ~3,300 cases for which UPD was associated with clinical presentation due to imprinting disorders or recessive diseases. Thus, the prevalence of UPD and its phenotypic consequences in the general population are unknown. We searched for instances of UPD in over four million consented research participants from the personal genetics company 23andMe, Inc., and 431,094 UK Biobank participants. Using computationally detected DNA segments identical-by-descent (IBD) and runs of homozygosity (ROH), we identified 675 instances of UPD across both databases. Here we present the first characterization of UPD prevalence in the general population, a machine-learning framework to detect UPD using ROH, and a novel association between autism and UPD of chromosome 22.
biorxiv genomics 0-100-users 2019Comparative analysis of commercially available single-cell RNA sequencing platforms for their performance in complex human tissues, bioRxiv, 2019-02-06
ABSTRACTThe past five years have witnessed a tremendous growth of single-cell RNA-seq methodologies. Currently, there are three major commercial platforms for single-cell RNA-seq Fluidigm C1, Clontech iCell8 (formerly Wafergen) and 10x Genomics Chromium. Here, we provide a systematic comparison of the throughput, sensitivity, cost and other performance statistics for these three platforms using single cells from primary human islets. The primary human islets represent a complex biological system where multiple cell types coexist, with varying cellular abundance, diverse transcriptomic profiles and differing total RNA contents. We apply standard pipelines optimized for each system to derive gene expression matrices. We further evaluate the performance of each system by benchmarking single-cell data with bulk RNA-seq data from sorted cell fractions. Our analyses can be generalized to a variety of complex biological systems and serve as a guide to newcomers to the field of single-cell RNA-seq when selecting platforms.
biorxiv genomics 100-200-users 2019Deep learning reveals cancer metastasis and therapeutic antibody targeting in whole body, bioRxiv, 2019-02-06
Reliable detection of disseminated tumor cells and of the biodistribution of tumor-targeting therapeutic antibodies within the entire body has long been needed to better understand and treat cancer metastasis. Here, we developed an integrated pipeline for automated quantification of cancer metastases and therapeutic antibody targeting, named DeepMACT. First, we enhanced the fluorescent signal of tumor cells more than 100-fold by applying the vDISCO method to image single cancer cells in intact transparent mice. Second, we developed deep learning algorithms for automated quantification of metastases with an accuracy matching human expert manual annotation. Deep learning-based quantifications in a model of spontaneous metastasis using human breast cancer cells allowed us to systematically analyze clinically relevant features such as size, shape, spatial distribution, and the degree to which metastases are targeted by a therapeutic monoclonal antibody in whole mice. DeepMACT can thus considerably improve the discovery of effective therapeutic strategies for metastatic cancer.
biorxiv cancer-biology 200-500-users 2019Emergence of stable coexistence in a complex microbial community through metabolic cooperation and spatio-temporal niche partitioning, bioRxiv, 2019-02-06
Microbial communities in nature often feature complex compositional dynamics yet also stable coexistence of diverse species. The mechanistic underpinnings of such dynamic stability remain unclear as system-wide studies have been limited to small engineered communities or synthetic assemblies. Here we show how kefir, a natural milk-fermenting community, realizes stable coexistence through spatio-temporal orchestration of species and metabolite dynamics. During milk fermentation, kefir grains (a polysaccharide matrix synthesized by kefir microbes) grow in mass but remain unchanged in composition. In contrast, the milk is colonized in a dynamic fashion with early members opening metabolic niches for the followers. Through large-scale mapping of metabolic preferences and inter-species interactions, we show how microbes poorly suited for milk survive in, and even dominate the community through metabolic cooperation and uneven partitioning between the grain and the liquid phase. Overall, our findings reveal how spatio-temporal dynamics promote stable coexistence and have implications for deciphering and modulating complex microbial ecosystems.
biorxiv microbiology 0-100-users 2019Revealing neural correlates of behavior without behavioral measurements, bioRxiv, 2019-02-06
Measuring neuronal tuning curves has been instrumental for many discoveries in neuroscience but requires a-priori assumptions regarding the identity of the encoded variables. We applied unsupervised learning to large-scale neuronal recordings in behaving mice from circuits involved in spatial cognition, and uncovered a highly-organized internal structure of ensemble activity patterns. This emergent structure allowed defining for each neuron an 'internal tuning-curve' that characterizes its activity relative to the network activity, rather than relative to any pre-defined external variable -revealing place-tuning in the hippocampus and head-direction tuning in the thalamus and postsubiculum, without relying on measurements of place or head-direction. Similar investigation in prefrontal cortex revealed schematic representations of distances and actions, and exposed a previously unknown variable, the 'trajectory-phase'. The structure of ensemble activity patterns was conserved across mice, allowing using one animal's data to decode another animal's behavior. Thus, the internal structure of neuronal activity itself enables reconstructing internal representations and discovering new behavioral variables hidden within a neural code.
biorxiv neuroscience 100-200-users 2019The functional landscape of the human phosphoproteome, bioRxiv, 2019-02-06
Protein phosphorylation is a key post-translational modification regulating protein function in almost all cellular processes. While tens of thousands of phosphorylation sites have been identified in human cells to date, the extent and functional importance of the phosphoproteome remains largely unknown. Here, we have analyzed 6,801 publicly available phospho-enriched mass spectrometry proteomics experiments, creating a state-of-the-art phosphoproteome containing 119,809 human phosphosites. To prioritize functional sites, 59 features indicative of proteomic, structural, regulatory or evolutionary relevance were integrated into a single functional score using machine learning. We demonstrate how this prioritization identifies regulatory phosphosites across different molecular mechanisms and pinpoint genetic susceptibilities at a genomic scale. Several novel regulatory phosphosites were experimentally validated including a role in neuronal differentiation for phosphosites present in the SWISNF SMARCC2 complex member. The scored reference phosphoproteome and its annotations identify the most relevant phosphorylations for a given process or disease addressing a major bottleneck in cell signaling studies.
biorxiv genomics 0-100-users 2019