pathwayPCA an R package for integrative pathway analysis with modern PCA methodology and gene selection, bioRxiv, 2019-04-23
ABSTRACTWith the advance in high-throughput technology for molecular assays, multi-omics datasets have become increasingly available. However, most currently available pathway analysis software provide little or no functionalities for analyzing multiple types of -omics data simultaneously. In addition, most tools do not provide sample-specific estimates of pathway activities, which are important for precision medicine. To address these challenges, we present pathwayPCA, a unique R package for integrative pathway analysis that utilizes modern statistical methodology including supervised PCA and adaptive elastic-net PCA for principal component analysis. pathwayPCA can analyze continuous, binary, and survival outcomes in studies with multiple covariate andor interaction effects. We provide three case studies to illustrate pathway analysis with gene selection, integrative analysis of multi-omics datasets to identify driver genes, estimating and visualizing sample-specific pathway activities in ovarian cancer, and identifying sex-specific pathway effects in kidney cancer. pathwayPCA is an open source R package, freely available to the research community. We expect pathwayPCA to be a useful tool for empowering the wide scientific community on the analyses and interpretation of the wealth of multiomics data recently made available by TCGA, CPTAC and other large consortiums.
biorxiv bioinformatics 0-100-users 2019Stacks 2 Analytical Methods for Paired-end Sequencing Improve RADseq-based Population Genomics, bioRxiv, 2019-04-23
AbstractFor half a century population genetics studies have put type II restriction endonucleases to work. Now, coupled with massively-parallel, short-read sequencing, the family of RAD protocols that wields these enzymes has generated vast genetic knowledge from the natural world. Here we describe the first software capable of using paired-end sequencing to derive short contigs from de novo RAD data natively. Stacks version 2 employs a de Bruijn graph assembler to build contigs from paired-end reads and overlap those contigs with the corresponding single-end loci. The new architecture allows all the individuals in a meta population to be considered at the same time as each RAD locus is processed. This enables a Bayesian genotype caller to provide precise SNPs, and a robust algorithm to phase those SNPs into long haplotypes – generating RAD loci that are 400-800bp in length. To prove its recall and precision, we test the software with simulated data and compare reference-aligned and de novo analyses of three empirical datasets. We show that the latest version of Stacks is highly accurate and outperforms other software in assembling and genotyping paired-end de novo datasets.
biorxiv genomics 100-200-users 2019Alzheimer’s patient brain myeloid cells exhibit enhanced aging and unique transcriptional activation, bioRxiv, 2019-04-19
AbstractGene expression changes in brain microglia from mouse models of Alzheimer’s disease (AD) are highly characterized and reflect specific myeloid cell activation states that could modulate AD risk or progression. While some groups have produced valuable expression profiles for human brain cells1–4, the cellular clarity with which we now view transcriptional responses in mouse AD models has not yet been realized for human AD tissues due to limited availability of fresh tissue samples and technological hurdles of recovering transcriptomic data with cell-type resolution from frozen samples. We developed a novel method for isolating multiple cell types from frozen post-mortem specimens of superior frontal gyrus for RNA-Seq and identified 66 genes differentially expressed between AD and control subjects in the myeloid cell compartment. Myeloid cells sorted from fusiform gyrus of the same subjects showed similar changes, and whole tissue RNA analyses further corroborated our findings. The changes we observed did not resemble the “damage-associated microglia” (DAM) profile described in mouse AD models5, or other known activation states from other disease models. Instead, roughly half of the changes were consistent with an “enhanced human aging” phenotype, whereas the other half, including the AD risk gene APOE, were altered in AD myeloid cells but not differentially expressed with age. We refer to this novel profile in human Alzheimer’s microgliamyeloid cells as the HAM signature. These results, which can be browsed at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpresearch-pub.gene.comBrainMyeloidLandscapereviewVersion>research-pub.gene.comBrainMyeloidLandscapereviewVersion<jatsext-link>, highlight considerable differences between myeloid activation in mouse models and human disease, and provide a genome-wide picture of brain myeloid activation in human AD.
biorxiv neuroscience 100-200-users 2019Automated analysis of whole brain vasculature using machine learning, bioRxiv, 2019-04-19
SUMMARYTissue clearing methods enable imaging of intact biological specimens without sectioning. However, reliable and scalable analysis of such large imaging data in 3D remains a challenge. Towards this goal, we developed a deep learning-based framework to quantify and analyze the brain vasculature, named Vessel Segmentation & Analysis Pipeline (VesSAP). Our pipeline uses a fully convolutional network with a transfer learning approach for segmentation. We systematically analyzed vascular features of the whole brains including their length, bifurcation points and radius at the micrometer scale by registering them to the Allen mouse brain atlas. We reported the first evidence of secondary intracranial collateral vascularization in CD1-Elite mice and found reduced vascularization in the brainstem as compared to the cerebrum. VesSAP thus enables unbiased and scalable quantifications for the angioarchitecture of the cleared intact mouse brain and yields new biological insights related to the vascular brain function.GRAPHICAL ABSTRACT<jatsfig id=ufig1 position=float fig-type=figure orientation=portrait>Supporting material of VesSAP is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpDISCOtechnologies.orgVesSAP>httpDISCOtechnologies.orgVesSAP<jatsext-link><jatsgraphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=613257_ufig1 position=float orientation=portrait >
biorxiv neuroscience 200-500-users 2019Diversity begets diversity in microbiomes, bioRxiv, 2019-04-19
AbstractMicrobes are embedded in complex microbiomes where they engage in a wide array of inter- and intra-specific interactions1–4. However, whether these interactions are a significant driver of natural biodiversity is not well understood. Two contrasting hypotheses have been put forward to explain how species interactions could influence diversification. ‘Ecological Controls’ (EC) predicts a negative diversity-diversification relationship, where the evolution of novel types becomes constrained as available niches become filled5. In contrast, ‘Diversity Begets Diversity’ (DBD) predicts a positive relationship, with diversity promoting diversification via niche construction and other species interactions6. Using the Earth Microbiome Project, the largest standardized survey of global biodiversity to date7, we provide support for DBD as the dominant driver of microbiome diversity. Only in the most diverse microbiomes does DBD reach a plateau, consistent with increasingly saturated niche space. Genera that are strongly associated with a particular biome show a stronger DBD relationship than non-residents, consistent with prolonged evolutionary interactions driving diversification. Genera with larger genomes also experience a stronger DBD response, which could be due to a higher potential for metabolic interactions and niche construction offered by more diverse gene repertoires. Our results demonstrate that the rate at which microbiomes accumulate diversity is crucially dependent on resident diversity. This fits a scenario in which species interactions are important drivers of microbiome diversity. Further (population genomic or metagenomic) data are needed to elucidate the nature of these biotic interactions in order to more fully inform predictive models of biodiversity and ecosystem stability4,5.
biorxiv evolutionary-biology 100-200-users 2019GWAS of brain volume on 54,407 individuals and cross-trait analysis with intelligence identifies shared genomic loci and genes, bioRxiv, 2019-04-19
AbstractThe phenotypic correlation between human intelligence and brain volume (BV) is considerable (r≈0.40), and has been shown to be due to shared genetic factors1. To further examine specific genetic factors driving this correlation, we present genomic analyses of the genetic overlap between intelligence and BV using genome-wide association study (GWAS) results. First, we conducted the largest BV GWAS meta-analysis to date (N=54,407 individuals), followed by functional annotation and gene-mapping. We identified 35 genomic loci (27 novel), implicating 362 genes (346 novel) and 23 biological pathways for BV. Second, we used an existing GWAS for intelligence (N=269,867 individuals2), and estimated the genetic correlation (rg) between BV and intelligence to be 0.23. We show that the rg is driven by physical overlap of GWAS hits in 5 genomic loci. We identified 67 shared genes between BV and intelligence, which are mainly involved in important signaling pathways regulating cell growth. Out of these 67 we prioritized 32 that are most likely to have functional impact. These results provide new information on the genetics of BV and provide biological insight into BV’s shared genetic etiology with intelligence.
biorxiv genetics 100-200-users 2019