7 Tesla MRI of the ex vivo human brain at 100 micron resolution, bioRxiv, 2019-05-31
AbstractWe present an ultra-high resolution MRI dataset of an ex vivo human brain specimen. The brain specimen was donated by a 58-year-old woman who had no history of neurological disease and died of non-neurological causes. After fixation in 10% formalin, the specimen was imaged on a 7 Tesla MRI scanner at 100 μm isotropic resolution using a custom-built 31-channel receive array coil. Single-echo multi-flip Fast Low-Angle SHot (FLASH) data were acquired over 100 hours of scan time (25 hours per flip angle), allowing derivation of a T1 parameter map and synthesized FLASH volumes. This dataset provides an unprecedented view of the three-dimensional neuroanatomy of the human brain. To optimize the utility of this resource, we warped the dataset into standard stereotactic space. We now distribute the dataset in both native space and stereotactic space to the academic community via multiple platforms. We envision that this dataset will have a broad range of investigational, educational, and clinical applications that will advance understanding of human brain anatomy in health and disease.<jatstable-wrap id=utbl1 orientation=portrait position=float><jatsgraphic xmlnsxlink=httpwww.w3.org1999xlink xlinkhref=649822v2_utbl1 position=float orientation=portrait ><jatstable-wrap>
biorxiv neuroscience 500+-users 2019Deep learning does not outperform classical machine learning for cell-type annotation, bioRxiv, 2019-05-31
AbstractDeep learning has revolutionized image analysis and natural language processing with remarkable accuracies in prediction tasks, such as image labeling or word identification. The origin of this revolution was arguably the deep learning approach by the Hinton lab in 2012, which halved the error rate of existing classifiers in the then 2-year-old ImageNet database1. In hindsight, the combination of algorithmic and hardware advances with the appearance of large and well-labeled datasets has led up to this seminal contribution.The emergence of large amounts of data from single-cell RNA-seq and the recent global effort to chart all cell types in the Human Cell Atlas has attracted an interest in deep-learning applications. However, all current approaches are unsupervised, i.e., learning of latent spaces without using any cell labels, even though supervised learning approaches are often more powerful in feature learning and the most popular approach in the current AI revolution by far.Here, we ask why this is the case. In particular we ask whether supervised deep learning can be used for cell annotation, i.e. to predict cell-type labels from single-cell gene expression profiles. After evaluating 6 classification methods across 14 datasets, we notably find that deep learning does not outperform classical machine-learning methods in the task. Thus, cell-type prediction based on gene-signature derived cell-type labels is potentially too simplistic a task for complex non-linear methods, which demands better labels of functional single-cell readouts. We, therefore, are still waiting for the “ImageNet moment” in single-cell genomics.
biorxiv bioinformatics 100-200-users 2019Genomic diversity affects the accuracy of bacterial SNP calling pipelines, bioRxiv, 2019-05-31
AbstractBackgroundAccurately identifying SNPs from bacterial sequencing data is an essential requirement for using genomics to track transmission and predict important phenotypes such as antimicrobial resistance. However, most previous performance evaluations of SNP calling have been restricted to eukaryotic (human) data. Additionally, bacterial SNP calling requires choosing an appropriate reference genome to align reads to, which, together with the bioinformatic pipeline, affects the accuracy and completeness of a set of SNP calls obtained.This study evaluates the performance of 41 SNP calling pipelines using simulated data from 254 strains of 10 clinically common bacteria and real data from environmentally-sourced and genomically diverse isolates within the genera Citrobacter, Enterobacter, Escherichia and Klebsiella.ResultsWe evaluated the performance of 41 SNP calling pipelines, aligning reads to genomes of the same or a divergent strain. Irrespective of pipeline, a principal determinant of reliable SNP calling was reference genome selection. Across multiple taxa, there was a strong inverse relationship between pipeline sensitivity and precision, and the Mash distance (a proxy for average nucleotide divergence) between reads and reference genome. The effect was especially pronounced for diverse, recombinogenic, bacteria such as Escherichia coli, but less dominant for clonal species such as Mycobacterium tuberculosis.ConclusionsThe accuracy of SNP calling for a given species is compromised by increasing intra-species diversity. When reads were aligned to the same genome from which they were sequenced, among the highest performing pipelines was NovoalignGATK. However, across the full range of (divergent) genomes, among the consistently highest-performing pipelines was Snippy.
biorxiv bioinformatics 100-200-users 2019Dense neuronal reconstruction through X-ray holographic nano-tomography, bioRxiv, 2019-05-30
AbstractElucidating the structure of neuronal networks provides a foundation for understanding how the nervous system processes information to generate behavior. Despite technological breakthroughs in visible light and electron microscopy, imaging dense nanometer-scale neuronal structures over millimeter-scale tissue volumes remains a challenge. Here, we demonstrate that X-ray holographic nano-tomography is capable of imaging large tissue volumes with sufficient resolution to disentangle dense neuronal circuitry in Drosophila melanogaster and mammalian central and peripheral nervous tissue. Furthermore, we show that automatic segmentation using convolutional neural networks enables rapid extraction of neuronal morphologies from these volumetric datasets. The technique we present allows rapid data collection and analysis of multiple specimens, and can be used correlatively with light microscopy and electron microscopy on the same samples. Thus, X-ray holographic nano-tomography provides a new avenue for discoveries in neuroscience and life sciences in general.
biorxiv neuroscience 100-200-users 2019Accelerating Sequence Alignment to Graphs, bioRxiv, 2019-05-28
AbstractAligning DNA sequences to an annotated reference is a key step for genotyping in biology. Recent scientific studies have demonstrated improved inference by aligning reads to a variation graph, i.e., a reference sequence augmented with known genetic variations. Given a variation graph in the form of a directed acyclic string graph, the sequence to graph alignment problem seeks to find the best matching path in the graph for an input query sequence. Solving this problem exactly using a sequential dynamic programming algorithm takes quadratic time in terms of the graph size and query length, making it difficult to scale to high throughput DNA sequencing data. In this work, we propose the first parallel algorithm for computing sequence to graph alignments that leverages multiple cores and single-instruction multiple-data (SIMD) operations. We take advantage of the available inter-task parallelism, and provide a novel blocked approach to compute the score matrix while ensuring high memory locality. Using a 48-core Intel Xeon Skylake processor, the proposed algorithm achieves peak performance of 317 billion cell updates per second (GCUPS), and demonstrates near linear weak and strong scaling on up to 48 cores. It delivers significant performance gains compared to existing algorithms, and results in run-time reduction from multiple days to three hours for the problem of optimally aligning high coverage long (PacBioONT) or short (Illumina) DNA reads to an MHC human variation graph containing 10 million vertices.AvailabilityThe implementation of our algorithm is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comParBLiSSPaSGAL>httpsgithub.comParBLiSSPaSGAL<jatsext-link>. Data sets used for evaluation are accessible using <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsalurulab.cc.gatech.eduPaSGAL>httpsalurulab.cc.gatech.eduPaSGAL<jatsext-link>.
biorxiv bioinformatics 100-200-users 2019Estimations of the weather effects on brain functions using functional MRI – a cautionary tale, bioRxiv, 2019-05-28
AbstractThe influences of environmental factors such as weather on human brain are still largely unknown. A few neuroimaging studies have demonstrated seasonal effects, but were limited by their cross-sectional design or sample sizes. Most importantly, the stability of MRI scanner hasn’t been taken into account, which may also be affected by environments. In the current study, we analyzed longitudinal resting-state functional MRI (fMRI) data from eight individuals, where the participants were scanned over months to years. We applied machine learning regression to use different resting-state parameters, including amplitude of low-frequency fluctuations (ALFF), regional homogeneity (ReHo), and functional connectivity matrix, to predict different weather and environmental parameters. For a careful control, the raw EPI and the anatomical images were also used in the prediction analysis. We first found that daylight length and temperatures could be reliability predicted using cross-validation using resting-state parameters. However, similar prediction accuracies could also achieved by using one frame of EPI image, and even higher accuracies could be achieved by using segmented or even the raw anatomical images. Finally, we verified that the signals outside of the brain in the anatomical images and signals in phantom scans could also achieve higher prediction accuracies, suggesting that the predictability may be due to the baseline signals of the MRI scanner. After all, we did not identify detectable influences of weather on brain functions other than the influences on the stability of MRI scanners. The results highlight the difficulty of studying long term effects on brain using MRI.
biorxiv neuroscience 100-200-users 2019