Discovering event structure in continuous narrative perception and memory, bioRxiv, 2016-10-15
SummaryDuring realistic, continuous perception, humans automatically segment experiences into discrete events. Using a novel model of neural event dynamics, we investigate how cortical structures generate event representations during continuous narratives, and how these events are stored and retrieved from long-term memory. Our data-driven approach enables identification of event boundaries and event correspondences across datasets without human-generated stimulus annotations, and reveals that different regions segment narratives at different timescales. We also provide the first direct evidence that narrative event boundaries in high-order areas (overlapping the default mode network) trigger encoding processes in the hippocampus, and that this encoding activity predicts pattern reinstatement during recall. Finally, we demonstrate that these areas represent abstract, multimodal situation models, and show anticipatory event reinstatement as subjects listen to a familiar narrative. Our results provide strong evidence that brain activity is naturally structured into semantically meaningful events, which are stored in and retrieved from long-term memory.
biorxiv neuroscience 100-200-users 2016Nanopore DNA Sequencing and Genome Assembly on the International Space Station, bioRxiv, 2016-09-28
AbstractThe emergence of nanopore-based sequencers greatly expands the reach of sequencing into low-resource field environments, enabling in situ molecular analysis. In this work, we evaluated the performance of the MinION DNA sequencer (Oxford Nanopore Technologies) in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained mixtures of genomic DNA extracted from lambda bacteriophage, Escherichia coli (strain K12) and Mus musculus (BALBc). The in-flight sequencing experiments generated more than 80,000 total reads with mean 2D accuracies of 85 – 90%, mean 1D accuracies of 75 – 80%, and median read lengths of approximately 6,000 bases. We were able to construct directed assemblies of the ~4.7 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% pairwise identity, respectively, and de novo assemblies of the lambda and E. coli genomes generated solely from nanopore reads yielded 100% and 99.8% genome coverage, respectively, at 100% and 98.5% pairwise identity. Across all surveyed metrics (base quality, throughput, staysbase, skipsbase), no observable decrease in MinION performance was observed while sequencing DNA in space. Simulated runs of in-flight nanopore data using an automated bioinformatic pipeline and cloud or laptop based genomic assembly demonstrated the feasibility of real-time sequencing analysis and direct microbial identification in space. Applications of sequencing for space exploration include infectious disease diagnosis, environmental monitoring, evaluating biological responses to spaceflight, and even potentially the detection of extraterrestrial life on other planetary bodies.
biorxiv genomics 100-200-users 2016DNA Fountain enables a robust and efficient storage architecture, bioRxiv, 2016-09-10
AbstractDNA is an attractive medium to store digital information. Here, we report a storage strategy, called DNA Fountain, that is highly robust and approaches the information capacity per nucleotide. Using our approach, we stored a full computer operating system, movie, and other files with a total of 2.14 × 106 bytes in DNA oligos and perfectly retrieved the information from a sequencing coverage equivalent of a single tile of Illumina sequencing. We also tested a process that can allow 2.18 × 1015 retrievals using the original DNA sample and were able to perfectly decode the data. Finally, we explored the limit of our architecture in terms of bytes per molecules and obtained a perfect retrieval from a density of 215Petabytegram of DNA, orders of magnitudes higher than previous techniques.
biorxiv synthetic-biology 100-200-users 2016Re-evaluation of SNP heritability in complex human traits, bioRxiv, 2016-09-10
SNP heritability, the proportion of phenotypic variance explained by SNPs, has been reported for many hundreds of traits. Its estimation requires strong prior assumptions about the distribution of heritability across the genome, but the assumptions in current use have not been thoroughly tested. By analyzing imputed data for a large number of human traits, we empirically derive a model that more accurately describes how heritability varies with minor allele frequency, linkage disequilibrium and genotype certainty. Across 19 traits, our improved model leads to estimates of common SNP heritability on average 43% (SD 3) higher than those obtained from the widely-used software GCTA, and 25% (SD 2) higher than those from the recently-proposed extension GCTA-LDMS. Previously, DNaseI hypersensitivity sites were reported to explain 79% of SNP heritability; using our improved heritability model their estimated contribution is only 24%.
biorxiv genetics 100-200-users 2016Improving genetic diagnosis in Mendelian disease with transcriptome sequencing, bioRxiv, 2016-09-09
AbstractExome and whole-genome sequencing are becoming increasingly routine approaches in Mendelian disease diagnosis. Despite their success, the current diagnostic rate for genomic analyses across a variety of rare diseases is approximately 25-50%. Here, we explore the utility of transcriptome sequencing (RNA-seq) as a complementary diagnostic tool in a cohort of 50 patients with genetically undiagnosed rare muscle disorders. We describe an integrated approach to analyze patient muscle RNA-seq, leveraging an analysis framework focused on the detection of transcript-level changes that are unique to the patient compared to over 180 control skeletal muscle samples. We demonstrate the power of RNA-seq to validate candidate splice-disrupting mutations and to identify splice-altering variants in both exonic and deep intronic regions, yielding an overall diagnosis rate of 35%. We also report the discovery of a highly recurrent de novo intronic mutation in COL6A1 that results in a dominantly acting splice-gain event, disrupting the critical glycine repeat motif of the triple helical domain. We identify this pathogenic variant in a total of 27 genetically unsolved patients in an external collagen VI-like dystrophy cohort, thus explaining approximately 25% of patients clinically suggestive of collagen VI dystrophy in whom prior genetic analysis is negative. Overall, this study represents a large systematic application of transcriptome sequencing to rare disease diagnosis and highlights its utility for the detection and interpretation of variants missed by current standard diagnostic approaches.One Sentence SummaryTranscriptome sequencing improves the diagnostic rate for Mendelian disease in patients for whom genetic analysis has not returned a diagnosis.
biorxiv genomics 100-200-users 2016Power Analysis of Single Cell RNA-Sequencing Experiments, bioRxiv, 2016-09-09
AbstractHigh-throughput single cell RNA sequencing (scRNA-seq) has become an established and powerful method to investigate transcriptomic cell-to-cell variation, and has revealed new cell types, and new insights into developmental process and stochasticity in gene expression. There are now several published scRNA-seq protocols, which all sequence transcriptomes from a minute amount of starting material. Therefore, a key question is how these methods compare in terms of sensitivity of detection of mRNA molecules, and accuracy of quantification of gene expression. Here, we assessed the sensitivity and accuracy of many published data sets based on standardized spike-ins with a uniform raw data processing pipeline. We developed a flexible and fast UMI counting tool (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comvalsumis>httpsgithub.comvalsumis<jatsext-link>) which is compatible with all UMI based protocols. This allowed us to relate these parameters to sequencing depth, and discuss the trade offs between the different methods. To confirm our results, we performed experiments on cells from the same population using three different protocols. We also investigated the effect of RNA degradation on spike-in molecules, and the average efficiency of scRNA-seq on spike-in molecules versus endogenous RNAs.
biorxiv genomics 100-200-users 2016