Why Does the Neocortex Have Columns, A Theory of Learning the Structure of the World, bioRxiv, 2017-07-13
ABSTRACTNeocortical regions are organized into columns and layers. Connections between layers run mostly perpendicular to the surface suggesting a columnar functional organization. Some layers have long-range excitatory lateral connections suggesting interactions between columns. Similar patterns of connectivity exist in all regions but their exact role remain a mystery. In this paper, we propose a network model composed of columns and layers that performs robust object learning and recognition. Each column integrates its changing input over time to learn complete predictive models of observed objects. Excitatory lateral connections across columns allow the network to more rapidly infer objects based on the partial knowledge of adjacent columns. Because columns integrate input over time and space, the network learns models of complex objects that extend well beyond the receptive field of individual cells. Our network model introduces a new feature to cortical columns. We propose that a representation of location relative to the object being sensed is calculated within the sub-granular layers of each column. The location signal is provided as an input to the network, where it is combined with sensory data. Our model contains two layers and one or more columns. Simulations show that using Hebbian-like learning rules small single-column networks can learn to recognize hundreds of objects, with each object containing tens of features. Multi-column networks recognize objects with significantly fewer movements of the sensory receptors. Given the ubiquity of columnar and laminar connectivity patterns throughout the neocortex, we propose that columns and regions have more powerful recognition and modeling capabilities than previously assumed.
biorxiv neuroscience 100-200-users 2017Text mining of 15 million full-text scientific articles, bioRxiv, 2017-07-12
AbstractAcross academia and industry, text mining has become a popular strategy for keeping up with the rapid growth of the scientific literature. Text mining of the scientific literature has mostly been carried out on collections of abstracts, due to their availability. Here we present an analysis of 15 million English scientific full-text articles published during the period 1823–2016. We describe the development in article length and publication sub-topics during these nearly 250 years. We showcase the potential of text mining by extracting published protein–protein, disease–gene, and protein subcellular associations using a named entity recognition system, and quantitatively report on their accuracy using gold standard benchmark data sets. We subsequently compare the findings to corresponding results obtained on 16.5 million abstracts included in MEDLINE and show that text mining of full-text articles consistently outperforms using abstracts only.
biorxiv bioinformatics 100-200-users 2017Enhanced proofreading governs CRISPR-Cas9 targeting accuracy, bioRxiv, 2017-07-07
The RNA-guided CRISPR-Cas9 nuclease from Streptococcus pyogenes (SpCas9) has been widely repurposed for genome editing1-4. High-fidelity (SpCas9-HF1) and enhanced specificity (eSpCas9(1.1)) variants exhibit substantially reduced off-target cleavage in human cells, but the mechanism of target discrimination and the potential to further improve fidelity were unknown5-9. Using single-molecule Förster resonance energy transfer (smFRET) experiments, we show that both SpCas9-HF1 and eSpCas9(1.1) are trapped in an inactive state10 when bound to mismatched targets. We find that a non-catalytic domain within Cas9, REC3, recognizes target mismatches and governs the HNH nuclease to regulate overall catalytic competence. Exploiting this observation, we identified residues within REC3 involved in mismatch sensing and designed a new hyper-accurate Cas9 variant (HypaCas9) that retains robust on-target activity in human cells. These results offer a more comprehensive model to rationalize and modify the balance between target recognition and nuclease activation for precision genome editing.
biorxiv biochemistry 100-200-users 2017“Unexpected mutations after CRISPR-Cas9 editing in vivo” are most likely pre-existing sequence variants and not nuclease-induced mutations, bioRxiv, 2017-07-06
Schaefer et al. recently advanced the provocative conclusion that CRISPR-Cas9 nuclease can induce off-target alterations at genomic loci that do not resemble the intended on-target site.1 Using high-coverage whole genome sequencing (WGS), these authors reported finding SNPs and indels in two CRISPR-Cas9-treated mice that were not present in a single untreated control mouse. On the basis of this association, Schaefer et al. concluded that these sequence variants were caused by CRISPR-Cas9. This new proposed CRISPR-Cas9 off-target activity runs contrary to previously published work2–8 and, if the authors are correct, could have profound implications for research and therapeutic applications. Here, we demonstrate that the simplest interpretation of Schaefer et al.’s data is that the two CRISPR-Cas9-treated mice are actually more closely related genetically to each other than to the control mouse. This strongly suggests that the so-called “unexpected mutations” simply represent SNPs and indels shared in common by these mice prior to nuclease treatment. In addition, given the genomic and sequence distribution profiles of these variants, we show that it is challenging to explain how CRISPR-Cas9 might be expected to induce such changes. Finally, we argue that the lack of appropriate controls in Schaefer et al.’s experimental design precludes assignment of causality to CRISPR-Cas9. Given these substantial issues, we urge Schaefer et al. to revise or re-state the original conclusions of their published work so as to avoid leaving misleading and unsupported statements to persist in the literature.
biorxiv molecular-biology 100-200-users 2017NanoJ-SQUIRREL quantitative mapping and minimisation of super-resolution optical imaging artefacts, bioRxiv, 2017-07-03
Most super-resolution microscopy methods depend on steps that contribute to the formation of image artefacts. Here we present NanoJ-SQUIRREL, an ImageJ-based analytical approach providing a quantitative assessment of super-resolution image quality. By comparing diffraction-limited images and super-resolution equivalents of the same focal volume, this approach generates a quantitative map of super-resolution defects, as well as methods for their correction. To illustrate its broad applicability to super-resolution approaches we apply our method to Localization Microscopy, STED and SIM images of a variety of in-cell structures including microtubules, poxviruses, neuronal actin rings and clathrin coated pits. We particularly focus on single-molecule localisation microscopy, where super-resolution reconstructions often feature imperfections not present in the original data. By showing the quantitative evolution of data quality over these varied sample preparation, acquisition and super-resolution methods we display the potential of NanoJ-SQUIRREL to guide optimization of superresolution imaging parameters.
biorxiv biophysics 100-200-users 2017Chromatin accessibility dynamics of myogenesis at single cell resolution, bioRxiv, 2017-06-27
AbstractOver a million DNA regulatory elements have been cataloged in the human genome, but linking these elements to the genes that they regulate remains challenging. We introduce Cicero, a statistical method that connects regulatory elements to target genes using single cell chromatin accessibility data. We apply Cicero to investigate how thousands of dynamically accessible elements orchestrate gene regulation in differentiating myoblasts. Groups of co-accessible regulatory elements linked by Cicero meet criteria of “chromatin hubs”, in that they are physically proximal, interact with a common set of transcription factors, and undergo coordinated changes in histone marks that are predictive of gene expression. Pseudotemporal analysis revealed a subset of elements bound by MYOD in myoblasts that exhibit early opening, potentially serving as the initial sites of recruitment of chromatin remodeling and histone-modifying enzymes. The methodological framework described here constitutes a powerful new approach for elucidating the architecture, grammar and mechanisms of cis-regulation on a genome-wide basis.
biorxiv genomics 100-200-users 2017