Chiron Translating nanopore raw signal directly into nucleotide sequence using deep learning, bioRxiv, 2017-08-24
ABSTRACTSequencing by translocating DNA fragments through an array of nanopores is a rapidly maturing technology which offers faster and cheaper sequencing than other approaches. However, accurately deciphering the DNA sequence from the noisy and complex electrical signal is challenging. Here, we report Chiron, the first deep learning model to achieve end-to-end basecalling directly translating the raw signal to DNA sequence without the error-prone segmentation step. Trained with only a small set of 4000 reads, we show that our model provides state-of-the-art basecalling accuracy even on previously unseen species. Chiron achieves basecalling speeds of over 2000 bases per second using desktop computer graphics processing units.
biorxiv bioinformatics 100-200-users 2017High Aspect Ratio Nanomaterials Enable Delivery of Functional Genetic Material Without DNA Integration in Mature Plants, bioRxiv, 2017-08-23
Genetic engineering of plants is at the core of sustainability efforts, natural product synthesis, and agricultural crop engineering. The plant cell wall is a barrier that limits the ease and throughput with which exogenous biomolecules can be delivered to plants. Current delivery methods either suffer from host range limitations, low transformation efficiencies, tissue damage, or unavoidable DNA integration into the host genome. Here, we demonstrate efficient diffusion-based biomolecule delivery into tissues and organs of intact plants of several species with a suite of pristine and chemically-functionalized high aspect ratio nanomaterials. Efficient DNA delivery and strong protein expression without transgene integration is accomplished in Nicotiana benthamiana (Nb), Eruca sativa (arugula), Triticum aestivum (wheat) and Gossypium hirsutum (cotton) leaves and arugula protoplasts. We also demonstrate a second nanoparticle-based strategy in which small interfering RNA (siRNA) is delivered to Nb leaves and silence a gene with 95% efficiency. We find that nanomaterials not only facilitate biomolecule transport into plant cells but also protect polynucleotides from nuclease degradation. Our work provides a tool for species-independent and passive delivery of genetic material, without transgene integration, into plant cells for diverse biotechnology applications.
biorxiv plant-biology 0-100-users 2017Design and specificity of long ssDNA donors for CRISPR-based knock-in, bioRxiv, 2017-08-22
Update November 12th, 2019. The conclusions of this pre-print are outdated. See Authors note on page 2. CRISPRCas technologies have transformed our ability to manipulate genomes for research and gene-based therapy. In particular, homology-directed repair after genomic cleavage allows for precise modification of genes using exogenous donor sequences as templates. While both single-stranded DNA (ssDNA) and double-stranded DNA (dsDNA) forms of donors have been used as repair templates, a systematic comparison of the performance and specificity of repair using ssDNA versus dsDNA donors is still lacking. Here, we describe an optimized method for the synthesis of long ssDNA templates and demonstrate that ssDNA donors can drive efficient integration of gene-sized reporters in human cell lines. We next define a set of rules to maximize the efficiency of ssDNA-mediated knock-in by optimizing donor design. Finally, by comparing ssDNA donors with equivalent dsDNA sequences (PCR products or plasmids), we demonstrate that ssDNA templates have a unique advantage in terms of repair specificity while dsDNA donors can lead to a high rate of off-target integration. Our results provide a framework for designing high-fidelity CRISPR-based knock-in experiments, in both research and therapeutic settings.
biorxiv molecular-biology 0-100-users 2017Genome-wide association studies of brain structure and function in the UK Biobank, bioRxiv, 2017-08-22
SummaryThe genetic basis of brain structure and function is largely unknown. We carried out genome-wide association studies of 3,144 distinct functional and structural brain imaging derived phenotypes in UK Biobank (discovery dataset 8,428 subjects). We show that many of these phenotypes are heritable. We identify 148 clusters of SNP-imaging associations with lead SNPs that replicate at p<0.05, when we would expect 21 to replicate by chance. Notable significant and interpretable associations include iron transport and storage genes, related to changes in T2* in subcortical regions; extracellular matrix and the epidermal growth factor genes, associated with white matter micro-structure and lesion volume; genes regulating mid-line axon guidance development associated with pontine crossing tract organisation; and overall 17 genes involved in development, pathway signalling and plasticity. Our results provide new insight into the genetic architecture of the brain with relevance to complex neurological and psychiatric disorders, as well as brain development and aging. The full set of results is available on the interactive Oxford Brain Imaging Genetics (BIG) web browser.
biorxiv genetics 200-500-users 2017An atlas of genetic associations in UK Biobank, bioRxiv, 2017-08-17
ABSTRACTGenome-wide association studies have revealed many loci contributing to the variation of complex traits, yet the majority of loci that contribute to the heritability of complex traits remain elusive. Large study populations with sufficient statistical power are required to detect the small effect sizes of the yet unidentified genetic variants. However, the analysis of huge cohorts, like UK Biobank, is complicated by incidental structure present when collecting such large cohorts. For instance, UK Biobank comprises 107,162 third degree or closer related participants. Traditionally, GWAS have removed related individuals because they comprised an insignificant proportion of the overall sample size, however, removing related individuals in UK Biobank would entail a substantial loss of power. Furthermore, modelling such structure using linear mixed models is computationally expensive, which requires a computational infrastructure that may not be accessible to all researchers. Here we present an atlas of genetic associations for 118 non-binary and 599 binary traits of 408,455 related and unrelated UK Biobank participants of White-British descent. Results are compiled in a publicly accessible database that allows querying genome-wide association summary results for 623,944 genotyped and HapMap2 imputed SNPs, as well downloading whole GWAS summary statistics for over 30 million imputed SNPs from the Haplotype Reference Consortium panel. Our atlas of associations (GeneATLAS, <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgeneatlas.roslin.ed.ac.uk>httpgeneatlas.roslin.ed.ac.uk<jatsext-link>) will help researchers to query UK Biobank results in an easy way without the need to incur in high computational costs.
biorxiv genomics 100-200-users 2017Frequent lack of repressive capacity of promoter DNA methylation identified through genome-wide epigenomic manipulation, bioRxiv, 2017-08-17
AbstractIt is widely assumed that the addition of DNA methylation at CpG rich gene promoters silences gene transcription. However, this conclusion is largely drawn from the observation that promoter DNA methylation inversely correlates with gene expression in natural conditions. The effect of induced DNA methylation on endogenous promoters has yet to be comprehensively assessed. Here, we induced the simultaneous methylation of thousands of promoters in the genome of human cells using an engineered zinc finger-DNMT3A fusion protein, enabling assessment of the effect of forced DNA methylation upon transcription, histone modifications, and DNA methylation persistence after the removal of the fusion protein. We find that DNA methylation is frequently insufficient to transcriptionally repress promoters. Furthermore, DNA methylation deposited at promoter regions associated with H3K4me3 is rapidly erased after removal of the zinc finger-DNMT3A fusion protein. Finally, we demonstrate that induced DNA methylation can exist simultaneously on promoter nucleosomes that possess the active histone modification H3K4me3, or DNA bound by the initiated form of RNA polymerase II. These findings suggest that promoter DNA methylation is not generally sufficient for transcriptional inactivation, with implications for the emerging field of epigenome engineering.One Sentence SummaryGenome-wide epigenomic manipulation of thousands of human promoters reveals that induced promoter DNA methylation is unstable and frequently does not function as a primary instructive biochemical signal for gene silencing and chromatin reconfiguration.
biorxiv genomics 500+-users 2017