Purification of Cross-linked RNA-Protein Complexes by Phenol-Toluol Extraction, bioRxiv, 2018-05-30
Recent methodological advances allowed the identification of an increasing number of RNA-binding proteins (RBPs) and their RNA-binding sites. Most of those methods rely, however, on capturing proteins associated to polyadenylated RNAs which neglects RBPs bound to non-adenylate RNA classes (tRNA, rRNA, pre-mRNA) as well as the vast majority of species that lack poly-A tails in their mRNAs (including all archea and bacteria). To overcome these limitations, we have developed a novel protocol, Phenol Toluol extraction (PTex), that does not rely on a specific RNA sequence or motif for isolation of cross-linked ribonucleoproteins (RNPs), but rather purifies them based entirely on their physicochemical properties. PTex captures RBPs that bind to RNA as short as 30 nt, RNPs directly from animal tissue and can be used to simplify complex workflows such as PAR-CLIP. Finally, we provide a first global RNA-bound proteome of human HEK293 cells and Salmonella Typhimurium as a bacterial species.
biorxiv molecular-biology 0-100-users 2018Thousands of large-scale RNA sequencing experiments yield a comprehensive new human gene list and reveal extensive transcriptional noise, bioRxiv, 2018-05-28
AbstractWe assembled the sequences from 9,795 RNA sequencing experiments, collected from 31 human tissues and hundreds of subjects as part of the GTEx project, to create a new, comprehensive catalog of human genes and transcripts. The new human gene database contains 43,162 genes, of which 21,306 are protein-coding and 21,856 are noncoding, and a total of 323,824 transcripts, for an average of 7.5 transcripts per gene. Our expanded gene list includes 4,998 novel genes (1,178 coding and 3,819 noncoding) and 97,511 novel splice variants of protein-coding genes as compared to the most recent human gene catalogs. We detected over 30 million additional transcripts at more than 650,000 sites, nearly all of which are likely to be nonfunctional, revealing a heretofore unappreciated amount of transcriptional noise in human cells.
biorxiv genomics 500+-users 2018Fast animal pose estimation using deep neural networks, bioRxiv, 2018-05-25
AbstractRecent work quantifying postural dynamics has attempted to define the repertoire of behaviors performed by an animal. However, a major drawback to these techniques has been their reliance on dimensionality reduction of images which destroys information about which parts of the body are used in each behavior. To address this issue, we introduce a deep learning-based method for pose estimation, LEAP (LEAP Estimates Animal Pose). LEAP automatically predicts the positions of animal body parts using a deep convolutional neural network with as little as 10 frames of labeled data for training. This framework consists of a graphical interface for interactive labeling of body parts and software for training the network and fast prediction on new data (1 hr to train, 185 Hz predictions). We validate LEAP using videos of freely behaving fruit flies (Drosophila melanogaster) and track 32 distinct points on the body to fully describe the pose of the head, body, wings, and legs with an error rate of <3% of the animal’s body length. We recapitulate a number of reported findings on insect gait dynamics and show LEAP’s applicability as the first step in unsupervised behavioral classification. Finally, we extend the method to more challenging imaging situations (pairs of flies moving on a mesh-like background) and movies from freely moving mice (Mus musculus) where we track the full conformation of the head, body, and limbs.
biorxiv animal-behavior-and-cognition 500+-users 2018Genetic compensation is triggered by mutant mRNA degradation, bioRxiv, 2018-05-22
Genetic compensation by transcriptional modulation of related gene(s) (also known as transcriptional adaptation) has been reported in numerous systems1–3; however, whether and how such a response can be activated in the absence of protein feedback loops is unknown. Here, we develop and analyze several models of transcriptional adaptation in zebrafish and mouse that we show are not caused by loss of protein function. We find that the increase in transcript levels is due to enhanced transcription, and observe a correlation between the levels of mutant mRNA decay and transcriptional upregulation of related genes. To assess the role of mutant mRNA degradation in triggering transcriptional adaptation, we use genetic and pharmacological approaches and find that mRNA degradation is indeed required for this process. Notably, uncapped RNAs, themselves subjected to rapid degradation, can also induce transcriptional adaptation. Next, we generate alleles that fail to transcribe the mutated gene and find that they do not show transcriptional adaptation, and exhibit more severe phenotypes than those observed in alleles displaying mutant mRNA decay. Transcriptome analysis of these different alleles reveals the upregulation of hundreds of genes with enrichment for those showing sequence similarity with the mutated gene’s mRNA, suggesting a model whereby mRNA degradation products induce the response via sequence similarity. These results expand the role of the mRNA surveillance machinery in buffering against mutations by triggering the transcriptional upregulation of related genes. Besides implications for our understanding of disease-causing mutations, our findings will help design mutant alleles with minimal transcriptional adaptation-derived compensation.
biorxiv genetics 200-500-users 2018Efficient long single molecule sequencing for cost effective and accurate sequencing, haplotyping, and de novo assembly, bioRxiv, 2018-05-17
Obtaining accurate sequences from long DNA molecules is very important for genome assembly and other applications. Here we describe single tube long fragment read (stLFR), a technology that enables this a low cost. It is based on adding the same barcode sequence to sub-fragments of the original long DNA molecule (DNA co-barcoding). To achieve this efficiently, stLFR uses the surface of microbeads to create millions of miniaturized barcoding reactions in a single tube. Using a combinatorial process up to 3.6 billion unique barcode sequences were generated on beads, enabling practically non-redundant co-barcoding with 50 million barcodes per sample. Using stLFR, we demonstrate efficient unique co-barcoding of over 8 million 20-300 kb genomic DNA fragments. Analysis of the genome of the human genome NA12878 with stLFR demonstrated high quality variant calling and phasing into contigs up to N50 34 Mb. We also demonstrate detection of complex structural variants and complete diploid de novo assembly of NA12878. These analyses were all performed using single stLFR libraries and their construction did not significantly add to the time or cost of whole genome sequencing (WGS) library preparation. stLFR represents an easily automatable solution that enables high quality sequencing, phasing, SV detection, scaffolding, cost-effective diploid de novo genome assembly, and other long DNA sequencing applications.
biorxiv genomics 0-100-users 2018The genetic prehistory of the Greater Caucasus, bioRxiv, 2018-05-16
AbstractArchaeogenetic studies have described the formation of Eurasian ‘steppe ancestry’ as a mixture of Eastern and Caucasus hunter-gatherers. However, it remains unclear when and where this ancestry arose and whether it was related to a horizon of cultural innovations in the 4th millennium BCE that subsequently facilitated the advance of pastoral societies likely linked to the dispersal of Indo-European languages. To address this, we generated genome-wide SNP data from 45 prehistoric individuals along a 3000-year temporal transect in the North Caucasus. We observe a genetic separation between the groups of the Caucasus and those of the adjacent steppe. The Caucasus groups are genetically similar to contemporaneous populations south of it, suggesting that – unlike today – the Caucasus acted as a bridge rather than an insurmountable barrier to human movement. The steppe groups from Yamnaya and subsequent pastoralist cultures show evidence for previously undetected farmer-related ancestry from different contact zones, while Steppe Maykop individuals harbour additional Upper Palaeolithic Siberian and Native American related ancestry.
biorxiv genomics 200-500-users 2018