Mapping Vector Field of Single Cells, bioRxiv, 2019-07-09
AbstractUnderstanding how gene expression in single cells progress over time is vital for revealing the mechanisms governing cell fate transitions. RNA velocity, which infers immediate changes in gene expression by comparing levels of new (unspliced) versus mature (spliced) transcripts (La Manno et al. 2018), represents an important advance to these efforts. A key question remaining is whether it is possible to predict the most probable cell state backward or forward over arbitrary time-scales. To this end, we introduce an inclusive model (termed Dynamo) capable of predicting cell states over extended time periods, that incorporates promoter state switching, transcription, splicing, translation and RNAprotein degradation by taking advantage of scRNA-seq and the co-assay of transcriptome and proteome. We also implement scSLAM-seq by extending SLAM-seq to plate-based scRNA-seq (Hendriks et al. 2018; Erhard et al. 2019; Cao, Zhou, et al. 2019) and augment the model by explicitly incorporating the metabolic labelling of nascent RNA. We show that through careful design of labelling experiments and an efficient mathematical framework, the entire kinetic behavior of a cell from this model can be robustly and accurately inferred. Aided by the improved framework, we show that it is possible to reconstruct the transcriptomic vector field from sparse and noisy vector samples generated by single cell experiments. The reconstructed vector field further enables global mapping of potential landscapes that reflects the relative stability of a given cell state, and the minimal transition time and most probable paths between any cell states in the state space. This work thus foreshadows the possibility of predicting long-term trajectories of cells during a dynamic process instead of short time velocity estimates. Our methods are implemented as an open source tool, dynamo (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comaristoteleodynamo-release>httpsgithub.comaristoteleodynamo-release<jatsext-link>).
biorxiv systems-biology 100-200-users 2019Recent evolutionary history of tigers highlights contrasting roles of genetic drift and selection, bioRxiv, 2019-07-09
AbstractTigers are among the most charismatic of endangered species, yet little is known about their evolutionary history. We sequenced 65 individual genomes representing extant tiger geographic range. We found strong genetic differentiation between putative tiger subspecies, divergence within the last 10,000 years, and demographic histories dominated by population bottlenecks. Indian tigers have substantial genetic variation and substructure stemming from population isolation and intense recent bottlenecks here. Despite high genetic diversity across India, individual tigers host longer runs of homozygosity, potentially suggesting recent inbreeding here. Amur tiger genomes revealed the strongest signals of selection and over-representation of gene ontology categories potentially involved in metabolic adaptation to cold. Novel insights highlight the antiquity of northeast Indian tigers. Our results demonstrate recent evolution, with differential isolation, selection and drift in extant tiger populations, providing insights for conservation and future survival.
biorxiv genomics 0-100-users 2019Hierarchical Compression Reveals Sub-Second to Day-Long Structure in Larval Zebrafish Behaviour, bioRxiv, 2019-07-08
AbstractAnimal behaviour is dynamic, evolving over multiple timescales from milliseconds to days and even across a lifetime. To understand the mechanisms governing these dynamics, it is necessary to capture multi-timescale structure from behavioural data. Here, we develop computational tools and study the behaviour of hundreds of larval zebrafish tracked continuously across multiple 24-hour daynight cycles. We extracted millions of movements and pauses, termed bouts, and used unsupervised learning to reduce each larva’s behaviour to an alternating sequence of active and inactive bout types, termed modules. Through hierarchical compression, we identified recurrent behavioural patterns, termed motifs. Module and motif usage varied across the daynight cycle, revealing structure at sub-second to day-long timescales. We further demonstrate that module and motif analysis can uncover novel pharmacological and genetic mutant phenotypes. Overall, our work reveals the organisation of larval zebrafish behaviour at multiple timescales and provides tools to identify structure from large-scale behavioural datasets.
biorxiv neuroscience 0-100-users 2019Structural basis for recognition of RALF peptides by LRX proteins during pollen tube growth, bioRxiv, 2019-07-08
AbstractPlant reproduction relies on the highly regulated growth of the pollen tube for proper sperm delivery. This process is controlled by secreted RALF signaling peptides, which have been previously shown to be perceived by CrRLK1Ls membrane receptor-kinases and leucine-rich (LRR) extensin proteins (LRXs). Here we demonstrate that RALF peptides are active as folded, disulfide bond-stabilized proteins, which can bind to the LRR domain of LRX proteins with nanomolar affinity. Crystal structures of the LRX-RALF signaling complexes reveal LRX proteins as constitutive dimers. The N-terminal LRR domain containing the RALF binding site is tightly linked to the extensin domain via a cysteine-rich tail. Our biochemical and structural work reveals a complex signaling network by which RALF ligands may instruct different signaling proteins – here CrRLK1Ls and LRXs – through structurally different binding modes to orchestrate cell wall remodeling in rapidly growing pollen tubes.SignificancePlant reproduction relies on proper pollen tube growth to reach the female tissue and release the sperm cells. This process is highly regulated by a family of secreted signaling peptides that are recognized by cell-wall monitoring proteins to enable plant fertilization. Here, we report the crystal structure of the LRX-RALF cell-wall complex and we demonstrate that RALF peptides are active as folded proteins. RALFs are autocrine signaling proteins able to instruct LRX cell-wall modules and CrRKL1L receptors, through structurally different binding modes to coordinate pollen tube integrity.
biorxiv plant-biology 0-100-users 2019Systems-level immunomonitoring using self-sampled capillary blood, bioRxiv, 2019-07-08
AbstractComprehensive profiling of the human immune system in patients with cancer, autoimmune disease and during infections are providing valuable information that help us understand disease states and discriminate productive from inefficient immune responses and identify possible targets for immune modulation. Recent technical advances now allow for all immune cell populations and hundreds of plasma proteins to be detected using small volume blood samples. To democratize such systems-immunological analyses, further simplified blood sampling and preservation will be important. Here we describe that blood obtained via a nearly painless self-sampling device of 100 microliter of capillary blood that is preserved and frozen, can simplify systems-level immunomonitoring studies.
biorxiv immunology 100-200-users 2019Transcriptome assembly from long-read RNA-seq alignments with StringTie2, bioRxiv, 2019-07-08
AbstractRNA sequencing using the latest single-molecule sequencing instruments produces reads that are thousands of nucleotides long. The ability to assemble these long reads can greatly improve the sensitivity of long-read analyses. Here we present StringTie2, a reference-guided transcriptome assembler that works with both short and long reads. StringTie2 includes new computational methods to handle the high error rate of long-read sequencing technology, which previous assemblers could not tolerate. It also offers the ability to work with full-length super-reads assembled from short reads, which further improves the quality of assemblies. On 33 short-read datasets from humans and two plant species, StringTie2 is 47.3% more precise and 3.9% more sensitive than Scallop. On multiple long read datasets, StringTie2 on average correctly assembles 8.3 and 2.6 times as many transcripts as FLAIR and Traphlor, respectively, with substantially higher precision. StringTie2 is also faster and has a smaller memory footprint than all comparable tools.
biorxiv genomics 100-200-users 2019