GeneRax A tool for species tree-aware maximum likelihood based gene tree inference under gene duplication, transfer, and loss, bioRxiv, 2019-09-27
AbstractInferring gene trees is difficult because alignments are often too short, and thus contain insufficient signal, while substitution models inevitably fail to capture the complexity of the evolutionary processes. To overcome these challenges species tree-aware methods seek to use information from a putative species tree. However, there are few methods available that implement a full likelihood framework or account for horizontal gene transfers. Furthermore, these methods often require expensive data pre-processing (e.g., computing bootstrap trees), and rely on approximations and heuristics that limit the exploration of tree space. Here we present GeneRax, the first maximum likelihood species tree-aware gene tree inference software. It simultaneously accounts for substitutions at the sequence level and gene level events, such as duplication, transfer and loss and uses established maximum likelihood optimization algorithms. GeneRax can infer rooted gene trees for an arbitrary number of gene families, directly from the per-gene sequence alignments and a rooted, but undated, species tree. We show that compared to competing tools, on simulated data GeneRax infers trees that are the closest to the true tree in 90% of the simulations in terms relative Robinson-Foulds distance. While, on empirical datasets, GeneRax is the fastest among all tested methods when starting from aligned sequences, and that it infers trees with the highest likelihood score, based on our model. GeneRax completed tree inferences and reconciliations for 1099 Cyanobacteria families in eight minutes on 512 CPU cores. Thus, its advanced parallelization scheme enables large-scale analyses. GeneRax is available under GNU GPL at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.comBenoitMorelGeneRax>httpsgithub.comBenoitMorelGeneRax<jatsext-link>.
biorxiv bioinformatics 0-100-users 2019Ancient genomic regulatory blocks are a major source for gene deserts in vertebrates after whole genome duplications, bioRxiv, 2019-09-26
AbstractWe investigated how the two rounds of whole genome duplication that occurred at the base of the vertebrate lineage have impacted ancient microsyntenic associations involving developmental regulators (known as genomic regulatory blocks, GRBs). We showed that the majority of GRBs present in the last common ancestor of chordates have been maintained as a single copy in humans. We found evidence that dismantling of the additional GRB copies occurred early in vertebrate evolution often through the differential retention of the regulatory gene but loss of the bystander gene’s exonic sequences. Despite the large evolutionary scale, the presence of duplicated highly conserved non-coding regions provided unambiguous proof for this scenario for dozens of ancient GRBs. Remarkably, the dismantling of ancient GRB duplicates has contributed to the creation of large gene deserts associated with regulatory genes in vertebrates, providing a widespread mechanism for the origin of these enigmatic genomic traits.
biorxiv evolutionary-biology 0-100-users 2019Tximeta reference sequence checksums for provenance identification in RNA-seq, bioRxiv, 2019-09-26
AbstractCorrect annotation metadata is critical for reproducible and accurate RNA-seq analysis. When files are shared publicly or among collaborators with incorrect or missing annotation metadata, it becomes difficult or impossible to reproduce bioinformatic analyses from raw data. It also makes it more difficult to locate the transcriptomic features, such as transcripts or genes, in their proper genomic context, which is necessary for overlapping expression data with other datasets. We provide a solution in the form of an RBioconductor package tximeta that performs numerous annotation and metadata gathering tasks automatically on behalf of users during the import of transcript quantification files. The correct reference transcriptome is identified via a hashed checksum stored in the quantification output, and key transcript databases are downloaded and cached locally. The computational paradigm of automatically adding annotation metadata based on reference sequence checksums can greatly facilitate genomic workflows, by helping to reduce overhead during bioinformatic analyses, preventing costly bioinformatic mistakes, and promoting computational reproducibility. The tximeta package is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsbioconductor.orgpackagestximeta>httpsbioconductor.orgpackagestximeta<jatsext-link>.
biorxiv bioinformatics 0-100-users 2019A processive rotary mechanism couples substrate unfolding and proteolysis in the ClpXP degradation machinery, bioRxiv, 2019-09-25
AbstractThe ClpXP degradation machine consists of a hexameric AAA+ unfoldase (ClpX) and a pair of heptameric serine protease rings (ClpP) that unfold, translocate, and subsequently degrade client proteins. ClpXP is an important target for drug development against infectious diseases. Although structures are available for isolated ClpX and ClpP rings, it remains unknown how symmetry mismatched ClpX and ClpP work in tandem for processive substrate translocation into the ClpP proteolytic chamber. Here we present cryo-EM structures of the substrate-bound ClpXP complex from Neisseria meningitidis at 2.3 to 3.3 Å resolution. The structures allow development of a model in which the cyclical hydrolysis of ATP is coupled to concerted motions of ClpX loops that lead to directional substrate translocation and ClpX rotation relative to ClpP. Our data add to the growing body of evidence that AAA+ molecular machines generate translocating forces by a common mechanism.
biorxiv biophysics 0-100-users 2019Dendritic calcium signals in rhesus macaque motor cortex drive an optical brain-computer interface, bioRxiv, 2019-09-25
AbstractCalcium imaging has rapidly developed into a powerful tool for recording from large populations of neurons in vivo. Imaging in rhesus macaque motor cortex can enable the discovery of new principles of motor cortical function and can inform the design of next generation brain-computer interfaces (BCIs). Surface two-photon (2P) imaging, however, cannot presently access somatic calcium signals of neurons from all layers of macaque motor cortex due to photon scattering. Here, we demonstrate an implant and imaging system capable of chronic, motion-stabilized two-photon (2P) imaging of calcium signals from in macaques engaged in a motor task. By imaging apical dendrites, some of which originated from deep layer 5 neurons, as as well as superficial cell bodies, we achieved optical access to large populations of deep and superficial cortical neurons across dorsal premotor (PMd) and gyral primary motor (M1) cortices. Dendritic signals from individual neurons displayed tuning for different directions of arm movement, which was stable across many weeks. Combining several technical advances, we developed an optical BCI (oBCI) driven by these dendritic signals and successfully decoded movement direction online. By fusing 2P functional imaging with CLARITY volumetric imaging, we verify that an imaged dendrite, which contributed to oBCI decoding, originated from a putative Betz cell in motor cortical layer 5. This approach establishes new opportunities for studying motor control and designing BCIs.
biorxiv neuroscience 0-100-users 2019Hue tuning curves in V4 change with visual context, bioRxiv, 2019-09-25
AbstractTo understand activity in the visual cortex, researchers typically investigate how parametric changes in stimuli affect neural activity. A fundamental tenet of this approach is that the response properties of neurons in one context, e.g. color stimuli, are representative of responses in other contexts, e.g. natural scenes. This assumption is not often tested. Here, for neurons in macaque area V4, we first estimated tuning curves for hue by presenting artificial stimuli of varying hue, and then tested whether these would correlate with hue tuning curves estimated from responses to natural images. We found that neurons’ hue tuning on artificial stimuli was not representative of their hue tuning on natural images, even if the neurons were strongly color-responsive. One explanation of this result is that neurons in V4 respond to interactions between hue and other visual features. This finding exemplifies how tuning curves estimated by varying a small number of stimulus features can communicate a small and potentially unrepresentative slice of the neural response function.
biorxiv neuroscience 0-100-users 2019