Species better track the shifting isotherms in the oceans than on lands, bioRxiv, 2019-09-11
Despite mounting evidence of species redistribution as climate warms, our knowledge of the coupling between species range shifts and isotherm shifts is limited. Compiling a global geo-database of 30,534 range shifts from 12,415 taxa, we show that only marine taxa closely track the shifting isotherms. In the oceans, the velocity of isotherm shifts interacts synergistically with anthropogenic disturbances and baseline temperatures such that isotherm tracking by marine life happens either in warm and undisturbed waters (e.g. Central Pacific Basin) or in colder waters where human activities are more pronounced (e.g. North Sea). On lands, increasing anthropogenic activities and temperatures negatively impact the capacity of terrestrial taxa to track isotherm shifts in latitude and elevation, respectively. This suggests that terrestrial taxa are lagging behind the shifting isotherms, most likely due to their wider thermal safety margin, more constrained physical environment for dispersal and higher availability of thermal microrefugia at shorter distances.
biorxiv ecology 0-100-users 2019A compact vocabulary of paratope-epitope interactions enables predictability of antibody-antigen binding, bioRxiv, 2019-09-10
Antibody-antigen binding relies on the specific interaction of amino acids at the paratope-epitope interface. It has been a long-standing question whether antibody-antigen binding is predictable. A fundamental premise for the predictability of paratope-epitope interactions is the existence of structural units that are universally shared among antibody-antigen binding complexes. Here, we screened the largest available set of non-redundant antibody-antigen structures for binding patterns and identified structural interaction motifs, which together compose a vocabulary of paratope-epitope interactions that is universally shared among investigated antibody-antigen structures. The vocabulary (i) is compact, less than 104 motifs, (ii) is immunity-specific and distinct from non-immune protein-protein interactions, (iii) mediates specific oligo- and polyreactive interactions between paratope-epitope pairs, and (iv) enables the machine learnability of paratope-epitope interactions. Collectively, our results demonstrate the predictability of antibody-antigen binding.
biorxiv immunology 100-200-users 2019Direct-fit to nature an evolutionary perspective on biological (and artificial) neural networks, bioRxiv, 2019-09-10
AbstractEvolution is a blind fitting process by which organisms, over generations, adapt to the niches of an ever-changing environment. Does the mammalian brain use similar brute-force fitting processes to learn how to perceive and act upon the world? Recent advances in training deep neural networks has exposed the power of optimizing millions of synaptic weights to map millions of observations along ecologically relevant objective functions. This class of models has dramatically outstripped simpler, more intuitive models, operating robustly in real-life contexts spanning perception, language, and action coordination. These models do not learn an explicit, human-interpretable representation of the underlying structure of the data; rather, they use local computations to interpolate over task-relevant manifolds in a high-dimensional parameter space. Furthermore, counterintuitively, over-parameterized models, similarly to evolutionary processes, can be simple and parsimonious as they provide a versatile, robust solution for learning a diverse set of functions. In contrast to traditional scientific models, where the ultimate goal is interpretability, over-parameterized models eschew interpretability in favor of solving real-life problems or tasks. We contend that over-parameterized blind fitting presents a radical challenge to many of the underlying assumptions and practices in computational neuroscience and cognitive psychology. At the same time, this shift in perspective informs longstanding debates and establishes unexpected links with evolution, ecological psychology, and artificial life.
biorxiv neuroscience 100-200-users 2019Quantifying the tradeoff between sequencing depth and cell number in single-cell RNA-seq, bioRxiv, 2019-09-10
The allocation of a sequencing budget when designing single cell RNA-seq experiments requires consideration of the tradeoff between number of cells sequenced and the read depth per cell. One approach to the problem is to perform a power analysis for a univariate objective such as differential expression. However, many of the goals of single-cell analysis requires consideration of the multivariate structure of gene expression, such as clustering. We introduce an approach to quantifying the impact of sequencing depth and cell number on the estimation of a multivariate generative model for gene expression that is based on error analysis in the framework of a variational autoencoder. We find that at shallow depths, the marginal benefit of deeper sequencing per cell significantly outweighs the benefit of increased cell numbers. Above about 15,000 reads per cell the benefit of increased sequencing depth is minor. Code for the workflow reproducing the results of the paper is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsgithub.compachterlabSBP_2019>httpsgithub.compachterlabSBP_2019<jatsext-link>.
biorxiv genomics 200-500-users 2019A multi-view model for relative and absolute microbial abundances, bioRxiv, 2019-09-09
AbstractThe absolute abundance of bacterial taxa in human host-associated environments play a critical role in reproductive and gastrointestinal health. However, obtaining the absolute abundance of many bacterial species is typically prohibitively expensive. In contrast, relative abundance data for many species is comparatively cheap and easy to collect (e.g., with universal primers for the 16S rRNA gene). In this paper, we propose a method to jointly model relative abundance data for many taxa and absolute abundance data for a subset of taxa. Our method provides point and interval estimates for the absolute abundance of all taxa. Crucially, our proposal accounts for differences in the efficiency of taxon detection in the relative and absolute abundance data. We show that modeling taxon-specific efficiencies substantially reduces the estimation error for absolute abundance, and controls the coverage of interval estimators. We demonstrate the performance of our proposed method via a simulation study, a sensitivity study where we jackknife the taxa with observed absolute abundances, and a study of women with bacterial vaginosis.
biorxiv genomics 0-100-users 2019BrainSpace a toolbox for the analysis of macroscale gradients in neuroimaging and connectomics datasets, bioRxiv, 2019-09-09
AbstractUnderstanding how higher order cognitive function emerges from the underlying brain structure depends on quantifying how the behaviour of discrete regions are integrated within the broader cortical landscape. Recent work has established that this macroscale brain organization and function can be quantified in a compact manner through the use of multivariate machine learning approaches that identify manifolds often described as cortical gradients. By quantifying topographic principles of macroscale organization, cortical gradients lend an analytical framework to study structural and functional brain organization across species, throughout development and aging, and its perturbations in disease. More generally, its macroscale perspective on brain organization offers novel possibilities to investigate the complex relationships between brain structure, function, and cognition in a quantified manner. Here, we present a compact workflow and open-access toolbox that allows for (i) the identification of gradients (from structural or functional imaging data), (ii) their alignment (across subjects or modalities), and (iii) their visualization (in embedding or cortical space). Our toolbox also allows for controlled association studies between gradients with other brain-level features, adjusted with respect to several null models that account for spatial autocorrelation. The toolbox is implemented in both Python and Matlab, programming languages widely used by the neuroimaging and network neuroscience communities. Several use-case examples and validation experiments demonstrate the usage and consistency of our tools for the analysis of functional and microstructural gradients across different spatial scales.
biorxiv neuroscience 0-100-users 2019