An Algorithmic Barrier to Neural Circuit Understanding, bioRxiv, 2019-05-27
AbstractNeuroscience is witnessing extraordinary progress in experimental techniques, especially at the neural circuit level. These advances are largely aimed at enabling us to understand how neural circuit computations mechanistically cause behavior. Here, using techniques from Theoretical Computer Science, we examine how many experiments are needed to obtain such an empirical understanding. It is proved, mathematically, that establishing the most extensive notions of understanding need exponentially-many experiments in the number of neurons, in general, unless a widely-posited hypothesis about computation is false. Worse still, the feasible experimental regime is one where the number of experiments scales sub-linearly in the number of neurons, suggesting a fundamental impediment to such an understanding. Determining which notions of understanding are algorithmically tractable, thus, becomes an important new endeavor in Neuroscience.
biorxiv neuroscience 100-200-users 2019Population history and genetic adaptation of the Fulani nomads Inferences from genome-wide data and the lactase persistence trait, bioRxiv, 2019-05-27
AbstractHuman population history in the Holocene was profoundly impacted by changes in lifestyle following the invention and adoption of food-production practices. These changes triggered significant increases in population sizes and expansions over large distances. Here we investigate the population history of the Fulani, a pastoral population extending throughout the African SahelSavannah belt. Based on genome-wide analyses we propose that ancestors of the Fulani population experienced admixture between a West African group and a group carrying both European and North African ancestries. This admixture was likely coupled with newly adopted herding practices, as it resulted in signatures of genetic adaptation in contemporary Fulani genomes, including the control element of the LCT gene enabling carriers to digest lactose throughout their lives. The lactase persistence (LP) trait in the Fulani is conferred by the presence of the allele T-13910, which is also present at high frequencies in Europe. We establish that the T-13910 LP allele in Fulani individuals analysed in this study lies on a European haplotype background thus excluding parallel convergent evolution. Our findings further suggest that Eurasian admixture and the European LP allele was introduced into the Fulani through contact with a North African populations. We furthermore confirm the link between the lactose digestion phenotype in the Fulani to the MCM6LCT locus by reporting the first Genome Wide Association study (GWAS) of the lactase persistence trait. We also further explored signals of recent adaptation in the Fulani and identified additional candidates for selection to adapt to herding life-styles.
biorxiv evolutionary-biology 100-200-users 2019The advantages and disadvantages of short- and long-read metagenomics to infer bacterial and eukaryotic community composition, bioRxiv, 2019-05-27
AbstractBackgroundThe first step in understanding ecological community diversity and dynamics is quantifying community membership. An increasingly common method for doing so is through metagenomics. Because of the rapidly increasing popularity of this approach, a large number of computational tools and pipelines are available for analysing metagenomic data. However, the majority of these tools have been designed and benchmarked using highly accurate short read data (i.e. illumina), with few studies benchmarking classification accuracy for long error-prone reads (PacBio or Oxford Nanopore). In addition, few tools have been benchmarked for non-microbial communities.ResultsHere we use simulated error prone Oxford Nanopore and high accuracy Illumina read sets to systematically investigate the effects of sequence length and taxon type on classification accuracy for metagenomic data from both microbial and non-microbial communities. We show that very generally, classification accuracy is far lower for non-microbial communities, even at low taxonomic resolution (e.g. family rather than genus).ConclusionsWe then show that for two popular taxonomic classifiers, long error-prone reads can significantly increase classification accuracy, and this is most pronounced for non-microbial communities. This work provides insight on the expected accuracy for metagenomic analyses for different taxonomic groups, and establishes the point at which read length becomes more important than error rate for assigning the correct taxon.
biorxiv bioinformatics 100-200-users 2019A shared genetic basis for personality traits and local cortical grey matter structure?, bioRxiv, 2019-05-25
AbstractPersonality traits are key indices of inter-individual variation. Personality is heritable and has been associated with brain structure and function. To date, it is unknown whether the relation between personality and brain macrostructure can be explained by genetic factors. In a large-scale twin sample (Human Connectome Project), we performed genetic correlation analyses to evaluate whether personality traits (NEO-FFI) and local brain structure have a shared genetic basis. We found a genetic overlap between personality traits and local brain structure in 11 of 22 observed phenotypic associations in predominantly frontal cortices. In these regions the proportion of phenotypic covariance accounted for by shared genetic effects was between 82 and 100%. Second, in the case of Agreeableness, Conscientiousness, and Openness, the phenotypic correlation between personality and local brain structure was observed to reflect genetic, more than environmental, factors. These observations indicate that genetic factors influence the relationship between personality traits and local brain structure. Importantly, observed associations between personality traits and cortical thickness did only partially replicate in two independent large-scale samples of unrelated individuals. Taken together, our findings demonstrate that genes impact the relationship between personality and local brain structure, but that phenotypic associations are, to a large extent, non-generalizable. These observations provide a novel perspective on the nature and nurture of the biological basis of personality.
biorxiv neuroscience 100-200-users 2019Algorithms for efficiently collapsing reads with Unique Molecular Identifiers, bioRxiv, 2019-05-25
AbstractBackgroundUnique Molecular Identifiers (UMI) are used in many experiments to find and remove PCR duplicates. Although there are many tools for solving the problem of deduplicating reads based on their finding reads with the same alignment coordinates and UMIs, many tools either cannot handle substitution errors, or require expensive pairwise UMI comparisons that do not efficiently scale to larger datasets.ResultsWe formulate the problem of deduplicating UMIs in a manner that enables optimizations to be made, and more efficient data structures to be used. We implement our data structures and optimizations in a tool called UMICollapse, which is able to deduplicate over one million unique UMIs of length 9 at a single alignment position in around 26 seconds.ConclusionsWe present a new formulation of the UMI deduplication problem, and show that it can be solved faster, with more sophisticated data structures.
biorxiv bioinformatics 100-200-users 2019