A comparison of single-cell trajectory inference methods towards more accurate and robust tools, bioRxiv, 2018-03-06
AbstractUsing single-cell-omics data, it is now possible to computationally order cells along trajectories, allowing the unbiased study of cellular dynamic processes. Since 2014, more than 50 trajectory inference methods have been developed, each with its own set of methodological characteristics. As a result, choosing a method to infer trajectories is often challenging, since a comprehensive assessment of the performance and robustness of each method is still lacking. In order to facilitate the comparison of the results of these methods to each other and to a gold standard, we developed a global framework to benchmark trajectory inference tools. Using this framework, we compared the trajectories from a total of 29 trajectory inference methods, on a large collection of real and synthetic datasets. We evaluate methods using several metrics, including accuracy of the inferred ordering, correctness of the network topology, code quality and user friendliness. We found that some methods, including Slingshot, TSCAN and Monocle DDRTree, clearly outperform other methods, although their performance depended on the type of trajectory present in the data. Based on our benchmarking results, we therefore developed a set of guidelines for method users. However, our analysis also indicated that there is still a lot of room for improvement, especially for methods detecting complex trajectory topologies. Our evaluation pipeline can therefore be used to spearhead the development of new scalable and more accurate methods, and is available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpgithub.comdynversedynverse>github.comdynversedynverse<jatsext-link>.To our knowledge, this is the first comprehensive assessment of trajectory inference methods. For now, we exclusively evaluated the methods on their default parameters, but plan to add a detailed parameter tuning procedure in the future. We gladly welcome any discussion and feedback on key decisions made as part of this study, including the metrics used in the benchmark, the quality control checklist, and the implementation of the method wrappers. These discussions can be held at github.comdynversedynverseissues.
biorxiv bioinformatics 100-200-users 2018Spatial organization of the somatosensory cortex revealed by cyclic smFISH, bioRxiv, 2018-03-05
The global efforts towards the creation of a molecular census of the brain using single-cell transcriptomics is generating a large catalog of molecularly defined cell types lacking spatial information. Thus, new methods are needed to map a large number of cell-specific markers simultaneously on large tissue areas. Here, we developed a cyclic single molecule fluorescence in situ hybridization methodology and defined the cellular organization of the somatosensory cortex using markers identified by single-cell transcriptomics.
biorxiv neuroscience 100-200-users 2018Meta-analysis of genome-wide association studies for height and body mass index in ∼700,000 individuals of European ancestry, bioRxiv, 2018-03-03
Genome-wide association studies (GWAS) stand as powerful experimental designs for identifying DNA variants associated with complex traits and diseases. In the past decade, both the number of such studies and their sample sizes have increased dramatically. Recent GWAS of height and body mass index (BMI) in ∼250,000 European participants have led to the discovery of ∼700 and ∼100 nearly independent SNPs associated with these traits, respectively. Here we combine summary statistics from those two studies with GWAS of height and BMI performed in ∼450,000 UK Biobank participants of European ancestry. Overall, our combined GWAS meta-analysis reaches N∼700,000 individuals and substantially increases the number of GWAS signals associated with these traits. We identified 3,290 and 716 near-independent SNPs associated with height and BMI, respectively (at a revised genome-wide significance threshold of p<1 × 10−8), including 1,185 height-associated SNPs and 554 BMI-associated SNPs located within loci not previously identified by these two GWAS. The genome-wide significant SNPs explain ∼24.6% of the variance of height and ∼5% of the variance of BMI in an independent sample from the Health and Retirement Study (HRS). Correlations between polygenic scores based upon these SNPs with actual height and BMI in HRS participants were 0.44 and 0.20, respectively. From analyses of integrating GWAS and eQTL data by Summary-data based Mendelian Randomization (SMR), we identified an enrichment of eQTLs amongst lead height and BMI signals, prioritisting 684 and 134 genes, respectively. Our study demonstrates that, as previously predicted, increasing GWAS sample sizes continues to deliver, by discovery of new loci, increasing prediction accuracy and providing additional data to achieve deeper insight into complex trait biology. All summary statistics are made available for follow up studies.
biorxiv genetics 200-500-users 2018Persistent Underrepresentation of Women’s Science in High Profile Journals, bioRxiv, 2018-03-03
AbstractYiqin Alicia Shen, Jason M. Webster, Yuichi Shoda, and Ione Fine Department of Psychology, University of Washington Past research has demonstrated an under-representation of female editors and reviewers in top scientific journals, but less is known about the representation of women authors within original research articles. We collected research article publication records from 15 high-profile multidisciplinary and neuroscience journals for 2005-2017 and analyzed the representation of women over time, as well as its relationship with journal impact factor. We find that women authors have been persistently underrepresented in high-profile journals. This under-representation has persisted over more than a decade, with glacial improvement over time. Even within our limited group of high profile journals, the percent of female first and last authors is negatively associated with journal impact factor. Since publishing in high-profile journals is a gateway to academic success, this underrepresentation of women may contribute to the lack of women at the top of the academic ladder.
biorxiv scientific-communication-and-education 500+-users 2018Transcriptional burst initiation and polymerase pause release are key control points of transcriptional regulation, bioRxiv, 2018-03-03
AbstractTranscriptional regulation occurs via changes to the rates of various biochemical processes. Sequencing-based approaches that average together many cells have suggested that polymerase binding and polymerase release from promoter-proximal pausing are two key regulated steps in the transcriptional process. However, single cell studies have revealed that transcription occurs in short, discontinuous bursts, suggesting that transcriptional burst initiation and termination might also be regulated steps. Here, we develop and apply a quantitative framework to connect changes in both Pol II ChIP-seq and single cell transcriptional measurements to changes in the rates of specific steps of transcription. Using a number of global and targeted transcriptional regulatory perturbations, we show that burst initiation rate is indeed a key regulated step, demonstrating that transcriptional activity can be frequency modulated. Polymerase pause release is a second key regulated step, but the rate of polymerase binding is not changed by any of the biological perturbations we examined. Our results establish an important role for transcriptional burst regulation in the control of gene expression.
biorxiv systems-biology 100-200-users 2018Unsupervised discovery of temporal sequences in high-dimensional datasets, with applications to neuroscience, bioRxiv, 2018-03-03
AbstractIdentifying low-dimensional features that describe large-scale neural recordings is a major challenge in neuroscience. Repeated temporal patterns (sequences) are thought to be a salient feature of neural dynamics, but are not succinctly captured by traditional dimensionality reduction techniques. Here we describe a software toolbox—called seqNMF—with new methods for extracting informative, non-redundant, sequences from high-dimensional neural data, testing the significance of these extracted patterns, and assessing the prevalence of sequential structure in data. We test these methods on simulated data under multiple noise conditions, and on several real neural and behavioral data sets. In hippocampal data, seqNMF identifies neural sequences that match those calculated manually by reference to behavioral events. In songbird data, seqNMF discovers neural sequences in untutored birds that lack stereotyped songs. Thus, by identifying temporal structure directly from neural data, seqNMF enables dissection of complex neural circuits without relying on temporal references from stimuli or behavioral outputs.
biorxiv neuroscience 100-200-users 2018