Large-scale simultaneous measurement of epitopes and transcriptomes in single cells, bioRxiv, 2017-03-03
Recent high-throughput single-cell sequencing approaches have been transformative for understanding complex cell populations, but are unable to provide additional phenotypic information, such as protein levels of cell-surface markers. Using oligonucleotide-labeled antibodies, we integrate measurements of cellular proteins and transcriptomes into an efficient, sequencing-based readout of single cells. This method is compatible with existing single-cell sequencing approaches and will readily scale as the throughput of these methods increase.
biorxiv genomics 100-200-users 2017orco mutagenesis causes loss of antennal lobe glomeruli and impaired social behavior in ants, bioRxiv, 2017-03-01
Life inside ant colonies is orchestrated with a diverse set of pheromones, but it is not clear how ants perceive these social cues. It has been proposed that pheromone perception in ants evolved via expansions in the numbers of odorant receptors (ORs) and antennal lobe glomeruli. Here we generate the first mutant lines in ants by disrupting orco, a gene required for the function of all ORs. We find that orco mutants exhibit severe deficiencies in social behavior and fitness, suggesting that they are unable to perceive pheromones. Surprisingly, unlike in Drosophila melanogaster, orco mutant ants also lack most of the approximately 500 antennal lobe glomeruli found in wild-types. These results illustrate that ORs are essential for ant social organization, and raise the possibility that, similar to mammals, receptor function is required for the development andor maintenance of the highly complex olfactory processing areas in the ant brain.
biorxiv evolutionary-biology 100-200-users 2017A global perspective on bioinformatics training needs, bioRxiv, 2017-02-28
AbstractIn the last decade, life-science research has become increasingly data-intensive and computational. Nevertheless, basic bioinformatics and data stewardship are still only rarely taught in life-science degree programmes, creating a widening skills gap that spans educational levels and career roles. To better understand this situation, we ran surveys to determine how the skills dearth is affecting the need for bioinformatics training worldwide. Perhaps unsurprisingly, we found that respondents wanted more short courses to help boost their expertise and confidence in data analysis and interpretation. However, it was evident that most respondents appreciated their need for training only after designing their experiments and collecting their data. This is clearly rather late in the research workflow, and suboptimal from a training perspective, as skills acquired to address a specific need at a particular time are seldom retained, engendering a cycle of low confidence in trainees. To ensure that such skill gaps do not continue to create barriers to the progress of research, we argue that universities should strive to bring their life-science curricula into the digital-data era. Meanwhile, the demand for point-of-need training in bioinformatics and data stewardship will grow. While this situation persists, international groups like GOBLET are increasing their efforts to enlarge the community of trainers and quench the global thirst for bioinformatics training.
biorxiv scientific-communication-and-education 100-200-users 2017MAGIC A diffusion-based imputation method reveals gene-gene interactions in single-cell RNA-sequencing data, bioRxiv, 2017-02-26
ABSTRACTSingle-cell RNA-sequencing is fast becoming a major technology that is revolutionizing biological discovery in fields such as development, immunology and cancer. The ability to simultaneously measure thousands of genes at single cell resolution allows, among other prospects, for the possibility of learning gene regulatory networks at large scales. However, scRNA-seq technologies suffer from many sources of significant technical noise, the most prominent of which is ‘dropout’ due to inefficient mRNA capture. This results in data that has a high degree of sparsity, with typically only ~10% non-zero values. To address this, we developed MAGIC (Markov Affinity-based Graph Imputation of Cells), a method for imputing missing values, and restoring the structure of the data. After MAGIC, we find that two- and three-dimensional gene interactions are restored and that MAGIC is able to impute complex and non-linear shapes of interactions. MAGIC also retains cluster structure, enhances cluster-specific gene interactions and restores trajectories, as demonstrated in mouse retinal bipolar cells, hematopoiesis, and our newly generated epithelial-to-mesenchymal transition dataset.
biorxiv bioinformatics 100-200-users 2017Modern machine learning outperforms GLMs at predicting spikes, bioRxiv, 2017-02-25
AbstractNeuroscience has long focused on finding encoding models that effectively ask “what predicts neural spiking?” and generalized linear models (GLMs) are a typical approach. It is often unknown how much of explainable neural activity is captured, or missed, when fitting a GLM. Here we compared the predictive performance of GLMs to three leading machine learning methods feedforward neural networks, gradient boosted trees (using XGBoost), and stacked ensembles that combine the predictions of several methods. We predicted spike counts in macaque motor (M1) and somatosensory (S1) cortices from standard representations of reaching kinematics, and in rat hippocampal cells from open field location and orientation. In general, the modern methods (particularly XGBoost and the ensemble) produced more accurate spike predictions and were less sensitive to the preprocessing of features. This discrepancy in performance suggests that standard feature sets may often relate to neural activity in a nonlinear manner not captured by GLMs. Encoding models built with machine learning techniques, which can be largely automated, more accurately predict spikes and can offer meaningful benchmarks for simpler models.
biorxiv neuroscience 100-200-users 2017Single-cell epigenomics maps the continuous regulatory landscape of human hematopoietic differentiation, bioRxiv, 2017-02-22
AbstractNormal human hematopoiesis involves cellular differentiation of multipotent cells into progressively more lineage-restricted states. While epigenomic landscapes of this process have been explored in immunophenotypically-defined populations, the single-cell regulatory variation that defines hematopoietic differentiation has been hidden by ensemble averaging. We generated single-cell chromatin accessibility landscapes across 8 populations of immunophenotypically-defined human hematopoietic cell types. Using bulk chromatin accessibility profiles to scaffold our single-cell data analysis, we constructed an epigenomic landscape of human hematopoiesis and characterized epigenomic heterogeneity within phenotypically sorted populations to find epigenomic lineage-bias toward different developmental branches in multipotent stem cell states. We identify and isolate sub-populations within classically-defined granulocyte-macrophage progenitors (GMPs) and use ATAC-seq and RNA-seq to confirm that GMPs are epigenomically and transcriptomically heterogeneous. Furthermore, we identified transcription factors and cis-regulatory elements linked to changes in chromatin accessibility within cellular populations and across a continuous myeloid developmental trajectory, and observe relatively simple TF motif dynamics give rise to a broad diversity of accessibility dynamics at cis-regulatory elements. Overall, this work provides a template for exploration of complex regulatory dynamics in primary human tissues at the ultimate level of granular specificity – the single cell.One Sentence SummarySingle cell chromatin accessibility reveals a high-resolution, continuous landscape of regulatory variation in human hematopoiesis.
biorxiv genomics 100-200-users 2017