Continuity and admixture in the last five millennia of Levantine history from ancient Canaanite and present-day Lebanese genome sequences, bioRxiv, 2017-05-27

The Canaanites inhabited the Levant region during the Bronze Age and established a culture which became influential in the Near East and beyond. However, the Canaanites, unlike most other ancient Near Easterners of this period, left few surviving textual records and thus their origin and relationship to ancient and present-day populations remain unclear. In this study, we sequenced five whole-genomes from ~3,700-year-old individuals from the city of Sidon, a major Canaanite city-state on the Eastern Mediterranean coast. We also sequenced the genomes of 99 individuals from present-day Lebanon to catalogue modern Levantine genetic diversity. We find that a Bronze Age Canaanite-related ancestry was widespread in the region, shared among urban populations inhabiting the coast (Sidon) and inland populations (Jordan) who likely lived in farming societies or were pastoral nomads. This Canaanite-related ancestry derived from mixture between local Neolithic populations and eastern migrants genetically related to Chalcolithic Iranians. We estimate, using linkage-disequilibrium decay patterns, that admixture occurred 6,600-3,550 years ago, coinciding with massive population movements in the mid-Holocene triggered by aridification ~4,200 years ago. We show that present-day Lebanese derive most of their ancestry from a Canaanite-related population, which therefore implies substantial genetic continuity in the Levant since at least the Bronze Age. In addition, we find Eurasian ancestry in the Lebanese not present in Bronze Age or earlier Levantines. We estimate this Eurasian ancestry arrived in the Levant around 3,750-2,170 years ago during a period of successive conquests by distant populations such as the Persians and Macedonians.

biorxiv genetics 0-100-users 2017

A comparison between single cell RNA sequencing and single molecule RNA FISH for rare cell analysis, bioRxiv, 2017-05-19

AbstractThe development of single cell RNA sequencing technologies has emerged as a powerful means of profiling the transcriptional behavior of single cells, leveraging the breadth of sequencing measurements to make inferences about cell type. However, there is still little understanding of how well these methods perform at measuring single cell variability for small sets of genes and what “transcriptome coverage” (e.g. genes detected per cell) is needed for accurate measurements. Here, we use single molecule RNA FISH measurements of 26 genes in thousands of melanoma cells to provide an independent reference dataset to assess the performance of the DropSeq and Fluidigm single cell RNA sequencing platforms. We quantified the Gini coefficient, a measure of rare-cell expression variability, and find that the correspondence between RNA FISH and single cell RNA sequencing for Gini, unlike for mean, increases markedly with per-cell library complexity up to a threshold of ∼2000 genes detected. A similar complexity threshold also allows for robust assignment of multi-genic cell states such as cell cycle phase. Our results provide guidelines for selecting sequencing depth and complexity thresholds for single cell RNA sequencing. More generally, our results suggest that if the number of genes whose expression levels are required to answer any given biological question is small, then greater transcriptome complexity per cell is likely more important than obtaining very large numbers of cells.

biorxiv genomics 0-100-users 2017

A Next Generation Connectivity Map L1000 Platform And The First 1,000,000 Profiles, bioRxiv, 2017-05-11

SUMMARYWe previously piloted the concept of a Connectivity Map (CMap), whereby genes, drugs and disease states are connected by virtue of common gene-expression signatures. Here, we report more than a 1,000-fold scale-up of the CMap as part of the NIH LINCS Consortium, made possible by a new, low-cost, high throughput reduced representation expression profiling method that we term L1000. We show that L1000 is highly reproducible, comparable to RNA sequencing, and suitable for computational inference of the expression levels of 81% of non-measured transcripts. We further show that the expanded CMap can be used to discover mechanism of action of small molecules, functionally annotate genetic variants of disease genes, and inform clinical trials. The 1.3 million L1000 profiles described here, as well as tools for their analysis, are available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpsclue.io>httpsclue.io<jatsext-link>.HIGHLIGHTS<jatslist list-type=bullet><jatslist-item>A new gene expression profiling method, L1000, dramatically lowers cost<jatslist-item><jatslist-item>The Connectivity Map database now includes 1.3 million publicly accessible L1000 perturbational profiles<jatslist-item><jatslist-item>This expanded Connectivity Map facilitates discovery of small molecule mechanism of action and functional annotation of genetic variants<jatslist-item><jatslist-item>The work establishes feasibility and utility of a truly comprehensive Connectivity Map<jatslist-item>

biorxiv genomics 0-100-users 2017

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo