Talent Identification at the limits of Peer Review an analysis of the EMBO Postdoctoral Fellowships Selection Process, bioRxiv, 2018-12-04
Scientific peer review is still the most common system for fund allocation despite having been shown in multiple instances to lack accuracy in identifying the most meritorious applications among high quality ones. This study evaluates two aspects of the selection process of the top- ranked applicants to the EMBO Long-Term Fellowship program in 2007. First, the accuracy of the system is evaluated by comparing the level of career progression of the candidates in 2017 with the original award decisions made in 2007. The second aspect, explores the relationship of career progression with indicators derived from the information available to evaluators at the time of application. The results obtained suggest that the peer review system is not substantially better than random selection in identifying the best candidates once an initial pre-selection of the most promising ones is performed. Not only that, the analysis of the indicators studied, some of which have not been analyzed in detail in the past, suggests that among other potential sources of uncertainty, the information available at the time of application is not sufficiently predictive of career progression. As previously described, however, we find differences in career progression between men and women. We propose a new mixed model of fellowship evaluation in which peer review is used to select high quality applications, and random allocation of funds is subsequently used to award fellowships among these top ranked candidates.
biorxiv scientific-communication-and-education 500+-users 2018The skull of StW 573, a 3.67 Ma Australopithecus skeleton from Sterkfontein Caves, South Africa, bioRxiv, 2018-12-04
Here we present the first full anatomical description of the 3.67 million-year-old Australopithecus skull StW 573 that was recovered with its skeleton from the Sterkfontein Member 2 breccia in the Silberberg Grotto. Analysis demonstrates that it is most similar in multiple key morphological characters to a group of fossils from Sterkfontein Member 4 and Makapansgat that are here distinguished morphologically as A. prometheus. This taxon contrasts with another group of fossils from those sites assigned to A. africanus. The anatomical reasons for why these groupings should not be lumped together (as is frequently done for the South African fossils) are discussed in detail. In support of this classification, we also present for the first time a palate (StW 576 from Sterkfontein Member 4) newly reconstructed by RJC, which has a uniquely complete adult dentition of an A. africanus. The StW 573 skull also has certain similarities with other earlier Australopithecus fossils in East Africa, A. afarensis and A. anamensis, which are discussed. One of its most interesting features is a pattern of very heavy anterior dental wear unlike that found in A. africanus but resembling that found in A. anamensis at 4.17 Ma. While StW 573 is the only hominid fossil in Sterkfontein Member 2, we conclude that competitive exclusion probably accounts for the synchronous and sympatric presence of two species of Australopithecus in the younger deposits at Makapansgat and Sterkfontein Member 4. Because the StW 573 skull is associated with a near-complete skeleton that is also described for the first time in this special issue, we are now able to use this individual to improve our understanding of more fragmentary finds in the South African fossil record of Australopithecus.
biorxiv paleontology 100-200-users 2018Ultra-deep, long-read nanopore sequencing of mock microbial community standards, bioRxiv, 2018-12-04
Background Long sequencing reads are information-rich aiding de novo assembly and reference mapping, and consequently have great potential for the study of microbial communities. However, the best approaches for analysis of long-read metagenomic data are unknown. Additionally, rigorous evaluation of bioinformatics tools is hindered by a lack of long-read data from validated samples with known composition.Methods We sequenced two commercially-available mock communities containing ten microbial species (ZymoBIOMICS Microbial Community Standards) with Oxford Nanopore GridION and PromethION. Isolates from the same mock community were sequenced individually with Illumina HiSeq.Data We generated 14 and 16 Gbp from GridION flowcells and 146 and 148 Gbp from PromethION flowcells for the even and odd communities respectively. Read length N50 was 5.3 Kbp and 5.2 Kbp for the even and log community, respectively. Basecalls and corresponding signal data are made available (4.2 TB in total). Results Alignment to Illumina-sequenced isolates demonstrated the expected microbial species at anticipated abundances, with the limit of detection for the lowest abundance species below 50 cells (GridION). De novo assembly of metagenomes recovered long contiguous sequences without the need for pre-processing techniques such as binning.Conclusions We present ultra-deep, long-read nanopore datasets from a well-defined mock community. These datasets will be useful for those developing bioinformatics methods for long-read metagenomics and for the validation and comparison of current laboratory and software pipelines.
biorxiv bioinformatics 100-200-users 2018Engineering Brain Parasites for Intracellular Delivery of Therapeutic Proteins, bioRxiv, 2018-12-03
Protein therapy has the potential to alleviate many neurological diseases; however, delivery mechanisms for the central nervous system (CNS) are limited, and intracellular delivery poses additional hurdles. To address these challenges, we harnessed the protist parasite Toxoplasma gondii, which can migrate into the CNS and secrete proteins into cells. Using a fusion protein approach, we engineered T. gondii to secrete therapeutic proteins for human neurological disorders. We tested two secretion systems, generated fusion proteins that localized to the secretory organelles of T. gondii and assessed their intracellular targeting in various mammalian cells including neurons. We show that T. gondii expressing GRA16 fused to the Rett syndrome protein MeCP2 deliver a fusion protein that mimics the endogenous MeCP2, binding heterochromatic DNA in neurons. This demonstrates the potential of T. gondii as a therapeutic protein vector, which could provide either transient or chronic, in situ synthesis and delivery of intracellular proteins to the CNS.
biorxiv synthetic-biology 500+-users 2018Direct RNA nanopore sequencing of full-length coron-avirus genomes provides novel insights into structural variants and enables modification analysis, bioRxiv, 2018-12-01
ABSTRACTSequence analyses of RNA virus genomes remain challenging due to the exceptional genetic plasticity of these viruses. Because of high mutation and recombination rates, genome replication by viral RNA-dependent RNA polymerases leads to populations of closely related viruses that are generally referred to as ‘quasispecies’. Although standard (short-read) sequencing technologies allow to readily determine consensus sequences for these ‘quasispecies’, it is far more difficult to reconstruct large numbers of full-length haplotypes of (i) RNA virus genomes and (ii) subgenome-length (sg) RNAs comprised of noncontiguous genome regions that may be present in these virus populations. Here, we used a full-length, direct RNA sequencing (DRS) approach without any amplification step to characterize viral RNAs produced in cells infected with a human coronavirus representing one of the largest RNA virus genomes known to date.Using DRS, we were able to map the longest (~26 kb) contiguous read to the viral reference genome. By combining Illumina and nanopore sequencing, a highly accurate consensus sequence of the human coronavirus (HCoV) 229E genome (27.3 kb) was reconstructed. Furthermore, using long reads that did not require an assembly step, we were able to identify, in infected cells, diverse and novel HCoV-229E sg RNAs that remain to be characterized. Also, the DRS approach, which does not require reverse transcription and amplification of RNA, allowed us to detect methylation sites in viral RNAs. Our work paves the way for haplotype-based analyses of viral quasispecies by demonstrating the feasibility of intra-sample haplotype separation. We also show how supplementary short-read sequencing (Illumina) can be used to reduce the error rate of nanopore sequencing.Even though a number of technical challenges remain to be addressed to fully exploit the potential of the nanopore technology, our work illustrates that direct RNA sequencing may significantly advance genomic studies of complex virus populations, including predictions on long-range interactions in individual full-length viral RNA haplotypes.
biorxiv genomics 100-200-users 2018Direct RNA nanopore sequencing of full-length coronavirus genomes provides novel insights into structural variants and enables modification analysis, bioRxiv, 2018-12-01
Sequence analyses of RNA virus genomes remain challenging due to the exceptionalgenetic plasticity of these viruses. Because of high mutation and recombinationrates, genome replication by viral RNA-dependent RNA polymerases leads topopulations of closely related viruses, so-called 'quasispecies'. Standard(short-read) sequencing technologies are ill-suited to reconstruct large numbersof full-length haplotypes of (i) RNA virus genomes and (ii) subgenome-length(sg) RNAs comprised of noncontiguous genome regions. Here, we used afull-length, direct RNA sequencing (DRS) approach based on nanopores tocharacterize viral RNAs produced in cells infected with a human coronavirus.Using DRS, we were able to map the longest (~26 kb) contiguous read to theviral reference genome. By combining Illumina and nanopore sequencing, wereconstructed a highly accurate consensus sequence of the human coronavirus(HCoV) 229E genome (27.3 kb). Furthermore, using long reads that did notrequire an assembly step, we were able to identify, in infected cells, diverseand novel HCoV-229E sg RNAs that remain to be characterized. Also, the DRSapproach, which circumvents reverse transcription and amplification of RNA,allowed us to detect methylation sites in viral RNAs. Our work paves the way forhaplotype-based analyses of viral quasispecies by demonstrating the feasibilityof intra-sample haplotype separation.Even though several technical challenges remain to be addressed to exploit thepotential of the nanopore technology fully, our work illustrates that direct RNAsequencing may significantly advance genomic studies of complex viruspopulations, including predictions on long-range interactions in individualfull-length viral RNA haplotypes.
biorxiv genomics 100-200-users 2018