An improved de novo assembly and annotation of the tomato reference genome using single-molecule sequencing, Hi-C proximity ligation and optical maps, bioRxiv, 2019-09-14

AbstractThe original Heinz 1706 reference genome was produced by a large team of scientists from across the globe from a variety of input sources that included 454 sequences in addition to full-length BACs, BAC and fosmid ends sequenced with Sanger technology. We present here the latest tomato reference genome (SL4.0) assembled de novo from PacBio long reads and scaffolded using Hi-C contact maps. The assembly was validated using Bionano optical maps and 10X linked-read sequences. This assembly is highly contiguous with fewer gaps compared to previous genome builds and almost all scaffolds have been anchored and oriented to the 12 tomato chromosomes. We have found more repeats compared to the previous versions and one of the largest repeat classes identified are the LTR retrotransposons. We also describe updates to the reference genome and annotation since the last publication. The corresponding ITAG4.0 annotation has 4,794 novel genes along with 29,281 genes preserved from ITAG2.4. Most of the updated genes have extensions in the 5’ and 3’ UTRs resulting in doubling of annotated UTRs per gene. The genome and annotation can be accessed using SGN through BLAST database, Pathway database (SolCyc), Apollo, JBrowse genome browser and FTP available at <jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httpssolgenomics.net>httpssolgenomics.net<jatsext-link>.

biorxiv genomics 0-100-users 2019

Self-organised symmetry breaking in zebrafish reveals feedback from morphogenesis to pattern formation, bioRxiv, 2019-09-14

A fundamental question in developmental biology is how the early embryo breaks initial symmetry to establish the spatial coordinate system later important for the organisation of the embryonic body plan. In zebrafish, this is thought to depend on the inheritance of maternal mRNAs [1–3], cortical rotation to generate a dorsal pole of beta-catenin activity [4–8] and the release of Nodal signals from the yolk syncytial layer (YSL) [9–12]. Recent work aggregating mouse embryonic stem cells has shown that symmetry breaking can occur in the absence of extra-embryonic tissue [19,20]. To test whether this is also true in zebrafish, we separated embryonic cells from the yolk and allowed them to develop as aggregates. These aggregates break symmetry autonomously to form elongated structures with an anterior-posterior pattern. Extensive cell mixing shows that any pre-existing asymmetry is lost prior to the breaking morphological symmetry, revealing that the maternal pre-pattern is not strictly required for early embryo patterning. Following early signalling events after isolation of embryonic cells reveals that a pole of Nodal activity precedes and is required for elongation. The blocking of PCP-dependent convergence and extension movements disrupts the establishment of opposing poles of BMP and WntTCF activity and the patterning of anterior-posterior neural tissue. These results lead us to suggest that convergence and extension plays a causal role in the establishment of morphogen gradients and pattern formation during zebrafish gastrulation.

biorxiv developmental-biology 0-100-users 2019

Complete characterization of the human immune cell transcriptome using accurate full-length cDNA sequencing, bioRxiv, 2019-09-13

ABSTRACTThe human immune system relies on highly complex and diverse transcripts and the proteins they encode. These include transcripts for Human Leukocyte Antigen (HLA) class I and II receptors which are essential for selfnon-self discrimination by the immune system as well as transcripts encoding B cell and T cell receptors (BCR and TCR) which recognize, bind, and help eliminate foreign antigens.HLA genes are highly diverse within the human population with each individual possessing two of thousands of different alleles in each of the 9 major HLA genes. Determining which combination of alleles an individual possesses for each HLA gene (high-resolution HLA-typing) is essential to establish donor-recipient compatibility in organ and bone-marrow transplantations. BCR and TCR genes in turn are generated by recombining a diverse set of gene segments on the DNA level in each maturing B and T cell, respectively. This process generates adaptive immune receptor repertoires (AIRR) of composed of unique transcripts expressed by each B and T cells. These repertoires carry a vast amount of health relevant information. Both short-read RNA-seq based HLA-typing1 and adaptive immune receptor repertoire sequencing2–5 currently rely heavily on our incomplete knowledge of the genetic diversity at HLA6 and BCRTCR loci7,8.Here we used our nanopore sequencing based Rolling Circle toConcatemeric Consensus (R2C2) protocol9 to generate over 10,000,000 full-length cDNA sequences at a median accuracy of 97.9%. We used this dataset to demonstrate that deep and accurate full-length cDNA sequencing can - in addition to providing isoform-level transcriptome analysis for over 9,000 loci - be used to generate accurate sequences of HLA alleles for HLA allele typing and discovery as well as detailed AIRR data for the analysis of the adaptive immune system without requiring specific knowledge of the diversity at HLA and BCRTCR loci.

biorxiv genomics 0-100-users 2019

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo