RootNav 2.0 Deep Learning for Automatic Navigation of Complex Plant Root Architectures, bioRxiv, 2019-07-20

AbstractWe present a new image analysis approach that provides fully-automatic extraction of complex root system architectures from a range of plant species in varied imaging setups. Driven by modern deep-learning approaches, RootNav 2.0 replaces previously manual and semi-automatic feature extraction with an extremely deep multi-task Convolutional Neural Network architecture. The network has been designed to explicitly combine local pixel information with global scene information in order to accurately segment small root features across high-resolution images. In addition, the network simultaneously locates seeds, and first and second order root tips to drive a search algorithm seeking optimal paths throughout the image, extracting accurate architectures without user interaction. The proposed method is evaluated on images of wheat (Triticum aestivum L.) from a seedling assay. The results are compared with semi-automatic analysis via the original RootNav tool, demonstrating comparable accuracy, with a 10-fold increase in speed. We then demonstrate the ability of the network to adapt to different plant species via transfer learning, offering similar accuracy when transferred to an Arabidopsis thaliana plate assay. We transfer for a final time to images of Brassica napus from a hydroponic assay, and still demonstrate good accuracy despite many fewer training images. The tool outputs root architectures in the widely accepted RSML standard, for which numerous analysis packages exist (<jatsext-link xmlnsxlink=httpwww.w3.org1999xlink ext-link-type=uri xlinkhref=httprootsystemml.github.io>httprootsystemml.github.io<jatsext-link>), as well as segmentation masks compatible with other automated measurement tools.

biorxiv bioinformatics 0-100-users 2019

Human Genome Assembly in 100 Minutes, bioRxiv, 2019-07-17

AbstractDe novo genome assembly provides comprehensive, unbiased genomic information and makes it possible to gain insight into new DNA sequences not present in reference genomes. Many de novo human genomes have been published in the last few years, leveraging a combination of inexpensive short-read and single-molecule long-read technologies. As long-read DNA sequencers become more prevalent, the computational burden of generating assemblies persists as a critical factor. The most common approach to long-read assembly, using an overlap-layout-consensus (OLC) paradigm, requires all-to-all read comparisons, which quadratically scales in computational complexity with the number of reads. We assert that recently achievements in sequencing technology (i.e. with accuracy ~99% and read length ~10-15k) enables a fundamentally better strategy for OLC that is effectively linear rather than quadratic. Our genome assembly implementation, Peregrine uses sparse hierarchical minimizers (SHIMMER) to index reads thereby avoiding the need for an all-to-all read comparison step. Peregrine can assemble 30x human PacBio CCS read datasets in less than 30 CPU hours and around 100 wall-clock minutes to a high contiguity assembly (N50 &gt; 20Mb). The continued advance of sequencing technologies coupled with the Peregrine assembler enables routine generation of human de novo assemblies. This will allow for population scale measurements of more comprehensive genomic variations -- beyond SNPs and small indels -- as well as novel applications requiring rapid access to de novo assemblies.

biorxiv bioinformatics 100-200-users 2019

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo