Complete genome characterisation of a novel coronavirus associated with severe human respiratory disease in Wuhan, China, bioRxiv, 2020-01-26

Emerging and re-emerging infectious diseases, such as SARS, MERS, Zika and highly pathogenic influenza present a major threat to public health1–3. Despite intense research effort, how, when and where novel diseases appear are still the source of considerable uncertainly. A severe respiratory disease was recently reported in the city of Wuhan, Hubei province, China. At the time of writing, at least 62 suspected cases have been reported since the first patient was hospitalized on December 12nd 2019. Epidemiological investigation by the local Center for Disease Control and Prevention (CDC) suggested that the outbreak was associated with a sea food market in Wuhan. We studied seven patients who were workers at the market, and collected bronchoalveolar lavage fluid (BALF) from one patient who exhibited a severe respiratory syndrome including fever, dizziness and cough, and who was admitted to Wuhan Central Hospital on December 26th 2019. Next generation metagenomic RNA sequencing4 identified a novel RNA virus from the family Coronaviridae designed WH-Human-1 coronavirus (WHCV).Phylogenetic analysis of the complete viral genome (29,903 nucleotides) revealed that WHCV was most closely related (89.1% nucleotide similarity similarity) to a group of Severe Acute Respiratory Syndrome (SARS)-like coronaviruses (genus Betacoronavirus, subgenus Sarbecovirus) previously sampled from bats in China and that have a history of genomic recombination. This outbreak highlights the ongoing capacity of viral spill-over from animals to cause severe disease in humans.

biorxiv pathology 100-200-users 2020

Interpretable multimodal deep learning for real-time pan-tissue pan-disease pathology search on social media, bioRxiv, 2018-08-21

AbstractBackgroundPathologists are responsible for rapidly providing a diagnosis on critical health issues, from infection to malignancy. Challenging cases benefit from additional opinions of pathologist colleagues. In addition to on-site colleagues, there is an active worldwide community of pathologists on social media for complementary opinions. Such access to pathologists worldwide has the capacity to (i) improve diagnostic accuracy and (ii) generate broader consensus on next steps in patient care.Methods and findingsFrom Twitter we curate 13,626 images from 6,351 tweets from 25 pathologists from 13 countries. We supplement the Twitter data with 113,161 images from 1,074,484 PubMed articles. We develop machine learning and deep learning models to (i) accurately identify histopathology stains, (ii) discriminate between tissues, and (iii) differentiate disease states. For deep learning, we derive novel regularization and activation functions for set representations related to set cardinality and the Heaviside step function. Area Under Receiver Operating Characteristic is 0.805-0.996 for these tasks. We repurpose the disease classifier to search for similar disease states given an image and clinical covariates. We report precision@k=1 = 0.701±0.003 (chance 0.397±0.004, mean±stdev). The classifiers find texture and tissue are important clinico-visual features of disease. For search, deep features and cell nuclei features are less important.We implement a social media bot (@pathobot on Twitter) to use the trained classifiers to aid pathologists in obtaining real-time feedback on challenging cases. The bot activates when mentioned in a social media post containing pathology text and images. The bot generates quantitative predictions of disease state (normalartifact infectioninjurynontumor, pre-neoplasticbenignlow-grade-malignant-potential, or malignant) and provides a ranked list of similar cases across social media and PubMed.ConclusionsOur project has become a globally distributed expert system that facilitates pathological diagnosis and brings expertise to underserved regions or hospitals with less expertise in a particular disease. This is the first pan-tissue pan-disease (i.e. from infections to malignancy) method for prediction and search on social media, and the first pathology study prospectively tested in public on social media. We expect our project to cultivate a more connected world of physicians and improve patient care worldwide.Author summaryWhy was this study done?<jatslist list-type=bullet><jatslist-item>No publicly available pan-tissue pan-disease dataset exists for computational pathology. This limits the general application of machine learning in histopathology.<jatslist-item><jatslist-item>Pathologists use social media to obtain both (i) opinions for challenging patient cases and (ii) continuing education. Connecting pathologists and linking to similar cases leads to more informative exchanges than computational predictions – e.g. to diagnose best, pathologists may discuss patient history and next tests to order. Additionally, pathologists seek the most interesting rare cases and new articles.<jatslist-item>What did the researchers do and find?<jatslist list-type=bullet><jatslist-item>We generated a pan-tissue, pan-disease dataset comprising 10,000+ images from social media and 100,000+ images from PubMed. Classifiers applied to social media data suggest texture and tissue are important clinico-visual features of disease. Learning from both clinical covariates (e.g. tissue type or marker mentions) and visual features (e.g. local binary patterns or deep learning image features), these classifiers are multimodal.<jatslist-item><jatslist-item>These data and classifiers power the first social media bot for pathology. It responds to pathologists in real time, searches for similar cases, and encourages collaboration.<jatslist-item>What do these findings mean?<jatslist list-type=bullet><jatslist-item>This diverse dataset will be a critical test for machine learning in computational pathology, e.g. search for cures of rare diseases.<jatslist-item><jatslist-item>Interpretable real-time classifiers can be successfully applied to images on social media and PubMed to find similar diseases and generate disease predictions. Going forward, similar methods may elucidate important clinico-visual features of specific diseases.<jatslist-item>

biorxiv pathology 100-200-users 2018

Adversarial childhood events are associated with Sudden Infant Death Syndrome (SIDS) an ecological study, bioRxiv, 2018-06-07

AbstractSudden Infant Death Syndrome (SIDS) is the most common cause of postneonatal infant death. The allostatic load hypothesis posits that SIDS is the result of perinatal cumulative painful, stressful, or traumatic exposures that tax neonatal regulatory systems. To test it, we explored the relationships between SIDS and two common stressors, male neonatal circumcision (MNC) and prematurity, using latitudinal data from 15 countries and over 40 US states during the years 1999-2016. We used linear regression analyses and likelihood ratio tests to calculate the association between SIDS and the stressors. SIDS prevalence was significantly and positively correlated with MNC and prematurity rates. MNC explained 14.2% of the variability of SIDS’s male bias in the US, reminiscent of the Jewish myth of Lilith, the killer of infant males. Combined, the stressors increased the likelihood of SIDS. Ecological analyses are useful to generate hypotheses but cannot provide strong evidence of causality. Biological plausibility is provided by a growing body of experimental and clinical evidence linking adversary preterm and early-life events with SIDS. Together with historical evidence, our findings emphasize the necessity of cohort studies that consider these environmental stressors with the aim of improving the identification of at-risk infants and reducing infant mortality.

biorxiv pathology 100-200-users 2018

H&E-stained Whole Slide Image Deep Learning Predicts SPOP Mutation State in Prostate Cancer, bioRxiv, 2016-07-18

A quantitative model to genetically interpret the histology in whole microscopy slide images is desirable to guide downstream immuno-histochemistry, genomics, and precision medicine. We constructed a statistical model that predicts whether or not SPOP is mutated in prostate cancer, given only the digital whole slide after standard hematoxylin and eosin [H&amp;E] staining. Using a TCGA cohort of 177 prostate cancer patients where 20 had mutant SPOP, we trained multiple ensembles of residual networks, accurately distinguishing SPOP mutant from SPOP non-mutant patients (test AUROC=0.74, p=0.0007 Fisher’s Exact Test). We further validated our full metaensemble classifier on an independent test cohort from MSK-IMPACT of 152 patients where 19 had mutant SPOP. Mutants and non-mutants were accurately distinguished despite TCGA slides being frozen sections and MSK-IMPACT slides being formalin-fixed paraffin-embedded sections (AUROC=0.86, p=0.0038). Moreover, we scanned an additional 36 MSK-IMPACT patients having mutant SPOP, trained on this expanded MSK-IMPACT cohort (test AUROC=0.75, p=0.0002), tested on the TCGA cohort (AUROC=0.64, p=0.0306), and again accurately distinguished mutants from non-mutants using the same pipeline. Importantly, our method demonstrates tractable deep learning in this “small data” setting of 20-55 positive examples and quantifies each prediction’s uncertainty with confidence intervals. To our knowledge, this is the first statistical model to predict a genetic mutation in cancer directly from the patient’s digitized H&amp;E-stained whole microscopy slide. Moreover, this is the first time quantitative features learned from patient genetics and histology have been used for content-based image retrieval, finding similar patients for a given patient where the histology appears to share the same genetic driver of disease i.e. SPOP mutation (p=0.0241 Kost’s Method), and finding similar patients for a given patient that does not have have that driver mutation (p=0.0170 Kost’s Method).Significance StatementThis is the first pipeline predicting gene mutation probability in cancer from digitized H&amp;E-stained microscopy slides. To predict whether or not the speckle-type POZ protein [SPOP] gene is mutated in prostate cancer, the pipeline (i) identifies diagnostically salient slide regions, (ii) identifies the salient region having the dominant tumor, and (iii) trains ensembles of binary classifiers that together predict a confidence interval of mutation probability. Through deep learning on small datasets, this enables automated histologic diagnoses based on probabilities of underlying molecular aberrations and finds histologically similar patients by learned genetic-histologic relationships.Conception, Writing AJS, TJF. Algorithms, Learning, CBIR AJS. Analysis AJS, MAR, TJF. Supervision MAR, TJF.

biorxiv pathology 0-100-users 2016

 

Created with the audiences framework by Jedidiah Carlson

Powered by Hugo