Congratulations to Kerui Peng, a PhD rotation student in Mangul Lab at USC, on the acceptance of her oral presentation, “Reuse of publicly available omics data across 1 million research publications,” in 2019 9th International Conference on Computational Advances in Bio and medical Sciences (ICCABS) and for winning a conference travel award from the Graduate Affairs Office in the USC School of Pharmacy!

Along with co-authors Dat Duong, Nicholas Darci-Maher,  (Dept. of Computer Science at UCLA), Serghei Mangul (Dept. of Clinical Pharmacy, USC), and Jaqueline Brito (PhD, University of São Paulo, Brazil), Kerui examined over 1 million biomedical publications for the reusability of two public repositories: NCBI Sequence Read Archive (SRA) and NCBI Gene Expression Omnibus (GEO). The team found that the reusability of raw sequencing data is significantly lower than that of gene expression data. 

Among 1 million surveyed publications, 13.2% included the two public repositories among all published biomedical research papers. Only 10.1% of the surveyed publications with SRA accession numbers reused the sequencing data. Among publications containing GEO accession numbers, 41.5% were publication generating novel datasets. They found that datasets hosted on GEO were reused significantly more frequently when compared to SRA datasets (Mann–Whitney U test pvalue=2.58×10-285). Kerui and her team then examined how often the datasets represented by SRA and GEO IDs were reused. Only 7.2% of all SRA datasets were reused, while 26.4% of GEO datasets were subject to secondary analysis.

This work represents one of the first efforts to assess the current status of secondary analysis of omics data. With the rapid growth of raw sequencing data, utilizing bioinformatics tools and developing a standard protocol for omics data reuse could lead to new scientific and translational findings with existing omics data.

Kerui will present her work at the 9th Workshop on Computational Advances for Next Generation Sequencing, a part of 2019 ICCABS, which takes place from November 15 through 17, at Florida International University, Miami, Florida. ICCABS aims to bring together leading academic and industry researchers to discuss the latest advances in computational methods for bio and medical sciences.