Science & Technology

Big data methods under development at UGA will help tackle diseases

Athens, Ga. – The University of Georgia’s Ping Ma will use a new grant to crunch big data numbers, not uncommon for a statistics professor. What is unusual is that his work may help save lives.

Ma has been awarded $1.3 million in funding from the National Institutes of Health to develop statistical tools to further clarify the causes of many diseases-including cancer, heart disease and aging-related illnesses. Over four years, Ma and his team of researchers will look at something known as small RNAs, hoping to unravel their regulatory role on abnormal variations in genetic transcription.

RNA, or ribonucleic acid, is present in all living cells and is incredibly important in the human body. Small RNA primarily acts as a messenger for DNA and regulates various biological processes.

Ma, a professor in UGA’s Franklin College of Arts and Sciences department of statistics and lead investigator on the project, will work to analyze big data sets that contain biomedical information on various diseases and create smart algorithms. His goal is to allow researchers to accurately analyze large sets of data without the need for expensive supercomputers.

“Multiple interconnected research programs for tackling the challenge of big data have been actively pursued by my lab,” he said. “An example of exciting progress, achieved through a collaborative project, is our finding that by sampling very small representative sub-data sets using smart algorithms, one can effectively extract almost all of the relevant information contained in the original vast data sets.”

Using these statistical methods allows biomedical researchers who may not have direct access to supercomputers to analyze biomedical data accurately and scale outcomes to larger data sets. The results of developing useful statistical methods for analysis, he said, means that biomedical researchers can use their desktop computers, iPads and smartphones to analyze data.

“The advent of new biotechnologies has great potential to view the gene expression at unprecedented detail and clarity, which opens many new doors for studying the mechanisms of alternative splicing of various abnormal splicing related diseases,” he said. “Given the huge volumes of data, we believe that this is an opportune time for taking an analytical approach to study small RNAs’ regulatory role on alternative splicing.”

Recent studies have indicated that over 95 percent of human genes undergo alternative splicing. Aberrant splicing of pre-mRNAs can cause various human diseases.

“Small RNAs regulate alternative splicing,” he said, “and enhanced understanding of the regulation of alternative splicing is crucial for finding therapeutic targets and providing better treatment.”

Small RNAs, including microRNAs and short interfering RNAs, regulate gene expression through complementary base pairing with target RNAs. Together with their protein partners, small RNAs have been implicated in multiple aspects of gene functions. Very recently, siRNAs have also been implicated in alternative splicing processes in human cells, but the global regulatory role of small non-coding RNA on alternative splicing remains elusive.

Advances in biomedical sciences and technologies in the past decade have created an extraordinary amount of biomedical data that was inaccessible just a decade ago and offers biomedical researchers an unprecedented opportunity to tackle much larger and more complex research challenges. The opportunity has not yet been fully realized because effective and efficient statistical and computing tools for analyzing super-large data sets are still lacking.

“The proposed work will establish a comprehensive statistical framework and computational strategies to investigate the global mechanisms of alternative splicing regulation by small RNAs,” he said. “As a byproduct of this effort, we will also be able to provide an efficient, robust, publicly available and user-friendly software for the analysis.”