Supercomputer, AI to speed up drug discoveries

The Tianhe-2 supercomputer in south China's Guangdong Province. (Photo: CFP)

Using artificial intelligence and one of the world's fastest supercomputers, Chinese scientists are engineering otherwise unknown chemicals designed for future clinical use.

The Tianhe-2 supercomputer, located in the National Supercomputer Center in Guangzhou, Guangdong province, ranked as the world's fastest supercomputer in June 2013, holding the top spot for three years and today still ranks among the world's top 10.

Tianhe-2 has been used as a platform for drug discovery, and currently, AI-based algorithms making the machine even smarter.

Scientists from Sun Yat-sen University and Beijing-based AI startup Galixir, along with those from the Georgia Institute of Technology and the Massachusetts Institute of Technology, have created a practical deep-learning toolkit to predict the biosynthetic pathways for natural products-or NP-like compounds-using Tianhe-2.

Natural products are the primary source of clinical drug discovery. More than 60 percent of small molecule drugs in the United States approved by the Food and Drug Administration are NPs or their derivatives.

Over 300,000 NPs have been recorded to date, but owing to the complex production know-how required, only one-tenth have been developed as a substrate or products, with more computer-aided screening urgently needed.

In a recent study published in Nature Communications, the researchers presented a tool called BioNavi-NP to propose NP biosynthetic pathways from simple building blocks in an optimal fashion, which requires no already-known biochemical rules.

A single-step bio-retrosynthesis prediction model is trained to generate candidate precursors for a target NP, and the full data-driven model achieves a prediction accuracy 1.7 times more precise than the previous rule-based model, according to the study.

After this process, an automated retro-biosynthesis route planning system efficiently samples plausible biosynthetic pathways.

The study revealed that the toolkit can successfully identify biosynthetic pathways for 90.2 percent of 368 test compounds.

The researchers combined an existing enzyme prediction tool to provide a user-friendly, open-to-the-public web server that can predict biosynthetic pathways. It can also score the biological feasibility of those pathways based on the estimated preferences of species and enzymes.

Inputting any relevant NP molecules into the online toolkit, one can obtain multiple predicted ways to synthesize them in just a few minutes in many cases.

The quick results are made possible by Tianhe-2's strong parallel computing capability and its customized graphics processing unit, which helps shorten the previous training and testing time from more than two weeks to within a day.

Tianhe-2 has been widely used for research on health and medicine.

A previous study resulted in a cost-efficient tool to discern types of gastric cancer using Tianhe-2 and an AI-based model called EBVNet.

A gene-screening model on Tianhe-2 can effectively discover signs of nasopharynx cancer among high-risk populations. The studies were published in Nature Communications in May and April, respectively.

In March, another study published in the journal Cell Metabolism showed that scientists used Tianhe-2 to find three chemicals that bring about a conceptually new strategy to treat complications of COVID-19.

Chinese scientists have also used Tianhe-2 to carry out the world's first computer modeling based on deep learning to non-invasively screen and identify liver and biliary diseases using ocular imaging.