Home Framework Nankai and Stanford Researchers Propose ‘DeepDrug’: A Python-Based Deep Learning Framework for Predicting Drug Relationships

Nankai and Stanford Researchers Propose ‘DeepDrug’: A Python-Based Deep Learning Framework for Predicting Drug Relationships

0
Source: https://www.biorxiv.org/content/10.1101/2020.11.09.375626v1.full.pdf

Drug discovery involves the search for biomedical links between chemical compounds (drugs, chemicals) and protein targets. Drugs interact with biological systems at a fundamental level by binding to target proteins and influencing their downstream action. Prediction of drug-target interactions (DTI) is crucial for identifying therapeutic targets or characteristics of drug targets. Understanding and predicting higher-level information such as side effects, therapeutic mechanisms, and even innovative insights for drug repositioning or reuse can all be aided by DTI insights.

Sildenafil, for example, was originally created to treat pulmonary hypertension, but after its side effects were discovered, it was repurposed to treat erectile dysfunction. Polypharmacy has also become a viable method among pharmacists because most human diseases are complex biological processes that are resistant to the activity of any drug. Drug-drug interaction (DDI) prediction and validation can sometimes identify possible synergy in drug combinations, allowing individual drugs to be more effective.

Additionally, negative DDIs are a major source of adverse drug reactions (ADRs), especially in the elderly, who are more likely to take many medications. Drugs have been taken off the market following critical DDIs, such as mibefradil and cerivastatin in the United States. Therefore, the early discovery of negative DDIs or unacceptable toxicity helps ensure drug safety while preventing the investment of additional resources in non-viable entities.

Various biochemical databases, such as DrugBank, TwoSides, RCSB Protein Data Bank, and PubChem, have emerged over the past decade, providing quick reference for DTIs and DDIs for healthcare professionals. The prediction of new biochemical interactions, on the other hand, remains a difficult task. In vitro procedures are reliable, but they are both expensive and time-consuming. Due to their cost-effectiveness and improved prediction accuracy, in silico techniques have garnered much attention. Machine learning algorithms that combine large-scale biochemical data are used in advanced computational methods for interaction prediction.

The majority of these initiatives are based on the idea that similar drugs have similar target proteins and vice versa. Therefore, the most widely used framework treats DTI and DDI prediction as a classification task using some kind of similarity functions as inputs. The researchers also looked at deep learning-based algorithms that use different feature extraction techniques in conjunction with different neural network designs, such as DeepDDI for DDI predictions and DeepDTA for DTI predictions.

Another common method is to use random walks to construct a heterogeneous network in the chemogenomic space to predict likely interactions. Natural language processing (NLP) techniques have been used on a large volume of relevant text corpora to automate the efficient extraction of relationships from biomedical research articles. Over the past two decades, the emergence of machine learning approaches and their integration into biomedical science has dramatically boosted drug research.

Deep learning frameworks based on variants of graph neural networks such as graph convolutional networks (GCN), graph attention networks (GAT), and closed graph neural networks (GGNN) have recently demonstrated breakthrough performance in social sciences, natural sciences, knowledge graphs, and a variety of other fields. GCNs have been used to solve a variety of biochemical problems, including molecular fingerprinting, where each node in the graphical model represents an atom, and each edge represents a chemical bond, and protein classification, where each node represents a residue and each edge represents the distances between nodes. Since structural qualities are the primary source of pharmacological and genetic similarities, graphical representations of biological entities have been shown to be more capable of capturing structural characteristics than Euclidean representations without requiring feature engineering.

Based on these findings, researchers from Stanford University and Nankai University propose DeepDrogue, a graph-based deep learning framework for learning drug interactions like paired DDIs or DTIs. The proposed model differs from existing methods for predicting drug relationships in the following ways: 1) DeepDrug only requires graphical representations of drugs and proteins as input data for learning structural features, as it takes advantage of the natural graphical representation of drugs and proteins; 2) DeepDrug uses GCN modules to capture the intrinsic structure between the atoms of a compound and the residues of a protein.

DeepDrug can successfully learn both DDIs and DTIs from graphical features in multiple tasks such as binary classification and multi-class classification and outperforms other state-of-the-art models, according to extensive testing on multiple datasets from reference. To further assess the robustness of the model, the team creates other datasets with varying ratios of positive and negative data. The team also shows the utility of the graphical model to learn structural information that is not explicitly introduced into the prediction framework by using visualization approaches and by calculating Dice similarity scores between study drugs.

DeepDrug uses a unified GCN-based framework to extract structural information from drugs and proteins to predict downstream DDIs and DTIs. Unlike crafted features (e.g., molecular fingerprints) or string-based features (e.g., SMILES sequence), DeepDrug’s innovative architecture design can automatically capture structural information taking into account interactions between nodes and bindings in input graphs. DeepDrug outperforms the competition in DDI and DTI prediction tasks.

The team demonstrates the superior performance of DeepDrug through extensive experiments that include binary class classification of DDIs, multiple class classification of DDIs, and binary class classification of DTIs, highlighting the strong and robust predictive power of graphical presentation strategy and GCN architecture. The structural properties of DeepDrug are visualized, demonstrating the crucial finding that biological structure can influence function, and that drugs with similar structures have similar targets. These results show that DeepDrug could be a valuable tool to model DDIs and DTIs and thus accelerate the drug discovery process.

Because chemical drugs are tiny chemicals that are often easier to turn into graphs with little ambiguity, the team first evaluates DeepDrug’s performance in a binary classification context for DDI prediction. DeepDrug is compared to a benchmark method based on Random Forest Classification (RFC) and another deep learning method, DeepDDI. The basic technique accepts graphical representations, while the original DeepDDI framework only accepts SMILES strings. Three different data sets are used.

DeepDrug consistently beats other approaches, with a 13.2% higher AUROC (Area Under Receiver Operating Characteristic) and 15.1% higher AUPRC (Area Under Precision-Recall Curve) than the second-best method, according to the report . DeepDrug has a 31.0% higher AUROC and a 17.0% higher AUPRC than DeepDDI, due to the fact that DeepDDI only uses SMILES sequence information as input. DeepDrug, on the other hand, uses a unique graphical representation and has the ability to learn underlying structural characteristics in order to improve performance.

Conclusion

DeepDrug, a revolutionary end-to-end deep learning framework for DDI and DTI predictions, is proposed in this article. DeepDrug takes drug SMILES strings and protein PDB inputs to characterize biological entities in graphical representations, then uses GCNs to develop latent feature representations that provide improved predictive modeling accuracy. DeepDrug can include both DDI and DTI predictions in a general framework thanks to the competitive advantage of graph-based design. DeepDrug can now be applied to new entities with graphical representations that can be extracted.

Article: https://www.biorxiv.org/content/10.1101/2020.11.09.375626v1.full.pdf

Github: https://github.com/wanwenzeng/deepdrug