[month] [year]

Vineeth Ravindra Chelur – Dual Degree CNS

Vineeth Ravindra Chelur received his MS Dual Degree in Computational Natural Sciences (CNS). His research work was supervised by Prof. Deva Priyakumar. Here’s a summary of  his research work on Sequence-Based Predictions of Binding Residues and Secondary Structures of Proteins using Deep Learning:

With the number of protein sequences increasing rapidly, it becomes imperative to have a basic idea of the function and structure of a protein before the three-dimensional structure becomes available. Protein-drug interactions play essential roles in many biological processes and therapeutics. Prediction of the active binding site of a protein helps discover and optimise these interactions leading to the design of better ligand molecules. The secondary structure provides clues to the shape that the protein can be expected to take. It tells us whether an amino acid belongs to a coil turn, alpha-helix or beta-sheet structure. Deep Learning is a class of machine learning algorithms that progressively uses multiple layers to extract higher-level features from raw input. Deep learning methods eliminate feature engineering for supervised learning tasks by translating the raw inputs into intermediate representations that capture the more abstract and composite information, removing redundancies in the original input. The rapid adoption and success of deep learning algorithms in various sections of structural biology beckon deep learning algorithms for accurate binding site detection and secondary structure prediction. 

Protein-drug interactions play essential roles in many biological processes and therapeutics. Prediction of the active binding site of a protein helps discover and optimise these interactions leading to the design of better ligand molecules. The tertiary structure of a protein determines the binding sites available to the drug molecule. To quickly and accurately predict the binding site from sequence alone without utilising the three-dimensional structure is challenging. In the first study, a Residual Neural Network (leveraging skip connections) is implemented to predict a protein’s most active binding site. An Annotated Database of Druggable Binding Sites from the Protein DataBank, sc-PDB, is used for training the network. Features extracted from the Multiple Sequence Alignments (MSAs) of the protein generated using DeepMSA, such as Position-Specific Scoring Matrix (PSSM), Secondary Structure (SS3), and Relative Solvent Accessibility (RSA), are provided as input to the network. A weighted binary cross-entropy loss function is used to counter the substantial imbalance in the two classes of binding and non-binding residues. The network performs very well on single-chain proteins, providing a pocket that has good interactions with a ligand.