The Protein Data Bank (PDB) file format remains a popular format used and supported by many software to represent coordinates of macromolecular structures. It however suffers from drawbacks such as error-prone manual editing. Because of that, various software toolkits have been developed to facilitate its editing and manipulation, but, to date, there is no online tool available for this purpose. Here we present PDB-Tools Web, a flexible online service for manipulating PDB files. It offers a rich and user-friendly graphical user interface that allows users to mix-and-match more than 40 individual tools from the pdb-tools suite. Those can be combined in a few clicks to perform complex pipelines, which can be saved and uploaded. The resulting processed PDB files can be visualized online and downloaded. The web server is freely available at https://wenmr.science.uu.nl/pdbtools.
Predicting the range of substrates accepted by an enzyme from its amino acid sequence is challenging. Although sequence- and structure-based annotation approaches are often accurate for predicting broad categories of substrate specificity, they generally cannot predict which specific molecules will be accepted as substrates for a given enzyme, particularly within a class of closely related molecules. Combining targeted experimental activity data with structural modeling, ligand docking, and physicochemical properties of proteins and ligands with various machine learning models provides complementary information that can lead to accurate predictions of substrate scope for related enzymes. Here we describe such an approach that can predict the substrate scope of bacterial nitrilases, which catalyze the hydrolysis of nitrile compounds to the corresponding carboxylic acids and ammonia. Each of the four machine learning models (linear regression, random forest, gradient-boosted decision trees, and support vector machines) performed similarly (average ROC = 0.9, average accuracy = ~82%) for predicting substrate scope for this dataset. The approach is intended to be highly modular with respect to physicochemical property calculations and software used for docking and modeling.
Natural products and natural product-derived compounds have been widely used for pharmaceuticals for many years, and the search for new natural products that may have interesting activity is on going. Abyssomicins are natural product molecules that have antibiotic activity via inhibition of the folate synthesis pathway in microbiota. These compounds also appear to undergo a required [4+2] cycloaddition in their biosynthetic pathway. Here we report the structure of an FAD-dependent reductase, AbsH3, from the biosynthetic gene cluster of novel abyssomicins found in Streptomyces sp. LC-6-2.
The focal adhesion kinase (FAK) and the proline-rich tyrosine kinase 2-beta (PYK2) are implicated in cancer progression and metastasis and represent promising biomarkers and targets for cancer therapy. FAK and PYK2 are recruited to Focal Adhesions (Fas) via interactions between their Focal Adhesion Targeting (FAT) domains and conserved segments (LD motifs) on the proteins Paxillin, Leupaxin and Hic-5. A promising new approach for the inhibition of FAK and PYK2 targets interactions of the FAK domains with proteins that promote localization at Focal Adhesions. Advances toward this goal include the development of surface plasmon resonance, HSQC-NMR and fluorescence polarization assays for the identification of fragments or compounds interfering with the FAK-Paxillin interaction. We have recently validated this strategy, showing that Paxillin mimicking polypeptides with 2-3 LD motifs displace FAK from FAs and block kinase-dependent and independent functions of FAK, including downstream integrin signalling and FA localization of the protein p130Cas. In the present work we study by all-atom molecular dynamics simulations the recognition of peptides with the Paxillin and Leupaxin LD motifs by the FAK-FAT and PYK2-FAT domains. Our simulations and free-energy analysis interpret experimental data on binding of Paxillin and Leupaxin LD motifs at FAK-FAT and PYK2-FAT binding sites, and assess the roles of consensus LD regions and flanking residues. Our results can assist in the design of effective inhibitory peptides of the FAK-FAT:Paxillin and PYK2-FAT:Leupaxin complexes and the construction of pharmacophore models for the discovery of potential small-molecule inhibitors of the FAK-FAT and PYK2-FAT focal adhesion based functions.
Isoflavonoid is one of the groups of flavonoids that play pivotal roles in the survival of land plants. Chalcone synthase (CHS), the first enzyme of the isoflavonoid biosynthetic pathway, catalyzes the formation of a common isoflavonoid precursor. We have previously reported that an isozyme of soybean CHS (termed GmCHS1) is a key component of the isoflavonoid metabolon, a protein complex to enhance efficiency of isoflavonoid production. Here, we determined the crystal structure of GmCHS1 as a first step of understanding the metabolon structure, as well as to better understand the catalytic mechanism of GmCHS1.
This paper reports on the results of research aimed to translate biometric 3D face recognition concepts and algorithms into the field of protein biophysics in order to precisely and rapidly classify morphological features of protein surfaces. Both human faces and protein surfaces are free-forms and some descriptors used in differential geometry can be used to describe them applying the principles of feature extraction developed for computer vision and pattern recognition. The first part of this study focused on building the protein dataset using a simulation tool and performing feature extraction using novel geometrical descriptors. The second part tested the method on two examples, first involved a classification of tubulin isotypes and the second compared tubulin with the FtSZ protein, which is its bacterial analogue. An additional test involved several unrelated proteins. Different classification methodologies have been used: a classic approach with a Support Vector Machine (SVM) classifier and an unsupervised learning with a k-means approach. The best result was obtained with SVM and the radial basis function (RBF) kernel. The results are significant and competitive with the state-of-the-art protein classification methods. This opens a new area for protein structure analysis.
Structural characterization of alternatively folded and partially disordered protein conformations remains challenging. Outer surface protein A (OspA) is a pivotal protein in Borrelia infection, which is the etiological agent of Lyme disease. OspA exists in equilibrium with intermediate conformations, in which the central and the C-terminal regions of the protein have lower stabilities than the N-terminal. Here, we characterize pressure- and temperature-stabilized intermediates of OspA by nuclear magnetic resonance spectroscopy combined with paramagnetic relaxation enhancement (PRE). We found that the C-terminal region of the intermediate was partially disordered; however, it retains weak specific contact with the N-terminal region, owing to a twist of the central β-sheet and increased flexibility in the polypeptide chain. The disordered C-terminal region of the pressure-stabilized intermediate was more compact than that of the temperature-stabilized form. Further, molecular dynamics simulation demonstrated that temperature-induced disordering of the β-sheet was initiated at the C-terminal region and continued through to the central region. An ensemble of simulation snapshots qualitatively described the PRE data from the intermediate and indicated that the intermediate structures of OspA may expose tick receptor-binding sites more readily than does the basic folded conformation.
Expansins have the remarkable ability to loosen plant cell walls and cellulose material without showing catalytic activity and therefore have potential applications in biomass degradation. To support the study of sequence-structure-function relationships and the search for novel expansins, the Expansin Engineering Database (ExED, https://exed.biocatnet.de) collected sequence and structure data on expansins from Bacteria, Fungi, and Viridiplantae, and expansin-like homologues such as carbohydrate binding modules, glycoside hydrolases, loosenins, swollenins, cerato-platanins, and EXPNs. Based on global sequence alignment and protein sequence network analysis, the sequences are highly diverse. However, many similarities were found between the expansin domains. Newly created profile hidden Markov models of the two expansin domains enable standard numbering schemes, comprehensive conservation analyses, and genome annotation. Conserved key amino acids in the expansin domains were identified, a refined classification of expansins and carbohydrate binding modules was proposed, and new sequence motifs facilitate the search of novel candidate genes and the engineering of expansins.
The M42 aminopeptidases are a family of dinuclear aminopeptidases widely distributed in Prokaryotes. They are potentially associated to the proteasome, achieving complete peptide destruction. Their most peculiar characteristic is their quaternary structure, a tetrahedron-shaped particle made of twelve subunits. The catalytic site of M42 aminopeptidases is defined by seven conserved residues. Five of them are involved in metal ion binding which is important to maintain both the activity and the oligomeric state. The sixth conserved residue, a glutamate, is the catalytic base deprotonating the water molecule during peptide bond hydrolysis. The seventh residue is an aspartate whose function remains poorly understood. This aspartate residue, however, must have a critical role as it is strictly conserved in all MH clan enzymes. It forms some kind of catalytic triad with the histidine residue and the metal ion of the M2 binding site. We assess its role in TmPep1050, an M42 aminopeptidase of Thermotoga maritima, through a mutational approach. Asp-62 was substituted with alanine, asparagine, or glutamate residue. The three Asp-62 substitutions completely abolished TmPep1050 activity and impeded dodecamer formation. They also interfered with metal ion binding as only one cobalt ion is bound per subunit instead of two. The structural data showed that the Asp62Ala substitution has an impact on the active site folds becoming similar to TmPep1050 dimer. We propose a structural role for Asp-62, helping to stabilize a crucial loop in the active site and to position correctly the catalytic base and a metal ion ligand of the M1 site.
Accurate prediction of protein secondary structure (alpha-helix, beta-strand and coil) is a crucial step for protein inter-residue contact prediction and ab initio tertiary structure prediction. In a previous study, we developed a deep belief network-based protein secondary structure method (DNSS1) and successfully advanced the prediction accuracy beyond 80%. In this work, we developed multiple advanced deep learning architectures (DNSS2) to further improve secondary structure prediction. The major improvements over the DNSS1 method include (i) designing and integrating six advanced one-dimensional deep convolutional/recurrent/residual/memory/fractal/inception networks to predict secondary structure, and (ii) using more sensitive profile features inferred from Hidden Markov model (HMM) and multiple sequence alignment (MSA). Most of the deep learning architectures are novel for protein secondary structure prediction. DNSS2 was systematically benchmarked on two independent test datasets with eight state-of-art tools and consistently ranked as one of the best methods. Particularly, DNSS2 was tested on the 82 protein targets of 2018 CASP13 experiment and achieved the best Q3 score of 83.74% and SOV score of 72.46%. DNSS2 is freely available at: https://github.com/multicom-toolbox/DNSS2.
Protein-protein interactions (PPIs) are ubiquitous and functionally of great importance in biological systems. Hence, the ac-curate prediction of PPIs by protein-protein docking and scoring tools is highly desirable in order to characterize their structure and biological function. Ab initio docking protocols are divided into the sampling of docking poses to produce at least one near-native structure, then to evaluate the vast candidate structures by scoring. Concurrent development in both sampling and scoring is crucial for the deployment of protein-protein docking software. In the present work, we apply a machine learning model on pairwise potentials to refine the task of protein quaternary structure native structure detection among decoys. A decoy set was featurized using the Knowledge and Empirical Combined Scoring Algorithm 2 (KECSA2) pairwise potential. The highly unbalanced decoy set was then balanced using a comparison concept between native and decoy structures. The resultant comparison descriptors were used to train a logistic regression (LR) classifier. The LR model yielded the optimal performance for native detection among decoys compared to conventional scoring functions, while exhibiting lesser performance for the detection of low root mean square deviation (RMSD) decoy structures. Its deployment on an independent benchmark set confirms that the scoring function performs competitively relative to other scoring functions. All data and scripts used are available at: https://github.com/TanemuraKiyoto/PPI-native-detection-via-LR .
Allostery governing two conformational states is one of the proposed mechanisms for catch-bond behavior in adhesion proteins. In FimH, a catch-bond protein expressed by pathogenic bacteria, separation of two domains disrupts inhibition by the pili domain. Thus, tensile force can induce a conformational change in the lectin domain, from an inactive state to an active state with high affinity. To better understand allosteric inhibition in two-domain FimH (H2 inactive), we use molecular dynamics simulations to study the lectin domain alone, which has high affinity (HL active), and also the lectin domain stabilized in the low-affinity conformation by an Arg-60-Pro mutation (HL mutant). Because ligand-binding induces an allostery-like conformational change in HL mutant, this more experimentally tractable version has been proposed as a “minimal model” for FimH. We find that HL mutant has larger backbone fluctuations than both H2 inactive and HL active, at the binding pocket and allosteric interdomain region. We use an internal coordinate system of dihedral angles to identify protein regions with differences in backbone and sidechain dynamics beyond the putative allosteric pathway sites. By characterizing HL mutant dynamics for the first time, we provide additional insight into the transmission of allosteric information across the lectin domain and build upon structural and thermodynamic data in the literature to further support the use of HL mutant as a “minimal model.” Understanding how to alter protein dynamics to prevent the allosteric conformational change may guide drug development to prevent infection by blocking FimH adhesion.