New insight into the In-Silico prediction and molecular docking of Giardia intestinalis protease resistance to nitroimidazole

Background/Objective. Giardia intestinalis is a flagella protozoan residing in human intestine causing diarrhea that affecting most common among the children. Usually metronidazole and nitroimidazole were used for the treatment of giardiasis worldwide. Pyruvate Ferredoxin oxidoreductase (PFeO) protease enzyme may play a role inactivation of these drugs that may lead to resistance. Studies regarding drug resistance are very less. Therefore, present study was carried out to predict the structure and characterize them structurally as well as functionally by using appropriate in-silico methods. Methods: The proteins sequences of wild type and mutant strain i.e. Pyruvate Ferredoxin oxidoreductase (PFeO) and Pyruvate flavodoxin oxidoreductase (PFIO) were retracted from the NCBI and aligned using ClustalW. Bioinformatic tools were carried out like molecular dynamics simulations and docking to understand the three-dimensional (3D) conformational structure and interaction behavior of mutant and wild types with the ligand. Findings: The wild complex maintained an approximate 4H-bonds during simulation while the mutant complex could hardly maintain single H-bond by the end of the simulation, suggesting that 4-nitroimidazole is highly stable in the wild complex. Since the compound occupies this binding site by making stable H-bond with key interacting residues, the function of mutant may be affected. For the very reason, the mutant may be resistant to the nitroimidazole. Applications: Molecular dynamic simulation and docking results have increased our understanding of resistance mechanisms and may also help for the design derivatives of nitroimidazole that inhibit protein function more efficiently.


Introduction
Giardia intestinalis (syn. Giardia lamblia or Giardia duodenalis) belong to protozoa which are intestinal parasites residing in the small intestine of the human, this parasite causes diarrhea in human and domestic animals as well, while children's is known to be more susceptible to this parasite infection. The infections occurred worldwide, however, they estimated that more than four hundred million cases annually are affecting by this infection and increased with their incidence rate (1,2) . Most commonly seen in the developing countries with a poor hygienic setting, 15% approximately Giardia infections mostly occur it is children group which is less than 10 years of age and it considers as the second foremost cause of death in children under five years old (3)(4)(5)(6) . Most of the cases were found to be asymptomatic they act as a carrier, but in the symptomatic cases may lead to chronic and acute symptoms which include nausea, diarrhea, dehydration, malabsorption syndrome (1) .
G. intestinalis infection can cause like irritable bowel syndrome, it has been found that the infection has some impact on the patient with obesity and type II diabetes (7,8) . Presently worldwide treatment of giardiasis constrained to the nitro heterocyclic group of drugs like metronidazole, nitroimidazole, and furazolidone compounds (9,10) . Some of the studies abroad has been reported that clinical isolates of Giardia intestinalis have diverse levels of resistance against these nitro groups of drugs may be due to overexpressing some protease enzymes related to drug resistance (10)(11)(12)(13) . But the mechanistic basis for this resistance is not known. It has been reported that the protease enzyme known as pyruvate ferredoxin oxidoreductase (PFeO) mutant type which is present in the Giardia showed resistance to metronidazole and nitroimidazole drugs by inactivating the activity of these particular drugs (14) . The metronidazole inactive oxide form enters the cell of Giardia through passive diffusion methods and inside mitochondria organelle of Giardia the nitrogen being converted to toxic radicals by reduced pyruvate-ferredoxin oxidoreductase enzyme (PFeO) mutant type (15,16) . This mutation in the gene of pyruvate ferredoxin oxidoreductase of G. intestinalis can able to decrease the protein binding affinity process which brings down the expression of the protease enzyme and its resistance to nitroimidazole and metronidazole.
Previously various studies regarding nitro heterocyclic drug resistance by Giardia has been evaluated in different aspect like regulation of Oxidoreductase enzymes in mRNA expression levels that take place in the parasite (17)(18)(19) . However, there are no complete study has been showed at the protein level of replication in Giardia which is related to nitro heterocyclic resistance of G. intestinalis isolates. In western countries they find it is one of the major problems in treating nitro imidazole-resistant clinical isolates of Giardia in the patients, these nitro group of drug their permeability does not alter on the trophozoites and developed cysts of the parasite and there are no significant changes at lower concentrations to stop the multiplication of cyst production of the parasite (20,21) . Pyruvate-ferredoxin oxidoreductase enzyme is known to be involved as nitro-binding protein in drug resistance that presents in G. intestinalis. Therefore, resistance to this nitroimidazole drug in G. intestinalis may be due to pyruvate ferredoxin oxidoreductase activity of this enzyme, So, therefore, the present study aimed to understand the drug-resistant mechanism involve in the G. intestinalis by using appropriate in-silico bioinformatics tools like simulation and molecular docking of pyruvate-ferredoxin oxidoreductase enzyme with the particular drug ligand nitroimidazole.

Homology modeling/template selection and sequence alignment
The protein sequences of mutant and wild types of Giardia intestinalis were extracted from National Centre for Biotechnology Information (NCBI) protein search with accession numbers AAA74894.1 and XP_001708948.1, respectively. The Basic Local Alignment Search Tool protein (BLASTp) search was performed to retrieve the most similar template structure for homology modeling (22) . Both mutant and wild protein sequences have structural similarity with pyruvate-ferredoxin oxidoreductase protein structure from Moorella thermoacetica American Type Culture Collection (ATCC) 39073 with an E-value of 0.0. With this template, the wild protein showed a sequence identity of 47.04% and query coverage of 98% while the mutant protein exhibited a sequence identity of 37.16% and query coverage of 97%. Then, homology modeling was performed to predict the 3D protein structure using the Modeller9v19 script (23) . A set of 50 protein models was generated relying on sequence alignment and template structure assuring the spatial constraints. The model with low molecular PDF (mol pdf) score was evaluated with Rampage Ramachandran plot (24) . The residues lying in the disallowed region were subjected to loop modeling if they are present in the loop region of the protein. Again, a set of 50 models was generated with a loop modeling procedure and the model with the lowest molecular PDF (mol pdf) score was selected for further studies. https://www.indjst.org/

Molecular dynamics (MD) simulation
To analyze the stability of the modeled structure and to extract the minimized structure, Molecular dynamic simulation was performed using GROMACS 5 (25) . The Gromos 43A1 force field was applied and the protein was placed at the geometrical center of the triclinic box with a distance of 1.0 nm from the box edge. The box was solvated with simple point charged (SPC) explicit water molecules. The redundant charge of the system was neutralized with a sufficient amount of counter ions by replacing solvent molecules. Then, the energy minimization of the entire system was done using the steepest descent algorithm with a tolerance force for convergence of <1000.0 kJ/mol/nm. Subsequently, the system was equilibrated with the NVT ensemble using a leap-frog integrator with a time step of 2 fs, where initial velocities were assigned from Maxwell distribution to gradually heat the system to 300K. For temperature coupling, the modified Berendsen or V-rescale thermostat was used with 0.1 ps time constant. The NPT ensemble was then applied using Parrinello-Rahman barostat with 0.2 ps time constant to attain 1bar pressure. All bonds have pertained with LINear Constraint Solver (LINCS) holonomic constraints for van der Waals and electrostatic interactions with 10 Å cut-off (26) . For long-range electrostatics, the Particle-Mesh-Ewald (PME) coulomb type with a cubic interpolation order of 4 and a grid spacing of 0.16 nm was applied (27) . Finally, the MD run was performed in the NPT ensemble for a time scale of 10 ns. The same protocol was implemented for protein-ligand simulation while the ligand parameters were obtained from the swissparam server with CHARMM force field (28) .

Essential dynamics
The principal component analysis (PCA) and free energy landscape (FEL) were plotted for all four structures i.e., wild, mutant, complex-wild and complex-mutant to attain the minimum energy structure. At first, the covariance matrix was constructed from backbone atomic fluctuations of protein structure by avoiding the removal of rotational and translational movement. The sum of eigenvalues was computed from the diagonalization of the covariance matrix. Then, the detailed analysis of motion was performed along with eigenvector direction by projecting onto individual eigenvectors using gmx anaeig. From the analysis, the protein structure with the lowest free energy was extracted from the trajectory for further studies.

Binding site prediction
The binding site of both wild and mutant types proteins was predicted by Computed Atlas of Surface Topography of proteins (CASTp) server (6) . Both protein structures were submitted to the CASTp server individually in PDB format as input and a probe radius of 2.0 Å was given for computing solvent accessible surface area (SASA).

Molecular docking
As explained earlier, the protein structure of wild and mutant types was extracted from the FEL plot while the 3D structure of ligand i.e. 4-nitroimidazole was downloaded from PubChem (ID: 18208). Molecular docking was performed using AutoDock to predict the binding pocket of 4-nitroimidazole and to obtain its binding affinity (29) . Initially, the protein (both wild and mutant) was prepared for docking by adding the polar hydrogens and Kollman charges. In the second step, the ligand was prepared by employing Gasteiger charges (30) . The prepared protein and ligand files were transformed into pdbqt format that contains atom coordinates, partial charges, and solvation parameters. In the third step, the grid maps were calculated using the AutoGrid program with binding site residues predicted by the CASTp server. At last, the docking was performed using the Lamarckian genetic algorithm (GA) which is an adaptive local search method. During docking, the protein was kept rigid and the torsional bonds of 4-nitroimidazole were allowed to be flexible. Lamarckian GA was employed for a set of 50 generation cycles composed of mapping, fitness energy evaluation for a maximum value of 2.5 x 106 with a selection, crossover and mutation rate of 0.02. The generated conformers were scored by AutoDock energy-based scoring function that includes scoring for H-bond, short-range van der Waals and electrostatic interactions and loss of entropy due to solvation and ligand binding (30,31) . The conformation with the lowest binding energy in the largest cluster was considered for further analysis.

Structural analysis of predicted mutant and wild models
Among the 50 generated models, wild type had the lowest mol pdf value of 6464.68652 and mutant type had a score of 7084.38086. Structure validation (after loop refinement) of wild type by Ramachandran plot showed 95.8% of residues in the favored region, 3.6% residues in allowed and 0.6% residues in the outlier region as shown in ( Figure 1 ). In the case of mutant, 94.6% residues lied in the favored region, 3.9% residues in allowed while 1.4% residues fallen in the outlier region as shown https://www.indjst.org/

MD simulation analysis
During the MD simulation, the behavior of protein was studied as a function of time. The root-mean-square deviation (RMSD) plot explains that the wild protein has converged after 6 ns and was quite stable for the rest of the simulation compared to mutant protein simulation as shown in ( Figure 3 A). In the root-mean-square fluctuation (RMSF) plot, the highest fluctuation was observed at C-terminus which generally happens due to free end as shown in ( Figure 3B). As per the Rg plot, the compactness of both the proteins increased during the simulation. If the compactness of protein is more then, the protein is more stable as shown in ( Figure 3C). The intensive motion in trajectories through the eigenvectors of the covariance matrix of atomic fluctuation was studied with the help of PCA analyses. The dihedral angles i.e. Φ, Ψ of proteins were used to define the atomic fluctuation during simulation which was then taken to state the cosine content of the principal component of the covariance matrix. In general, the representative structure of a protein can be defined based on the cosine contribution of the first eigenvector. However, in many cases, the cosine contribution of the first eigenvector features large scale motion in protein dynamics. Thus, it cannot be used to depict protein behavior in terms of FEL. In recent studies, the first two PCs with cosine content values below 0.2 are considered to produce qualitatively related and smooth results. In the present study, the first 20 PCs of wild and mutant proteins were extracted and their cosine content was analyzed to find out the first two PCs having cosine content less than 0.2 as shown in ( Figure 4 A and Figure 5 A). The coordinates of the first two PCs with such values were considered to construct the FEL plot. A single minimum energy cluster was observed in the 2D counter map of FEL of wild protein as shown in ( Figure 4B). Based on the FEL 3D plot, the most representative structure of the minimum energy cluster was extracted as shown in ( Figure 4C and D). Though there are two clusters in the 2D FEL counter map of the mutant protein, the lowest minimum energy cluster was attained from the 3D FEL plot as shown in ( Figure 5B and C). The minimum energy structure of the mutant protein was extracted from the FEL counter map as shown in ( Figure 5D). https://www.indjst.org/

Prediction of the binding site for mutant and wild proteins
Since the binding site of 4-nitroimidazole is not yet known on these two proteins, it was predicted by CASTp server. It implements alpha shape method to identify topological features and to measure area and volume of pockets of protein to compute imprints. In wild protein, the predicted binding pockets residue numbers are 9

MD simulation analysis of mutant and wild complexes
Since the docking procedure is static, the complexes were further subjected to MD simulation analysis to understand the stability of H-bond interactions. As shown in Figure 7, the RMSD plot of both wild and mutant complexes was less than 0.4 nm. The major fluctuation in the RMSF graph of wild was seen at residue Val700 which is from structural linker (domain IV) followed by N-and C-terminal ends. Higher fluctuation at terminals is expected due to their free ends. The fluctuation of residues in the mutant complex was relatively less than the wild complex as shown in ( Figure 7B). The compactness of both the complexes as per the Rg plot was more or less similar as shown in ( Figure 7C). Since the stability of the compound is to be analyzed, H-bond plays an important role in the analysis. It can be observed in Figure 7D that wild complex maintained an approximate 4H-bonds during simulation while the mutant complex could hardly maintain single H-bond by the end of the simulation. This explains that 4-nitroimidazole is highly stable in wild complex rather than in the mutant complex. Although Figure 7 explains the higher number of H-bonds in wild complex, it does not elucidate the interacting residues and if new interactions are formed or not. Hence, clustering was performed for an entire trajectory with a cut-off of 0.1 to extract the most representative structure. The compound could maintain H-bond interactions with Asp431 (4 H-bonds) and Thr 467 while the rest of the residues from P-loop and Val433, Val434, Tyr458, Phe466, and Gly465 formed hydrophobic interaction. While in the mutant, the compound-maintained H-bond interaction with Cys857 and Gly859. The details of H-bond, its distance, and hydrophobic interactions are written in Table 2. As the compound 4-nitroimidazole could not make H-bond or hydrophobic interaction with key residues of the mutant protein, the compound may show less efficiency on mutant protein. https://www.indjst.org/

Discussion
Pyruvate Ferredoxin oxidoreductase is a protease enzyme that takes part in the synthesis pathway in some of the bacteria and parasites. The role of this enzyme is inactivation or inactivation some of the anti-parasite drugs, to date, this enzyme has not been crystallized. And thus, homology modeling was performed to predict the 3D protein structure using the Modeller9v19 script (23) . On sequence alignments, it was observed that the sequence similarity between mutant strain and wild type is 51% and the protein structure of both the types can be modeled on the same template protein.
To analyze the stability of the modeled structure and to extract the minimized structure, MD simulation was performed using GROMACS 5 (25) . All bonds pertained with LINear Constraint Solver (LINCS) holonomic constraints for van der Waals and electrostatic interactions with 10 Å cutoff (26) . The binding site of both wild and mutant pyruvate ferredoxin oxidoreductase proteins was predicted by Computed Atlas of Surface Topography of proteins (CASTp) server. For long-range electrostatics, the Particle-Mesh-Ewald (PME) coulomb type with a cubic interpolation order of 4 and a grid spacing of 0.16 nm was applied (27) . Finally, the MD run was performed in the NPT ensemble for a time scale of 10 ns. The same protocol was implemented for protein-ligand simulation while the ligand parameters were obtained from the swissparam server with CHARMM force field (28) . Molecular docking was performed using AutoDock to predict the binding pocket of 4-nitroimidazole and to obtain its binding affinity (29) . Both wild and mutant structures composed of six domains, three iron-sulfur [4Fe-4S] clusters, one thiamine pyrophosphate (TPP) and one Mg2+ molecule. In the root-mean-square fluctuation (RMSF) plot, the highest fluctuation was observed at C-terminus which generally https://www.indjst.org/ happens due to free end. As per the Rg plot, the compactness of both the proteins increased during the simulation. In the docking studies, the compound 4-nitroimidazole formed H-bonds with Asp431, Gly432 (P-loop residues), Gly465 and Thr467 residues of domain III of wild or PFlO protein and hydrophobic interaction with key residues i.e., Gly427, Met428, Ser430, Arg563 and Val433, Val434, Tyr458, Gly464, Phe466, and Ser493. The RMSD plot of both wild and mutant complexes was less than 0.4 nm. The major fluctuation in the RMSF graph of wild was seen at residue Val700 which is from structural linker (domain IV) followed by N-and C-terminal ends. It can be observed in that wild complex maintained an approximate 4H-bonds during simulation while the mutant complex could hardly maintain single H-bond by the end of the simulation. This explains that 4-nitroimidazole is highly stable in the wild complex rather than in the mutant complex. As the compound 4-nitroimidazole could not make H-bond or hydrophobic interaction with key residues of the mutant protein, the compound may show less efficiency on mutant protein.

Conclusion
The sequence similarity between wild type and mutant type is 51% and the 3D structure of both proteins could be built based on the same template protein. However, only mutant type is resistant to 4-nitroimidazole. The structural insights behind this resistance are unknown so far. In our study, the predicted tertiary structures of wild and mutant types were necessary to understand wild/mutant types 4-nitroimidazole interactions in computer-aided drug discovery procedures. This study predicts that 4-nitroimidazole may interrupt the binding of Co-A in wild type which is required for its regular function. Since the compound occupies this binding site by making stable H-bond with key interacting residues, the function of wild type may be affected. For the very reason, the mutant type may be resistant to the 4-nitroimidazole. Thus, this study predicted tertiary structures of wild and mutant types, the most probable binding site and also interacting residues of 4-nitroimidazole on both proteins. Besides, this study would be useful for the design derivatives of 4-nitroimidazole that inhibit protein function more efficiently.

Limitation
Nevertheless, the present study incorporated the suggested amino acid residues for site-directed drug-resistant results, so studies in vitro can be carried out to prove the above proposed in silico study.