Skip to content

Structure

Add SDF to PDB

Adds a small molecule to a protein structure. The docking between the molecule and the protein must be calculated beforehand, e.g., with the Diff Dock node. For the combined structure, the binding affinity can be calculated with Protein Ligand Binding Affinity (see Chemistry).

Input:

  • Molecule: The molecule to add to the protein structure as loaded SDF file.
  • Structure: The protein structure as loaded PDB structure.

Output: Combined structure.

Alpha Fold

Computes a protein tertiary structure using AlphaFold2.

Input: Fasta file containing a protein sequence for which the structure is to be predicted.

Input Parameters:

  • Max template date: Cutoff date for which structures from the Alpha Fold databases to include in the structural predictions. Can be useful for repeating older analyses.
  • Models to relax: Determines if the node will apply a relaxation step using molecular dynamics to either only the best predicted structure (structure 0), all predicted structures, or no predicted structure. A relaxation step can help refine the structure prediction of Alpha Fold by allowing for the optimization of bond lengths and other parameters, potentially generating a more accurate and stable structure.

Output: Predictions of all five Alpha Fold models as PDB files, ranked from best to worst prediction with the first prediction being the best.

Citations:

Jumper, J., Evans, R., Pritzel, A., Green, T., Figurnov, M., Ronneberger, O., ... & Hassabis, D. (2021). Highly accurate protein structure prediction with AlphaFold. Nature, 596(7873), 583-589. DOI: 10.1038/s41586-021-03819-2.

Alpha Fold Multimer

Predicts protein complex structures from multiple fasta sequences. Each fasta sequence in the supplied fasta file is assumed to be one chain of a multichain protein complex.

Input: Multifasta file to predict the structure of. Each fasta sequence in the supplied fasta file is assumed to be one chain of a multichain protein complex.

Input Parameters:

  • Number of models to relax: Models to relax. Options are all, none, and best.
  • max_template_date: Maximum age of the most recent template files to use. Useful for replicating older alphafold analyses on less recent databases. For an up-to-date database prediction, use the current data in the format yyyy-mm-dd.

Output:

  • Prediction1 to Prediction5: Separate structure outputs, as AlphaFold is comprised of five separate models, with each giving their own prediction, ordered by model confidence.

Diff Dock

Molecular docking using the Diff Dock-L diffusion machine learning model. Can dock small molecule ligands given in an SDF file into a given protein structure. The resulting docked ligand can be bound to the protein with the Add SDF to PDB node.

Input:

  • Receptor: The protein structure to dock the ligand to as loaded PDB structure.
  • Ligand: The small molecule to dock as loaded SDF structure (ChemicalStructure).

Output:

  • Docked Ligand: Ligand in predicted docked pose (ChemicalStructure, SDF file).
  • Confidence Score: Confidence score of the prediction as float.

Citations:

Corso, G., Deng, A., Fry, B., Polizzi, N., Barzilay, R., & Jaakkola, T. (2024). Deep Confident Steps to New Pockets: Strategies for Docking Generalization. _arXiv preprint arXiv:_2402.18396. DOI: https://doi.org/10.48550/arXiv.2402.18396.

AutoDock Vina

AutoDock Vina is a docking tool designed to determine the binding affinity between a ligand and a protein receptor. It employs real-world physics and chemistry calculations to estimate the Gibbs free energy of the ligand-receptor complex, measured in kcal/mol. A lower (more negative) value indicates a stronger binding interaction.

Unlike DiffDock, which relies solely on a neural network without an understanding of fundamental mathematical and physical principles, AutoDock Vina provides physically meaningful predictions rooted in established scientific models.

Limitations

AutoDock Vina is not suitable for metalloproteins. If a protein contains metal ions, the tool will process it as if it were a non-metal-ion protein, leading to potentially inaccurate results.

Input:

  • Receptor Protein Structure: A PDB-file of the target receptor
  • Ligand SMILES String: A SMILES string of the ligand molecule
  • Docked Ligand File for Re-Docking: Output conformation SDF-file of the docked ligand

Input Requirements:

The AutoDock Vina node always requires a PDB-file for the receptor protein. The ligand can be supplied in two formats:

  • As a SMILES string
  • As a ligand conformation SDF-file

Parameters:

The computational effort in the docking simulation can be adjusted using the ‘Exhaustiveness’ input parameter:

  • Default: 8 (fast results)
  • 64 (precise and fast enough for most applications)
  • 512 (maximum value, best possible docking accuracy)

Output:

  • Docking Conformation 1...3: Contains the SDF-file of the docked ligand. The numbers refer to the ranked output (1 is the best result and so on).
  • Binding Energy for Conformation 1...3: Gibbs free energy in kcal/mol for each docking result.

Interpreting Results

To visualize interactions between the ligand and the protein:

  • Use the Add SDF to PDB node to map the ligand onto the protein structure.
  • The AutoDock Vina node also outputs a cleaned protein structure, which can be analyzed in visualization tools like Mol*Star.
  • For accurate structural representation, use the protein structure file in conjunction with the SDF ligand conformation file via the Add SDF to PDB node.

This ensures correct alignment and a meaningful interpretation of the docking results.

Citations

Eberhardt, J., Santos-Martins, D., Tillack, A. F., & Forli, S. (2021). AutoDock Vina 1.2.0: New Docking Methods, Expanded Force Field, and Python Bindings. Journal of Chemical Information and Modeling. DOI: https://doi.org/10.1021/acs.jcim.1c00203.

Trott, O., & Olson, A. J. (2010). AutoDock Vina: Improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. Journal of Computational Chemistry, 31(2), 455-461. DOI: https://doi.org/10.1002/jcc.21334.

O’Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., & Hutchison, G. R. (2011). Open Babel: An open chemical toolbox. Journal of Cheminformatics, 3, 33. DOI: https://doi.org/10.1186/1758-2946-3-33.

Omega Fold

Computes a protein tertiary structure de novo (with no needed templates and multiple sequence alignment) using Omega Fold. This is a lot faster than computing the structure using Alpha Fold while being slightly less accurate.

Input: Protein sequence to predict the structure for.

Input Parameters: Subbatch size to use less VRAM. The subbatch size determines how much of the structure is computed in one computational batch. Larger batches can cause the computation of the protein structure to fail, if the protein sequence is too large. Set -1 to use the number of residues in the sequence and compute everything in one batch.

Output: Predicted protein structure.

Citations:

Wu, R., Ding, F., Wang, R., Shen, R., Zhang, X., Luo, S., ... & Peng, J. (2022). High-resolution de novo structure prediction from primary sequence. BioRxiv, 2022-07. DOI: https://doi.org/10.1101/2022.07.21.500999.

Output: Reverse complement of the input sequence as fasta file

Smiles To Structure

Converts a SMILES (Simplified molecular-input line-entry system) string, given as text, into an SDF format structure. The SDF file format stores three-dimensional structural data of molecules like drugs or metabolites.

Input Parameters:

  • SMILES: Smiles string to convert.
  • _Optimize:_Option that determines if xyna.bio should optimize the three-dimensional geometry of the molecule using molecular mechanics. This can be useful to obtain a more accurate three-dimensional structure of the molecule.

Output: Generated chemical structure as SDF file.