Nanobody Aggregation
These nodes present a toolset, which provides users with the ability to predict nanobody aggregation properties.
ANARCI
Antigen Receptor Numbering And Receptor ClassIfication (ANARCI) is a tool for classifying and numbering each amino acid in an anti- or nanobody. As these proteins are highly variable in length and amino acid composition, it is hard to assign an amino acid to a functional region (like CDR loops and framework regions) only based on its position in the sequence. For this reason, different numbering schemes with diverse applications exist.
Reference:
Dunbar, J., & Deane, C. M. (2016). ANARCI: antigen receptor numbering and receptor classification. Bioinformatics, 32(2), 298-300.https://doi.org/10.1093/bioinformatics/btv552
The user can choose between the following numbering schemes:
- IMGT: IMGT stands for International ImMunoGeneTics Information System. It ensures standardization and structural consistency by aligning and annotating sequences based on conserved residues. For instance, Complementarity Determining Regions (CDRs) are defined with fixed positions. It is the most widely used numbering scheme and recommended for comparisons across datasets, thus the default in xyna.bio.
Reference: Lefranc MP, Pommié C, Ruiz M, Giudicelli V, Foulquier E, Truong L, Thouvenin-Contet V, Lefranc G. IMGT unique numbering for immunoglobulin and T cell receptor variable domains and Ig superfamily V-like domains. Dev Comp Immunol. 2003 Jan;27(1):55-77.
- Kabat: The Kabat scheme was developed based on the location of regions of high sequence variation between sequences of the same domain type. For instance, CDRs are defined based on regions of high variability. Kabat's strength is that it captures the natural diversity of anti- and nanobodies.
Reference: Kabat E.A., et al. (1991) Sequences of Proteins of Immunological Interest. Fifth Edition. NIH Publication No. 91-3242.
- Chothia: Chothia's scheme refines Kabat by aligning sequences to known antibody structures, making it suitable for 3D modeling and structural prediction.
Reference: Al-Lazikani B., et al. (1997) Standard conformations for the canonical structures of immunoglobulins. J. Mol. Biol., 273, 927–948.
- Martin: Martin's scheme is an enhanced version of Chothia with further structural corrections for higher accuracy.
Reference: Abhinandan K.R., Martin A.C.R. (2008) Analysis and improvements to Kabat and structurally correct numbering of antibody variable domains. Mol. Immunol., 45, 3832–3839.
- AHo: AHo stands for Antibody Homology. The scheme is based on structure-based homology and uses fixed lengths to place residues. It is well-suited for machine learning and structural bioinformatics.
Reference: Honegger A., Plückthun A. (2001) Yet another numbering scheme for immunoglobulin variable domains: an automatic modeling and analysis tool. J. Mol. Biol., 309, 657–670.
- Wolfguy: The Wolfguy scheme is also structural-based. It defines CDRs as a combined definition from the Kabat and CHothia schemes.
Reference: Bujotzek A, Dunbar J, Lipsmeier F et al. Prediction of VH–VL domain orientation for antibody variable domain modeling. Proteins Struct. Funct. Bioinforma. 2015;83:681–95. 10.1002/prot.24756.
ANARCI PDB
The ANARCI PDB node renumbers an anti- or nanobody PDB structure using ANARCI.
Input
- Structure: The anti- or nanobody structure to renumber in .pdb format.
- Numbering Scheme: The ANARCI numbering scheme used.
Output
- Renumbered Structure: The renumbered structure in .pdb format.
NanobodyBuilder2
NanobodyBuilder2 is a tool for predicting the 3D structure of nanobodies from their amino acid sequence using a deep learning model.
Reference:
Abanades, B., Wong, W. K., Boyles, F., Georges, G., Bujotzek, A., & Deane, C. M. (2023). ImmuneBuilder: Deep-Learning models for predicting the structures of immune proteins. Communications Biology, 6(1), 575. https://doi.org/10.1038/s42003-023-04927-7
Browser application: NanoBodyBuilder2
Input
- Sequences: A fasta file containing the sequence(s) of prospective nanobodies to be analysed.
Input Parameters
- Numbering Scheme: The ANARCI numbering scheme used.
Output
- Structures: A list of predicted 3D structures in .pdb format. This output should be directed into a Batch node to process and calculate the aggregation score of each nanobody in a single run.
Intramolecular Hydrophobic Interactions
This node implements a simplified algorithm to determine intramolecular hydrophobic interactions between amino acid residues of a nanobody. The algorithm is partially inspired by the Contacts of Structural Units (CSU) algorithm. For each residue of the nanobody, all hydrophobic side chain C-atoms are considered in the calculation. These are all side chain C-atoms that are not covalently bound to an oxygen, nitrogen, or sulfur atom. For each hydrophobic side chain C-atom of a residue Ri, the distances to all hydrophobic side chain C-atoms of the other residues Rj of the nanobody structure are determined from the PDB file. If a distance is lower than the pre-defined cutoff, residue Rj is classified as interaction partner of residue Ri. The number of intramolecular hydrophobic interactions for residue Ri is its number of interaction partners.
Reference:
Sobolev, V., Sorokine, A., Prilusky, J., Abola, E. E., & Edelman, M. (1999). Automated analysis of interatomic contacts in proteins. Bioinformatics (Oxford, England), 15(4), 327-332.https://doi.org/10.1093/bioinformatics/15.4.327
Input
- Input Structure: A single .pdb formatted structure file containing the ANARCI numbered structure of an anti- or nanobody.
Input Parameters
- Distance Cutoff: The distance in Angstrom up to which two residues will be determined to have an interaction. Set it to 4 Å, in order to reproduce parameter settings of Geyer et al., 2025 (citation TBD).
Output
- Hydrophobic Interactions: A list containing the number of interaction partners for each amino acid of the given structure.
Exposed Surface Hydropathy
A large factor in aggregation is the total amount of exposed surface area hydropathy of the former VH-VL interface, called FR2 region (residue 39-55 based on IMGT numbering). This node determines the exposed surface hydrophobicity by calculating the mean product of the exposed surface area (Shrake-Rupley algorithm) and hydrophobicity of each amino acid (Wimley and White hydrophobicity scale). A more negative result for a particular residue reflects more hydrophilic exposed surface. A more positive result reflects more hydrophobic surface.
References:
Shrake-Rupley alogorithm: Shrake, A., & Rupley, J. A. (1973). Environment and exposure to solvent of protein atoms. Lysozyme and insulin. Journal of molecular biology, 79(2), 351-371.https://doi.org/10.1016/0022-2836(73)90011-9 (Implementation: https://github.com/biopython/biopython/blob/master/Bio/PDB/SASA.py)
Hydrophobicity scale: Wimley, W. C., & White, S. H. (1996). Experimentally determined hydrophobicity scale for proteins at membrane interfaces. Nature structural biology, 3(10), 842-848.https://doi.org/10.1038/nsb1096-842
Input
- Nanobody Structure: A single .pdb file containing a nanobody structure annotated with ANARCI. It can, for instance, be the output of the ANARCI PDB node or a batch item of the Nanobody Builder 2 Output.
Input Parameters
- N Points: Number of points at which the surface will be probed. A higher number might increase accuracy but also increases runtime. It is recommended to start with 100 (default), if not stated otherwise.
Output
- Exposed Area: A list containing the hydropathy of the exposed surface for each amino acid of the given structure.
Nanobody Aggregation Score
Nanobody aggregation scoring based on Geyer et al., 2025 (citation TBD). This node calculates the aggregation score for a given nanobody based on the hydropathy of its exposed surface area, its intramolecular hydrophobic interactions and its inherent (in-)stability.
Input
-
Nanobody Structure: A single .pdb file containing the ANARCI numbered structure of the nanobody. It can, for instance, be the output of the ANARCI PDB node or a batch item of the Nanobody Builder 2 Output.
-
Surface Properties: A list containing the hydropathy of the exposed surface for each amino acid of the given structure. Output of the Exposed Surface Hydropathy node.
-
Interactions: A list of the number of interaction partners each for each amino acid of the given structure. Output by the Intramolecular Hydrophobic Interactions node.
Input Parameters
- Surface Area Hydropathy ROI: Region of Interest (ROI) for the surface area hydropathy on which the analysis should be performed. The selected default is the FR2 region and residue 118 (according to IMGT numbering). The indices should be provided according to the used ANARCI numbering scheme. Separate the indices by '-' for ranges and by ',' for single indices (e.g.: 39-55,118 for the proposed region). If the field is left blank, the entire nanobody is used.
- Hydrophobic Interactions ROI: ROI for the hydrophobic interactions on which the analysis should be performed. Again, the selected default is the FR2 region and residue 118 (according to IMGT numbering). The indices should be provided as stated for the Surface Area Hydropathy ROI parameter.
- Instability Index ROI: ROI for the instability index on which the analysis should be performed. The selected default is the FR2 region (according to IMGT numbering). The indices should be provided as stated for the Surface Area Hydropathy ROI parameter.
- Job Name: A name for the given job that is used to aggregate the results of one batch into one output file.
Output
- Aggregation Score: The final aggreagation score of each nanobody in the batch, stored as one .csv file.
Advanced settings
Open the side panel, and toggle Save AA specific CSV data, to additionally store the amino acid values for the interactions, surface area, and hydrophobicity.