Introduction
Antibodies are a primary adaptive defense mechanism against infection, and function by recognizing and binding to non-self antigens. It is their ability to differentiate between antigens that make them indispensable tools in medical and biological applications. While most of the sequence of all antibodies of a given individual is identical, relatively small variations in six stretches of amino acids turn each antibody into a specific binder of one antigen. The attempt to identify the regions in which Antigen binding actually occurs has been the focus of extensive research for the last few decades.

It is widely assumed that antigen binding sites correspond to the so called Complementarity Determining Regions (CDRs) of the antibody, namely the elements that are most different between antibodies. We analysed all known antibody-antigen complexes and found that about 20% of the residues that actually bind the antigen fall outside the CDRs. However, we also found that virtually all antigen binding residues fall within regions of structural consensus between antibodies, and are organized along the sequence of the antibody. Moreover, antigen binding residues that reside within these structural consensus regions but outside of the CDRs make significant energetic contribution to antigen binding. For further details, please see Kunik et al. and Supplementary Material.

Paratome - identification of Antigen Bidning Regions (ABRs) in antibodies
Given the amino acid sequence or 3D structure of an antibody, the Paratome server identifies the antigen binding regions within the query antibody. Paratome was constructed by structurally aligning a non-redundant set of all known Antibody-Antigen complexes in the PDB (see train dataset, light chain Multiple Structure Alignment and heavy chain Multiple Structure Alignment), from which structural consensus elements that are commonly involved in antigen binding across antibodies were identified. The list of all train and test sets ABRs and Ab-Ag contacts which were used to construct Paratome can be found in train set ABRs, test set ABRs, train set contacts and test set contacts. For further details, please see Materials and Methods in Kunik et al. The outline of our ABRs identification method is delineated in Figures 1 and 2.

Sequence based ABRs identification
A BLAST search is performed using the query antibody sequence versus the dataset of non redundant PDB antibodies. Using the best hit from the BLAST search , the query and annotated antibodies frameworks (FRs) are aligned and hence the query sequence ABRs are inferred based on the location of the annotated sequence ABRs (see Figure 1).

Figure 1. Sequence based ABRs identification.
Paratome

Structure based ABRs identification
A BLAST search is performed using the sequence of the query antibody versus our dataset of antibodies. Using the best hit from the BLAST search, the query and annotated antibodies are structurally aligned. The ABRs of the query antibody are inferred based on the location of the annotated antibody ABRs (see Figure 2).

Figure 2. Structure based ABRs identification.
Paratome

Performance
Paratome was thoroughly benchmarked against a set of antibodies (see test dataset) that were extracted from the PDB following the construction of the tool. While residues identified by Paratome cover virtually all the antigen binding sites (94%), the CDRs (as identified by the commonly used CDR identification tools, i.e. Kabat, Chothia and IMGT miss significant portions of them (up to 20%). The precision of all methods is essentially the same. Figure 3. depicts the recall and precision.

Figure 3. Recall and precision of antigen binding sites identification.
Paratome

We refer to the antigen binding residues which are identified by Paratome but are not identified by any of the common CDR identification methods, as Paratome-unique residues. Similarly, antigen binding residues that are identified by any of the common CDR identification methods but are not identified by Paratome are referred to as CDRs-unique residues. We used the FoldX algorithm, a molecular modelling software that computationally predicts the effect of mutations on the binding energy, to assess the contribution of Paratome-unique and CDRs-uinque antigen binding residues to antibody-antigen binding. Residues that fall outside of the traditionally defined CDRs are at least as important to antigen binding as residues within the CDRs, and in some cases, they are even more important energetically. Furthermore, antigen binding residues that fall out of the structural consensus regions but within CDRs show a relatively marginal energetic contribution to antigen binding. Figure 4 presents the results. For further details, please see Kunik et al. and Supplementary Material.

Figure 4. The contribution of Paratome-unique and CDR unique residues to the binding energy in antibody-antigen complexes.
Paratome