Footprints Book Class 7 Pdf Download

Major histocompatibility complex class II (MHC-II) molecules present peptide fragments to T cells for immune recognition. Current predictors for peptide to MHC-II binding are trained on binding affinity data, generated in vitro and therefore lacking information about antigen processing.

Major histocompatibility complex class II (MHC-II) molecules play a central role in the immune system of vertebrates. MHC-II present exogenous, digested peptide fragments on the surface of antigen-presenting cells, forming peptide-MHC-II complexes (pMHCII). On the cell surface, these pMHCII complexes are scrutinized, and if certain stimulatory conditions are met, a T helper lymphocyte may recognize the pMHCII and initiate an immune response [1].

Download File 🔥 https://tinurll.com/2y4PkM 🔥

The precise rules of MHC class II antigen presentation are influenced by many factors including internalization and digestion of extracellular proteins, the peptide binding motif specific for each MHC class II molecule, and the transport and surface half-life of the pMHCIIs. The MHC-II binding groove, unlike MHC class I, is open at both ends. This attribute facilitates peptide protrusion out of the groove, thereby allowing longer peptides (and potentially whole proteins) to be loaded onto MHC-II molecules [2, 3]. Peptide binding to MHC-II is mainly determined by interactions within the peptide binding groove, which most commonly encompass a peptide with a consecutive stretch of nine amino acids [4]. Ligand residues protruding from either side of the MHC binding groove are commonly known as peptide flanking regions (PFRs). The PFRs are variable in length and composition and affect both the peptide MHC-II binding [5] and the subsequent interaction with T cells [6,7,8]. The open characteristic of the MHC-II binding groove does not constrain the peptides to a certain length, thereby increasing the diversity of sequences that a given MHC-II molecule can present. Also, MHC-II molecules are highly polymorphic, and their binding motifs have appeared to be more degenerate than MHC-I motifs [9,10,11].

Considering all the aspects mentioned above, MHC-II motif characterization and rational identification of MHC-II ligands and epitopes is a highly challenging and costly endeavor. Because MHC-II is a crucial player in the exogenous antigen presentation pathway, considerable efforts have been dedicated in the past to develop efficient experimental techniques for MHC-II peptide binding quantification. The traditional approach to quantify peptide MHC-II binding relies on measuring binding affinity, either as the dissociation constant (Kd) of the complex [12, 13] or in terms of IC50 (concentration of the query peptide which displaces 50% of a bound reference peptide) [14]. To date, data repositories such as the Immune Epitope Database (IEDB) [15] have collected more than 150,000 measurements of peptide-MHC-II binding interactions. Such data have been used during the last decades to develop several prediction methods with the ability to predict binding affinities to the different alleles of MHC class II. While the accuracy of these predictors has increased substantially over the last decades due the development of novel machine learning frameworks and a growing amount of peptide binding data being available for training [16], state-of-the-art methods still fail to accurately predict accurately MHC class II ligands and T cell epitopes [17, 18].

Because naturally eluted ligands incorporate information about properties of antigen presentation beyond what is obtained from in vitro binding affinity measurements, large MS-derived sets of peptides can be used to generate more accurate prediction models of MHC antigen presentation [20, 21, 25]. As shown recently, generic machine learning tools, such as NNAlign [9, 29], can be readily applied to individual MS data sets, which in turn can be employed for further downstream analyses of the immunopeptidome [30]. The amount of MHC molecules characterized by MS eluted ligand data is, however, still limited. This has led us to suggest a machine learning framework where peptide binding data of both MS and in vitro binding assays are merged in the training of the prediction method [25]. This approach has proven highly powerful for MHC class I, but has not, to the best of our knowledge, been applied to MHC class II.

HLA class-II peptidome data were obtained from two recent MS studies. Three data sets corresponding to the HLA-DRB1*01:01: DR1Ph, DR1Pm [26], and DR1Sm [24], two to DRB1*15:01: DR15-Ph and DR15-Pm, and one to the allele DRB5*01:01: DR51 Ph (for details see Table 1). Here, the data sets with subscript h correspond to the data obtained from human cell lines and data sets with the subscript m to the data obtained from human MHC-II molecules transfected into MHC-II deficient mice cell lines. Details on how the data were generated are provided in the original publications. Note that DR15 Ph and DR51 Ph data sets were obtained from a heterozygous EBV-transformed B lymphoblastoid cell line (BLCL), IHW09013 (also known as SCHU), which expresses two HLA-DR molecules, HLA-DRB1*15:01 and HLA-DRB5*01:01 (shortened here with the name DR15/51). The DR1 Ph data set was extracted from a BLCL culture as well (IHW09004). On the other hand, DR1 Pm, DR1 Sm, and DR15 Pm data sets were extracted from HLA transgenic mice, and therefore only cover the human alleles of interest. These cells are treated here as monoallelic.

Two types of model were trained: one with single data type (eluted ligand or binding affinity) input, and one with a mixed input of the two data types. Single models per each data set and allele were trained as previously described with either binding affinity or eluted ligand data as input [30]. All models were built as an ensemble of 250 individual networks generated with 10 different seeds; 2, 10, 20, 40, and 60 hidden neurons; and 5 partitions for cross-validation. Models were trained for 400 iterations, without the use of early stopping. Additional settings in the architecture of the network were used as previously described for MHC class II [30]. Combined models were trained as described earlier [25] with both binding affinity and eluted ligand data as input. Training was performed in a balanced way so that on average the same number of data points of each data type (binding affinity or eluted ligand) is used for training in each training iteration.

Earlier work on MHC class I has demonstrated that the information contained within eluted ligand and peptide binding affinity data is, to some degree, complementary and that a prediction model can benefit from being trained integrating both data types [25]. Here, we investigate if a similar observation could be made for MHC class II. As proposed by Jurtz et al., we extended the NNAlign neural network model to handle peptides from both binding affinity and elution assays. In short, this is achieved by including an additional output neuron to the neural network prediction model allowing one prediction for each data type. In this setup, weights are shared between the input and hidden layer for the two input types (binding affinity and eluted ligand), whereas the weights connecting the hidden and output layer are specific for each input type. During neural network training, an example is randomly selected from either data set and submitted to forward and back propagation, according to the NNAlign algorithm. The weight sharing allows information to be transferred between the two data types and potentially results in a boost in predictive power (for more details on the algorithm, refer to [25]).

Models were trained and evaluated in a fivefold cross-validation manner with the same model hyper-parameters that were used for the single data type model. Comparing the performance of the single data type (Table 2), to the multiple data type models for the different data sets (Table 3), a consistent improvement in predictive performance was observed when the two data types were combined. This is the case, in particular, when looking at the PPV performance values. Here, the combined model in all cases has improved performance compared to the single data type model. This is in line to what we have previously observed for MHC class I predictions [25].

Having developed improved models for prediction of MHC class II ligand binding, we next analyzed whether the models could be used to identify signals of antigen processing in the MS eluted ligand data sets. We hypothesized that information concerning antigen processing should be present in the regions around the N and C termini of the ligand. These regions comprise residues that flank the MHC binding core called peptide flanking regions (PFRs) and residues from the ligand source protein sequence located outside the ligand (see lower part of Fig. 4 for a schematic overview).

Secondly, we demonstrated that high accuracy prediction models for peptide MHC II interaction can be constructed from the MS-derived MHC II eluted ligand data, that the accuracy of these models can be improved by training models integrating information from both binding affinity and eluted ligand data sets, and that these improved models can be used to identify both eluted ligands and T cell epitopes in independent data sets at an unprecedented level of accuracy. This observation strongly suggests that eluted ligand data contain information about the MHC peptide interaction that is not contained within in vitro binding affinity data. This notion is further supported by the subtle differences observed in the binding motifs derived from eluted ligand and in vitro binding affinity data. Similar observations have been made for MHC class I [20, 25]. We at this point have no evidence for the source of these differences, but a natural hypothesis would be that they are imposed by the presence of the molecular chaperones (such as HLA-DM) present in the eluted ligand but absent from in vitro binding assays. An alternative explanation could be that the eluted peptide ligands reflect peptide-MHC class II stability rather than affinity: something that would imply that stability is a better correlate of immunogenicity than affinity [54]. e24fc04721