分析测试百科网

搜索

喜欢作者

微信支付微信支付
×

a pipeline for the identification of intact N-glycopeptides(四)

2020.5.18
头像

王辉

致力于为分析测试行业奉献终身

Interpretation of MS3 data. MS3 data were converted to “.ms3” format by pXtract within pFind Studio (version 2.8)26,27, and then analyzed by pFind 2.8. The protein database was the database of the six standard glycoproteins mixed with 500 protein sequences randomly selected from SwissProt (v12.05, Homo sapiens species) as the background. Concatenated forward-reverse database search was performed to estimate the peptide FDR. The N-glycosylation sequon (N-X-S/T/C, X ≠ P) was modified by changing “N” to “J” (the two shared the same mass) which had been applied previously28,29. The enzyme was semi-trypsin and the maximal missed cleavage was 2. Fixed modification was carbamidomethylation on all Cys residues (C + 57.022 Da). Variable modifications contained oxidation on Met (M + 15.995 Da), HexNAc on N-glycosylation sequon (J + 203.079 Da). The matching mass tolerance for precursors of MS3 spectra was set as ± 3 Da. Since MS3 acquisition was performed in the low-resolution ion-trap, the matching mass tolerance for fragment ions was set as ± 0.5 Da. Neutral mass loss of 203.079 Da of the HexNAc was considered in pFind. Only the peptide-spectrum match (PSM) with the modification of HexNAc on “J” was kept as a valid candidate, which was a potential identity of the Y1 ion. And the PSM FDR was set to 1% for these candidate Y1 PSMs.

 

FDR analysis for glycan identification. After identification of an HCD- and CID-MS/MS spectrum pair, its top-ranked glycan with a deduced peptide backbone mass was output as the glycopeptide candidate, but the error rate was unknown. The FDR estimation for glycan identification has puzzled researchers for a long time30. In pGlyco, we developed a novel decoy method coupled with the finite mixture model to give an approximate solution against this problem. For peptide identification, the target-decoy strategy has become a routine method to estimate the FDR, and a sequence-based decoy database generated from the target database is used. However, it is difficult to design a corresponding tree-based target-decoy strategy for glycan identification, so a novel decoy method is to be developed. In pGlyco, we investigated a spectrum-based decoy method to estimate the glycan FDR. For the sequence-based decoy method in peptide identification, each sequence-based decoy peptide is theoretically fragmented into a decoy spectrum, and then matched against the experimental spectrum to get a decoy match. Although the tree-based decoy for a glycan structure is difficult to generate, it is easy to generate a theoretical decoy spectrum to get a decoy match against the experimental glycopeptide spectrum. We named this novel decoy method as the spectrum-based decoy method. When searching an experimental spectrum, the theoretical target glycopeptide spectrum was generated after the masses of the Y ions were deduced based on the putative peptide backbone mass. And then, we added a random mass ranging from 1–30 Da to the mass of each deduced Y ion to generate a theoretical decoy spectrum. Both theoretical target and decoy spectra were competitively matched against the experimental spectrum. The key assumption of target-decoy method is “the number of incorrect identifications from target or decoy sequences are equally likely31”. However, this assumption might not always be guaranteed when using the spectrum-based decoy method, and hence the bias of the FDR estimation might arise. To adjust the possible bias, a finite mixture model (FMM) was employed, which had been used for the peptide identification32. This method was validated by comparing with the conventional sequence-based target-decoy approach for the peptide identification in public datasets, and showed quite a good performance (see

the section “The spectrum-based decoy method and the FMM” in the Supporting Information).

 

1589791167211954.jpg

 

Figure 2. An example of analyzing an HCD- and CID-MS/MS spectrum pair of the glycan (6, 5, 1, 0, 1).

(a) An example of trimannosyl core ions matched against the HCD-MS/MS spectrum. The deduced peptide backbone mass is 1408.819 Da. In this spectrum, the Y1 ion is the most intense ion in the mass range above 700 m/z. (b) The sister CID-MS/MS spectrum of the HCD-MS/MS spectrum in (a). By combining the HCDand CID-MS/MS spectrum analysis, the glycan is confidently identified. The tolerance of fragment ions for both HCD- and CID-MS/MS is ± 20 ppm.

 

After employing the spectrum-based decoy and the FMM, distributions of correct and incorrect scores were drawn, and FDR could be estimated (see Fig. S-2 in the Supporting Information). Before glycan FDR filtration, pGlyco obtained 2,704 glycan-spectrum matches (GSMs), and at 1% glycan FDR, 1,720 GSMs were identified. Each GSM was represented by a glycan identified by a HCD/CID-MS/MS spectrum pair together with the deduced mass of the peptide backbone, but the sequence of each peptide backbone remained to be identified. An example of analyzing the HCD- and CID-MS/MS spectrum pair for the glycan (6, 5, 1, 0, 1) was illustrated in Fig. 2. (In this manuscript, a glycan composition is represented by a vector with the form (#Hex, #HexNAc, #NeuAc, #NeuGc, #dHex), the vector (6, 5, 1, 0, 1) means the glycan with composition Hex6HexNAc5NeuAc1dHex1.) In Fig. 2a, highly intense trimannosyl core ions matched in the HCD-MS/MS spectrum were observed. With the identified glycan (6, 5, 1, 0, 1), the peptide backbone mass was deduced as 1408.819 Da. From matched ions in the HCD-MS/MS spectrum in Fig. 2a, we could see that the relative intensity of the Y1 ion was high enough for the data-dependent acquisition of MS3. Figure 2b showed an example of the annotation of a CID-MS/MS spectrum (the sister spectrum of the HCD-MS/MS spectrum in Fig. 2a). Quite a number of Y ions were matched, and most of the matched fragment mass errors were between − 10 and 10 ppm, resulting in a highly confident identification. The most possible glycan structure from our glycan database was also drawn in Fig. 2a.


互联网
仪器推荐
文章推荐