Cryptic antimicrobial peptides (AMPs) in the human gut: a potential case study

Amy Houseman, Célio Dias Santos-Junior, Luis Pedro Coelho

tl;dr Maybe there are a lot of cryptic peptides with antimicrobial properties, which, with a few mutations, can become independent genes.

Right now, we have some speculative results on a case study, a molecule we are calling HG4. Our current work aims at making this more robust and trying to figure out if it can be a general mechanism for AMP evolution.

Introduction I: what are AMPs?

Antimicrobial Peptides are short molecules, usually 10-100 residues in length, that interfere with microbial cells. Due to the reduced size of AMPs, their identification usually relies not on homology, but on machine learning methods (Meher et al., 2017). After their predominant discovery in the 1980s, AMPs have shown to have a broad spectrum of activity: some AMPs are anti-bacterial (Jiravanichpaisal et al., 2007, Li et al., 2017), anti-fungal (Gupta and Srivastava., 2014, Do et al., 2014) and anti-cancer (Hilchie et al., 2011). Interestingly, some AMPs have also shown to be effective against antibiotic-resistant bacteria (Lázár et al., 2018). With the prominent antimicrobial-resistance (AMR) crisis (Vladislav et al., 2019), AMPs could represent an alternative to conventional antibiotics.

In the Antimicrobial Peptide Database (AMPdb) (Wang et al., 2016), which contains known AMP sequences, only 12% of reported sequences belong to prokaryotes, suggesting that prokaryotic AMPs are still under-explored.

AMPs are a diverse group of molecules, commonly cationic. This means they have a positive charge, and usually organised faces with hydrophobic and hydrophilic regions. These regions enable AMPs to interact with microbial membranes (Cunha et al., 2017).

Introduction II: what are cryptic AMPs?

A cryptic AMP is an AMP that is embedded in a larger protein. One well-documented way in which cryptic AMPs can be activated is with the use of naturally occurring proteases, which cleave the parent protein, releasing the AMP.

One example is lactoferricin which is released from lactoferrin, its parent protein. Lactoferrin is typically found in human and cow milk, although it has been found in saliva and other bodily fluids. Lactoferricin, a cryptic peptide, was discovered in 1992 through the pepsin cleavage of lactoferrin in the gut of humans (Bellamy et al. 1992). Gifford et al. (2005) highlighted lactoferricin’s potential ability to be anti-bacterial, antiviral and antitumor.

Cryptic AMPs can also be formed by the activation of protein precursors. This activation can be triggered by a variety of situations, namely external stress conditions. An example of this in plants is that a phytohormone treatment induced cryptic AMPs in moss (Fesenko et al., 2019).

Searching for AMPs in human gut metagenome data, led us to find a potential cryptic peptide which became in independent gene.

HG4: Probably an AMP

Using 184 human guts metagenomes (Heintz-Buschart et al., 2016), we used Macrel v0.4 (Santos-Junior et al., 2020) to predict AMPs.

(See the macrel preprint for more details on macrel and the full results on these metagenomes)

HG4 was a putative AMP found in three metagenomes from the same individual. Using homology, we found the HG4 sequence within a protein from Prevotella melaninogenica. We aligned the contig containing HG4 sequence with the P. melaninogenica genome, and discovered that the portion containing the HG4 AMP was disrupted. Specifically, there was a deletion of a nucleotide upstream of the HG4 peptide start codon, this deletion disrupted the amino terminal and created a ribosome binding site with a 11 bp spacer (see Fig. 1).

Further downstream a point mutation generated a stop codon, thus forming a complete ORF that can be transcribed by the original gene promoter.

Figure 1.Diagram of the mutations that result in the formation of the AMP HG4. A point deletion occurs (Vdel) led to the creation of a ribosome binding site. Further, downstream, at the 25th residue, a second point mutation (G -> A) results in a stop codon.

A few further tests served to strengthen our prediction that HG4 has antimicrobial properties. As seen in Figure 2, the sequence containing the AMP (first 100 residues) is a transition from hydrophobic to hydrophilic regions. This is a typical amphiphilic profile of AMPs (Kumar et al., 2018).

Figure 2. Hydrophobicity profile of the reconstituted protein containing HG4. With the majority of points having a score above 0, we can say that this sequence is amphiphilic and matches that of a typical AMP.

AMPs usually have alternating sections of polar and hydrophobic residues, which is essential for them to facilitate membrane movement (Dennison et al., 2005). HG4 has polar residues mostly arranged along one side of the helical wheel presented in Figure 3. For AMPs to interact with their target microbes, they require an electrostatic charge. An electrostatic attraction exists between the negative charge of a bacterial membrane, and the positively charged AMP. The positive charge in the AMP is responsible for the formation of pores. These pores lead to cytoplasm leakage in the target cell, causing the cell’s death (Lei et al., 2019).

Figure 3. Helical wheel of HG4, showing polarity on the left hand side, represented with green circles

Can HG4 originate from a cryptic peptide?

One hypothesis would be that wild-type P. melaninogenica produces the whole protein, and through the human digestive tract it is degraded into the HG4 peptide. Figure 4 shows the protease cleavage sites predicted with Peptide Cutter (Artimo et al., 2012) from the protein containing the HG4 sequence. This shows that general proteases from the human intestine would be able to cleave the complete protein, releasing HG4 by proteolysis.

The reconstituted protein is from a specific family of oxidoreductase. This enzyme has a signal peptide which was detected using SignalP4.0 (Nielson., 2017). Signal peptides present in prokaryotic proteins are likely used for the secretion of said protein. Our reconstituted protein has a signal peptide, hence is probably secreted. Oxidoreductases are dependent on oxygen, which is depleted in the human gut. So, a disruption in this gene would be less harmful for a microbe in the gut, such as Prevotella sp. Furthermore, its secretion is also makes the protein available for proteolysis.

Figure 4. This shows the available protease sites in the reconstituted protein containing the peptide HG4 (red dashed box) for proteases from the human intestinal environment. Trypsin and Chymotrypsin-high specificity are both able to cleave the protein in a position that are beneficial in the extraction of HG4.

HG4 also had the longest predicted half-life among all the possible peptides (generated in silico) formed from the reconstituted protein. Half-life was calculated in an intestinal-like environment because proteases present in the intestine have the ability to form HG4. Using the HLP server (Sharma et al., 2014), we predicted that this peptide would be stable in the human gut for at least 3 seconds, relatively a long half-life (Figure 5).

Thus, if the reconstituted protein is present in the human gut for a prolonged period of time and proteases release HG4 at specific sites, this AMP would have enough time to interact with its target.

Figure 5. Half-life (seconds) plotted against peptide positions of the reconstituted HG4 gene sliced in 25-mers. HG4 has a predicted half-life of 3 seconds. The rest of the peptidyl positions have a faster half-life than the position of HG4, therefore being degraded faster in an intestinal-like environment than HG4.

Figure 6. Decay rate of each peptidyl position of the reconstituted protein. HG4 is in the position labelled with a red circle and has the lowest decay in comparison to all other peptide positions. This means it remains in its original state for a period of time longer than the other positions on the peptide.

We also searched for the HG4 sequence in the Global Microbial Gene Catalogue (version 1) and had a hit from two saliva samples from the same donor that contained the HG4 sequence. These hits, also predicted to be from P. melaninogenica, which suggests also that due to this species’ niche, known to cause tooth cavities, that human proteases could be working on the mouth and gut to generate the HG4 AMP.

What is the target of HG4?

The prediction tool available in the DBAASP database (Pirtskhalava et al., 2016) predicts HG4 is active against Klebsiella pneumoniae (0.85 PPV). Therefore, we searched the Global Microbial Gene Catalogue data to see if there were any correlation between the abundance of genes coding the cryptic hg4 and k. pneumoniae.

The Spearman correlation is not statistically significant (correlation=0.14, p-value=0.09). However, Fisher’s exact test on presence/absence between HG4 and K. pneumoniae is statistically significant (p-value=1.155·10⁻⁶). This suggests that there is a non-random association between the AMP and its potential target, but it’s a noisy one.

Insights and valuable ideas from HG4

The model we have put forward is that HG4 is a cryptic peptide embedded in a P. melaninogenica protein. This protein is secret into the environment and, in the human gut, can be cleaved and potentially interact with K. pneumoniae. However, with a few mutations, the peptide stops being cryptic and is now transcribed and translated on its own.

Figure 7. Purple boxes represent the process by which proteases are able to cleave HG4 from a protein found in P. melaninogenica. Green boxes represent the process by which two mutations occur within the protein sequence from P. melaninogenica, leading to the formation of HG4.

For now, we have an isolated case study, based on a series of speculative steps, but it is interesting to ask if it can be representative of a more widespread phenomenon. In particular, we are now asking how often do similar events happen across the microbial world? Hopefully, we will be able to give you an update in the not so distant future with stronger evidence.


Artimo, P., Jonnalagedda, M., Arnold, K., Baratin, D., Csardi, G., De Castro, E., Duvaud, S., Flegel, V., Fortier, A., Gasteiger, E. and Grosdidier, A., 2012. ExPASy: SIB bioinformatics resource portal. Nucleic acids research, 40(W1), pp.W597-W603. DOI: 10.1093/nar/gks400.

Bellamy, W., Takase, M., Yamauchi, K., Wakabayashi, H., Kawase, K. and Tomita, M., 1992. Identification of the bactericidal domain of lactoferrin. Biochimica et Biophysica Acta (BBA)-Protein Structure and Molecular Enzymology, 1121(1-2), pp.130-136. DOI: 10.1016/0167-4838(92)90346-f.

Cunha, N.B., Cobacho, N.B., Viana, J.F., Lima, L.A., Sampaio, K.B., Dohms, S.S., Ferreira, A.C., de la Fuente-Núñez, C., Costa, F.F., Franco, O.L. and Dias, S.C., 2017. The next generation of antimicrobial peptides (AMPs) as molecular therapeutic tools for the treatment of diseases with social and economic impacts. Drug discovery today, 22(2), pp.234-248. DOI: 10.1016/j.drudis.2016.10.017.

Dennison, S.R., Wallace, J., Harris, F. and Phoenix, D.A., 2005. Amphiphilic α-helical antimicrobial peptides and their structure/function relationships. Protein and peptide letters, 12(1), pp.31-39. DOI: 10.2174/0929866053406084.

Dias Santos-Junior, C., Pan, S.,, Zhao, XM., Coelho,LP. MACREL: antimicrobial peptide screening in genomes and metagenomes. bioRxiv, 2019.12.17.880385; DOI: 10.1101/2019.12.17.880385.

Do, N., Weindl, G., Grohmann, L., Salwiczek, M., Koksch, B., Korting, H.C. and Schäfer‐Korting, M., 2014. Cationic membrane‐active peptides–anticancer and antifungal activity as well as penetration into human skin. Experimental dermatology, 23(5), pp.326-331. DOI: 10.1111/exd.12384.

Fesenko, I., Azarkina, R., Kirov, I., Kniazev, A., Filippova, A., Grafskaia, E., Lazarev, V., Zgoda, V., Butenko, I., Bukato, O. and Lyapina, I., 2019. Phytohormone treatment induces generation of cryptic peptides with antimicrobial activity in the Moss Physcomitrella patens. BMC plant biology, 19(1), pp.1-16. DOI: 10.1186/s12870-018-1611-z.

Gifford, J.L., Hunter, H.N. and Vogel, H.J., 2005. Lactoferricin. Cellular and molecular life sciences, 62(22), p.2588. DOI: 10.1007/s00018-005-5373-z.

Gupta, R. and Srivastava, S., 2014. Antifungal effect of antimicrobial peptides (AMPs LR14) derived from Lactobacillus plantarum strain LR/14 and their applications in prevention of grain spoilage. Food microbiology, 42, pp.1-7. DOI: 10.1016/

Heintz-Buschart, A., May, P., Laczny, C.C., Lebrun, L.A., Bellora, C., Krishna, A., Wampach, L., Schneider, J.G., Hogan, A., De Beaufort, C. and Wilmes, P., 2016. Integrated multi-omics of the human gut microbiome in a case study of familial type 1 diabetes. Nature microbiology, 2(1), pp.1-13. DOI: 10.1038/nmicrobiol.2016.180.

Hilchie, A.L., Doucette, C.D., Pinto, D.M., Patrzykat, A., Douglas, S. and Hoskin, D.W., 2011. Pleurocidin-family cationic antimicrobial peptides are cytolytic for breast carcinoma cells and prevent growth of tumor xenografts. Breast cancer research, 13(5), p.R102. DOI: 10.1186/bcr3043.

Huerta-Cepas, J., Szklarczyk, D., Heller, D., Hernández-Plaza, A., Forslund, S.K., Cook, H., Mende, D.R., Letunic, I., Rattei, T., Jensen, L.J. and von Mering, C., 2019. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic acids research, 47(D1), pp.D309-D314. DOI: 10.1093/nar/gky1085.

Jiravanichpaisal, P., Lee, S.Y., Kim, Y.A., Andrén, T. and Söderhäll, I., 2007. Antibacterial peptides in hemocytes and hematopoietic tissue from freshwater crayfish Pacifastacus leniusculus: characterization and expression pattern. Developmental & Comparative Immunology, 31(5), pp.441-455. DOI: 10.1016/j.dci.2006.08.002.

Kumar, P., Kizhakkedathu, J.N. and Straus, S.K., 2018. Antimicrobial peptides: Diversity, mechanism of action and strategies to improve the activity and biocompatibility in vivo. Biomolecules, 8(1), p.4. DOI: 10.3390/biom8010004.

Lázár, V., Martins, A., Spohn, R., Daruka, L., Grézal, G., Fekete, G., Számel, M., Jangir, P.K., Kintses, B., Csörgő, B. and Nyerges, Á., 2018. Antibiotic-resistant bacteria show widespread collateral sensitivity to antimicrobial peptides. Nature microbiology, 3(6), pp.718-731. DOI: 10.1038/s41564-018-0164-0.

Lei, J., Sun, L., Huang, S., Zhu, C., Li, P., He, J., Mackey, V., Coy, D.H. and He, Q., 2019. The antimicrobial peptides and their potential clinical applications. American journal of translational research, 11(7), p.3919. PMCID: PMC6684887.

Li, N.N., Li, J.Z., Liu, P., Pranantyo, D., Luo, L., Chen, J.C., Kang, E.T., Hu, X.F., Li, C.M. and Xu, L.Q., 2017. An antimicrobial peptide with an aggregation-induced emission (AIE) luminogen for studying bacterial membrane interactions and antibacterial actions. Chemical Communications, 53(23), pp.3315-3318. DOI: 10.1039/c6cc09408b.

Meher, P.K., Sahu, T.K., Saini, V. and Rao, A.R., 2017. Predicting antimicrobial peptides with improved accuracy by incorporating the compositional, physico-chemical and structural features into Chou’s general PseAAC. Scientific reports, 7(1), pp.1-12. DOI: 10.1038/srep42362.

Nielsen, H., 2017. Predicting secretory proteins with SignalP. In Protein function prediction (pp. 59-73). Humana Press, New York, NY. DOI: 10.1007/978-1-4939-7015-5_6.

Pirtskhalava M, Gabrielian A, Cruz P, Griggs HL, Squires RB, Hurt DE, Grigolava M, Chubinidze M, Gogoladze G, Vishnepolsky B, Alekseev V, Rosenthal A, and Tartakovsky M. DBAASP v.2: an Enhanced Database of Structure and Antimicrobial/Cytotoxic Activity of Natural and Synthetic Peptides. Nucl. Acids Res., 2016, 44 (D1), D1104-D1112. DOI: 10.1093/nar/gkv1174.

Sharma A., Singla D., Rashid M. and Raghava G.P.S (2014) Designing of peptides with desired half-life in intestine-like environment. BMC Bioinformatics 2014, 15:282. DOI: 10.1186/1471-2105-15-282.

Vladislav, M., Chernov, Olga, A., Chernova, Alexey, A., Mouzykantov, Leonid, L., Lopukhov, Rustam, I., Aminov. (2019) Omics of antimicrobials and antimicrobial resistance. Expert Opinion on Drug Discovery 14:5, pages 455-468. DOI: 10.1080/17460441.2019.1588880.

Wang, G., Li, X. and Wang, Z. (2016) APD3: the antimicrobial peptide database as a tool for research and education. Nucleic Acids Research, 44 D1087-D1093. DOI: 10.1093/nar/gkv1278.