tl;dr Maybe there are a lot of cryptic peptides with antimicrobial
properties, which, with a few mutations, can become independent genes.
Right now, we have some speculative results on a case study, a molecule we are
calling HG4. Our current work aims at making this more robust and trying to
figure out if it can be a general mechanism for AMP evolution.
Introduction I: what are AMPs?
Antimicrobial Peptides are short molecules, usually 10-100 residues in length,
that interfere with microbial cells. Due to the reduced size of AMPs, their
identification usually relies not on homology, but on machine learning methods
(Meher et al., 2017). After their predominant discovery in the 1980s, AMPs
have shown to have a broad spectrum of activity: some AMPs are anti-bacterial
(Jiravanichpaisal et al., 2007, Li et al., 2017), anti-fungal (Gupta and
Srivastava., 2014, Do et al., 2014) and anti-cancer (Hilchie et al., 2011).
Interestingly, some AMPs have also shown to be effective against
antibiotic-resistant bacteria (Lázár et al., 2018). With the prominent
antimicrobial-resistance (AMR) crisis (Vladislav et al., 2019), AMPs could
represent an alternative to conventional antibiotics.
In the Antimicrobial Peptide Database (AMPdb) (Wang
et al., 2016), which contains known AMP sequences, only 12% of reported
sequences belong to prokaryotes, suggesting that prokaryotic AMPs are still
under-explored.
AMPs are a diverse group of molecules, commonly cationic. This means they have
a positive charge, and usually organised faces with hydrophobic and hydrophilic
regions. These regions enable AMPs to interact with microbial membranes (Cunha
et al., 2017).
Introduction II: what are cryptic AMPs?
A cryptic AMP is an AMP that is embedded in a larger protein. One
well-documented way in which cryptic AMPs can be activated is with the use of
naturally occurring proteases, which cleave the parent protein, releasing the
AMP.
One example is lactoferricin
which is released from lactoferrin, its parent protein. Lactoferrin is
typically found in human and cow milk, although it has been found in saliva and
other bodily fluids. Lactoferricin, a cryptic peptide, was discovered in 1992
through the pepsin cleavage of lactoferrin in the gut of humans (Bellamy et
al. 1992). Gifford et al. (2005) highlighted lactoferricin's potential
ability to be anti-bacterial, antiviral and antitumor.
Cryptic AMPs can also be formed by the activation of protein precursors. This
activation can be triggered by a variety of situations, namely external stress
conditions. An example of this in plants is that a phytohormone treatment
induced cryptic AMPs in moss (Fesenko et al., 2019).
Searching for AMPs in human gut metagenome data, led us to find a potential
cryptic peptide which became in independent gene.
HG4: Probably an AMP
Using 184 human guts metagenomes (Heintz-Buschart et al., 2016), we used
Macrel v0.4 (Santos-Junior et
al., 2020) to predict AMPs.
(See the macrel preprint for more details on macrel and the full results on
these metagenomes)
HG4 was a putative AMP found in three metagenomes from the same individual.
Using homology, we found the HG4 sequence within a protein from Prevotella
melaninogenica. We aligned the contig containing HG4 sequence with the
P. melaninogenica genome, and discovered that the portion containing
the HG4 AMP was disrupted. Specifically, there was a deletion of a nucleotide
upstream of the HG4 peptide start codon, this deletion disrupted the amino
terminal and created a ribosome binding site with a 11 bp spacer (see Fig. 1).
Further downstream a point mutation generated a stop codon, thus forming a
complete ORF that can be transcribed by the original gene promoter.

Figure 1.Diagram of the mutations that result in the formation of the AMP
HG4. A point deletion occurs (Vdel) led to the creation of a ribosome binding
site. Further, downstream, at the 25th residue, a second point mutation (G
-> A) results in a stop codon.
A few further tests served to strengthen our prediction that HG4 has
antimicrobial properties. As seen in Figure 2, the sequence containing the AMP
(first 100 residues) is a transition from hydrophobic to hydrophilic regions.
This is a typical amphiphilic profile of AMPs (Kumar et al., 2018).

Figure 2. Hydrophobicity profile of the reconstituted protein containing
HG4. With the majority of points having a score above 0, we can say that this
sequence is amphiphilic and matches that of a typical AMP.
AMPs usually have alternating sections of polar and hydrophobic residues, which
is essential for them to facilitate membrane movement (Dennison et al.,
2005). HG4 has polar residues mostly arranged along one side of the helical
wheel presented in Figure 3. For
AMPs to interact with their target microbes, they require an electrostatic
charge. An electrostatic attraction exists between the negative charge of a
bacterial membrane, and the positively charged AMP. The positive charge in the
AMP is responsible for the formation of pores. These pores lead to cytoplasm
leakage in the target cell, causing the cell’s death (Lei et al., 2019).

Figure 3. Helical wheel of HG4, showing polarity on the left hand side,
represented with green circles
Can HG4 originate from a cryptic peptide?
One hypothesis would be that wild-type P. melaninogenica produces the whole
protein, and through the human digestive tract it is degraded into the HG4
peptide. Figure 4 shows the protease cleavage sites predicted with Peptide
Cutter (Artimo et al., 2012) from
the protein containing the HG4 sequence. This shows that general proteases from
the human intestine would be able to cleave the complete protein, releasing HG4
by proteolysis.
The reconstituted protein is from a specific family of oxidoreductase. This
enzyme has a signal peptide which was detected using
SignalP4.0 (Nielson., 2017). Signal
peptides present in prokaryotic proteins are likely used for the secretion of
said protein. Our reconstituted protein has a signal peptide, hence is probably
secreted. Oxidoreductases are dependent on oxygen, which is depleted in the
human gut. So, a disruption in this gene would be less harmful for a microbe in
the gut, such as Prevotella sp. Furthermore, its secretion is also makes the
protein available for proteolysis.

Figure 4. This shows the available protease sites in the reconstituted
protein containing the peptide HG4 (red dashed box) for proteases from the
human intestinal environment. Trypsin and Chymotrypsin-high specificity are
both able to cleave the protein in a position that are beneficial in the
extraction of HG4.
HG4 also had the longest predicted half-life among all the possible peptides
(generated in silico) formed from the reconstituted protein. Half-life was
calculated in an intestinal-like environment because proteases present in the
intestine have the ability to form HG4. Using the HLP
server (Sharma et al., 2014),
we predicted that this peptide would be stable in the human gut for at
least 3 seconds, relatively a long half-life (Figure 5).
Thus, if the reconstituted protein is present in the human gut for a prolonged
period of time and proteases release HG4 at specific sites, this AMP would have
enough time to interact with its target.

Figure 5. Half-life (seconds) plotted against peptide positions of the
reconstituted HG4 gene sliced in 25-mers. HG4 has a predicted half-life of 3
seconds. The rest of the peptidyl positions have a faster half-life than the
position of HG4, therefore being degraded faster in an intestinal-like
environment than HG4.

Figure 6. Decay rate of each peptidyl position of the reconstituted
protein. HG4 is in the position labelled with a red circle and has the lowest
decay in comparison to all other peptide positions. This means it remains in
its original state for a period of time longer than the other positions on the
peptide.
We also searched for the HG4 sequence in the Global Microbial Gene Catalogue
(version 1) and had a hit from two saliva samples from
the same donor that contained the HG4 sequence. These hits, also predicted to
be from P. melaninogenica, which suggests also that due to this species'
niche, known to cause tooth cavities, that human proteases could be working on
the mouth and gut to generate the HG4 AMP.
What is the target of HG4?
The prediction tool available in the DBAASP database
(Pirtskhalava et al., 2016) predicts HG4 is active against Klebsiella
pneumoniae (0.85 PPV). Therefore, we searched the Global Microbial Gene
Catalogue data to see if there were any correlation between the abundance of
genes coding the cryptic hg4 and k. pneumoniae.
The Spearman correlation is not statistically significant (correlation=0.14,
p-value=0.09). However, Fisher’s exact test on presence/absence between HG4 and
K. pneumoniae is statistically significant (p-value=1.155·10⁻⁶). This
suggests that there is a non-random association between the AMP and its
potential target, but it's a noisy one.
Insights and valuable ideas from HG4
The model we have put forward is that HG4 is a cryptic peptide embedded in a
P. melaninogenica protein. This protein is secret into the environment and,
in the human gut, can be cleaved and potentially interact with K. pneumoniae.
However, with a few mutations, the peptide stops being cryptic and is now
transcribed and translated on its own.

Figure 7. Purple boxes represent the process by which proteases are able
to cleave HG4 from a protein found in P. melaninogenica. Green boxes
represent the process by which two mutations occur within the protein sequence
from P. melaninogenica, leading to the formation of HG4.
For now, we have an isolated case study, based on a series of speculative
steps, but it is interesting to ask if it can be representative of a more
widespread phenomenon. In particular, we are now asking how often do similar
events happen across the microbial world? Hopefully, we will be able to give
you an update in the not so distant future with stronger evidence.
References
Artimo, P., Jonnalagedda, M., Arnold, K., Baratin, D., Csardi, G., De
Castro, E., Duvaud, S., Flegel, V., Fortier, A., Gasteiger, E. and
Grosdidier, A., 2012. ExPASy: SIB bioinformatics resource portal.
Nucleic acids research, 40(W1), pp.W597-W603.
DOI: 10.1093/nar/gks400.
Bellamy, W., Takase, M., Yamauchi, K., Wakabayashi, H., Kawase, K. and
Tomita, M., 1992. Identification of the bactericidal domain of
lactoferrin.
Biochimica et Biophysica Acta (BBA)-Protein Structure and
Molecular Enzymology, 1121(1-2), pp.130-136.
DOI: 10.1016/0167-4838(92)90346-f.
Cunha, N.B., Cobacho, N.B., Viana, J.F., Lima, L.A., Sampaio, K.B.,
Dohms, S.S., Ferreira, A.C., de la Fuente-Núñez, C., Costa, F.F.,
Franco, O.L. and Dias, S.C., 2017. The next generation of antimicrobial
peptides (AMPs) as molecular therapeutic tools for the treatment of
diseases with social and economic impacts.
Drug discovery today,
22(2), pp.234-248.
DOI: 10.1016/j.drudis.2016.10.017.
Dennison, S.R., Wallace, J., Harris, F. and Phoenix, D.A., 2005.
Amphiphilic α-helical antimicrobial peptides and their
structure/function relationships.
Protein and peptide letters, 12(1),
pp.31-39.
DOI: 10.2174/0929866053406084.
Dias Santos-Junior, C., Pan, S.,, Zhao, XM., Coelho,LP. MACREL:
antimicrobial peptide screening in genomes and metagenomes.
bioRxiv,
2019.12.17.880385;
DOI: 10.1101/2019.12.17.880385.
Do, N., Weindl, G., Grohmann, L., Salwiczek, M., Koksch, B., Korting,
H.C. and Schäfer‐Korting, M., 2014. Cationic membrane‐active
peptides–anticancer and antifungal activity as well as penetration into
human skin.
Experimental dermatology, 23(5), pp.326-331.
DOI: 10.1111/exd.12384.
Fesenko, I., Azarkina, R., Kirov, I., Kniazev, A., Filippova, A.,
Grafskaia, E., Lazarev, V., Zgoda, V., Butenko, I., Bukato, O. and
Lyapina, I., 2019. Phytohormone treatment induces generation of cryptic
peptides with antimicrobial activity in the Moss Physcomitrella patens.
BMC plant biology, 19(1), pp.1-16.
DOI: 10.1186/s12870-018-1611-z.
Gifford, J.L., Hunter, H.N. and Vogel, H.J., 2005.
Lactoferricin.
Cellular and molecular life sciences, 62(22), p.2588.
DOI: 10.1007/s00018-005-5373-z.
Gupta, R. and Srivastava, S., 2014. Antifungal effect of antimicrobial
peptides (AMPs LR14) derived from Lactobacillus plantarum strain LR/14
and their applications in prevention of grain spoilage.
Food
microbiology, 42, pp.1-7.
DOI: 10.1016/j.fm.2014.02.005.
Heintz-Buschart, A., May, P., Laczny, C.C., Lebrun, L.A., Bellora, C.,
Krishna, A., Wampach, L., Schneider, J.G., Hogan, A., De Beaufort, C.
and Wilmes, P., 2016. Integrated multi-omics of the human gut microbiome
in a case study of familial type 1 diabetes.
Nature microbiology,
2(1), pp.1-13.
DOI: 10.1038/nmicrobiol.2016.180.
Hilchie, A.L., Doucette, C.D., Pinto, D.M., Patrzykat, A., Douglas, S.
and Hoskin, D.W., 2011. Pleurocidin-family cationic antimicrobial
peptides are cytolytic for breast carcinoma cells and prevent growth of
tumor xenografts.
Breast cancer research, 13(5), p.R102.
DOI: 10.1186/bcr3043.
Huerta-Cepas, J., Szklarczyk, D., Heller, D., Hernández-Plaza, A.,
Forslund, S.K., Cook, H., Mende, D.R., Letunic, I., Rattei, T., Jensen,
L.J. and von Mering, C., 2019. eggNOG 5.0: a hierarchical, functionally
and phylogenetically annotated orthology resource based on 5090
organisms and 2502 viruses.
Nucleic acids research, 47(D1),
pp.D309-D314.
DOI: 10.1093/nar/gky1085.
Jiravanichpaisal, P., Lee, S.Y., Kim, Y.A., Andrén, T. and Söderhäll,
I., 2007. Antibacterial peptides in hemocytes and hematopoietic tissue
from freshwater crayfish Pacifastacus leniusculus: characterization and
expression pattern.
Developmental & Comparative Immunology, 31(5),
pp.441-455.
DOI: 10.1016/j.dci.2006.08.002.
Kumar, P., Kizhakkedathu, J.N. and Straus, S.K., 2018. Antimicrobial
peptides: Diversity, mechanism of action and strategies to improve the
activity and biocompatibility in vivo.
Biomolecules, 8(1), p.4.
DOI: 10.3390/biom8010004.
Lázár, V., Martins, A., Spohn, R., Daruka, L., Grézal, G., Fekete, G.,
Számel, M., Jangir, P.K., Kintses, B., Csörgő, B. and Nyerges, Á., 2018.
Antibiotic-resistant bacteria show widespread collateral sensitivity to
antimicrobial peptides.
Nature microbiology, 3(6), pp.718-731.
DOI: 10.1038/s41564-018-0164-0.
Lei, J., Sun, L., Huang, S., Zhu, C., Li, P., He, J., Mackey, V., Coy,
D.H. and He, Q., 2019. The antimicrobial peptides and their potential
clinical applications.
American journal of translational research,
11(7), p.3919.
PMCID: PMC6684887.
Li, N.N., Li, J.Z., Liu, P., Pranantyo, D., Luo, L., Chen, J.C., Kang,
E.T., Hu, X.F., Li, C.M. and Xu, L.Q., 2017. An antimicrobial peptide
with an aggregation-induced emission (AIE) luminogen for studying
bacterial membrane interactions and antibacterial actions.
Chemical
Communications, 53(23), pp.3315-3318.
DOI: 10.1039/c6cc09408b.
Meher, P.K., Sahu, T.K., Saini, V. and Rao, A.R., 2017. Predicting
antimicrobial peptides with improved accuracy by incorporating the
compositional, physico-chemical and structural features into Chou’s
general PseAAC.
Scientific reports, 7(1), pp.1-12.
DOI: 10.1038/srep42362.
Nielsen, H., 2017. Predicting secretory proteins with SignalP. In
Protein function prediction (pp. 59-73).
Humana Press, New York, NY.
DOI: 10.1007/978-1-4939-7015-5_6.
Pirtskhalava M, Gabrielian A, Cruz P, Griggs HL, Squires RB, Hurt DE,
Grigolava M, Chubinidze M, Gogoladze G, Vishnepolsky B, Alekseev V,
Rosenthal A, and Tartakovsky M. DBAASP v.2: an Enhanced Database of
Structure and Antimicrobial/Cytotoxic Activity of Natural and Synthetic
Peptides.
Nucl. Acids Res., 2016, 44 (D1), D1104-D1112.
DOI: 10.1093/nar/gkv1174.
Sharma A., Singla D., Rashid M. and Raghava G.P.S (2014) Designing of
peptides with desired half-life in intestine-like environment.
BMC
Bioinformatics 2014, 15:282.
DOI: 10.1186/1471-2105-15-282.
Vladislav, M., Chernov, Olga, A., Chernova, Alexey, A., Mouzykantov,
Leonid, L., Lopukhov, Rustam, I., Aminov. (2019) Omics of antimicrobials
and antimicrobial resistance.
Expert Opinion on Drug Discovery 14:5,
pages 455-468.
DOI: 10.1080/17460441.2019.1588880.
Wang, G., Li, X. and Wang, Z. (2016) APD3: the antimicrobial peptide
database as a tool for research and education.
Nucleic Acids Research,
44 D1087-D1093.
DOI: 10.1093/nar/gkv1278.