Metagenomics Medicine mapping system (MetaMed), is a novel and integrative system-wide correlation mapping system to link bacteria functions and medicine therapeutics. In this system, a well-defined similarity score between microbial metabolite entity and medicine entity is applied to link microbiota functions and existed medicine therapeutics. We demonstrate with comprehensive and solid evidence that such a linking strategy, although straightforward while never tried before, can help to achieve accurate predictions of microbial effects on human body. The key idea behind MetaMed is to use a well-defined similarity score to measure the similarity between the metabolites derived from microbe biosynthetic gene clusters (BGCs ) and public available drugs. We leverage KEGG pathway annotations to justify our proposed MetaMed score schema, and our score can be applied to identify metabolites with potential therapeutic effects effectively. By using such a scoring system, MetaMed integrates the microbe and public medicine information from MIBiG , DrugBank , LINCS , SIDER etc., providing a comprehensive data source for the deeply investigation of the microbe impact on human body by linking the microbe metabolites to the functions of known drugs, resulting in a systematic mapping system of MetaMed including the links of microbe-drug, microbe-treatment indication, microbe-side effect and microbe-immune status transition etc. The aim of our study is to present a novel computational strategy to decipher microbial effect on human health, we validate the identified relationships with literature evidences. Future experimental validation and investigations are encouraged to perform on these derived hypotheses or predicted results.
MetaMed uses a well-defined similarity score to measure the similarity between the metabolites derived from microbe biosynthetic gene clusters (BGC) and public available drugs. Basically, this similarity score considers both the compound structure information and corresponding perturb effects on cells measured by transcriptional readout, commonly recorded in LINCS  and CMAP . Specially, for the similarity calculation between one metabolite and one drug, the score is just a weighted average of the compound structure similarity and the compound perturbation based transcriptional level similarity . The calculation of the structure similarity between two sets of compounds is based on structural fingerprint, which is generated by Open Babel  and ChemmineR  using Tanimoto coefficient . The calculation of the compound perturbation based transcriptional level similarity is performed on the LINCS data by using L1000CD web tool .
To the best of our knowledge, MetaMed is the first well-curated and annotated database to provide comprehensive information bridging the microbe and personalized medicine study. MetaMed contains 827 microbes, 1,157 BGCs producing metabolites, 8,226 drugs and ~1,193,324 BGC-Drug interactions.
Another highlight of MetaMed is that it links microbes with personalized medicine information by using compound structure and transcriptional level profiles as a bridge. By such a bridging, (1) It can be used to analyze potential implication of the microbes for personalized medicine; (2) It annotates the metabolites with known side effect ( SIDER ) ; (3) It predicts the microbe metabolites’ impact on immune system  by the defined score system
In summary, MetaMed provides rich information to help to achieve an accurate predictions of microbial impact on human body.
Immune Status Transition reflects immune cell responses to perturbations by generating differentially expressed gene signatures between two immunological states. (e.g., DC.pDC.8+.SLN-DC.pDC.8+.Sp means DC cell type with pDC.8+ identical surface markers isolated from skin-draining lymph nodes transferring to DC cell type with pDC.8+ identical surface markers isolated from spleen)
Abbreviated cell type labels represent: Neut = neutrophil, Mono = monocyte, NKT = natural killer T cell, NK = natural killer cell, Mac = macrophage, T4 = CD4+ T cell, T8 = CD8+ T cell, Tgd = gamma-delta T cell, DC = dendritic cell. Abbreviated tissue labels represent: BM = Bone marrow, scLN = Subcutaneous lymph nodes, msLN = mesenteric lymph nodes, sdLN = skin-draining lymph nodes, SI = Small intestine, FLN = Follicular lymph nodes, IE = Intraepithelial, mdLN = mediastinal lymph nodes, PP = Peyer’s patch, AA = abdominal adipose, LN = Lymph nodes, MsLN = Medullary sinus lymph nodes, PLN = Pancreatic lymph nodes, SF = synovial fluid, ssLN = Subcapsular lymph nodes. For the details information, please refer to .