Haemophilus parasuis, an important swine pathogen, was recently proven able to invade into endothelial or epithelial cell in vitro. NOD1/2 are specialized NLRs that participate in the recognition of pathogens able to invade intracellularly and therefore, we assessed that the contribution of NOD1/2 to inflammation responses during H. parasuis infection. We observed that H. parasuis infection enhanced NOD2 expression and RIP2 phosphorylation in porcine kidney 15 cells. Our results also showed that knock down of NOD1/2 or RIP2 expression respectively significantly decreased H. parasuis-induced NF-κB activity, while the phosphorylation level of p38, JNK or ERK was not changed. Moreover, real-time PCR result showed that NOD1, NOD2 or RIP2 was involved in the expression of CCL4, CCL5 and IL-8. Inhibition of NOD1 and NOD2 significantly reduced CCL5 promoter activity, even in a more effective way compared with inhibition of TLR.
Epstein-Barr virus (EBV) transforms B cells to continuously proliferating lymphoblastoid cell lines (LCLs), which represent an experimental model for EBV-associated cancers. EBV nuclear antigens (EBNAs) and LMP1 are EBV transcriptional regulators that are essential for LCL establishment, proliferation, and survival. Starting with the 3D genome organization map of LCL, we constructed a comprehensive EBV regulome encompassing 1,992 viral/cellular genes and enhancers. Approximately 30% of genes essential for LCL growth were linked to EBV enhancers. Deleting EBNA2 sites significantly reduced their target gene expression. Additional EBV super-enhancer (ESE) targets included MCL1, IRF4, and EBF. MYC ESE looping to the transcriptional stat site of MYC was dependent on EBNAs. Deleting MYC ESEs greatly reduced MYC expression and LCL growth. EBNA3A/3C altered CDKN2A/B spatial organization to suppress senescence. EZH2 inhibition decreased the looping at the CDKN2A/B loci and reduced LCL growth. This study provides a comprehensive view of the spatial organization of chromatin during EBV-driven cellular transformation.
Primary effusion lymphoma (PEL) is a largely incurable malignancy of B cell origin with plasmacytic differentiation. Here, we report the identification of a highly effective inhibitor of PEL. This compound, 6-ethylthioinosine (6-ETI), is a nucleoside analog with toxicity to PEL in vitro and in vivo, but not to other lymphoma cell lines tested. We developed and performed resistome analysis, an unbiased approach based on RNA sequencing of resistant subclones, to discover the molecular mechanisms of sensitivity. We found different adenosine kinase-inactivating (ADK-inactivating) alterations in all resistant clones and determined that ADK is required to phosphorylate and activate 6-ETI. Further, we observed that 6-ETI induces ATP depletion and cell death accompanied by S phase arrest and DNA damage only in ADK-expressing cells. Immunohistochemistry for ADK served as a biomarker approach to identify 6-ETI-sensitive tumors, which we documented for other lymphoid malignancies with plasmacytic features. Notably, multiple myeloma (MM) expresses high levels of ADK, and 6-ETI was toxic to MM cell lines and primary specimens and had a robust antitumor effect in a disseminated MM mouse model. Several nucleoside analogs are effective in treating leukemias and T cell lymphomas, and 6-ETI may fill this niche for the treatment of PEL, plasmablastic lymphoma, MM, and other ADK-expressing cancers.
Epstein-Barr virus (EBV) is a major cause of immunosuppression-related B-cell lymphomas and Hodgkin lymphoma (HL). In these malignancies, EBV latent membrane protein 1 (LMP1) and LMP2A provide infected B cells with surrogate CD40 and B-cell receptor growth and survival signals. To gain insights into their synergistic in vivo roles in germinal center (GC) B cells, from which most EBV-driven lymphomas arise, we generated a mouse model with conditional GC B-cell LMP1 and LMP2A coexpression. LMP1 and LMP2A had limited effects in immunocompetent mice. However, upon T- and NK-cell depletion, LMP1/2A caused massive plasmablast outgrowth, organ damage, and death. RNA-sequencing analyses identified EBV oncoprotein effects on GC B-cell target genes, including up-regulation of multiple proinflammatory chemokines and master regulators of plasma cell differentiation. LMP1/2A coexpression also up-regulated key HL markers, including CD30 and mixed hematopoietic lineage markers. Collectively, our results highlight synergistic EBV membrane oncoprotein effects on GC B cells and provide a model for studies of their roles in immunosuppression-related lymphoproliferative diseases.
Nasopharyngeal carcinoma (NPC) most frequently occurs in southern China and southeast Asia. Epidemiology studies link NPC to genetic predisposition, Epstein-Barr virus (EBV) infection, and environmental factors. Genetic studies indicate that mutations in chromatin-modifying enzymes are the most frequent genetic alterations in NPC. Here, we used H3K27ac chromatin immune precipitation followed by deep sequencing (ChIP-seq) to define the NPC epigenome in primary NPC biopsies, NPC xenografts, and an NPC cell line, and compared them to immortalized normal nasopharyngeal or oral epithelial cells. We identified NPC-specific enhancers and found these enhancers were enriched with nuclear factor κB (NF-κB), IFN-responsive factor 1 (IRF1) and IRF2, and ETS family members ETS1 motifs. Normal cell-specific enhancers were enriched with basic leucine zipper family members and TP53 motifs. NPC super-enhancers with extraordinarily broad and high H3K27ac signals were also identified, and they were linked to genes important for oncogenesis including ETV6. ETV6 was also highly expressed in NPC biopsies by immunohistochemistry. High ETV6 expression correlated with a poor prognosis. Furthermore, we defined the EBV episome epigenetic landscapes in primary NPC tissue.
Epstein-Barr virus (EBV) super-enhancers (ESEs) are essential for lymphoblastoid cell (LCL) growth and survival. Reanalyses of LCL global run-on sequencing (Gro-seq) data found abundant enhancer RNAs (eRNAs) being transcribed at ESEs. Inactivation of ESE components, EBV nuclear antigen 2 (EBNA2) and bromodomain-containing protein 4 (BRD4), significantly decreased eRNAs at ESEs -428 and -525 kb upstream of the MYC oncogene transcription start site (TSS). shRNA knockdown of the MYC -428 and -525 ESE eRNA caused LCL growth arrest and reduced cell growth. Furthermore, MYC ESE eRNA knockdown also significantly reduced MYC expression, ESE H3K27ac signals, and MYC ESEs looping to MYC TSS. These data indicate that ESE eRNAs strongly affect cell gene expression and enable LCL growth.
BACKGROUND: Indwelling arterial catheters (IACs) are used extensively in the ICU for hemodynamic monitoring and for blood gas analysis. IAC use also poses potentially serious risks, including bloodstream infections and vascular complications. The purpose of this study was to assess whether IAC use was associated with mortality in patients who are mechanically ventilated and do not require vasopressor support.
METHODS: This study used the Multiparameter Intelligent Monitoring in Intensive Care II database, consisting of > 24,000 patients admitted to the Beth Israel Deaconess Medical Center ICU between 2001 and 2008. Patients requiring mechanical ventilation who did not require vasopressors or have a diagnosis of sepsis were identified, and the primary outcome was 28-day mortality. A model based on patient demographics, comorbidities, vital signs, and laboratory results was developed to estimate the propensity for IAC placement. Patients were then propensity matched, and McNemar test was used to evaluate the association of IAC with 28-day mortality.
RESULTS: We identified 1,776 patients who were mechanically ventilated who met inclusion criteria. There were no differences in the covariates included in the final propensity model between the IAC and non-IAC propensity-matched groups. For the matched cohort, there was no difference in 28-day mortality between the IAC group and the non-IAC group (14.7% vs 15.2%; OR, 0.96; 95% CI, 0.62-1.47).
CONCLUSIONS: In hemodynamically stable patients who are mechanically ventilated, the presence of an IAC is not associated with a difference in 28-day mortality. Validation in other datasets, as well as further analyses in other subgroups, is warranted.
Epstein-Barr Virus (EBV) conversion of B-lymphocytes to Lymphoblastoid Cell Lines (LCLs) requires four EBV nuclear antigen (EBNA) oncoproteins: EBNA2, EBNALP, EBNA3A, and EBNA3C. EBNA2 and EBNALP associate with EBV and cell enhancers, up-regulate the EBNA promoter, MYC, and EBV Latent infection Membrane Proteins (LMPs), which up-regulate BCL2 to protect EBV-infected B-cells from MYC proliferation-induced cell death. LCL proliferation induces p16(INK4A) and p14(ARF)-mediated cell senescence. EBNA3A and EBNA3C jointly suppress p16(INK4A) and p14(ARF), enabling continuous cell proliferation. Analyses of the EBNA3A human genome-wide ChIP-seq landscape revealed 37% of 10,000 EBNA3A sites to be at strong enhancers; 28% to be at weak enhancers; 4.4% to be at active promoters; and 6.9% to be at weak and poised promoters. EBNA3A colocalized with BATF-IRF4, ETS-IRF4, RUNX3, and other B-cell Transcription Factors (TFs). EBNA3A sites clustered into seven unique groups, with differing B-cell TFs and epigenetic marks. EBNA3A coincidence with BATF-IRF4 or RUNX3 was associated with stronger EBNA3A ChIP-Seq signals. EBNA3A was at MYC, CDKN2A/B, CCND2, CXCL9/10, and BCL2, together with RUNX3, BATF, IRF4, and SPI1. ChIP-re-ChIP revealed complexes of EBNA3A on DNA with BATF. These data strongly support a model in which EBNA3A is tethered to DNA through a BATF-containing protein complexes to enable continuous cell proliferation.
Super-enhancers are clusters of gene-regulatory sites bound by multiple transcription factors that govern cell transcription, development, phenotype, and oncogenesis. By examining Epstein-Barr virus (EBV)-transformed lymphoblastoid cell lines (LCLs), we identified four EBV oncoproteins and five EBV-activated NF-κB subunits co-occupying ∼1,800 enhancer sites. Of these, 187 had markedly higher and broader histone H3K27ac signals, characteristic of super-enhancers, and were designated "EBV super-enhancers." EBV super-enhancer-associated genes included the MYC and BCL2 oncogenes, which enable LCL proliferation and survival. EBV super-enhancers were enriched for B cell transcription factor motifs and had high co-occupancy of STAT5 and NFAT transcription factors (TFs). EBV super-enhancer-associated genes were more highly expressed than other LCL genes. Disrupting EBV super-enhancers by the bromodomain inhibitor JQ1 or conditionally inactivating an EBV oncoprotein or NF-κB decreased MYC or BCL2 expression and arrested LCL growth. These findings provide insight into mechanisms of EBV-induced lymphoproliferation and identify potential therapeutic interventions.
Epstein-Barr virus (EBV) infects germinal center (GC) B cells and establishes persistent infection in memory B cells. EBV-infected B cells can cause B-cell malignancies in humans with T- or natural killer-cell deficiency. We now find that EBV-encoded latent membrane protein 2A (LMP2A) mimics B-cell antigen receptor (BCR) signaling in murine GC B cells, causing altered humoral immune responses and autoimmune diseases. Investigation of the impact of LMP2A on B-cell differentiation in mice that conditionally express LMP2A in GC B cells or all B-lineage cells found LMP2A expression enhanced not only BCR signals but also plasma cell differentiation in vitro and in vivo. Conditional LMP2A expression in GC B cells resulted in preferential selection of low-affinity antibody-producing B cells despite apparently normal GC formation. GC B-cell-specific LMP2A expression led to systemic lupus erythematosus-like autoimmune phenotypes in an age-dependent manner. Epigenetic profiling of LMP2A B cells found increased H3K27ac and H3K4me1 signals at the zinc finger and bric-a-brac, tramtrack domain-containing protein 20 locus. We conclude that LMP2A reduces the stringency of GC B-cell selection and may contribute to persistent EBV infection and pathogenesis by providing GC B cells with excessive prosurvival effects.
The Epstein-Barr virus (EBV) encoded oncoprotein Latent Membrane Protein 1 (LMP1) signals through two C-terminal tail domains to drive cell growth, survival and transformation. The LMP1 membrane-proximal TES1/CTAR1 domain recruits TRAFs to activate MAP kinase, non-canonical and canonical NF-kB pathways, and is critical for EBV-mediated B-cell transformation. TRAF1 is amongst the most highly TES1-induced target genes and is abundantly expressed in EBV-associated lymphoproliferative disorders. We found that TRAF1 expression enhanced LMP1 TES1 domain-mediated activation of the p38, JNK, ERK and canonical NF-kB pathways, but not non-canonical NF-kB pathway activity. To gain insights into how TRAF1 amplifies LMP1 TES1 MAP kinase and canonical NF-kB pathways, we performed proteomic analysis of TRAF1 complexes immuno-purified from cells uninduced or induced for LMP1 TES1 signaling. Unexpectedly, we found that LMP1 TES1 domain signaling induced an association between TRAF1 and the linear ubiquitin chain assembly complex (LUBAC), and stimulated linear (M1)-linked polyubiquitin chain attachment to TRAF1 complexes. LMP1 or TRAF1 complexes isolated from EBV-transformed lymphoblastoid B cell lines (LCLs) were highly modified by M1-linked polyubiqutin chains. The M1-ubiquitin binding proteins IKK-gamma/NEMO, A20 and ABIN1 each associate with TRAF1 in cells that express LMP1. TRAF2, but not the cIAP1 or cIAP2 ubiquitin ligases, plays a key role in LUBAC recruitment and M1-chain attachment to TRAF1 complexes, implicating the TRAF1:TRAF2 heterotrimer in LMP1 TES1-dependent LUBAC activation. Depletion of either TRAF1, or the LUBAC ubiquitin E3 ligase subunit HOIP, markedly impaired LCL growth. Likewise, LMP1 or TRAF1 complexes purified from LCLs were decorated by lysine 63 (K63)-linked polyubiqutin chains. LMP1 TES1 signaling induced K63-polyubiquitin chain attachment to TRAF1 complexes, and TRAF2 was identified as K63-Ub chain target. Co-localization of M1- and K63-linked polyubiquitin chains on LMP1 complexes may facilitate downstream canonical NF-kB pathway activation. Our results highlight LUBAC as a novel potential therapeutic target in EBV-associated lymphoproliferative disorders.
MOTIVATION: Lipid, an essential class of biomolecules, is receiving increasing attention in the research community, especially with the development of analytical technique like mass spectrometry. Gene Ontology (GO) is the de facto standard function annotation scheme for gene products. Identification of both explicit and implicit lipid-related GO terms will help lipid research in many ways, e.g. assigning lipid function in protein function prediction.
RESULTS: We have constructed a Web site 'LipidGO' that facilitates browsing and searching lipid-related GO terms. An expandable hierarchical GO tree is constructed that allows users to find lipid-related GO terms easily. To support large-scale analysis, a user is able to upload a list of gene products or a list of GO terms to find out which of them is lipid related. Finally, we demonstrate the usefulness of 'LipidGO' by two applications: (i) identifying lipid-related gene products in model organisms and (ii) discovering potential novel lipid-related molecular functions
AVAILABILITY AND IMPLEMENTATION: LipidGO is available at http://compbio.ddns.comp.nus.edu.sg/%7elipidgo/index.php.
BACKGROUND: Epstein-Barr Virus (EBV) is a globally prevalent herpesvirus associated with infectious mononucleosis and many malignancies. The survey on EBV prevalence appears to be important to study EBV-related diseases and determine when to administer prophylactic vaccine. The purpose of this retrospective study was to collect baseline information about the prevalence of EBV infection in Chinese children.
METHODOLOGY/PRINCIPAL FINDING: We collected 1778 serum samples from healthy children aged 0 to 10, who were enrolled in conventional health and nutrition examinations without any EBV-related symptom in 2012 and 2013 in North China (n = 973) and South China (n = 805). We detected four EBV-specific antibodies, i.e., anti-VCA-IgG and IgM, anti-EBNA-IgG and anti-EA-IgG, by ELISA, representing all of the phases of EBV infection. The overall EBV seroprevalence in samples from North and South China were 80.78% and 79.38% respectively. The EBV seropositivity rates dropped slightly at age 2, and then increased gradually with age. The seroprevalence became stabilized at over 90% after age 8. In this study, the seroprevalence trends between North and South China showed no difference (P>0.05), and the trends of average antibody concentrations were similar as well (P>0.05).
CONCLUSIONS/SIGNIFICANCE: EBV seroprevalence became more than 50% before age 3 in Chinese children, and exceed 90% after age 8. This study can be helpful to study the relationship between EBV and EBV-associated diseases, and supportive to EBV vaccine development and implementation.
Epstein-Barr virus nuclear antigen 3C (EBNA3C) repression of CDKN2A p14(ARF) and p16(INK4A) is essential for immortal human B-lymphoblastoid cell line (LCL) growth. EBNA3C ChIP-sequencing identified >13,000 EBNA3C sites in LCL DNA. Most EBNA3C sites were associated with active transcription; 64% were strong H3K4me1- and H3K27ac-marked enhancers and 16% were active promoters marked by H3K4me3 and H3K9ac. Using ENCODE LCL transcription factor ChIP-sequencing data, EBNA3C sites coincided (±250 bp) with RUNX3 (64%), BATF (55%), ATF2 (51%), IRF4 (41%), MEF2A (35%), PAX5 (34%), SPI1 (29%), BCL11a (28%), SP1 (26%), TCF12 (23%), NF-κB (23%), POU2F2 (23%), and RBPJ (16%). EBNA3C sites separated into five distinct clusters: (i) Sin3A, (ii) EBNA2/RBPJ, (iii) SPI1, and (iv) strong or (v) weak BATF/IRF4. EBNA3C signals were positively affected by RUNX3, BATF/IRF4 (AICE) and SPI1/IRF4 (EICE) cooccupancy. Gene set enrichment analyses correlated EBNA3C/Sin3A promoter sites with transcription down-regulation (P < 1.6 × 10(-4)). EBNA3C signals were strongest at BATF/IRF4 and SPI1/IRF4 composite sites. EBNA3C bound strongly to the p14(ARF) promoter through SPI1/IRF4/BATF/RUNX3, establishing RBPJ-, Sin3A-, and REST-mediated repression. EBNA3C immune precipitated with Sin3A and conditional EBNA3C inactivation significantly decreased Sin3A binding at the p14(ARF) promoter (P < 0.05). These data support a model in which EBNA3C binds strongly to BATF/IRF4/SPI1/RUNX3 sites to enhance transcription and recruits RBPJ/Sin3A- and REST/NRSF-repressive complexes to repress p14(ARF) and p16(INK4A) expression.
The nuclear factor κB (NF-κΒ) subunits RelA, RelB, cRel, p50, and p52 are each critical for B cell development and function. To systematically characterize their responses to canonical and noncanonical NF-κB pathway activity, we performed chromatin immunoprecipitation followed by high-throughput DNA sequencing (ChIP-seq) analysis in lymphoblastoid B cell lines (LCLs). We found a complex NF-κB-binding landscape, which did not readily reflect the two NF-κB pathway paradigms. Instead, 10 subunit-binding patterns were observed at promoters and 11 at enhancers. Nearly one-third of NF-κB-binding sites lacked κB motifs and were instead enriched for alternative motifs. The oncogenic forkhead box protein FOXM1 co-occupied nearly half of NF-κB-binding sites and was identified in protein complexes with NF-κB on DNA. FOXM1 knockdown decreased NF-κB target gene expression and ultimately induced apoptosis, highlighting FOXM1 as a synthetic lethal target in B cell malignancy. These studies provide a resource for understanding mechanisms that underlie NF-κB nuclear activity and highlight opportunities for selective NF-κB blockade.
BACKGROUND: H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are essential for understanding the infection mechanism of the formidable pathogen M. tuberculosis H37Rv. Computational prediction is an important strategy to fill the gap in experimental H. sapiens-M. tuberculosis H37Rv PPI data. Homology-based prediction is frequently used in predicting both intra-species and inter-species PPIs. However, some limitations are not properly resolved in several published works that predict eukaryote-prokaryote inter-species PPIs using intra-species template PPIs.
RESULTS: We develop a stringent homology-based prediction approach by taking into account (i) differences between eukaryotic and prokaryotic proteins and (ii) differences between inter-species and intra-species PPI interfaces. We compare our stringent homology-based approach to a conventional homology-based approach for predicting host-pathogen PPIs, based on cellular compartment distribution analysis, disease gene list enrichment analysis, pathway enrichment analysis and functional category enrichment analysis. These analyses support the validity of our prediction result, and clearly show that our approach has better performance in predicting H. sapiens-M. tuberculosis H37Rv PPIs. Using our stringent homology-based approach, we have predicted a set of highly plausible H. sapiens-M. tuberculosis H37Rv PPIs which might be useful for many of related studies. Based on our analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent homology-based approach, we have discovered several interesting properties which are reported here for the first time. We find that both host proteins and pathogen proteins involved in the host-pathogen PPIs tend to be hubs in their own intra-species PPI network. Also, both host and pathogen proteins involved in host-pathogen PPIs tend to have longer primary sequence, tend to have more domains, tend to be more hydrophilic, etc. And the protein domains from both host and pathogen proteins involved in host-pathogen PPIs tend to have lower charge, and tend to be more hydrophilic.
CONCLUSIONS: Our stringent homology-based prediction approach provides a better strategy in predicting PPIs between eukaryotic hosts and prokaryotic pathogens than a conventional homology-based approach. The properties we have observed from the predicted H. sapiens-M. tuberculosis H37Rv PPI network are useful for understanding inter-species host-pathogen PPI networks and provide novel insights for host-pathogen interaction studies.
Epstein-Barr virus (EBV) nuclear antigens EBNALP (LP) and EBNA2 (E2) are coexpressed in EBV-infected B lymphocytes and are critical for lymphoblastoid cell line outgrowth. LP removes NCOR and RBPJ repressive complexes from promoters, enhancers, and matrix-associated deacetylase bodies, whereas E2 activates transcription from distal enhancers. LP ChIP-seq analyses identified 19,224 LP sites of which ~50% were ± 2 kb of a transcriptional start site. LP sites were enriched for B-cell transcription factors (TFs), YY1, SP1, PAX5, BATF, IRF4, ETS1, RAD21, PU.1, CTCF, RBPJ, ZNF143, SMC3, NFκB, TBLR, and EBF. E2 sites were also highly enriched for LP-associated cell TFs and were more highly occupied by RBPJ and EBF. LP sites were highly marked by H3K4me3, H3K27ac, H2Az, H3K9ac, RNAPII, and P300, indicative of activated transcription. LP sites were 29% colocalized with E2 (LP/E2). LP/E2 sites were more similar to LP than to E2 sites in associated cell TFs, RNAPII, P300, and histone H3K4me3, H3K9ac, H3K27ac, and H2Az occupancy, and were more highly transcribed than LP or E2 sites. Gene affected by CTCF and LP cooccupancy were more highly expressed than genes affected by CTCF alone. LP was at myc enhancers and promoters and of MYC regulated ccnd2, 23 med complex components, and MYC regulated cell survival genes, igf2r and bcl2. These data implicate LP and associated TFs and DNA looping factors CTCF, RAD21, SMC3, and YY1/INO80 chromatin-remodeling complexes in repressor depletion and gene activation necessary for lymphoblastoid cell line growth and survival.
Host-pathogen interactions are important for understanding infection mechanism and developing better treatment and prevention of infectious diseases. Many computational studies on host-pathogen interactions have been published. Here, we review recent progress and results in this field and provide a systematic summary, comparison and discussion of computational studies on host-pathogen interactions, including prediction and analysis of host-pathogen protein-protein interactions; basic principles revealed from host-pathogen interactions; and database and software tools for host-pathogen interaction data collection, integration and analysis.
BACKGROUND: H. sapiens-M. tuberculosis H37Rv protein-protein interaction (PPI) data are very important information to illuminate the infection mechanism of M. tuberculosis H37Rv. But current H. sapiens-M. tuberculosis H37Rv PPI data are very scarce. This seriously limits the study of the interaction between this important pathogen and its host H. sapiens. Computational prediction of H. sapiens-M. tuberculosis H37Rv PPIs is an important strategy to fill in the gap. Domain-domain interaction (DDI) based prediction is one of the frequently used computational approaches in predicting both intra-species and inter-species PPIs. However, the performance of DDI-based host-pathogen PPI prediction has been rather limited.
RESULTS: We develop a stringent DDI-based prediction approach with emphasis on (i) differences between the specific domain sequences on annotated regions of proteins under the same domain ID and (ii) calculation of the interaction strength of predicted PPIs based on the interacting residues in their interaction interfaces. We compare our stringent DDI-based approach to a conventional DDI-based approach for predicting PPIs based on gold standard intra-species PPIs and coherent informative Gene Ontology terms assessment. The assessment results show that our stringent DDI-based approach achieves much better performance in predicting PPIs than the conventional approach. Using our stringent DDI-based approach, we have predicted a small set of reliable H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies. We also analyze the H. sapiens-M. tuberculosis H37Rv PPIs predicted by our stringent DDI-based approach using cellular compartment distribution analysis, functional category enrichment analysis and pathway enrichment analysis. The analyses support the validity of our prediction result. Also, based on an analysis of the H. sapiens-M. tuberculosis H37Rv PPI network predicted by our stringent DDI-based approach, we have discovered some important properties of domains involved in host-pathogen PPIs. We find that both host and pathogen proteins involved in host-pathogen PPIs tend to have more domains than proteins involved in intra-species PPIs, and these domains have more interaction partners than domains on proteins involved in intra-species PPI.
CONCLUSIONS: The stringent DDI-based prediction approach reported in this work provides a stringent strategy for predicting host-pathogen PPIs. It also performs better than a conventional DDI-based approach in predicting PPIs. We have predicted a small set of accurate H. sapiens-M. tuberculosis H37Rv PPIs which could be very useful for a variety of related studies.
BACKGROUND: Pathway data are important for understanding the relationship between genes, proteins and many other molecules in living organisms. Pathway gene relationships are crucial information for guidance, prediction, reference and assessment in biochemistry, computational biology, and medicine. Many well-established databases--e.g., KEGG, WikiPathways, and BioCyc--are dedicated to collecting pathway data for public access. However, the effectiveness of these databases is hindered by issues such as incompatible data formats, inconsistent molecular representations, inconsistent molecular relationship representations, inconsistent referrals to pathway names, and incomprehensive data from different databases.
RESULTS: In this paper, we overcome these issues through extraction, normalization and integration of pathway data from several major public databases (KEGG, WikiPathways, BioCyc, etc). We build a database that not only hosts our integrated pathway gene relationship data for public access but also maintains the necessary updates in the long run. This public repository is named IntPath (Integrated Pathway gene relationship database for model organisms and important pathogens). Four organisms--S. cerevisiae, M. tuberculosis H37Rv, H. Sapiens and M. musculus--are included in this version (V2.0) of IntPath. IntPath uses the "full unification" approach to ensure no deletion and no introduced noise in this process. Therefore, IntPath contains much richer pathway-gene and pathway-gene pair relationships and much larger number of non-redundant genes and gene pairs than any of the single-source databases. The gene relationships of each gene (measured by average node degree) per pathway are significantly richer. The gene relationships in each pathway (measured by average number of gene pairs per pathway) are also considerably richer in the integrated pathways. Moderate manual curation are involved to get rid of errors and noises from source data (e.g., the gene ID errors in WikiPathways and relationship errors in KEGG). We turn complicated and incompatible xml data formats and inconsistent gene and gene relationship representations from different source databases into normalized and unified pathway-gene and pathway-gene pair relationships neatly recorded in simple tab-delimited text format and MySQL tables, which facilitates convenient automatic computation and large-scale referencing in many related studies. IntPath data can be downloaded in text format or MySQL dump. IntPath data can also be retrieved and analyzed conveniently through web service by local programs or through web interface by mouse clicks. Several useful analysis tools are also provided in IntPath.
CONCLUSIONS: We have overcome in IntPath the issues of compatibility, consistency, and comprehensiveness that often hamper effective use of pathway databases. We have included four organisms in the current release of IntPath. Our methodology and programs described in this work can be easily applied to other organisms; and we will include more model organisms and important pathogens in future releases of IntPath. IntPath maintains regular updates and is freely available at http://compbio.ddns.comp.nus.edu.sg:8080/IntPath.