Identification of a novel variant of LMP-1 of EBV in patients with endemic Burkitt lymphoma in western Kenya

Background Epstein Barr virus (EBV) is a gammaherpesvirus that is associated with nasopharyngeal carcinoma (NPC) and endemic Burkitt lymphoma (eBL). EBV carries several latent genes that contribute to oncogenesis including the latent membrane protein 1 (LMP-1), a known oncogene and constitutively active CD40 homolog. Variation in the C terminal region of LMP-1 has been linked to NPC pathogenesis, but little is known regarding LMP-1 variation and eBL. Results In the present study, peripheral blood samples were obtained from 38 eBL patients and 22 healthy controls in western Kenya, where the disease is endemic. The LMP-1 C-terminal region from these samples was sequenced and analyzed. The frequency of a 30 base pair deletion of LMP-1 previously linked to NPC was not associated with eBL compared to healthy controls. However a novel LMP-1 variant was identified, called K for Kenya and for the G318K mutation that characterizes it. The K variant LMP-1 was found in 40.5% of eBL sequences and 25.0% of healthy controls. All K variant sequences contained mutations in both of the previously described minimal T cell epitopes in the C terminal end of LMP-1. These mutations occurred in the anchor residue at the C-terminal binding groove of both epitopes, a pocket necessary for MHC loading. Conclusions Overall, our results suggest that there is a novel K variant of LMP-1 in Kenya that may be associated with eBL. Further studies are necessary to determine the functional implications of the LMP-1 variant on early events in eBL genesis.


Background
Epstein Barr virus (EBV) is a well known infectious cofactor involved in the development of several malignancies, including endemic Burkitt lymphoma (eBL) and nasopharyngeal carcinoma (NPC) (reviewed in [1]). Still under question, however, is how EBV functions to drive malignancy. One possibility is that genetic variation in EBV leads to immune evasion of virally infected cells.
EBV encodes a number of genes that contribute to maintaining cell proliferation, blocking apoptosis, and contributing to the malignant phenotype of cancer cells [2][3][4][5]. One of the main EBV encoded oncogenes is latent membrane protein-1 (LMP-1) [6]. Latent membrane protein-1 is expressed during primary B cell infection, functioning as a constitutively active CD40 homolog and affecting many cellular proteins including TRADD, JAK3, PI3K, and RIPs [4,7,8]. Overexpression of LMP-1 in EBVnegative cell lines has shown that LMP-1 blocks apoptosis, increases cytokine production, cellular migration and transformation, and decreases cellular adhesion [8,9]. The structure of LMP-1 includes six transmembrane regions starting at the N terminus, with a long cytoplasmic tail containing three C terminal activating regions (CTAR), responsible for activating signaling cascades ( Figure 1) [8].
Genetic variation of LMP-1 has been classified using different schemes [10][11][12][13]. These schemes were developed from sequences of different geographic areas and cellular origins. Sandvej and colleagues published the first of these classification schemes using a variety of healthy European sequences [12]. In this study Sandvej and colleagues identified 4 variants of LMP-1 sequences in healthy Caucasians and labeled them A, B, C, and D [12]. The most frequent LMP-1 variant observed was variant A (41.2%), followed by variant C (26.5%), variant D (17.6%), variant B (11.8%), and uncharacterized (2.9%) [12]. Previous sequencing studies had been performed using tumor tissue rather than peripheral blood from healthy individuals [14], potentially selecting for certain viral sequences.
Mutations and deletions within the CTARs of LMP-1 have been associated with disease [15][16][17]. Specifically, a 10 amino acid deletion mutant of LMP-1 as compared to the prototypical B95.8 EBV strain has been associated with NPC cases in Asia, Europe, and North Africa [18][19][20]. In a retrospective study of EBV-positive lymphoproliferative disorders, the LMP-1 deletion mutant was linked to malignant phenotypes [21]. Deletions in LMP-1 have also been associated with other types of EBV-positive lymphomas [22][23][24]. One study of children in Turkey with Burkitt lymphoma reported a high frequency of the larger 69 base pair deletion variant of LMP-1, but this study did not compare incidence to healthy controls [25]. A study in Brazil reported that a similar high proportion of Burkitt lymphoma patients and controls harbored deletion variants of LMP-1 [26]. Other studies have examined the association of EBV variants with eBL and produced conflicting results [13,[27][28][29][30]. Focused studies on EBV variation in eBL patients relative to healthy controls are needed to clarify these divergent observations. To our knowledge, no study has examined the extent of genetic diversity of LMP-1 in an area endemic for BL or in eBL patients.
Genetic variation in LMP-1 has been shown to correlate with differences in T cell immunity [31][32][33]. Two ways that variant LMP-1 can decrease T cell immunity are through enhancement of regulatory T cells (Tregs) and immune evasion. The role of Tregs in NPC was examined by Pai et al. wherein an NPC-associated LMP-1 variant failed to stimulate T cells as effectively as wildtype LMP-1 in a mixed lymphocyte reaction [33]. The NPC-associated LMP-1 variant led to enhanced IL-10 production by antigen presenting cells, enhancing regulatory T cell function and reducing T cell responses to LMP-1 [33]. LMP-1 is also a target for EBV cytotoxic T lymphocytes (CTL) and has well described T cell epitopes [32,34]. Duraiswami and colleagues showed that there are 6 LMP-1 peptide sequences that stimulate LMP-1 specific T cells to produce IFN-γ. Each of these regions was broken down into the minimal peptide sequences that were T cell epitopes. One of the T cell epitope regions within LMP-1 falls within CTAR3 [34], an area with known sequence variation [11,12,35]. A sequencing study of LMP-1 T cell epitopes from NPC patients showed no association with disease, however it has not been shown whether LMP-1 variation within the T cell epitope region is associated with immune evasion  Figure 1 Diagram of LMP-1 structural and functional motifs. Cytoplasmic terminal activating regions are labeled CTAR1-3 and labeled with their corresponding amino acid numbers. The region that we sequenced is labeled, along with the positions of amino acid mutations in the K variant sequence, designated with *. The 10 amino acid deletion associated with NPC is labeled with X. The T cell epitope region of CTAR3 is labeled TCE, the JAK3 binding region is labeled JAK3, and the TRADD motif of CTAR2 is labeled TRADD.
in eBL [34]. While LMP-1 is not expressed in eBL, T cell control of EBV during primary infection of B cells may be impaired by different LMP-1 variants.
The current study sought to answer several outstanding questions. First, what is the diversity of LMP-1 sequence variation in an area endemic for eBL? Second, are certain LMP-1 genotypes associated with eBL compared to healthy controls? Finally, what does LMP-1 variation suggest about EBV pathogenesis? To answer these questions the C terminus of LMP-1 was sequenced from eBL patients and healthy controls from an eBL endemic area of western Kenya. A novel LMP-1 variant was observed in the Kenyan population, was highly prevalent in eBL patients, and carried mutations in the C terminal amino acids of both minimal T cell epitopes found in the portion of LMP-1 studied. These results may have implications for EBV-mediated immune evasion in the early events of Burkitt lymphomagenesis.

Study populations
Endemic Burkitt lymphoma patients and healthy controls were selected based on their availability from our previously reported case control study [36]. In this study only 13% of eBL patients were parasitemic by blood smear at admission, although nearly all resided in a malaria holoendemic area [37]. Also 28% of parents reported giving their child antimalarial treatment in the two weeks prior to presentation (Moormann, unpublished observation). Therefore point prevalence malaria status for eBL patients at presentation to this tertiary care hospital is not an accurate indicator of recent malaria. We have previously reported that 68% of this group of healthy controls were malaria positive at sampling [36]. Additional controls (C17-C24) were included from a nearby area of western Kenya [38], and of these 57% were PCR positive for malaria. Although acute malaria increases EBV load and possibly detectability [39], we were able to amplify EBV DNA from all eBL patients and healthy controls sampled, suggesting a low rate of detection bias of EBV. After sequencing it was pathologically determined that two eBL patients had tumors other than eBL (BL16 and BL39), and their sequencing data were excluded from the analysis but can be found in Additional file 1: Table S1. The mean age of eBL patients was 90 months and for healthy controls was 54 months. For eBL patients 56.8% were male and for healthy controls, 40.9% were male. A summary of demographic data on the study populations is shown in Table 1.

Coinfection with multiple EBV variants
Coinfection with different EBV LMP-1 deletion variants was determined by difference in the product size among clones. One eBL patient and two healthy controls had two discernible variants in LMP-1 size as determined by the size of the cloned PCR product when analyzed by gel electrophoresis (Figure 2). Both of the variants for the three study participants were sequenced and pooled with the results of the remaining sequences for analysis, resulting in 39 eBL sequences and 24 healthy control sequences.

Diversity of LMP-1 sequence variants
The T cell epitope region of CTAR3 through the 30 base pair deletion region to the 3′ end of the LMP-1 gene that was sequenced is shown in Figure 1. Isolates were then categorized into the scheme defined by Sandvej and colleagues and also compared with the prototypic B95.8 strain of EBV [15,35]. Because Sandvej et al. sequenced LMP-1 from many healthy Europeans [12], and compared the sequences to lymphoma patients [35], this classification scheme was chosen for the present study. In the present study of the C terminus of LMP-1, in contrast to Sandvej et al., variant A was not observed, while B, C, D, and B95.8 EBV LMP-1 variants were observed. Table 2 represents the full array of mutations observed in this study population, and the frequency of each variant in healthy control and eBL samples is shown in Figure 3. The only variant sequence represented exactly as described by Sandvej was the C variant, which was present in 15 (40.5%) eBL sequences and 7 (29.2%) control sequences (p=0.42, OR 1.65, 95% CI 0.55-4.97). However other variants could be characterized as similar to C type, differing only by single amino acid substitutions. These variants were denoted C' and when combined with true C variant totaled 17 (45.9%) eBL samples and 10 (41.7%) healthy controls (p=0.80, OR 1.19, 95% CI 0.42-3.36). Thus no difference in the frequency of C variant was observed between eBL and healthy control sequences.
Variants of several other previously described LMP-1 isolates were observed, including B, D, and B95.8. There Sequences are grouped according to the scheme devised by Sandvej et al. [12], with amino acid mutations labeled at sites along LMP-1 protein sequence. Novel K variant amino acid changes are also labeled. # Stands for B95.8 reference sequence. * Stands for amino acid deletion.
Presence of the 30 base pair deletion LMP-1 mutant detected by gel electrophoresis or by sequencing was compared and 100% concordance was observed between electrophoresis and sequencing studies in detecting the LMP-1 deletion (Figure 4, other data not shown). Next the frequency of the deletion mutant was compared between eBL cases and healthy controls. The 30 base pair deletion mutant was present in 17 (45.9%) eBL sequences and 10 (41.7%) healthy controls (p=0.80, OR 1.19, 95% CI 0. 42-3.36).
No mutations were observed in the TRADD/RIP binding sequence of CTAR2, which occurs from amino acids 379-385 of LMP-1. Of the 63 sequence reads, 55 produced clean traces through the end of the LMP1 coding sequence. The other 8 sequences were amplified with primers that did not include the last 8 amino acids of LMP-1, and this portion has been excluded from their analysis. However in all 55 traces, the TRADD/RIP binding motif at the C terminal end of CTAR2 was 100% conserved in all samples.

LMP-1 T cell epitope variants
Duraiswami and colleagues showed that only specific LMP-1 epitopes are able to elicit interferon-γ production from T cells [34]. One of these epitopes occurs in CTAR3, from amino acids 307 to 323. Within this region it was determined that there were two minimal sequences of 9 amino acids necessary for recognition by EBV-specific T cells. The minimal T cell epitope sequences within CTAR3 were AGNDGGPPQ and PSDSAGNDG. When the K sequence was mapped onto these epitopes, it was found that the K variant was mutated at the C terminal amino acid of both minimal T cell epitopes, creating sequences AGNDEGPPK and PSDSAGNDE. A diagram of the possible effects of these mutations on MHC-I loading is shown in Figure 5. The G318K mutation was highly linked to the Q322E mutation, such that all 22 sequences observed containing G318K also contained Q322E. An amino acid mutation at Q322 in the C terminal of the T cell epitope was detected in 55 of 61 samples analyzed. While all K variant sequences contained two amino acid mutations in the T cell epitope region of CTAR3, all but two other sequences with mutations in this region harbored mutations only in Q322. Of the two sequences with multiple T cell epitope mutations, one was an alternate C variant sequence (BL36), with mutations in both terminal amino acids, to AGNDGGPSN. The other was a B variant sequence (C2), and contained the sequence AGNDNGPPE.

Discussion
The main goals of this study were to determine the genetic variation of the C terminus of LMP-1 in children residing in western Kenya, whether variation was linked to eBL versus healthy controls, and what LMP-1 variation suggests about EBV biology. To address the first goal of our study, the LMP-1 sequences obtained from Kenyan study participants were compared to previously reported sequences from healthy Caucasians [12]. The major LMP-1 sequences observed in the Kenyan population were the C variant and a previously unreported K variant sequence. We are unaware of any previous studies describing the characteristic G318K mutation of the K variant sequence. Other LMP-1 variants observed included the B, D, and B95.8. No A variant sequences were observed among this population from western Kenya, in contrast to the high prevalence observed in the European population [12]. This general pattern of EBV variants could suggest historical movement of EBV among populations [11]. For example, the A variant virus in the European population may have arisen independently of mutation in the African setting. Further studies using larger regions of the EBV genome and sequences from diverse geographical regions are necessary to validate these observations across the global population.
The second aim of this study was to determine if certain LMP-1 genotypes were associated with eBL as compared to healthy controls. None of the previously characterized LMP-1 variants observed were associated with eBL, including B, C, D, and B95.8. The novel K variant LMP-1 was found in 40.5% of eBL sequences and 25.0% of healthy controls (p=0.27). Larger sample sizes are needed to confirm whether K variant LMP-1 is associated with eBL in Kenya. Still undetermined is whether the K variant sequence is associated with eBL in other areas endemic for Burkitt lymphoma, which would support an immune evasive phenotype of K variant LMP-1, or if it arose independently in the Kenyan population. The selection of EBV genetic variants in cancer agrees with previous work suggesting that EBV-associated Hodgkin's disease selects for certain LMP-1 variants, which differ from the distribution of variants in the general population [35]. Similarly in eBL, previous work on EBNA-1 has suggested that certain EBNA-1 variants are more oncogenic than others [27]. Although some research has suggested the selection of specific EBNA-1 genetic variants in lymphomas, other work has suggested that specific EBNA-1 variants are associated only with geographic areas and not with eBL [28].
T cell control of EBV is critical for the development of protective immunity [40]. It was recently confirmed in a mouse model that T cell control of LMP-1 is necessary for inhibiting lymphomagenesis [41]. It has also been determined that only specific LMP-1 epitopes generate interferon-γ responses from T cells [34]. The possible link to T cell immune evasion in K variant LMP-1 derives from the mutated anchor residues in the C terminal binding groove of both of the two known minimal T cell recognition sequences of CTAR3 in the K variant. In addition to their specific location within the anchor position, these mutations resulted in changes in the polarity of the amino acid. The first mutation was from the small and uncharged glycine at position 318 to larger and positively charged lysine. The second mutation at amino acid 322 was from uncharged glutamine to negatively charged glutamic acid. Mutations in the C terminal binding groove affect the ability of peptides to be loaded onto appropriate MHC class I molecules [42,43], so these mutations may play an important role in MHC loading, decreasing the ability of LMP-1 derived peptides to be presented at the cell surface. Our study did not evaluate the MHC specificity of these variants, but the Kenyan population has very high MHC heterogeneity [44], and it is possible that people with certain MHC variants are unable to present these novel LMP-1 peptides. Functional studies are necessary to characterize the MHC specificity of the novel LMP-1 variants identified in this study.
Given the immune evasion hypothesis it is interesting that we did not observe a difference in the frequency of K type LMP-1 between eBL patients and controls. There are multiple possible explanations for this. One possiblity is that the sample size of the current study was too small to detect a difference between these populations. Sampling a larger population was unfortunately not possible for this study. Another possibility is that LMP-1 variants of eBL patients and controls differ in critical T cell epitopes outside of the region sequenced here. It is known that LMP-1 T cell epitopes exist outside of CTAR2 and that amino acid variation leads to functional consequences [34], so this remains a possibility that should be examined by future studies. Another possibility is that K type LMP-1 in healthy individuals clusters spatially with high-risk eBL clusters [37,45]. Spatial data were not recorded in the current study, possibly altering the frequency of K type LMP-1 that would be observed in high versus low risk healthy controls. We believe that future studies including the entire coding region of LMP-1 with larger sample sizes will help resolve this apparent discrepancy.
A major limitation of this study was that LMP-1 was sequenced from DNA extracted from peripheral blood lymphocytes rather than eBL tumor tissue. We were unable to obtain biopsy tissue for these studies. However previous work showed that EBV isolated from eBL biopsy samples contained the same EBNA-1 sequence as EBV obtained from peripheral blood of the same individual, indicating that tumor and peripheral blood EBV isolates were genetically identical [28].

Conclusions
The C-terminus of LMP-1 was sequenced from peripheral blood of eBL patients and healthy controls in western Kenya. The Kenyan population demonstrated an altered distribution of LMP-1 variants compared to previous studies in Europe. A previously undocumented LMP-1 variant was also observed, called K for Kenya and its novel lysine (K) substitution. The K variant LMP-1 is characterized by amino acid mutations in the C terminal anchor residues of both minimal T cell epitopes of LMP-1 CTAR-3, which may lead to functional differences in MHC loading. The K variant was found at increased frequency in eBL patients compared to healthy controls. Since this variant has not been described in eBL samples previously, larger patient populations will need to be studied to confirm the linkage between K variant and eBL development. Future studies are also needed to confirm the functional role of K variant mutations on MHC loading and T cell immune evasion.

Samples
Endemic BL patients were enrolled when presenting to the New Nyanza Provincial General Hospital in Kisumu, Kenya and healthy controls were enrolled from a nearby malaria holoendemic area as previously described, [46]. Additional controls (C17-C24) were included from a subset of samples of a separate study of healthy children living in a nearby area of Kisumu, Kenya [38]. After obtaining informed consent, approximately five milliliters of peripheral blood was drawn from children with eBL and healthy controls. Whole blood was frozen at -80°C until use. From these frozen samples, 38 eBL patients and 22 healthy controls were randomly selected for sequencing. After beginning the study it was pathologically determined that 2 eBL patients (BL16 and BL39) had non-eBL tumors and their sequencing data were excluded from analysis.

Ethical approval
Ethical approval was obtained from the Institutional Review Boards at The State University of New York Upstate Medical University (Rochford), The University of Massachusetts Medical School (Moormann), and the Ethical Review Committee at the Kenya Medical Research Institute, Nairobi, Kenya. Parents of minor study participants provided individual, written informed consent in accordance with the Declaration of Helsinki.

DNA extraction
DNA was extracted from whole blood using the QIAamp DNA Mini Kit (Qiagen, Germantown, MD, USA) according to the manufacturer's instructions.

Cloning
After confirming the appropriate product length, PCR products were cloned using the TOPO TA pCR 2.1 cloning kit with TOP10 chemically competent Escherichia coli according to the manufacturer's instructions (Invitrogen, Carlsbad, CA, USA). Five clones per sample were selected and run on an agarose gel to visualize the presence of the LMP-1 product and the size of the amplicon.
Plasmid DNA was purified from E. coli using a Qiagen Plasmid Purification Mini Kit (Germantown, MD, USA) according to the manufacturer's instructions and eluted in HPLC grade water. To confirm the presence of the LMP-1 insert, plasmid DNA was digested with EcoR1 (New England Biolabs, Ipswitch, MA, USA) according to the manufacturer's instructions. A total of 5 clones per sample were digested. Digestion products were run on a 2% agarose gel as described above to confirm the presence of LMP-1 insert DNA.

Sequence analysis
Plasmids containing cloned LMP-1 PCR products were sent to Genewiz (South Plainfield, NJ, USA) for sequencing using M13R universal primers. Sequences were aligned using Unipro UGENE software (Novosibirsk, Russia).

Statistical analysis
Fisher's exact test with odds ratios (OR), and 95% confidence intervals (95% CI) in GraphPad Prism, version 5.0b (La Jolla, CA, USA) were used to compare the frequency of LMP-1 variants between eBL patients and healthy controls.

Additional file
Additional file 1: Table S1. Amino acid sequences of patients excluded from study with non-BL tumors.