Human papillomavirus infection and risk of breast cancer: a meta-analysis of case-control studies

Background Although systematic reviews (SR) report that human papillomavirus (HPV) increases the risk of breast cancer, there are still disputes regarding this association. In particular, it has been argued that the risk level differs depending on nationality, type of tissue, subtype of HPV, and publication year. Considering that the searching year of publication for the previous SRs was June 2013, a renewal meta-analysis needs to be conducted. Methods Using articles selected in the previous SRs, we compiled a list of references, cited articles, and related articles from the PubMed and Scopus databases. Of these, only publications with data from case-control studies on HPV DNA-positivity in tissues were chosen. Summary odds ratio (SOR) and 95 % confidence interval (CI) were calculated through meta-analysis. Meta-regression analysis was performed for nationality, types of tissue, subtype of HPV, and publication year. Results Twenty-two case-control studies were selected, and the total number of individuals in the case and control group was 1897 and 948, respectively. According to the meta-analysis about the 22 publications, HPV infection increased the risk of breast cancer (SOR = 4.02, 95 % CI: 2.42–6.68; I-squared = 44.7 %). Statistical significance was not found in meta-regression performed on the four variables of nationality, type of tissue, subtype of HPV, and publication year which some researchers think sources of heterogeneity. Conclusions The results of the present study supported the argument that HPV infection increases the risk of breast cancer. Age-matched case-control studies are in need in the future.

paraffin-embedded tissue (PET) to test for HPV DNApositivity because HPV DNA can be destroyed and become contaminated during the treatment procedure, meaning that PET will have more measurement errors than fresh frozen tissue (FFT) [15]. Although Li et al. [12] emphasized that HPV 33 was detected in all Asians, it was suggested that these regional differences can be attributed to differences in the testing method [15]. In addition, Zhou et al. [13] stressed that the risk of HPV infection was influenced by geographic region, HPV DNA source, PCR primer used, and publication year. However in the subgroup analysis, the confidence intervals of SOR overlapped with one another. Therefore, it is necessary to further examine whether these variables indeed cause heterogeneity. Furthermore, taking into account that the final search period of the 3 SRs was June 2013 [11], the meta-analysis needs to be adapted by additionally selecting literatures published up to September 2015. The objective of this study was to re-conduct meta-analysis with meta-regression on the relationship between HPV infection and the risk of breast cancer. Figure 1 depicts the process of selecting articles for the final analysis through a data search. Based on the 3 SRs to identify the association between HPV and the prevalence and odds ratio (OR) of breast cancer, a list was compiled containing 85 references and 8122 cited and related articles from PubMed and Scopus. We sequentially applied the selection criteria into the total 8207 papers, and excluded (1) 8113 articles with a different hypothesis, (2) 21 articles that were expert reviews or systematic reviews, (3) 45 articles using case only studies, (4) 2 articles that were case-control studies without HPV DNA-positivity in both groups [16,17], and (5) 2 articles published using duplicate samples [18,19]. The older publication in 2005 by Tsai et al. [18] was excluded because the samples used were the same as a publication in 2007 [20] by the same group. In addition, the studies published in 2009 by Lawson et al. [19] and Hang et al. [21] used the same DNA specimens as each other; of these, Lawson et al. [19] was excluded based on the suitability of the hypothesis for our study.

Results and discussion
Following the aforementioned exclusion process, 24 publications were selected for the meta-analysis [10,14,. Table 1 summarizes the numbers of HPV DNApositive and HPV DNA-negative individuals in the case and control group in these 24 case-control studies, organized according to the nationality of the study subjects, types of DNA specimen, and 3 HPV subtypes. Of these studies, He et al. [28] and Fu et al. [40]  The papers finally selected for this study = 24   [21] was used for analyzing the HPV 16 results. Therefore, in the 22 publications of case-control studies excluding the 2 articles that used DNA specimens from the same hospital [21,28], there were 1897 and 948 individuals in the case and control group, respectively. When categorized by region, there were 10 articles in far-east Asia, 5 articles in middle-east Asia, and 7 articles in other regions. By specimen type, there were 15 articles using PET and 7 articles using FFT. When the data was organized by HPV subtype, there were 11 articles on HPV 16, 10 articles on HPV 18, and 5 articles on HPV 33. Regardless of HPV subtype, the risk of breast cancer was 4.02-fold higher (95 % CI: 2.42-6.68: I-squared =44.7 %) for HPV DNA-positive individuals (Fig. 2). The Egger test was used to determine publication bias, and the bias coefficient was 0.91 which was not statistically significant (p = 0.165) (Fig. 3).     The meta-regression analysis was performed on 26 datasets created around three subtypes, with nationality, types of tissue, subtype, and publication year as the variables. None of the variables showed statistical significance (not shown).
In order to satisfy the criteria to prove that a specific virus causes cancer [42], case-control studies must be performed instead of case only studies [43]. However, tumor-based case-control studies are susceptible to measurement errors [44,45], and thus, systematic reviews are needed to overcome this shortcoming.
According to the meta-analysis for results from 22 case-control studies, the risk of breast cancer due to HPV infection was 4.02-fold higher. Even when the results were analyzed by categorizing into four regions, two types of DNA specimen and two publication periods, the risk of breast cancer due to HPV was statistically significant. The findings provide supporting evidence for the HPV infection as a risk factor of breast cancer. Additionally, the CIs of SOR calculated in the subgroup analysis were overlapping with one another, and the results from meta-regression analysis showed that none of the 4 variables caused heterogeneity. These findings support the validity of the SOR calculated in the meta-analysis.
The estimated SOR in this study was similar to previous meta-analysis results (Table 3). However, our metaanalysis retrieved results from 22 case-control studies, and therefore, has a narrower confidence interval because we were able to retrieve publications that were not selected through electronic search. The list of 22 publications gathered in this manner will be important for renewal meta-analyses in the future.
Early study results were confusing, due to inappropriate experimental design, small sample sizes, and unstandardized HPV DNA detection methods [11,14,15]. However, Li et al. [12] commented that consistent study results have been reported since 2006. Therefore, we tried to conduct a subgroup analysis by dividing into before and after 2006, but because only 3 of the 21 publications were before 2006, we performed analysis with 2010 as the cut point. In terms of selecting region variables, 9 out of 16 studies selected in Zhou et al. [13] had Asian subjects, whereas in this study it was 15 out of 22 studies that had Asian subjects. Thus, in the study, an analysis was done after the 15 studies were separated into 10 far-east and 5 middle-east Asia studies. Also, Zhou et al. [13] reported the difference for each PCR primer even if the CIs of SORs overlapped. In this study, we used the subtype variable, in lieu of the variable of PCR primer used. That is, we created 26 sets of database after dividing HPV into 3 subtypes (16, 18, and 33) and examined SOR by subtype. Not only the results showed that the CIs of SOR calculated by subtype overlapped, but also we confirmed no statistical significance with a meta-regression analysis.   Regarding the link between the Epstein-Barr virus infection and breast cancer, it has been argued that different kinds of control tissue cause heterogeneity [46]. Of the 22 selected studies, we found that only 2 studies used adjacent normal cells from the cancer tissue [24,41], and the remaining 20 studies used normal breast cells of non-cancer tissues. Therefore, an additional analysis by type of control tissue was not performed.
It has been proposed that not only HPV but also herpesvirus, polyomavirus, and beta retrovirus increase the risk of breast cancer [47]. Proving these theories related to viral infection is of great significance because it opens up the possibility of using antiviral drugs to treat breast cancer and vaccines to prevent breast cancer [8,48].

Conclusions
In conclusion, this meta-analysis supports the hypothesis that HPV infection is a risk factor for breast cancer. In near future, it is anticipated that nested case-control studies will be actively performed, along with agematched case-control studies.

Search and selection of related articles
Since we were using 3 previously published systematic reviews [11][12][13], we used the hand search method rather than the electronic search method [49,50]. Publications were found by searching the references of articles selected in these 3 systematic reviews on the preferential basis. And then lists of "cited articles" and "similar (related) articles" provided by the PubMed (www.ncbi.nlm.nih.gov/ pubmed) and Scopus (www.elsevier.com/solutions/scopus) databases for each article were also considered for inclusion. This searching strategy assumes that studies conducted with the 'same research hypothesis' have a high possibility of being cited in related articles and that they will have similar findings [51].
The final selection criteria were case-control studies that detected HPV DNA in the tissue. Based on the titles and abstracts for the papers in the compiled list, the following 5 exclusion criteria were applied sequentially. (1) Articles with different hypothesis, (2) expert reviews or systematic reviews, (3) case only studies, (4) case-control studies without HPV DNA-positivity in both groups, and (5) articles published by using the same DNA samples as another study. The remaining case-control studies after applying the 5 aforementioned criteria were selected as publications for the final analysis.

Statistical analysis
Two researchers applied the exclusion criteria for each publication and retrieved HPV-related data-the number of HPV DNA-positive and HPV DNA-negative individuals in the case and control group, nationality of study subjects, types of DNA specimen, types of HPV subtypes, and publication period. Using the obtained number of HPV DNA-positive and HPV DNA-negative individuals in the case and control group, OR and 95 % CI were calculated for each article. Based on the prevalence of HPV subtypes reported by the Zhou et al. [13], data on high-risk type-specific HPV 16, 18, and 33 were organized separately. Based on the nationality of study subjects, groups were categorized into far-east Asia (Korea, China, and Japan), middle-east Asia (Turkey, Iran, and Iraq), America, and Europe & Oceania regions. Specimen types were classified into PET and FFT groups. Publication year was divided into 2 groups with 2010 as the cut point.
The presence of heterogeneity in meta-analysis was assessed using the I-squared value (%). The summary odds ratio (SOR) for a random effect model and its 95 % CI were calculated first because if the I-squared value is 0.0 %, using either a random effect model or a fixed effect model will result in the same value. To determine the publication bias, Egger's test for small-study effects was conducted [52]. Additionally, a subgroup analysis and a meta-regression analysis were conducted using the 4 potential variables thought to cause heterogeneity in risks-geographic region, HPV DNA source, publication year, and subtype of HPV. P-value of less than 5 % was considered statistically significant, and STATA version 14 (www.stata.com) statistics program was used.