Molecular and phylogenetic analysis of HIV-1 variants circulating in Italy

Objective The continuous identification of HIV-1 non-B subtypes and recombinant forms in Italy indicates the need of constant molecular epidemiology survey of genetic forms circulating and transmitted in the resident population. Methods The distribution of HIV-1 subtypes has been evaluated in 25 seropositive individuals residing in Italy, most of whom were infected through a sexual route during the 1995–2005 period. Each sample has been characterized by detailed molecular and phylogenetic analyses. Results 18 of the 25 samples were positive at HIV-1 PCR amplification. Three samples showed a nucleotide divergence compatible with a non-B subtype classification. The phylogenetic analysis, performed on both HIV-1 env and gag regions, confirms the molecular sub-typing prediction, given that 1 sample falls into the C subtype and 2 into the G subtype. The B subtype isolates show high levels of intra-subtype nucleotide divergence, compatible with a long-lasting epidemic and a progressive HIV-1 molecular diversification. Conclusion The Italian HIV-1 epidemic is still mostly attributable to the B subtype, regardless the transmission route, which shows an increasing nucleotide heterogeneity. Heterosexual transmission and the interracial blending, however, are slowly introducing novel HIV-1 subtypes. Therefore, a molecular monitoring is needed to follow the constant evolution of the HIV-1 epidemic.


Introduction
Human immunodeficiency virus type 1 (HIV-1) shows an extensive genetic variability and can be classified into 9 phylogenetic subtypes (A-K), which are approximately equidistant from one another, and several circulating recombinant forms (CRFs), resulting from recombination events occurring between different HIV-1 subtypes co-circulating in a specific geographic region [1].
The first phase of the HIV epidemic in Italy has been mainly confined to the injecting drug users (IDUs) risk group, with an absolute predominance of HIV-1 B subtype, in accordance with other Western Countries. In particular, among the total AIDS cases reported in the adult population during the period between 1982 and 2006, 56.0% were IDUs (including also homosexual IDUs) with similar percentages in men and women groups (57.3% and 51.5%, respectively) [2]. The annual percentages of AIDS cases reported in IDUs have gradually decreased from 65.8% in 1987 to 27.6% in 2006 [2], in part as consequence of prevention programs implemented in Italy to discourage syringe sharing [3,4]. In parallel, the overall AIDS cases reported in heterosexual individuals account for the 19.5% of total epidemic cases, with a significantly higher percentage in the women category compared to men (41.2% vs 13.6%). However, the annual percentage of AIDS cases related to the heterosexual transmission has dramatically increased over the years, becoming in 2006 the most prevalent risk factor for AIDS (40.4%) [2].
Although almost 25% of heterosexual individuals diagnosed with AIDS in Italy are partners of long-term HIV-1 infected individuals, carrying a "historical" B-subtype virus, more than 10% of them are either immigrants from endemic regions for HIV-1 (6.87%) or their Italian partners (3.03%), while the risk is unidentified for 64% of them [2]. This epidemiological evidence, based only on the AIDS reported cases and not considering all the HIV-1 infections derived also from travelling abroad, suggests that at least 10% of the viruses transmitted through heterosexual contacts could potentially belong to non-B subtypes and CRFs. In fact, HIV-1 isolates genetically related to subtypes novel to the Italian epidemic have been recently increasingly identified and described [5][6][7][8][9][10][11][12][13][14][15][16][17].
In this framework, the introduction and the possible spread of different HIV-1 subtypes and/or recombinant forms, which could require the future development of adequate diagnostic, treatment, and prevention strategies, needs to be constantly monitored.
For the present study, 25 HIV-seropositive individuals residing in Italy were enrolled at sentinel Centers, with a HIV-infection diagnosed in the period 1995-2005. The molecular study has been performed on the hypervariable C2-V3 region of the env gene as well as the more conserved 5' region of the gag p17 sequence. Three non-Bsubtype HIV-1 isolates have been identified and phylogenetically classified as C (1 isolate) and G (2 isolates) subtype.

Sample Collection
Blood samples were collected from 25 HIV-positive individuals attending Italian sentinel Centers in Bologna, Parma and Naples. For all of them, HIV infection was diagnosed during the 1995 to 2005 period, and most participants were infected through sexual contact (18 of 25, 72%). HIV-1 infection was diagnosed by immunologic methods (ELISA, Western blot), and the viral load was evaluated by viral RNA quantification. The full designation of samples, according to WHO-proposed nomenclature, is CI05.00XE or CI05.00XG, where CI stands for the city of enrolment, 05 stands for the year of study, 00 for the enrolment number and E (or G) stands for env (or gag). For the sake of simplicity, however, in this paper the samples have been indicated with CI05.01 (i.e. BO05.01).

Polymerase Chain Reaction (PCR)
Peripheral blood mononuclear cells (PBMCs) were purified from fresh HIV-1-positive blood samples by Leucoprep density gradient centrifugation, and cellular lysates (approximately 6 × 10 6 cells) were prepared by Proteinase K digestion at 56°C.
The quality of target DNA was verified by PCR amplification of the housekeeping p53 cellular gene. The amplification of HIV-1 V3-V5 region of the env gene and p17 region of the gag gene were performed by nested PCR analysis, using 1.5 × 10 5 cells (corresponding to approximately 1 μg of genomic DNA) as a template. The V3-V5 region of the HIV-1 env gene (666 bp) was amplified, as previously described, using the primer pairs ED5-ED12 and ES7-ES8 for the first and the second round of amplification, respectively [31,32]. The p17 region (474 bp) was amplified, as previously described, using the primer pairs CL1028-CL1033 and CL1029-CL1032 for the first and the second round of amplification, respectively [14,33].

DNA Sequencing
Direct sequencing reactions were performed on PCR products purified with a rapid method developed in our laboratory, following the Sequenase protocol (United States Biochemical, Cleveland OH), modified in the labelling step (3 minutes on ice) [31,34]. The internal annealing oligonucleotides, V3B and GAG B SENSE (annealing to a 19 bp fragment of the C2-V3 region and a 17 bp of the p17 region, respectively), were used to prime sense sequence reactions. Sequences were then analyzed on 6% polyacrylamide wedge sequencing gel.

Analysis of Sequences
The env and gag sequences obtained were aligned. Multiple sequence alignments were performed with the MegAlign application of the Lasergene software (DNASTAR Inc., Madison, WI) using the Clustal method. Phylogenetic trees were generated by using the neighbor-joining method with the PHYLIP software package (version 3.52c; Joseph Felsenstein, University of Washington). Briefly, the SEQBOOT program was carried out to generate 100 data sets that represent randomly re-sampled versions of the input-aligned sequences, to test the reliability of the final tree topology. Evolutionary distances were estimated by the DNADIST program, using either the Kimura 2-parameter method or the maximum likelihood distance method, and the phylogenetic relationships were determined by the NEIGHBOR program. A consensus tree was constructed using the CONSENSE program with the majority rule criterion and was drawn with the NJPLOT application.

Amplification of Italian samples by PCR
The C2-V5 region of the HIV-1 env gene and the p17 region of the gag gene were amplified by nested PCR using primers and conditions previously described [14]. Overall, 60% of the samples (15 of 25) were positive at the PCR amplification reactions for both env and gag subgenomic regions. However, 64% of the samples (16 of 25) were positive at the amplification reaction for env and 68% of the samples (17 of 25) were positive to the amplification reaction for gag. Therefore, one sample (BO07.14) was positive only in env and two samples (NA05.05 e NA05.06) were positive only in gag; the remaining 7 samples were negative to both sub-genomic regions. This result might be the consequence of either mutations in the primers' annealing regions, reducing the melting temperature, or proviral DNA quantity below the threshold of sensibility. An example of the results obtained by nested PCR for env (666 bp) and gag (474 bp) is shown in Figure 1A and 1B.

Molecular analysis of env gene
The nucleotide sequence analysis has been performed on the C2-V3 region of the env gene, directly from the PCR products without a sub-cloning step.  Figure 2A). Nevertheless, specific samples show a >30% divergence versus the B clade, strongly suggesting a non-B clade classification. In particular, the sample PR06.07 shows an average divergence of 15% versus the CRF02_AG and 23.2% versus all other clades ( Figure 2B); the sample PR06.03 shows an average divergence of 18% versus the C clade and 22.7% versus all other clades (Figure 2C). Therefore, the results suggest a B-clade classification for the vast majority of the analyzed Italian sequences and a C-clade and CRF02_AG classification for the latter samples.

Molecular analysis of gag gene
The nucleotide sequence analysis has been performed on the p17 region of the gag gene, directly from the PCR products without a sub-cloning step. As for the env gene, the HIV-1 nucleotide sequences have been aligned with Clustal method to HIV-1 reference standards of different subtypes, Groups and CRF02_AG, in order to determine the homology values between the analyzed samples.  Figure 3A). Nevertheless, specific samples show a >22% divergence versus the B clade, strongly suggesting a non-B clade classification. In particular, the sample PR06.07, also in the gag region, shows an average divergence of 13% versus the CRF02_AG and 24.34% versus all other clades ( Figure 3B). Similarly, the sample PR06.03, also in the gag region, shows an average divergence of 22% versus the C clade and 32.08% versus Analysis of DNA fragments obtained by nested PCR on the C2-V5 env (A) and the p17 gag (B) genes of HIV-1 all other clades ( Figure 3D). Finally, the sample NA05.05, negative for env, shows an average divergence of 13.9% versus the CRF02_AG and 25.31% versus all other clades ( Figure 3C). Therefore, also in the gag region, the results suggest a B-clade classification for the vast majority of the analyzed Italian sequences and confirm a C-clade or CRF02_AG classification for the same samples identified in env. Moreover, all isolates show a concordant subtype classification in both env and gag sub-genomic regions, suggesting the absence of intra-genomic recombination events.

Peptide analysis and comparison
The V3 and p17 region amino acid sequences have been deduced for each isolate by computer analysis and aligned following their subtype classification. The V3 region of the Italian B-subtype isolates identified in this study shows an amino acid variability mainly localized outside the V3 loop region. The consensus derived from the alignment, in fact, shows that the amino acid residues conserved in 100% of the aligned sequences are all localized in the V3 loop (9 of 35, 25.71%). However, besides the subtypespecific "fingerprint" GPGR sequence at the tip of the V3 loop, less-represented sequences have been identified at the tip (GPGG, GPGS, GPGQ), suggesting a constant diversification in the B clade ( Figure 6A).
In regards to the non-B clade isolates identified in this study, and distributed in two different clades (C and CRF02-AG), the too limited number of samples hampers the possibility to derive a consensus sequence and to infer conclusions; nevertheless, it is worthwhile to mention that they all show the "fingerprint" GPGQ tetramer at the tip of their V3 loop sequences (data not shown).

Moreover, mutations in amino acid residues conferring resistance to fusion/binding inhibitors, in association
Average nucleotide divergence of p17 gag gene versus standard sequences of different clades  with other mutations along the env sequence not analyzed in the present study, have been observed (i.e. Q296K), suggesting the transmission of isolates resistant to this class of antiretrovirals ( Figure 6A) [35][36][37].
The p17 region of the B clade isolates shows a low aminoacidic variability and the consensus derived from the alignment shows an overall rate of amino acid conservation of 44.44% (36 of 81 residues). None of the observed amino acid changes is found in gag residues conferring drug resistance ( Figure 6B). Similar results are observed also for non-B clade isolates. observed divergence values are compatible with samples identified in a geographical area characterized by a longlasting HIV epidemic and confirm the more pronounced genetic evolution of env gene compared to gag.

Discussion
The phylogenetic analysis confirmed the nucleotide divergence subtype prediction for both B and non-B-subtype classification; in particular, of the 3 non-B isolates, 2 cluster with CRF02-AG (NA05.05 and PR06.07), 1 with the C subtype (PR06.03). Moreover, for all the B-and non-Bsubtype sequences, the phylogenetic classification matches in the gag and env sub-genomic regions, suggesting the absence of intra-genomic recombination events.
The phenetic analysis of the V3 region shows a significant amino acid stability in 9 residues of the V3 loop, for Bsubtype isolates, confirming the strong selection for specific sequences involved in strategic functions regarding the immune response as well as cellular tropism and transmission. The different HIV-1 isolates identified in the current study show the subtype-specific tetrameric sequence at the apex of the V3 loop (GPGR for B clade, GPGQ for non-B clades), which is considered the target for the anti-V3 neutralizing antibodies. However, less-represented sequences have been identified at the tip of the B clade isolates (GPGG, GPGS, GPGQ), suggesting a constant diversification in this clade.
The overall results suggest that the B subtype is still largely predominant in the HIV-1 epidemic in Italy and is circulating among all risk groups. On the contrary, HIV-1 non-B subtypes in Italy are strictly associated with the heterosexual transmission and are identified in infections acquired in the last period (2001)(2002)(2003)(2004)(2005) Phylogenetic classification of the p17 region of gag gene Figure 5 Phylogenetic classification of the p17 region of gag gene. The p17 region of gag gene from Italian samples has been aligned with standard sequences of HIV-1 group M including some known CRF. Sequences from groups N and O have been used as outgroup. Italian sequences are indicated as underlined. Reliability has been estimated by boot-strap analysis. The bar shows a 10% divergence.  These results confirm that in Italy, as in other Western European countries, non-B subtypes or recombinant forms are introduced by immigrants/migrants and transmitted at a low rate to the indigenous population. This would also explain the lower prevalence of non-B subtypes in Italy compared with other European countries with an older tradition of immigration waves and much tighter historic and economic links with African countries.
The presented data are representative of a nationwide molecular survey and, regardless the small sample size, are part of recurrent studies giving a constant updated picture of the genetic evolution of the HIV-1 epidemics in different risk groups in Italy. http://www.infectagentscancer.com/content/3/1/13