Mannose-binding lectin 2 gene polymorphisms and their association with tuberculosis in a Chinese population

Background Immune- and inflammation-related genes (IIRGs) play an important role in the pathogenesis of tuberculosis (TB). However, the relationship between IIRG polymorphisms and TB risk remains unknown. In this study, the gene polymorphisms and their association with tuberculosis were determined in a Chinese population. Methods We performed a case-control study involving 1016 patients with TB and 507 healthy controls of Han Chinese origin. Sixty-four single-nucleotide polymorphisms (SNPs) belonging to 18 IIRGs were genotyped by the PCR-MassArray assay, and the obtained data was analyzed with χ2-test, Bonferroni correction, and unconditional logistic regression analysis. Results We observed significant differences in the allele frequency of LTA rs2229094*C (P = 0.015), MBL2 rs2099902*C (P = 0.001), MBL2 rs930507*G (P = 0.004), MBL2 rs10824793*G (P = 0.004), and IL12RB1 rs2305740*G (P = 0.040) between the TB and healthy groups. Increased TB risk was identified in the rs930507 G/G genotype (Padjusted = 0.027) under a codominant genetic model as well as in the rs2099902 (C/T + C/C) vs T/T genotype (Padjusted = 0.020), rs930507 (C/G + G/G) vs C/C genotype (Padjusted = 0.027), and rs10824793 (G/A + G/G) vs A/A genotype (Padjusted = 0.017) under a dominant genetic model after Bonferroni correction in the analysis of the overall TB group rather than the TB subgroups. Furthermore, the rs10824793_rs7916582*GT and rs10824793_rs7916582*GC haplotypes were significantly associated with increased TB risk (P = 0.001, odds ratio [OR] = 1.421, 95% confidence interval [CI]: 1.152–1.753; and P = 0.018, OR = 1.364, 95% CI: 1.055–1.765, respectively). Moreover, the rs10824793_rs7916582*AT/AT or rs10824793_rs7916582*GT/GT diplotype showed a protective (P = 0.003, OR = 0.530, 95% CI: 0.349–0.805) or harmful (P = 0.009, OR = 1.396, 95% CI: 1.087–1.793) effect against the development of TB. Conclusions This study indicated that MBL2 polymorphisms, haplotypes, and diplotypes were associated with TB susceptibility in the Han Chinese population. Additionally, larger sample size studies are needed to further confirm these findings in the future.


Background
Tuberculosis (TB) is a global infectious disease in humans. It is a severe and even lethal disease and was responsible for 1.2 million deaths worldwide in 2018 [1]. The main reason worldwide TB eradication is so difficult is that smear-positive TB patients are the most important source of infection. They often transmit the TB bacterium via droplets produced by coughing, sneezing, etc. It was found that a TB patient typically infects 10-15 people from the onset of the disease until diagnosis and treatment and that these infected people can, in turn, become new sources of infection. A healthy person's chances of being infected with Mycobacterium tuberculosis depend on the number of droplets inhaled and duration as well as the individual's immune status.
However, previous studies have mostly focused on the association between polymorphisms in one or several related genes and susceptibility to TB, rather than multiple IIRGs. It is well known that the interaction between M. tuberculosis and its host leads to a very complex immune response. As such, studies that focus on a single or a small number of sample genes may overlook potential associations between multiple genes: for example, they may ignore linkage disequilibrium between numerous single-nucleotide polymorphisms (SNPs). Therefore, it is imperative to study the association between various SNPs and susceptibility to TB in as many IIRGs as possible.
In this study, 64 SNPs in 18 IIRGs were selected, and the association between these SNPs and TB risk was evaluated using the polymerase chain reaction (PCR)-MassArray method in a large case-control population of Han Chinese origin.

Patients, controls, and ethics statement
This case-control study was performed in the 8th Medical Center of Chinese PLA General Hospital (Beijing, China) from June 2009 to March 2019 and was approved by the Research Ethics Committee of the 8th Medical Center of the Chinese PLA General Hospital. All DNA samples were extracted from residual blood after a liver function test. Informed consent was obtained from all participants. In total, 1016 patients (597 males and 419 females, mean age 39.5 ± 19.3 years) with a TB diagnosis according to smear, M. tuberculosis culture, radiological examination, and histological examination were randomly included from the patients in the 8th Medical Center of the Chinese PLA General Hospital (Beijing, China). In the same period, 507 healthy volunteers (289 males and 218 females, mean age 51.8 ± 10.6 years) with retrospectively confirmed non-tuberculous diseases were included from the physical examination center of 8th Medical Center of the Chinese PLA General Hospital. All TB and control patients were HIV negative.

DNA extraction
Blood samples (2 ml) from each participant were collected (the residual portion of the blood samples obtained for a liver function test) and stored in citrateanticoagulated glass tubes at − 40°C until use. The Whole Blood DNA Extraction Kit (Tiangen Biotech, Co., Ltd., Beijing, China) was used to extract total genomic DNA from 1 ml of the stored blood samples, following the manufacturer's instructions. Then, the extracted genomic DNA was resuspended in 0.1 × Tris-EDTA buffer (10 mmol/L Tris, 1 mmol/L EDTA, pH 8.0) and stored at − 20°C.

Screening of target SNPs
Data from the International HapMap Project (http:// hapmap.ncbi.nlm.nih.gov) were used to screen potential SNPs using an estimated r 2 threshold of > 0.8 for the untyped SNPs as reported in a previous study [15]. The genotype data for the Han Chinese population were obtained from the Haploview 4.2 program (http://www. broad.mit.edu/haploview) and used to select SNPs that have a minor allele frequency (MAF) of > 0.05.
(3) Extension: The concentrations of the extension primers were adjusted to equilibrate the signal-to-noise ratios. Then, termination mix (100 μmol), DNA polymerase (0.05 U, Sequenom, Inc., San Diego, United States), and extension primers (625 to 1250 nmol/L) in a final volume of 9 μl were pooled together and detected using an iPLEX Gold Kit (Sequenom, Inc., San Diego, United States) at 94°C for 30 s, followed by 5 s at 94°C and 5 cycles of 5 s at 52°C and 5 s at 80°C. An additional 40 annealing and extension cycles were then performed, with 5 s at 94°C and 5 cycles of 5 s at 52°C and 5 s at 80°C. The final extension was carried out at 72°C for 3 min; then, the sample was cooled to 4°C. (4) MALDI-TOF-MS: The samples were then manually desalted using 6 mg of clean resin and a dimple plate and subsequently transferred to a 384-well Spectro-CHIP (Sequenom, Inc., San Diego, United States) using a nano-dispenser. The mass spectra were acquired using the Compact Mass Spectrometer and analyzed via the MassArray Typer 4.0 Software (Sequenom, Inc., San Diego, United States). The PCR assay was performed with two no-template controls and four duplicated samples in each 384-well format as quality controls. Each genotyping result was generated and analyzed by laboratory staff who were unaware of the patient's status.

Statistical analyses
All statistical analyses were performed using the Stata statistical package (version 10.0; StataCorp LP, College Station, TX, USA), and all P values were two-tailed. The statistical differences in allele and genotype frequencies between the TB and control groups were evaluated using the χ 2 -test. In the χ 2 -test, P values with a Bonferroni correction of < 0.05 were considered significant. The Hardy-Weinberg Equilibrium (HWE) was tested via the χ 2 -test for goodness of fit using a web program (http:// ihg.gsf.de/cgi-bin/hw/hwa1.pl). Moreover, Akaike's information criterion was used to select the genetic model with maximum parsimony for each SNP. Odds ratios (ORs) as well as 95% confidence intervals (CIs) were calculated via unconditional logistic regression analysis with adjustment for age and gender.
The pairwise linkage disequilibrium (LD) among the SNPs was determined using Lewontin's standardized coefficient D' and LD coefficient r 2 as described in a previous study [17], whereas haplotype blocks were defined in Haploview 4.2 (https://www.broadinstitute.org/haploview/haploview) with default settings following the criteria published in a previous study [18]. In addition, the haplotype frequencies were estimated using the PHASE 2.1 Bayesian algorithm [19] and HAPLO.STATS [20]. The haplotypes were then pooled into a combined group if their frequency was less than 0.03. Empirical P values, based on 100 000 simulations, were computed for the global score test and each of the haplotype-specific score tests. The diplotype (haplotype dosage, an estimate of the number of copies of the haplotype) was the most probable haplotype pair for each individual. Unconditional logistic regression analysis was used to evaluate the ORs and 95% CIs for participants carrying 1 to 2 copies versus 0 copies of each common haplotype for the dichotomized diplotypes.

Results
The distribution of 64 SNP alleles in TB patients and healthy controls One thousand and sixteen patients with a TB diagnosis and 507 healthy controls were recruited. Among the TB patients, 680 (66.9%) had total pulmonary TB (TPTB), including 388 with simple PTB and 74 with simple TB pleurisy (TBP), 166 (16.3%) had extrapulmonary TB (EPTB), and 170 (16.7%) had concomitant PTB and EPTB (PTB + EPTB).
Sixty-four SNPs from 18 IIRGs were selected and genotyped, and all allele distributions in the control group were consistent with those from the HWE (P > 0.01, Table 1). The results showed that the allele distributions of LTA rs2229094*C (P = 0.015), MBL2 rs2099902*C (P = 0.001), MBL2 rs930507*G (P = 0.004), MBL2 rs10824793*G (P = 0.004), and IL12RB1 rs2305740*G (P = 0.040) were significantly different between the TB patients and healthy controls (Table 1), whereas the allele distributions of the other SNPs were not.

The genotypic frequencies of SNPs and their associations with TB risk
When investigating the TB group, the unconditional logistic regression analysis showed that 14 SNPs of IL18R1, IL1A, STAT1, LTA, IFNGR1, MBL2, VDR, and IL12RB1 were associated with TB risk under a  (Table S1) and under a dominant and recessive genetic model (Table S2). However, after adjusting for the Bonferroni correction, only SNPs in the MBL2 gene were found to still be associated with TB risk. Therefore, we next focused on the MBL2 gene. Our results showed that: 1) Under a codominant genetic model (Table 2), the rs2099902 C/T and C/C genotypes, rs930507 C/G genotype, rs10824793 G/A and G/G genotypes, and rs7916582 T/C genotype were associated with increased risk of TB. After the Bonferroni correction, increased TB risk was still observed in patients with a rs930507 G/G genotype (P adjusted = 0.027). 2) Under a dominant and recessive genetic model (Table 3)

, the rs2099902 (C/T + C/C) vs T/T and C/C vs (T/T + C/T) genotypes, rs930507 (C/G + G/G) vs C/C genotype, rs10824793 (G/A + G/G) vs A/A as well as G/G vs (A/ A + G/A) genotypes, and rs7916582 (T/C + C/C) vs T/T
genotype were associated with increased risk of TB. Interestingly, increased TB risk was still observed for the rs2099902 (P adjusted = 0.020), rs930507 (P adjusted = 0.027), and rs10824793 (P adjusted = 0.017) SNPs under a dominant genetic model after the Bonferroni correction.

The distribution of the MBL2 SNP genotype frequency
To further confirm the differences in the distribution of the MBL2 SNP genotype frequency between the TB subgroups (TPTB, PTB, EPTB, and PTB + EPTB) and healthy controls, we performed unconditional logistic regression analysis under codominant, dominant, and recessive genetic models. The results indicated that the rs2099902 C/T and C/C genotypes, rs930507 C/G genotype, rs10824793 G/A and G/G genotypes were associated with increased TB risk in the TB subgroups (Table  S3). However, these statistically significant differences   Table  S4).

The distribution of the MBL2 haplotypes and diplotypes
To investigate the associations regarding LD patterns between these four SNPs, we used Haploview to plot their haplotype blocks. We identified one haplotype block composed of rs10824793 and rs7916582 (r 2 = 0.98). However, the rs2099902 and rs930507 SNPs in the MBL2 gene were outside this haplotype block (Fig. 1). In the haplotype analysis, three common haplotypes (rs10824793_rs7916582*AT, GT, and GC) were observed among the participants; the total percentage of these common haplotypes was as high as 99.85% in the TB group or 99.9% in the control group. Conversely, the total percentage of other haplotypes was only 0.15% or 0.10% in the TB or control group, respectively ( Table 4). The global score test indicated that the frequency of the haplotypes from the block between the TB and control groups was significantly different (global P = 0.00222, P sim = 0.00207). Interestingly, statistical differences were observed in the frequency of the rs10824793_rs7916582*AT (P = 0.00014) or rs10824793_rs7916582*GT (P = 0.003) haplotype between the TB and control groups. Moreover, this difference remained significant after the Bonferroni correction (rs10824793_rs7916582*AT, P adjusted = 0.00042; rs10824793_ rs7916582*GT, P adjusted = 0.009). Furthermore, the rs1082 4793_rs7916582*GT or rs10824793_rs7916582*GC haplotype was significantly associated with increased TB risk (P = 0.001, OR: 1.421, 95% CI: 1.152-1.753; or P = 0.018, OR: 1.364, 95% CI: 1.055-1.765) in the logistic regression analysis when compared to the rs10824793_rs7916582*AT haplotype (Table 4). Moreover, the association between the diplotypes of the MBL2 gene polymorphisms and TB risk was also analyzed. As shown in Table 5, the diplotype composed of the rs10824793_rs7916582*AT haplotypes had a considerably decreased TB risk in a 2-copy logistic regression analysis compared with 0-copy (P = 0.003, OR = 0.530, 95% CI: 0.349-0.805). Moreover, this significant protective effect was still observed after Bonferroni correction (P adjusted = 0.009). In contrast, increased TB risk was found in the diplotype composed of the rs10824793_rs7916582*GT (P = 0.009, OR = 1.396, 95% CI: 1.087-1.793) or rs10824793_rs7916582*GC (P = 0.05, OR = 1.330, 95% CI: 1.000-1.768) haplotypes in 1-copy logistic regression analysis compared with 0-copy. However, this significant difference was only observed in the diplotype composed of the rs10824793_rs7916582*GT haplotype after Bonferroni correction (P adjusted = 0.027).

Discussion
In this study, we genotyped 64 SNPs from 18 IIRGs in a Han Chinese population. We first showed that the rs930507 G/G, rs2099902 [(C/T + C/C) vs T/T], rs930507 [(C/G + G/G) vs C/C], and rs10824793 [(G/ A + G/G) vs A/A] genotypes were risk factors for TB under a codominant or dominant genetic model in TB patients and healthy controls (Fig. 2). Interestingly, these significant associations were not observed under any genetic model between subgroups of the TB patients and controls. This may be attributed to the low number of patients included in each tuberculosis subgroup. Therefore, to further improve the accuracy of the study, the sample size of each tuberculosis subgroup should be increased in the future. The MBL protein is encoded by the MBL2 gene and is secreted in the liver, where it activates the complement system via the lectin pathway to combat pathogens during host infection [21]. Although the mechanisms by which the MBL2 mutations regulate TB progression remain unclear, there is no doubt that MBL2 plays a vital role in the pathophysiology of TB. To our knowledge, this is the first study to report that the rs930507, rs2099902, and rs10824793 polymorphisms can affect TB development in a population of Han Chinese origin. It is worth mentioning that several meta-analysis studies have reported that five MBL2 SNPs (rs1800450, rs1800451, rs5030737, rs7095891, and rs7096206) are associated with an increased or decreased TB risk [22][23][24][25][26]. However, there is insufficient data regarding the role of rs930507, rs2099902, and rs10824793 in TB susceptibility. Several studies on diseases other than TB have revealed that rs930507 was associated with an increased risk of invasive pneumococcal disease (IPD) in African Americans [27] and otitis media in children younger than 2 years of age [28]. Moreover, it was also shown to be associated with sodium-lithium countertransport (SLC) and systolic blood pressure [29]. Previous studies have found no association between rs2099902 and recurrent vulvovaginal infections risk [30] or severe dengue [31]; however, another study performed by Zanetti et al.
suggested that rs2099902 was associated with increased risk of colon cancer in African Americans [32]. These data indicate that the susceptibility and pathogenicity of the same SNP were different in various diseases. As such, the mechanisms that underlie these differences might deserve further investigation.
The above-mentioned evidence indicated an association between the MBL2 gene and TB risk genotypes. Herein, we found an association between them by linkage disequilibrium, haplotype, and diplotype analyses. In this study (Fig. 2), the rs7916582 polymorphism was not found to be significantly associated with TB susceptibility. However, when the rs10824793 and rs7916582 SNPs were combined in haplotypes, the rs10824793*G/ rs7916582*T and rs10824793*G/rs7916582*C alleles were found to be significantly associated with TB risk, which is similar to the haplotype block rs7095891*G/ rs1800450*C/rs1800451*C/rs4935047*A/rs930509*G/ rs2120131*G/rs2099902*C yielded by LD analysis in a previous study [31]. LD is the non-random combination of alleles at different loci and is influenced by several factors, such as selection, genetic drift, recombination rate, mutation rate, and population structure as well as genetic linkage. A haplotype is a group of genes in an organism that are inherited together from a single parent. Haplotypes are critical for investigating the genetics of  a P values from unconditional logistic regression analyses, adjusted for age and gender b P adjusted , P value with Bonferroni correction, P adjusted value less than 0.05 was considered to be significant common diseases, which have been studied in humans through the International HapMap Project [33]. Analyses of polymorphism data based on LD and haplotype structure are becoming increasingly important; both have been successfully used to determine the association between MBL2 polymorphisms and TB susceptibility. A previous study indicated that MBL2 gene diplotypes might be significantly more common in TB patients than in the control group [24]. It is well known that the haplotype or genotype information can be statistically defined as complete or incomplete data because the genotype data can be extracted from the haplotype data, but the reverse is not true. Consequently, it seems more important to determine the association between polymorphism and phenotype based on the configuration of haplotypes and diplotypes compared with alleles and genotypes. Recently, some studies have indicated that reactions to drugs and phenotypes are associated with the arrangement of haplotypes or diplotype rather than genotypes [34], which is consistent with the results of our present study. Although there were no significant differences in the MBL2 alleles observed between the TB and control groups, haplotype or diplotype configuration analysis found that the rs10824793_rs7916582*AT/AT diplotype had a significantly decreased TB risk in 1-copy logistic regression analysis compared with 0-copy, but the rs10824793_rs7916582*GT/GT diplotype had a considerably increased TB risk.
However, the limitation of the present study is that we did not analyze the relationship between MBL levels and TB risk. It has been reported that serum MBL levels were significantly higher in patients with active TB than in healthy controls [35], which may protect against the early development of pulmonary TB after infection [36]. represented by solid dots. The circle size and color of the dot represent the number of connection degree, red represents the maximum connection degree, and blue represents the minimum connection degree. Three genes (MBL2, LTA, and IL12RB1) and their significant SNPs were showed as red color