Spatial distribution of leprosy in India: an ecological study

Background As leprosy elimination becomes an increasingly realistic goal, it is essential to determine the factors that contribute to its persistence. We evaluate social and economic factors as predictors of leprosy annual new case detection rates within India, where the majority of leprosy cases occur. Methods We used correlation and linear mixed effect regressions to assess whether poverty, illiteracy, nighttime satellite radiance (an index of development), and other covariates can explain district-wise annual new case detection rate and Grade 2 disability diagnoses. Results We find only weak evidence of an association between poverty and annual new case detection rates at the district level, though illiteracy and satellite radiance are statistically significant predictors of leprosy at the district level. We find no evidence of rapid decline over the period 2008–2015 in either new case detection or new Grade 2 disability. Conclusions Our findings suggest a somewhat higher rate of leprosy detection, on average, in poorer districts; the overall effect is weak. The divide between leprosy case detection and true incidence of clinical leprosy complicates these results, particularly given that the detection rate is likely disproportionately lower in impoverished settings. Additional information is needed to distinguish the determinants of leprosy case detection and transmission during the elimination epoch.


Background
Leprosy (Hansen's Disease) is caused by a chronic infection by Mycobacterium leprae [1][2][3]. Long stigmatized in many cultures, leprosy is curable today with multidrug therapy [4]. While a concerted global effort to meet the World Health Organization (WHO) goals of elimination has greatly reduced the case burden in recent decades, over 200000 new cases are still reported globally each year [5,6]. Current WHO targets focus on decreasing the rate of new diagnoses with Grade 2 disability, and the reversal of legislation enabling leprosy discrimination [7]. work has indicated that leprosy case detection in India was significantly associated with enhanced case finding activity and exhibited evidence of spatial autocorrelation [15], but it is yet unclear what factors may exacerbate leprosy burden. Here, we use publicly available district-level data on reported annual new case detection rates (ANCDR) and Grade 2 disability rates [18,19]. We examine the association between these epidemiological outcome variables and poverty, based on other available measures of district wealth and development.

Data sources Leprosy
The Indian Ministry of Health reports annual new case counts for leprosy for the period 2008-2015 for each district in India (see Spatial boundaries) [18][19][20][21][22][23][24][25][26][27][28][29][30][31][32]. In accordance with case report data, we define each year as the twelve month period ending March 31. The National Leprosy Eradication Program also provides annual estimated populations for each district, the number of new cases of Grade 2 disability (defined by the WHO as visible deformity to the hands or feet or severe visual impairment) at the district level, as well as state-level estimates for the fraction of multibacillary cases, the fraction of cases among children, and the fraction with Grade 2 disability at diagnosis.

Census
The 2011 Census of India contains district-level data on illiteracy, unemployment, scheduled caste and scheduled tribe populations, rural population, and poverty [33][34][35][36][37]. In our data set, a poverty index was defined as the absence of a defined set of assets included in the census survey. A household was considered to be impoverished in the absence of ownership of a radio, a TV, a computer (with or without internet access), a mobile phone, landline, a bicycle, or a motorized two-or four-wheel vehicle (including a scooter or car) [37,38]. This definition is more restrictive than other economic measures of poverty (which routinely place between 20-30% of the population in poverty); only about 18% of households meet this criterion of poverty.
Illiteracy is defined as the inability to both read and write in any language; children 6 years old or younger are automatically considered illiterate in the census. An individual is considered unemployed (specifically, a "nonworker") if he or she did not partake in an economically productive activity in the 12 months preceding the census survey. This includes students, homemakers, children, retirees, and beggars; it does not include subsistence farmers or others whose primary activity was producing food for self-consumption. Therefore, unemployment here does not necessarily indicate a desire to work, or an active pursuit of employment. The census also reports the fraction of a district's population that lives in a rural area (defined as a region not registered as statutory town or municipality, with fewer than 5000 individuals, with greater than 75% of working individuals employed in agriculture, or with population density less than 400 per km 2 ) [33].
The Constitution of India includes provisions for individuals in scheduled castes and scheduled tribes (indigenous tribal persons). Historically, these two groups experienced higher levels of discrimination, exclusion, and poverty [39]. The Census reports the number of individuals in scheduled castes and in scheduled tribes per district [35,36], though this was not reported in 84 of the 604 analytic districts (Table 1; see Spatial boundaries, below).

State-level predictors
While we primarily focused on district-level analysis, we examined two possible state-level predictors of leprosy burden, collected from the Centre for Monitoring Indian Economy database [40]. The first, per-capita net domestic product (NDP), is thought to be a more direct measurement of community development and wealth than poverty or other socio-demographic variables. The second, the number of government hospitals in each state, may be related to healthcare availability and accessibility.

Satellite imagery
Nighttime satellite imagery data has proven useful in assessing economic conditions in the developing world [42][43][44][45]. We obtained nighttime cloud-free composites providing average visible lights and stable lights (which excludes impermanent sources of light, such as fires or other background noise), at 30 arc second resolution (roughly 1 km; Fig. 1) [46]. In the most dense, brightly lit areas, the satellite sensors become saturated and cannot record values above a certain threshold. In India, this threshold obscures subtle differences in illumination from the country's largest cities, including Delhi, Kolkata, Bangalore, and Mumbai. Radiance, a readjusted illumination measure produced from the same satellite imagery, may provide a better indicator of economic activity and development [44,46]. Radiance data were derived from images taken in 2010 and 2011, and were computed by averaging radiance over the areas of each district. We computed the radiance divided by the estimated population, yielding a ratio which exhibits outliers (the largest value is approximately 14 times the average value). To minimize the occurrence of potential high-leverage points, we used the rank transformed values as a predictor. Additionally, we calculated a binary low visibility indicator, defined as 1 if a district was in the lowest decile of mean visibility index, as well as a similar low radiance indicator. No effort was made to identify oil flares or other causes of high illuminance unrelated to socioeconomic development.

Spatial boundaries
There were several rearrangements of state and district boundaries over the study period. Spatial analysis was based on the GADM (Global Administrative Areas) database for administrative boundaries [47], supplemented by an updated version for selected jurisdictions [48]. If a district or state was divided into multiple districts or states during the study period, we combined data from the resulting new districts to estimate what the counts would have been for the old district boundaries, to obtain a longitudinally consistent set of reporting districts and states. Likewise, if two or more regions were merged, the data from these regions was combined throughout the study period into a single analytic district. This procedure yielded 604 analytic districts from 2008-2015 (Table 1; [15]).

Program activities
A group of 209 districts were identified as high leprosy districts, based on 2010-2011 reports [49], and these regions were targeted for subsequent enhanced surveillance activities through the National Leprosy Eradication Program. As in our previous analysis [15], we entered this list of districts for use as a binary regressor.

Statistical methods Outcomes
The primary outcome variables were the leprosy annual new case detection rates (ANCDR), defined as the number of new cases in a district divided by the estimated population of the district during that year, and the rate of new We also explored the heterogeneity in the proportion of reported leprosy cases that displayed Grade 2 disability (Grade 2 fraction).
We computed Spearman's rho (ρ) correlation coefficients for the outcomes of interest and the potential predictors. We then conducted multivariate linear mixed effects regression [50] of the longitudinal outcomes, using the district-and state-level predictors. All models include a random slope and intercept, year as a fixed effect, and a fixed effect in 2012 and 2013 for each of the 209 enhanced case finding districts mentioned above. Spatial block bootstrap (1000 replicates) helps account for spatial dependence and often estimates a conservative confidence interval [51]. The marginal and conditional R 2 values estimate the variability explained by fixed effect predictors, and by both fixed and random effects, respectively [52,53]. To improve normality and homoskedasticity, we used the log transformation for the new case detection rates (per 10 000 inhabitants) and the per-capita Grade 2 rate (per million inhabitants), with zeros modeled as 0.5 divided by the district population (as in [15]). All analysis was conducted in R v. 3.2 for MacIntosh (R Foundation for Statistical Computing, Vienna, Austria), using packages sp, maptools, spdep, lmer, and sperrorest.

Results
For 2008, district-level new case detection rates averaged 1.06 per 10 000 population (range 0-9.45). By 2015, the average ANCDR had decreased slightly to 0.929 per 10 000 population (range 0-13.6). While some districts reported no new cases of Grade 2 disability, others reported as many as 1.27 cases per 10 000 population, and in fifteen districts, all newly reported cases presented with Grade 2 disability. There was also substantial heterogeneity in district-level measures of poverty and development ( Table 1).
The raw radiance variable was strongly negatively associated with poverty (-0.68, P < 0.001), illiteracy (-0.49, P < 0.001), rural population (-0.64, P < 0.001), and scheduled caste and tribe populations (-0.40, P < 0.001). There was no significant relationship with ANCDR but a weak, positive association with rate of Grade 2 cases (0.12, P < 0.001). Though other metrics of satellite visibility, including the scaled radiance term and visibility, were significantly associated with at least one of the three primary outcomes and other covariates, the relationships were weak and inconsistent. We therefore use the unadjusted radiance term in the remainder of our analysis.

District-level predictors in Madhya Pradesh
We compared district-wise per-capita income in the state of Madhya Pradesh to variables under evaluation using the Spearman correlation to assess their utility in estimating economic and social conditions. Per-capita income is significantly positively correlated with satellite radiance (Spearman ρ = 0.57, P < 0.001) in Madhya Pradesh. Per-capita income is also, unsurprisingly, linked to the census-derived index of poverty (ρ = -0.50, P < 0.001), total visibility (ρ = 0.59, P < 0.001), illiteracy (ρ =-0.54, P < 0.001) and rural population (ρ = -0.71, P < 0.001). Per-capita income is not a significant correlate of unemployment or scheduled tribe and caste population. We also computed univariate Spearman regressions for the three leprosy outcomes in Madhya Pradesh, but these are small.

District-level analysis of leprosy trends
We first computed nonparametric correlation coefficients of leprosy case detection rates with the poverty index, for every year from 2008-2015. Values ranged from 0.0686 to 0.113 (Holm-adjusted P-values all less than 0.012). For illiteracy, the median of these yearly correlations was 0.141 (Holm-adjusted P-values all less than 7.6 × 10 −6 ). For the rural fraction, the median of these yearly correlations was 0.0805 (Holm-adjusted P-values all less than 0.076). Other predictors gave smaller univariate correlations (not reported).
Beginning with a base model which included the effect of time trend (year) and enhanced case finding, together with a random effect for district, we individually added each of the following predictors: (1) poverty index, (2) illiteracy fraction, (3) unemployment fraction, (4) fraction rural, (5) fraction in scheduled tribes, (6) fraction in scheduled castes, (7) log-transformed satellite radiance, and (8) the binary low visibility indicator (similar results, not shown, obtained for the binary low radiance indicator).
Illiteracy, scheduled tribe population, and radiance (including the binary low visibility indicator) are all independently significant predictors of district-level annual new case detection rate (Table 2). For the district-level rate of Grade 2 disability, the only statistically significant predictors were the fraction in scheduled tribes and the binary indicator of radiance. The fraction of cases with Grade 2 disability is significantly associated with illiteracy and unemployment rates, as well as the fraction in scheduled tribes and the binary radiance indicators. While illiteracy is a positive predictor of ANCDR, it is negatively associated with the fraction of cases with Grade 2 disability. Moreover, while scheduled tribe fraction and radiance are both negatively associated with ANCDR and Grade 2 disability rate, they are positively associated with the fraction of Grade 2 cases in each district. We also explored the role of per-capita income as a predictor of leprosy in Madhya Pradesh, incorporating a temporal dimension and random effect for district, but found non-significant relationships with all three leprosy outcomes.
We then performed multivariate linear mixed effects regression to determine which covariates, in combination, produced the best-fit model (as determined by Akaike's Information Criterion) ( Table 3). We found that illiteracy, radiance, and time are included in the best models of all three leprosy outcomes. Poverty is included in the models of ANCDR and fraction of Grade 2 cases, while unemployment rate and the fraction of population that is rural are included only in the models of Grade 2 fraction and ANCDR, respectively. The coefficient for time was negative, corresponding to a (slight) decrease in ANCDR from 2008 to 2015; it appears that both the population rate and fraction of Grade 2 cases have been increasing over the same period.  All models include calendar time in years, a covariate for the effect of enhanced case finding, a random slope, and a random intercept. Each covariate in the left hand column is separately added to the model. Confidence intervals derived by spatial block bootstrap (with a radius of 1.5 degrees; see text for details)  Models were selected using Akaike Information criterion (AIC) from all subsets of the regressors: poverty index, illiteracy, unemployment, rural population fraction, and scaled radiance (see text for details). All models include calendar time in years, the enhanced case finding covariate, a random slope, and a random intercept. Marginal R 2 values indicate the fraction of variance explained by the fixed effects, and conditional R 2 indicate the fraction of varianceexplained by both fixed and random effects as described in the text. Confidence intervals derived by spatial block bootstrap (with a radius of 1.5 degrees); see text for details

State-level analysis
At the state level, neither net domestic product or number of government hospitals (adjusted and unadjusted for population) were significant predictors of ANCDR, Grade 2 disability rate, or Grade 2 disability fraction in univariate analysis Table 4. Healthcare availability, estimated by the frequency of government hospitals, does not appear to substantially influence reported ANCDR across states.

Discussion
Leprosy incidence has decreased dramatically in recent years, spurred by ambitious WHO goals for elimination and by concerted effort by many of the most affected countries. Nonetheless, uncertainty remains regarding the factors underlying its persistence in certain geographic regions. Here, we examined the role of poverty and other measures of socioeconomic status in explaining variation in the district-level new case detection rates of leprosy in India. Modest relationships between leprosy annual new case detection rates and a census-derived poverty index of poverty were seen in univariate analysis. Higher rates of illiteracy were associated with a higher ANCDR but a lower fraction of Grade 2 cases; the inverse is true of the scheduled tribe population fraction, which is negatively correlated with ANCDR and Grade 2 detection rate, but positively associated with the fraction of Grade 2 cases. Other variables (unemployment, scheduled caste population, rural population) yielded nonsignificant relationships. The unadjusted radiance term is a significant predictor of higher ANCDR in both univariate and multivariate analysis. However, the binary variable indicating whether a district is in the darkest 10% of all districts is significantly negatively associated with ANCDR. Considerable reporting heterogeneity between states or districts may make rates difficult to compare between regions, and may be associated with many of the covariates studied here. The poorest districts may have less capacity for detection and surveillance, resulting in lower ANCDRs than would be expected. A number of independent reports have also indicated that leprosy incidence might be considerably higher than reported incidence, and that many cases continue to go undetected by national surveillance systems [16,54,55]. Beyond our finding of a modest decrease in the detection rate of new leprosy cases from 2008-2015 (consistent with a previous study [15]), we also found a slight increase in the rate of detection of Grade 2 cases and the fraction of detected cases presenting with Grade 2 disability. Others have noted a similar increase or stability in the rate and fraction of Grade 2 cases, even in the event of an overall reduction in leprosy burden [56][57][58][59]. Those factors that were significantly positively predictive of ANCDR were all significant negative predictors of the fraction of Grade 2 cases in univariate analysis (Table 2). Again, reporting capacity may be a confounding factor. While there could be a true increase in the incidence of Grade 2 disability relative to the number of new leprosy cases, it is also possible that districts with higher ANCDR have better surveillance or reporting systems, finding cases before they progress to Grade 2. Conversely, districts with less detection or reporting capacity (and consequently, lower ANCDR) could be more likely to detect mostly severe, Grade 2 cases. Poverty may both increase exposure to conditions favoring the transmission of disease as well as reduce detection and reporting.
Several limitations apply to this analysis. As discussed above, ANCDR does not perfectly reflect true leprosy incidence. Moreover, the use of the census-derived poverty index and satellite radiance do not fully characterize poverty. While these covariates were strongly correlated with per-capita income in one state, many other aspects of poverty, healthcare availability, and development status may be important drivers of leprosy persistence. Our analysis is also limited due to its ecological nature; from these data, it is impossible to ascertain the relationship between poverty and leprosy within a district or at an individual-level. Furthermore, several determinants of leprosy persistence may be manifested on a geographic scale smaller than that studied here. There is some evidence that leprosy occurs in relatively small spatial clusters (even within districts) [17,[59][60][61]. Analysis at a finer spatial scale may be needed to more definitively identify the key drivers of leprosy transmission and case detection.

Conclusion
We found evidence of a modest relationship between poverty and leprosy at the district level for India, in the context of a slowly declining incidence. Our results also emphasize the role of surveillance capacity in the detection, treatment, and prevention of leprosy casesindeed, a large scale population-based detection campaign has been recently undertaken across endemic districts [62]. More information at the individual level, from cross-sectional population-based surveys and assessment of surveillance capacity, is needed to understand the relationship between poverty and leprosy, and to overcome poverty and stigma as obstacles to leprosy elimination.