Human and spatial-temporal clustering analysis of human brucellosis in mainland China from 2012 to 2016

Objective: The aim of the present study was to better understand the epidemiology of brucellosis in mainland China. We set out to investigate the human, temporal and spatial distribution and clustering characters of the disease. Methods: Human brucellosis data from mainland China between 2012 and 2016 were collected from the China Information System for Disease Control and Prevention. Geographic information system ArcGIS10.3 (ESRI, Redlands) and SaTScan software were used to identify potential changes in the spatial and temporal distribution of human brucellosis in mainland China during the study period. Results: A total of 244,348 cases of human brucellosis were reported during the study period. During 2012-2016, the average incidence of human brucellosis was higher in the 40-65 age group. The temporal clustering analysis showed that the high incidence of brucellosis occurring between march and July annually. The spatial clustering analysis of the incidence of human brucellosis showed that the location of brucellosis clustering in mainland China remained relatively xed, mainly concentrated in most parts of northern China. The result of spatial-temporal clustering analysis showed that there is a primary cluster area of Heilongjiang , and three secondary clusters area of Tibet, Shanxi and Hubei province. Conclusion: Human brucellosis remains a widespread challenge, particularly in northern China. The clustering analysis highlights potential high-risk human, time and areas which may require special plans and resources for monitoring and controlling the disease.

statutory infectious diseases, it rose from 16th place in 2000 to 6th place in 2014. A strong upward trend such as this is extremely rare among all such reported infectious diseases.
Previous studies have demonstrated that the global epidemiology of brucellosis has changed dramatically over the past decades, particularly in industrialized countries [15]. The spatial distribution characteristics of a brucellosis outbreak also have changed distinctly [1], although whether this change has temporal and spatial covariance requires further investigation.
Based on national surveillance data, this study applied a human, spatial and temporal distribution model http://www.phsciencedata.cn/Share/index.html? aafa8285-42ae-4dbc-a828-152c2cef6396). This database collects all the reported cases since in 2004, in which the direct network report system of infectious disease in China were opened. The main contents of database of human brucellosis include the number of incidence, incidence rate the number of deaths, and mortality rate in multiple dimensions by region, age group gender, and occupation. According to national brucellosis surveillance program in China, the clinically diagnosed cases and con rmed cases are reported in this system. Clinically diagnosed cases are de ned as presumptive cases(those with both epidemiological history and clinical manifestations), with any positive screening test results. Con rmed cases are de ned as presumptive cases or clinically diagnosed cases, with any one of the con rmed experimental evidence.

Statistical methods
Firstly, Age clustering was achieved by hierarchical clustering method using SPSS(23.0) software.
Hierarchical clustering is a statistical analysis technique that divides objects into relatively homogeneous groups. According to the characteristics of the object, it is classi ed so as to reduce the number of objects. Then, time series decomposition analysis was performed to explore the impact of seasonality on human brucellosis incidence. A seasonal decomposition was used to break down monthly time series data into four components: original time series data, trend, seasonality and random effect, and time series decomposition analysis was achieved by R software.
Global Moran's I value Global Moran's I value was used to determine whether there is a global spatial auto-correlation between provinces. Global Moran's I value ranges from [-1,1]. I>0 indicates a positive spatial correlation; I<0 indicates a negative spatial correlation; if I value is close to 0, no spatial correlation exists [18]. The larger the absolute value of I, the stronger the correlation. When |Z|>1.96, P<0.05 was considered statistically signi cant and there was spatial auto-correlation [19][20]. Global spatial auto-correlation was conducted by the ArcGIS(10.6) software using the packages of Spatial Auto-correlation.

Space-time scan
SaTScan software was developed by Kuldlorff, was used to explore the spatial, temporal, spatialtemporal cluster of human brucellosis as well as to verify whether the time and geographic clustering of human brucellosis was caused by random variation or not [21]. SaTScan is based on the spatial dynamic window scanning statistics. By calculating the likelihood ratio of spatial unit attributes within and outside the dynamic window area under different centers and radius, it makes statistical inference, and uses Monte Carlo simulation for statistical signi cance evaluation to explore the maximum possible clustering area [22]. For each possible spatial-temporal clustering area, when p < 0.05, the larger the LLR value is, the more likely it is that the area covered by the scanning dynamic window is the clustering area [23]. Finally, the window with the largest LLR value is selected as the maximum possible clustering area, and other windows with statistical signi cance are the secondary possible clustering area. A retrospective spatial analysis method and Poisson distribution model were used to analyze the high spatial clustering areas of brucellosis in mainland China.
The selection of the maximum radius of the scanning window and the maximum length of the temporal scanning window were very important since the result of spatial-temporal scan are sensitive to them [24]. The default Settings for window size and time size are usually set to 50%, but some studies have questioned whether this is appropriate [25]. A high false positive rate may occur If the window size is too large. Similarly, a high false negative rate may occur if the window size is too small [26]. Many studies have explored how to select the appropriate scan window. The main rule of these studies is that the number of areas covered by a single clustering area should not exceed 15% of the number of overlapping areas [27][28]. And some studies also used this standards in regular scan statistics [27]. So we used this experience for reference in our study. Finally, we selected the spatial window covering 30% of the population at risk of the whole study period to do our research.
During the analysis, three data banks were connected to the programme for the analysis: one with the latitudes and longitudes of the centroids of each province, another with the populations of the provinces by year, and a third with the number of cases per province and by year [29].

Descriptive analysis
There were 244,348 cases of human brucellosis reported over 31 provinces during the study period . During all years represented, 2014 had the largest number of cases (57,222) with all provinces reporting, except the Tibet. There was a small increase in incidence during the study period(2.9328/100,000 in 2012 to 3.4388/100,000 in 2016).

Clustering analysis 1 Human clustering analysis
At present, the incidence of brucellosis is gradually expanding, from the previous young and middle-aged population to all age groups. During 2012-2016, the average incidence of human brucellosis was higher in the 40-65 age group. While the incidence of brucellosis in other age groups was relatively low, and the age groups under 20 and over 80 were much lower, and the concrete incidence distribution of each age group was as follows (Table 1).
According to the hierarchical clustering result (Fig 1), the average incidence of human brucellosis among different age groups during 2012 to 2016 was divided into three clusters. The highest incidence was in the 40 65 groups. Secondly, 10 , 15 , 80 and above , 20 , 70 were grouped together. Once again, the incidence rates of 30 , 35 , 5 , 25 , 70 and 0 were clustered into another cluster.

Temporal clustering analysis
The seasonal decomposition of brucellosis incidence showed strong seasonal characteristics (Fig.2).
Similarly, the temporal cluster analysis showed the consistent result, with the high incidence of brucellosis occurring between march and July annually (Table 2), which was similar to the previous studies. The high clustering time for human brucellosis in the whole study period was observed from January 2014 to December 2015. During this period, a total of 113,111 human brucellosis cases were reported and the risk of human brucellosis related events was 31% (RR = 1.31, P = 0.001) higher than during other periods. In addition, brucellosis incidence increased overall during the study period, but declined slightly in 2016.

Spatial clustering analysis
Using provincial units to carry out the global auto-correlation analysis, we obtained Moran's I value, variance, Z score and P value from 20012 to 2016, respectively (see Table 3). The values of Moran's I were 0.1179 and 0.1181 respectively for 2013 and 2014, while Z values were greater than 1.96 (all P < 0.05), indicating that incidences of brucellosis in China between 2013 to 2014 had a non-random distribution. So further spatial clustering analysis of human brucellosis were needed.
The spatial clustering analysis of the incidence of human brucellosis from 2012 to 2016 showed that the location of brucellosis clustering in mainland China remained relatively stationary, mainly concentrated in most parts of northern China (Fig 3). 4 Spatial-temporal clustering analysis A heat-map was drawn for the regions and time of human brucellosis. It was found that the high brucellosis occurred in Xinjiang, Ningxia, Heilongjiang, Inner Mongolia and Shanxi during the study period ( gure 4A), and the incidence was much higher than that in other regions during the same period. At the same time, brucellosis tends to occur from march to August, with the highest incidence in May, particularly in 2014 and 2015( gure 4B).
Finally, both spatial and temporal clusters of high incidence districts per zone were identi ed. The results of spatial-temporal cluster analysis for reported human brucellosis in 31 provinces of mainland China from 2012 to 2016 were shown in Table 4

Discussion:
In recent years, brucellosis is still regarded as a serious public health problem because of its resurgence in China and even in the world. No matter how we stress the importance of restart control strategy cannot be over emphasized. Describing the cluster distribution of brucellosis in human, spatial and temporal of brucellosis are the basis of preventing and eliminating the brucellosis.
Although there is no age preference in brucellosis and all people can be infected, there is still a clear age difference from our ndings. The incidence of brucellosis in people over 25 years old is signi cantly higher than that in people under 25 years old, and the incidence of brucellosis in people over 65 years old is also lower. It is not di cult to nd that the distribution of age is based on whether it is an important family labor force. In rural Chinese families, most middle-aged and elderly people aged between 40 and 65 are raising livestock, so the incidence of brucellosis is the highest among this age group. The presence of people in other age groups may be related to drinking unpasteurized milk [30][31].
As we all know, seasonal identi cation is a key step in brucellosis prevention strategy. Our temporal cluster analysis results show that there is a distinct seasonality in human Brucellosis. The mainly cluster time of reported cases is in March, April, May, June and July, accounting for 58.86% of the total incidence from 2012 to 2016. These months are the time of lambing in the agricultural and pastoral areas of northern China. It may also be due to warm temperatures suitable for the transmission of zoonosis, with similar seasonal characteristics in other countries [32][33][34][35][36]. At the same time, according to some research ndings, temperature, sunshine, wind speed, altitude and rainfall will affect the introduction of brucellosis [37][38][39].
Spatial cluster analysis of human brucellosis incidence in 31 provinces in mainland China showed that the spatial cluster exited in every year from 2012 to 2016. The distribution of clustering regions were similar in each year, which meant that the high incidence areas of brucellosis in China are concentrated in the northern animal husbandry areas and their adjacent areas, and other studies also have this conclusion(Ningxia, Xinjiang, Qinghai and Inner Mongolia belong to the four major pastoral areas in China) [40]. Meanwhile, Heilongjiang, Jilin, Liaoning, Gansu and Shanxi provinces also belong to the cluster zone, possibly because of their proximity to high-prevalence areas, regions with a high incidence of the brucellosis, and the close exchange of livestock and meat. At the same time, it is no doubt that most of the high-incidence areas of the brucellosis are China's economically underdeveloped areas, with poor economy, high proportion of minorities and underdeveloped medical level. The government and individuals invested less in public health, so they have not been able to do a good job in quarantine and immunization in time, which led to a serious condition of brucellosis.
Spatial-temporal cluster analysis identi ed one primary cluster and three secondary clusters. The primary cluster located in Heilongjiang, Jilin, Liaoning and Inner Mongolia, clustering time were concentrate in January 2012 to December 2013, which indicated that prevention and control measures in these regions still need to be strengthened not only in the distant past, but also in recent years. Other secondary clusters distributed in northwestern China and scattered in the center China mainly from January 2014 to December 2016, which means that the brucellosis move from north to south, therefore, prevention and control awareness also should be established in these areas.
Our study is not without limitations. Firstly, incidences of human brucellosis were underestimated to some extent, as our data was passively collected by a monitoring system, while surveillance data quality was in uenced by comprehensive factors, such as the capacity of local health workers, availability of laboratory diagnostics, levels of awareness about the need to visit doctors and so on, all of which may have affected the study's accuracy. Secondly, We performed spatial-temporal scan statistics to detect clusters in different space and time periods, but this method only relies on circular space scans and cylindrical spatial-temporal scans, and does not consider irregular spaces.
In summary, given developments in animal husbandry in China, the prevention and detection of brucellosis urgently need to be strengthened. This spatial-temporal clustering study of brucellosis is helpful for identifying high-risk areas and time for brucellosis, and to a certain extent provides a basis for the decision-making of relevant departments. In terms of the distribution characteristics of brucellosis epidemic in this study, we suggest that the detection of brucellosis in northern China should be further strengthened and the monitoring in central China should not be ignored, and that effective methods for controlling brucellosis should be found in southern China, where incidence of brucellosis is relatively low. Governments at all levels should attach importance to the establishment of joint prevention and control mechanism among high-incidence areas. Relevant departments in different regions should strengthen both prevention and control throughout the year. More resources for prevention and control should be appropriately increased in these areas to curb the spread of brucellosis.

Conclusion:
It may be concluded that human brucellosis continues to be a widespread challenge in mainland China, especially in the northwest provinces, which are high risk areas. This study utilized ArcGIS10.3 (ESRI, Redlands) and SaTScan to analyze the spatial and temporal distribution of brucellosis in China, thus contributing to the study of high incidence seasons and areas for brucellosis. However, further research should focus on an analysis of environmental, humanistic and socioeconomic factors in order to determine risk factors affecting the occurrence and transmission of brucellosis. Such information has the potential to provide critical guidelines for policy makers to initiate prevention measures and control strategies, aimed at susceptible areas that might be high risk, and to prevent or lessen the incidence of human brucellosis in these areas. Year Cluster time frame Observed* Expected* RR* LLR* P Figure 1 Age cluster distribution of human brucellosis incidence in mainland China from 2012 to 2016. The average incidence of human brucellosis among different age groups during 2012 to 2016 was divided into three clusters, the highest incidence was in the 40 60 groups.

Figure 2
The seasonal distribution of monthly human brucellosis in mainland China from 2012 to 2016. The seasonal decomposition of brucellosis incidence showed that there was an increasing trend of human brucellosis with distinct seasonality.

Figure 3
Spatial clustering of reported brucellosis in mainland China from 2012 to 2016. The darker color represents the primary clustering area, and the lighter color represents the secondary clustering area. It is obvious that high incidence areas were concentrated in the north of China.(Color is required for this gure.) Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. Spatial-Temporal clustering of reported brucellosis in mainland China from 2012 to 2016 Figure A represents the distribution of human brucellosis incidence in different provinces of China from 2012 to 2016, and gure B represents the distribution of human brucellosis incidence in each month from 2012 to 2016. Red represents a high incidence, while blue represents a low incidence.(Color is required for this gure.) Figure 5 Spatial-Temporal clustering of reported brucellosis in mainland China from 2012 to 2016 The circle represents the clustering area, the largest represents the primary clustering area, and the rest represents the secondary clustering areas.(Color is required for this gure.) Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.