AI and Data-Mining for Understanding Disease Transmission Patterns
(Jiming Liu, William K. Cheung, et al.)

In this project, computational techniques and tools will be developed for collecting, mining, and visualizing source data as well as disease transmission/diffusion networks in order to timely detect, monitor and control emerging disease spread or epidemic outbreaks. Regarding the temporal-spatial disease diffusion network to be detected, (unlike the existing SaTScan clustering) here we define the network to consist of nodes and links, where the nodes may be the cases reported/observed over time, and the directional links connecting the nodes may correspond to the probability/likelihood of disease “diffusion” from one node to another over time. In the case of certain infectious/parasitic disease surveillance, the discovered links between the nodes may provide us with insights into certain “transmission” patterns, which might have been due to certain HIDDEN ecological/househood environments, e.g., due to some HIDDEN mosquito ecology/evolution patterns over space and time, within the region. Therefore, the discovered links could reveal those HIDDEN “transmission” pathways.


Recent Findings:

  1. Inferring the potential risks of H7N9 infection in eastern China
    • We have developed a diffusion model to spatiotemporally characterize the impacts of bird migration and poultry distribution on the geographic spread of H7N9 infection.
    • We have estimated the likelihood of bird migration based on available environmental and meteorological data of eastern China for 12 weeks (4 February to 28 April 2013), from Zhejiang, Shanghai and Jiangsu, to Liaoning, Jilin and Heilongjiang. The possibility of migrant birds moving over these provinces was estimated and hence its impact on the potential spatiotemporal spread of H7N9 infection.
    • Based on the poultry production and consumption in 31 provinces/municipalities in the Mainland, we have found that although the majority of early cases of H7N9 were found in Shanghai, that city is not a major poultry exporter based on the model and the results also showed limited transmission via this route. In contrast, Jiangsu distributes poultry to Shanghai, Zhejiang and beyond. Transmission via this route poses greater risks of the spread of H7N9 infection (Figure 1).
    • We have also developed a map displaying the estimated spatiotemporal patterns of the integrated risk caused by both bird migration and poultry distribution in eastern China (Figure 2).

      Figure 1. The map of relative H7N9 infection risk in mainland China. The map of relative H7N9 infection risk in mainland China estimated by assuming that the source of infection originated from the southeastern regions of Zhejiang, Shanghai, and Jiangsu (Jiangsu has a slightly higher likelihood).

      Figure 2. The estimated spatiotemporal patterns of integrated risk caused by both bird migration and poultry distribution in eastern China. The estimated integrated risk caused by both bird migration and poultry distribution in eastern China. It can be found that during the first 8 weeks, except for Jiangsu, Shanghai, and Zhejiang, other provinces such as Hebei, Anhui, Shangdong, and Beijing, also have the risk of H7N9 infection. After Week 8, we can still observe sustained infection risks in Hebei, Tianjin, and Beijing until Week 12.

  2. Inferring Plasmodium vivax transmission networks in Yunnan, China
    • We have proposed a general machine learning framework, which consists of the interactions between malaria transmission models and machine learning models, to infer underlying malaria transmission networks (Figure 3).
    • We have implemented a spatial transmission model and a recurrent neural network method to infer the transmission networks of Plasmodium vivax among 62 towns in four adjacent counties of Yunnan province, China, which have been experiencing high malaria endemicity in the past years (Figure 4).
    • By conducting scenario analysis, we have (i) examined the significance of the inferred transmission networks, (ii) estimated the number of imported cases for each individual town, and (iii) quantified the roles of individual towns in the geographical spread of Plasmodium vivax.

      Figure 3. An illustration of the proposed machine learning approach to predicting the patterns of malaria transmission. On the one hand, the surveillance data can serve as continuous inputs for a malaria transmission model, which is used to predict malaria transmission patterns. On the other hand, the surveillance data can also perform as measures of an appropriate machine learning model such that both the malaria transmission model and the parameters in the model can be adjusted accordingly.

      Figure 4. The inferred Plasmodium vivax transmission networks for scenarios with 60%, 70%, 80%, and 90% imported cases. The colors represent the relative strength of malaria transmission from one town to another. Note that the diagonal entries refer to the self-propagation of Plasmodium vivax within individual towns.


Grant Support:

This project is supported by the Research Grants Council (RGC), Hong Kong SAR, China (Project HKBU211212) and the Hong Kong Baptist University Strategic Development Grant (Project SDF10-0526-P08).

For further information on this research topic, please contact Prof. Jiming Liu.