Advanced Machine Learning in Health Risk Prediction and Mapping: Theories and Applications
(Jiming Liu, Yang Liu, et al.)


Project Description:

The vigorous development of artificial intelligence (AI) in recent years has offered a science-grounded approach to addressing challenges in health risk prediction and mapping. Toward this end, machine learning (ML) methods have been proposed, which have shown some promising results in disease transmission modeling, prediction, risk assessment, spatiotemporal analytics, and healthcare resource allocation. However, most of the existing methods share one major common limitation in that they do not adequately tackle the difficult real-world ML issues in health risk prediction and mapping, such as data heterogeneity and incompleteness, spatio-temporal heterogeneity and multi-scale dependencies of disease dynamics, and resource scarcity. As a result, the impact of these methods in the real world remains unsatisfactory. Moreover, given a specific health risk prediction and mapping task and the available data sources, the underlying theoretical ML issues of how to appropriately determine the most appropriate ML model, theoretically analyze the model’s learning behavior, and quantitatively characterize the model’s learning capacity are largely untouched.

Building on our ongoing research in AI and ML, as well as our previous experience in AI/ML-enabled disease control and prevention together with domain experts, this research project aims to make a major step forward by further developing a novel ML framework as well as an information-theoretic approach to understanding, modeling, and analyzing spatiotemporal domain tasks in general, and the public health and epidemiology tasks in particular. Specifically, the project will consist of three key milestones, namely:

  1. We will design and demonstrate a novel ML framework that addresses the issues of how to integrate data from heterogeneous sources, how to capture information at multiple spatiotemporal scales, and how to integrate information from different scales for subsequent learning tasks such as health risk prediction.


    Figure 1. An illustration of the designed Interactively and Integratively connected Deep Recurrent Neural Network (I2DRNN) model.

  2. We will theoretically formulate and prove that the designed models will be able to adequately learn complex multi-scale spatiotemporal dependencies in risk prediction and mapping. We will develop an information-theoretic approach to examining information-based learning capacities of the proposed models. In so doing, we expect to develop answers to some of the important open questions in ML, e.g., how to determine the appropriate configurations of a designed learning model with respect to a given learning task at hand.


    Figure 2. Analysis of the model behavior.


    Figure 3. Analysis of the learning capacity of the designed model.

  3. 3) We will validate the designed learning models and the information-theoretic approach by systematically conducting a series of experiments on public health and disease risk modeling, prediction, and mapping, involving both synthetic and real-world datasets. Moreover, we will examine the learning behaviors of the designed models as well as the model configurations derived from the information-theoretic analysis in the real-world context.


    Figure 4. Real-world validations: disease transmission network discovery.


    Figure 5. Real-world validations: risk mapping and optimization of resource allocation.



This project will enrich the interdisciplinary research in AI, ML, as well as public health and epidemiology. The outcomes of this project will timely serve as both the methodological and practical foundations for health risk prediction and mapping, strengthening the academic and public health collaborations both regionally and internationally, and most importantly, contribute to the well-being of the society.


Publications:

[1] H. Pei, B. Yang, J. Liu, and K. Chang, Active surveillance via group sparse Bayesian learning, JIEEE Transactions on Pattern Analysis and Machine Intelligence, 44(3): 1133-1148, 2022.

[2] J. Ren, M. Liu, Y. Liu, and J. Liu, Optimal resource allocation with spatiotemporal transmission discovery for effective disease control, Infectious Diseases of Poverty, 11(1): 34, 2022.

[3] Q. Tan, Y. Liu, and J. Liu, Demystifying deep learning in predictive spatiotemporal analytics: An information-theoretic framework IEEE Transactions on Neural Networks and Learning Systems, 32(8):3538-3552, 2021.

[4] Y. Liu, Z. Gu, and J. Liu, Uncovering transmission patterns of COVID-19 outbreaks: A region-wide comprehensive retrospective study in Hong Kong EClinicalMedicine, The LANCET Discovery Science, 36:100929, 2021.

[5] Y. Liu, Z. Gu, S. Xia, B. Shi, X.-N. Zhou, Y. Shi, and J. Liu, What are the underlying transmission patterns of COVID-19 outbreak? – An age-specific social contact characterization, EClinicalMedicine, The LANCET Discovery Science, 22:100354, 2020.


For further information on this research topic, please contact Prof. Jiming Liu or Dr. Yang Liu.