Advanced Data Processing of Pancreatic Cancer Data Integrating Ontologies and Machine Learning Techniques to Create Holistic Health Records 

Our research paper entitled “Advanced Data Processing of Pancreatic Cancer Data Integrating Ontologies and Machine Learning Techniques to Create Holistic Health Records” has been published in the Journal “Sensor”, IF: 3.9.

The goal of this paper is to evaluate the implementation of an advanced data processing and harmonization mechanism with a specific focus a real-world use case that leverages data related to pancreatic cancer. This paper includes contributions such as:

  • The introduction of an end-to-end and holistic reference architecture and data ingestion mechanism for advanced data processing and analysis in a modern HIS;
  • A set of practical recommendations and implementations for the integration of techniques from the domains of data science, ML, and the Semantic Web;
  • The realization of the Holistic Health Record (HHR) data model through the integration, standardization, and harmonization of primary and secondary data;
  • Analysis and discussion of the industry-centric challenges and problems that researchers in the healthcare domain face regarding data processing and analysis, such as data being available in divergent formats and semantic non-interoperable data.

In this paper, the applicability of the introduced mechanism was validated based on data collected in the context of the HDM pilot of the iHelp project that mainly focuses on the development of a predictive model for assessing individuals’ risk for developing pancreatic cancer based on smoking cessation and the adoption of healthier lifestyle habits by the participants included in the clinical study. Moreover, this pilot seeks to establish solutions for the real-time monitoring of physiological parameters and interaction with study participants through the different tools created by the iHelp project. To this end, data derived from hospital’s EHR system (primary data) and from Garmin wearable devices (secondary data) have been collected. Their efficient and effective integration, processing and analysis are key challenges in the context of the modern healthcare domain towards capturing all the health determinants of individuals and the provision of more precise, accurate, and personalized recommendations and care plans to them.

This paper has discussed these challenges and presented solutions that can advance the state of data-driven personalized decision support systems. The main contribution of the paper is the introduction of an advanced mechanism for health-related data processing integrating Semantic Web and ML techniques, also leveraging the potential derived from the utilization of integrated primary and secondary data in the HHR format. The viability of this approach has been evaluated through heterogeneous healthcare datasets pertaining to risk identification and individual monitoring and care planning.

Related Post

Leave a Reply

Your email address will not be published. Required fields are marked *