Information systems are employed by organizations for the collection, filtering, processing, creation and distribution of data. In healthcare delivery, patients are required to share information with certain categories of health personnel to facilitate correct diagnosis and to determine appropriate treatment. There have been cases of unauthorized access to patient information by health personnel. Some of these personnel eventually cause great harm to the patient by divulging sensitive information. The existing Data Privacy Preservation (DPP) models are designed for Clinical Decision Support Systems with inadequate information available for DPP in Health Information Systems (HIS) in Nigeria. This research, therefore focused on the development of a model for Data Privacy Preservation (DPP) in HIS to address this inadequacy.
A model for DPP in HIS was developed using the iterative design technique. The model developed comprises a local database that contains the health information of patients, the Random Forest Decision Tree (RFDT) algorithm, an attribute blocking module that employs the RFDT algorithm, an attribute unblocking module which also uses the RFDT algorithm and a module for the computation of time elapsed in unblocking attributes. Mandatory Role-based Access Control was used to restrict the access health professionals have to patient data; each category of health worker can only view the attribute(s) needed for them to provide the service required to fulfill their role. An application based on the RFDT algorithm, was developed to instantiate the model following the Waterfall Software Development Life Cycle. Netbeans Integrated Development Environment, MySQL server, Java Development Kit 8, Scenebuilder 2.0, and Navicat 8 query editor constitute the programming environment. The application was evaluated against the machine learning approach to DPP that employed the classification technique, by comparing its efficiency with the Waikato Environment for Knowledge Analysis (WEKA) version 3.8 software in ensuring DPP using the RFDT algorithm.
The model developed in this study provides a generic framework for DPP in HIS that reveals the necessary components. This model provides a template that could be adapted for use in studies on DPP in HIS. The application provides the health personnel with Graphical User Interfaces that depict the professional’s access to the patient database while restricting access to attributes not allowed for such category of health workers. The use of the RFDT algorithm in WEKA for DPP gave an efficiency of 73.77% while the approach that employed the application gave an efficiency of 78.32%.
The model presented in this study would help preserve sensitive patient data from being accessed by health workers who are not authorized to do so. The study showed that the application is more efficient than the WEKA software in ensuring DPP using the RFDT algorithm. The DPP model proposed in this study could also be employed in other domains outside the health sector to curb the challenges resulting from weak DPP.
Keywords: Health Information System, Machine Learning, Data Privacy Preservation Model, Software Development Life Cycle, Random Forest Decision Tree
Word Count: 463