When applied to healthcare, machine learning ushers in a new age of data-driven medical practice that holds great promise for better patient outcomes and individualized treatment. However, this ...evolution isn't without significant difficulties, such as the difficulty of striking a balance between patient confidentiality and data use. In this study, we use -Differential Privacy as a privacy-protecting technique and a number of machine learning models to quantify the value of the data collected. Our research demonstrates a subtle trade-off, where more stringent privacy safeguards often result in less useful data, and vice versa. We stress the need for ethical frameworks, patient permission, and flexible privacy restrictions as means of negotiating this space. Achieving responsible and successful machine learning-enabled healthcare calls for a number of future steps, including optimization of privacy settings, adoption of federated learning, data ownership through blockchain, validations in the actual world, and extensive ethical advice.
The use of wearables facilitates data collection at a previously unobtainable scale, enabling the construction of complex predictive models with the potential to improve health. However, the highly ...personal nature of these data requires strong privacy protection against data breaches and the use of data in a way that users do not intend. One method to protect user privacy while taking advantage of sharing data across users is federated learning, a technique that allows a machine learning model to be trained using data from all users while only storing a user's data on that user's device. By keeping data on users' devices, federated learning protects users' private data from data leaks and breaches on the researcher's central server and provides users with more control over how and when their data are used. However, there are few rigorous studies on the effectiveness of federated learning in the mobile health (mHealth) domain.
We review federated learning and assess whether it can be useful in the mHealth field, especially for addressing common mHealth challenges such as privacy concerns and user heterogeneity. The aims of this study are to describe federated learning in an mHealth context, apply a simulation of federated learning to an mHealth data set, and compare the performance of federated learning with the performance of other predictive models.
We applied a simulation of federated learning to predict the affective state of 15 subjects using physiological and motion data collected from a chest-worn device for approximately 36 minutes. We compared the results from this federated model with those from a centralized or server model and with the results from training individual models for each subject.
In a 3-class classification problem using physiological and motion data to predict whether the subject was undertaking a neutral, amusing, or stressful task, the federated model achieved 92.8% accuracy on average, the server model achieved 93.2% accuracy on average, and the individual model achieved 90.2% accuracy on average.
Our findings support the potential for using federated learning in mHealth. The results showed that the federated model performed better than a model trained separately on each individual and nearly as well as the server model. As federated learning offers more privacy than a server model, it may be a valuable option for designing sensitive data collection methods.
The exploitation of synthetic data in health care is at an early stage. Synthetic data could unlock the potential within health care datasets that are too sensitive for release. Several synthetic ...data generators have been developed to date; however, studies evaluating their efficacy and generalizability are scarce.
This work sets out to understand the difference in performance of supervised machine learning models trained on synthetic data compared with those trained on real data.
A total of 19 open health datasets were selected for experimental work. Synthetic data were generated using three synthetic data generators that apply classification and regression trees, parametric, and Bayesian network approaches. Real and synthetic data were used (separately) to train five supervised machine learning models: stochastic gradient descent, decision tree, k-nearest neighbors, random forest, and support vector machine. Models were tested only on real data to determine whether a model developed by training on synthetic data can used to accurately classify new, real examples. The impact of statistical disclosure control on model performance was also assessed.
A total of 92% of models trained on synthetic data have lower accuracy than those trained on real data. Tree-based models trained on synthetic data have deviations in accuracy from models trained on real data of 0.177 (18%) to 0.193 (19%), while other models have lower deviations of 0.058 (6%) to 0.072 (7%). The winning classifier when trained and tested on real data versus models trained on synthetic data and tested on real data is the same in 26% (5/19) of cases for classification and regression tree and parametric synthetic data and in 21% (4/19) of cases for Bayesian network-generated synthetic data. Tree-based models perform best with real data and are the winning classifier in 95% (18/19) of cases. This is not the case for models trained on synthetic data. When tree-based models are not considered, the winning classifier for real and synthetic data is matched in 74% (14/19), 53% (10/19), and 68% (13/19) of cases for classification and regression tree, parametric, and Bayesian network synthetic data, respectively. Statistical disclosure control methods did not have a notable impact on data utility.
The results of this study are promising with small decreases in accuracy observed in models trained with synthetic data compared with models trained with real data, where both are tested on real data. Such deviations are expected and manageable. Tree-based classifiers have some sensitivity to synthetic data, and the underlying cause requires further investigation. This study highlights the potential of synthetic data and the need for further evaluation of their robustness. Synthetic data must ensure individual privacy and data utility are preserved in order to instill confidence in health care departments when using such data to inform policy decision-making.
In recent technological advancement, the health recommendation system is gaining attention among the public to acquire health care services online. Traditional health recommendations are insecure due ...to the lack of security constraints caused by the intruders and not suitable to suggest appropriate recommendations. Thus, it creates hesitation in the minds of the people to share sensitive medical information. Hence, it is essential to design a privacy-preserving health recommendation system that should guarantee privacy and also suggest top-N recommendation to the user based on their preferences and earlier feedback. To cope with these issues, we propose a stacked discriminative de-noising convolution auto-encoder–decoder with a two-way recommendation scheme that provides secure and efficient health data to the end-users. In this scheme, privacy is assured to users through the modified blowfish algorithm. For structuring the big data collected from the patient, the Hadoop transform is used. Here, the two-way system analyzes and learns more effective features from the explicit and implicit information of the patient individually, and finally, all the learned features are fused to provide an efficient recommendation. The performance of the proposed system is analyzed with different statistical metrics and compared with recent approaches. From the result analysis, it is evident that the proposed system performs better than the earlier approaches.
Cyber resilience is the business capability of handling the risks and preparing themselves for responding and recovering from risks. Being a cyber resilient the organization is capable of handling ...unknown or less known threats and ready to face such adversities and challenges. Healthcare related datasets using Machine learning or ML-based systems for detection of diseases such as Streptococcus pharyngitis will be expected to operate in contested and adversarial environments. Every operation these datasets support depends on their capacity to adjust to threats. To minimize the risk of misdiagnosing and early diagnosis of the disease an intelligent ML method are required. ML has gained a significant success in almost all the domains and has proved its ability in healthcare sector also. This research presents comparison of different ML algorithms to detect Pharyngitis. The study revealed that with reduced feature set Random Forest performs best with 70.11% accuracy and outshined all other implemented techniques. The authors propose a new privacy framework to protect the patient health care data.
Abstract
Privacy and security in the medical field are major aspects to consider in the current era. This is due to the huge necessity for data among providers, payers and patients, respectively. ...Recently, more researchers are making their contributions in this field under different aspects. But, there need more enhancements concerning security. In this circumstance, this paper intends to propose a new IoT-dependent health care privacy preservation model with the impact of the machine learning algorithm. Here, the information or data from IoT devices is subjected to the proposed sanitization process via generating the optimal key. In this work, the utility of the machine learning model is the greatest gateway to generating optimal keys as it is already trained with the neural network. Moreover, identifying the optimal key is the greatest crisis, which is done by a new Improved Dragonfly Algorithm algorithm. Thereby, the sanitization process works, and finally, the sanitized data are uploaded to IoT. The data restoration is the inverse process of sanitization, which gives the original data. Finally, the performance of the proposed work is validated over state-of-the-art models in terms of sanitization and restoration analysis.